Close Menu
    Trending
    • Five major women’s NCAA Tournament storylines heading into Selection Sunday
    • Opinion | The Political Cost of Trump’s War
    • All 6 crew members killed in crash of American KC-135 refueling aircraft in Iraq, U.S. military confirms
    • Bitcoin Fear & Greed Index At COVID- And LUNA-Crash Low — What’s Next?
    • Ethereum And Solana Are Topping Developer Activity Again, But Why Are Their Prices Struggling?
    • Bitcoin Price Reclaims $73,000, Outperforming Gold And Stocks
    • ByteDance has reportedly suspended the global rollout of its new AI video generator
    • WATCH: MS Dhoni repairs his bat using an electric sander during IPL 2026 practice session
    FreshUsNews
    • Home
    • World News
    • Latest News
      • World Economy
      • Opinions
    • Politics
    • Crypto
      • Blockchain
      • Ethereum
    • US News
    • Sports
      • Sports Trends
      • eSports
      • Cricket
      • Formula 1
      • NBA
      • Football
    • More
      • Finance
      • Health
      • Mindful Wellness
      • Weight Loss
      • Tech
      • Tech Analysis
      • Tech Updates
    FreshUsNews
    Home » HBM on GPU: Thermal Challenges and Solutions
    Tech Analysis

    HBM on GPU: Thermal Challenges and Solutions

    FreshUsNewsBy FreshUsNewsJanuary 15, 2026No Comments6 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Peek contained in the package deal of AMD’s or Nvidia’s most advanced AI products, and also you’ll discover a acquainted association: The GPU is flanked on two sides by high-bandwidth memory (HBM), probably the most superior reminiscence chips out there. These reminiscence chips are positioned as shut as attainable to the computing chips they serve to be able to minimize down on the most important bottleneck in AI computing—the energy and delay in getting billions of bits per second from reminiscence into logic. However what when you might convey computing and reminiscence even nearer collectively by stacking the HBM on prime of the GPU?

    Imec just lately explored this situation utilizing superior thermal simulations, and the reply—delivered in December on the 2025 IEEE International Electron Device Meeting (IEDM)—was a bit grim. 3D stacking doubles the working temperature contained in the GPU, rendering it inoperable. However the workforce, led by Imec’s James Myers, didn’t simply surrender. They recognized a number of engineering optimizations that in the end might whittle down the temperature distinction to just about zero.

    Imec began with a thermal simulation of a GPU and 4 HBM dies as you’d discover them at the moment, inside what’s referred to as a 2.5D package deal. That’s, each the GPU and the HBM sit on substrate referred to as an interposer, with minimal distance between them. The 2 sorts of chips are linked by 1000’s of micrometer-scale copper interconnects constructed into the interposer’s floor. On this configuration, the mannequin GPU consumes 414 watts and reaches a peak temperature of slightly below 70 °C—typical for a processor. The reminiscence chips devour an extra 40 W or so and get considerably much less sizzling. The warmth is faraway from the highest of the package deal by the form of liquid cooling that’s grow to be widespread in new AI data centers.

    RELATED: Future Chips Will Be Hotter Than Ever

    “Whereas this method is at the moment used, it doesn’t scale effectively for the long run—particularly because it blocks two sides of the GPU, limiting future GPU-to-GPU connections contained in the package deal,” Yukai Chen, a senior researcher at Imec, instructed engineers at IEDM. In distinction, “the 3D method results in increased bandwidth, decrease latency.… An important enchancment is the package deal footprint.”

    Sadly, as Chen and his colleagues discovered, probably the most simple model of stacking, merely placing the HBM chips on prime of the GPU and including a block of clean silicon to fill in a spot on the middle, shot up temperatures within the GPU to a scorching 140 °C—effectively previous a typical GPU’s 80 °C restrict.

    System Know-how Co-optimization

    The Imec workforce set about making an attempt a variety of know-how and system optimizations aimed toward reducing the temperature. The very first thing they tried was throwing out a layer of silicon that was now redundant. To know why, it’s important to first get a grip on what HBM actually is.

    This type of reminiscence is a stack of as many as 12 high-density DRAM dies. Every has been thinned right down to tens of micrometers and is shot by with vertical connections. These thinned dies are stacked one atop one other and linked by tiny balls of solder, and this stack of reminiscence is vertically linked to a different piece of silicon, referred to as the bottom die. The bottom die is a logic chip designed to multiplex the info—pack it into the restricted variety of wires that may match throughout the millimeter-scale hole to the GPU.

    However with the HBM now on prime of the GPU, there’s no want for such an information pump. Bits can stream straight into the processor with out regard for what number of wires occur to suit alongside the facet of the chip. After all, this transformation means transferring the reminiscence management circuits from the bottom die into the GPU and due to this fact altering the processor’s floorplan, says Myers. However there ought to be ample room, he suggests, as a result of the GPU will not want the circuits used to demultiplex incoming reminiscence knowledge.

    RELATED: The Hot, Hot Future of Chips

    Slicing out this intermediary of reminiscence cooled issues down by solely rather less than 4 °C. However, importantly, it ought to massively enhance the bandwidth between the reminiscence and the processor, which is essential for one more optimization the workforce tried—slowing down the GPU.

    Which may appear opposite to the entire objective of higher AI computing, however on this case, it’s a bonus. Large language models are what are referred to as “memory-bound” issues. That’s, reminiscence bandwidth is the principle limiting issue. However Myers’s workforce estimated 3D stacking HBM on the GPU would enhance bandwidth fourfold. With that added headroom, even slowing the GPU’s clock by 50 p.c nonetheless results in a efficiency win, whereas cooling all the things down by greater than 20 °C. In follow, the processor won’t must be slowed down fairly that a lot. Rising the clock frequency to 70 p.c led to a GPU that was just one.7 °C hotter, Myers says.

    Optimized HBM

    One other large drop in temperature got here from making the HBM stack and the world round it extra conductive. That included merging the 4 stacks into two wider stacks, thereby eliminating a heat-trapping area; scaling down the highest—normally thicker—die of the stack; and filling in additional of the house across the HBM with clean items of silicon to conduct extra warmth.

    With all of that, the stack now operated at about 88 °C. One closing optimization introduced issues again to close 70 °C. Usually, some 95 p.c of a chip’s warmth is faraway from the highest of the package deal, the place on this case water carries the warmth away. However including related cooling to the underside as effectively drove the stacked chips down a closing 17 °C.

    Though the analysis introduced at IEDM exhibits it is perhaps attainable, HBM-on-GPU isn’t essentially your best option, Myers says. “We’re simulating different system configurations to assist construct confidence that that is or isn’t your best option,” he says. “GPU-on-HBM is of curiosity to some in business,” as a result of it places the GPU nearer to the cooling. However it could possible be a extra complicated design, as a result of the GPU’s energy and knowledge must stream vertically by the HBM to succeed in it.

    From Your Website Articles

    Associated Articles Across the Internet



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleInner Circle bests 3DMAX at BLAST Bounty Season 1 to continue upset trend
    Next Article ASUS has stopped producing the NVIDIA RTX 5070 Ti and 5060 Ti 16GB, saying they’ve reached ‘end of life’
    FreshUsNews
    • Website

    Related Posts

    Tech Analysis

    Waabi CEO Raquel Urtasun on Level 4 Autonomous Trucks

    March 13, 2026
    Tech Analysis

    Professional Community Investment Yields Big Returns

    March 12, 2026
    Tech Analysis

    AI Sycophancy: Why Chatbots Agree With You

    March 12, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Hospitals can soften the blow of Medicaid’s retroactive coverage change, if they choose to – The Health Care Blog

    January 16, 2026

    Kenyan opposition leader Raila Odinga dies of heart attack in India at 80 | Politics News

    October 15, 2025

    David Bailey Confirmed As A Bitcoin 2026 Speaker

    March 12, 2026

    FEMA Employees Fired For Using Government Systems To Engage In Sexually Explicit Behavior

    September 9, 2025

    XRP Structure Resembles That Of February 2022, Glassnode Warns

    January 20, 2026
    Categories
    • Bitcoin News
    • Blockchain
    • Cricket
    • eSports
    • Ethereum
    • Finance
    • Football
    • Formula 1
    • Healthy Habits
    • Latest News
    • Mindful Wellness
    • NBA
    • Opinions
    • Politics
    • Sports
    • Sports Trends
    • Tech Analysis
    • Tech News
    • Tech Updates
    • US News
    • Weight Loss
    • World Economy
    • World News
    Most Popular

    Five major women’s NCAA Tournament storylines heading into Selection Sunday

    March 15, 2026

    Opinion | The Political Cost of Trump’s War

    March 15, 2026

    All 6 crew members killed in crash of American KC-135 refueling aircraft in Iraq, U.S. military confirms

    March 15, 2026

    Bitcoin Fear & Greed Index At COVID- And LUNA-Crash Low — What’s Next?

    March 14, 2026

    Ethereum And Solana Are Topping Developer Activity Again, But Why Are Their Prices Struggling?

    March 14, 2026

    Bitcoin Price Reclaims $73,000, Outperforming Gold And Stocks

    March 14, 2026

    ByteDance has reportedly suspended the global rollout of its new AI video generator

    March 14, 2026
    Our Picks

    Will CPP and Old Age Security last as Canada's seniors population grows?

    September 30, 2025

    Outdoors: Close Enchantments to hikers

    August 27, 2025

    How Predictive Modeling Can Rewrite the Story of Congenital Syphilis – The Health Care Blog

    November 14, 2025

    Google shows off the Pixel 10 less than a month before its launch

    July 22, 2025

    Arnold Ventures Part II “Structuring Information Felicitously” – The Health Care Blog

    February 3, 2026

    Sara Cox’s Children in Need challenge explained: The route, map and all the timings

    November 11, 2025

    Apple’s ‘tabletop robot’ companion rumored for 2027 launch

    August 13, 2025
    Categories
    • Bitcoin News
    • Blockchain
    • Cricket
    • eSports
    • Ethereum
    • Finance
    • Football
    • Formula 1
    • Healthy Habits
    • Latest News
    • Mindful Wellness
    • NBA
    • Opinions
    • Politics
    • Sports
    • Sports Trends
    • Tech Analysis
    • Tech News
    • Tech Updates
    • US News
    • Weight Loss
    • World Economy
    • World News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Freshusnews.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.