• AI energy demands could be lowered by large single-wafer chips
  • Researchers say these can overcome the limitations faced by GPUs
  • Cerebras and Tesla already use these huge chips, with special cooling systems to manage heat

Engineers at the University of California Riverside are exploring a new approach to artificial intelligence hardware that could tackle both performance and sustainability.

In a peer-reviewed paper, published in the journal Device, the team investigated the potential of wafer-scale accelerators – giant computer chips that operate on entire silicon wafers rather than the small chips used in today’s GPUs.

“Wafer-scale technology represents a major leap forward,” said Mihri Ozkan, a professor of electrical and computer engineering at UCR and lead author of the paper. “It enables AI models with trillions of parameters to run faster and more efficiently than traditional systems.”

Like monorails

These chips, like Cerebras’ Wafer-Scale Engine 3 (WSE-3), which we’ve covered previously, contain up to 4 trillion transistors and 900,000 AI-focused cores on a single unit. Another wafer-scale processor, Tesla’s Dojo D1, houses 1.25 trillion transistors and close to 9,000 cores per module.

The processors remove the delays and energy losses common in systems where data travels between multiple chips.

“By keeping everything on one wafer, you avoid the delays and power losses from chip-to-chip communication,” Ozkan said.

Traditional GPUs are still important due to their lower cost and modularity, but as AI models grow in size and complexity, the chips begin to encounter performance and energy barriers.

“AI computing isn’t just about speed anymore,” Ozkan explained. “It’s about designing systems that can move massive amounts of data without overheating or consuming excessive electricity.”

Wafer-scale systems have important environmental benefits too. Cerebras’ WSE-3, for example, can perform up to 125 quadrillion operations per second, while using far less energy than GPU setups.

“Think of GPUs as busy highways – effective, but traffic jams waste energy,” Ozkan said. “Wafer-scale engines are more like monorails: direct, efficient, and less polluting.”

One major challenge still remains however – the age-old issue of heat. Wafer-scale chips can consume up to 10,000 watts of power, nearly all of which turns into heat, requiring advanced cooling systems to prevent overheating and maintain performance.

Cerebras uses a glycol-based cooling loop built into the chip, while Tesla has a liquid system that spreads coolant evenly over the chip’s surface.

Via Tech Xplore

You may also like

By

Leave a Reply

Your email address will not be published. Required fields are marked *