GPUs in the Sky With Diamonds

The modern narrative around artificial intelligence (AI) often fixates on chips as saviors, but there's more than what meets the eye.

Feb 20, 2025

The modern narrative around artificial intelligence (AI) often obsesses on GPUs as celestial saviors; glittering chips that will single-handedly propel humanity into a new industrial revolution. Yet, this vision floats on incomplete information. GPUs, for all their parallel-processing prowess and the soaring market caps of their manufacturers, are merely the tip of an iceberg submerged in a sea of unsolved engineering challenges and bottlenecks. The true battleground for AI’s future lies not in the chips themselves but in the intricate infrastructure required to feed them data. Moving bits into and out of data centers at the speeds demanded by next-generation AI workloads will require reinventing nearly every layer of the technological stack, from silicon to satellites and back.
For starters, consider the “memory wall1”, a term that describes the existential crisis at the heart of computing, related to memory timing and density scaling. Today’s processors, including GPUs, spend most of their energy and time waiting for memory hierarchies to deliver data. High-bandwidth memory (HBM) and 3D chip packaging offer partial relief, but they strain against thermal limits and manufacturing complexity. Even if a GPU could theoretically process exaflops, it remains shackled by the latency of fetching data from DRAM or storage. On top of that, currently, DRAM needs 10 years for a 2x density increase.
Solving this requires not just better (new!) memory technologies but also rethinking system architectures2 to minimize data movement.

GPUs still sit on old-fashioned Printed Circuit Boards (PCBs3), a nearly 70-year-old concept using lossy copper tracks, noisy connectors, and an intricate, error-prone manufacturing process4. PCB tracks destroy electric signals as they go, and the faster the signals, the higher the challenges, which begs for breakthroughs in ultra-low-loss substrate materials and copper foils to sustain signal integrity at terabit-per-second speeds. As chiplet-based designs proliferate to circumvent Moore’s law, inter-chiplet interfaces must juggle Tbps data rates across nanometer-scale traces, battling crosstalk, impedance mismatches, and losses. The industry’s reliance on legacy methods, optimized for GHz-era demands, now meets the need for optical interconnects.

The problem keeps getting worse at the macro scale. Data centers are evolving into distributed fabrics, with workloads split across GPU clusters, edge nodes, and even non-terrestrial platforms. Moving data between these layers demands fiber-optic backbones capable of handling petabit-scale traffic, yet today’s fiber networks struggle under the weight of protocol overheads and physical layer inefficiencies. Meanwhile, wireless links—envisioned as bridges to orbital data infrastructure or drone-based relays—must face the harsh realities of mmWave physics. The wavelengths at mmWave frequencies are so small that the molecular constituency of air and water play a major role in defining the propagation properties. Radio waves are dramatically attenuated by atmospheric absorption caused by oxygen and water molecules, but also by moving objects like cars and people. Reconfigurable intelligent surfaces5 and metamaterials promise to bend radio waves to our will, but deploying them at scale remains a materials science and logistics challenge.

And the tooling! Design tools need to massively step up their game. Without systemic automation in design, manufacturing, and testing, from chiplets to boards and backplanes, AI systems will never achieve the yield and reliability required for mission-critical deployments. Design tools are still highly siloed, where EDA, CAD, and software tools still mind their own business. Better multiphysics simulation, PLM that does not suck, and enterprise software architectures that do not turn against us6 are the least we must ask for.

The road ahead is not merely technical but existential. Each layer of the stack—from the atomic precision of 3D packaging to the geopolitical logistics of chips and undersea fiber cables—requires unprecedented collaboration between physicists, engineers, and policymakers. The cost of failure is not stagnation but collapse: a world where GPUs sit idle, starved of data, while the infrastructure meant to sustain them buckles under its own complexity.

Of course, as compute density scales, so does power density. There isn’t still such a thing as a free lunch. With future architectures targeting teraflops-per-watt ratios that still demand kilowatt-scale power delivery per chip. Even minor improvements (e.g., gallium nitride (GaN) transistors or 3D-switched capacitor architectures) barely offset the soaring currents required for 3nm-class logic. Worse, the aforementioned “memory wall” forces GPUs to idle at 30-40% utilization while waiting for data, wasting energy on static leakage currents that scale with transistor count.
Cooling these systems is no longer a matter of airflow, but a war against physics. A single GPU rack can dissipate insane amounts of heat, with localized thermal fluxes exceeding 1 kW/cm² at hotspots like interconnects and HBM stacks. Traditional air cooling, limited by the limits of convection, collapses at these power densities. Liquid cooling—via cold plates, immersion tanks, or direct-to-chip microfluidics—is now mandatory, but introduces its own nightmares. Two-phase immersion cooling, while efficient, demands engineered dielectric fluids with precise boiling points and hermetic rack designs to prevent fluid degradation and galvanic corrosion at copper-aluminum interfaces. Geez.

As we can see, AI’s future does not depend on GPUs alone but on our ability to reinvent the mundane—the cables, connectors, and protocols that form the nervous and circulatory system of computation. And pushing the limits of thermodynamics. The diamonds are not the chips but the unsung innovations buried in the deep trenches of materials science and systems engineering.

https://semianalysis.com/2024/09/03/the-memory-wall/

https://arxiv.org/abs/2402.14878

https://www.eurocircuits.com/blog/the-history-of-printed-circuit-boards/

https://notesonsystemsdesign.com/NSD2/Resources/Hierarchy+of+Digital+Systems/Printed+Circuit+Boards#PCB+Assembly

Tech to Keep an Eye On: 6G and The Advent of Intelligent Surfaces

Ignacio Chechile

Jan 5

Tech to Keep an Eye On: 6G and The Advent of Intelligent Surfaces

As mobile networks evolve and the density of users only increases, there is a continuous push towards utilizing higher frequency bands to accommodate the increasing demand for higher data throughout, which requires more complex antenna designs and access techniques combined with higher radio access technology sophistication.

Read full story

Managers Everywhere

Product Lifecycle Data and The Enterprise Software SNAFU

There is one downside of product development, and it usually does not become visible until it turns itself into an elephant inhabiting in the room: designing and producing technical objects generates…

4 years ago · 2 likes · Ignacio Chechile

Insights

GPUs in the Sky With Diamonds

The modern narrative around artificial intelligence (AI) often fixates on chips as saviors, but there's more than what meets the eye.

Tech to Keep an Eye On: 6G and The Advent of Intelligent Surfaces