The NVIDIA GB300 NVL72 isn't just another AI server—it's a 1.36-tonne, liquid-cooled, 120 kW exascale rack that reshapes what's possible inside a power plant. With 72 Blackwell Ultra GPUs, 36 Grace CPUs, 20 TB of HBM3e memory, and 130 TB/s of NVLink bandwidth in a single cabinet, the GB300 NVL72 turns a server room into an AI factory capable of running trillion-parameter plant LLMs, multi-modal foundation models, and real-time digital twins on-prem. This guide breaks down what's actually inside the rack, the deployment topology that fits a generating station, the power and cooling reality you'll need to engineer around, and how it stacks up against H200 and GB200 NVL72.
Upcoming iFactory Ai Live Webinar:
GB300 NVL72 Inside the Power Plant
Join the iFactory team for a live walk-through of NVIDIA's flagship rack-scale AI system deployed inside operating power plants. Cover specs, power topology, cooling loops, workload mapping, and the full migration path from H200 to GB300—built on 1,000+ enterprise implementations.
One Rack. 72 GPUs. Acting as a Single Massive GPU.
The GB300 NVL72 is a fully liquid-cooled, rack-scale system where 72 Blackwell Ultra GPUs and 36 Grace CPUs are stitched together by fifth-gen NVLink into a single 72-GPU NVLink domain. The whole rack behaves like one accelerator. Book a 30-minute briefing if you want our engineers to model the rack inside your existing plant footprint.
Compute, Memory, and Interconnect — The Numbers That Matter
Three numbers do most of the work explaining why GB300 NVL72 exists: 1.1 exaFLOPS of dense FP4 compute, 20 TB of HBM3e at 569 TB/s aggregate bandwidth, and 130 TB/s of NVLink fabric inside one cabinet.
Four Reasons Power Plants Are Choosing GB300 NVL72
A power plant is not a hyperscaler—but the AI workloads it runs are starting to look like hyperscaler-class problems. Four pressures push thermal, gas, hydro, and nuclear operators toward GB300-class infrastructure on-prem.
A foundation model fine-tuned on plant DCS history, P&IDs, maintenance logs, and OEM manuals easily breaks 1T parameters. The 20 TB unified HBM3e on a single GB300 rack holds these models without sharding across racks—simpler topology, lower latency.
A Cosmos-class world foundation model rendering the boiler furnace and turbine hall in real time produces 4M tokens per 5-second sequence. Blackwell Ultra runs that 30× faster than Hopper—on-prem, behind the plant firewall.
"Why did Unit 3 trip?" is a reasoning query, not a classifier. DeepSeek-R1-class models running test-time scaling demand 10× more inference compute. GB300 delivers exactly that—1.5× FP4 FLOPS and 2× attention vs Blackwell.
Nuclear, defense, and regulated thermal sites can't route control-room data through a hyperscaler. The GB300 NVL72 lets you run frontier models inside the plant security perimeter—no operational data leaves the site.
Where the Rack Sits Inside an Operating Power Plant
A GB300 NVL72 inside a power plant is not just "another rack in the IT room." It needs a dedicated AI infrastructure zone with its own electrical, cooling, and network topology. Schedule a topology walkthrough with our deployment engineers and we'll model exactly where the rack lands in your station layout.
120 kW per Rack — What Your Facility Actually Needs
A GB300 NVL72 cabinet draws roughly 120 kW continuous. That's 5–15× beyond the air-cooling ceiling of a standard server rack and demands real engineering decisions on power delivery and coolant loops.
- ~120 kW continuous rack draw
- 480V 3-phase distribution required
- 8 power shelves × 33 kW each
- 48V DC busbar for stable delivery
- 300A circuit minimum per rack
- Direct-to-chip cold plates per GPU/CPU
- 252 quick-connects across 18 trays
- 30–45°C inlet coolant (warm water)
- 20 L/min flow rate per GPU
- CDU sized 150–200 kW per rack
- Standard 42U rack footprint
- 1.36 t (~3,000 lb) cabinet weight
- ~440 psf floor loading
- M12 anchoring recommended
- Dual-side service access required
- 800 Gb/s per GPU (ConnectX-8)
- Quantum-X800 InfiniBand or
- Spectrum-X Ethernet supported
- 2,000+ cables per multi-rack pod
- PCIe Gen6 host bus
What You Actually Run on a GB300 NVL72 Inside a Plant
The GB300 NVL72 isn't justified by one workload—it's justified by hosting all of them concurrently from one rack. Triton multiplexes inference across the 72-GPU NVLink domain. Talk to our support team for a workload-fit assessment specific to your plant.
1T+ parameter model fine-tuned on DCS logs, P&IDs, maintenance records, OEM manuals, and incident reports. Engineers query in natural language; the model retrieves the right context, summarizes, and reasons.
Cosmos-class world foundation model renders boiler furnace, turbine hall, or substation in real time. Operators see what the AI sees—physics-stable, photo-realistic, updated 60 FPS.
DeepSeek-R1-class reasoning agent with test-time scaling. Answers "Why did Unit 3 trip?" by walking through DCS history, alarms, and maintenance logs with chain-of-thought.
Combined vision + LLM model processes thermal cameras, drone inspection footage, and acoustic sensors. Detects flame instability, blade erosion, and steam leaks across the plant.
RL-PPO combustion optimizer, CNN turbine vibration model, and DNN NOx soft sensor running concurrently. Triton serves all three with sub-50 ms inference per call.
When the rack is not at peak inference load, it trains. Continuous fine-tuning on fresh plant data keeps models aligned with current fuel mix, equipment state, and operating regime.
GB300 NVL72 vs H200 vs B200 vs GB200 NVL72
Choosing between Hopper-era H200, single-node B200, and the rack-scale GB200 / GB300 NVL72 is a million-dollar facility decision. Here's the side-by-side that drives it.
| Spec | H200 (HGX 8-GPU) | B200 (HGX 8-GPU) | GB200 NVL72 | GB300 NVL72 |
|---|---|---|---|---|
| Architecture | Hopper | Blackwell | Blackwell | Blackwell Ultra |
| GPUs / rack | 8 | 8 | 72 | 72 |
| HBM per GPU | 141 GB HBM3e | 192 GB HBM3e | 192 GB HBM3e | 288 GB HBM3e |
| Total HBM | 1.13 TB | 1.5 TB | 13.4 TB | 20 TB |
| FP4 (rack, dense) | n/a (FP8 only) | 72 PFLOPS | 720 PFLOPS | 1,400 PFLOPS |
| NVLink BW | 900 GB/s/GPU | 1.8 TB/s/GPU | 130 TB/s rack | 130 TB/s rack |
| NVLink domain | 8 GPUs | 8 GPUs | 72 GPUs | 72 GPUs |
| Power per rack | ~10 kW | ~14 kW | ~120 kW | ~120 kW |
| Cooling | Air or RDHx | Air or DLC | Direct liquid | Direct liquid |
| Best for | Drop-in upgrade | Enterprise AI nodes | Trillion-param training | Reasoning + 50× factory output |
From Hopper to Blackwell Ultra — The Performance Curve
NVIDIA describes GB300 NVL72 as a 50× AI factory output increase over Hopper-based platforms. That's a combination of 10× tokens-per-second per user and 5× tokens-per-second per megawatt.
The 16-Week GB300 NVL72 Deployment Inside a Power Plant
A first-time GB300 deployment inside an operating plant is not a server install. It's a coordinated facility, electrical, cooling, network, and software project.
What Plant Engineers Ask Before Deploying GB300
These come up in every GB300 NVL72 scoping call. Reach out to our support team for tailored answers on your specific plant.
If your plant LLM is under 70B parameters and you don't need real-time digital twins, an 8-GPU B200 node is plenty. GB300 NVL72 becomes essential when you need a 1T+ parameter model in unified memory or 50× factory throughput for reasoning workloads.
Almost certainly not. Standard server rooms are designed for 8–25 kW per rack. GB300 NVL72 requires 480V 3-phase distribution, a dedicated CDU, and ~440 psf floor loading. Plan for a purpose-built AI infrastructure room.
Yes. NVIDIA explicitly mandates direct-to-chip liquid cooling for GB300 NVL72. Air cooling tops out at 8–25 kW per rack and 5–15 W/cm² heat flux—the B300 die runs at 500–600 W/cm². There is no air-cooled alternative.
Typical lead time is 8–12 weeks for the rack plus 4–6 weeks for facility prep. Migration from H200 is straightforward at the model layer—CUDA stack is unchanged—but the facility layer needs full electrical and cooling redesign.
Why Power Plants Choose iFactory for GB300 NVL72 Deployment
A GB300 NVL72 inside a power plant is not a hyperscaler deployment. It is an OT-grade install with safety, sovereignty, and uptime constraints that hyperscaler integrators don't think about. Book a deployment-readiness review and we'll model the rack inside your station before you sign a PO.
Get a GB300 NVL72 Site Plan for Your Power Plant
Thirty minutes with our deployment engineers. Bring your plant layout, electrical capacity, and cooling water spec. We'll identify whether GB300 NVL72 fits your facility today—or what needs to change before it can—and give you a concrete 16-week deployment path.







