Complete NVIDIA GB300 NVL72 Power Plant Deployment Guide

By will Jackes on May 2, 2026

nvidia-gb300-nvl72-power-plant-deployment

The NVIDIA GB300 NVL72 isn't just another AI server—it's a 1.36-tonne, liquid-cooled, 120 kW exascale rack that reshapes what's possible inside a power plant. With 72 Blackwell Ultra GPUs, 36 Grace CPUs, 20 TB of HBM3e memory, and 130 TB/s of NVLink bandwidth in a single cabinet, the GB300 NVL72 turns a server room into an AI factory capable of running trillion-parameter plant LLMs, multi-modal foundation models, and real-time digital twins on-prem. This guide breaks down what's actually inside the rack, the deployment topology that fits a generating station, the power and cooling reality you'll need to engineer around, and how it stacks up against H200 and GB200 NVL72.

MAY 13, 2026 11:30 AM EST, ORLANDO

Upcoming iFactory Ai Live Webinar:
GB300 NVL72 Inside the Power Plant

Join the iFactory team for a live walk-through of NVIDIA's flagship rack-scale AI system deployed inside operating power plants. Cover specs, power topology, cooling loops, workload mapping, and the full migration path from H200 to GB300—built on 1,000+ enterprise implementations.

Full GB300 NVL72 spec breakdown
120 kW power & liquid cooling design
Plant LLM & digital twin workload mapping
H200 → B200 → GB300 migration path
What Is GB300 NVL72

One Rack. 72 GPUs. Acting as a Single Massive GPU.

The GB300 NVL72 is a fully liquid-cooled, rack-scale system where 72 Blackwell Ultra GPUs and 36 Grace CPUs are stitched together by fifth-gen NVLink into a single 72-GPU NVLink domain. The whole rack behaves like one accelerator. Book a 30-minute briefing if you want our engineers to model the rack inside your existing plant footprint.

GB300 NVL72 RACK
NVLink Switch Tray
Compute Tray ×18
4 × Blackwell Ultra GPU
2 × Grace CPU
Cold-plate liquid cooling
NVLink Switch Tray
Power Shelves ×8 · 33 kW each · 48V DC busbar
72
Blackwell Ultra GPUs
36
Grace ARM CPUs
2,592
Neoverse V2 cores
18
Compute trays
~120 kW
Rack power draw
3,000 lb
Cabinet weight
Specs Deep Dive

Compute, Memory, and Interconnect — The Numbers That Matter

Three numbers do most of the work explaining why GB300 NVL72 exists: 1.1 exaFLOPS of dense FP4 compute, 20 TB of HBM3e at 569 TB/s aggregate bandwidth, and 130 TB/s of NVLink fabric inside one cabinet.

COMPUTE · NVFP4
1.1 ExaFLOPS Dense FP4
FP4 (dense)1,400 PFLOPS
FP8 / FP6720 PFLOPS
FP16 / BF16360 PFLOPS
TF32180 PFLOPS
Per-GPU FP415 PFLOPS
MEMORY · HBM3e
20 TB HBM3e + 17 TB LPDDR5X
HBM3e per GPU288 GB
Total HBM3e20 TB
HBM bandwidth8 TB/s per GPU
Aggregate BW569 TB/s
HBM stack12-high
FABRIC · NVLink 5
130 TB/s Inside the Rack
NVLink per GPU1.8 TB/s
Total NVLink BW130 TB/s
NVLink domain72 GPUs
NIC per GPU800 Gb/s ConnectX-8
NetworkQuantum-X800 / Spectrum-X
Why It Matters

Four Reasons Power Plants Are Choosing GB300 NVL72

A power plant is not a hyperscaler—but the AI workloads it runs are starting to look like hyperscaler-class problems. Four pressures push thermal, gas, hydro, and nuclear operators toward GB300-class infrastructure on-prem.

Trillion-Parameter Plant LLMs Need 20 TB Memory

A foundation model fine-tuned on plant DCS history, P&IDs, maintenance logs, and OEM manuals easily breaks 1T parameters. The 20 TB unified HBM3e on a single GB300 rack holds these models without sharding across racks—simpler topology, lower latency.

Real-Time Digital Twins Need NVFP4 + NVLink

A Cosmos-class world foundation model rendering the boiler furnace and turbine hall in real time produces 4M tokens per 5-second sequence. Blackwell Ultra runs that 30× faster than Hopper—on-prem, behind the plant firewall.

Reasoning Inference Needs Test-Time Scaling

"Why did Unit 3 trip?" is a reasoning query, not a classifier. DeepSeek-R1-class models running test-time scaling demand 10× more inference compute. GB300 delivers exactly that—1.5× FP4 FLOPS and 2× attention vs Blackwell.

Sovereign AI Means Inside the Fence

Nuclear, defense, and regulated thermal sites can't route control-room data through a hyperscaler. The GB300 NVL72 lets you run frontier models inside the plant security perimeter—no operational data leaves the site.

Power Plant Topology

Where the Rack Sits Inside an Operating Power Plant

A GB300 NVL72 inside a power plant is not just "another rack in the IT room." It needs a dedicated AI infrastructure zone with its own electrical, cooling, and network topology. Schedule a topology walkthrough with our deployment engineers and we'll model exactly where the rack lands in your station layout.

ZONE 1 · Plant Floor
DCS Controllers PLC Cabinets Plant Sensors (Vibration, Thermo) Edge Jetson Modules
↓ OPC-UA · TLS · Industrial Ethernet
ZONE 2 · Plant Data Aggregation
PI Historian (AVEVA) Time-Series Buffer DMZ Firewall OT/IT Bridge
↓ Private Link · Audit-Logged
ZONE 3 · GB300 NVL72 AI Infrastructure Room
GB300 NVL72 Rack (~120 kW) CDU (150–200 kW capacity) 480V 3-phase Switchgear Quantum-X800 InfiniBand Triton Inference Server Model Registry + MLOps
↓ 800 Gb/s · ConnectX-8
ZONE 4 · Application & Operations
Plant LLM Console Digital Twin Viewer Reasoning Agent UI Control-Room Dashboards
The GB300 rack sits in a dedicated AI infrastructure room—physically separated from the control room but connected through a hardened OT/IT bridge. CDU, switchgear, and rack are co-located to keep coolant runs short and electrical losses low. Floor loading must support ~440 psf at the rack footprint.
Power & Cooling

120 kW per Rack — What Your Facility Actually Needs

A GB300 NVL72 cabinet draws roughly 120 kW continuous. That's 5–15× beyond the air-cooling ceiling of a standard server rack and demands real engineering decisions on power delivery and coolant loops.

Power Delivery
  • ~120 kW continuous rack draw
  • 480V 3-phase distribution required
  • 8 power shelves × 33 kW each
  • 48V DC busbar for stable delivery
  • 300A circuit minimum per rack
Liquid Cooling Loop
  • Direct-to-chip cold plates per GPU/CPU
  • 252 quick-connects across 18 trays
  • 30–45°C inlet coolant (warm water)
  • 20 L/min flow rate per GPU
  • CDU sized 150–200 kW per rack
Physical Footprint
  • Standard 42U rack footprint
  • 1.36 t (~3,000 lb) cabinet weight
  • ~440 psf floor loading
  • M12 anchoring recommended
  • Dual-side service access required
Network Fabric
  • 800 Gb/s per GPU (ConnectX-8)
  • Quantum-X800 InfiniBand or
  • Spectrum-X Ethernet supported
  • 2,000+ cables per multi-rack pod
  • PCIe Gen6 host bus
Workloads

What You Actually Run on a GB300 NVL72 Inside a Plant

The GB300 NVL72 isn't justified by one workload—it's justified by hosting all of them concurrently from one rack. Triton multiplexes inference across the 72-GPU NVLink domain. Talk to our support team for a workload-fit assessment specific to your plant.

01
Plant Foundation LLM

1T+ parameter model fine-tuned on DCS logs, P&IDs, maintenance records, OEM manuals, and incident reports. Engineers query in natural language; the model retrieves the right context, summarizes, and reasons.

FP4 inference · 288 GB/GPU · NVLink shared
02
Real-Time Digital Twin

Cosmos-class world foundation model renders boiler furnace, turbine hall, or substation in real time. Operators see what the AI sees—physics-stable, photo-realistic, updated 60 FPS.

Cosmos diffusion · 30× vs Hopper
03
Reasoning Agent for Operations

DeepSeek-R1-class reasoning agent with test-time scaling. Answers "Why did Unit 3 trip?" by walking through DCS history, alarms, and maintenance logs with chain-of-thought.

2× attention · 50× factory output
04
Multi-Modal Plant Vision

Combined vision + LLM model processes thermal cameras, drone inspection footage, and acoustic sensors. Detects flame instability, blade erosion, and steam leaks across the plant.

Multi-modal · <100 ms response
05
Heat Rate Optimizer Cluster

RL-PPO combustion optimizer, CNN turbine vibration model, and DNN NOx soft sensor running concurrently. Triton serves all three with sub-50 ms inference per call.

Triton multi-model · sub-50 ms
06
Synthetic Data & Training

When the rack is not at peak inference load, it trains. Continuous fine-tuning on fresh plant data keeps models aligned with current fuel mix, equipment state, and operating regime.

Off-peak training · MLOps registry
Comparison

GB300 NVL72 vs H200 vs B200 vs GB200 NVL72

Choosing between Hopper-era H200, single-node B200, and the rack-scale GB200 / GB300 NVL72 is a million-dollar facility decision. Here's the side-by-side that drives it.

SpecH200 (HGX 8-GPU)B200 (HGX 8-GPU)GB200 NVL72GB300 NVL72
Architecture Hopper Blackwell Blackwell Blackwell Ultra
GPUs / rack 8 8 72 72
HBM per GPU 141 GB HBM3e 192 GB HBM3e 192 GB HBM3e 288 GB HBM3e
Total HBM 1.13 TB 1.5 TB 13.4 TB 20 TB
FP4 (rack, dense) n/a (FP8 only) 72 PFLOPS 720 PFLOPS 1,400 PFLOPS
NVLink BW 900 GB/s/GPU 1.8 TB/s/GPU 130 TB/s rack 130 TB/s rack
NVLink domain 8 GPUs 8 GPUs 72 GPUs 72 GPUs
Power per rack ~10 kW ~14 kW ~120 kW ~120 kW
Cooling Air or RDHx Air or DLC Direct liquid Direct liquid
Best for Drop-in upgrade Enterprise AI nodes Trillion-param training Reasoning + 50× factory output
Generational Leap

From Hopper to Blackwell Ultra — The Performance Curve

NVIDIA describes GB300 NVL72 as a 50× AI factory output increase over Hopper-based platforms. That's a combination of 10× tokens-per-second per user and 5× tokens-per-second per megawatt.

H100 NVL8
H200 NVL8
~1.7×
B200 NVL8
~3.5×
GB200 NVL72
~33×
GB300 NVL72
50× AI factory output
Relative AI factory throughput on FP4 inference workloads. Source: NVIDIA Blackwell Ultra technical brief, January 2026.
Deployment Path

The 16-Week GB300 NVL72 Deployment Inside a Power Plant

A first-time GB300 deployment inside an operating plant is not a server install. It's a coordinated facility, electrical, cooling, network, and software project.

WK 1–3

Site survey & load study. Floor loading, electrical capacity, cooling, network paths.
WK 4–6

Electrical & CDU prep. 480V 3-phase, switchgear, CDU placement, manifold runs.
WK 7–9

Rack delivery & integration. 1.36 t cabinet placement, anchoring, coolant fill, leak test.
WK 10–12

Network & software stack. InfiniBand, Triton, Mission Control, model registry online.
WK 13–16

Workload migration + go-live. Plant LLM, twin, and reasoning agent into production.
FAQ

What Plant Engineers Ask Before Deploying GB300

These come up in every GB300 NVL72 scoping call. Reach out to our support team for tailored answers on your specific plant.

Do I really need GB300 NVL72, or is a B200 8-GPU node enough?

If your plant LLM is under 70B parameters and you don't need real-time digital twins, an 8-GPU B200 node is plenty. GB300 NVL72 becomes essential when you need a 1T+ parameter model in unified memory or 50× factory throughput for reasoning workloads.

Can my existing IT room handle 120 kW?

Almost certainly not. Standard server rooms are designed for 8–25 kW per rack. GB300 NVL72 requires 480V 3-phase distribution, a dedicated CDU, and ~440 psf floor loading. Plan for a purpose-built AI infrastructure room.

Is liquid cooling really mandatory?

Yes. NVIDIA explicitly mandates direct-to-chip liquid cooling for GB300 NVL72. Air cooling tops out at 8–25 kW per rack and 5–15 W/cm² heat flux—the B300 die runs at 500–600 W/cm². There is no air-cooled alternative.

What's the lead time and what's the migration path from H200?

Typical lead time is 8–12 weeks for the rack plus 4–6 weeks for facility prep. Migration from H200 is straightforward at the model layer—CUDA stack is unchanged—but the facility layer needs full electrical and cooling redesign.

iFactory Approach

Why Power Plants Choose iFactory for GB300 NVL72 Deployment

A GB300 NVL72 inside a power plant is not a hyperscaler deployment. It is an OT-grade install with safety, sovereignty, and uptime constraints that hyperscaler integrators don't think about. Book a deployment-readiness review and we'll model the rack inside your station before you sign a PO.

Generic Hyperscaler Integrator
✕ Treats the plant like a colo data center
✕ No experience with OT data classification
✕ Multi-vendor handoffs for power, cooling, network
✕ Cloud-default architecture, sovereignty afterthought
✕ No plant LLM, twin, or reasoning agent on day one
✕ Generic SLA, no plant-floor response

iFactory
✓ OT-grade install — DCS, PI, safety envelope respected
✓ 50+ pre-built plant connectors, on-prem first
✓ Single team owns rack, CDU, network, and software
✓ Sovereign by default — zero hyperscaler dependency
✓ Plant LLM, digital twin, reasoning agent shipped in 16 weeks
✓ Plant-floor SLA with on-site response
1,000+
Enterprise AI deployments
50+
Plant & OT connectors
16 wk
GB300 deployment cycle
99.5%
Uptime on deployed AI
Book a Free Deployment-Readiness Review

Get a GB300 NVL72 Site Plan for Your Power Plant

Thirty minutes with our deployment engineers. Bring your plant layout, electrical capacity, and cooling water spec. We'll identify whether GB300 NVL72 fits your facility today—or what needs to change before it can—and give you a concrete 16-week deployment path.

1.1 EF
FP4 dense compute
20 TB
Total HBM3e
130 TB/s
NVLink fabric
~120 kW
Rack power draw

Share This Story, Choose Your Platform!