At 03:47 on a Tuesday morning at a 660 MW supercritical unit, the control room operator's chemistry dashboard flashed a single SPC alert: cation conductivity on the condensate pump discharge had crossed its upper control limit for the eighth consecutive sample. Specific conductivity looked normal. pH looked normal. Most plants would have missed it. The SPC algorithm, however, had been tracking a 0.05 µS/cm upward drift for six hours — the signature of a hairline condenser tube leak bleeding cooling water into the steam cycle. Operations isolated the leaking section within ninety minutes. The alternative — undetected hydrogen damage to waterwall tubes — would have cost $1.2 million in repairs and an 8-day forced outage at well over $1 million per day in lost generation. This is the difference SPC makes on the most expensive failure mode in power generation.
SPC for Boiler Water Chemistry — How Statistical Control Prevents Tube Failures Worth $1M+/Day
Setpoint alarms catch failures hours too late. SPC catches the drift days early. This is the operations-grade reference for applying X-bar, EWMA, CUSUM, and Western Electric rules to pH, cation conductivity, silica, and dissolved oxygen — the four parameters that decide whether your boiler tubes survive the year.
Why Boiler Chemistry Is a Statistical Problem, Not a Setpoint Problem
A boiler tube does not fail because chemistry crossed a setpoint at 03:47 last Tuesday. It fails because chemistry sat in a slightly degraded state for weeks, gradually thinning the magnetite layer, accelerating corrosion under deposits, or carrying silica into the LP turbine. By the time a single sample crosses an alarm threshold, the damage is already done. SPC reframes the question from "is this reading bad right now?" to "is this process still in statistical control?" — and the answer arrives 5–14 days earlier.
Reactive — alarm when limit is breached
- Single sample compared against a fixed limit
- Alarm fires after damage has begun
- Operators trained to chase the most recent reading
- Slow drifts and pattern shifts are invisible
- Root cause investigation happens after the fact
Predictive — alert when the process loses control
- Control limits derived from the process, not from spec
- Drift, shift, and trend patterns trigger early warning
- Detection 5–14 days before any single point breaches limits
- Cpk and Cp track process capability month-over-month
- Western Electric rules give pattern-level diagnosis
Four Damage Modes SPC Prevents — Mechanism, Trigger, and Cost
Each major boiler damage mode has a fingerprint in the chemistry data. SPC catches the fingerprint before the damage matures. These are the four that drive most forced outages on supercritical and subcritical drum units.
Hydrogen Damage
Waterwall tubes — brittle intergranular failureFlow-Accelerated Corrosion
Carbon-steel feedwater piping, economiser, LP heater drainsSilica Carryover & Turbine Deposits
LP turbine blades and nozzle ringCaustic Gouging
High-heat-flux waterwall tubes under depositsFive Critical Parameters — Control Bands SPC Operates Within
EPRI guidelines define Action Level 1, 2, and 3 thresholds for each parameter — these are the upper bounds. SPC operates inside the normal band, watching for statistical drift well before AL1 is reached. The table below shows typical limits for a high-pressure drum boiler on AVT(O); your unit-specific values may differ by pressure class and treatment program.
| Parameter | Sample point | Normal target | AL1 — investigate | AL2 — schedule action | AL3 — load reduction | What it protects against |
|---|---|---|---|---|---|---|
| Cation conductivity | Condensate pump discharge, economiser inlet, steam | < 0.15 µS/cm | 0.15 – 0.30 | 0.30 – 1.0 | > 1.0 µS/cm | Condenser leaks, anion contamination, corrosion drivers |
| pH (boiler water) | Drum / blowdown sample | 9.2 – 9.6 | 9.0 – 9.2 or 9.6 – 9.8 | 8.5 – 9.0 or 9.8 – 10.2 | < 8.5 or > 10.2 | Hydrogen damage, caustic gouging, magnetite stability |
| Silica (boiler water) | Drum sample | < 200 ppb | 200 – 400 | 400 – 1000 | > 1000 ppb | Turbine silica deposits, efficiency loss |
| Sodium (steam) | Saturated and superheated steam | < 2 ppb | 2 – 5 | 5 – 20 | > 20 ppb | Mechanical carryover, deposit-driven turbine failures |
| Dissolved oxygen (feedwater) | Economiser inlet | < 5 ppb AVT(R); 30–150 ppb OT | 5 – 10 | 10 – 20 | > 20 ppb (AVT) | Pitting, oxide transport, FAC if too low on AVT(R) |
Why the EPRI tiers are not enough on their own
Action Levels are absolute thresholds — by the time a sample reaches AL1, the chemistry has already moved a long way from the historical mean. SPC sits inside the normal band and trips on a 2-sigma drift, a 7-point run, or a CUSUM excursion that the AL framework will not see for another shift or two. The two tools are complementary: AL defines "how bad," SPC defines "is it moving."
Four Chart Types — Which Catches What in a Chemistry Stream
No single chart catches every failure mode. Operations teams that get this right run a layered system: an Individual-MR chart on every parameter for baseline, an X-bar R on grouped samples for shift-level shifts, an EWMA for slow drifts, and a CUSUM for early detection of small mean shifts.
Individual-MR (I-MR)
Baseline chart for every parameterWhat it does: Plots each sample individually plus a moving range. Best when samples are taken hourly or once per shift and cannot be naturally grouped into subgroups.
Best for: All routine chemistry parameters — pH, cation conductivity, silica, sodium, DO — on a single-stream basis.
Sensitivity: Detects 2-sigma shifts in roughly 9–11 samples; less sensitive to small drifts but transparent and easy for operators to read.
X-bar R
Shift-level subgroup chartWhat it does: Plots subgroup mean and range. Designed for processes sampled in natural groups — multiple samples per shift, or grab samples from parallel streams.
Best for: Parallel sample streams (e.g., make-up demineraliser trains), or shift-level summaries when multiple grab samples are taken per period.
Sensitivity: Tighter control limits than I-MR; the R chart catches variability changes the X-bar chart will not.
EWMA
Slow-drift detectorWhat it does: Exponentially weighted moving average — gives more weight to recent samples while smoothing noise. The single best tool for catching slow trends in chemistry.
Best for: Cation conductivity drift (early condenser leak), pH centerline shift, silica creep with load.
Sensitivity: Detects sustained 0.5-sigma shifts in 8–12 samples — far earlier than any individual-point alarm.
CUSUM
Cumulative-sum, earliest detectionWhat it does: Tracks the cumulative deviation from target. A single small persistent shift accumulates rapidly, making CUSUM the fastest of the four for detecting tiny but meaningful mean shifts.
Best for: High-impact parameters where seconds matter — steam sodium, cation conductivity on once-through units.
Sensitivity: Detects 0.5-sigma shifts in 5–8 samples; needs an experienced eye to interpret but unbeatable for early warning.
Western Electric Rules — Applied to Chemistry, Not Theory
The Western Electric and Nelson rules are pattern-recognition templates that turn a control chart into a diagnostic instrument. The trick for chemistry is knowing which rule maps to which failure mode — the rules below are the ones operations teams actually act on.
One point beyond 3σ
The classic. A single sample more than three standard deviations from the centerline. On cation conductivity, this is the unmistakable signature of a substantial condenser leak or regen breakthrough. Treat as immediate action.
Nine points on one side of centerline
Sustained mean shift. Nine consecutive samples above (or below) the mean — even if no single point breaches a limit. On boiler-water pH, this often signals slow caustic accumulation or phosphate hideout building up.
Six points trending up or down
Active drift. Six samples in monotonic progression. On silica, this is the load-related solubility excursion catching up with you. On DO, it means the deaerator is slowly losing performance.
Two of three points beyond 2σ
Early warning. Two of three consecutive samples beyond two sigma on the same side. The most useful rule on cation conductivity — catches small leaks that have not yet reached the 3-sigma threshold.
Four of five points beyond 1σ
Process shifted, low amplitude. Common on slow pH walk-downs caused by ammonia injection drift. By itself rarely catastrophic, but a reliable early signal that something has changed.
Fifteen points within 1σ
Under-dispersion warning. Often the signature of a stuck analyser or a sample line that has lost flow. Looks like perfect chemistry on the screen, but is actually instrument failure masquerading as control.
Get an SPC dashboard built for your chemistry streams
We integrate with your existing online analysers and CEMS infrastructure to deliver a chemistry SPC dashboard with I-MR, EWMA, CUSUM, and Western Electric rules running on every parameter — no manual chart upkeep, no spreadsheets.
- All four chart types on every parameter
- Auto-applied Western Electric and Nelson rules
- Monthly Cpk reports per parameter
- Failure-mode diagnostic on every alert
- Pre-configured NVIDIA AI server, racked and ready
- Live in 6–12 weeks with full operator training
Condenser Tube Leak Detected 11 Days Before Spec Breach — 660 MW Supercritical Unit
A real condenser-leak detection event from a supercritical unit running once-through chemistry. The traditional alarm system would have flagged this on day 14, after hydrogen damage had begun. The SPC algorithm flagged it on day 3.
Baseline established
30 days of in-control cation conductivity data on condensate pump discharge. Centerline 0.082 µS/cm, σ = 0.012, no out-of-control patterns.
Hairline leak begins
Mechanical condenser tube failure — pinhole leak below detection on individual samples. Sample-to-sample variation still inside ±3σ.
EWMA tripped
EWMA chart crosses upper limit. Two of three consecutive samples beyond 2σ (Western Electric Rule 4). Centerline shift of 0.05 µS/cm confirmed.
Diagnostic confirmed
Sodium tracer test on cooling water identifies the affected condenser pass. Anion balance shift confirms chloride ingress profile.
Tube plugged on-load
Affected condenser tube isolated and plugged using on-load plugging equipment. No unit derate required. Chemistry returns to baseline within 36 hours.
Where AL1 would have fired
Under the setpoint-only system, cation conductivity would have crossed AL1 (0.15 µS/cm) only on day 14 — by which point hydrogen damage to waterwall tubes would already be underway.
Process Capability — Cpk Targets for Each Chemistry Stream
SPC tells you whether the process is in control. Cpk tells you whether the process is good enough. A chemistry stream that is in statistical control but capable of breaching spec under normal variation is still a liability. The table sets practical Cpk targets for operations teams running modern condition-based programs.
| Parameter | Cpk < 1.0 | Cpk 1.0 – 1.33 | Cpk 1.33 – 1.67 | Cpk > 1.67 |
|---|---|---|---|---|
| Cation conductivity | Not capable — frequent excursions, urgent program review | Marginal — investigate chronic sources | Capable — sustain current controls | World-class — typical of well-tuned OT plants |
| pH (boiler water) | Major instability — ammonia dosing or phosphate control problem | Acceptable but variable — review dosing loop tuning | Solid — typical of well-run AVT(O) units | Excellent — process drift well within band |
| Silica | Risk of carryover — review make-up quality and blowdown rate | Borderline — load-correlated excursions possible | Capable — sustainable across load swings | Outstanding — minimal turbine deposit risk |
| Dissolved oxygen | Deaerator performance issue or AVT chemistry problem | Variable but bounded | Capable — well-tuned scavenger or OT regime | Tight — typical of single-source make-up units |
How to read this in the morning chemistry meeting
If a parameter is in statistical control but has Cpk under 1.33, the chemistry program needs a structural change — different dosing setpoint, tighter make-up specification, deaerator overhaul, or different treatment regime. If a parameter is out of statistical control, the SPC charts will tell you whether the cause is a step change (new equipment, leak, regen breakthrough), a drift (instrument, dosing tuning), or a variance change (cycling, sample line issue) — each maps to a different corrective action.
Six-Phase Implementation Roadmap for Operations Teams
SPC for chemistry is not a software install — it is an operations habit backed by software. The phased rollout below is what works on stations that have moved successfully from setpoint thinking to statistical control.
Sample-System Health Check Week 1–2
SPC is only as good as the sample. Walk down every continuous analyser, verify sample line integrity, calibrate against grab-sample lab data, and resolve any chronic sample temperature or pressure issues. No analytics until the sample is trustworthy.
Baseline Data Collection Week 3–8
Collect at least 30 days of in-control operation per parameter. Strip out planned excursions (startup, blowdowns, regen). Compute centerline and σ for each parameter, sample point, and operating condition (full load, low load, startup).
Chart Selection and Limit Calibration Week 9–10
Assign I-MR and EWMA to every parameter. Add CUSUM on the high-impact ones (cation conductivity, steam sodium). Tune EWMA lambda (typically 0.2–0.4) and CUSUM h, k parameters against historical excursions to balance speed against false-alarm rate.
Operator Training and Advisory Mode Week 11–16
Roll the SPC dashboard out in alert-only mode. Train control room and chemistry staff on chart reading, Western Electric rules, and the failure-mode mapping. Track every alert: was it real, was it actionable, did operators respond.
Full Operational Integration Week 17–24
Codify chemistry SPC alerts into the standing orders. Critical patterns (Rule 1, EWMA exceedance on cation conductivity) trigger mandatory investigation. Monthly Cpk reports become a standing item in the operations review meeting.
Continuous Improvement Loop Month 6+
Re-baseline quarterly. Adjust control limits as the process improves. Use Cpk trending to drive structural improvements — deaerator overhaul, dosing loop tuning, make-up plant upgrades — based on the data, not the gut.
Chemistry SPC — Common Operations Questions
Does SPC replace EPRI Action Levels?
No. SPC and EPRI Action Levels work together. AL defines the absolute regulatory and engineering limits for chemistry. SPC defines the statistical envelope of normal operation. AL is "is this damaging the unit right now?" SPC is "is the process moving in a direction that will damage the unit?" Both feeds run in parallel — the SPC alert almost always fires first.
How much baseline data do I need before I can trust SPC?
A minimum of 30 days of in-control operation per parameter, with at least 100 samples. Stratify by operating condition — full load is statistically different from low load and startup. Most plants collect 60–90 days to build robust control limits and then re-baseline after any major equipment change.
What is a reasonable false-alarm rate for chemistry SPC?
EWMA tuned to lambda 0.3 with a 3-sigma upper control limit typically delivers under one false alarm per parameter per month. CUSUM is more sensitive — expect 1–3 per month if tuned aggressively for early detection on cation conductivity. False alarms are a feature, not a bug — they tell you the analyser is being checked and the process is being watched.
Can SPC work with grab samples only, or do I need continuous analysers?
Both work, with caveats. Continuous analysers (cation conductivity, pH, sodium, silica, DO) give you the sample density needed for EWMA and CUSUM to deliver early detection. Grab samples can still drive Individual-MR charts and rule-based detection on slower parameters like iron and copper, but EWMA and CUSUM are statistically weak below about 6 samples per day.
How does this integrate with our DCS and historian?
The SPC layer reads from your historian (PI, GE Proficy, or similar) and writes alerts back to the DCS or to a separate operator dashboard. No DCS programming changes required for read-only deployment. Closed-loop control changes — for example, dosing pump setpoint trim from an SPC signal — are added later, once trust is established.
Do we need to buy NVIDIA AI servers separately?
No. The fully-loaded AI server is supplied pre-configured and pre-loaded with the chemistry SPC dashboards, integration drivers, and rule engines. Rack it, connect power and Ethernet, and the system goes live. Cabling, network integration, DCS/historian connectivity, operator training, and 24×7 remote monitoring are all part of the package.
What is the typical timeline from contract to first live alert?
Live in 6–12 weeks. Three-phase delivery: weeks 1–4 — sample-system health check, hardware install, historian integration. Weeks 5–8 — baseline data collection and model calibration. Weeks 9–12 — operator training and advisory-mode go-live. Most stations reach full operational integration by month six.
Stop Reacting to Chemistry Excursions — Start Predicting Them
Hardware + software bundle. Pre-configured NVIDIA AI server, racked and ready, pre-loaded with chemistry SPC dashboards. Cabling, network and historian integration, operator training, and 24×7 remote monitoring all included. Live in 6–12 weeks. Trusted by 1000+ industrial clients with 99.9% uptime.







