Downtime Tracking Checklist for Manufacturing Plants

By Rebecca Lawson on May 26, 2026

downtime-tracking-checklist-for-manufacturing

Downtime tracking is the data discipline that converts unplanned production stops from operational anecdotes into an improvement-driving dataset. The difference between a plant that reduces downtime year-over-year and one that manages the same recurring failures in perpetuity is not the maintenance budget or the equipment age — it is the quality of the downtime data and the rigour of the review process applied to it. This downtime tracking checklist covers every element of a complete downtime management system: event capture with timestamps and reason codes, downtime categorisation and code structure, Pareto analysis, MTBF and MTTR tracking, and the action-ownership process that converts downtime data into reliability improvement.

3,600
Monthly searches for downtime tracking checklists
23%
Average unplanned downtime as % of available time in discrete manufacturing
Pareto
Top 20% of downtime causes typically account for 80% of lost production time
Auto-capture
iFactory captures downtime from machine signals — no manual entry, no gaps in the record



Automated Downtime Tracking

Capture Every Downtime Event Automatically and Link It to OEE

iFactory captures downtime events from machine PLCs or operator input, timestamps every start and stop, applies your reason code structure, and links each event to the OEE Availability calculation automatically — no end-of-shift manual entry, no gaps, no disputes.

Downtime capture checklist: timestamped events with reason codes, linked to OEE Availability
Machine downtime checklist: automatic PLC signal capture eliminates manual entry errors
Downtime analysis checklist: Pareto, MTBF, and MTTR generated automatically per shift
Area 1

Downtime Definition — Agree on What You Are Measuring

Before a single downtime event is captured, the organisation must agree on what counts as downtime. The most common source of downtime data disputes is inconsistent definition: one supervisor counts changeovers as downtime, another records them as planned stops. One line operator logs every two-minute jam, another only logs stops that require maintenance intervention. These definition inconsistencies produce a downtime dataset that cannot be used for cross-line comparison, trending, or improvement targeting.

Defn · 01

Unplanned vs. Planned

Unplanned downtime: any stop that was not scheduled in the production plan for that shift. Planned downtime: scheduled maintenance, planned changeovers, breaks, no-order periods. Both must be tracked but in separate categories.

Defn · 02

Minimum Duration Threshold

Stops below the threshold (typically 2–5 minutes) are either excluded from downtime or tracked as minor stoppages in a separate category. Define the threshold before go-live and apply it consistently to every line.

Defn · 03

Changeover Separate from Breakdown

Changeover is a Six Big Losses Setup/Adjustment loss — it has a different improvement methodology (SMED) from equipment breakdown (predictive and preventive maintenance). Mixing them in the same downtime category obscures both.

Defn · 04

Operator-Caused vs. Equipment-Caused

An operator error that stops a machine is downtime. An equipment failure is downtime. Both count against Availability OEE but require different corrective actions. Code them separately from day one.

Defn · 05

Process and Material Stops

Material shortage, quality hold, and process engineering stop are downtime events — not "not downtime" because maintenance did not respond. They count against Availability and must be coded and tracked like any other downtime cause.

Defn · 06

Document the Agreed Definition

The downtime definition is written into the operating procedure for the line. New operators, new supervisors, and new maintenance technicians are trained on the definition before their first shift. Consistent understanding is the foundation of consistent data.

Area 2

Downtime Reason Codes — The Quality of Your Analysis Depends on This

Reason codes are the most important design decision in a downtime tracking system. A poorly designed reason code list — too generic, too long, poorly labelled, or missing key failure modes — produces a downtime dataset where 30% of events are coded "Other" and the Pareto is meaningless. A well-designed reason code list captures every actual failure mode in a two-level hierarchy that enables both high-level trend analysis and specific root cause investigation.

Build the Code List from Actual Data, Not a Generic Template

Before creating the reason code list, analyse three to six months of historical downtime records — whatever exists, even if incomplete. Identify the actual failure modes that occur on your equipment. The code list should include specific names your operators recognise, not generic engineering categories they have never heard.

Two-Level Hierarchy Is the Minimum

Level 1: broad category (Electrical, Mechanical, Tooling, Process, Material, Operator, Changeover). Level 2: specific cause within the category (Electrical → Servo drive fault; Mechanical → Bearing failure; Tooling → Insert wear). One-level codes do not support root cause analysis.

Monitor "Other" Usage Weekly

"Other" reason code usage above 5% means either the code list is missing a common failure mode, operators are not using the code list correctly, or both. Every week, review all "Other" entries and add specific codes for any that appear more than twice. Within 90 days, "Other" should represent less than 2% of events.

Make Code Selection Fast

If selecting a reason code takes more than 15 seconds, operators will use "Other" or skip the field entirely. Mobile-optimised code selection with search-as-you-type or a short list of most-used codes displayed first is critical for data quality at the operator level.

Area 3

Downtime Analysis — Pareto to Improvement Action

A downtime tracking system that captures data without generating weekly analysis and action is a record-keeping system, not a reliability improvement programme. The Pareto principle applies consistently in manufacturing downtime: 20% of failure modes account for 80% of lost production time. Finding and eliminating the top one or two downtime causes — with root cause analysis and permanent corrective action — produces more reliability improvement than addressing 20 minor causes simultaneously.

01
Daily Downtime Pareto

Generate a daily Pareto of downtime by reason code per line. Post it visibly in the production area or on the shift dashboard. The single top downtime cause for the day should be identifiable by any operator or supervisor at a glance.

02
MTBF Trending per Equipment Class

Mean Time Between Failures per equipment type reveals whether reliability is improving or degrading. A declining MTBF trend on a specific machine class is an early warning signal for predictive maintenance intervention before the failure frequency escalates.

03
MTTR Trending per Code Category

Mean Time To Repair reveals whether maintenance response efficiency is improving. High MTTR on a specific failure type indicates: wrong spare parts on hand, technician skill gap, diagnostic process too slow, or repair procedure not documented.

04
Repeat Failure Escalation

Any downtime cause that appears three or more times in 30 days on the same asset triggers a formal root cause analysis — not another reactive repair. Repeat failures are the most visible evidence of a systemic issue that reactive maintenance will never solve.

05
Action Ownership and Closure

Every improvement action from the downtime Pareto review has a named owner, a specific action description, and a completion date. Actions reviewed at weekly production meeting. Open actions older than the defined resolution window are escalated.




Downtime Analytics Platform

Automate Downtime Capture, Pareto, and OEE Link in iFactory

iFactory captures downtime from machine signals or operator input, applies your reason code structure, calculates MTBF and MTTR per equipment class, generates daily and weekly Pareto automatically, and links every downtime event to the OEE Availability calculation.

Downtime monitoring checklist: timestamped events, reason codes, and OEE link automatic
Downtime reporting checklist: daily Pareto, MTBF, MTTR, and trend dashboard per shift
Downtime data checklist: data quality audit and "Other" code usage tracking built in
Checklist

Downtime Tracking Checklist — 30 Items

Use this checklist when implementing or auditing a manufacturing downtime tracking programme. Items cover system setup, event capture, reason code structure, analysis cadence, action tracking, and data quality.

Setup Downtime Tracking Infrastructure 5 items
#Checklist ItemTypePriorityPhotoRequiredCritical
1Downtime definition agreed: unplanned stop only, or includes planned maintenance and changeoversPass/FailHigh
2Minimum downtime threshold defined — stops below threshold (e.g. 2 min) excluded or tracked separatelyPass/FailHigh
3Capture method selected: automatic machine signal, operator tablet entry, or supervisor logPass/FailHigh
4Every machine/line in scope has a unique asset ID in the downtime tracking systemPass/FailHigh
5Downtime data linked to OEE Availability calculation — same data source, not separate entryPass/FailHigh
Capture Downtime Event Capture 6 items
#Checklist ItemTypePriorityPhotoRequiredCritical
6Start timestamp recorded at moment of stop — not estimated laterPass/FailHigh
7End timestamp recorded at moment of restart — not estimated at shift endPass/FailHigh
8Operator or maintenance technician ID recorded with every downtime eventPass/FailHigh
9Reason code selected from structured list — "Other" usage tracked and kept below 5%Pass/FailHigh
10Equipment sub-system identified: electrical, mechanical, tooling, material, operator, processSelectionHigh
11Notes field used for first-occurrence descriptions — not relied on for repeat eventsPass/FailMed
Codes Reason Code Structure 5 items
#Checklist ItemTypePriorityPhotoRequiredCritical
12Reason code list built from actual historical downtime — not generic templatePass/FailHigh
13Reason codes organised in 2-level hierarchy: category (equipment failure) → specific cause (spindle motor fault)Pass/FailHigh
14Maintenance, process, material, and operator-caused stops all in separate code familiesPass/FailHigh
15Changeover and setup coded separately from unplanned downtimePass/FailHigh
16Reason code list reviewed and updated quarterly — new failure modes added promptlyPass/FailMed
Analysis Downtime Analysis & Pareto 5 items
#Checklist ItemTypePriorityPhotoRequiredCritical
17Daily downtime Pareto generated per line — top 3 causes visible at shift reviewPass/FailHigh
18Weekly downtime Pareto reviewed in production meeting — top cause actionedPass/FailHigh
19MTBF (Mean Time Between Failures) tracked per equipment classPass/FailHigh
20MTTR (Mean Time To Repair) tracked per downtime categoryPass/FailHigh
21Repeat downtime events (same asset, same reason code, 3+ times in 30 days) trigger formal RCAPass/FailHigh
Action Downtime Action Tracking 5 items
#Checklist ItemTypePriorityPhotoRequiredCritical
22Every downtime event above threshold has an assigned maintenance response — open or closedPass/FailHigh
23Chronic downtime causes (top 3 by frequency) have active improvement projectsPass/FailHigh
24Downtime reduction targets set per line — not only global OEE targetsPass/FailMed
25Improvement actions linked to specific downtime reason codes — not general "improve reliability"Pass/FailHigh
Quality Downtime Data Quality 4 items
#Checklist ItemTypePriorityPhotoRequiredCritical
26Monthly data quality audit: manual entries spot-checked against machine signalsPass/FailMed
27"Other" reason code usage below 5% — any higher triggers code list reviewPass/FailHigh
28No unresolved gaps in downtime timeline — every shift has complete start/end accountingPass/FailHigh
29Downtime data accessible to both maintenance and production teams — not siloedPass/FailHigh
30Downtime trend visible 13 weeks rolling — seasonal and campaign effects identifiablePass/FailMed
Types: Pass/Fail Numeric Text Selection    Priority: High Med    Toggles: ✓ Required ✓ Yes — No
FAQ

Frequently Asked Questions

What is downtime tracking in manufacturing?

Downtime tracking is the systematic recording of every production stop — its start time, end time, duration, equipment, and cause — to produce a dataset that enables reliability analysis, OEE calculation, and maintenance prioritisation. Effective downtime tracking requires agreed definitions, structured reason codes, timestamped data capture, and a regular review process that converts downtime data into improvement actions. Without downtime tracking, maintenance operates reactively and OEE Availability cannot be calculated accurately.

What is the difference between planned and unplanned downtime?

Planned downtime includes scheduled stops that are known in advance and built into the production schedule: planned maintenance windows, scheduled changeovers, breaks, and no-order periods. Planned downtime is excluded from the OEE Availability denominator. Unplanned downtime is any stop that was not scheduled — equipment breakdown, material shortage, quality hold, tooling failure, or operator error. Unplanned downtime is the Availability loss in OEE and is the target for downtime reduction programmes.

What are downtime reason codes and why do they matter?

Downtime reason codes are the structured classification system that turns a downtime duration into actionable diagnostic data. Without reason codes, you know how much time was lost but not why. With a well-designed two-level reason code hierarchy, you can Pareto downtime by cause, calculate MTBF per failure mode, identify repeat failures that require root cause analysis, and separate maintenance-driven losses from process and material-driven losses. The quality of the reason code list determines the quality of every downtime analysis your organisation will ever produce. Book a Demo to see iFactory's reason code configuration.

What is MTBF and MTTR in downtime tracking?

MTBF (Mean Time Between Failures) is the average time between unplanned stop events for a specific piece of equipment or equipment class. A declining MTBF trend means failures are becoming more frequent — a signal for predictive or preventive maintenance intervention. MTTR (Mean Time To Repair) is the average time from when a failure occurs to when production restarts. High MTTR indicates maintenance response or repair effectiveness issues — wrong spares, skill gaps, or undocumented repair procedures. Both metrics are calculated automatically by iFactory from timestamped downtime event data.

How does iFactory automate downtime tracking?

iFactory connects to machine PLCs, sensors, or operator input devices to capture downtime events automatically at the moment they occur — no end-of-shift manual entry. Each event is timestamped, assigned to the correct asset, and presented to the operator for reason code selection on a mobile device. The system calculates MTBF and MTTR per equipment class, generates daily and weekly Pareto automatically, and links every downtime event to the OEE Availability calculation. Book a Demo to see the downtime module.




Start Tracking Downtime Correctly

Replace Manual Downtime Logs with Automated iFactory Downtime Tracking

iFactory captures every downtime event with timestamp and reason code, links it to OEE, calculates MTBF and MTTR, and generates the daily and weekly Pareto your maintenance and production teams need to eliminate chronic losses.

Machine downtime checklist: automatic capture from PLC signals, no manual entry
Downtime categorization: structured 2-level reason code hierarchy configured for your equipment
Downtime analysis checklist: daily Pareto, MTBF, MTTR, and trend dashboards — automatic

Share This Story, Choose Your Platform!