Data Collection Checklist for Manufacturing Analytics

By Danielle Montgomery on June 5, 2026

data-collection-checklist-manufacturing-analytics

Data collection is the foundation of manufacturing analytics, yet most plants collect less than 40% of the data they could be using. PLCs stream cycle times but not energy consumption. MES tracks work orders but not setup times. Operators enter scrap counts but not root causes. This data collection checklist covers every manufacturing data source — PLCs, IoT sensors, MES, ERP, vision systems, and manual operator inputs — and what each must provide for a complete analytics pipeline. Based on iFactory's integration patterns deployed across 1,000+ plants, the 30 items below ensure your data collection architecture captures, validates, and delivers every signal required for production analytics.

Unify All Your Plant Data Sources in One Analytics Layer — in One Week

iFactory connects to PLCs, IoT sensors, MES, ERP, and manual entry points through 200+ pre-built connectors. See your unified data stream in a 30-minute session.

Why Structured Data Collection Matters for Analytics

Manufacturing plants that systematically map and validate all data sources before building analytics achieve 3.2x higher dashboard adoption and reduce data reconciliation effort by 70%. The five data source tiers below — from plant-floor sensors to enterprise business systems — must all feed into a unified analytics layer for a complete operational picture.

30 Data Source Checks Every checklist item required for a complete manufacturing data collection architecture across all source tiers.
6 Source Tiers PLCs, IoT sensors, MES, ERP, vision systems, and manual entry — each with distinct collection requirements.
7 Day Integration Average timeline from data source audit to unified data stream visible in iFactory's analytics dashboard.
200+ Pre-Built Connectors iFactory's connector library covers PLC protocols, IoT platforms, MES APIs, ERP modules, and operator entry apps.

Manufacturing Data Source Topology — From Sensor to Dashboard

Data flows through five distinct tiers from the plant floor to business systems. Each tier has unique collection methods, latency requirements, and data formats. A complete analytics deployment connects all five tiers into a unified data pipeline.

L0 Sensors & Actuators
  • Temperature, pressure, vibration, flow sensors
  • Analog and digital I/O signals from field devices
  • Sampling rates: 1 Hz to 10 kHz
  • Protocols: 4-20 mA, 0-10 V, IO-Link, HART
L1 PLCs & Controllers
  • Cycle times, fault codes, production counts
  • Programmable logic controllers and PACs
  • Tag counts: 500–50,000 per controller
  • Protocols: OPC-UA, Modbus TCP, Profinet, EtherNet/IP
L2 SCADA & Edge
  • Supervisory control, historian, edge processing
  • Data aggregation and buffering at line level
  • Typical latency: sub-second to 5 seconds
  • Storage: 1–10 TB per line per year
L3 MES & CMMS
  • Work orders, specifications, quality results
  • Maintenance schedules, spare parts, asset hierarchy
  • Data granularity: per batch, per shift, per work order
  • Integration: REST APIs, database views, flat-file exports
L4 ERP & Business
  • Orders, inventory, costing, shipping schedules
  • ERP, CRM, supply chain, financial systems
  • Update frequency: daily, weekly, real-time API
  • Integration: REST/SOAP APIs, EDI, OData, JDBC

See Every Data Source in Your Plant Unified in One Dashboard

iFactory discovers all available data sources across your plant, maps tags to standard KPIs, and streams everything into a single analytics layer. No coding, no complex ETL configuration.

Manufacturing Data Quality Dimensions — What Good Data Looks Like

Data quality determines analytics accuracy. These five dimensions define whether your collected data is usable for production analytics, predictive models, and operational dashboards. Each dimension has a minimum threshold that must be met before data enters the analytics pipeline.


≥95% Completeness

Percentage of expected data points actually captured. Missing timestamps, null values, and gaps of more than 5 minutes reduce completeness below threshold and trigger data quality alerts.


≥98% Accuracy

Percentage of values within expected range and calibrated tolerance. Out-of-range spikes, stuck-at-value readings, and uncalibrated sensor drift all degrade accuracy below acceptable levels.


<5 s Timeliness

Maximum acceptable latency from measurement generation to data availability in the analytics layer. Delays beyond 5 seconds render the data unusable for real-time dashboards and alarm routing.


No Drift Consistency

Data format, units, and naming conventions must be identical across all sources. Mixed units (psi vs bar, Celsius vs Fahrenheit) and inconsistent tag naming cause dashboard errors and analytics failures.


<1% Uniqueness

Duplicate data points should not exceed 1% of total records. Redundant sensors reporting the same signal, overlapping historian scopes, and double-counted production events distort KPIs and model training.

Manufacturing Data Collection Checklist — 30 Items

Each checklist item includes the specific action required, type, priority, and status toggles. The type indicates whether the item is a pass/fail check, a structured selection, or a numeric configuration. Priority marks implementation order. Use the Photo, Required, and Critical toggles to track completion.

PLC PLC & Controller Data Collection 5 items
#Checklist ItemTypePriorityPhotoReq.Crit.
1PLC tag inventory completed — all controllers inventoried with tag counts, data types (bool, int, real, string), and update rates documentedPass/FailHigh
2OPC-UA server configured on each controller or gateway — security mode set to SignAndEncrypt, endpoint URL documentedPass/FailHigh
3Tag naming convention standardised across all PLCs — no vendor-specific or engineer-specific naming; standard format: Area_Line_Asset_TagPass/FailHigh
4Critical tags identified — production count, cycle time, fault codes, downtime status — with minimum 100 ms read rate for real-time analyticsSelectionHigh
5Redundant controllers logged — paired controllers or hot-standby PLCs identified and tagged to prevent double-counting production eventsPass/FailMed
IoT IoT Sensor & Edge Data Collection 5 items
#Checklist ItemTypePriorityPhotoReq.Crit.
6Sensor type, measurement range, and accuracy class documented for every IoT sensor on the plant networkPass/FailHigh
7Edge processing device configured to buffer sensor data during network outages — minimum 72 hours of local storage at full sampling ratePass/FailHigh
8Sampling rate configured per sensor type — vibration ≥10 kHz, temperature ≥1 Hz, pressure ≥10 Hz, current ≥1 kHzNumericHigh
9Wireless sensor network gateway coverage validated — signal strength ≥−70 dBm at all sensor locations with redundancy for critical pathsNumericMed
10Edge-to-cloud data compression and filtering rules defined — raw data stored at edge, summarised metrics (min, max, avg, stddev) sent to cloud per configurable windowPass/FailMed
MES MES & Production Data Collection 5 items
#Checklist ItemTypePriorityPhotoReq.Crit.
11MES work order fields mapped to analytics schema — order ID, product code, planned quantity, start time, end time, actual quantity, scrap quantityPass/FailHigh
12MES API endpoint or database view established — read-only connection with refresh interval ≤30 seconds for near-real-time production trackingPass/FailHigh
13Quality inspection results linked to work order and production timestamp — defect counts, measurement values, and pass/fail status per batchPass/FailHigh
14Downtime events synchronised between PLC fault codes and MES downtime categories — each PLC fault code mapped to exactly one MES downtime reasonPass/FailHigh
15CMMS (maintenance) data integration configured — work orders triggered by production events, asset health scores connected to maintenance historyPass/FailMed
ERP ERP & Business Data Collection 5 items
#Checklist ItemTypePriorityPhotoReq.Crit.
16ERP production order feed configured — planned orders, released orders, in-production orders, and completed orders with timestamps for each status transitionPass/FailHigh
17Material master and BOM data synchronised — each product in analytics must link to its bill of materials for cost-per-unit and material usage calculationsPass/FailHigh
18Inventory transactions exported — material consumption, finished goods receipt, scrap disposal with timestamps accurate to the minute for OEE and yield calculationsPass/FailHigh
19Customer demand and shipping schedule accessible — forecast horizon, confirmed orders, and shipment dates for build-to-order and make-to-stock planning visibilityPass/FailMed
20ERP cost centre and GL account mapping loaded — production costs, labour rates, and overhead allocations linked to assets and product codes for profitability analysisPass/FailMed
Manual Manual Entry & Operator Data Collection 5 items
#Checklist ItemTypePriorityPhotoReq.Crit.
21Operator data entry interface designed with drop-downs, barcode scanning, and numeric keypads — no free-text fields for critical production dataPass/FailHigh
22Touchscreen or tablet deployed at each work station — minimum 10-inch display, industrial-rated (IP65), mounted at operator eye levelPass/FailHigh
23Manual data submission auto-timestamped at server receipt — operator cannot backdate entries or modify the recorded timestampPass/FailHigh
24Photo capture configured for defect, downtime, and safety incident entries — operator can photograph the issue and attach it to the event recordPass/FailMed
25Offline mode enabled — entries queued locally when network is unavailable and auto-synced when connectivity restores, with clear sync-status indicatorPass/FailMed
Gov Data Governance & Validation 5 items
#Checklist ItemTypePriorityPhotoReq.Crit.
26Data validation rules configured for every source — range checks, rate-of-change limits, stuck-value detection, and null-rejection with automated alert on violationPass/FailHigh
27Data retention policy defined per source tier — raw sensor data: 12 months, summarised data: 36 months, aggregated KPIs: 84 months — with purge schedule documentedPass/FailHigh
28Time synchronisation verified across all data sources — maximum clock drift of 100 ms between any two sources using NTP or PTP with regular drift auditPass/FailHigh
29Data source ownership assigned — each source has an owner responsible for data quality, schema changes, and collection uptime with documented escalation pathPass/FailMed
30Monthly data completeness report automated — percentage of expected data points received per source, with trend chart and source-level drill-down for gap analysisPass/FailMed
Legend: Pass/Fail Selection Numeric Priority: High Med Toggles: ✓ Required — Photo ✓ Critical

Manufacturing Data Integration Protocols — Connect Every Source

Each data source tier requires specific integration protocols. Use this reference to match the correct protocol to each source type. iFactory's connector library includes all protocols shown below — no custom driver development required for standard industrial equipment.

OPC-UA OPC Unified Architecture All PLCs, controllers, and industrial automation devices Platform-independent, built-in security (SignAndEncrypt), browseable address space, historical data access ≥1 ms
MQTT MQTT Sparkplug B IoT sensors, edge devices, and remote monitoring nodes Lightweight publish-subscribe, TLS encryption, stateful payload with birth/death certificates, scalable to 10,000+ nodes ≥10 ms
MOD Modbus TCP / RTU Legacy PLCs, RTUs, energy meters, drives, and older field devices Widest device compatibility in industrial environments, register-based addressing, no built-in security (use VPN tunnel) ≥50 ms
REST REST API / OData MES, ERP, CMMS, quality systems, and all modern business applications Standard HTTP/HTTPS, JSON payload, token-based auth (OAuth2, API key), CRUD operations on structured records ≥1 s
SQL JDBC / ODBC Relational databases behind MES, ERP, historian, and quality systems Direct SQL query access, scheduled polling via cron, read-only replicas preferred for analytics workloads ≥5 s
FILE Flat File / CSV / Excel Legacy systems, external suppliers, manual exports, and batch-reporting sources Schedule-based import (FTP, SFTP, email), schema-on-read parsing, header validation, and error-rejection logging Near-real-time

Data Collection Pipeline Maturity Levels

Data collection maturity determines what analytics your plant can support. Each level unlocks new analytics capabilities — from basic reporting at Level 1 to AI-driven optimisation at Level 4. Most plants start at Level 1 or 2 and reach Level 3 within 90 days with a structured integration approach.

1

Manual

Paper & Spreadsheet Collection

Operators record production data on paper, supervisors enter into Excel. No automated data capture. MES and ERP operate independently. Analytics is limited to weekly manual reports. Data latency: 24–72 hours. Reconciliation effort: 8+ hours per week.

2

Connected

PLC & Sensor Integration

PLCs and IoT sensors stream data to a central historian or cloud platform. Automated downtime and cycle time capture. Manual data still entered via tablets. MES and ERP not yet integrated. Data latency: sub-second to minutes. Reconciliation effort: 3–5 hours per week.

3

Unified

Full System Integration

PLC, IoT, MES, ERP, and manual entry all feed a unified analytics layer. Cross-system data validation runs automatically. Real-time dashboards reflect shop-floor and business data together. Data latency: sub-second to near-real-time. Reconciliation effort: <1 hour per week.

4

Intelligent

Self-Healing Data Pipeline

Data quality checks run continuously with auto-correction routines. Missing data triggers automatic source reconnection or fallback to secondary source. Anomaly detection identifies sensor drift before it affects analytics. Data latency: real-time. Reconciliation effort: automated.

Data Collection Deployment Stages

iFactory deploys manufacturing data collection in four sequential stages. Each stage adds new source tiers and validation layers, building toward a complete unified data architecture that supports every analytics use case in your plant.

01 Discover Week 1: Source Audit & Mapping
  • Inventory all data sources across five tiers — PLC, IoT, MES, ERP, manual
  • Document protocols, tag counts, access credentials, and connectivity status
  • Map source data to standard KPI schema — OEE, downtime, quality, throughput
  • Identify critical gaps — unmonitored assets, manual-only data, unsynchronised clocks
02 Connect Week 2: Primary Source Integration
  • Configure OPC-UA connectors for all PLCs and controllers
  • Deploy IoT gateway with edge buffering for sensor networks
  • Establish MES and ERP read-only API connections or database views
  • Set up operator tablet entry forms with photo capture and offline mode
03 Validate Week 3: Data Quality & Governance
  • Enable automated data validation — range checks, rate-of-change, stuck-value detection
  • Verify time synchronisation across all sources with drift audit report
  • Test data completeness — compare expected vs actual data points per source per hour
  • Assign data ownership and document escalation path per source tier
04 Scale Week 4: Expansion & Automation
  • Roll out data collection to remaining production lines and secondary assets
  • Configure automated monthly data completeness report with source-level drill-down
  • Set up data quality alert routing to source owners and plant analytics team
  • Document data retention policy and automated purge schedule per tier

Manufacturing Data Collection — Frequently Asked Questions

What is the minimum data collection setup needed to start manufacturing analytics?

The minimum viable data collection setup requires three sources: (1) PLC data for OEE — production count, cycle time, and downtime status from each asset, (2) MES or manual entry for quality — scrap count, defect reasons, and pass/fail per work order, and (3) ERP or manual entry for throughput — order quantities, scheduled production, and shipment targets. With just these three sources connected to iFactory's analytics layer, you get shift-level OEE, quality yield, and production vs plan tracking. Additional sources (IoT sensors, CMMS, vision systems) add depth but are not required to start.

How do you handle data from older PLCs that don't support OPC-UA?

Older PLCs that do not support OPC-UA can be connected through protocol gateways or edge converters. Common approaches include: using a Modbus TCP gateway if the PLC supports Modbus, deploying an edge device that polls the PLC's native protocol (Allen-Bradley CSP, Siemens S7, Mitsubishi MC) and converts to OPC-UA or MQTT, or installing a protocol converter appliance that presents a modern interface to upstream systems while talking the legacy protocol downstream. iFactory's integration layer supports all major legacy PLC protocols directly and includes pre-built connectors for over 50 PLC brands and models without requiring protocol conversion hardware.

What data should be collected in real time vs batch?

Real-time collection (sub-second to 5-second latency) is required for: production counts and cycle times for OEE calculation, downtime events with start and end timestamps, alarm and fault code activations, and quality measurements from in-line gauges. Batch collection (minute to daily) is sufficient for: work order data from MES, inventory transactions from ERP, maintenance records from CMMS, energy consumption totals, and operator-entered scrap reasons. A common mistake is trying to collect everything in real time, which creates unnecessary network load and storage costs. iFactory's analytics layer supports mixed-latency collection — real-time for operational metrics and batch for business data — and joins them automatically by timestamp and asset ID.

How do you ensure data consistency across different source systems?

Data consistency is achieved through four mechanisms: (1) a shared asset hierarchy — every source uses the same asset IDs, line names, and area codes defined in a central equipment register, (2) standard unit conversion — iFactory's ingestion layer automatically converts pressure (psi/bar), temperature (Celsius/Fahrenheit), length (mm/inches), and other common units to a standard system-wide unit, (3) time synchronisation — all sources synchronised via NTP or PTP with automated drift monitoring and alerts when clock skew exceeds 100 ms, and (4) cross-source validation — production counts from PLCs are compared against MES work order quantities and ERP inventory receipts, with discrepancies flagged for manual reconciliation. These four mechanisms eliminate the data reconciliation problems that plague most multi-source analytics deployments.

What is the typical bandwidth and storage requirement for manufacturing data collection?

Bandwidth and storage vary significantly by source type. For PLC data (cycle times, fault codes, production counts), a typical plant with 50 PLCs and 20,000 tags requires approximately 2–5 Mbps upstream bandwidth and 1–3 TB per year of storage at 1-second polling. For IoT sensor data (vibration, temperature, pressure at 1 kHz), edge processing is strongly recommended — raw data stays at the edge and only summarised metrics (min, max, avg, stddev) are sent to the cloud, reducing bandwidth by 99%. For MES and ERP data (work orders, inventory, quality records), bandwidth is negligible — typically <1 Mbps — but requires careful API rate-limit management. iFactory's edge gateway handles this automatically with configurable data compression, filtering, and summarisation rules per source type.

How does iFactory handle data security and OT network segmentation?

The iFactory analytics layer is designed for OT network architectures. The edge gateway sits in the plant network (OT network) and initiates all connections — it polls PLCs, reads from historians, and queries MES databases using outbound-only connections. Data is transmitted to the iFactory cloud layer over TLS 1.3 with certificate-based authentication. The gateway never exposes inbound ports and requires no firewall changes in the OT network. For air-gapped plants, iFactory supports a fully on-premise deployment option where the edge gateway, analytics server, and dashboard server all run inside the plant network with no cloud connectivity required. Data retention and purge schedules are configurable per source tier to meet regulatory and internal compliance requirements.

Connect Every Data Source in Your Plant — See a Unified View in 30 Minutes

iFactory's pre-built connectors and edge gateway unify PLC, IoT, MES, ERP, and manual data into one analytics layer. No coding, no complex ETL. Your first dashboard in a 30-minute demo session.


Share This Story, Choose Your Platform!