Data collection is the foundation of manufacturing analytics, yet most plants collect less than 40% of the data they could be using. PLCs stream cycle times but not energy consumption. MES tracks work orders but not setup times. Operators enter scrap counts but not root causes. This data collection checklist covers every manufacturing data source — PLCs, IoT sensors, MES, ERP, vision systems, and manual operator inputs — and what each must provide for a complete analytics pipeline. Based on iFactory's integration patterns deployed across 1,000+ plants, the 30 items below ensure your data collection architecture captures, validates, and delivers every signal required for production analytics.
Unify All Your Plant Data Sources in One Analytics Layer — in One Week
iFactory connects to PLCs, IoT sensors, MES, ERP, and manual entry points through 200+ pre-built connectors. See your unified data stream in a 30-minute session.
Why Structured Data Collection Matters for Analytics
Manufacturing plants that systematically map and validate all data sources before building analytics achieve 3.2x higher dashboard adoption and reduce data reconciliation effort by 70%. The five data source tiers below — from plant-floor sensors to enterprise business systems — must all feed into a unified analytics layer for a complete operational picture.
Manufacturing Data Source Topology — From Sensor to Dashboard
Data flows through five distinct tiers from the plant floor to business systems. Each tier has unique collection methods, latency requirements, and data formats. A complete analytics deployment connects all five tiers into a unified data pipeline.
- Temperature, pressure, vibration, flow sensors
- Analog and digital I/O signals from field devices
- Sampling rates: 1 Hz to 10 kHz
- Protocols: 4-20 mA, 0-10 V, IO-Link, HART
- Cycle times, fault codes, production counts
- Programmable logic controllers and PACs
- Tag counts: 500–50,000 per controller
- Protocols: OPC-UA, Modbus TCP, Profinet, EtherNet/IP
- Supervisory control, historian, edge processing
- Data aggregation and buffering at line level
- Typical latency: sub-second to 5 seconds
- Storage: 1–10 TB per line per year
- Work orders, specifications, quality results
- Maintenance schedules, spare parts, asset hierarchy
- Data granularity: per batch, per shift, per work order
- Integration: REST APIs, database views, flat-file exports
- Orders, inventory, costing, shipping schedules
- ERP, CRM, supply chain, financial systems
- Update frequency: daily, weekly, real-time API
- Integration: REST/SOAP APIs, EDI, OData, JDBC
See Every Data Source in Your Plant Unified in One Dashboard
iFactory discovers all available data sources across your plant, maps tags to standard KPIs, and streams everything into a single analytics layer. No coding, no complex ETL configuration.
Manufacturing Data Quality Dimensions — What Good Data Looks Like
Data quality determines analytics accuracy. These five dimensions define whether your collected data is usable for production analytics, predictive models, and operational dashboards. Each dimension has a minimum threshold that must be met before data enters the analytics pipeline.
Percentage of expected data points actually captured. Missing timestamps, null values, and gaps of more than 5 minutes reduce completeness below threshold and trigger data quality alerts.
Percentage of values within expected range and calibrated tolerance. Out-of-range spikes, stuck-at-value readings, and uncalibrated sensor drift all degrade accuracy below acceptable levels.
Maximum acceptable latency from measurement generation to data availability in the analytics layer. Delays beyond 5 seconds render the data unusable for real-time dashboards and alarm routing.
Data format, units, and naming conventions must be identical across all sources. Mixed units (psi vs bar, Celsius vs Fahrenheit) and inconsistent tag naming cause dashboard errors and analytics failures.
Duplicate data points should not exceed 1% of total records. Redundant sensors reporting the same signal, overlapping historian scopes, and double-counted production events distort KPIs and model training.
Manufacturing Data Collection Checklist — 30 Items
Each checklist item includes the specific action required, type, priority, and status toggles. The type indicates whether the item is a pass/fail check, a structured selection, or a numeric configuration. Priority marks implementation order. Use the Photo, Required, and Critical toggles to track completion.
| # | Checklist Item | Type | Priority | Photo | Req. | Crit. |
|---|---|---|---|---|---|---|
| 1 | PLC tag inventory completed — all controllers inventoried with tag counts, data types (bool, int, real, string), and update rates documented | Pass/Fail | High | ✓ | ✓ | ✓ |
| 2 | OPC-UA server configured on each controller or gateway — security mode set to SignAndEncrypt, endpoint URL documented | Pass/Fail | High | — | ✓ | ✓ |
| 3 | Tag naming convention standardised across all PLCs — no vendor-specific or engineer-specific naming; standard format: Area_Line_Asset_Tag | Pass/Fail | High | — | ✓ | ✓ |
| 4 | Critical tags identified — production count, cycle time, fault codes, downtime status — with minimum 100 ms read rate for real-time analytics | Selection | High | — | ✓ | ✓ |
| 5 | Redundant controllers logged — paired controllers or hot-standby PLCs identified and tagged to prevent double-counting production events | Pass/Fail | Med | — | ✓ | — |
| # | Checklist Item | Type | Priority | Photo | Req. | Crit. |
|---|---|---|---|---|---|---|
| 6 | Sensor type, measurement range, and accuracy class documented for every IoT sensor on the plant network | Pass/Fail | High | ✓ | ✓ | ✓ |
| 7 | Edge processing device configured to buffer sensor data during network outages — minimum 72 hours of local storage at full sampling rate | Pass/Fail | High | — | ✓ | ✓ |
| 8 | Sampling rate configured per sensor type — vibration ≥10 kHz, temperature ≥1 Hz, pressure ≥10 Hz, current ≥1 kHz | Numeric | High | — | ✓ | ✓ |
| 9 | Wireless sensor network gateway coverage validated — signal strength ≥−70 dBm at all sensor locations with redundancy for critical paths | Numeric | Med | — | ✓ | — |
| 10 | Edge-to-cloud data compression and filtering rules defined — raw data stored at edge, summarised metrics (min, max, avg, stddev) sent to cloud per configurable window | Pass/Fail | Med | — | ✓ | — |
| # | Checklist Item | Type | Priority | Photo | Req. | Crit. |
|---|---|---|---|---|---|---|
| 11 | MES work order fields mapped to analytics schema — order ID, product code, planned quantity, start time, end time, actual quantity, scrap quantity | Pass/Fail | High | — | ✓ | ✓ |
| 12 | MES API endpoint or database view established — read-only connection with refresh interval ≤30 seconds for near-real-time production tracking | Pass/Fail | High | — | ✓ | ✓ |
| 13 | Quality inspection results linked to work order and production timestamp — defect counts, measurement values, and pass/fail status per batch | Pass/Fail | High | ✓ | ✓ | ✓ |
| 14 | Downtime events synchronised between PLC fault codes and MES downtime categories — each PLC fault code mapped to exactly one MES downtime reason | Pass/Fail | High | — | ✓ | ✓ |
| 15 | CMMS (maintenance) data integration configured — work orders triggered by production events, asset health scores connected to maintenance history | Pass/Fail | Med | — | ✓ | — |
| # | Checklist Item | Type | Priority | Photo | Req. | Crit. |
|---|---|---|---|---|---|---|
| 16 | ERP production order feed configured — planned orders, released orders, in-production orders, and completed orders with timestamps for each status transition | Pass/Fail | High | — | ✓ | ✓ |
| 17 | Material master and BOM data synchronised — each product in analytics must link to its bill of materials for cost-per-unit and material usage calculations | Pass/Fail | High | — | ✓ | ✓ |
| 18 | Inventory transactions exported — material consumption, finished goods receipt, scrap disposal with timestamps accurate to the minute for OEE and yield calculations | Pass/Fail | High | — | ✓ | ✓ |
| 19 | Customer demand and shipping schedule accessible — forecast horizon, confirmed orders, and shipment dates for build-to-order and make-to-stock planning visibility | Pass/Fail | Med | — | ✓ | — |
| 20 | ERP cost centre and GL account mapping loaded — production costs, labour rates, and overhead allocations linked to assets and product codes for profitability analysis | Pass/Fail | Med | — | — | — |
| # | Checklist Item | Type | Priority | Photo | Req. | Crit. |
|---|---|---|---|---|---|---|
| 21 | Operator data entry interface designed with drop-downs, barcode scanning, and numeric keypads — no free-text fields for critical production data | Pass/Fail | High | ✓ | ✓ | ✓ |
| 22 | Touchscreen or tablet deployed at each work station — minimum 10-inch display, industrial-rated (IP65), mounted at operator eye level | Pass/Fail | High | ✓ | ✓ | ✓ |
| 23 | Manual data submission auto-timestamped at server receipt — operator cannot backdate entries or modify the recorded timestamp | Pass/Fail | High | — | ✓ | ✓ |
| 24 | Photo capture configured for defect, downtime, and safety incident entries — operator can photograph the issue and attach it to the event record | Pass/Fail | Med | — | ✓ | — |
| 25 | Offline mode enabled — entries queued locally when network is unavailable and auto-synced when connectivity restores, with clear sync-status indicator | Pass/Fail | Med | — | ✓ | — |
| # | Checklist Item | Type | Priority | Photo | Req. | Crit. |
|---|---|---|---|---|---|---|
| 26 | Data validation rules configured for every source — range checks, rate-of-change limits, stuck-value detection, and null-rejection with automated alert on violation | Pass/Fail | High | — | ✓ | ✓ |
| 27 | Data retention policy defined per source tier — raw sensor data: 12 months, summarised data: 36 months, aggregated KPIs: 84 months — with purge schedule documented | Pass/Fail | High | — | ✓ | ✓ |
| 28 | Time synchronisation verified across all data sources — maximum clock drift of 100 ms between any two sources using NTP or PTP with regular drift audit | Pass/Fail | High | — | ✓ | ✓ |
| 29 | Data source ownership assigned — each source has an owner responsible for data quality, schema changes, and collection uptime with documented escalation path | Pass/Fail | Med | — | ✓ | — |
| 30 | Monthly data completeness report automated — percentage of expected data points received per source, with trend chart and source-level drill-down for gap analysis | Pass/Fail | Med | — | — | — |
Manufacturing Data Integration Protocols — Connect Every Source
Each data source tier requires specific integration protocols. Use this reference to match the correct protocol to each source type. iFactory's connector library includes all protocols shown below — no custom driver development required for standard industrial equipment.
Data Collection Pipeline Maturity Levels
Data collection maturity determines what analytics your plant can support. Each level unlocks new analytics capabilities — from basic reporting at Level 1 to AI-driven optimisation at Level 4. Most plants start at Level 1 or 2 and reach Level 3 within 90 days with a structured integration approach.
Manual
Paper & Spreadsheet CollectionOperators record production data on paper, supervisors enter into Excel. No automated data capture. MES and ERP operate independently. Analytics is limited to weekly manual reports. Data latency: 24–72 hours. Reconciliation effort: 8+ hours per week.
Connected
PLC & Sensor IntegrationPLCs and IoT sensors stream data to a central historian or cloud platform. Automated downtime and cycle time capture. Manual data still entered via tablets. MES and ERP not yet integrated. Data latency: sub-second to minutes. Reconciliation effort: 3–5 hours per week.
Unified
Full System IntegrationPLC, IoT, MES, ERP, and manual entry all feed a unified analytics layer. Cross-system data validation runs automatically. Real-time dashboards reflect shop-floor and business data together. Data latency: sub-second to near-real-time. Reconciliation effort: <1 hour per week.
Intelligent
Self-Healing Data PipelineData quality checks run continuously with auto-correction routines. Missing data triggers automatic source reconnection or fallback to secondary source. Anomaly detection identifies sensor drift before it affects analytics. Data latency: real-time. Reconciliation effort: automated.
Data Collection Deployment Stages
iFactory deploys manufacturing data collection in four sequential stages. Each stage adds new source tiers and validation layers, building toward a complete unified data architecture that supports every analytics use case in your plant.
- Inventory all data sources across five tiers — PLC, IoT, MES, ERP, manual
- Document protocols, tag counts, access credentials, and connectivity status
- Map source data to standard KPI schema — OEE, downtime, quality, throughput
- Identify critical gaps — unmonitored assets, manual-only data, unsynchronised clocks
- Configure OPC-UA connectors for all PLCs and controllers
- Deploy IoT gateway with edge buffering for sensor networks
- Establish MES and ERP read-only API connections or database views
- Set up operator tablet entry forms with photo capture and offline mode
- Enable automated data validation — range checks, rate-of-change, stuck-value detection
- Verify time synchronisation across all sources with drift audit report
- Test data completeness — compare expected vs actual data points per source per hour
- Assign data ownership and document escalation path per source tier
- Roll out data collection to remaining production lines and secondary assets
- Configure automated monthly data completeness report with source-level drill-down
- Set up data quality alert routing to source owners and plant analytics team
- Document data retention policy and automated purge schedule per tier
Manufacturing Data Collection — Frequently Asked Questions
What is the minimum data collection setup needed to start manufacturing analytics?
The minimum viable data collection setup requires three sources: (1) PLC data for OEE — production count, cycle time, and downtime status from each asset, (2) MES or manual entry for quality — scrap count, defect reasons, and pass/fail per work order, and (3) ERP or manual entry for throughput — order quantities, scheduled production, and shipment targets. With just these three sources connected to iFactory's analytics layer, you get shift-level OEE, quality yield, and production vs plan tracking. Additional sources (IoT sensors, CMMS, vision systems) add depth but are not required to start.
How do you handle data from older PLCs that don't support OPC-UA?
Older PLCs that do not support OPC-UA can be connected through protocol gateways or edge converters. Common approaches include: using a Modbus TCP gateway if the PLC supports Modbus, deploying an edge device that polls the PLC's native protocol (Allen-Bradley CSP, Siemens S7, Mitsubishi MC) and converts to OPC-UA or MQTT, or installing a protocol converter appliance that presents a modern interface to upstream systems while talking the legacy protocol downstream. iFactory's integration layer supports all major legacy PLC protocols directly and includes pre-built connectors for over 50 PLC brands and models without requiring protocol conversion hardware.
What data should be collected in real time vs batch?
Real-time collection (sub-second to 5-second latency) is required for: production counts and cycle times for OEE calculation, downtime events with start and end timestamps, alarm and fault code activations, and quality measurements from in-line gauges. Batch collection (minute to daily) is sufficient for: work order data from MES, inventory transactions from ERP, maintenance records from CMMS, energy consumption totals, and operator-entered scrap reasons. A common mistake is trying to collect everything in real time, which creates unnecessary network load and storage costs. iFactory's analytics layer supports mixed-latency collection — real-time for operational metrics and batch for business data — and joins them automatically by timestamp and asset ID.
How do you ensure data consistency across different source systems?
Data consistency is achieved through four mechanisms: (1) a shared asset hierarchy — every source uses the same asset IDs, line names, and area codes defined in a central equipment register, (2) standard unit conversion — iFactory's ingestion layer automatically converts pressure (psi/bar), temperature (Celsius/Fahrenheit), length (mm/inches), and other common units to a standard system-wide unit, (3) time synchronisation — all sources synchronised via NTP or PTP with automated drift monitoring and alerts when clock skew exceeds 100 ms, and (4) cross-source validation — production counts from PLCs are compared against MES work order quantities and ERP inventory receipts, with discrepancies flagged for manual reconciliation. These four mechanisms eliminate the data reconciliation problems that plague most multi-source analytics deployments.
What is the typical bandwidth and storage requirement for manufacturing data collection?
Bandwidth and storage vary significantly by source type. For PLC data (cycle times, fault codes, production counts), a typical plant with 50 PLCs and 20,000 tags requires approximately 2–5 Mbps upstream bandwidth and 1–3 TB per year of storage at 1-second polling. For IoT sensor data (vibration, temperature, pressure at 1 kHz), edge processing is strongly recommended — raw data stays at the edge and only summarised metrics (min, max, avg, stddev) are sent to the cloud, reducing bandwidth by 99%. For MES and ERP data (work orders, inventory, quality records), bandwidth is negligible — typically <1 Mbps — but requires careful API rate-limit management. iFactory's edge gateway handles this automatically with configurable data compression, filtering, and summarisation rules per source type.
How does iFactory handle data security and OT network segmentation?
The iFactory analytics layer is designed for OT network architectures. The edge gateway sits in the plant network (OT network) and initiates all connections — it polls PLCs, reads from historians, and queries MES databases using outbound-only connections. Data is transmitted to the iFactory cloud layer over TLS 1.3 with certificate-based authentication. The gateway never exposes inbound ports and requires no firewall changes in the OT network. For air-gapped plants, iFactory supports a fully on-premise deployment option where the edge gateway, analytics server, and dashboard server all run inside the plant network with no cloud connectivity required. Data retention and purge schedules are configurable per source tier to meet regulatory and internal compliance requirements.
Connect Every Data Source in Your Plant — See a Unified View in 30 Minutes
iFactory's pre-built connectors and edge gateway unify PLC, IoT, MES, ERP, and manual data into one analytics layer. No coding, no complex ETL. Your first dashboard in a 30-minute demo session.







