“A systematic and proactive approach to address dirty data is the first step to fully leveraging machine learning in production operations. Doing this ensures the accuracy and completeness of the data being used to achieve objectives such as process optimization and production quality.”
Internet of Things, Product Marketing
CESMII Member Spotlight
Several years ago, I was involved in a proof-of-concept project at a manufacturing facility that produced pet food. The facility faced a problem with the degradation of throughput in its extruders over time. To solve this problem, the company would take an extruder down for an overhaul, which took 6 weeks and resulted in lost capacity. They wanted to use mathematical modeling to predict and plan for these overhauls and make the best financial decision for the plant. However, during the analytic process, they discovered that they did not have all the necessary data –specifically, they never tracked which screw was in a given extruder at any given time. This made the degradation model impossible to build. While it was initially considered a failure, it was actually a success as it revealed the value of data not being collected and allowed them to remedy the situation. This is just one example of how advanced analytics can provide visibility and insight into how to improve and clean up “dirty data.”
Dirty data can negatively impact any manufacturing production operation. It can prevent manufacturers from being proactive and predicting issues around quality and reliability – which can cause reputation, efficiency, and profitability issues. Most companies “think” their MES and Historian data are clean, until they apply predictive models and machine learning. To take full advantage of machine learning, manufacturers need to clean their data and make it usable for predictive modeling. What steps can they take to transform their data for use in production operations? Well, the first step may surprise you!
- Actually run through an analytic process and build a predictive model. It is in this process where dirty data is exposed. In addition, you may find data that needs yet to be collected based on the questions you’re trying to answer.
- Identify the source of the dirty data: It’s important to understand where the dirty data is coming from to address the root cause of the problem.
- Validate data inputs: Establishing processes to validate data inputs can help ensure that only accurate and complete data is entered into the system.
- Review and correct data: In some cases, it may be necessary to review and correct data manually. This can be time-consuming but is often necessary to ensure the accuracy of the data.
- Implement data governance policies: Establishing clear policies and procedures for how data is entered, managed, and used can help prevent dirty data from entering the system in the first place.
- Regularly monitor and audit data quality: Regularly reviewing the data to identify and correct errors can help ensure that the data remains accurate over time.
A systematic and proactive approach to address dirty data is the first step to fully leveraging machine learning in production operations. Doing this ensures the accuracy and completeness of the data being used to achieve objectives such as process optimization and production quality.
However, don’t wait to leverage advanced analytics when you’re focusing on cleaning up your data — make advanced analytics a part of your process. The data you already have is valuable and through exercising your questions will reveal the “dirty spots” in your data. It will point you to where it needs to be cleaned and collection mechanisms hardened up. Improving manufacturing operations is a continuous improvement endeavor which includes advanced analytic cycles, and data integrity is a part of that.
Are you gearing up to improve your production quality and processes with machine learning in 2023? Think about the data you’re planning to use. Can you confidently say it’s going to provide a good foundation for the project? Prioritizing data transformation using advanced analytics in alignment with your desired outcomes will make all the difference in optimizing your results.