AI is the means for producing and scaling trusted and qualified AI-ready data and validated models for smart predictive, preventive, and proactive solutions from factory floor machine to supply chain. Operational data is the essential resource.
AI with Open and Scaled Data Sharing in the Semiconductor Industry
2023 and 2025 with the participation of an industry coalition championed by Seagate Technology
Options for a National Plan for Smart Manufacturing
2022-2024 National Academies Consensus Study
Towards Resilient Manufacturing Ecosystems Through AI
2020-2022 Industry-Academia-Government Workshop Study
Data-first AI Strategy for Smart Manufacturing
Three studies on meaningfully shared and scaled data; AI tools and applications; interconnectedness with trust; and scaled decision-making, automation, and autonomy.
Jim Davis, UCLA Office of Advanced Research Computing, Vice Provost IT (CIO/CTO) & Professor Emeritus, executive oversight CESMII, highlights the findings of three industry-academic-government studies conducted between 2020 and 2025 on AI in Smart Manufacturing/I4.0 enterprises. These studies provide direction for CESMII’s operational AI and data strategies.
In CESMII terms, these studies address consistently processed AI-ready data and implementing AI-enabled applications. The focus is on AI-ready data at scale which requires:
- Data to be “un-siloed” (i.e. CESMII’s API initiative i3X)
- Data information models (i.e. SM Profiles) to be finalized and site-approved for data sources and data qualification
- Data types to be labeled and categorized for persistent interpretation, selection, reuse, validations, application certifications, and IP protections – everything in the context of the operation from which the data are collected
See below for an overview of the three studies. We highlight the third which benchmarks semiconductor cross-factory/cross company technical and business experiences and value points with scaled data and AI to increase productivity.
The three studies were aimed at developing a comprehensive plan for AI-enabled manufacturing. The 2020-2022 Workshop Study, “Towards Resilient Manufacturing Ecosystems Through AI” was conducted based on interest by the National Science Foundation and the National Institute of Standards and Technology. The 2022-2024 National Academies Consensus Study, “Options for a National Plan for Smart Manufacturing,” was supported by an award between the National Academies of Sciences and the Advanced Materials and Manufacturing Technologies Office of the Department of Energy. In both studies, real-time operational data at scale was recognized as the critical resource. Training is focused on increasing the value of data in AI and how to trust, validate, and certify applications. Industry access and the availability of qualified data at the right time, place, and condition were critical business and technical barriers.
Also based on interest by the National Science Foundation and the National Institute of Standards and Technology, the third study, “AI with Open and Scaled Data Sharing in the Semiconductor Industry,” was conducted between 2023 and 2025 with the participation of an industry coalition championed by Seagate Technology. The report is summarized in an introductory form, Executive summary in the Manufacturing Leadership Council Journal and as a pre-publication synopsis.
The coalition together with other data, AI, and operational experts implemented and tested recommendations from the prior two studies and documented business and technical practices, benefits, and experiences. An agreed-upon ML metrology application for increasing productivity was the benchmark use case. The study focused on producing qualified and trusted data at greater scales and lower cost, improving the quality, usefulness, and value of data available for building AI operational solutions, and ensuring qualified data at greater operational scales. An NSF/NIST organizing committee, consisting of Sthitie Bom (Seagate), Jim Davis (UCLA/CESMII), Said Jahanmir (NIST), Bruce Kramer (NIST, formerly NSF), Don Ufford (NIST when work was done), and Greg Vogl(NIST) facilitated the study.
Key Conclusions
- Machine Learning (ML) models demonstrate the ability to control and manage complex behaviors that cannot be modeled using physics-based methods. When raw manufacturing data needed for training remains siloed and inconsistent, the resulting ML models are narrowly applicable to the set-up used for training. With consistent treatment of the data from multiple machines that are nominally identical, there can be better ML model performance for all machines, the data are scalable and reusable, and the aggregate cost of the applications is lowered.
- The qualification and readiness of AI-ready data at scale is most revealed when scaling applications. This is counter to early-phase manufacturer readiness which requires building experience with smaller, initial applications. Small individual applications are not necessarily good indicators of the data used at scale. Collaborating on data is a cost-effective way of generating and scaling data for robustness, application scaling, and accelerating the maturity process with less risk. A data-first strategy that circumvents the scaling pitfalls of siloed solutions requires a common business priority for scaling data, AI, and application models together – including practices for valuing, generating, and sharing AI-ready manufacturing data for progressively building capability.
- The coalition and study participants investigated the training of ML models with datasets derived from processing wafers on multiple plasma etch machines in multiple factories. The workshop study was motivated by the expectation that training across a broader spectrum of data would produce ML models applicable to broader classes of setups. The report discusses an industry coalition experience co-developed and reviewed by a wide range of manufacturing and data science experts. Data ecosystem benefits, workforce implications, and mitigating risks of data sharing are addressed and benchmarked where possible.
- CESMII’s SM Profile, as a data information model for machines, was a key enabler of persistent and consistent contextualization of data from multiple machines located in multiple sites. It was also a key vehicle for engaging factory floor staff from multiple sites in sharing expertise and resolving consistent data practices.
- The Workshop Study anticipates a future business practice in which manufacturing data is harvested and processed at scale to support a range of ML solutions that are robust, accessible, and affordable for manufacturers. Challenges stem largely from legacy mindsets and business practices that silo the data. Processing data is not yet a business priority and often not done well or consistently. The Workshop reframed the business focus to be on AI-ready data as a refined asset in Smart Manufacturing and on “data-first” as the strategic roadmap for high-value data ecosystems.
About Smart Manufacturing, Data and AI at Scale
Smart Manufacturing (SM) has always been about the right data at the right time in the right form at the right place throughout the enterprise, all the time. The objective is the operational enterprise that is always sensing, learning, and taking proactive action. SM has always been about sensing, data, and action at scale with AI as the key enabler to:
-
Increase operational productivity (energy, material, workforce),
-
Ensure and improve product quality and precision,
-
Adapt continuously for better, faster, cheaper, safer, and secure performance, and
-
Innovate on operations and product together.
AI is the means for producing and scaling trusted and qualified AI-ready data and validated models for smart predictive, preventive, and proactive solutions from factory floor machine to supply chain. Operational data is the essential resource.
SM has its origins in the real-time operational benefits of cyberinfrastructure and use of factory-scale data. It was incorporated into the national plan in response to the U.S. economic downturn in 2007-2009. Through the National Advanced Manufacturing Partnership 1.0 and 2.0 discussions, Smart Manufacturing, based on Advanced Sensing Controls Platforms and Modeling (ASCPM), became a national priority leading to the formation of CESMII in 2017. The emphasis was on factories, supply chains, and the technical and business democratization of the industry. National interest in AI, critical materials, energy, and widespread industry adoption have sharply expanded into a factory-and-industry scope since 2020.
About CESMII
CESMII – the Smart Manufacturing Institute – has a total current investment commitment of $201M from Department of Energy funding and public/private partnership contributions, with a mandate to create a more competitive manufacturing environment in the US through advanced sensing, analytics, modeling, control, and platforms. CESMII is one of 17 Manufacturing USA institutes on this mission to increase manufacturing productivity, global competitiveness, and reinvestment by increasing energy productivity, improving economic performance and raising workforce capacity. University of California at Los Angeles (UCLA) is the program and administrative home of CESMII. For more information about CESMII, its history and Smart Manufacturing, visit cesmii.org.


