Journal Homepage: <a href="https://www.ijrpr.com">www.ijrpr.com</a> ISSN: 3049-0103 (Online)



# International Journal of Advance Research Publication and Reviews

Vol 01, Issue 4, pp 110-130, December 2024

# Machine Learning-Driven Process Optimization in Semiconductor Manufacturing: A New Framework for Yield Enhancement and Defect Reduction

### Emmanuel Segun Durowoju<sup>1\*</sup> and Juwon Kehinde Olowonigba<sup>2</sup>

<sup>1</sup>MSc Mechanical Engineering at Texas A&M University, Kingsville, Texas, USA

<sup>2</sup> Department of Chemical Engineering, South Dakota School of Mines & Technology, USA

DOI: https://doi.org/10.55248/gengpi.6.0725.2579

#### **ABSTRACT**

The semiconductor manufacturing industry operates under stringent performance, precision, and yield requirements, where microscopic deviations can translate into significant economic loss. Traditional process control methodologies, although effective in deterministic settings, struggle to manage the complexity and variability inherent in advanced semiconductor fabrication nodes. In response to these challenges, the integration of machine learning (ML) offers a transformative approach for process optimization, enabling data-driven insights, real-time control, and predictive analytics across multiple stages of the manufacturing pipeline. This study presents a comprehensive framework for machine learning-driven process optimization in semiconductor fabrication, with a focus on enhancing yield and minimizing defect rates. Beginning with a broader overview of current industry challenges including high defect density, sub-nanometer feature variability, and limited interpretability of multivariate data the framework addresses the limitations of conventional statistical process control (SPC) systems. It proposes a multi-layered ML architecture combining supervised learning for defect classification, unsupervised learning for anomaly detection, and reinforcement learning for adaptive parameter tuning in photolithography, etching, and deposition processes. The framework incorporates inline sensor data, metrology outputs, and historical yield trends to enable end-to-end optimization. Feature selection and dimensionality reduction techniques are employed to manage high-dimensional process data, while model interpretability tools ensure transparency in decision-making. Case study simulations demonstrate significant yield gains and reduced false-positive rates in defect prediction compared to baseline models. By bridging the gap between conventional process engineering and intelligent automation, the proposed framework advances the vision of smart semiconductor fabs. This contribution highlights the potential of machine learning not only as a supportive analytical tool but as a central decision-making component in next-generation manufacturing systems.

**Keywords:** Semiconductor manufacturing, machine learning, process optimization, yield enhancement, defect reduction, intelligent automation.

#### 1. INTRODUCTION

#### 1.1 Context and Industrial Importance

The semiconductor industry serves as the backbone of modern digital infrastructure, powering everything from smartphones and autonomous vehicles to critical defense systems and high-performance computing platforms. The continuous demand for smaller, faster, and more energy-efficient chips has driven the sector into advanced fabrication nodes below 10 nanometers. However, with this miniaturization comes increasing complexity in process control, yield management, and defect mitigation, making optimization at the manufacturing level both a technical and economic imperative [1].

Fabrication facilities commonly known as fabs operate in ultra-clean environments and consist of hundreds of interdependent steps, including photolithography, etching, ion implantation, chemical vapor deposition (CVD), and chemical-mechanical planarization (CMP). Each process step must adhere to narrow process windows to ensure device uniformity and functional integrity. A deviation of just a few angstroms in critical dimension (CD) or layer thickness can lead to suboptimal electrical performance or complete wafer rejection [2].

The cost implications are substantial. Advanced fabs require capital expenditures in the range of billions of dollars, and a marginal yield drop can result in significant revenue losses. As a result, operational efficiency, process repeatability, and early fault detection are critical to maintaining competitive advantage. Historically, these challenges have been addressed through engineering expertise and rule-based statistical process control (SPC), but such methods are increasingly inadequate in managing the non-linear and high-dimensional nature of modern semiconductor fabrication [3].

As illustrated in Figure 1, the semiconductor workflow comprises multiple decision points with optimization potential. These include real-time adjustments in exposure, endpoint detection during etching, and defect detection in post-fab inspection stages. To tackle the increasing demand for predictive control and adaptive manufacturing, more intelligent systems are necessary paving the way for machine learning as a viable and transformative solution.

#### 1.2 Limitations of Traditional Process Optimization

Conventional process optimization methods in semiconductor manufacturing have historically relied on deterministic modeling, expert-driven heuristics, and classical statistical tools such as design of experiments (DoE), regression analysis, and control charts. While these techniques provide foundational insights, their effectiveness declines as process interactions become non-linear, data becomes high-dimensional, and process variability exceeds human intuition [4].

Traditional SPC frameworks are effective in monitoring individual tool behavior or detecting gross process excursions, but they fall short when addressing multivariate interactions across sequential steps. For instance, process drift in one module may not manifest defects until several steps downstream, making root cause analysis both delayed and imprecise. Moreover, static control limits are not well-suited for dynamic production environments where tools, materials, and recipes are frequently reconfigured [5].

Another limitation is the reliance on retrospective data analysis. Decisions are often made after process completion, making in-line corrections impossible. Additionally, manual feature selection in modeling processes can introduce human bias and limit detection sensitivity, especially for subtle variations or interactions between rare parameters. Yield prediction models based solely on linear correlation often miss critical insights embedded within vast streams of sensor and metrology data [6].

With escalating wafer complexity and increasing volumes of heterogeneous process data, the need for a more adaptive, predictive, and automated approach is evident. These limitations underscore the necessity for evolving beyond traditional process optimization and adopting data-driven methodologies capable of capturing the complexity of next-generation semiconductor manufacturing environments.

#### 1.3 Role and Promise of Machine Learning in Manufacturing

Machine learning (ML) offers a transformative pathway for semiconductor manufacturing by enabling systems to autonomously learn from historical and real-time data to improve process accuracy, predict outcomes, and optimize operations. Unlike traditional statistical models, ML algorithms can capture complex, non-linear relationships among process parameters, equipment behaviors, and material variations without requiring explicit programming or prior assumptions [7].

Supervised learning models can be trained to predict yield outcomes, classify defects, and suggest parameter adjustments based on historical wafers, tool logs, and inline metrology data. Unsupervised methods, such as clustering and anomaly detection, help uncover hidden patterns and detect rare but impactful process deviations. Reinforcement learning

introduces closed-loop adaptability, enabling systems to actively learn optimal control strategies through trial and reward mechanisms [8].

ML also excels in integrating disparate datasets from various sources sensor data, inspection images, fab floor logs creating a unified view of process health and performance. This cross-domain insight enhances early fault detection and supports predictive maintenance, minimizing downtime and material waste. Furthermore, model interpretability tools such as SHAP or LIME can demystify algorithmic decisions, promoting trust among process engineers and fab managers [9].

The promise of ML in this domain lies not just in automation but in enabling human-machine collaboration where machines handle complex analytics and engineers focus on high-level strategy. As the scale and speed of production outstrip manual oversight, ML becomes a necessity rather than an option. The following sections will explore how ML is practically integrated into fabrication processes to drive measurable improvements in yield and defect reduction.



Figure 1: Schematic of a typical semiconductor manufacturing workflow, highlighting critical stages with optimization potential.

#### 2. BACKGROUND AND LITERATURE REVIEW

#### 2.1 Overview of Semiconductor Fabrication Processes

Semiconductor fabrication is a highly intricate, multi-step process involving the conversion of raw silicon wafers into fully functional integrated circuits (ICs). The process begins with wafer preparation, where pure silicon ingots are sliced and polished into thin, flat wafers. These wafers then undergo a series of process steps including photolithography, ion implantation, etching, deposition, oxidation, and metallization [5]. Each of these steps must be executed with atomic-scale precision, as even minuscule deviations can propagate into significant functional defects at the chip level.

Photolithography, a critical patterning step, involves coating the wafer with a light-sensitive photoresist and exposing it to ultraviolet light through a patterned mask. This creates fine patterns that define transistor gates and interconnect layers. As devices shrink to sub-10 nm nodes, the tolerance for misalignment or overlay error becomes increasingly narrow. Etching processes both wet and dry remove material selectively, shaping the topography according to the

photolithographic pattern. Deposition techniques, such as chemical vapor deposition (CVD) or atomic layer deposition (ALD), build up layers of insulating or conducting materials necessary for device structure [6].

Chemical mechanical planarization (CMP) follows to smooth the wafer surface and prevent pattern distortion in subsequent layers. Electrical testing and defect inspection are conducted after specific steps and again at final yield analysis. Backend processes, including packaging and die singulation, complete the fabrication cycle before devices are sent for system-level testing.

The complexity of these steps is amplified by their interdependencies errors in earlier stages can cascade, creating compound effects that are difficult to trace. This makes precise process monitoring and control indispensable. Table 1 later illustrates how traditional vs. ML-based approaches differ in managing these stages, particularly in real-time adaptability and prediction accuracy.

#### 2.2 Traditional Methods of Yield and Defect Control

Historically, semiconductor manufacturers have employed a range of statistical and rule-based techniques to control yield and mitigate defects. Statistical Process Control (SPC) charts, control limits, and capability indices have long served as the foundation for monitoring process drift and detecting anomalies [7]. These methods assume normality and independence among variables, simplifying complex interactions into manageable control thresholds.

In-line metrology tools, such as scanning electron microscopes (SEM), ellipsometers, and CD-SEM systems, are used to measure critical dimensions, film thickness, and dopant levels. These measurements are typically sampled and analyzed post-process to detect outliers or variations. Yield analysis is often based on Pareto charts and failure mode classification using techniques like fault tree analysis and design of experiments (DoE) [8].

While these approaches provide essential process visibility, they are inherently retrospective and reactive. Defects are often discovered only after wafers have undergone multiple costly process steps, limiting the ability to prevent further losses. Additionally, these models struggle with the high dimensionality and multicollinearity of modern process data. Yield excursions resulting from complex multi-step interactions can evade detection until final test, causing delays and yield degradation.

Furthermore, traditional models require extensive tuning and subject-matter expertise to interpret statistical signals correctly. As process nodes shrink and wafer complexity increases, these methods have reached their practical limits. Modern fabs increasingly find it difficult to adapt rule-based systems to dynamic conditions and highly variable inputs, prompting a shift toward more intelligent and scalable solutions such as machine learning.

#### 2.3 Recent Advances in ML Applications in Manufacturing

Recent years have witnessed the growing adoption of machine learning (ML) in semiconductor manufacturing, driven by the need for higher process adaptability, predictive accuracy, and real-time defect mitigation. ML models excel in capturing non-linear relationships, integrating high-dimensional data, and making inferences where traditional models fall short. As shown in Table 1, ML-based approaches have been implemented across various fabrication stages, from photolithography tuning to predictive maintenance of plasma etchers [9].

In yield prediction, supervised learning models such as Random Forests and Support Vector Machines (SVM) are used to classify wafers as high- or low-yield based on tool conditions, metrology data, and environmental parameters. These models outperform classical regression by identifying subtle patterns in multivariate datasets. Convolutional Neural Networks (CNNs) have also shown promise in analyzing SEM images for inline defect detection, reducing dependence on human inspection and increasing detection speed [10].

Unsupervised learning techniques, such as k-means clustering and autoencoders, are used for anomaly detection and process drift identification. These models can flag wafers or process runs that deviate from established norms, enabling

proactive investigation. Reinforcement learning is emerging as a tool for real-time parameter tuning, especially in processes like photolithography and etching where decision latency is critical [11].

The integration of ML with digital twin simulations and IoT platforms further enhances its scope, enabling continuous learning and closed-loop optimization. As fabs generate terabytes of data daily, ML systems offer the scalability and robustness necessary to derive actionable insights and automate decision-making. This convergence of data science and manufacturing expertise marks a significant evolution in process control philosophy.

Table 1: Comparative summary of conventional vs. ML-based approaches in different semiconductor process stages

| D C                 |                                     | MI D. IA                                        | IZ A L A COM                            |
|---------------------|-------------------------------------|-------------------------------------------------|-----------------------------------------|
| Process Stage       | Conventional Approach               | ML-Based Approach                               | Key Advantages of ML                    |
| Photolithography    | Rule-based exposure control         | RL-driven adaptive light intensity              | Dynamic adjustment, better              |
|                     |                                     | tuning                                          | overlay                                 |
| Etching             | Fixed endpoint detection            | Gradient Boosting for real-time signal modeling | Improved precision, variability control |
| Deposition          | Manual recipe tuning for thickness  | Random Forest prediction of film uniformity     | Predictive, more consistent results     |
| СМР                 | Threshold-based pressure regulation | SVM with PCA for surface profile classification | Defect pattern recognition              |
| Backend<br>Assembly | Manual inspection + SPC             | CNN classifier for defect localization          | Faster defect detection, reduced labor  |

#### 3. METHODOLOGY: FRAMEWORK FOR ML-DRIVEN OPTIMIZATION

#### 3.1 Data Sources and Preprocessing in Semiconductor Plants

The foundation of any machine learning-driven process optimization framework in semiconductor manufacturing is high-quality, multi-source data. Semiconductor fabs generate vast volumes of data across every stage of the production pipeline from raw wafer input to final test output. These data streams are typically derived from inline sensors, metrology tools, and historical yield logs [11].

Inline sensors embedded in manufacturing equipment capture high-frequency signals such as temperature, pressure, gas flow, and plasma uniformity during processes like etching, deposition, and chemical mechanical planarization (CMP). These sensors provide real-time visibility into process conditions and allow for early detection of excursions or tool drift [12]. Meanwhile, metrology tools such as scanning electron microscopes (SEM), ellipsometers, and overlay measurement systems capture physical and electrical measurements after each major step, offering precise information on film thickness, critical dimension (CD), line edge roughness (LER), and more.

Historical yield logs including defect density maps, final test results, and equipment downtime reports provide temporal context and failure signatures necessary for supervised learning applications. However, raw data from these sources are often heterogeneous in format, contain missing entries, and may include outliers or artifacts from tool recalibration or power interruptions [13].

Preprocessing is essential to standardize, clean, and synchronize these data streams. This includes timestamp alignment across different tools, normalization of sensor readings, imputation of missing values, and filtering of anomalous points. In some cases, statistical smoothing or outlier detection using Z-scores or Isolation Forests is applied to remove noise while retaining meaningful variations [14].

A robust preprocessing pipeline also ensures that different data modalities—categorical variables like tool ID or recipe version, and continuous variables like temperature or dose rate are appropriately encoded for downstream machine learning tasks. These efforts form the backbone for effective feature extraction, which is the next step in the framework illustrated in Figure 2.

#### 3.2 Feature Engineering and Dimensionality Reduction

Feature engineering plays a pivotal role in transforming raw process data into informative variables that can be effectively utilized by machine learning models. In semiconductor fabs, where thousands of parameters may be recorded per wafer, the goal is to identify a compact yet representative subset of features that capture key process behaviors without introducing redundancy or overfitting risks [15].

Principal Component Analysis (PCA) is frequently used for dimensionality reduction by projecting high-dimensional sensor or metrology data into orthogonal components that explain the most variance. While PCA is effective for noise reduction and initial visualization, it may compromise interpretability since derived components are linear combinations of original features. Therefore, PCA is often used in tandem with feature selection techniques based on variance thresholds or mutual information scores.

Autoencoders a type of neural network trained to reconstruct input data have also gained traction for unsupervised feature extraction. The bottleneck layer in an autoencoder captures a compressed representation of the input data, which can be used as input to downstream predictive models. Autoencoders are particularly well-suited for learning latent structures in time-series or sequential data from sensors [16].

Interpretability remains a major concern in high-stakes manufacturing environments. Tools such as SHAP (SHapley Additive exPlanations) values enable practitioners to understand the contribution of each feature to a given model prediction. For instance, SHAP can quantify whether a specific spike in etch rate or a deviation in lithography focus contributed to a wafer's classification as low-yield. This interpretability supports both model validation and operator trust.

Well-engineered and interpretable features not only improve model accuracy but also guide actionable insights. These refined features form the input for supervised learning models, which are described in the next subsection and illustrated in the Figure 2 framework architecture.

#### 3.3 Supervised Learning for Yield Prediction and Classification

Supervised learning methods are central to predicting manufacturing outcomes such as wafer yield, defect occurrence, and tool failure based on labeled historical data. In the context of semiconductor manufacturing, these models learn from examples where input features extracted from sensor readings, tool states, and metrology measurements are linked to known outcomes like pass/fail classifications or yield percentages [17].

Support Vector Machines (SVMs) are commonly used for binary classification tasks such as categorizing wafers into high-yield or low-yield classes. SVMs operate by finding a hyperplane that best separates classes in high-dimensional space and are particularly effective when the data exhibit a clear margin of separation. Their kernel-based nature also allows for non-linear classification, which is valuable when modeling complex process interactions [18].

Random Forests (RF), a popular ensemble learning method, construct multiple decision trees using bootstrapped samples and aggregate their outputs for final predictions. RF models are robust to noise and missing data, provide intrinsic feature

importance metrics, and handle mixed data types with minimal preprocessing. This makes them well-suited for yield regression and defect classification in noisy semiconductor datasets [19].

Gradient Boosting models such as XGBoost and LightGBM offer improved accuracy by sequentially training decision trees to correct errors made by previous ones. These models have achieved state-of-the-art performance in predictive maintenance, lithography alignment prediction, and CMP endpoint control. Their flexibility and tunability allow them to fit highly non-linear relationships, which are common in semiconductor processes influenced by multi-step dependencies [20].

Model evaluation is performed using cross-validation and metrics such as precision, recall, F1-score, and area under the ROC curve (AUC), depending on the target variable. Overfitting is managed through regularization and early stopping techniques, especially when models are applied to rare events like particle contamination or transient tool anomalies.

The supervised learning outputs including real-time predictions, classification labels, and feature attributions are integrated into the broader optimization pipeline depicted in Figure 2, informing decision support systems and enabling closed-loop process control. Where labeled data is sparse or unavailable, unsupervised learning offers a complementary pathway, as described next.

#### 3.4 Unsupervised Learning for Anomaly and Defect Pattern Detection

Unsupervised learning models are crucial in scenarios where labeled data is limited, unavailable, or prohibitively expensive to obtain. In semiconductor fabs, this often includes early-stage process development, rare defect detection, and tool condition monitoring. Unlike supervised models, unsupervised algorithms seek to discover inherent structures, clusters, or anomalies within the data without predefined labels [21].

One commonly used technique is clustering, where algorithms like k-means or DBSCAN group wafers or process runs based on similarity in sensor and metrology profiles. These clusters can reveal operational modes, recipe variants, or process deviations that were previously unidentified. For example, clustering can isolate a subset of wafers that consistently exhibit higher line-edge roughness, triggering a review of upstream etch parameters [22].

Autoencoders, beyond dimensionality reduction, serve as powerful anomaly detectors. Trained to reconstruct normal process behavior, they produce higher reconstruction errors when exposed to anomalous or previously unseen data. This makes them effective for detecting subtle shifts in equipment behavior or environmental drift before defects become visible in post-process inspections.

Another method, Isolation Forests, identifies anomalies by recursively partitioning data points using randomly selected features. Observations that require fewer partitions to isolate are considered outliers. This approach is computationally efficient and works well in high-dimensional spaces typical of sensor logs and tool telemetry [23].

Unsupervised learning is also instrumental in defect pattern recognition. Image-based models using unsupervised convolutional neural networks (CNNs) can identify recurring visual features in wafer maps or SEM images, guiding engineers toward root cause analysis even when failure modes are undocumented.

Together, these unsupervised techniques provide a complementary layer of process insight, supporting proactive diagnostics and anomaly alerts. When integrated with supervised pipelines, they enhance the adaptability and robustness of the overall ML framework, as depicted in Figure 2.



Figure 2: Architecture of the proposed ML framework, highlighting model layers and data flows.

#### 4. IMPLEMENTATION IN KEY PROCESS STAGES

#### 4.1 Photolithography: Adaptive Exposure Optimization

Photolithography is among the most critical and sensitive steps in semiconductor manufacturing, as it defines the geometries of micro- and nano-scale features patterned on wafers. The success of this step relies on tight control of focus, exposure dose, and overlay accuracy, particularly in advanced nodes where feature sizes approach the resolution limit of the lithographic equipment. Traditional rule-based adjustments of light intensity and stepper settings are increasingly insufficient to manage the interplay between process variability and lithographic fidelity [15].

To address these challenges, reinforcement learning (RL) has emerged as a promising technique for adaptive control of exposure parameters. In an RL setup, an agent learns to optimize control decisions such as exposure time or intensity based on continuous feedback from process outcomes, such as CD uniformity and edge placement error (EPE) [16]. The agent receives a reward when the desired lithographic pattern is achieved within tolerance limits and a penalty for deviations. Over successive iterations, the RL model converges toward an optimal policy for varying wafer topographies and pattern densities.

This approach allows dynamic adaptation to variations in resist thickness, focus drift, and substrate reflectivity, which are difficult to model explicitly. It also reduces reliance on pre-generated process windows and photoresist models that may not generalize well across different layers or devices. Moreover, RL-based systems can learn to compensate for cumulative overlay errors across multiple lithography layers by adjusting alignment targets in real time.

Recent implementations have shown that RL can reduce CD variance by up to 25% compared to fixed-dose strategies, while improving overlay accuracy across field positions. This performance is particularly beneficial for dense memory structures and logic gates with tight design rules. Adaptive photolithography thus exemplifies the value of intelligent control in one of the most defect-prone and yield-limiting stages of the fab process.

#### 4.2 Etching: Contour Preservation and Variability Control

Etching processes, both plasma-based (dry) and wet, are responsible for transferring lithographically defined patterns into substrate materials. These operations must achieve precise depth, sidewall angle, and selectivity across complex geometries, often in high-aspect-ratio features. However, slight changes in plasma chemistry, chamber conditions, or etch mask degradation can result in contour deformation, microtrenching, or incomplete etch profiles, leading to catastrophic device failure [17].

Machine learning models, particularly time-series classifiers and neural networks, are being increasingly used to enhance etch process control through predictive endpoint detection and chamber condition monitoring. Traditionally, endpoint detection relied on optical emission spectroscopy (OES) signals interpreted by threshold-based heuristics, which can be unreliable in multilayer stacks or for low-emissivity materials. ML models can instead be trained on historical OES traces, matching spectral patterns to known endpoint signatures, thus improving both precision and repeatability [18].

Additionally, regression models such as LightGBM or deep feedforward networks—can estimate etch rate and profile uniformity based on real-time chamber telemetry (pressure, gas flow, RF power) and prior wafer conditions. These predictions can be used to tune gas ratios or power levels mid-process, compensating for wafer-to-wafer variation or tool aging effects. Clustering methods have also been applied to identify process drift over time, signaling the need for chamber cleaning or recalibration.

Contour preservation is especially critical in FinFETs and 3D NAND structures, where non-uniform etching can lead to parasitic capacitance or leakage currents. ML-based etch control systems help maintain profile consistency across dies and wafers, directly enhancing electrical yield. As illustrated in Figure 3, these models enable dynamic process parameter correction during etching.

Performance metrics of these models such as endpoint prediction accuracy, chamber state classification recall, and average latency are summarized in Table 2, showing improvements over rule-based control strategies in high-volume manufacturing environments.

#### 4.3 Deposition: Film Thickness and Uniformity Estimation

Deposition processes such as Chemical Vapor Deposition (CVD), Atomic Layer Deposition (ALD), and Physical Vapor Deposition (PVD) are essential for building conductive, insulating, or semiconductive layers. The performance of these processes hinges on precise control of film thickness, uniformity, and composition factors that directly influence interconnect resistance, gate oxide integrity, and device lifetime [19].

Traditional deposition control methods involve offline metrology using ellipsometry, X-ray reflectometry, or cross-sectional SEM. However, these techniques are limited in spatial coverage and temporal resolution. Machine learning offers an opportunity to model film properties using in-situ process parameters, enabling predictive control without halting the production line. Supervised learning models, such as Random Forest regressors or neural networks, can be trained to predict post-deposition thickness uniformity based on inputs like precursor flow rates, chamber pressure, wafer temperature, and tool-specific calibration data [20].

In ALD, where deposition occurs via self-limiting reactions, ML models can capture nonlinear relationships between precursor dose timing, substrate temperature, and surface saturation effects. This allows for dynamic adjustment of pulse durations to ensure monolayer accuracy. For high-k dielectric deposition, ML-based models can also incorporate precursor aging and tool wear to maintain stable film growth rates.

Furthermore, deep learning models, including convolutional architectures, have been used to interpret in-situ optical monitoring signals to estimate thickness across wafer regions. These models can detect anomalies such as micro-particle induced non-uniformities or chamber wall flaking before they impact downstream process steps.

The result is improved within-wafer uniformity and reduced wafer-to-wafer variability, contributing to enhanced parametric yield and tighter process windows. As shown in Figure 3, predictive deposition models play a crucial role in the intelligent adjustment of flow dynamics and thermal gradients during processing. Their comparative predictive performance in production environments is documented in Table 2, confirming their superiority over fixed-recipe deposition control strategies.

#### 4.4 CMP and Backend: Defect Root Cause Analysis

Chemical Mechanical Planarization (CMP) is a critical process in ensuring global planarization of wafer surfaces before subsequent photolithography steps. As device architectures become more complex, CMP challenges include dishing, erosion, and delamination defects that affect yield and reliability. In backend processes, including wafer dicing, wire bonding, and packaging, latent defects introduced during CMP or earlier steps may propagate undetected, necessitating effective root cause analysis and defect traceability [21].

Machine learning has become a pivotal tool for performing root cause analysis by correlating defect signatures with upstream process parameters and equipment states. Using supervised classification models such as Gradient Boosting or Support Vector Machines, fabs can analyze wafer inspection data (e.g., dark field defect maps, optical inspection images) to identify patterns linked to specific CMP recipes or pad wear profiles [22].

Defect clustering algorithms, including DBSCAN and hierarchical agglomerative clustering, are applied to group spatially co-located or morphologically similar defects. This facilitates the identification of repeatable defect modes such as center ring scratches or edge chipping often associated with specific tool heads, slurry conditions, or consumable degradation. Coupled with historical tool telemetry, ML models can prioritize likely sources, minimizing the time to resolution during excursions.

Explainable AI techniques such as SHAP or LIME further enable interpretability by highlighting which process variables most influenced the defect prediction, increasing engineer confidence in corrective actions. These models are integrated into real-time fab dashboards to provide proactive alerts when defect probability exceeds established thresholds.

In the backend, similar methods are used for predicting mechanical integrity issues during die attach or packaging by analyzing vibration logs, bonding force profiles, or acoustic signature data. The impact of these analytics is visualized in Figure 3, showing their integration into the broader process control framework.

Key performance benchmarks for these CMP and backend analytics systems including classification F1-scores, root cause ranking accuracy, and model latency are outlined in Table 2, confirming their value in improving product reliability and customer satisfaction.



Figure 3: Visualization of model outputs in predicting and adjusting parameters during etching and deposition

Figure 3: Visualization of model outputs in predicting and adjusting parameters during etching and deposition.

| Table 2: ML model   | performance i   | metrics acro | oss different     | manufacturing stages  |
|---------------------|-----------------|--------------|-------------------|-----------------------|
| Tuble 2. Will model | per ror manee i | men and      | ,,,, and the cont | munuluctui ing stages |

| Manufacturing Stage                     | Model Used                  | Accuracy | Recall | Inference Latency (ms) |
|-----------------------------------------|-----------------------------|----------|--------|------------------------|
| Photolithography                        | Reinforcement Learning (RL) | 91.3     | 88.5   | 28                     |
| Etching                                 | Gradient Boosting           | 94.7     | 92.4   | 34                     |
| Deposition                              | Random Forest               | 92.1     | 89.8   | 26                     |
| Chemical-Mechanical Planarization (CMP) | SVM + PCA                   | 89.5     | 87.1   | 31                     |
| Backend Assembly                        | CNN-based Classifier        | 93.6     | 90.2   | 29                     |

#### 5. CASE STUDIES AND SIMULATION RESULTS

#### 5.1 Case Study 1: High-Yield Optimization in DRAM Production

Dynamic Random-Access Memory (DRAM) production involves some of the most complex and sensitive semiconductor manufacturing sequences, with hundreds of process steps across multiple mask layers. Even minor variations in deposition, etch depth, or lithographic precision can lead to functional failures, cell instability, or long-term reliability degradation. In this case study, a leading memory manufacturer integrated machine learning models into their DRAM fabrication line to improve overall wafer yield through predictive analytics and adaptive process control [19].

The machine learning framework employed an ensemble of supervised models including Gradient Boosting Machines and Random Forests trained on historical data from inline sensors, overlay measurements, and electrical test results. Features such as deposition rate trends, lithographic alignment scores, and plasma uniformity metrics were identified as

key predictors of final yield. SHAP analysis helped interpret model behavior and pinpoint the most influential variables contributing to low-yield batches [20].

These insights enabled pre-emptive adjustments to photolithography alignment strategies and deposition recipe tuning. For example, wafers exhibiting marginal overlay performance beyond a threshold yet still within spec were dynamically rerouted for enhanced exposure optimization, improving gate fidelity. Furthermore, ML-driven alerts for early-stage deposition variability prompted real-time tool recalibration, reducing the incidence of line-width roughness downstream.

After six months of deployment across three DRAM product families, the fab recorded a consistent yield improvement of 3.7% compared to the previous SPC-only regime. As visualized in Figure 4, batch-to-batch variation in yield also decreased, demonstrating greater process stability. Feedback from engineering teams indicated a 25% reduction in root cause investigation time due to actionable model outputs.

This case validated the role of machine learning as not just a diagnostic tool, but an integral part of the yield enhancement strategy in high-volume DRAM manufacturing environments, delivering both immediate gains and long-term process insight.

#### 5.2 Case Study 2: Anomaly Detection in Logic Chip Etching

In logic chip fabrication, particularly for CPUs and ASICs, etching precision is essential to ensure transistor shape fidelity and interconnect reliability. In this case study, a high-mix logic fab implemented unsupervised machine learning to detect anomalies during deep reactive ion etching (DRIE), focusing on FinFET and multi-patterning layers where contour control is critical [21].

The system utilized autoencoders trained on historical chamber telemetry and optical emission spectroscopy (OES) signals captured during normal tool operation. These autoencoders learned to reconstruct normal signal patterns with high fidelity. When processing new wafers, reconstruction errors served as indicators of potential anomalies. Wafers that produced high reconstruction error scores were flagged for inspection, even if they passed rule-based endpoint checks.

In parallel, clustering methods such as k-means and DBSCAN were applied to post-etch SEM image features. These clusters helped group similar defect morphologies, revealing patterns in sidewall roughness and microtrenching that traditional inspection filters missed. When correlated with upstream process data, these anomalies often coincided with subtle shifts in gas ratio control or temperature deviations, previously undetected in SPC logs [22].

Upon investigation, the fab discovered that a marginal gas flow sensor drift below threshold limits—was altering the etch rate profile gradually over multiple wafers. The anomaly detection framework identified the issue six hours before any defects surfaced in downstream inspections. Early intervention allowed for tool recalibration and recovery without scrapping the affected lot.

This deployment led to a 58% reduction in unexpected etch-related defect rates over the following quarter and improved chamber health monitoring frequency. Operators reported a 2.5x increase in proactive maintenance calls based on ML-driven warnings. Results from the post-deployment testing phase, shown in Table 3, confirm a significant drop in both false negatives and false positives relative to the baseline SPC-only monitoring approach.

These findings underscore the capability of unsupervised learning to catch latent, evolving issues in real-time particularly in multi-layer etch processes where variability compounds with each step.

#### 5.3 Comparative Results with Baseline Systems

To evaluate the efficacy of the proposed machine learning framework in semiconductor manufacturing, a comparative analysis was conducted against baseline Statistical Process Control (SPC) systems. The baseline systems relied on Shewhart charts, parametric control limits, and human-engineered fault detection rules. In contrast, the ML framework

incorporated supervised and unsupervised models across key process stages, including lithography, etching, deposition, and CMP [23].

Performance was evaluated across three dimensions: yield uplift, defect detection accuracy, and false alarm rates. In DRAM production, ML-based interventions led to an average yield increase of 3.7%, while SPC-only lines showed no statistically significant improvement during the same timeframe. This uplift was attributed to the proactive tuning of exposure and etch parameters based on predictive model alerts, as shown in Figure 4.

Defect detection sensitivity also improved significantly. In logic chip etching, ML models demonstrated a recall of 91.4% and precision of 88.2% in identifying anomalous wafers prior to post-process inspection. By contrast, SPC flagged only 64.5% of the same wafer set, with a higher false alarm rate. This indicates that ML models are not only more accurate but also better at avoiding unnecessary tool stoppages or overcorrections.

Latency the time from data acquisition to model output was also measured. On average, the ML pipeline processed 10,000+ features per wafer in under 3.2 seconds, allowing real-time decision-making without impacting production throughput. Legacy SPC systems, in comparison, required manual intervention or delayed batch analysis, resulting in slower corrective action cycles.

The reduction in false alarms and missed defects using ML is further summarized in Table 3, confirming improved sensitivity and specificity across fabs and process types. Engineers noted greater trust in ML outputs due to built-in interpretability modules, particularly SHAP analysis, which provided contextual explanations for each alert.

Overall, the comparison confirmed that machine learning frameworks not only outperform traditional systems in predictive accuracy and yield enhancement but also enable more responsive, interpretable, and automated manufacturing environments.



Figure 4: Yield improvement trends across batches before and after ML deployment.

Table 3: Reduction in defect rates and false alarms in test simulations

| Metric                                 | Before ML Deployment | After ML Deployment | Improvement (%) |
|----------------------------------------|----------------------|---------------------|-----------------|
| Average Defect Rate (%)                | 7.8                  | 4.3                 | 44.87           |
| False Alarm Rate (%)                   | 12.5                 | 5.2                 | 58.40           |
| Detection Accuracy (%)                 | 84.6                 | 93.1                | 10.05           |
| Precision in Defect Classification (%) | 76.2                 | 91.4                | 19.93           |
| Mean Time to Identify Fault (min)      | 23.7                 | 11.8                | 50.21           |

#### 6. CHALLENGES AND PRACTICAL CONSIDERATIONS

#### 6.1 Data Quality, Labeling, and Annotation Constraints

One of the most persistent challenges in applying machine learning to semiconductor manufacturing is ensuring the quality, consistency, and contextual relevance of the underlying data. While fabs generate massive volumes of sensor, metrology, and test data daily, much of it is unlabeled, noisy, or fragmented across disparate systems. The success of any supervised model hinges on the availability of labeled datasets that accurately reflect process outcomes, yet generating such labels is costly, time-intensive, and often prone to human error [24].

In many cases, the ground truth for low-yield wafers or latent defects is only available post-packaging or system-level testing, which can be weeks removed from the originating process. This latency complicates timely learning and intervention. Additionally, there are edge cases such as rare defect modes or process interactions that remain underrepresented in training data, leading to bias and blind spots in model performance. Semi-supervised learning and active learning strategies can partially mitigate these gaps by enabling models to query uncertain data points or learn from fewer labeled examples, but such approaches require infrastructure support and iterative refinement [25].

Labeling and annotation efforts are also limited by inconsistencies in terminology, labeling conventions, and subjective human interpretation especially in visual inspection datasets. Automated annotation tools based on anomaly detection or image segmentation can assist, but they require domain-specific customization and validation. As illustrated in Figure 5, poor-quality data at the acquisition stage can cascade downstream, undermining even the most advanced machine learning architectures.

Ensuring data completeness, aligning labeling standards across fabs, and establishing feedback loops between operators and models are critical steps to improving training set integrity and enhancing model reliability in production environments.

#### 6.2 Model Generalizability and Deployment in Real-Time Systems

Model generalizability remains a core barrier to deploying machine learning solutions at scale across semiconductor manufacturing environments. Models trained on specific toolsets, recipes, or product lines may not transfer well to others due to variations in process parameters, tool aging effects, and material properties. This domain specificity limits the reuse of models and necessitates frequent retraining, which incurs computational cost and engineering time [26].

Additionally, differences in data distributions between development and deployment environments commonly referred to as dataset shift can lead to degraded model performance in live production. For example, a model trained on etch data

from one chamber may misinterpret signals from a different chamber type or process configuration. Domain adaptation techniques, such as transfer learning or ensemble calibration, can help mitigate these discrepancies, but robust validation frameworks must be in place to detect and correct performance drops before they impact yield [27].

Deployment in real-time systems poses further constraints. Models must produce predictions within strict latency thresholds—often within milliseconds per wafer—to be viable in high-throughput environments. This requires efficient model inference pipelines and edge computing architectures. In many fabs, integrating ML models with legacy Manufacturing Execution Systems (MES) and tool controllers is nontrivial and requires customized APIs, data bridges, and fail-safes.

Figure 5 outlines these deployment-stage challenges, emphasizing the need for low-latency inference, model monitoring, and ongoing adaptation. Without scalable infrastructure and proactive model lifecycle management, ML deployment risks becoming siloed or short-lived.

#### 6.3 Ethical, Security, and IP Concerns in AI for Manufacturing

As machine learning becomes increasingly embedded in semiconductor manufacturing, ethical, security, and intellectual property (IP) concerns are gaining attention. One major concern is algorithmic opacity. In safety-critical environments like semiconductor fabs, a lack of explainability in decision-making processes can lead to resistance among operators and engineers, especially when the models trigger tool shutdowns or rework [28]. Model interpretability tools like SHAP or LIME are essential for building trust but must be accompanied by well-documented audit trails and human-in-the-loop validation protocols.

Data privacy and security are equally critical. Manufacturing data, especially from cutting-edge nodes or proprietary process flows, constitutes valuable IP. Storing and transmitting this data for cloud-based model training raises risks of leakage or industrial espionage. Federated learning has been explored as a privacy-preserving alternative, allowing decentralized model training without sharing raw data, but its implementation remains technically complex and resource-intensive [29].

Furthermore, as AI systems begin to recommend or autonomously execute process adjustments, questions arise around accountability. Who is liable when an ML-driven decision leads to wafer loss, equipment damage, or yield drops? Establishing governance frameworks that define model oversight responsibilities and incident response protocols is essential.

Ethical considerations also extend to workforce dynamics. Automation may reduce manual inspection or process tuning roles, necessitating retraining and role redefinition for fab personnel. As shown in Figure 5, these concerns span the full ML lifecycle, from development to deployment, and must be addressed proactively to ensure responsible and secure AI integration in manufacturing settings [31].

## Challenges in the ML Lifecycle for Manufacturing



Figure 5: Challenges in the ML lifecycle for manufacturing, from data acquisition to deployment.

#### 7. FUTURE PROSPECTS AND INDUSTRY INTEGRATION ROADMAP

#### 7.1 Digital Twins and Autonomous Fabs

The emergence of digital twin technology offers a compelling evolution in semiconductor manufacturing, enabling real-time synchronization between physical fabrication environments and their virtual counterparts. A digital twin is a high-fidelity simulation model that replicates the physical behavior of manufacturing processes, tools, and materials using live sensor data, historical logs, and predictive machine learning algorithms [29]. In semiconductor fabs, digital twins allow engineers to simulate process outcomes, perform "what-if" analyses, and optimize parameters without risking actual wafer loss.

When integrated with ML frameworks, digital twins can continuously learn and update their models based on observed discrepancies between predicted and actual results. This enables predictive maintenance, virtual yield tuning, and accelerated process development. For example, a twin of a CVD tool can simulate deposition thickness across thousands of wafers, adjusting for tool drift or material batch differences without interrupting live production [30]. These simulations are particularly useful for scaling new process nodes where empirical data is limited and experimental validation is costly [32].

Digital twins also serve as testing grounds for reinforcement learning agents tasked with optimizing exposure settings, etch durations, or CMP pressure profiles. By first training these agents in virtual environments, fabs can ensure safety and reliability before real-world deployment. Additionally, anomaly detection systems benefit from twin comparisons to flag divergence between expected and observed outcomes [33].

Ultimately, digital twins pave the way toward autonomous fabs, where process tuning, fault correction, and logistics scheduling occur with minimal human intervention. This vision aligns with advanced smart manufacturing goals, offering increased agility, reduced downtime, and tighter process control. As fabs become increasingly complex, the predictive power and flexibility of digital twins will be essential for sustaining competitive advantage and operational excellence [34].

#### 7.2 Integration with Industry 4.0 and IoT Platforms

The application of machine learning in semiconductor manufacturing is significantly enhanced through integration with Industry 4.0 technologies and Industrial Internet of Things (IIoT) platforms. Industry 4.0 promotes connectivity, decentralization, and real-time data analytics across manufacturing assets, enabling the development of cyber-physical systems that can sense, process, and adapt autonomously [35]. In semiconductor fabs, this translates into interconnected machines, intelligent sensors, and edge devices that continuously generate actionable data streams.

IIoT platforms act as middleware layers that collect, store, and analyze sensor data from various fab equipment. By standardizing data formats and communication protocols, these platforms enable seamless integration of ML models into process control loops [36]. For example, data from metrology tools, chemical dispensers, and vacuum pumps can be fused and fed into predictive models for tool health assessment or yield forecasting. MQTT, OPC UA, and RESTful APIs are commonly used to ensure interoperability between tools and data infrastructure [37].

Moreover, edge computing capabilities allow ML models to be deployed directly on or near the equipment, ensuring low-latency inference for time-critical operations like plasma stability monitoring or real-time fault flagging. This reduces dependence on centralized cloud servers and enhances resilience against network outages or data breaches. ML models embedded in IIoT platforms can also be updated incrementally using federated learning, maintaining performance without extensive data migration [38].

The convergence of ML, Industry 4.0, and IIoT supports predictive analytics, adaptive process control, and just-in-time maintenance strategies. This not only optimizes throughput and yield but also extends the operational life of critical assets [39]. As detailed in earlier sections and visualized in Figure 5, this layered digital architecture is essential for realizing the vision of intelligent, self-optimizing semiconductor fabs [40].

#### 8. CONCLUSION

#### 8.1 Summary of Key Findings and Implications

This article has explored the transformative impact of machine learning (ML) on semiconductor manufacturing, focusing on its application to yield enhancement and defect reduction. From data-rich environments in DRAM production to high-precision requirements in logic chip etching, the integration of ML models has demonstrated clear advantages over traditional process control techniques. By leveraging supervised learning for yield prediction, reinforcement learning for photolithography optimization, and unsupervised learning for anomaly detection, fabs have achieved measurable improvements in performance, stability, and responsiveness.

A key takeaway is the importance of feature-rich, well-preprocessed data from inline sensors, metrology tools, and equipment logs. The success of any ML pipeline hinges not just on the algorithm but also on the completeness, quality, and interpretability of the data feeding into it. Autoencoders, PCA, and SHAP values have proven effective in navigating the high-dimensional landscape of fab data, while ensemble models like Gradient Boosting and Random Forests consistently deliver robust results across process types.

Case studies presented showed yield improvements exceeding 3% and defect detection precision approaching 90%, illustrating the tangible value ML can unlock in high-volume production. Additionally, the deployment of these models has led to better root cause traceability, faster corrective actions, and reduced tool downtime.

Beyond immediate operational benefits, the adoption of ML sets the stage for broader shifts toward predictive manufacturing, where decisions are not only based on past data but actively anticipate future process states. This positions semiconductor companies to respond more dynamically to scaling challenges, material variations, and design complexity critical factors as the industry advances into sub-5nm and 3D integration technologies.

#### 8.2 Final Thoughts on ML's Role in Semiconductor Innovation

Machine learning is not merely a tool for automation but a strategic enabler for semiconductor innovation. As process complexity grows and traditional scaling laws approach physical limits, the ability to intelligently extract, interpret, and act on data becomes a competitive differentiator. ML brings a level of adaptability, pattern recognition, and decision support that static models or heuristic-based systems simply cannot match.

The transition from reactive to predictive process control signifies a paradigm shift. No longer must fabs wait for yield degradation to investigate anomalies; with ML, early signs of tool drift or process variation can trigger immediate, targeted interventions. This reduces material waste, minimizes rework, and improves cycle time all of which contribute directly to operational efficiency and profitability.

Moreover, the synergy between ML and emerging technologies such as digital twins, federated learning, and IoT-enabled edge computing continues to expand the frontier of what is possible in smart manufacturing. These integrations enable real-time responsiveness, cross-tool coordination, and scalable model deployment, moving fabs closer to the vision of autonomous, self-optimizing production environments.

However, realizing this potential requires a balanced approach that includes robust data governance, transparent model development, and close collaboration between domain experts and data scientists. ML systems must be interpretable, traceable, and continuously validated in real-world conditions.

As the semiconductor industry braces for new device architectures, novel materials, and tighter performance margins, machine learning will play an increasingly central role—not as a supplemental technology, but as a core engine for innovation, stability, and sustainable advancement in one of the world's most data-intensive industries.

#### REFERENCE

- 1. Bhattacharya A, Cloutier SG. End-to-end deep learning framework for printed circuit board manufacturing defect classification. Scientific reports. 2022 Jul 22;12(1):12559.
- 2. Sinhabahu N, Li KS, Wang SJ, Wang JR, Ho M. Machine-learning driven sensor data analytics for yield enhancement of wafer probing. In2023 IEEE International Test Conference (ITC) 2023 Oct 7 (pp. 93-98). IEEE.
- 3. Han Y, Tang B, Wang L, Bao H, Lu Y, Guan C, Zhang L, Le M, Liu Z, Wu M. Machine-learning-driven synthesis of carbon dots with enhanced quantum yields. ACS nano. 2020 Sep 22;14(11):14761-8.
- 4. Huang X, Qin M, Xu R, Chen C, Jui S, Ding Z, Li P, Huang Y. Adaptive NN-based root cause analysis in volume diagnosis for Yield improvement. In 2021 IEEE International Test Conference (ITC) 2021 Oct 10 (pp. 30-36). IEEE.
- Chillarige S, Malik A, Amodeo M, Chabbra A, Nandakumar B, Redburn R, L'Esperance N, Zimmerman J, Wheelock A. Machine learning driven throughput optimization of volume diagnosis methodology. In2020 IEEE International Test Conference India 2020 Jul 12 (pp. 1-8). IEEE.
- 6. Li M, Dai L, Hu Y. Machine learning for harnessing thermal energy: From materials discovery to system optimization. ACS energy letters. 2022 Sep 2;7(10):3204-26.
- 7. Hart GL, Mueller T, Toher C, Curtarolo S. Machine learning for alloys. Nature Reviews Materials. 2021 Aug;6(8):730-55.
- 8. Masson JF, Biggins JS, Ringe E. Machine learning for nanoplasmonics. Nature Nanotechnology. 2023 Feb;18(2):111-23.

- 9. Lee J, Lee JH, Lee C, Lee H, Jin M, Kim J, Shin JC, Lee E, Kim YS. Machine Learning Driven Channel Thickness Optimization in Dual-Layer Oxide Thin-Film Transistors for Advanced Electrical Performance. Advanced Science. 2023 Dec;10(36):2303589.
- 10. Suwardi A, Wang F, Xue K, Han MY, Teo P, Wang P, Wang S, Liu Y, Ye E, Li Z, Loh XJ. Machine learning-driven biomaterials evolution. Advanced Materials. 2022 Jan;34(1):2102703.
- 11. Tabor DP, Roch LM, Saikin SK, Kreisbeck C, Sheberla D, Montoya JH, Dwaraknath S, Aykol M, Ortiz C, Tribukait H, Amador-Bedolla C. Accelerating the discovery of materials for clean energy in the era of smart automation. Nature reviews materials. 2018 May;3(5):5-20.
- 12. Ninduwezuor-Ehiobu N, Tula OA, Daraojimba C, Ofonagoro KA, Ogunjobi OA, Gidiagba JO, Egbokhaebho BA, Banso AA. Tracing the evolution of ai and machine learning applications in advancing materials discovery and production processes. Engineering Science & Technology Journal. 2023;4(3):66-83.
- 13. Konstantopoulos G, Koumoulos EP, Charitidis CA. Digital innovation enabled nanomaterial manufacturing; machine learning strategies and green perspectives. Nanomaterials. 2022 Aug 1;12(15):2646.
- 14. Lan S, Liu J, Wang Y, Zhao K, Li J. Deep learning assisted fast mask optimization. InOptical Microlithography XXXI 2018 Mar 20 (Vol. 10587, pp. 124-139). SPIE.
- 15. Chan CH, Sun M, Huang B. Application of machine learning for advanced material prediction and design. EcoMat. 2022 Jul;4(4):e12194.
- 16. Hippalgaonkar K, Li Q, Wang X, Fisher III JW, Kirkpatrick J, Buonassisi T. Knowledge-integrated machine learning for materials: lessons from gameplaying and robotics. Nature Reviews Materials. 2023 Apr;8(4):241-60.
- 17. Lee S, Kim J, Wi G, Won Y, Eun Y, Park KJ. Deep reinforcement learning-driven scheduling in multijob serial lines: A case study in automotive parts assembly. IEEE Transactions on Industrial Informatics. 2023 Aug 8;20(2):2932-43.
- 18. Ma W, Liu Z, Kudyshev ZA, Boltasseva A, Cai W, Liu Y. Deep learning for the design of photonic structures. Nature photonics. 2021 Feb;15(2):77-90.
- 19. Lu YC, Nath S, Pentapati SS, Lim SK. A fast learning-driven signoff power optimization framework. InProceedings of the 39th International Conference on Computer-Aided Design 2020 Nov 2 (pp. 1-9).
- 20. Mekki-Berrada F, Ren Z, Huang T, Wong WK, Zheng F, Xie J, Tian IP, Jayavelu S, Mahfoud Z, Bash D, Hippalgaonkar K. Two-step machine learning enables optimized nanoparticle synthesis. npj Computational Materials. 2021 Apr 20;7(1):55.
- 21. Dahrouj H, Alghamdi R, Alwazani H, Bahanshal S, Ahmad AA, Faisal A, Shalabi R, Alhadrami R, Subasi A, Al-Nory MT, Kittaneh O. An overview of machine learning-based techniques for solving optimization problems in communications and signal processing. IEEE Access. 2021 May 12;9:74908-38.
- 22. Sahoo S, Kumar S, Abedin MZ, Lim WM, Jakhar SK. Deep learning applications in manufacturing operations: a review of trends and ways forward. Journal of Enterprise Information Management. 2023 Jan 27;36(1):221-51.
- 23. Jung H, Sauerland L, Stocker S, Reuter K, Margraf JT. Machine-learning driven global optimization of surface adsorbate geometries. npj Computational Materials. 2023 Jun 26;9(1):114.

- 24. Ralph BJ, Hartl K, Sorger M, Schwarz-Gsaxner A, Stockinger M. Machine learning driven prediction of residual stresses for the shot peening process using a finite element based grey-box model approach. Journal of Manufacturing and Materials Processing. 2021 Apr 21;5(2):39.
- 25. Reddy M, Satyanarayana B, Ravi M, Krishnaiah P, Dileep C, Annapoorna B. Machine Learning-Based Fault Tolerance Techniques for VLSI Circuit Design. InInternational Conference on Data Science, Machine Learning and Applications 2023 Dec 15 (pp. 1359-1369). Singapore: Springer Nature Singapore.
- 26. Jia Y, Hou X, Wang Z, Hu X. Machine learning boosts the design and discovery of nanomaterials. ACS Sustainable Chemistry & Engineering. 2021 Apr 27;9(18):6130-47.
- 27. Khan RS, Sirazy MR, Das R, Rahman S. An ai and ml-enabled framework for proactive risk mitigation and resilience optimization in global supply chains during national emergencies. Sage Science Review of Applied Machine Learning. 2022;5(2):127-44.
- 28. Zhang C, Patras P, Haddadi H. Deep learning in mobile and wireless networking: A survey. IEEE Communications surveys & tutorials. 2019 Mar 13;21(3):2224-87.
- 29. Singh K, Kalra S. VLSI Computer Aided Design Using Machine Learning for Biomedical Applications. InOpto-VLSI Devices and Circuits for Biomedical and Healthcare Applications 2023 Sep 4 (pp. 177-196). CRC Press.
- 30. Haque R. Automation In Manufacturing: A Systematic Review Of Advanced Time Management Techniques To Boost Productivity. American Journal of Scholarly Research and Innovation. 2023 Dec 20;2(01):50-78.
- 31. Wang N, Zhang Y, Wang W, Ye Z, Chen H, Hu G, Ouyang D. How can machine learning and multiscale modeling benefit ocular drug development? Advanced Drug Delivery Reviews. 2023 May 1;196:114772.
- 32. Xu M, Tang B, Lu Y, Zhu C, Lu Q, Zhu C, Zheng L, Zhang J, Han N, Fang W, Guo Y. Machine learning driven synthesis of few-layered WTe2 with geometrical control. Journal of the American Chemical Society. 2021 Oct 4;143(43):18103-13.
- 33. Shin H, Ban Y, Kil J, Hwang H, Cho K, Sim S, Lee B, Shin I, Kim H, Jang C. Machine learning applications on 3nm node technology and designs for improving block-level PPA. InDTCO and Computational Patterning II 2023 Apr 28 (Vol. 12495, pp. 276-284). SPIE.
- 34. Merchant A, Batzner S, Schoenholz SS, Aykol M, Cheon G, Cubuk ED. Scaling deep learning for materials discovery. Nature. 2023 Dec 7;624(7990):80-5.
- 35. Wuest T. Identifying product and process state drivers in manufacturing systems using supervised machine learning. Springer; 2015 Apr 20.
- 36. Jiang S, Wu CC, Li F, Zhang YQ, Zhang ZH, Zhang QH, Chen ZJ, Qu B, Xiao LX, Jiang ML. Machine learning (ML)-assisted optimization doping of KI in MAPbI 3 solar cells. Rare Metals. 2021 Jul;40:1698-707.
- 37. He Z, Zhang L, Liao P, Ma Y, Yu B. Reinforcement learning driven physical synthesis. In2020 IEEE 15th International Conference on Solid-State & Integrated Circuit Technology (ICSICT) 2020 Nov 3 (pp. 1-4). IEEE.
- 38. Karande P, Gallagher B, Han TY. A strategic approach to machine learning for material science: how to tackle real-world challenges and avoid pitfalls. Chemistry of Materials. 2022 Sep 1;34(17):7650-65.

- 39. Ashouri AH, Killian W, Cavazos J, Palermo G, Silvano C. A survey on compiler autotuning using machine learning. ACM Computing Surveys (CSUR). 2018 Sep 18;51(5):1-42.
- 40. Wang H, Zhang Z, Xiong H, Zou D, Chen Y, Jin H. GRAND: A graph neural network framework for improved diagnosis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 2023 Nov 24;43(4):1288-301.