Anomaly Detection Continues to Evolve with Topological and State Space Techniques

James Bramante – Data Scientist, INFICON

Background

In the second quarter of 2020, INFICON began beta testing their new SmartFDC™ product. This product automates the detection of anomalies in time-series process data using unsupervised machine learning (a domain of Artificial Intelligence) to complement the results generated by conventional FDC techniques.

The Importance of Shape

By default, SmartFDC uses Machine Learning techniques to construct envelope models that automatically learn process behavior and its variability under normal operating conditions. The envelope models are shape-agnostic: they make no assumptions about ideal process behavior or shape. This enables SmartFDC models to establish limits on tool or sensor data that shift dynamically across a tool step (Figure 1). As long as the process data has a consistent shape, the model can adapt to that shape. This adaptability allows SmartFDC to monitor any process in a factory with minimal setup.

Figure 1: Three examples of envelope models (green lines) adapting to changes in process data (blue lines) within a tool step.

Probabilistic Limits

Distinguishing shape and noise is important when monitoring processes with complex behavior. To improve the performance of SmartFDC, certain machine learning algorithms are being investigated that can explicitly consider process shape. Persistent Homology, which is an application of the mathematical discipline of algebraic topology, can summarize the shape of time-series data based on the distance between local peaks in its trace (Figure 2). With this technique, we can define probabilistic limits on shape similar to conventional statistical process control limits.

Figure 2: Example algorithm for encoding temperature sensor data with persistent homology to derive a 2-dimensional joint probability distribution defining the range of shapes in that data.

Latent State Space Modeling

An alternative method of analyzing shape is to segment the data into shorter intervals where the raw values or time derivative are nearly constant. After establishing the order in which the intervals occur in training data, shape anomalies are detectable as intervals that are out of order (Figure 3). When unsupervised, this method is generally known as latent state space modeling. Latent state space modeling also produces variables that can be used later to classify the data in a given tool step and tool bin.

Figure 3: An idealized example of regular process behavior (left) classified into five roughly linear segments and an anomalous case (right) where the segments appear out of order.

Characterizing Process Data

Figure 4: An example of 16,000 runs automatically grouped on the basis of three shape characteristics.

Machine learning has applications in process monitoring beyond setting more robust process limits. For example, we are using machine learning algorithms to automatically classify process data based on its shape and other characteristics (Figure 4). Given these classes, we can teach our algorithm to recognize and ignore meaningless data. Applications of this feature include:

Determining when a sensor is returning values outside of its dynamic range.
Distinguishing process steps (where faults can damage wafers) from cleaning or diagnostic steps.

We view this process as teaching our model common sense. This common sense will empower our customers' domain knowledge experts to focus on only the most complex and costly faults.

Research and Development Continues

Already a success in production environments, SmartFDC remains a focus for continued research and development to improve its detection capability, and imbue it with common sense reasoning about process data.