Advanced Analytics Improve Predictive Models

By Joseph Reckamp
January 25, 2022
InTech Magazine
Feature

Summary

Inaccurate or overlooked alerts on manufacturing data can be reduced with proper data handling when developing and deploying predictive models. This article first appeared in the December 2021 issue of Intech magazine.

Advanced Analytics Improve Predictive Models

Data analytics, and specifically predictive analytics, are meant to reduce the number of alarms for process improvements, trend forecasting and predictive maintenance. However, deploying predictive analytics often leads to excessive nuisance alarms, a common problem in process manufacturing control rooms.

Process engineers typically spend days or weeks reducing and eliminating nuisance alarms to ensure control room operators respond with the correct vigilance to the critical alarms that could cause safety incidents or quality deviations. Predictive analytics help drive data-driven decisions for process improvement and optimization, but data and insights must be handled properly.

Predictive analytics 101

Predictive analytics are a method of data analysis whereby future projections of data are estimated as a function of modeled historical data sets. Some common techniques for creating predictive forecasted data include regression, event or profile, and classification.

Figure 1: Predictive analytics models create a projection of future data, typically based on a quantifiable method such as regression.

Regression uses a quantifiable model applied to historical data and extrapolated for new or future data. Event or profile predictions denote when a particular data trend is preceded by a pattern or shape of the data. Classification predictions leverage normal data in a model to detect anomalies or abnormalities in the new data (Figure 1). Each of these methods requires a model to be trained on historical data, with the model then used to predict future data or events.

Predictions have three potential outcomes: an accurate prediction, a false positive, or a false negative. An ideal model uses accurate data as an input to accurately predict issues well in advance. False positives occur when the model predicts a nonexistent issue. False negatives occur when the model predicts normal operation and overlooks an issue.

In practice, most predictive analytics display some amount of the two false error types, because every possible combination of inputs to and outputs from the model are unlikely to be known and accounted for in the model. False negative errors can result in significant cost to the organization for missing issues, and false positive errors often cause unnecessary adjustments or maintenance. Psychological costs are also incurred, because operators change behaviors due to a loss of trust in the model results.

Regardless of whether the predictive analytics algorithms used are transparent, open-source formula, or proprietary and compiled algorithms, all methods require historical data for training and fitting the predictive model. Making sure the historical input data to the models is understood, cleansed, contextualized, and well prepared for modeling is critical for deploying accurate predictive models.

Leveraging hybrid first principles models

Most predictive analytics models are empirical, meaning that the quantitative model algorithm is fitted to the data using the method best fitting the data. Therefore, if the training data portrays a linear relationship between variables, a linear regression is used for fitting the model. However, inherent noise within data sets, complex relationships among variables, and even the selection of the specific data set used for model training can all result in a training data set that is not representative of all data available. For example, a training data set may show a linear trend, resulting in selection of a linear model; whereas the full data set may show curvature or edge effects that suggest a higher order model should have been applied.

A better method is to use first principles engineering equations that describe the relationships among variables. These first principles equations are laws or theorems taught in engineering education or published in literature. They were derived from analyzing significant amounts of data.

For example, the performance of a filtration membrane can be analyzed by membrane resistance, which is described by Darcy’s law to be a linear direct relationship with the transmembrane pressure and linear inverse relationships with the viscosity and flux. Alternately, a chemical reaction with a known nth order power law equation would indicate that an nth order polynomial fit should be taken for the concentration of the components to describe the change in concentration over time.

Understanding and applying first principles models typically has two major benefits. First, the predictive analytics models are often dimensionally reduced by default. In the Darcy’s law example, multiple parameters, such as pressures, flow rates, surface area of the membrane, and viscosity, are reduced to a single membrane resistance variable (Figure 2).

Second, the model fit is more likely to represent data outside of the training window, as the engineering principle should hold true beyond the training data range in most cases. Overfitting is a major concern with model development, as it often degrades results. Leveraging specific relationships with a basis in engineering laws based on first principles addresses this issue, because it avoids using a higher order model than necessary.

Reducing noise with data cleansing

Oftentimes, data cleansing is related to removing invalid or not-a-number data before running it through an algorithm, but there are many more considerations to improve the model and reduce false alerts. Process data often has inherent noise in the data set that causes a noisy model output, which triggers false alerts from small spikes in the data. These alerts can be minimized through data cleansing methods such as smoothing filters to eliminate or reduce the noise present in the model.

Figure 2: An advanced analytics application, such as Seeq, can leverage Darcy’s law on a tangential flow filtration system. Three pressure sensors, three flow rate sensors, viscosity, and the surface area of the membrane were dimensionally reduced to a membrane resistance soft sensor that can be regressed and projected into the future to determine the appropriate maintenance period.

Numerous smoothing filters exist, with the most common types being low-pass filters such as the Loess method, Savitzky-Golay method, or a moving average filter. Applying these filters to the process data reduces the noise in the model inputs, resulting in a model with reduced noise that is less likely to trigger a false alert by oscillating above the alert limit. In addition to signal smoothing, other types of data cleansing include outlier removal, time shifting data to adjust for process dynamics, and eliminating data that is not relevant, such as when the process is not running.

It is important to note that when data cleansing methods such as signal smoothing, time shifting, or outlier removal are applied to the training data, those same methods should be applied online to the live data when the model is operationalized. Failing to apply the same logic to the live data set as the training data set can result in a completely different set of false positives or false negatives. The model is trained on clean data and then applied to noisy data.

Similarly, persons developing predictive models of time series data should understand historian data archiving methods such as compression. Training data is typically post-compression data in the data historian, but online or live data may be precompression. Applying a model trained on post-compression data to precompression live data can cause lack of model fit issues, resulting in numerous false positives and negatives.

Providing context for alerts

Predictive analytics models are often not applicable 24/7. A certain model may only be applicable when a particular product is being produced, in a specific mode of operation, or even simply when the equipment is running. The most common time for models to falsely alert is immediately after a process change, which is when operators are often flooded with nuisance alarms.

As part of the model development process, determine which sections of the process make sense for alerts, and which ones do not. From there, build context to automatically segment or suppress alerts during time frames that are not periods of interest.

A common method of providing context to reduce false alerting is to suppress alerts during equipment startup, shutdown, or product changeovers. Additionally, model alerts occurring within a certain period of time after a manual adjustment to a process set point are often segmented into a separate visual indicator instead of a notified alert. Those deviations are expected until the process returns to steady state at the new set point (Figure 3).

Understanding model validity

Analogous to alerts not being valid during all modes of operation, models are not valid for all possible input parameters. A predictive analytics model is only able to predict situations that the model has been trained upon. For example, if a predictive quality model was created for a reaction with training data between 40°C and 60°C, it cannot reliably predict what the quality will be if the reaction temperature is 70°C.

Figure 3: False alerts during product changeover can be easily suppressed to a visual indicator using Seeq and not notified to operators as alarms.

Models are only valid in the range of the process inputs provided to the predictive model. For continuous processes, this generally is distilled down to upper and lower static limits for each model input, whereas batch processes require a batch profile of inputs that could adjust over time. It is important for the model to build in checkpoints for valid process input ranges to avoid causing false alerts when the model is not valid.

Oftentimes, the results of predictive analytics models are displayed to operators through dashboards of the results, which include alerts or notifications. However, these operators are typically not responsible for model development or understanding when the model is valid or invalid. Therefore, it is up to the model developer to input that information and make sure the operators are aware of these conditions.

The modeler can decide the best method for dealing with model validity. Some common approaches are to create boundaries around each of the input signals represented by most of the training data. For example, a setting of ±2 standard deviations is commonly used to account for 95% of the training data, which is then turned into static limits for continuous processes, or reference profiles for batch processes (Figure 4).

Figure 4: Seeq was used to create these reference profiles, which were applied against model input parameters to determine when the batch model was invalid. In this graph, the total base added went beyond the model validity period, so all suspected alerts beyond that point in time were suppressed.

Excursions from these model validity bands can then be used to suppress model results, along with associated alerts to operators, when the model is not valid. The model developer should be notified of these excursions, so he or she can extend the training data to expand the model validity range.

Transferring knowledge from R&D

One limitation with predictive models is the amount of data available about the process to build the predictive model. Since predictive models require a breadth of input parameters to create a wide validity range to predict future events, new processes are often at a disadvantage in the quantity of data available.

However, it is important to realize that in most cases, the data does not have to be data from the same equipment, size, or manufacturing site. Although some parameters will be affected by process configuration and scale up, a research and development (R&D) organization typically attempts to minimize the impact of scale up. Therefore, process data from R&D laboratory experiments, or even manufacturing at other sites, can often be used for model development.

One of the advantages of R&D laboratory experiments is that oftentimes there are design of experiments (DoEs) executed on the process, with a wide range of process inputs tested. These DoEs provide data regarding additional failure modes and wider process parameter ranges than would typically be observed in the manufacturing environment. Using R&D data alongside available manufacturing data provides a much greater model validity range, along with more accurate model development to reduce false positives or false negatives.

Errors will always occur, even when all the correct engineering procedures are followed for model development. These will increase, along with a decrease in confidence in the accuracy of the model, as the time horizon for predictions increases. Despite these difficulties, employing the techniques and tips described in this article will substantially decrease false positives and negatives, with corresponding improvements to plant operations and maintenance.

All figures courtesy of Seeq

This article first appeared in the December 2021 issue of Intech magazine.

About The Author

Joseph Reckamp is an analytics engineering group manager at Seeq Corporation, specializing in the pharmaceutical industry. He received his BS and MS in chemical engineering from Villanova University and has worked in the pharmaceutical industry throughout his career, including stints in R&D with GlaxoSmithKline and production with Evonik.

Did you enjoy this great article?

Check out our free e-newsletters to read more great articles..