Google for Industry: Taming Big Data |

Google for Industry: Taming Big Data

Google for Industry: Taming Big Data

By Bert Baeck, CEO, TrendMiner 

We all see how the Internet has transformed the world. However, we often forget the tremendous difficulties of navigating the web in the early days of the Digital Age. Prior to the introduction of directories and search engines like Alta Vista, Yahoo and Google, locating the information you needed from the the vast amounts of data stored on the web was time consuming and frustrating.

The same comparison applies to companies trying to leverage the huge amounts of data collected by their various systems. Companies have been very good at amassing data, but until now have lacked affordable and effective tools needed to search and interpret it for actionable information.

The Challenge of Big Data

For many years, various automation software vendors have been trying to simplify data retrieval and representation. We all know more in-depth insight into operations and processes boosts productivity and cuts costs, but accessing the right information quickly hasn’t been easy or economical.

When historians were introduced in the 1980s their function was storing process data and generating reports to satisfy regulatory requirements. They were not designed to be easily mined or have their data visualized for predictive analytics.

In more recent years, leading companies have realized that accessing and interpreting the captured historical time series data from historians could optimize operations. They knew a lot of valuable time series data was already gathered in process historians, but only five percent was being transformed into actual actionable information. It was simply too hard, too expensive and too time consuming for their engineers to use historian data for process improvement.

Industrial Internet of Things (IIoT)

In addition to the huge amounts of data being collected every day in plants across the world, the industrial sector is undergoing another revolutionary transformation: the Industrial Internet of Things (IIOT) or Industry 4.0.    IIOT proposes a new way to improve business intelligence and consequently boost operational efficiency using data-driven analytics.

Along with increased functionality, IIoT promises to create even more data. It will introduce new types of data and formats, increase data volume and ideally transform operational decision making from reactive to proactive. 

Companies wanting to benefit from big data will require a new approach to accessing information and analytics.  Many traditional architectures and processes will not be able to deliver timely insights, resulting in missed opportunities.

Another crucial paradigm shift in the IIOT transformation triggered by the abundance of data is that it’s not about ‘seeing everything’ anymore. Enormous amounts of data will quickly overwhelm anyone not able to easily locate specific information; therefore, the ability to manage and interpret information will be the key to success.

To be successful in today’s world companies need to unlock their historian data and demand analytics solutions that help workers find information fast with minimal effort. 

A History Lesson

Early attempts to transform raw data into useful actionable information relied on data modeling. The problem with this approach is it usually requires complex IT projects and data scientists to build and maintain models. This makes the projects time consuming and expensive, so only a few searches are implemented.

We developed the concept of Google for Industry, an approach for easily accessing specific information from historians, from our experiences in the process industries. It was invented by former process engineers from Covestro (then known as Bayer MaterialScience) in 2008. 

These engineers had worked with nearly all types of analytics models and identified their limitations for scaling-up beyond pilot projects. They started with the same problems faced by nearly all process engineers: how to execute searches quickly in an intuitive way, just like people use Google in their daily lives.

Google for Industry is Born

One of the first challenges our engineers faced was that existing solutions were based on data modeling. Data modeling solutions required a data scientist, making them very time consuming and expensive to implement. Moreover, they were sensitive to change and not flexible at all.

All modeling technologies must go through the same steps:

  1. Data preparation
  2. Data Modeling
  3. Data Validation
  4. Bringing the model live.

Therefore, each time a model is changed, it must go through the same cycle. Not only is data modeling time consuming, but it was not designed for dynamic processes because it is often based on assumptions about stationarity and data distribution that do not hold in a real process with variables.

                                    Data Modeling Difficulties

Requires significant engineering

Data cleaning, filtering, modeling, validating, iterating on results/models

Sensitive to change

Users needed continual training

Requires data scientist

Plants had to hire additional workers, or engineers spent too much time trying to be data scientists

Not plug and play

Installation and deployment required significant investments in time and money

Black Box Engineering

User cannot see how results are determined

In one of our initial projects a plant manager of a chemical plant struggled with significant variance between operator teams, which had direct impact on the throughput. He wanted to know if we could learn from the data what that Formula1 shift operator team was doing differently.  In order to make it possible, we had to search through the entire process data history and figure out what different ‘manual actions’ where taken in similar events that led to better or worse plant performance.  We knew there had to be a better way.

This was the impetus for creating a solution that replaced the labor-intensive data modeling approach with pattern recognition.

Ever wondered how Shazam, the app that helps identify a song from a small sample works? Shazam looks for patterns in the song that match patterns in its database. What’s more interesting is that it ignores much of the entire song’s millions of “data points” and focuses on a few intense moments or “high energy content.” Seemlingly, this approach would create a lot of wrong matches, but Shazam is very accurate, even in environments with a lot of background noise.

The key to Shazam’s success is it searches and recognizes distinctive patterns instead of trying to match all aspects of a song. And it does it quickly. Shazam understood the key to enabling big analytics is to focus on patterns, not trying to crunch all data.

The Shazam example is a simplification of the more complex search performed by our Google for Industry concept. Our engineers created a user-friendly, highly sophisticated search for process data in which advanced pattern recognition algorithms find either similar or deviating behavior. Users just choose their reference period over one or more tags then the system finds similar behaviour throughout the entire data history.

It didn’t take long for the Google for Industry concept to grow.  People wanted to search far beyond only similar patterns in the past. Thus, we extended the search capabilities to basically all relevant things  typically sought in daily searches: behavioral patterns, slopes, operator actions, certain switch patterns, conditional or Boolean conditions based on tags, drift, oscillation of a certain frequency, anomalies, event frames, context, and more.  

Our Google for Industry approach started with search, but its vision goes much further. We knew it was the first step in creating a fast, affordable and user-friendly method for unlocking historian data to help industrial companies find new areas of improvement.

How It Works

We have developed this software with a high performance discovery analytics engine for process measurement data to deliver the power and ease of Internet search engines to industrial applications. Through an intuitive web-based trend, client users can start searching for trends with patent pending pattern recognition and machine learning technologies.

Using value or digital-step searches for filtering in and out of times or finding something that looks similar is just the beginning.  Process engineers are also enabled to search for particular operating regimes, process drifts, operator actions, process instabilities or oscillations. 

Root cause analysis is very important to keep operations running as efficiently as possible. The causal and influence factor search algorithms can show users the underlying reasons behind process anomalies or a deviating batch.

Comparing behavior between an anomaly and a normal operating period is often a painful process. This software instead uses avanced search and diagnostic capabilities that allow users to compare a large number of transitions focusing on both equalities and differences. This provides a faster and more acurate analysis of a continuous production process.

In addition, live displays let users see process values as they evolve while the software can also predict the most probable evolution of these values in the future based on matching these on historical values.

From Reactive to Proactive Decision Making

The idea is not just to be able to better examine past process behaviors, but optimize operations through proactive decision making.

Firstly, we provide a means to capture this information from multiple users – process engineers, operators, maintenance staff, etc. – in one single environment connected to the relevant process data in order to provide context.

The next step is creating a golden profile by using  search features to find the best transitions or batches of a given type from multiple historical transitions. The golden profile is then used for predictive monitoring. Users can take a live view of a process and apply it over the golden profile to verify that recent changes are behaving as expected. This allows users to proactively adjust settings to reach optimal performance or test to see if changes will produce the right results before they are implemented.

In addition, the software provides sophisticated alarm capabilities as well. Instead of merely sending an alert that a problem is occurring, the software can help to configure meaningful alarms to prevent problems from happening.

For example, a process engineer can review the operators’ annotations of a problem that occurred in the past. He performs a search to find four comparable anomalies in the past. Based on an overlay of the resulting periods, the engineer can easily determine alarm limits to prevent a similar occurrence happening in the future.

As we have seen, highly advanced software capabilities simplify the ability to make proactive decisions by providing user-friendly search with predictive analytic capabilities provided by powerful algorithms.  This leads to a much more productive solution for the process engineer.