Designing an Intelligent System for High Reliability Production Line

Designing an Intelligent System for High Reliability Production Line
Designing an Intelligent System for High Reliability Production Line

Manufacturing processes today are under continuous pressure to improve yield, throughput and eventually, ROI. This brings reliability and maintenance in sharp focus to ensure minimum breakdowns and reduce unplanned downtime. To maximize efficiency and operational excellence, M&R teams are increasingly turning to Industry 4.0 technologies and advanced approaches to reliability.
In this blog post we will discuss the progress reliability techniques have made and how operations teams are using AI to maximize efficiency.

The story so far

Reliability Centered Maintenance (RCM) has been around for a long time but, due to the historical difficulty and cost of getting real time data, has been focused on FMECA and statistical analyses to determine which equipment to service and when. Now, sensors, data transfer, data storage and advanced analytics are becoming very cost effective. This enables more powerful RCM approaches like Condition Based Maintenance (CBM) and Predictive Maintenance (PdM) to be feasible in more and more applications.

Let us look at the different approaches RCM utilizes to defining maintenance plans:

  • Scheduled a.k.a Preventative Maintenance (PM) cycles: occur on a fixed schedule
  • Condition based: occur as needed based on measurements
  • Predictive: use measurements and analytic models to estimate how much time is left before service is needed

All of these are approaches to non-reactive (break-fix) maintenance. Preventative Maintenance is simple to set up and conservative in preventing failures but drains resources with more than the minimum necessary maintenance cycles. This results in lost productivity, higher consumables cost and more manpower required.
The next approach in the progression is CBM: This can range from simple rules to complex rules and from manual data collection e.g., with analysis sensors and inspection tools to automated periodic collection with screening sensors and low frequency sampling.
CBM relies on specialized analysis tools for specific types of inspection. The availability of low cost sensors and wireless connectivity like LPWan, 3G/4G Cellular, 5G and WiFi v6 make retrofitting easier than ever before. CBM algorithms have also made progress. For example, it has moved from being able to work with only simple, univariate rules to more complex condition detection, using multivariate analysis across a wide range of sensors. SaaS puts a more accessible and always available delivery mechanism for exploiting the sensing hardware than before.
Taking CBM to the next level is Predictive Maintenance: Here, we look for better precursors and use those precursors in estimating time to failure with more complex modeling. This is the next frontier and is enabled by the same technologies and techniques which make CBM such an exciting space. PdM is also able to look at larger volumes of data including through higher bandwidth sensing and by using process parameters. As a result, it finds earlier signs, more unknown unknowns, better explanation, and better tolerance to noise and process variations.
Being able to say when maintenance is needed is powerful for scheduling both people and production but this is a difficult thing to do well–it’s still at the cutting edge of maintenance strategy.

Major hurdles to predictive maintenance today

Seeking perfection from the outset is one major roadblock. Taking the approach of not starting anything unless it works perfectly stalls and eventually leads to abandonment of many valuable projects.
For example, an organization wants to implement predictive maintenance strategies but they will implement it only if it can predict every single failure mode–even the most complex ones. Now that’s a very high bar to set at the onset of such a project.
Instead, they should focus on finding a single, common failure mode in one equipment in order to see real value from the strategy. This does 3 things:

  1. It solves a real problem

  2. It engages stakeholders to find more such work

  3. It lets the organization start learning how to do this. It’s not easy, and experience shows there are many systemic issues, such as people, data quality and processes, that will hinder progress. Learning about them this way, on a smaller project, provides great returns at lower risk once the decision is made to scale up.

A key challenge is answering “what is the right problem?” A use-case based approach works best in such situations, i.e. review possible applications with a range of potential participants and see which ones stir a “I have that problem too” moment. The goal is to find a couple of  things that resonate – that move us one step closer to showing the art of the possible. The goal is not to find the “perfect” problem which will achieve a 10x ROI in 1 year.
Decision makers also need to adopt a monitoring mindset rather than a root-cause analysis mindset. Put another way, approaching with the stance that having and acting on imperfect knowledge now can be more important than having perfect knowledge later. Achieving awareness of what’s going in the line, at scale, with good-enough data from commodity sensors, driven by analysis tools that your existing operations team can actually use, without the help of data scientists or highly credentialed reliability experts, can go a long way to achieving value and achieving it quickly.
Fear of not being compliant with regulations is another hurdle organizations face. A paper on innovations in pharma manufacturing makes an important distinction between verifying and assisting in verification. Verification is a three-stage process with a high level of analysis and testing required to get approval. Yes, it is difficult to change after getting that regulatory approval. However, many of the CBM and PdM techniques can be used to supplement verification rather than replace it. For example: A continuous verification process may have established strict control limits based on months of characterization. Predicting when maintenance is necessary doesn’t replace that verification process. Rather it assists the process to be more effective by warning when the equipment is likely to start experiencing failures as defined by the verification process’ control limits. With this assistance, the maintenance team can take proactive steps to avoid ever failing in the first place. Assistance like this does not require a costly requalification.
Another aspect of “perfection” is framing the problem in terms of data scarcity–I must have a complete dataset before I can start a predictive maintenance program. This just isn’t true. Too often seeking more data is an excuse to delay the uncertain aspects of the work–getting more data is comparatively easy to do and involves a well-defined process. Instead, an intelligence-first approach works much better. The idea behind this is simple–data collection is not the goal, solving problems is. That goal can be achieved by working on real-time data. Working in this “intelligence-first” way also circumvents two of the biggest issues in dealing with historical data:

  1. That it is hard to understand what really happened in the past. It is widely known that reliability record keeping is an area that most organizations have difficulty with.

  2. That subject matter experts are reluctant to revisit old problems because they need to be focused on solving today’s problems. By working on issues that are happening now, it is much easier to get the subject matter expert input required to make good models and decisions based on those models. 

Unfortunately, traditional data-science based approaches are difficult to use with intelligence-first. Data scientists are amazingly skilled at what they do, but typically they don’t have domain expertise in the area where the models are being applied. This makes them a bit of a 5th wheel that can slow down the process of discovering and interpreting problems. This is why predictive analytics software needs to be usable by the operations teams directly. That speeds up the process by making learning cycles shorter, ultimately leading to results which can scale across the entire plant.

Using software to improve a reliability program

There are three basic approaches organizations take to step up their predictive operations:

  1. Specialized sensor applications

  2. Customized prediction software

  3. Software-defined reliability approach

 In a specialized sensor application, organizations implement sensors or instrumented equipment packages built to find and report on particular failure modes on certain models of that vendor’s products. This approach works for a limited set of issues on a limited set of assets for equipment performance issues but doesn’t take into account process issues where the equipment is used. To scale using this approach, organizations will need vendor specific solutions for all of your different equipment which covers all your different issues–this will get very expensive, very fast and may not even be possible for equipment which was deployed years ago despite having decades of productive life remaining.
The second approach is to use customized prediction software. This approach brings in data scientists to consult with SMEs and to model specific problems, which are then implemented into production software. As we have already discussed, this is not a scalable solution. It can be slow to implement and, when based on historical data, doesn’t engage the front line operations teams in their day-to-day work.
The last option is to use software that takes a general approach to reliability from any and all sources of operational data, and can be applied to a wide range of  assets and problems making it easier to train and manage people. This is possible by directly engaging the operations teams instead of data scientists. Such a software solution needs to be able to hide the complexity inherent in AI from the end users, enabling them to use it themselves and learn effectively. This kind of an intelligence first approach can start providing actionable insights in a matter of weeks instead of years.

About The Author

Nikunj Mehta, is the founder & CEO of Falkonry and is a new member of ISA’s Smart Manufacturing & IIoT Division. Nikunj founded Falkonry after realizing that valuable operational data produced in industrial infrastructure goes mostly unutilized in the energy, manufacturing & transportation sectors. Nikunj believes hard business problems can be solved by combining machine learning, user-oriented design & partnerships.

Prior to Falkonry, Nikunj led software architecture & customer success for C3 IoT. Earlier, he led innovation teams at Oracle focused on database technology & led the creation of the Indexed DB standard for databases embedded inside all modern browsers. He holds both Masters & Ph.D. degrees in Computer Science from the University of Southern California. He has contributed to standards at both W3C and IETF and is also a member of the ACM.

Did you enjoy this great article?

Check out our free e-newsletters to read more great articles..