Maintenance Strategy

December 102012
Maintenance Strategy
December 2012
By Bill Lydon, Editor
Maintenance strategy is becoming an important topic because of aging systems in developed countries and the lack of experienced personnel in all areas of the world. The goal of a maintenance strategy is to achieve the highest automation system availability consistent with the organization’s safety, capital investment and profit goals. Availability is defined as how often the system is operationally available for use and functioning correctly. When the system is not 100% available, production profits are lost. What is the best way to achieve higher availability?
Reliability-Centered Maintenance
Reliability Centered Maintenance (RCM) provides some ideas and thought provoking questions. RCM involves the establishment or improvement of a maintenance program utilizing a systematic, structured approach that is based on the consequences of failure, emphasizing the functional importance of system components and their failure/maintenance history. The concept finds its roots in the early 1960s, with strategies for commercial aircraft when wide-body jets were introduced into commercial airline service. A major concern of airlines was that existing time-based preventive maintenance programs would threaten the economic viability of larger, more complex aircraft. The experience of airlines with the RCM approach was that maintenance costs remained roughly constant but that the availability and reliability of their planes improved. RCM is now standard practice for most of the world's airlines.  
Technical standard SAE JA1011 (, Evaluation Criteria for RCM Processes, starts with these 7 questions:
1.  What is the item supposed to do and what are its associated performance standards?
2.  In what ways can it fail to provide the required functions?
3.  What are the events that cause each failure?
4.  What happens when each failure occurs?
5.  In what way does each failure matter?
6.  What systematic task can be performed proactively to prevent, or to diminish to a satisfactory degree, the consequences of the failure?
7.  What must be done if a suitable preventive task cannot be found?
Levels of criticality are assigned to the consequences of failure. Some functions are not critical and are left to "run to failure" while other functions must be preserved at all cost. Maintenance tasks are selected that address the dominant failure causes. This process directly addresses maintenance preventable failures. Failures caused by unlikely events, non-predictable acts of nature, etc. will usually receive no action provided their risk is trivial or at least tolerable. When the risk of such failures is very high, RCM encourages the user to consider changing something which will reduce the risk to a tolerable level.
The goal is a maintenance program that focuses scarce economic resources on those items that would cause the most disruption if they were to fail. RCM emphasizes the use of predictive maintenance techniques in addition to traditional preventive measures.
Higher Reliability Devices
The mean time between failure (MTBF) and mean time to repair (MTTR) of devices are important contributors to availability. Certainly all plants cannot afford the cost of redundant and triple redundancy deployed in critical production. However using devices with higher reliability MTBF and lower MTTR can certainly improve availability. These measurements can be applied to existing systems by creating a list of poor devices that have been unreliable in your system(s) and can be used as a starting point to evaluating weak parts for replacement with higher reliability ones. It is wise to factor MTBF and MTTR into analyzing purchase decisions rather than relying on first cost only. While evaluating products to purchase, ask manufacturers for MTBF and MTTR data. This is typically required for vendors to quote large projects and government work.
Predictive Maintenance
Predictive maintenance refers to maintenance based on the actual condition and performance of a component. Maintenance is not performed according to fixed preventive schedules but rather when certain changes in characteristics are noted. Examples of predictive approaches include corrosion sensors and vibration sensors. There has also been an increase in analytic software used to predict failures based on ongoing real-time analysis of information from the automation system.
Improving System Design
Systems designs can improve reliability and maintainability to optimize availability under a set of constraints, such as time and cost-effectiveness. Generally this requires involving maintenance people in design reviews since they understand the maintenance issues. It is good to think of this when doing upgrades and retrofits in addition to new projects. Doing a plant walk through with your maintenance people to discuss areas they know to be issues can provide the basis for changes to improve reliability.
Better Preventative Maintenance
It is easy to let preventative maintenance tasks go undone when budgets are tight but this is not wise. Tracking preventative maintenance costs also provides valuable information to make decisions about replacing equipment. These cost profiles can provide a major part of the investment justification for upgrades.
Remote Expert Services
Many vendors are starting to offer remote services that allow onsite people to electronically bring in experts to help with difficult problems. This can be as simple as a web meeting using an iPad in the field to show and discuss problems with remote experts or more advanced, specialized remote interfaces to your systems. Your staff needs to stay knowledgeable on the core parts of your automation system. However, it is difficult to know everything so remote services help fill the gaps.
The majority of automation vendors over the last two years are promoting their services to customers as a way to lower costs. These services are providing users with another building block for their maintenance strategy. After reviewing the brochures and presentations, it is ultimately the user’s responsibility to achieve efficient operations. Ongoing maintenance is an important issue. Many vendors are offering an outsource service for maintenance that may be an advantage but needs to be analyzed. The biggest factor to consider is the risk of downtime relative to savings by outsourcing. In-house people can provide quicker response when they have the training and knowledge required. Deciding what items are critical to be maintained by trained staff and what can be outsourced is the challenge to be determined using risk analysis.