Available online at www.sciencedirect.com
ScienceDirect
Procedía Engineering
ELSEVIER
Procedía Engineering 63 (2013) 852 - 860
www.elsevier.com/locate/proeedia
The Manufacturing Engineering Society International Conference, MESIC 2013
Methodology for the Maintenance Centered on the Reliability on
facilities of low accessibility
J.A. Sainz*, M.A. Sebastian15
Department of Manufacturing Engineering, National Distance University of Spain (UNED), C/ Juan del Rosal 12, 28040-Madrid, Spain.
This paper presents the importance of obtaining the application of a maintenance technique that satisfies in a precise way the different needs of the production process, independently of its technical complexity or difficulty of access to the industrial plant facilities. This is the case of the plants with a high automation level or wind farms located in remote places with low accessibility. Besides this, the studied situations have in common the low level of physical operation in its production process.
© 2013 The Authors. Publishedby ElsevierLtd.
Selection and peer-review underresponsibilityofUniversidaddeZaragoza,DptoIngDiseno y Fabricacion Keywords: Maintenance; reliability; RCM; wind turbine.
1. Introduction
Added to the difficulty of application of the different maintenance techniques in an optimal way at a reasonable economic cost, there is the difficulty of doing it in complex industrial facilities. In this regard, there appears the need to find a suitable technology which allows reducing the possibilities of an inadequate maintenance, by exceeding in the accomplishment of the activities as for the lack of the same ones, Alsyouf (2009). This way the consequences of important economic losses would be avoided without generating safety risks, environmental risks or risks in the working system. Two examples of this type of facilities are industrial plants with high automation level and flexibility with limited physical operation or facilities placed in locations with a difficult access.
* Corresponding author. Tel.: +34 - 986 - 813648. E-mail address: jgonzalez1100@alumno.uned.es
Abstract
1877-7058 © 2013 The Authors. Published by Elsevier Ltd.
Selection and peer-review under responsibility of Universidad de Zaragoza, Dpto Ing Diseño y Fabricacion doi:10.1016/j.proeng.2013.08.279
Robotized plants or wind farms are examples of both. Currently, the advances are grater regarding the efficiency and productivity of this type of assets. An inadequate plan of maintenance may reduce their performance. Therefore, these inadequate plans should be reduced.
For assets under study in this paper the key point resides in detecting the failures when they give some type of indication about whether they are going to take place. Then it is possible to study the trend over the election, measurement and monitoring of some relevant parameters representing the good operation of the facility under analysis. The parameters can be temperature, pressure, vibration, linear velocity, angular speed, noise level, thickness, dielectric strength or oil viscosity. Continuous monitoring allows for a historical record of the characteristic in analysis, Lee (2008), and turns out being extremely useful for repetitive failures. Also, this system operates on goods while they are working without having to move them from their location, while allowing use of time without causing production losses associated that stops represent.
2. Methodology
In the above conditions, the application of RCM maintenance technique is optimal, since the method is used to determine what it must be done to ensure that a machine or system should continue realizing his duties. Apart from complying with the specifications of SAE JA 1011 norm, in the particular application of limited access facilities it is necessary to carry out the development of maintenance techniques that improve the efficiency of management, finding a strategic work plan, which synthesizes new developments in a coherent model, making possible the evaluation and application of the methods of major value. With this technology it is possible to assure that the physical elements continue preserving the reliability that they themselves possess from his design, manufacture and testing, through performing repairs in the most opportune time so as to minor failures are prevented from producing more serious problems, Scarf (2010). To make this possible it is necessary to develop a maintenance plan which includes corrective, preventive, predictive and proactive maintenance tasks; and that are applied according to the criticality of each component, Zhong (2011); and adjusting tasks at all times to established production targets, optimizing safety and environmental preservation, Yang (2010). In the table 1 outlines the principal composition of the RCM methodology, based on assets, a priori, are more suitable for each maintenance type.
Table 1. Composition of the different techniques to be applied within the RCM by type of asset
For a complete and effective application of RCM of limited access facilities it is required the reliability evaluation of the plant under study using the following methods: FMECA (Analysis of Failure Modes and Effects), RCA (Root Cause Analysis), mathematical modeling (for example Weibull's analysis) and the use of the instrumental technology for condition monitoring (CMS: condition monitoring system). These actions complemented by implementation of performance measurement tools, such as OEE (Overall Equipment Effectiveness), provide accurate and reliable information about the suitability of the tasks and improvements in production processes.
2.1 FMECA (Analysis of Failure Modes and Effects)
It is a methodology intended to identify and analyze potential failures and to quantify the effect that these
failures have in normal operation of the production system in question and to minimize the effects on production, facilitating planning of work activities, Basten et al (2012) and strategies to carry out to optimize the reliability of the installation, setting the functions of each component and its functional failures with its corresponding failure modes. With FMECA technology three parameters are used: severity, occurrence and detection probability, which together form the basis of equitable assessment of risk RPN (Risk Priority Number). To quantify this parameter it is needed an early qualification of failures and prioritizing corrective, preventive predictive and proactive tasks to be done in order to eradicate or control these failures.
For the calculation of RPN use following equation:
RPN = Severity x Occurrence x Detection Probability = S x O x D (1)
Once the necessary activities for improvement derived from the procedure FMCA have been affected, the RPN evaluation is implemented again and the process is begun again until the control of the failures is met. The table 2 shows schematically the FMECA action process.
Table 2. Schematic overview of FMECA process.
Fa Owt
Knomi j [^^^^stentia^^^J I^^^^^iJctjo™» - RPN ciltulstLmn |
Ccitieui-f PlCVClllV* Predictive PiMCtiVf
R.PN iltdditiom
2.2 CMS (condition monitoring system)
To optimize the probability of failure detecting, continuous monitoring devices (CMS) are installed, which provide real-time information regarding the selected parameters, Utne et al (2012), such as temperature, amperage and vibration level, and thus these devices show early non-standard condition of important components such as bearings, gearboxes, axles and electric machines. This is performed by implementing sensors that collect the operating characteristics of the set of components that form the productive system. The installation of these devices and tracking selected variables enables the detection of non-standard conditions that facilitate failure analysis; thereby increasing detection probability thereof. Figure 1 shows an example installation of such equipment in a wind turbine with difficult access.
Fig 1. Installing the CMS system in a wind turbine placed in a difficult access location
The benefit achieved by detecting an early warning enables equitable mitigation of risk assessed by RPN; thus CMS can effectively schedule the corresponding task categories, generating a decline in failures with a consequent increased availability and reduced repair costs. Representing the data graphically, a time function results in a straight line, but as the measured parameter varies as a result of an incipient failure, this straight line will progressively bend in a manner proportional to the worsening of the condition status of the component under analysis, until a functional failure is achieved. This is known as PF curve, shown in Figure 2.
Net PF interval
Fig 2. PF curve which net PF interval
The curve shows when the failure starts and how it deteriorates to the point where they can be detected (P) and then, if the failure is not corrected, it continues to deteriorate, usually very fast, until it reaches the point of functional failure (F). This time range is called net PF interval. Figure 3 shows the temperature record of a bearing in which the average temperature increased by 16 degrees Celsius to reach the permissible safety limit value set on the machine at 86 degrees Celsius, a net PF interval of 30 days.
85 SO 75 70 65 60 55
19-07-2012 25-07-2012 31-07-2012 06-08-2012 12-08-2012 18-08-2012
Fig 3. Recording temperature of a bearing for a CMS system
* ■J !._
— 1
Condition monitoring does not prevent the failure, therefore it does not affect the failure rate or the reliability of
the equipment, but if monitoring data are correctly interpreted, they will lead to maintenance aimed at eliminating and reducing the consequences of failure. These maintenance tasks should be effective to permanently remove its root cause.
2.3 RCFA (Root Cause Analysis)
The root cause analysis is a method that makes possible to solve problems for any type of failure identifying the reasons that unleash it. It is based on a logical process consisting of the analysis of events in a failure. It manages to discover the root causes. Also, RCFA has the advantage of being able to establish a pattern of errors in the machine as a whole, substantially reducing the periods of non-availability and non-functionality. The steps that RCFA methodology develops find the root cause of the failures. These steps are the rapid response to a nonstandard condition preserving the most valid possible evidences with a reliable verification of its root causes and implements an additional RCFA after a reasonable time. Conducting this analysis quickly draws the conclusion that there are root causes that end in two differentiated failures types:
• Sporadic failures, which correspond with a diversion of the standard in a normal operation, and that as soon as the root reason that unleashed the failure is eliminated, everything returns to the normality. Almost always after having eliminated a sporadic failure the system turns to a situation of normal functionality. With the analysis the conclusion is drawn that the events are usually rare and almost never related to other events of the same type.
• Chronic failures, which are very frequent events, and that when they are eliminated or the root causes controlled, restored functionality is achieved to its peak and the expected work level rises. These failures are difficult to control or eradicate, and this is only achieved by applying failure analysis, and they are often accepted as a normal part of the production costs. When these failures are aggregated over a time period and are further combined with other chronic failures that may exist, very noticeable affect the production of the machine. By finding the root causes of these failures and controlling them, it results in increased productivity, increased availability rates, reduced production losses and maximized productivity and competitiveness of the productive system.
Figure 4 shows graphically the availability losses caused by chronic failures versus sporadic failures in a spooler machine, such availability understood as the property of a system to perform the functions provided for it, keeping their ability to work under the regimes and prescribed operating conditions and during the required time interval. Limiting access to the machine is represented by the high temperature at which the machine works and the necessary cooling time thereof to be operate on it.
Ara liability 100 +
Fig 4. Loss of availability due to chronic failures versus sporadic failures
It is noted in the figure how chronic failures are constantly subtracting availability principally due to the difference between the current average work capacity and the theoretical designed capacity and the numerous short
interruptions (micro stops) that the machine has in its normal operation. Instead, a sporadic fault represented as in the root cause which was a broken speed sensor coil that although at the time of the fault produces a complete stop, could be quickly solved by replacing the sensor and the state returns to normal function.
2.4 Modeling
In order to carry out the analysis of the nature of the operational phenomena in facilities and equipment, it is very useful to use statistics as a support for the quantification of the parameters. The phenomena's historical behavior is characterized based on operation and failure periods that have occurred since the commissioning time. The conditions that characterize the equipment operational time data are so numerous that it is not possible to say when exactly the next failure will occur. However, it is possible to express which will be the probability that the equipment is in operation or out of service at any given time. These times are associated with a cumulative distribution function of the random variable, which is defined as the addition of the probabilities of possible values of the variable that are lower or equal to a preset value. The mentioned random variable is constituted by the operating times and downtime of equipment or system in a given period. For its parameterization Weibull distribution is very appropriate as it is very effective and relatively simple to use in the reliability evaluation of a system by quantifying the probability of failure in the performance of the system's duties from the failure probabilities of its components based on the operation times. There are three different parameters:
• The shape factor a, which is the density, distribution and failure rate function, and represents specific regularity failure occurrence, and acquires three different shapes: P <1, P = 1, P> 1, which defines the "bathtub curve" as shown in figure 5.
Failure rate
Fig 5. "Bathtub curve"
This curve shows the evolution in time compared to the failure rate and the value of the shape parameter of the evaluated Peta equipment. Table 3 shows the characteristics of each phase with its main causes, and how these effects can be mitigated to each phase.
The parameter b, also called life characteristic, is the interval time between the end of lifetime guaranteed and the time which can be expected to have failed with 63% probability.
The guaranteed lifetime c, where failures do not occur until significant c time has passed.
Figure 6 shows a study based on the Weibull distribution to the rear bearing of an alternator in a wind turbine of 2 MW of power quantifying the produced MW instead of the lifetime. From the Weibull analysis it is determined that the root cause of the failure is the wearing out as Peta parameter is 2.4, which has a guaranteed life of 8,030,02 MW / H and that from that production, after having generated a further 6533.85 MW it will have a minimum 63% possibilities of continuing work.
Fig 6. Study based on Weibull distribution for an alternator bearing Table 3. Performance in the areas of the "bathtub curve"
Characteristic Caused Diminished
Phase I Running-in or infant mortality Failure rate decreases gradually with time Manufacturing defects Defective components Poor quality Conditions outside standard Inadequate design Operation review Quality control Fault analysis
Phase II maturity or lifetime Beta value and failure rate remain fixed Unsuitable maintenance Random loads higher than expected Unexpected events or fortuitous failure causes Component redesign Redundant system Process and operation technical review
Phase III aging Failure rate increases Fatigue Corrosion Aging Friction Cyclic loads Hidden defects Low safety factors Proactive Tasks. Planned component replacement Conservation state analysis
2.5 OEE (Overall Equipment Effectiveness)
The OEE or Overall Equipment Effectiveness is a parameter used to measure the production efficiency of industrial machinery with respect to its ideal equivalent, Relkar and Nandurkar (2012). Based on the fundamental principles that you cannot manage what is not measured and measuring what is not managed is useless, its application in this work becomes invaluable. OEE is calculated based on the following equation, wherein all data are expressed in percentages:
OEE = Availability x Performance x Quality
If by realizing an inadequate maintenance plan a machine works below its effectiveness, operational resources are wasted and corresponding negative impact on profitability is suffered, Sun et al (2008). OEE measurement can identify problems that generate work losses, thus enabling working on their causes and eliminating them systematically. When performing maintenance tasks that require significant financial investment investors should expect tangible results, and if these results are not satisfactory enough, the investment must be redesigned. With the application of OEE it is possible to continuously monitor measure, identify and quantify the efficiency of the maintenance process, assessing how much a failure affects the whole facility, and facilitating the evaluation of the suitability of the different improvement proposals. This process represents a balanced view of the elements that are most important to initiate specific and targeted improvements to the problems identified.
A better OEE means increasing the reliability and availability of the equipment, and therefore maintenance technology that has been applied on a site with difficult access will be more effective and efficient, such as wind turbine shown in Figure 7 to about 2000 meters.
3. Conclusions
In flexible factories with a high level of automation, or those placed in locations with a more complex accessibility, needs a detailed definition of fundamental parameters for a good functioning, such as the definition of clear goals, suitable information systems for decision taking, control of maintenance relevant activities, research on technological management, to achieve the goal of optimal levels of reliability through the realized maintenance operations. The maintenance tasks in this sense would be achieving a good condition state of the equipment through a good implementation of RCM, added to the FMECA, RCA techniques, the mathematical modeling, CMS and measuring results through OEE, it makes possible to find and eliminate the causes of failures and anomalies, trying to achieve zero failure and the maximum operational readiness. Therefore, the organization of the maintenance system could have a detailed, global and specific foresight of the actions and routes which must be taken. This methodology also permits the advanced definition of all services and resources needed for the maintenance operations, setting methodologies that make the continuous improvement of the activities and management possible, contributing to achieve the required efficiency.
Fig 7. Wind turbine representing a facility of low accessibility placed at a height of 2000 m.
References
A.I. Alsyouf. Maintenance practices in Swedish industries: Survey results. International Journal of Production Economics, 121 (2009), pp. 212213.
A.S. Relkar, K.N. Nandurkar. Optimizing & Analysing Overall Equipment Effectiveness (OEE), through design of experiments. Procedia Engineering, 38 (2012), pp. 2973-2980.
D.L. Yang, S.J. Yang. Minimizing the makespan on single-machine scheduling with aging effect and variable maintenance activities. Omega, 38 (2010), pp. 528-533.
I.B. Utne, T. Brurok, H. Rodseth. A structured approach to improved condition monitoring. Journal of Loss Prevention in the Process Industries, 25 (2012), pp. 478 - 488.
J. Sun, L. Xi, S. Du. Reliability modeling and analysis of serial-parallel hybrid multioperational manufacturing system considering dimensional quality, tool degradation and system configuration. International Journal of Production Economics, 114 (2008), pp.149-164.
L. Zhong, S. Youchao, O. Ekene. Disassembly sequence planning for maintenance based on metaheuristic method. An International Journal, 83 (2011), pp. 138 - 145.
P.A. Scarf, C.A Cavalcante. Hybrid block replacement and inspection policies for a multi-component system with heterogeneous component lives. European Journal of Operational Research, 206 (2010), pp. 384-394.
R. Basten, M.C. Heijden, J.M.J. Schutten. Joint optimization of level of repair analysis and spare parts stocks. European Journal of Operational Research, 222 (2012), pp. 474 - 483.
S. Lee. Product lifecycle management in aviation maintenance, repair and overhaul. Computers in Industry, 59 (2008), pp. 296-303.