CM Analysis

Our efforts are of little use to try to prevent failures if, when they happen, we are unable to provide an adequate response. Also, we must remember that a high percentage of man-hours devoted to maintenance are used in solving equipment failures that have not been detected by maintenance, but they have been communicated by the production staff. This percentage varies widely among companies, from those where 100% is corrective maintenance, there is not even a lubrication; to those very few, in which all interventions are planned. By estimate, we might consider that, on average, over 70% of total time spent on maintenance is used for solving unscheduled failures.

Managing effectively the corrective maintenance means:

  • Carrying out operations quickly, which allow the team be implemented in the shortest possible time (MTTR , mean time to repair, low)
  • Carrying out reliable operations, and adopting measures to prevent recurrence of these failure in a period of time long enough (MTBF , mean time between failures, large)
  • Consuming the lowest amount of resources (both labour and material)

The time required for an equipment implementation after a failure, is distributed as follows:

  • Time detection. It is the time between the origin of the problem and its detection. There is a relationship between detection time and total resolution time: the sooner the fault is detected, in general, will have caused less damage and will be easier and cheaper to repair. It is possible to reduce this time if systems to detect failures in its initial phase are developed, such as daily routine inspections, verification of operation parameters, and adequate training of the production personnel.
  • Communication time. It is the time between problem detection and the maintenance team location. This period is severely affected by the information and communication systems with the maintenance personnel and their managers. A good maintenance organization will make this time be very short, even negligible in the total passed time. To reduce this time, there must be an agile communication system, involving as few people as possible, and also there must be means to communicate with the maintenance staff without the need of looking for them (mobile phones, walkie-talkies, pagers, etc.).
  • Waiting time. It is the time between the failure communication and the start of the repair. It includes the waiting time to have workers can address the incidence, the paperwork needed to intervene (equipment stops, work order request, obtain a work permit, equipment isolation, etc..) And the staff transfer from wherever they are to where the incident has happened. This time is affected by several factors: number of maintenance workers that there are available, complexity or simplicity of the management system of work orders, security measures need to be taken, and distance from the maintenance workshop to the plant, between others. It is possible to reduce this time if you have a properly sized staff, if you have a flexible orders management and obtaining work permits, and if the distance of the workshop to the equipment is minimal (optimal location of the workshop maintenance is therefore, the center of the plant)
  • Diagnosis of the breakdown. It is time for maintenance operator to determine what is happening on the equipment and how to fix it. This time is affected by several factors: staff training and experience, and the quality of the available technical documentation (drawings, history of breakdowns, lists of breakdowns and solutions, etc..). It is possible to reduce this time if you have plans and manuals in the vicinity of equipment (it is not always possible) and if you draw up lists of breakdowns which detail symptoms, causes and solution of failures that have happened in the past or that may occur.
  • Collection of tools and technical means. Once determined what to do, the personnel responsible for the repair may need some time to put in the intervention place the means needed. This time is often affected by the distance of the workshops or tools warehouses to the intervention place, by operators’ forecast of carrying the tooling when believing they may need when an intervention is communicated and by the amount of resources available on the plant. To reduce this time, it is advisable to properly locate the workshops (see previous point), to acquire ‘healthy’ habits, such as attending to breakdowns carrying a standard toolbox, and equipping the workshop with the resources that may be necessary based on the type equipment that the plant has.
  • Collection of spare parts and materials. It is the time until the delivery of the materials needed to perform the intervention. It includes the time needed to locate the spare parts in the warehouse (in the case of having it in stock); make the relevant orders (if you do not have), so that the supplier locates them in the plant; to condition them (in case you have to do some preliminary work); to verify that they meet their specifications and to locate them in the place of use. This time is affected by the amount of material you have in stock, warehouse organization, by the ease of the purchasing department, and the quality of suppliers. To optimize this time, you must have a properly sized warehouse with an efficient, fast purchasing service, and have a quality supplier with service vocation.
  • Breakdown service. It is the necessary time to fix the problem emerged, so the equipment is ready to produce. This time is severely affected by the extent of the problem, the knowledge and skills of personnel involved in their resolution. To optimize this time, it is necessary to have a preventive maintenance system to avoid powerful breakdowns, and also have an effective staff, motivated and well trained.
  • Functional tests. It is the time needed to verify that the equipment has been properly repaired. The time spent on functional testing is usually a good investment: if equipment does not work until it has been found to meet all its specifications, the number of work orders decreases, and with it, all the time listed in the 1 to 6 points. This time primarily depends on the tests determined to be performed. To optimize this time is desirable to determine which minimum tests have to be performed to verify that the equipment is in perfect conditions, and to write protocols and procedures that clearly detail what tests are needed and how to implement them.
  • It is the time between the complete failure solution and the equipment commissioning. It is affected by the speed and flexibility of communications. To optimize it, as in point 2, you must have effective communication systems and agile bureaucratic systems that do not prevent the equipment implementation.
  • The maintenance documentation system must collect at least the most important incidents of the plant, with an analysis detailing the symptoms, cause, solution and preventive measures. In Section 4.5 Failure Analysis is discussed in detail how these reports should be.

It is easy to understand that the total time to an incident or breakdown solution, the repair time can be very small compared to the total time. It is also easy to understand that the Maintenance Management has a strongly influence on this time: at least 7 of the 10 previous times are affected by the organization of the department.

Very few companies collect and analyze the time spent in each of these phases, because of the complex differences between each of these times. Although, making these time captures at all corrective interventions can be tedious and unprofitable (high economic cost would not be justified with the savings that can be achieved with the study), it is important to carry out occasional samples to know how time is distributed of non-availability of production equipment. The conclusions may be valuable in deciding what actions can be low cost to reduce the MTTR.

The active corrective maintenance time can be analyzed as seven steps as per MIL-HDBK-472 and the mean time to repair (MTTR) can be determined using the probability that the repair will be necessary and the average time required to perform that repair. The steps are

  • Localization . Determining the system fault without using test equipment.
  • Isolation . Verification of the system fault using test equipment.
  • Accessing the fault.
  • Interchange . Replacing or repairing the fault.

BIT

Complex electronic systems such as laboratory instruments, avionics, communications networks and process control systems now frequently include built-in test (BIT) facilities. BIT consists of additional hardware and software which is used for carrying out functional test on the system. BIT might be designed to be activated by the operator, or it might monitor the system continuously or at set intervals.

BIT can be very effective in increasing system availability and user confidence in the system. However, BIT inevitably adds complexity and cost and can therefore increase the probability of failure. Additional sensors might be needed as well as BIT circuitry and displays. In microprocessor-controlled systems BIT can be largely implemented in software.

BIT can also adversely affect apparent reliability by falsely indicating that the system is in a failed condition. This can be caused by failures within the BIT, such as failures of sensors, connections, or other components. BIT should therefore be kept simple, and limited to monitoring of essential functions which cannot otherwise be easily monitored.

PM Analysis
Testability

Get industry recognized certification – Contact us

keyboard_arrow_up
Open chat
Need help?
Hello đź‘‹
Can we help you?