A failure reporting, analysis and corrective action system (FRACAS) is a system, sometimes carried out using software, that provides a process for reporting, classifying, analyzing failures, and planning corrective actions in response to those failures. It is typically used in an industrial environment to collect data, record and analyse system failures. A FRACAS system may attempt to manage multiple failure reports and produces a history of failure and corrective actions. FRACAS records the problems related to a product or process and their associated root causes and failure analyses to assist in identifying and implementing corrective actions.
The FRACAS method was developed by the US Govt. and first introduced for use by the US Navy and all department of defense agencies in 1985. The FRACAS process is a closed loop with the following steps
Failure Reporting (FR). The failures and the faults related to a system, an equipment, a software or a process are formally reported through a standard form (Defect Report, Failure Report).
Analysis (A). Perform analysis in order to identify the root cause of failure.
Corrective Actions (CA). Identify, implement and verify corrective actions to prevent further recurrence of the failure.
Common FRACAS outputs may include: Part Number, Part Name, OEM, Field MTBF, MTBR, MTTR, spares consumption, reliability growth, failure/incidents distribution by type, location, part no., serial no, symptom, etc.
It is essential that all failures which occur during development testing are carefully reported and investigated. It can be very tempting to categorize a failure as irrelevant, or not likely to cause problems in service, especially when engineers are working to tight schedules and do not want to be delayed by filling in failure reports. However, time and costs will nearly always be saved in the long run if the first occurrence of every failure mode is treated as a problem to be investigated and corrected. Failure modes which affect reliability in service can often be tracked back to incidents during development testing, when no corrective action was taken.
A failure review board should be set up with the task of assessing failures, instigating and monitoring corrective action, and monitoring reliability growth. An important part of the board’s task is to ensure that the corrective action is effective in preventing any recurrence of failure. The board should consist of:
- The project reliability engineer.
- The designer.
- Others who might be able to help with the solutions, such as the quality engineer, production or test engineer.
The failure review board should operate as a team which works together to solve problems, not as a forum to argue about blame or to consign failure reports to the ‘random, no action required’ category. Its recommendations should be actioned quickly or reported to the project management if the board cannot decide on immediate action, for example if the solution to the problem requires more resources.
US MIL-HDBK-781 provides a good description of failure reporting methods. Since a consistent reporting system should be used throughout the programme MIL-STD-781 can be recommended as the basis for this.
These data should be recorded for each failure:
- Description of failure symptoms, and effect of failure.
- Immediate repair action taken.
- Equipment operating time at failure (e.g. elapsed time indicator reading, mileage).
- Operating conditions.
- Date/time of failure.
- Failure classification (e.g. design, manufacturing, maintenance-induced).
- Report of investigation into failed component and reclassification, if necessary.
- Recommended action to correct failure mode.
- Corrective action follow-up (test results, etc.).