Simulation refers to the creation and use of a computer model in order to replicate and analyze the behavior of a real system. As the complexity of a model increases (e.g. as repairs, resource utilization, throughput, preventive maintenance, inspections and other factors are to be considered), simulation quickly becomes the only feasible approach. Some advantages of simulation analysis are:
- Real-world complex systems with stochastic components can be represented accurately with simulation models, but rarely with analytical models.
- Simulation allows experimenting with a system without disrupting it. For example, the performance of an existing system can be evaluated in varying conditions. Alternative designs of a system can also be tested against a requirement.
When simulation is the approach of choice, one has to be aware of the disadvantages of a simulation study. Some disadvantages include:
- The time and cost to develop, populate, simulate, validate and analyze a model is often high.
- The results vary from run to run so the accuracy of the results is dependent on the number of simulations. Because of this, optimization is more challenging than comparing a fixed number of alternatives.
- Results obtained via simulation techniques are often harder to “sell” than analytical models.
- The large volume of numerical information often makes the analyst overconfident in the results. A simple example of this would be the number of significant figures often presented in study results.
If you have determined that it is appropriate to use simulation for system analysis, the next section describes some basic steps.
Basic Steps for Performing a Simulation Study
A simulation study may be divided into different steps. Many variations have been put forth in literature. Some basic steps that should be included in any study are outlined below.
Define the problem – Define the problem and the overall objectives of the study. Define the specific questions to be answered by the study. Identify the performance measures that will be used to evaluate the efficiency of the system. In a RAM analysis, this may be reliability, availability and/or throughput, along with costs. A life cycle cost (LCC) may also be desirable. Decide on the time-frame of the study and the required resources. This step should allow a full understanding of the scope of the study, keeping in mind that changes may occur as the study progresses. It involves a collaborative effort among all stakeholders: management, project manager, simulation analysts and subject matter experts (SMEs).
Define the system – This step requires a definition of the different elements and the ways they interact with the system. The system structure may be defined, possibly via a block diagram and/or a reliability block diagram (RBD), along with operating procedures and environment. The system definition should be kept as simple as possible allowing for complexity to be added as needed. The level of detail chosen is a factor that can determine the success or failure of a simulation study. At this point, it is important to reach an agreement between the stakeholders regarding the validity of the conceptual model before additional time and money are spent. The limitations and shortfalls should be discussed. The main goal of this step is to define a conceptual model of the system that is adequate for solving the problems and questions defined in the previous step.
Collect the data – This step in the study is labor-intensive as a large amount of data and processing may be required. Quantities of interest need to be collected, such as the probability distributions for failure and repair. If available, data on the existing system should be collected to validate the model. It is important to document the assumptions as the study progresses, especially during this step. Assumptions then can be reviewed during the different validation milestones.
Construct the model – If the choice has not been made, the analysts will need to decide whether to use a programming language (such as C or C++), a general simulation environment (such as Excel® or RENO) or high-level simulation software (such as BlockSim), where no programming is required by the user. The choice will influence the level of complexity of the system to be captured. For example, some assumptions may have to be made when using a commercial off-the-shelf package that may not be necessary when using a programming language. However, the time to develop the model will generally be lower with high-level software.
Verify the model – At this point, the analysts need to ensure that the model is actually doing what is expected. If the expected performance output collected is based on the actual system, this is the time to check to make sure the model matches the real system. The analysts and SMEs should check the model for correctness. Sensitivity analysis can be used to determine the impact of different factors on the performance of the system. This may assist in focusing on the critical aspects of the model.
Design the simulation – Some of the things that should be determined:
- What should the initial conditions of the system be?
- If steady-state results are of interest, what should the warm up period be?
- What should the mission time be? (It is likely this may have been determined in Step 1.)
- How many simulation runs should be used? This will be directly tied to the accuracy of the results.
Run the model and analyze the output – Run the model following the simulation design from the previous step. Generally, the objective in this phase is to determine the performance of one or more alternatives of the system so that a comparison can be done. Statistically sound analysis of the simulation output must be performed. Extensive literature in this topic is available and should be well understood. A perfectly good model may go to waste if this step is not done carefully.
Document, present and use the results – In this step, formal documentation should be compiled regarding the assumptions, the simulation model and its validation and the results of the study. In a simulation study, the process of discovery and understanding of the system is often as valuable as the results. Ideally, the bulk of the information to be documented has already been collected and is readily available. This information will be key, not only for the current and future understanding of the system, but also for the credibility of the study. If the results are both valid and credible, they can now be used as part of the decision-making process.
Markov Analysis
Markov analysis provides a means of analysing the reliability and availability of systems whose components exhibit strong dependencies. Other systems analysis methods (such as the Kinetic Tree Theory method employed in fault tree analyses) generally assume component independence that may lead to optimistic predictions for the system availability and reliability parameters. Some typical dependencies that can be handled using Markov models are:
- Components in cold or warm standby
- Common maintenance personnel
- Common spares with a limited on-site stock
The major drawback of Markov methods is that Markov diagrams for large systems are generally exceedingly large and complicated and difficult to construct. However, Markov models may be used to analyse smaller systems with strong dependencies requiring accurate evaluation. Other analysis techniques, such as fault tree analysis, may be used to evaluate large systems using simpler probabilistic calculation techniques. Large systems which exhibit strong component dependencies in isolated and critical parts of the system may be analysed using a combination of Markov analysis and simpler quantitative models.
The state transition diagram identifies all the discrete states of the system and the possible transitions between those states. In a Markov process the transition frequencies between states depends only on the current state probabilities and the constant transition rates between states. In this way the Markov model does not need to know about the history of how the state probabilities have evolved in time in order to calculate future state probabilities. Although a true Markovian process would only consider constant transition rates, computer programs allow time-varying transition rates to be defined. These time-varying rates must be defined with respect to absolute time or phase time (the time elapsed since the beginning of the current phase).
As the size of the Markov diagram increases the task of evaluating the expressions for time-dependent unavailability by hand becomes impractical. Computerised numerical methods may be employed, however, to provide a fast solution to large and complicated Markov systems. In addition these numerical methods may be extended to allow the modelling of phased behaviour and time-dependent transition rates.
A system or component can be in one of two states (e.g. failed, non-failed), and we can define the probabilities associated with these states on a discrete or continuous basis, the probability of being in one or other at a future time can be evaluated using state-space (or state-time ) analysis. In reliability and availability analysis, failure probability and the probability of being returned to an available state, failure rate and repair rate, are the variables of interest.
The best-known state-space analysis technique is Markov analysis. The Markov method can be applied under the following major constraints:
- The probabilities of changing from one state to another must remain constant, that is, the process must be homogenous. Thus the method can only be used when a constant hazard or failure rate assumption can be justified.
- Future states of the system are independent of all past states except the immediately preceding one. This is an important constraint in the analysis of repairable systems, since it implies that repair returns the system to an ‘as new’ condition.
Example
In analysing switching by Business Class customers between airlines the following data has been obtained by British Airways (BA):
Next Flight by | |||
BA | Competition | ||
Last Flight by | BA | 0.85 | 0.15 |
Competition | 0.10 | 0.90 |
For example if the last flight by a Business Class customer was by BA the probability that their next flight is by BA is 0.85. Business Class customers make 2 flights a year on average. Currently BA have 30% of the Business Class market. What would you forecast BA’s share of the Business Class market to be after two years?
Solution – We have the initial system state s1 given by s1 = [0.30, 0.70] and the transition matrix P is given by
P = | 0.85 0.15 |2 = | 0.7375 0.2625 |
| 0.10 0.90 | | 0.1750 0.8250 |
Where the square term arises as Business Class customers make 2 flights a year on average. Hence after one year has elapsed the state of the system s2 = s1P = [0.34375, 0.65625]
After two years have elapsed the state of the system = s3 = s2P = [0.368, 0.632] and note here that the elements of s2 and s3 add to one (as required).
So after two years have elapsed BA’s share of the Business Class market is 36.8%