### Directly to: Evolutionary algorithms Neural networks Man-machine systems PAAG / HAZOP Event tree analysis Quantitative risk assessment Fuzzy logic Fault tree analysis Reliability forecast model (ZPM) Test and inspection planning Risk simulation Monte-Carlo simulation (MCS) Weibull analysis

> Follow this link to our one-pager on the topic of risk simulation

## Evolutionary algorithms

Evolutionary algorithms (EA) are optimisation methods that approach the solution being sought inspired by natural evolution. The best solutions are evaluated and selected in each iteration on the basis of a target function. The existing solutions are recombined and mutated for the following iteration, thus explaining the proximity to the word evolution.

Evolutionary algorithms are particularly useful if the optimisation problem is very complex and the mathematical description (model) as well as the solution to the question (e.g. by means of numerical methods) are laborious. Typical application cases for evolutionary algorithms include parameter estimation (e.g. the adjustment of a multi-parameter Weibull function), system design (e.g. weight of a cable harness, temperature load on a controller, turbulence load within a wind farm).

Selected references:

- Weibull analysis taking into account a sub-population model.
- Reliability forecast in the field of automobile telecommunication.
- Optimisation and vague date in the development process.

What we can offer:

- Were you unable to achieve satisfactory results with conventional methods and numerical methods supplied no useful solutions?

We can offer you the mathematical description and EA-based solutions to your individual questions – get in touch, we will be glad to help.

## Neural networks

Artificial neural networks (ANN) are used to design and solve models. They are able to generalise abstract structures and identify patterns, even with incomplete or fuzzy data. The actual model should be seen as a black-box within which the model is mapped from the input parameters to the solution via artificial neurons and their links, thus explaining the term neural networks.

Artificial neural networks are particularly useful if the problem is very complex and the mathematical description (model) as well as the solution to the problem are practically impossible (no knowledge of the influencing variables and the functional relationships). Typical application cases for ANN include parameter estimation (e.g. the adjustment of a multi-parameter Weibull function), lifetime prediction (e.g. taking into account climatic influencing variables) through to the prediction of share prices.

Selected references:

- Lifetime prediction taking into account changing climatic conditions.
- Neuronal parameter estimation for the use of failure data.
- Neuro-fuzzy reliability optimisation method with respect to life cycle costs.
- Optimisation and vague date in the development process.

What we can offer:

- Were you unable to obtain a satisfactory result with conventional methods, no data and bases to design models were available?

We can offer you the mathematical description and ANN-based solutions to your individual questions – get in touch, we will be glad to help.

## Man-machine system – event analyses

Event analyses in the field of man-machine systems (MMS) are used to determine the cause of and measures against accidents that have occurred in particular on account of the interaction between the subsystems individual, team, organisation, organisation environment and technology as well as from their sequential linking and interaction (see also VDI 4006, DIN EN 60300-3-1). Event analyses focus on the human factor with its interfaces to organisation and technology.

MMS event analyses are primarily suitable to investigate and reconstruct serious events (e.g. accidents at work, wrong decisions, false entries) – measures can be derived from the results that take into account the interaction of the subsystems individual, team, organisation, organisation environment and technology.

Selected references:

- Development of a prototype tool for the quantitative prediction of the human behavioural reliability in manual assembly work.

What we can offer:

- Are you struggling with process difficulties in the fields and interfaces of man-technology-organisation, and have there been any accidents at work in the recent past, for example?

We can help you understand the interfaces and develop solutions to eliminate process difficulties – get in touch, we will be glad to help.

## PAAG / HAZOP

The PAAG method (also known as HAZOP, Hazard and Operability) is a systematic procedure (inductive) to determine hazards, to investigate the relevant causes and effects, and to develop suitable countermeasures (see also DIN EN 60300-3-1). PAAG stands for “Prognose, Auffinden der Ursache, Abschätzung der Auswirkungen und Gegenmaßnahmen” (prediction, locate the cause, estimate the consequences and countermeasures). The PAAG method is essentially comparable to the FMEA, but unlike FMEA it uses defined key words (e.g. no, more, less, partly…), to identify deviations from the target function of the unit in question.

The PAAG method is thus ideal for a hazard analysis in the process industry, but can also be applied to all fields of technology. The benefits of this method lie in the early identification of deviations, hazards and weak points as well as the documentation of the status of the product development and process planning (general product knowledge).

Selected references:

- Preparation of a reliability concept for a power supply to a safety-critical infrastructure
- Risk and safety analysis to assess the hazard potential and improve the process-related operational safety of fuel cell heaters (BZH)

What we can offer:

- Do you need a risk assessment within the scope of an new development or change of operations?
- Have there been any faults/hazardous incidents recently?

We can offer the necessary tools, help you with their application and help you prepare the necessary documents – get in touch, we will be glad to help.

## Event tree analysis

The event tree analysis (ETA, see also DIN EN 60300-3-1) is suitable to map chains of events starting from an initial event (inductive). It is used if all of the possible paths of subsequent events (e.g. hazardous incident scenarios), their sequence and the most likely order have to be mapped and investigated.

The event tree analysis is often combined with the fault tree analysis (FTA) in practice, e.g. during a quantitative risk assessment (QRA). The safety levels (layers of protection) contained in the system are hereby mapped in the event tree. The probabilities with which the event sequences (e.g. failure of a safety level) can be followed in the event tree can be determined on the basis of methods such as the fault tree analysis or the Monte-Carlo simulation (MCS), for example.

Selected references:

- Preparation of a reliability concept for a power supply to a safety-critical infrastructure
- Risk and safety analysis to assess the hazard potential and improve the process-related operational safety of fuel cell heaters (BZH)

What we can offer:

- Does your system have several safety levels (layers of protection) that are difficult to map overall using methods such as FMEA of the fault tree analysis?
- Do incidents have to be investigated on account of the hazard potential within the scope of a site-specific hazard & risk assessment?
- Have there been any faults/hazardous incidents recently?

We can offer the necessary tools, help you with their application and help you prepare the necessary documents – get in touch, we will be glad to help.

## Quantitative risk assessment (QRA)

The quantitative risk assessment (QRA) is used to asses the risk of technical systems taking into account the event and exposure sequences. During a quantitative risk assessment, hazards are identified and assessed, the corresponding event sequences (scenarios) are mapped allowing for the technical safety levels and the resulting exposure sequences are modelled, analysed and assessed on the basis of the site environment.

The quantitative risk assessment is often used in practice to analyse and asses hazardous incident sequences with a high risk potential. Methods such as FMEA, HAZOP, event tree analysis, fault tree analysis and Monte-Carlo simulation are used and combined to perform the steps hazard identification and assessment, event and exposure sequence as well as risk assessment.

Selected references:

- Assessment of external sources of risk (kinds of external effects: pressure wave from explosion, aircraft crash) on the operation of nuclear power plants.
- Publications on the modelling, analysis and assessment of the external source of risk of pressure waves from an explosion.

What we can offer:

- Do you need a quantitative risk assessment with an evaluation of the risk hazards and risks within the scope of an operating license, new development or change of operations?
- Have there been any faults/hazardous incidents recently?

We can offer the necessary tools, we draft the quantitative risk assessment and help you prepare the necessary documents – get in touch, we will be glad to help.

## Fuzzy logic

Fuzzy logic enables the subjective knowledge that is available in the organisation or from experts to be used and integrated in existing approaches to safety and reliability analyses. A further advantage is that the fuzziness and subjectivity of the input data can still be identified in the results of the analysis.

Fuzzy logic has proven its worth in safety and reliability analyses in combination with tools such as FMECA or the error tree to integrate fuzzy data (e.g. failure rates). The fuzzy data is first fuzzified (transposed to a set framework of fuzzy mathematics). The result is then generated according to the calculation rules of the superordinate tool (FMECA, error tree, etc.), the following defuzzification of the result serves to transpose the fuzzy result into a concrete value.

Selected references:

- Fuzzy FMEA to analyse the risk from a pressure tank system.
- Fuzzy error tree analysis for a vehicle sensor system.
- Methodical comparison of the Boolean and fuzzy error tree analysis taking a flight control system as an example.

What we can offer:

- Does your system contain uncertainties, are you unable to identify concrete input variables (e.g. probabilities of occurrence, probabilities of detection, failure rates)?
- Are the experts who have been consulted unable to agree on the identification if the input variables (e.g. probabilities of occurrence, probabilities of detection, failure rates)?

We can offer the necessary tools, help you with their application and help you prepare the necessary documents – get in touch, we will be glad to help.

## Fault tree analysis

The **fault tree analysis** *(FTA)*, according to DIN EN 61025 also called the **fault status tree analysis**, can be used as a safety and reliability analysis for all kinds of plants and systems, including *common mode failures* and *human errors*. This is a deductive analysis based on the principles of Boolean algebra. The logical connections between component or subsystem failures leading to an unwanted event *(top event)* are determined. The results of the analyses allow an assessment of the system in terms of reliability, availability and safety.

The following figure shows a series and a parallel system as a reliability block diagram and as a fault tree. In the parallel system, basic event 1 AND basic event 2 have to occur for the top event to occur. The top event in the series system, on the other hand, takes place through the occurrence of basic event 1 OR basic event 2.

The goals of the fault tree analysis in detail are:

- The systematic identification of all possible causes and combinations of a failure that lead to an unwanted event.
- The determination of reliability parameters (e.g. occurrence probabilities for the failure combinations, occurrence probabilities for the unwanted event or non-availability of the system when requested).
- To generate a graphical representation in a kind of tree structure (logical combinatorial network) with input and output variables.
- To compare various design proposals through probabilistic reliability and safety predictions, to identify weak points and analytically verify required reliability and safety requirements.

The fault tree analysis is ideal for a presentation of data that is relevant for reliability and safety and to analyse large complex systems that may often consist of thousands of **minimum, steps** (these are combinations of events that lead to the unwanted top event). They are therefore prepared and evaluated with the help of computers.

Further knowledge of the systems considered by FBA can be gained through **importance parameters**. The effects of the individual basic events (often components) on reliability and safety can be determined with these evaluation parameters so as to objectify and quantify questions with respect to a system optimization, weak point analysis, error detection or maintenance strategies, for example.

The staff at IQZ use qualitative and quantitative fault tree analyses in the field of functional safety, for example, to obtain initial estimates of the probability of the occurrence of a defined top event at a very early stage of the product development process. The software-assisted generation and calculation is able to compare different system configurations so as to achieve the optimum safety, reliability and efficiency.

IQZ can access all common failure rate databases as a basis. The use of the Wuppertal reliability forecast model also allows the integration of customer-oriented failure rates.

## Reliability forecast model (ZPM)

Many questions in quality assurance – particularly in Warranty Management – stem directly from reliability forecasts, for example the determination of the expected failures in field use or the expected warranty costs. However, they also include the determination of the final stockpiling and spare parts quantities, the evaluation of various concepts, constructions and technologies over long periods of use as well as the estimation of possible risks. The following explanations relate to the automotive sector, but can in principle be transferred to all sectors where technical products are manufactured and used.

Trial series, endurance runs and accelerated tests are performed to ensure the basic function and toughness of vehicle components. However, they are unable to answer the questions posed above on account of the only small sampling sizes, and especially since the correlation between tests and real use in the field is unknown. Field data, on the other hand, reflect the real conditions of use and all relevant load factors, making them a good basis to describe the failure behaviour in the field. A sufficient number of failures are usually available for larger serial deliveries, thus guaranteeing a good statistical power. However, the field failures can only be completely recorded during the warranty period, which is why the data is censored in terms of time and hence incomplete.

The difference in time between registering a vehicle and the failure of the component in question is unsuitable to determine their load. The absolute operating time would be needed for this, something that is not normally recorded and is thus unknown. The distance travelled up to the failure is an adequate substitute for this, as has been proven by extensive testing. However, the driving behaviour during the observations that are restricted to the warranty period differs greatly (from only a few thousand to over 100 thousand km). Consequently, this has to be taken into account in reliability statements.

The figure below shows the procedure for a reliability forecast.

Reliability forecast models can be used for the following purposes, amongst others:

Warranty management

- Calculation of future guarantee and warranty costs
- Risk management for the extended warranty period
- Calculation of serial replacement requirement or final stocking quantities for spare parts management
- Exposure of warranty fraud
- Possibility of supplier assessment/checks
- Statistical analysis of the delay in approval (optimisation of supply chain management)
- Statistical analysis of the delay in reporting (optimisation of the flow of information between supplier-customer or internal)

Functional safety (ISO 26262, IEC 61508)

- Calculation of own failure rates taking into account the specific load of your components
- Possibility of verifying the “Proven-in-Use” pursuant to ISO 26262
- Proof of compliance with standards

Information research & development

- Evaluation of system modifications
- Help in choosing the component
- Use of components proven in operation

Optimisation of the test and inspection planning (actual load during use in the field) - Feedback for research and development

General advantages

- Statistical analysis of the company’s own field data by means of proven models
- Years of use with well-known OEMs and suppliers
- Both a pure analysis of the data and an interpretation by experts is possible
- Continuous further development (state-of-the-art)

**You can download a summary of the Wuppertal reliability forecast model here:**

## German version: |
## IQZ_Zuverlässigkeitsprognosemodell |

## German-English version: |
## IQZ_Reliablity-Prognosis-Modell |

## English version: |
## IQZ_Reliability-Prognosis-Modell |

## Test and inspection planning

The goal of test and inspection planning is to prove that a required property satisfies the requirements over a defined feature (usually time, cycles, actuations). In reliability engineering, it is generally checked whether the minimum required reliability (probability of survival) of a random sample (also: batch, group) from a population can be proven over a defined period.

The overall procedure is based on so-called hypothesis testing, which checks whether the random sample satisfies the null hypothesis. If this is the case, the zero hypothesis is assumed. If the zero hypothesis is rejected, the alternative hypothesis is assumed.

**Test methods**

The actual test methods can be very complex and expensive. This first thing to be done is thus to define a sensible and hence meaningful test. Because even an inexpensive or quick test is of no use if it is possible to derive the statements that can then be made from the test results.

After a suitable test has been defined, the main interest is in the time and the costs of the test.

Some tests will be described briefly below with the test procedures as well a their advantages and disadvantages.

With this method, the random sample is tested over a period of time to prove a defined minimum reliability. As the name says, all tested parts have to “survive” the test without failing.

One advantage is that the test is very simple. The result is also unambiguous and is not subject to any subjective evaluation as long as the failure has been clearly defined. Moreover, as soon as the general failure behaviour (infant mortality/wear-out) is known, an optimisation can be carried out between test duration and number of test pieces. For example, an infant mortality behaviour generally means that more parts can be tested and the test period can be shortened. The circumstances are reversed for a wear-out behaviour.

The main disadvantage is that no statements can be made on the probability of default curve on account of the test results. This is particularly unfavourable in the event of a component failure since not only is the result of the test negative, but little additional information can also be gained. This test can therefore be very unfavourable for wearing components and thus long test times.

In this test, the random sample is categorised in several classes, each of which have the same number of test pieces. Each class is tested until one of the test pieces fails. The test is not continued with the intact test pieces. The failure behaviour is then determined by means of a rank-size distribution since the failure data is a special case of censorship.

One advantage of this test is the categorisation in classes so that it can often be performed on smaller test rigs too. The determination of the failure function is also very helpful since it generates a lot of knowledge about the component.

However, this test can be unfavourable for components with a failure behaviour due to wear since the test durations may be very high in the individual classes. The greater work involved in planning the test is often seen as a disadvantage too, though this is often negligible with the long test times.

**Lifetime test**

In lifetime tests, the test pieces are tested over a pre-defined period of time, whereby a certain number of failures are allowed. The failed test pieces are not replaced. The minimum reliability can then be defined by means of the determined failure times and by specifying a confidence interval. In principle, the Success-Run-Test is a special kind of lifetime test.

The advantages of this test are that test pieces are allowed to fail and that the parameters can easily be determined by the Weibull distribution, a method that is very popular in reliability engineering.

However, the test may also take a long time with this method.

**End-of-Life-Test (EOL)**

In this test, the inspection lot is tested until all components fail. The failure times are then used to determine the failure behaviour. This test is a special kind of Sudden-Death-Test, in which each class has only one test piece and it is usually only used to ensure toughness.

The advantage of this test method over others is that everything is known about the failure behaviour of the random sample. However, it can only be used sensibly with a corresponding number of test pieces.

One disadvantage is usually the very long test duration.

Other methods can be added to reduce the test time in test and inspection planning, for example increasing the load through temperature, moisture etc.

## Risk simulation

The architecture of modern systems and components is characterised by a steady increase in complexity and a growing networking of different systems. This leads to an increase in the functions of these systems, but at the same time places much higher demands on their reliability.

One consequence of this networking is that it is harder to perform a reliability analysis using classic mathematical and stochastic methods due to confusing mathematical expressions that are impossible or hard to solve, complex technical dependences or simply on account of an excessively high number of components or subsystems.

The depiction, simulation and analysis of these complex problems in the technical and economic field is the domain of the Monte-Carlo simulation.

The Monte-Carlo simulation is a computer-aided simulation method that can reliably map and analyse even complex relationships by a large number of simulated system runs. This takes place by means of random numbers that are used to generate a random event for each simulation run. An estimator is developed in this way from a larger number of simulation runs to replace the conventional reliability analysis or other analytical calculations.

The possible fields of use for a Monte-Carlo simulation are very varied and can be employed for an integrated system, reliability or risk assessment (both technical and economic).

Selected technical fields of application:

- Classic reliability and safety analyses for (highly) complex systems
- Functional analyses to consider degraded functional states (subsystem failures)
- Use in the field of tolerance calculations, above all for non-linear tolerances and tolerance chains
- Simulation of a highly-reliable system with an extremely low probability of error
- Use in early phases of research and development for which no field data analyses can be performed.
- Integrated use in the field of product development to validate and secure development processes
- …

Selected economic fields of application:

- Analysis of the technical consequences on economic factors
- Targeted risk assessment taking a number of dynamic risk factors and objects into account
- Support in contract design (in particular guarantee and warranty terms)
- Comparison of different contract strategies based on concrete simulations
- Analysis of the availability of technical or organisational units and the development of an optimisation strategy based on this
- …

Thanks to its many years of experience in the field of simulation methods, IQZ can offer you both the methodical and applied competence to plan, perform, implement and monitor Monte-Carlo simulations in several fields of application.

## Monte-Carlo simulation (MCS)

The Monte-Carlo simulation (MCS), named after Monaco’s Monte Carlo district with its famous casino, is a simulation method to model random parameter ranges and their distribution function with the goal of solving certain integral, general and partial differential equations etc. with sufficient accuracy through stochastic modelling. Complex systems of equations of a stochastic or deterministic nature that are impossible or hard to solve analytically can thus be numerically (“playfully”) solved in a mathematical context using MCS. The mathematical foundation is hereby the “law of large numbers.”

Today, the Monte-Carlo simulation is seen as the only practical method to calculate complex, multi-dimensional systems of equations that cannot be solved analytically, or only with a great deal of calculation effort. Accordingly, the MCS is put to successful use in not only the physical field but also in game theory and mathematical economics, the theory of message transmission and not least in the theory of operation and reliability theory when analysing and considering

- complex fault trees,
- dependencies on states and changes in states,
- random distribution functions and time dependencies,
- flexible maintenance and repair strategies as well as
- dynamic influencing variables (“dynamic reliability theory”)

for example.

One current focus of research is the so-called “dynamic reliability theory.” The complex equations of the system transport theory to describe dynamic changes in systems that are taken as a basis here can normally only be successfully evaluated with MCS. The MCS allows the sufficiently accurate modelling of real conditions such as stochastic dependencies, time dependencies, ageing processes and physical influencing variables with no restrictions.

It can generally be said that the MCS is becoming increasingly important in industry since complex and expensive field tests can be wholly or partly substituted by computer simulations. Some of the benefits of this are that material and testing costs for field and laboratory tests can be saved, the test conditions are identical and the results are reproducible. What’s more, the results can be easily compared with those of other simulations and analysed.

The Monte-Carlo simulation is being constantly improved by high-performance computer systems – even with very high simulation runs. In this respect, the results that are obtained with a Monte-Carlo simulation are also becoming increasingly more accurate.

## Weibull analysis

The Weibull analysis is a widespread method to asses technical lifetimes and is used in the field of reliability engineering to assess different problems. Three common failure behaviours

I. Infant mortalities

II. Random failures

III. Wear-outs

can be selectively depicted and described using the Weibull distribution.

The Weibull analysis is used in practice to analyse and asses field complaints and to secure tests statistically (e.g. proof of operational strengths) so that targeted measures can be derived from the results of the analysis. The goal in both application cases is to describe the existing database and possibly forecast future probabilities of occurrence considering the corresponding static confidence.

Industrial application cases:

- Forecast future probabilities of default in field observations > Determine the number of defective parts, serial replacement requirement, final stocking quantities, warranty and goodwill costs, …
- Secure tests statistically > Use in reliability growth, design Success-Run-Test, Sudden-Death-Test, …
- Determine KPI from the point of view of operational stability
- Analyse load features that affect the lifetime (voltage, cycles, kilometres, …)
- Determine probabilities of default at an assigned load time
- Graphic visualisation option to supplement non-parametric tests (e.g. significance tests)

What we can offer:

We can offer the necessary tools, analyse your data and help you prepare the necessary evaluations including document(s) to fit your application case – get in touch, we will be glad to help.