When replacement pricing and downtime profit losses are considered, repair of a failed equipment asset can be up to 50% more costly than prior maintenance actions that could have averted the failure event altogether. However, making cost-effective decisions to guide preventative maintenance requires accurate analytical models that can predict future failure patterns from the recent operating history. In the airline industry, data prognostics for this type of preventative engine maintenance are essential in minimizing downtime costs and in maximizing safety for the travelling public.
During this study, AlgoTactica examined the time-series histories of 248 aircraft turbofan engines which, after a period of normal operation, developed one of two possible component degradations leading to failure. By analyzing outputs from engine sensors, an algorithm was trained to identify early trends indicative of degradation, and then predict the operational lifetime remaining before the failure event actually occurred. Here, predicted lifetimes are defined by remaining operational cycles, where a single cycle involves start-up, extended run at cruise power, followed by shutdown. Cycles are expressed in t-minus notation, where, for instance, t-60 indicates 60 operational cycles remaining before failure at time zero.
For each engine, 3 channels of operational setting data were available, plus 21 channels of sensor output data. Before data features could be engineered for trend detection, large variations in the sensor signal caused by changes in operational settings had to be removed. For each sensor channel, a single binary regression tree was designed, in which the operational settings were the independent variables, and the sensor signal was the dependent variable. As an example, the Feature Engineering Sequence plot illustrates how data features were engineered from a pressure sensor for each of 4 engines. The first-row time series shows the raw sensor output with amplitude shifts due to operational setting changes, while the second row shows the sensor residuals after using the regression trees to predict and subtract the influence of the settings. The third row shows the final engineered feature, derived from a 50-point sliding-window power-integration operation on the residuals. For each engine, the last 100 cycles before failure are shown, and an upwards warning trend emerges near the end of the feature.
Using an early-stopping strategy, a range of regression models were then built using a training set of 174 engines. These models were designed to predict the number of cycles remaining before failure, based on trending patterns in the 21 engineered features. The best performing model was a gradient-boosted regression tree ensemble, which scored an RMS error of 11.38 cycles on a 50-engine validation set, and 10.04 on a final 25-engine test set.
The Forecast Variance boxplots show the performance of the model when making predictions on the test set. Overall, it is seen that the forecasted number of cycles remaining before failure has a strong association with the actual number, and most important of all, as they approach zero, the predicted and actual values begin to agree very closely. There is greater variability in the -100 to -50 range because, in some cases, early warning trends did not emerge from the data until there were less than 50 cycles remaining. The Remaining Service Life plot matrix shows the individual relationship between predicted and actual remaining cycles for each of the 25 engines in the test set. Here too, there is a good agreement between predicted and actual, indicating that for each engine, an observer tracking only trends in the predicted values would acquire early warning of an emerging trajectory towards a failure event.