Reliability Centred Maintenance Meets AI: A Modern Framework

How ML models refine FMEA probabilities using real operating data. A practical framework for integrating AI outputs into RCM decision logic.

Written by Shane Scriven

Connect with us to learn more

Your FMEA is probably wrong. Not because it was poorly done — but because it was frozen in time.

Most Failure Mode and Effects Analyses are written once, reviewed occasionally, and left to age quietly in a document management system. The failure probabilities they contain were estimated by experienced engineers using the best information available at the time. That was good practice. It still is.

But those estimates were made before your organisation had three, five, or ten years of condition monitoring data. Before your CMMS held thousands of work orders with failure descriptions. Before machine learning could read those records and find patterns no individual could spot manually.

The question isn't whether AI will change RCM. It already is. The question is how to integrate AI outputs into existing RCM methodology without throwing out what works.

What AI Actually Changes in RCM

RCM has always rested on understanding failure modes, their consequences, and the most effective maintenance response. That logic is sound. AI doesn't replace it. What AI changes are the inputs.

When Your ML Model and Your FMEA Disagree

Fill in your details to download the free decision framework.

You're all set!

Your download is ready. Click below to get the RCM Meets AI decision framework.

Download Framework

‍

1. Failure Mode Probabilities Become Data Driven

Traditional FMEAs estimate how likely each failure mode is based on engineering experience and whatever operational data is available. In practice, this means educated guesswork shaped by the loudest voice in the room.

ML models trained on real operating data can do something different. They can parse thousands of work orders, correlate them against known failure modes, and produce probability estimates grounded in what actually happened — not what the team remembers happening.

We ran this exact exercise recently for an oil and gas operator. An AI agent scanned over 5,000 historic work orders from SAP across a single asset class, covering one year of operations. Coupled with the existing FMEA, the agent was able to parse and infer the correct failure mode for over 90% of those records.

The impact was immediate. Failure modes that the team had ranked as unlikely turned out to be frequent. Others they'd prioritised were barely occurring. The Weibull models built from the AI classified data told a fundamentally different story to the ones built from the original FMEA estimates.

Even at 50% accuracy, that kind of backfit is substantial. At 90%, it changes your entire RCM strategy.

2. Maintenance Intervals Respond to Real Conditions

Static maintenance intervals are the default in most organisations. An FMEA says a failure mode develops over a certain timeframe, so maintenance gets scheduled accordingly. The interval is set once and reviewed — if you're lucky — during the next RCM review cycle.

Condition based maintenance has been around for decades. What's new is the ability to use ML models to translate condition signals into maintenance timing decisions at scale. Not one asset at a time, but across fleets.

When your vibration monitoring system flags a change in bearing condition, an ML model can estimate the remaining useful life, compare it against the planned maintenance window, and recommend whether to pull the task forward, leave it, or extend the interval. That decision happens automatically, based on the same degradation physics that RCM has always relied on — just applied continuously rather than once per review cycle.

3. FMEAs Update as New Patterns Emerge

This is the part most organisations haven't considered yet. Your FMEA is a living document in theory. In practice, it updates when someone has time to review it. Which means it doesn't update.

An AI system that continuously classifies work orders against your failure mode taxonomy can flag when something new appears. A failure mode your FMEA didn't capture. A combination of conditions that hasn't been documented. A pattern that only becomes visible across a large enough dataset.

This doesn't mean the AI writes your FMEA for you. It means the AI raises a hand and says: "I've seen something you haven't documented." Your reliability engineers still decide what to do about it. But they're deciding based on evidence rather than waiting for a catastrophic surprise.

Where Traditional RCM Still Holds Firm

Here's the thing: AI changes the inputs to RCM, not the framework itself.

The RCM decision logic — the tree that maps failure modes to maintenance strategies — is timeless. It asks the right questions in the right order. Is the failure hidden or evident? What are the consequences? Is there a condition based task that works? Those questions don't change because you have better data.

Expert engineering judgement still wins for low data, high consequence failure modes. When you're dealing with safety critical systems and you don't have enough failure history to train a reliable model, the conservative engineering estimate is the right call. Absence of failure data is not evidence of low risk.

The organisations getting this right are treating AI as a challenge function for their existing RCM programme — not a replacement for it. The AI provides better inputs. The RCM logic processes them. The engineer makes the call.

When Your ML Model and Your FMEA Disagree

This will happen. It should happen. A disagreement between your model and your FMEA is not a problem to resolve — it's a signal to investigate.

Three scenarios and how to handle them:

Ask whether you have failure history for it. If yes, your FMEA has a gap. If no, the model may be overfitting noise. Validate with engineering judgement before acting.

Check whether operating conditions have shifted since the FMEA was written. Context drift is the most common explanation. If conditions haven't changed, compare data sources — is the model trained on representative data?

Don't automatically reduce your estimate. The conservative position exists for a reason. Only consider a guided reduction if you have three or more years of clean operating data with no occurrence and the failure mode is not safety critical.

We've built a decision flowchart that walks through this logic step by step. Download it below.

A Practical Implementation Roadmap

If you want to integrate AI into your RCM programme, here's the sequence that works:

Use AI to parse your existing work orders against your FMEA failure modes. This is where 80% of the value sits. You already have the data. You just haven't read it yet.

With AI classified failure data, your statistical models become dramatically more accurate. This changes maintenance intervals, spares holdings, and risk profiles.

Feed real time condition signals into your failure models so maintenance timing responds to what's actually happening, not what was predicted five years ago.

Set up continuous work order classification so your FMEA stays current. New failure patterns get flagged. Probability estimates evolve. Your RCM programme becomes self improving.

The key is starting with Phase 1. It requires no sensors, no new infrastructure, and no capital investment. Just your existing CMMS data and an AI agent capable of reading it.

The Honest Assessment

Most organisations have years of valuable operational data sitting in their CMMS. Nobody is analysing it. The work orders are written, closed, and forgotten. The failure intelligence they contain never makes it back into the FMEA, never updates the Weibull model, never changes the maintenance strategy.

AI makes that extraction practical for the first time. Not perfect — practical. Even a model that correctly classifies half your work orders gives you dramatically more information than you had before.

RCM is not going away. It's getting better inputs. The organisations that recognise this early will build maintenance strategies grounded in what their assets are actually doing, not what an engineer estimated five years ago.

The RCM decision tree hasn't changed. The data feeding it has.

Download the Free Decision Framework

"When Your ML Model and Your FMEA Disagree" — a step by step flowchart for handling disagreements between AI predictions and your FMEA.

When Your ML Model and Your FMEA Disagree

Fill in your details to download the free decision framework.

You're all set!

Your download is ready. Click below to get the RCM Meets AI decision framework.

Download Framework

‍

Want to Explore AI Integration for Your RCM Programme?

SAS Asset Management works with asset intensive organisations to connect operational data to maintenance strategy. Get in touch.

Related Resources

Risk & Reliability

Rethinking Asset Criticality

Stop ranking assets against each other. Start understanding what each asset is critical to and who experiences the consequence when it fails.

AI & Analytics

ML Techniques for Asset Condition Assessment

A practical guide to selecting the right machine learning approach for your condition data, from anomaly detection to classification.

Free Tool

FMECA Template Generator

Generate a structured FMECA template tailored to your asset class. Prepopulated failure modes, severity scales, and RPN calculations.

No items found.