What 3,126 Work Orders Taught Us About Your FMEA

Maintenance work orders are the richest record of how your assets fail. We took 3,126 real work orders, classified them against a real FMEA, and found 337 the FMEA missed.

Written by Shane Scriven

Connect with us to learn more

Every reliability manager we have ever worked with knows that their CMMS data is messier than it should be. Most of them have made peace with it. The dashboards still get built, the KPIs still get reported, and the maintenance strategy still gets written, all on top of a foundation that everyone privately knows is unreliable.

SAS-AM has come to believe this is the single biggest unaddressed problem in asset management. Not the absence of data. The absence of trust in the data we already have. Over the last six weeks our team tested whether that has to be the case. This is what we found.

The problem in one sentence

Maintenance work orders are the richest record of how your assets actually fail. They are also almost entirely unstructured. Free text descriptions, inconsistent abbreviations, cause and effect blurred together, recorded by tradespeople under time pressure at the end of a shift.

The information is in there. Extracting it at scale has historically required a team of analysts and a year of effort. Most organisations decide it is not worth the cost. That decision has held up for thirty years. It does not hold up anymore.

What we built

We took a single dataset of 3,126 work orders from a real asset intensive operation and built a pipeline that does three things.

First, it ingests the raw work order text along with whatever metadata exists — asset ID, component, date, work type. Second, it classifies each work order against the asset's existing FMEA taxonomy, returning a specific failure mode, a confidence score, and any secondary matches. Third, it surfaces work orders that do not fit any existing failure mode, flagging them as taxonomy gaps rather than discarding them.

The classifier itself is a large language model with a structured prompt. The interesting part is not the model. The interesting part is the confidence framework wrapped around it. Every classification carries a measurable confidence level, and the output is grouped into four buckets the maintenance team can actually use.

The numbers

Pipeline completion

Processed cleanly

3,102 of 3,126 work orders made it through the classifier with no manual triage.

Actionable

Mapped to a failure mode

2,286 work orders landed in the High, Review or Gap buckets with a measurable confidence score.

Multi match

Matched two or more modes

1,273 work orders fit multiple failure modes. Not noise — a taxonomy ambiguity signal.

The killer finding

Taxonomy gaps surfaced

Failure modes happening in the field that the FMEA did not contain at all. Roughly one in ten.

Why the gap finding matters

If you operate a complex asset, your maintenance strategy is built on a failure mode and effects analysis. Every PM, every condition monitoring task, every spare parts decision, every reliability KPI traces back to the FMEA somewhere.

If the FMEA is missing one in ten of the failures actually happening in your operation, then one in ten of your reliability decisions are being made for failure modes you cannot see.

This is not a failure of the analyst who built the FMEA. FMEAs are point in time documents. They get built when an asset is commissioned, signed off, filed, and rarely opened again. Meanwhile the asset ages, operating conditions drift, modifications get made, new failure modes emerge. The FMEA does not update itself.

What we found is that the work order history has been quietly recording every one of those missed failure modes for years. The information was always there. There was just no way to extract it.

Three numbers every reliability manager should be tracking

If you are running a reliability programme on top of an FMEA that was built more than two years ago, three numbers should matter to you.

Metric one

Your actionable percentage

What proportion of your work orders can be confidently mapped to a known failure mode. Higher is better — it means the data layer underneath your reliability programme is trustworthy.

73.7%

Borderline

0%50%100%

Below 70% suggests either your FMEA is too coarse or your work order quality is too poor. Both are fixable — but only once you have measured.

Drag the slider to your gut feel.

Metric two

Your multi match rate

Where the same work order maps to multiple failure modes, your taxonomy has overlap. This is not a defect of the classifier. It is a defect of the FMEA, and it is the reason your engineers argue about which mode to assign at criticality reviews.

40.7%

High overlap

0%30%60%

Above 30% means your taxonomy carries overlap your engineers privately know about. The fix sits in the FMEA, not the data.

Drag the slider to your gut feel.

Metric three

Your gap rate

The proportion of work orders that describe failures your FMEA does not contain. Anything above 5% is a strong signal that your maintenance strategy is being built on an outdated picture of how your assets actually fail.

10.8%

FMEA gap

0%15%30%

Above 5% means failure modes are happening in the field that your FMEA does not capture. That is the killer finding from the pilot.

Drag the slider to your gut feel.

These three numbers tell you, for the first time, how much you should trust the data layer underneath your reliability programme.

Talk to us

If you want the intake brief for the free work order analysis, get in touch or reply to the LinkedIn post that brought you here.

SAS Asset Management provides advanced analytics, expert asset management services and maturity assessments to help asset owners realise their value.

Resources & Minerals

Article