Small Models, Big Impact: Running Asset Intelligence on Existing Industrial Hardware

Why lightweight AI models running on existing PLCs and edge devices are outperforming cloud-dependent alternatives in the field.

Written by Shane Scriven

Connect with us to learn more

There's a persistent assumption in conversations about AI that bigger models are better models. More parameters, more training data, more compute - surely that means more accurate predictions?

At the edge, this logic breaks down. And understanding why reveals something important about what makes AI useful for asset management versus what makes it impressive on benchmarks.

Why small models win at the edge

Consider what's actually required to run a machine learning model at an asset location. You need hardware that can survive industrial environments - vibration, temperature extremes, dust, humidity. You need power consumption that won't overwhelm local electrical capacity. You need processing that happens fast enough to matter. And ideally, you need all of this at a cost that doesn't eliminate the business case.

Large language models and massive neural networks are engineering marvels. They're also completely impractical for edge deployment in most asset management contexts. A model that requires a GPU cluster and draws kilowatts of power isn't going on a pump in a remote pump station.

Small models - we're talking models that can run on industrial-grade hardware with power budgets measured in watts - turn out to be remarkably effective for the focused tasks that matter in asset management.

Matching models to tasks

The key insight is that asset management AI doesn't need to understand arbitrary questions or generate creative text. It needs to answer specific, well-defined questions about specific assets.

Vibration analysis - Is this bearing degrading? Is there an imbalance developing? Has the machine's signature changed in a concerning way? These questions have been well-understood analytically for decades. The ML adds speed and automation, not fundamentally new capability. A model with a few thousand parameters can handle this reliably.

Thermal trending - Is this equipment running hotter than it should? Is the temperature trajectory concerning? Again, the underlying physics is well-understood. The model needs to learn what "normal" looks like for a specific asset and flag deviations. This doesn't require billions of parameters.

Anomaly detection - Does this pattern of sensor readings look like patterns we've seen before, or is something unusual happening? Techniques like isolation forests and autoencoders can identify outliers with models small enough to run on a Raspberry Pi.

Classification - Given these symptoms, which of these known failure modes is most likely? Decision trees and small neural networks handle this well when trained on relevant historical data.

The pattern is consistent: focused questions with well-defined answer spaces can be addressed with computationally efficient models.

What efficiency looks like in practice

A vibration analysis model running on edge hardware might have 10,000 to 100,000 parameters. Compare that to GPT-class models with hundreds of billions of parameters. The difference isn't just academic - it translates directly to what hardware you need and what it costs to operate.

Power consumption. A small model running inference on efficient edge hardware might draw 5-10 watts. That's solar-panel territory for remote assets. A cloud-scale model needs orders of magnitude more power.

Hardware cost. Industrial-grade edge devices capable of running lightweight ML models cost hundreds to low thousands of dollars. The infrastructure to run large models costs more than most individual assets are worth.

Inference speed. Small models produce predictions in milliseconds. When you need to detect an anomaly before a pump cavitates, milliseconds matter.

Reliability. Simpler models with fewer moving parts are easier to validate, easier to debug when something goes wrong, and more predictable in operation. In regulated environments, this transparency matters.

The efficiency/accuracy trade-off (and when it doesn't exist)

There's a common assumption that smaller models must be less accurate. Sometimes this is true - but often it isn't, for an important reason.

Large models are designed to generalise across diverse tasks. They need massive capacity because they're trying to learn patterns that apply to everything from chess to poetry to medical diagnosis.

Asset management models don't need to generalise that broadly. They need to be very good at recognising patterns in specific types of sensor data for specific types of equipment. This narrower scope means smaller models can achieve high accuracy because they're not wasting capacity on irrelevant capabilities.

A model trained specifically to detect bearing faults in centrifugal pumps doesn't need to know anything about language, images, or even other types of rotating equipment. It can use all its capacity for the one thing it needs to do well.

In practice, well-designed small models often outperform large general-purpose models on focused tasks because they're optimised for exactly that task rather than trying to be good at everything.

Techniques that make small models work

Several approaches help maximise capability in minimal computational footprints.

Knowledge distillation. Train a large, capable model with abundant compute, then train a smaller model to replicate its behaviour. The small model learns to approximate the large model's predictions without needing the same capacity. The computational cost shifts to training time rather than inference time.

Quantisation. Reduce the numerical precision of model parameters. Instead of using 32-bit floating point numbers, use 8-bit or even 4-bit integers. This reduces memory requirements and speeds up computation with minimal accuracy impact for many applications.

Pruning. Remove parameters that contribute little to model performance. Many trained networks have redundant connections that can be eliminated without significantly affecting predictions.

Architecture selection. Choose model architectures designed for efficiency. MobileNets, SqueezeNets, and similar architectures achieve reasonable accuracy with fraction of the parameters of standard architectures.

Domain-specific features. Instead of feeding raw sensor data into a neural network and hoping it learns useful representations, engineer features based on domain knowledge. FFT-based frequency features for vibration analysis, for example, compress relevant information into a form that simpler models can use effectively.

When you do need more capability

Small models aren't the answer to every question. Some use cases genuinely benefit from larger models or cloud-based processing.

Complex multi-modal analysis. If you're trying to combine vibration, thermal, acoustic, and visual data to make integrated predictions, the complexity may exceed what small models handle well.

Natural language interfaces. If you want operators to query system state using natural language rather than structured interfaces, you probably need larger language models - and that probably means cloud processing.

Novel pattern discovery. If you're trying to find previously unknown failure modes rather than detecting known ones, you may need more exploratory analytical approaches that benefit from larger models.

Continuous retraining. If your model needs to adapt continuously to changing conditions, the training process may need more compute than edge hardware provides.

The answer in these cases is usually hybrid architecture - small models at the edge handling time-critical decisions, larger models in the cloud handling complex analysis that can tolerate latency.

Getting practical about model selection

When deploying AI at the edge, model selection should start with the use case rather than the technology.

What question are you trying to answer? Be specific. "Is this pump healthy?" is too vague. "Is this bearing showing wear patterns that indicate failure within the next 30 days?" is something a model can actually learn.

What accuracy is actually required? You don't need 99.9% accuracy for most maintenance applications. You need to catch serious problems before they cause serious consequences. A model that detects 85% of significant faults with a low false positive rate may be more operationally useful than one that detects 95% but with more false alarms.

What compute is available? This constrains what's possible. If you're limited to low-power industrial hardware, you're working with small models whether you want to or not.

What data do you have? Sophisticated models are useless without appropriate training data. Simple models trained on good data outperform complex models trained on poor data.

The goal isn't the most impressive model - it's the model that reliably improves decisions about your specific assets in your specific operating context. Often, that's a surprisingly small model doing a well-defined job well.

From Models to Maintenance Decisions

The shift toward efficient, focused models reflects a broader evolution in how AI supports reliability engineering. Getting value from AI in asset management isn't primarily about algorithmic sophistication - it's about connecting model outputs to maintenance decisions in ways that improve outcomes. Our exploration of how AI is transforming reliability engineering examines this integration in depth.

The hardware landscape enabling edge AI deployment continues to advance. Dell's edge AI predictions highlight how purpose-built edge compute is becoming more capable while maintaining the power efficiency essential for asset management applications. For organisations planning edge deployments, understanding where edge hardware is heading helps inform both platform selection and deployment timeline decisions.

No items found.