A Deep Dive into Time Series Foundation Models for SLB Surface Production Assets: Unveiling the Future
Introduction
Time series data are ubiquitous across various domains and industries and holds significant importance in many real-world dynamical systems such as weather, economics, and energy of course. Due to the prevalence, techniques for analyzing time series data have been extensively studied using statistical methods, machine learning, and deep learning models. Though deep learning-based methods have been able to surpass traditional time series methods in some instances, they are still limited to one model per dataset.
Recent advancements in the language and vision domains have revealed that large language models (LLMs) and vision language models (VLMs) have robust pattern recognition capabilities. This development has sparked growing interest in Foundation Models (FMs) for time series data and motivated several key questions such as: (1) Can we design one large model that can learn diverse time series characteristics? (2) Can these large time series foundation models (TSFM) be used out-of-the-box on the new unseen data? (3) While the classical machine learning models advocate for task-specific models, can the TSFM be task agnostic? Existing research works have explored some of these questions and demonstrated early signs for designing a universal model for time series analysis. Although these models show good performance on public open-source datasets, their performance reduces significantly when applied to a particular domain-specific data.
In this article, we will discuss a foundational time series model tailored for SLB equipment datasets. This model offers numerous benefits, with key advantages including its ability to learn the general characteristics, behavior patterns, and some physical properties of the equipment.
Figure 1. Production Assets Time Series Foundation Model takes multi-modal inputs to unlock diverse downstream tasks
Background
SLB possesses a unique combination of domain experts and data resources. These domain experts have deep knowledge of systems around energy. Additionally, we have built an ecosystem around data capturing from sensors, contextual data from manuals and documents, equipment across multiple sectors including production, drilling, well construction. This inheritance over the years offers us the ability to design a time series foundation model tailored to the nuisance of our domain. As said earlier, time series are hard to model because of their varied characteristics. Some of the interesting aspect of time series are:
- Changing temporal resolution (frequency)
- Inconsistent number of channels (univariate vs multivariate)
- Varying length of time series
- Difference in amplitudes of time series
- Changing context length, prediction length
To address these challenges, we tried to investigate varied state-of-the-art technologies around TSFM like MOIRAI, Lag-Llama, MOMENT, TimesFM, UNITS, Tiny Time Mixers (TTMs) and benchmarked to understand their strengths and limitations.
Model Capabilities
The primary motivation behind training large models is their ability to learn complex data distribution and exhibit superior generalization on unseen test instances. By training on extensive dataset, these models discern underlying patterns within the data and learn a latent space – a high dimensional space where the input data is encoded semantically. The encoding of input data to this latent space is termed as embedding or vector representation. Of note, the proximity in this latent space is indicative of the similarity or dissimilarity between their corresponding inputs in the original space.
As a result of learning such a meaningful embedding, these large models become adept at undertaking various downstream tasks, even with zero-shot or few-shot learning scenarios (as shown in Figure 1). They can generalize effectively, sometimes even demonstrating zero-shot capabilities, contingent on the input data. Nonetheless, despite their ability to learn generic representations, large models often require fine-tuning for domain-specific datasets, particularly when there is a statistical discrepancy between the training and the test data. Therefore, the finetuning strategy becomes crucial, as it allows the model to adapt to the nuances of domain-specific information, often with just a handful of examples.
Surface Production Asset Modeling
For production asset modeling, we start our investigation by curating synthetic and real datasets for production assets and employing open-source TSFM for mainly forecasting and anomaly detection. We design our experiments to measure the capabilities of TSFM under three different settings - (1) univariate vs multivariate forecasting, (2) zero-shot vs few-shot finetuning, and (3) visualization of input embeddings for anomaly detection. Our experiments suggest, while these models exhibit similar performance for forecasting on synthetic data, they show poor forecast results on real datasets. With multivariate modeling approach and fine-tuning, some of these models can improve forecast results, however, they do not consistently produce reliable forecasts. Further, when we analyze the PCA and t-SNE projections of embeddings, many of these models could not preserve a semantic meaning after dimensionality reduction and fail to discern anomalies in the data.
Case Study - Exploring Time Series Foundation Models for Compressor PHM Analysis
In this study, we delved into different time series tasks for Prognostics and Health Management (PHM) analysis using latent space representations of a compressor produced by a time series foundation model, also known as embeddings. For any given piece of equipment, we analyzed multiple sensor measurements to determine the system's state at any point in time with the goal of performing anomaly detection, fault isolation and remaining useful life (RUL) prediction.
Forecasting
Forecasting is crucial for predicting the future state of equipment based on historical data. We applied auto-regressive forecasting to predict imminent failures and long-term forecasting to estimate an asset's lifecycle.
We finetuned our time series foundation model on compressor data (19 sensors) and noted an improvement on forecasting compared to a zero-shot approach.
Figure 2: A workflow for time series forecasting using TSFM (left) - Forecasting results on compressor sensors using fine-tined TSFM (right)
Anomaly Detection
Anomalies are rare abnormal patterns that deviate from normal behavior. By comparing the embedding of observed and forecasted measurements, we can identify anomalies using techniques such as cosine similarity, distance metrics, and unsupervised clustering. Fine-tuning our Time Series Foundation Model (TSFM) on nominal data enhances anomaly detection.
Figure 3: A workflow for anomaly detection using TSFM embedding (left) - Anomaly Detection results on compressor sensors using a TSFM (right)
Fault Diagnosis
After detecting an anomaly, we pinpoint its cause using a Diagnosis matrix (D-matrix). By comparing embeddings from individual sensors, we isolate the anomaly.
Figure 4: Diagnosis matrix (D-matrix) for fault isolation (top) - Fault diagnosis results on compressor using a TSFM (bottom)
Remaining Useful Life (RUL) Prediction
Predicting the Remaining Useful Life (RUL) involves monitoring the gradual degradation of equipment. We use embeddings to measure the separation in latent space and forecast sensor values against predefined thresholds. The RUL is calculated as the difference between the present time and the predicted time when the asset reaches its end-of-life (EOL).
Figure 5: A workflow for estimating remaining useful life (RUL) for an asset (left) - RUL Prediction results on compressors using a TSFM (right)
Lessons Learnt
We explore the publicly available time series foundation models for assets modeling for multiple downstream tasks. Our studies indicate that the model trained on public dataset may not have a good zero-shot performance on the assets’ dataset, however, the results may improve after fine-tuning the model. To train a foundation model that shows strong generalizability across various assets or equipment's, the model may need to have capabilities to handle multivariate time series sequences and possibly the governing physical equations. Additionally, the training of the model may require an enormous amount of time-series data with over a few million to billions of observations to learn underlying patterns.
We believe that this technology is still in its nascent stages and has significant potential to evolve in the coming months. Our preliminary results on fine-tuning, adapting, and applying these models to our own data show promise, but we are also acutely aware of the current limitations. We are actively working on addressing these challenges to enhance the reliability and generalizability of the models for production asset modeling.