Logo

Overview of Subsurface Foundation Models

Over the past decade, machine learning (ML) has seen rapid advancements, particularly models utilizing convolutional neural networks (CNNs), generative adversarial networks (GANs), and transformers. Transformers, driven by self-attention mechanisms and self-supervised learning, have excelled in natural language processing (NLP) and have expanded into fields like computer vision (CV) and speech processing. Their success stems from their ability to capture long-range dependencies in data, scale to complex models, and leverage large datasets. These models demonstrate emergent behaviors like in-context learning, which allows them to address new tasks during inference.

Introduction

Over the past decade, machine learning (ML) has seen rapid advancements, particularly models utilizing convolutional neural networks (CNNs), generative adversarial networks (GANs), and transformers. Transformers, driven by self-attention mechanisms and self-supervised learning, have excelled in natural language processing (NLP) and have expanded into fields like computer vision (CV) and speech processing. Their success stems from their ability to capture long-range dependencies in data, scale to complex models, and leverage large datasets. These models demonstrate emergent behaviors like in-context learning, which allows them to address new tasks during inference.

Transformers have revolutionized machine learning by significantly enhancing efficiency in handling large-scale data across various domains for tasks that require processing of long-range dependencies and complex patterns. Additionally, their capacity for transfer learning allows pretrained transformer models to be fine-tuned for specific tasks, reducing the need for extensive task-specific training, thereby accelerating deployment and improving overall resource efficiency. This efficiency in both computation and adaptability makes transformers a powerful tool in modern AI/ML pipelines, driving innovations for subsurface assessments.

SLB closely tracks AI/ML innovations, recognizing their potential for subsurface assessment. Transformers offer opportunities for improved data fusion, automation, and tackling previously challenging problems. However, adapting popular AI/ML models to subsurface tasks presents challenges due to the unique nature of seismic and wellbore log data, which are sequential, multimodal, and often noisy. These models need to capture both local and contextual dependencies while being robust to noise.

Not all problems require the large dependency capture which transformers offer. For instance, many wellbore analysis tasks capturing reasonably wide local context/dependencies suffice. While convolutional networks provide less contextual capture, they offer other advantages such as better accuracy with less amount of data. Provided a network is constructed deeply, with transformer or convolutional processing units, both can handle task complexity and provide generalization. Our seismic foundation model and log foundational model are deep networks, the former relies on transformers, while the latter capitalizes on advantages of convolutional elements.

SLB is exploring Foundation Models (FM), Large Multi-Modal Models (LMM), and Agentic AI to optimize seismic and wellbore log analysis, acknowledging the complexity and specific requirements of subsurface workflows. While transformers are leading AI/ML development today, the field is rapidly evolving, and new innovations are likely to emerge soon.

Applicability of Foundation Models in Subsurface Assessments using Seismic data

Transformer models enhance subsurface seismic data assessment by improving the ability to analyze large-scale data, handle multiple data modalities, handle noise, and automate complex tasks, leading to more accurate and efficient geological insights. These together have brought tremendous efficiency to seismic interpretation tasks. While ML approaches have been successfully used in the past, if not trained on large diverse datasets may encounter generalization issues (leading to a swarm of fit-for-single-purpose models). Transformer-based models elegantly resolve these challenges by providing simple approaches for customization/fine-tuning of pre-trained models. Such pre-trained models provide a rich feature set which implicitly capture multi-scale geological features as well the geological nuances within seismic data and hence provide an excellent starting point for further interpretation ML modeling.

We have used publicly available data for experimentation and training of our seismic FM. Our models generalize very well (i.e., they can quickly adapt to new seismic data which the models are not trained with). Even with minimal input from subject matter experts, the model can detect features of all scales (viz. shallow hazards, faults, intrusions (salts/volcanics), channels and other stratigraphic features). While there are various applications for these models, we share some of our initial efforts here.

  1. Multiple Geological Feature Identification: Our seismic FM generalizes very well and can predict key geological features on datasets it has never seen before. The design of the approach lends itself to being more “geologically aware” which improves both the user interaction with the model and the ability to fine-tune the model using geological inputs. While the former enhances usability by integrating natural language interactions the latter helps experts with creating a corpus for use in model customization.

    For a deeper dive into the implementation of the geological feature identification please refer to the following blog - Geofeature Identification

  2. Geofeature segmentation/extraction: Once the features of interest are identified, our seismic FM can be further used to segment and isolate these features as geobodies or other interpretation artefacts (viz horizons, volumes, faults etc.) as the requirements or workflow dictates. SLB’s seismic FM is an example model which has been built using self-learning methods; and the remarkable set of geofeatures it can identify is evidence of the expressive features it extracts from the data. And such pretrained models can be used for multi-class semantic segmentation of geobodies, but a few labeled samples are needed. Our implementation provides users with additional control on predictions based on- quality of data-at-hand and complexity of the subsurface environment and structure. In the current implementation, to handle higher complexity (generally correlated to high geofeature variability) we permit the expert to guide the model via use of labels. In future these requirements could be minimal to none.

    For a deeper dive into the implementation of the geological feature segmentation and extraction please refer to the following blog - Geofeature Segmentation

Applicability of Foundation Models in Subsurface Assessments using well log data

Deep learning models have been increasingly applied to well log analysis, offering advanced methods for interpreting complex subsurface data. The sequential, noisy, and non-stationary characteristics of well data makes deep learning a well-suited approach for developing prediction models. Using publicly available data we have formulated our Well Log Foundation Model to handle these log data characteristics and generalize for various tasks.

A key feature of our log FM implementation is its ability to provide direct inference for - data anomalies (bad holes, outliers, unusual geological conditions etc.), reservoir properties, lithologies, and missing data imputations. Additionally, with minimal inputs the log FM can also automatically extract features and perform formation evaluations.

By leveraging the power of deep learning, well log analysis has become more automated, accurate, and efficient, leading to better insights into subsurface conditions and more informed decision-making in the oil and gas industry. While there are various applications for these models, we share some of our initial efforts here.

  1. Petrophysical Assistant: Log FM implementation provides useful insights in the data by providing direct inference of basic reservoir properties that a user may need in any acreage screening assessments. Log FM acts as an efficient assistant providing quick results for 1000s of wells necessary for screening acreage on basin to multi-basin scale. While this can be used to speed up the screening workflow, our implementation can also assist in generating multiple what-if scenarios in near real-time, resulting in better confidence in overall assessments.

    For a deeper dive into the implementation of this please refer to the blogs - LogFM implementation & LogFM - Petrophysical Assistant

Path Forward

As Foundation Models advance, their role in subsurface assessments will expand, driving innovation, reducing costs, and improving the accuracy and efficiency of oil & gas exploration and production. In the short term the natural path to progression is expanding the current models’ capabilities to handle more diverse geological settings and depositional environments and provide direct inferences.

  1. Cross-Disciplinary Integration: Foundation models could bridge different domains within oil and gas, such as geology, geophysics, and engineering, by creating integrated workflows that account for multiple disciplines. This integration would enable more holistic assessments, improving the overall understanding of reservoir behavior and optimizing resource recovery. For instance, log FM could provide key stratigraphic sequences to the seismic FM for extraction and the seismic FM could then be used to QC marker propagation between wells. Thus, enriching the outputs from each model.
  2. Hyper-Personalization: Future model implementations would power personalization for geoscientists and engineers, providing tailored recommendations, visualizations, and predictions based on specific project needs. These models would assist in interpreting complex data and automating routine tasks, allowing experts to focus on higher-level decision-making.
  3. Self-Learning Systems: Longer term, future foundation models could evolve into self-learning systems that continuously update and improve their performance as new data becomes available. This continuous learning loop would enhance the accuracy of subsurface assessments over time, leading to smarter and more efficient operations.