Augmenting Drilling Risk Management with Generative AI

Oilfield operators and oilfield service providers currently manage well construction risks by analysing historical daily drilling reports (DDRs) to identify significant events, their relevance to current operations, and appropriate risk control measures. However, these reports often lack proper details, and near misses are frequently underreported, leading to a higher risk of high-impact events that are challenging to predict and mitigate. This post discusses how an on-premises implementation of generative AI workflows enhances the identification, classification, characterization, and summarization of past drilling events from historical DDR activity descriptions.

Managing Well Construction Risks with Generative AI

Drilling operations are inherently complex, involving significant health, safety, and financial risks due to harsh subsurface conditions and complex borehole trajectories. Traditional risk management methods like Hazard Assessment and Risk Control (HARC) and Failure Mode and Effects Analysis (FMEA) aim to mitigate these risks, but their effectiveness depends on the quality of the data they rely on. Generative AI offers a way to improve these traditional approaches by providing automated analysis and enhanced detection of hidden risks.

In this post, we are describing how generative AI helps:

Identifying and categorizing hidden non-productive time (NPT)
Characterizing near-miss events
Generating event summaries

Finding and characterizing near-miss events

Near-miss events, like tight spots, indicate a potential increase in high-impact risks, such as stuck pipes. However these minor events are often resolved quickly and, therefore, underreported. By tuning a Large Language Model (LLM) to process all available data in the DDR, these near-miss events can be identified and classified, providing valuable insights into potential risks since they are frequently discussed in the DDR activity description.

Table 1: A DDR snapshot that describes the activities being performed, along with undesired events not reported as non-productive time (NPT), serves as a source for generative AI. Tuning an LLM to recognize and categorize events like tight spots, stuck pipes, and loss circulation provides a more comprehensive risk assessment.

This has a significant impact on the HARC workflow, which is based on the cheese model represented below.

Figure 1: Emmental cheese model of an oversimplified Stuck Pipe event causation, where hazards represented on the left (tight spots, …) are mitigated by prevention measures (the cheese) to results in infrequent materialized stuck pipe events. (source: )

An assessment of the risks that is based on the sole analysis of the materialized high impact events causes the risk to be underestimated, especially as the hazards are under-reported.

This is the area where generative AI provides an opportunity:

We tune a Large Language Model (LLM) to recognize and categorize events (Tight Spot, Stuck Pipe, Loss Circulation, …)
For each category, we then use zero shot prompting to extract key characteristics the event (Depth of event, rates/volumes, …)
Finally, we tune a LLM to summarize lengthy drilling events to further ease the HARC/FMEA creation and evaluation processes

Figure 2: Basic activity workflow

Categorizing Drilling Events

The first step in characterizing near-miss events is categorizing them. Using a parameter-efficient fine-tuning (PEFT) approach, the pre-trained LLM can be specialized to assign drilling event categories. This process improves accuracy in recognizing and categorizing drilling events. [Link to DDR Use Case blog]

Figure 3: Instruction fine tuning workflow: A training dataset is converted to instruction template in the format prompt completion pair.

Figure 4: PEFT workflow for DDR event identification.

The tuned model is proving to be much more efficient at categorizing the drilling events, with a precision going from 18% to 85%.

Model	Precision (All classes weight equally)	Recall (All classes weight equally)
Llama2 13B	0.18	0.07
Fine Tuned LLama2 13B	0.76	0.72
Fine Tuned Mistral 7B	0.85	0.90

Performance metrics of pre-trained model vs tuned model for the drilling event classification task, applied on testing data.

Characterizing drilling events

Once events are categorized, the next step is to extract the failure mode for each event, such as fluid loss rates and depths. This step is crucial because even minor events can be indicative of more significant problems. Zero-shot prompting is applied to extract information, followed by additional checks to detect and remove hallucinations.

Figure 5: Example of activity description referring to a tight spot. The tight spot is semantically related to the 5280-5283 FT interval. A regular expression approach would not know the difference and would also extract the 5270-5300 FT interval.

Here, a zero-shot prompting approach is applied, to extract information about the tight spot, followed by drilling engineering logic to detect and remove hallucinations. The main risk associated with using large language model, is that it can identify tight spots and hazard where there was none. While detecting hallucination using purely statistical models is very difficult, applying a catalog of domain guardrails allows the identification of likely hallucination. For example, depending on the event category, making sure that tight spots location, or loss circulation are occurring in open hole intervals, and disregard the other as hallucinations.

Figure 6: Zero shot prompt template used to extract tight spots information from the activity description (sample) into a predefined format (json_schema)

Applying zero-shot prompts to a historical data set of 4,000 wells revealed approximately 17,000 previously unrecognized tight spots. This approach does not require prior training on specific data, demonstrating the versatility of generative AI in risk management.

Figure 7: Examples of the drilling event characteristics extracted using a zero shot prompting approach

Summarizing drilling events

Summarizing drilling activities is critical for well engineers to understand the context of events quickly. However, the text to analyze for a single event can be extremely long, as represented in the picture below, where the length of drilling events descriptions is on par with classics in literature.

Figure 8: Length of the activity descriptions (characters count) for the sections – with character counts of famous novels as a reference point in terms of length

Here, we applied an approach like the approach we took for categorizing drilling events. We decided to specialize a LLM in the activity of summarizing drilling events.

To do so, we used pairs of {Activity Descriptions} – {Daily Summary}, {Activity Descriptions} – {Section Summary}, …

Figure 9: Parameter Efficient Fine Tuning (PEFT) applied for tuning a model for drilling activities summarization

The tuned model outperforms pre-trained model, using the ROUGE metrics applied on Test data.

Figure 10: Precision & Recall ROUGE1 metrics of a summarization prompt using Mistral7B-instruct-v0.2

Figure 11: Precision & Recall ROUGE1 metrics of a summarization prompt using a tuned Mistral7B-instruct-v0.2

The ROUGE metrics clearly show that the model tuning is improving the summarization capabilities of the model. And the generated summaries are using the same syntax

Actual Summary	continue performance drilling 16" vertical hole section to section total depth @ 10554 feet. circulate to clean hole. spot 500 barrel of lubricated pill on bottom. drop gyro, pull out of hole 16" performance bottom hole assembly to 175
Tuned Model Summary	continue perform drilling 16'' section to 10554 feet. Circulate and clean hole before pullout of hole. spot lube pill. pull out of hole 16'' bottom hole assembly

Figure 12: Actual summary vs predicted summary

Feeding the outputs into a Drilling Planning Risk Management system

Generative AI workflows can identify and characterize hidden events and near misses in historical data. These insights can be integrated into Drilling Planning tools, allowing well engineers to perform HARC/FMEA workflows more efficiently when planning well construction.

Figure 12: Offset Well Analysis Stick chart in SLB Drilling Planning Platform DrillPlan

Conclusion

Generative AI offers significant opportunities for augmenting risk management in well construction. By identifying hidden risks, categorizing drilling events, and summarizing lengthy activity descriptions, these AI techniques can be instrumental in improving the quality of risk management workflows. Using locally hosted models on cost-efficient infrastructure ensures data privacy and residency while supporting the well construction risk management process.

References

Alexey Ruzhnikov et al., "Development and Application of Digital Solutions for Automatic Hazard Identification During Well Planning Stage," IADC/SPE Asia Pacific Drilling Technology Conference, 2022.

Biography

Valerian Guillot is a Data Science Technical Lead in Montpellier Technology Center, SLB. Throughout his 15 years in the oil and gas service industry, he has worked on geostatistics, petrophysics, AI applied to wellbore interpretation, drilling equipment predictive maintenance, and currently focuses on well construction performance and risk management.

Alexey Ruzhnikov holds a Ph.D. in Petroleum Engineering and currently serves as the Principal Engineering Manager at Schlumberger. His areas of interest encompass Engineering Management Systems, Risk Management, Operational Performance, and the study of lost circulation. With 22 years of diverse experience in the field, Alexey has authored 57 publications across various peer-reviewed journals. Additionally, he volunteers as an expert on the Standards Upstream Committee for API, the Well Committee for IOGP, and as a Technical Programme Committee Member for SPE.

Lee Ming Xiang is a Subsurface Domain Data Scientist KL Innovation Factori, SLB. Throughout her professional career, she focused on both geophysics and data science for oil and gas industry. She experienced in seismic interpretation, seismic inversion, and seismic processing for several geological basins. With her role as domain data scientist, she actively contributing her skills in both production and subsurface domain. She has worked for more than 15 data science projects and published 6 papers in 3 years. She is dedicated in advancing Natural Language Processing (NLP), Machine Learning, as well as automation in oil and gas industry.