Data Workspace Conversational insights: Revolutionizing Data Discovery and Insight extraction
Data Workspace conversational insights
Today, Data Workspace features an advanced Generative AI-powered conversational system designed to empower users to discover, search, and summarize domain-specific information across multiple data sources and modalities. This Enterprise Semantic Search capability, built on top of OSDU (Open Subsurface Data Universe), provides an integrated platform for efficient data management and utilization.
Leveraging state-of-the-art Generative AI (Artificial Intelligence) and machine learning techniques, our system offers intuitive, context-aware search functionality that enables users to efficiently navigate the extensive datasets stored within OSDU. By automating the discovery, search, and summarization processes, our solution empowers users to quickly access relevant information, streamline their workflows, and make well-informed decisions. This innovative approach significantly enhances the value and usability of their data assets, ensuring a seamless and productive user experience.
Next generation capabilities
- Cross-modal search capability
The system seamlessly integrates information from various data sources and modalities, whether structured or unstructured, enabling comprehensive data exploration and insight extraction. Generative AI technology is used to construe the context behind the user query to efficiently route user searches across multiple data sources. The Generative AI capabilities of natural language query translation into database query languages like SQL, SPARQL, GraphQL etc. enable easy adaptation of Generative AI search workflows for diverse database schemas. In addition, seamless integration of the conversational agent with inherent Data Workspace custom viewers like Map, well log, Trajectory, Seismic, Document and Plot viewers enable users to discover and visualize retrieved information from different modalities instilling confidence on the retrieved search results.
Figure 1. DW conversational insights manage the complex task of translating user natural language queries into database queries that traverse the OSDU entity relationships. Retrieved records are summarized and semantically routed to data viewers customized to different entity types (map, log, seismic, document etc).
- Context-aware search
Our AI-driven system understands the context of user queries, providing more accurate and relevant search results as compared to traditional keyword-based searches. The inherent structured search workflows powered by Generative AI enable schema-aware search traversing the complex entity relationships and schema definitions for various data sources like OSDU, ProSource, OpenWorks etc. The arduous task of parsing the complex relationships across various data types is intelligently managed by the Generative AI workflow that plans for efficient query generation and traversal patterns.
Figure 2. DW conversational insights enables efficient exploratory data analysis with automated visualizations generated over retrieved OSDU data records. Generative AI workflow enables user to quickly query, transform and visualize domain data with inherent relationships.
Tapping into state-of-the-art advanced Retrieval augmented generation (RAG) and document summarization techniques, the Data Workspace user can now rapidly extract valuable insights without manually sifting through large volumes of unstructured data. In addition, the capability to cite document references for every generated system response empowers the user to trace the original information source for further validation and introspection.
Figure 3. DW conversational insights taps into unstructured document data to summarize information from multiple pages/documents based on user query. Citations provided along with extracted information point to the original data source enabling efficient user validation.
- Analytical Reasoning
Domain-based queries can often be complex involving multiple passes on multitude of data sources accompanied with analytic reasoning. The AI search assistant is powered by agentic LLM (Large Language Model) workflows to enable complex task decomposition & action planning for accurate, efficient, and adaptable search strategies across multitude of data types. The ability of Generative AI models to translate natural language into executable code enables profound abilities to transform data for summarization and data visualization tasks.
Figure 4: DW search assistant uses analytical reasoning to route search workflows, perform data transformations and incorporate pre-trained knowledge to highlight key observations in retrieved data and make recommendations.
- Enhanced usability and scalability
The intuitive conversational interface makes it easy for users of all technical backgrounds to interact with the system, promoting widespread adoption and maximizing productivity. Inbuilt multilingual capabilities, enable non-English speakers to interact with the system. Built on the scalable OSDU platform, our system can handle large volumes of data efficiently, ensuring that performance remains robust even as data grows.
Figure 5. DW conversational insights utilizes multilingual capabilities of large language models to enable non-English speakers to perform cross-modal search with summarization of insights.
Domain use cases
Below are a few examples of domain use cases where the Data Workspace conversation insights system can revolutionize how users interact with data and derive value. Through these scenarios, we aim to highlight how user workflow frictions are reduced, leading to more efficient and effective outcomes.
Scenario 1: Streamlined Data Discovery
Persona: Domain User, Petrel/Techlog user
Background: As a Geologist preparing to start interpretation on a specific field, you are faced with the challenge of sifting through vast amounts of diverse data accumulated over the years. This includes well reports, logs, seismic data, and more. Traditionally, this would involve performing complex search operations across multiple databases and manually compiling the necessary information. However, with Generative AI search workflows, this process is significantly streamlined. Instead of tedious manual searches, you can simply execute natural language queries to pinpoint and extract the exact combination of data you need. The assistant assimilates information from various sources, allowing you to effortlessly create a comprehensive data package ready for use in interpretation tools like Petrel, Techlog, and others. This not only saves time but also ensures nothing is missed out.
Scenario 2: Enhancing Data Quality and Completeness
Persona: Data Manager/Consumer
Background: As a Data Manager, ensuring the quality and completeness of data is paramount for supporting various downstream tasks, such as analysis, modelling, and decision-making. However, structured records often face challenges related to data quality, including issues with record completeness and consistency. With Generative AI workflows, these challenges are addressed more effectively. The system identifies gaps in the data, offers recommendations for enhancements, and even taps into unstructured data sources to improve overall data quality. This advanced approach ensures that your data is not only complete but also consistent and reliable, providing a strong foundation for all subsequent tasks.
Scenario 3: Summarizing and Extracting Information from Reports
Persona: Domain User
Background: Reservoir engineers frequently need to extract and summarize technical information from extensive reports to support their interpretation studies. This task can be challenging, given the sheer volume of unstructured data these reports often contain. Generative AI workflows streamline this process by efficiently navigating through large volumes of unstructured data, identifying key information, and summarizing it in a concise manner. The system then transforms the relevant data into structured representations that are easy to consume and integrate into further analysis. This approach not only saves time but also ensures that engineers have access to the most critical information needed for accurate and informed decision-making.
Scenario 4: Exploratory analysis and visualization of ProSource data
Persona: ProSource data consumer
Background: ProSource is a powerful E&P (Exploration and Production) data management product suite that is widely used across O&G (Oil & Gas) producers to store valuable subsurface data. For data consumers, being able to search data and visualize patterns within this data is essential for effective exploratory data analysis. Utilizing Data Workspace data connectors ProSource users can port their data onto OSDU data platform and benefit from the capabilities of the Data Workspace Search Assistant to not only query database records but also perform aggregation and data transformation operations with text prompts. Additionally, the system provides automated visualization that allow users to dynamically research patterns in the data, enabling them to make informed decisions with greater confidence. This comprehensive approach enhances the user experience, turning ProSource data into actionable insights with ease.