Enterprise-scale artificial intelligence (AI) for industrial operations demands a well-planned strategy, effective methodology, and efficient infrastructure, systems, and tools. Successful implementation of a single enterprise AI application usually requires developing, deploying, and maintaining a large number of machine learning (ML) models, which could easily mean maintaining thousands of ML models.

One good example of an enterprise-scale AI application is ML-based virtual metering for production optimization. A virtual meter (VM) is a model that mimics the function of physical sensors, such as a flow rate sensor, and provide measurements of critical physical quantities in an oil and gas production facility, thereby providing real-time visibility of ongoing operations. In the lack of physical sensors, ML-based virtual metering provides timely and granular visibility into production, distribution, and transportation operations across many industries. ML-based virtual meters are widely used in upstream oil and gas production, petrochemicals manufacturing, and energy management for buildings and campuses.

In this blog, we describe how BHC3™ Production Optimization, an enterprise AI application for upstream oil and gas operations, uses ML-based virtual metering to improve production for upstream oil and gas operators.

Upstream Oil and Gas Production

Oil and gas production is arguably the most important part of an energy production life cycle that entails all activities other than exploration and development of a field. This part of an energy production process can generally be divided into two separate cycles: daily production control, and reservoir management. The goal in daily production control is to meet short-term objectives such as production targets or utilization rates of surface facilities. Due to their nature, these activities are associated with short timescales of days to weeks. On the other hand, reservoir management consists of producing the available resources in the reservoir efficiently over the long term. In addition to the parameters that affect daily activities, inputs such as recompletion, gas or water injection shutoff, and stimulation are important for this process.

Figure 1. Schematic illustrating basic elements of an oil and gas production system.

Why Upstream Oil and Gas Production Needs Virtual Metering

For both short- and long-term planning, it is very important to the operator to know the production rate of every well. Due to regulations and business needs, the operator needs to know the flow rate of individual phases (for example oil, gas, water) produced from each well. Installing and maintaining physical sensors capable of providing real-time flow rate measurements and other quantities such as downhole pressure for every single well on the field is very expensive. Moreover, these physical sensors often require significant calibration to maintain accuracy; therefore, they are not available in majority of oil and gas fields.

Traditionally, the industry has employed an alternate technique of well tests that provide a practical solution for measuring flow rates. During a well test, the operator diverts the production from each well to a test separator. In a test separator, the phases are separated, and their production rates are measured individually and accurately for each phase. This process is typically done few times a week or month depending on the complexity of the field and cost associated with conducting well tests. The main limitation with well tests is the inability to account for production changes in between the well tests that are infrequent. With virtual metering, this limitation can be overcome while providing an additional advantage of monitoring continuous production.

Figure 2. Schematic illustrating typical Virtual Metering use case, where wells are connected to both production and test lines. Only one well is connected to the test line at a time. Infrequent flow rate measurements for each well (for different phases) are available from periods when the well is connected to the test line (during well tests).

In the context of upstream oil and gas production, a virtual meter is a model that estimates the flow rate of a single phase for an individual well when the well is not connected to a test separator. There are many possible approaches for virtual metering. One approach is physics-based modeling, where physical laws governing the dynamics of a given component are used to model flow rate or other variables of interest given measured values. For example, one could build a physics-based model for the flow rate of a liquid through a valve based on the geometry of the valve and the pressure before and after the valve. In practice, there are two main challenges with this approach. First, the exact physics governing simple operations can often be very complex. For example, in presence of multiple phases, accurate modeling of pressure drop across a valve is a complex and computationally expensive task. Second, this approach is usually hard to scale. For example, because of distinct operating conditions of various wells, different physical models are needed for an accurate prediction in each case. These models need to be carefully selected and tuned (aka history-matched) and this process can be prohibitively time-consuming for a large deployment with hundreds to thousands of wells.

One effective approach that addresses the challenges associated with physics-based modeling is supervised machine learning. With supervised machine learning, measurements that are frequently available both inside and outside well tests are utilized to build features, which are used to predict the value of the quantity of interest (e.g., the flow rate of a given phase). Examples for these measurements include data from bottomhole and well head sensors such as pressure and temperature readings. These features, along with measured production rates in previous well tests are used to train virtual metering models. Outside of well tests, the same input features are used to estimate the production rate for different phases when no direct measurement of the target variable is available. In theory, this allows you to estimate flow rates at the same frequency that the input features are available.

Implementing an AI-based Virtual Metering Application

Data and Data Model

The first prerequisite for an effective implementation of a virtual metering solution is access to high-fidelity, up-to-date data for different data sources. This includes all frequent measurements that can be used as input features and well test results that will be used as target values for training the models. Moreover, for a typical oil and gas field, this data needs to be available for many pads, with each pad including multiple production wells. The number of wells for a production optimization application can easily be on the order of hundreds to thousands. Organizing and unifying all the data sources necessary for virtual metering for such an application would require an efficient data lake and a data model that enables easy and reliable access to all data sources.

BHC3 Production Optimization comes with an extendible data model that can easily be configured and tailored for a new virtual metering deployment. The relationships between all key assets and measurements in an oil and gas fields are predefined in this data model. Moreover, once the data model is configured, all static and time series data related to a field can be readily accessed efficiently through the BHC3 Production Optimization data model.

ML Model Development

Developing a good machine learning model to serve as a virtual meter is a highly iterative process and typically requires careful exploration of the data sources and experimentation with different feature sets, learning techniques and ML models. This ad-hoc exploration and experimentation is simplified with the help of the BHC3-hosted Jupyter service. This service is tightly integrated with the BHC3 AI Suite, providing interface to the application data model and simplifying experimentation by combining multiple data sources that are typically not easily accessible in a standardized way in one place. This enables data scientists and subject matter experts (SMEs) to efficiently iterate over ideas and build an effective solution in a timely manner.

Model Deployment and Model Maintenance

In practice, multiple virtual meters are required per well. For example, in a conventional oil field these could be virtual meters for oil, gas, and water rates. Unconventional fields developed using technologies such as Steam Assisted Gravity Drainage (SAGD), may utilize virtual meters for measuring emulsion, gas and steam rates. In a typical deployment, to serve these virtual meters for a field with hundreds of wells, hundreds to thousands of machine learning models are required. In a recent deployment of BHC3 Production Optimization, over 300 models were deployed to provide predictions for four virtual meters (emulsion, gas, vapor, flashed vapor) for more than 200 wells.

Figure 3. Example Virtual Meter features (top 4) and label along with the predictions (bottom) for a virtual meter predicting emulsion production rate for an individual well.

Managing these machine learning models is a challenging task. First, every model needs to be trained and hyperparameter-tuned. Once these models are trained, they need to be productionized to continuously generate predictions and scores, and these outputs need to be persisted and efficiently served to the client using the application front-end. Finally, the quality of the predictions needs to be continuously monitored and the models must be retrained or replaced in case of performance degradation. Performing these tasks at the scale required for a field-level virtual metering application poses many technical challenges. In fact, most of these challenges are common across all enterprise scale AI applications. Building tools to address these challenges for a virtual metering application from scratch is a daunting and expensive venture.

BHC3 Production Optimization solves all these problems by leveraging BHC3’s model deployment framework. By leveraging the elastic, multi-node architecture of the BHC3 AI Suite, thousands of models can be trained, processed, or tuned simultaneously using asynchronous compute jobs. Moreover, BHC3’s model deployment framework enables the user to easily configure the logic that defines the data used for training the virtual meters. As an example, the user can specify whether for one type of virtual meter (e.g., gas production rate), the training data should be shared between all wells in a pad or collected individually for each well. This logic is flexible and can easily be defined by filters that can be based on any part of the application data model.

Figure 4. Different virtual meters can be configured and assigned to serve individual wells or a user-defined group of wells. In the example above one emulsion virtual meter is trained for each well using only the data available from that well. In contrast, the gas VM is shared between all wells in a pad and trained using all data available from a given pad. These groupings are easily achieved using simple filters in BHC3 AI Suite.

The application is preconfigured with specific machine learning models. However, the user can easily configure the application to select any other ML model from a library of machine learning pipelines available through the BHC3 AI Suite. Examples of available pipelines include random forest regressors, linear regressors, gradient boosted regressors, and neural net regressors. All trained models are instances of BHC3 AI Suite Types and are persisted and version controlled by the BHC3 AI Suite.

In addition to training the models asynchronously, BHC3 Production Optimization can be used to tune the hyperparameters of the ML models to ensure that the best performance is achieved for a given machine learning model. The application comes with predefined logic for how hyperparameter tuning can be performed. However, the user can configure the logic used for each hyperparameter tuning iteration. One key customizable part of the hyperparameter tuning logic is the metric used to choose the best performing model. In a recent deployment, the SMEs decided to use a custom metric in place of the default mean absolute error metric.

Similar to training and hyperparameter tuning, predictions for all virtual meters are continuously generated using asynchronous jobs and are made available to the end user as new data is streamed to the BHC3 platform. The framework also continuously monitors the performance of the machine learning models on new well tests that were not used in training the virtual meters to assess the quality of the predictions. In case of a performance degradation, the framework can automatically retrain the corresponding models using latest available data. The performance criteria and the retraining logic can both be configured and customized for a new deployment of BHC3 Production Optimization. The latency for updating model predictions and scores is an important requirement for deployment of Virtual Meter applications. The multi-node architecture of the BHC3 AI Suite satisfies these requirements with a cost-effective solution.

Importance of Subject Matter Expertise

Although BHC3 Production Optimization streamlines most necessary steps for building an effective virtual metering solution, subject matter experts play a big role in a successful deployment. First, identifying time series that are most reliable and trusted by the field engineers and technicians will greatly expedite feature engineering and feature selection. This process requires the help of a subject matter expert that is familiar with the specific field. As an example, in a recent deployment of BHC3 Production Optimization, a team of two data scientists were able to test many feature engineering ideas proposed by SMEs at scale (on 200+ production wells) and find a list of features that achieved the required prediction accuracy in a short amount of time.

Secondly, there are specific operation and procedures, such as comingled well tests, that are specific to each field and may be difficult to detect without familiarity with the details of these operations. A SME can help identify these edge cases and guide the deployment team to ensure these cases are incorporated in the configuration of the BHC3 Production Optimization application. Finally, SMEs can validate the predictions and the explainability of the models used in BHC3 Production Optimization. This final step is critical to ensure that the application is configured properly and integrated as a part of the production system.

Find out more about BHC3 Production Optimization and how to deploy virtual metering at bakerhughesc3.ai/production-optimization.

About the authors

Amir H. Delgoshaie is a Data Science Manager at C3 AI, where he has worked on the development and deployment of multiple large-scale AI applications for the utility, energy, and manufacturing sectors. He holds a Ph.D. in Energy Resources Engineering from Stanford University and master’s and bachelor’s degrees in Mechanical Engineering from ETH Zurich and Sharif UT. Prior to C3 AI, he developed algorithms and software at various research and industrial institutions.

Z. Larry Jin is a Lead Data Scientist at C3 AI. He has been working on multiple projects within a variety of industries, including manufacturing and oil and gas. Before C3 AI, he received his Ph.D. and master’s degree in Energy Resource Engineering from Stanford University where he focused on applying AI and deep learning to the energy sector.

Riyad Muradov is a Senior Data Scientist at C3 AI, with experience in developing and scaling data-driven applications in energy and financial sectors. He holds a master’s degree in Energy Resources Engineering from Stanford University with a focus on geostatistics and reservoir characterization.

Johnny Chen Johnny Chen is a Senior Data Scientist at Baker Hughes where he combines his background and experience in Chemical Engineering and Machine Learning for deveopling advanced analytics solutions for oil and gas industry, including virtual metering and optimization solutions at scale. He holds a master’s degree in Chemical Engineering from the University of Calgary, specializing in advanced process control and optimization. His areas of interest include deep learning, reinforcement learning, application development, and advanced control system.

Nikhil Gulati Nikhil Gulati is a Head of Applied ML at Baker Hughes with over 12 years of experience in data science, machine learning and software engineering. He is responsible for machine learning engineering, data science and implementation services for O&G customers globally for BHC3 products. Before joining Baker Hughes, he was a Data Science Leader at GE where he developed and deployed ML and AI solutions for GE’s top industrial customers across different verticals. He holds MS and Ph.D. in Electrical Engineering from Drexel University, Philadelphia USA, specializing in signal processing, machine learning and advance controls.