October 2023
Special focus: Advances in drilling

Optimizing BHA and fluid selection with a machine learning-based drilling system recommender

An innovative digital solution prepares an operational database and stores performance results of drill bits, motor power sections, RSS, BHA configurations and drilling fluids. After parameters of a planned drilling run are set, a multidimensional distance-based approach selects similar previous drilling runs for analysis. The process enables rational technology selection recommendations.
Greg Skoff / SLB Fatma Mahfoudh / SLB Cheolkyun Jeong / SLB Sergey Makarychew-Mikhailov / SLB

Imagine that you are a drilling engineer, who is tasked with planning an upcoming well. Perhaps you have experience drilling similar wells near the planned well, or maybe this is an exploration well in a new area. Your goal is to drill the well within an expected budget, while meeting all wellbore objectives, such as adhering to the designed trajectory, while maintaining a certain wellbore quality.  

You must design the drilling system, including the bottomhole assembly (BHA) and the drilling fluid, to drill each interval, or hole section, of the well. The BHA consists of a drill bit, a rotary steerable system (RSS), likely a motor, and often measurement-while-drilling and logging-while-drilling tools. The drilling fluid must satisfy all requirements for hole cleaning and wellbore stability. Also, the drilling fluid must be compatible with internal components of the BHA.  

The drilling engineer has numerous options for each of these components. A key decision is to determine which components to select. Furthermore, the engineer needs to determine whether the individual component selections are well-suited for the entire drilling system.  

Today, drilling engineers have many methods available to assist them in this decision-making process. Some factors that are commonly considered include local offset well analysis, prior experience​, physical modelling​, tool availability and/or cost​, and consultant advice from service companies. Due to the complexity of these numerous inputs, it is obvious that there is a need for a high-level, data-driven workflow to assist the decision-making process. Such a workflow would reduce the risk of suboptimal decision-making and improve the efficiency of the well construction process. 


The well construction industry continues to accumulate large quantities of data from global operational experience. Contained within this experience is a wide variety of both success and failure. The industry’s digital transformation aims to use these data to their fullest potential to reduce costs and improve efficiency. Digital technologies for data science and ML applications are rapidly becoming both easy to access and use. The stage is set for rapid growth and adoption of digital technologies within the industry to assist the human decision-making process. 

Recommender, or recommendation, systems in the large data volume era are changing the decision-making process in many ways, and the engines behind these systems are becoming smarter by using various statistical and machine-learning approaches. In the field of well construction, drilling performance is heavily influenced by the decisions made by engineers. However, these decisions are influenced greatly by the experiences of those very people.  

For example, a drilling engineer might remember the outstanding or atrocious run previously witnessed but may not evaluate the statistics of all the runs from a rig. Our team recognized the business goal to reduce operational time and increase consistency and realized that providing historical equipment selection and performance results through an intuitive user interface could help achieve these goals. 

An interdisciplinary team of data scientists, software engineers, and domain experts in the company collaborated to develop a novel recommender system to support data-driven decision-making in the well design process. Initially, the team partnered with external ML consulting company, Pluto7, to use their experience with feedback-based recommendation systems, similar to rating items on Netflix and Amazon.  

However, our solution is novel, in that we primarily use statistical measures of performance rather than user feedback; thus, we ventured out on our own to continue development. The authors have worked directly with internal end-users to ensure that our approach and the user interface satisfy the needs of today’s decision-makers. We redid and improved the first solution to include the motor power section, BHA, RSS and drilling fluid recommendations. The ML tool has the potential to add other criteria, such as cost and inventory availability.  

This article is a follow-up to our team’s initial paper, which focused on the development of the drilling fluid recommender. In this article, we aim to describe other components that have been developed, including PDC bit, motor power section and BHA design, as well as share some insights into our efforts toward a holistic drilling system recommender.  


In early 2020, we began developing the techniques and technologies for the drilling system recommender (DSR). Initially, two separate proof-of-concept engines were developed—one for PDC drill bit design and one for motor power section configuration. Because of the initial work, recent efforts have been made to increase the scope of these engines, develop additional engines for the BHA type and drilling fluid, as well as deploy a web app that provides a user interface (UI) to interact with these engines and obtain the recommendations. This effort led to user acceptance testing of the engines and web apps in which expert users provided feedback that the development team used to go over, and improve, the final products. 

A recommender system is a popular class of artificial intelligence  tools, implemented in many consumer applications. Music, video, and book recommendation services, product-buying recommendations, social platform content recommendations, and dating match advisers are just a few examples available today. In fact, at least one cloud computing provider (Google) now offers a fully managed service to deliver these consumer-type recommendations to their customers. 

A key difference between the work that this article presents, and these consumer-type recommendation systems, is that while the latter relies primarily on user preferences, the DSR relies on historical performance and other KPIs that are important within the context of the decision to be made. The KPI concept allows the recommendations to be based on what is ultimately important to the user; e.g., the average ROP of the run, the success rate of similar runs drilled with a certain tool, or the cost of the specific tool. The importance of each KPI can be fine-tuned, using the KPI importance weights, which the user can modify, based on their preferences. 

Ultimately, the usefulness of recommendations comes from the data on which the engine is built.  The authors’  service company has numerous global data systems that contain operational experience from the business units for drill bits, directional drilling services, and drilling fluids. To power the new DSR, an extensive effort was made to gather, clean and prepare global operational data into a new database. 


The DSR development relied heavily on collecting, organizing, and standardizing data from multiple legacy well construction databases within the service company. These databases included a legacy bits and tools database, a legacy directional drilling service database, and a legacy drilling fluid database. These databases consisted of hundreds of thousands of wells drilled worldwide. These databases were refreshed daily with new operational data and are made available to internal data science and data analytics workflows.  

As of October 2022, the number of global wells, sections, and runs since 2010 per database is summarized in Table 1. Note that a run count for the legacy drilling fluid database is not provided, because this database stores data per day and per section, but not per drilling run. 

All these data were hosted on a cloud environment, ensuring that access was controlled, using the minimum access concept, and conforming to strict data residency and data rights-of-use rules. Making the data available on the cloud opened the door to using current best practices for developing and deploying data science applications to a global user base. 

To standardize the data, a common table plan was established. Also, an extensive data engineering pipeline was developed to transform the data into a set of tables, which are usable from data scientists to business users.  


The data engineering phase begins with accessing well construction data from the three major databases, which are maintained by the three well construction business units. All the databases have complex plans and hundreds of tables connected with different primary solutions. Initially, the levels of details required by the recommender systems are defined, and the relevant tables and features are selected.  

Depending on whether the application is for entire well intervals, well sections, or individual drilling runs, a run summary is used as the lowest level, which aggregates information from daily summaries and high-resolution time series data for drilling operations. Structured Query Language (SQL) queries are formulated to extract necessary information from multiple raw tables and convert them into a series of compact tables that share the unified plan and feature notation. 

In the second step, the data are cleaned and processed as follows. Numerous features, in addition to well locations, operator, bit types, BHA, motors and muds, and numerous other features are cleaned automatically with grouping SQL or Python scripts or occasionally using manually created grouping tables. All numerical features, which by convention, are originally stored in the International System of Units, are converted to oilfield units for improved readability. Outliers in key features are identified, which originate from either inaccurately aggregated or erroneously manually entered data.  

Several consistency checks are performed to ensure there are no mismatches in key well sections and drilling run properties; e.g., location identifiers, top and bottom depths (measured depth and true vertical depth) and inclination. Such outlier and inconsistent values are corrected wherever possible, or they are cancelled out. Records (runs, intervals, wells) with too many cancelled or missing values are then removed. The resulting data sets are reduced in size, occasionally with data loss exceeding 25% but provide much higher quality and more reliable data sources. The cleaning and processing workflows are fully automated, using Dataiku Data Science Studio version 10.0.  

Recommender engine methods. Because numerous recommender engines have been developed, multiple approaches for engine architecture have been explored within the present work. The similarity of the planned run or interval to that of offset runs or intervals is the basis of the drilling fluid, BHA, and motor power section suggestions. Similarly, the planned run or interval is evaluated, using a multidimensional distance algorithm, such as the Eucliedean or Manhattan distance algorithms. While the drilling fluid recommender uses this approach directly, the BHA and motor power section recommenders use an unsupervised ML approach to cluster the offset runs. The multidimensional distance is then used to determine the most similar runs within the planned run’s predicted cluster. 

The bit design recommender, instead, uses a supervised ML approach. Classification models are used to predict multiple key bit design features, based on the input run features. This design was selected for two reasons. The first being the substantial number of bit designs available within the service company’s product collection. The second was due to the velocity that new bit designs are introduced, it is desirable to be able to recommend new bit designs that have not yet been field-tested. The user may well want to consider these unproven options together with the proven options.  

Drilling system recommender app. The DSR web application was developed in the Python version 3.7 release, using open-source libraries, such as Pandas programming language, SciPy module, Scikit-Learn library, and Streamlit library. Streamlit turns data scripts into shareable web apps in minutes. All in pure Python. No frontend experience required. Streamlit was a key library, because it allowed our team to quickly build a web-based UI, which we deploy directly to the end-users and rapidly iterate upon, based on their feedback. The DSR web app is deployed as a Docker-container, hosted on the Microsoft Azure app service. This web app uses a continuous integration, continuous deployment  pipeline (CI/CD), so that when changes are driven to the Azure DevOps code repository, the app is automatically rebuilt and redeployed. 

The end-user of the DSR web app is a drilling engineer, who is planning an upcoming well. The web app has multiple pages—one page for each decision to be made. These decisions are typically made in a predetermined order; e.g., first, the drilling fluid is selected, followed by the BHA, then the motor power section, and finally, the PDC drill bit. For each decision, the user is required to enter the parameters of the planned run, because parameters are essential for the decision to be made.  

For example, when using the BHA recommender page, the important parameters include the run type, the mud type, the geographic location, the water depth, the top measured depth, the planned drilling distance, the top and bottom true vertical depths, the top and bottom inclination angles, the wellbore diameter, the maximum dogleg severity and the average bottomhole temperature.  

After inputting the required data, the recommendations are displayed, and additional analyses can be performed by modifying the input data, changing the KPI importance weights, or simply analyzing the various data visualizations. The web app has been deployed to a global group of drilling engineers. For each recommender page, feedback sessions are held regularly, and the development team uses this feedback to rapidly iterate and improve user experience. Throughout this process, countless improvements have been made to the web app, to the engines, and even to the underlying database and data engineering pipeline. The authors believe that this feedback from actual users is essential to the success of the developed application. 


This section is divided into subsections for each recommender engine presented within the current work. 

Drilling fluid recommender. While drilling fluid is an essential technology in the well construction process, drilling performance is not usually a top priority considered when choosing a drilling fluid. However, drilling fluids can be helpful in achieving drilling objectives. Being a complex chemical system, drilling fluids also have numerous and often contradictory requirements, which makes the proper choice challenging. Our recent effort to profile drilling fluids in a variety of properties to understand similarities and differences in their performance and cost led us to the concept of the drilling fluid recommender. 

The objective of the drilling fluid recommender is to recommend the best drilling fluid for the planned well interval. Because drilling fluids are usually selected for the full-well interval (section), no detailed information on individual drilling runs is required, making a higher-level interval summary table sufficient. The smaller number of entities in the historical database allows for more intensive real-time computations for the recommendation engine without sacrificing user experience. Instead of clustering drilling runs by similarity, as is done in the BHA and motor power section recommenders, usually performed offline, in this situation, the offset intervals are selected, using the multidimensional distance, calculated real time.  

The target interval properties, entered by the user in the planning phase, are similar to the properties used in other recommenders. The target interval is placed into a multidimensional feature space populated with historical intervals. Distances between the target and all historical intervals are calculated, using either Euclidean or Manhattan distance algorithms. The smaller the distance, the more similar is the target interval to its offsets.  

The similarity is characterized by the normalized metric, so-called offset similarity index, which has the range from zero (most dissimilar) to one (identical). The user has an opportunity to select the number of historical offset intervals for analysis by varying the similarity index value. As in the other recommenders, additional filters can also be applied to refine the offset intervals selection. 

For all drilling fluids used in selected historical offset intervals, multiple aggregated metrics are quickly calculated, which allows for ranking fluids by many performance characteristics. In addition to ROP, time activity distributions are provided, making it easy to analyze the time used on various remedial activities (hole conditioning, stuck pipe), which are sometimes associated with fluid performance.  

Distribution of various fluid properties are also reported, as well as several indicators of the fluid cost and complexity; e.g., numbers of products used in the fluid formulation and numbers of fluid treatments required to maintain fluid within the specification range. All of the fluids used in the offset intervals are ranked in each of these performance characteristics. Depending on local requirements or constraints and other considerations, the user is empowered to weight performance metrics differently to find an optimal fluid system.  

BHA recommender. While drilling fluids are selected for a well interval, the BHA is selected for the planned drilling run. Multiple different BHAs can be used within the same well interval, because the interval might be divided into multiple runs, each with their own unique challenges. For example, a horizontal well might have both vertical and curve portions in the same 8¾-in. interval. In this case, the BHA for the curve section will almost certainly be different from the BHA for the vertical section. For these reasons, the BHA recommender, and the other recommenders presented in this paper, are focused on the drilling run requirements. 

The objective of the BHA engine is to recommend the best combination of the BHA type and the RSS model for the planned run. The BHA type is classified as one of five options—rotary (no motor, no RSS), straight motor, steerable motor, RSS, and motor-assisted RSS.  For both the RSS and the motor-assisted RSS BHA types, there are nine different RSS models offered by the service company. Hence, in total, there are 21 different BHA type options from which to select. The legacy directional drilling database is used for this engine, because it contains the best information on the directional tools used and performance of the drilling runs. 

The required input features for the planned run include the run type (simple run directional classification, such as vertical, tangent, curve, lateral, etc.), mud type (water-based mud or nonaqueous fluid), mud density, geographic location, top measured depth, drilling distance, top and bottom true vertical depth, top and bottom inclination, wellbore diameter, maximum dogleg severity, average bottomhole temperature, and water depth (for offshore wells). Before training an ML model, categorical features typically need to be encoded numerically, with one-hot encoding being a popular method. Finally, the input feature dimensionality and collinearity are reduced with principal component analysis. 

The BHA recommender engine is based on an unsupervised clustering method. First, all historical runs are split into two datasets – one for model training, the other for model testing. Using the training dataset, a model is trained, using the k-means algorithm. The approximately 140,000 total runs are separated into 35 clusters of similar runs, with an average of 4,000 runs per cluster. The number of clusters was selected, using the elbow method. The silhouette score, which can be between -1 and 1, is used to evaluate how well the clustering model performs. A silhouette score of 0.67 was achieved on the testing dataset.  

We observed that most clusters are segregated by three categorical features; i.e., the geographic location, the run type, and the mud type. Numerical features with high importance to the clustering included top and bottom true vertical depth, top inclination, and mud density. While the geographic location and many of the other input features function as proxy features for the formation drilled, it is recognized that improved data availability of the actual formation types and formation characteristics drilled should improve the run clustering approach.

Fig. 1. The BHA recommender web app.

After the planned run, parameters are entered into the app by the user, and the k-means model is used to predict to which cluster the planned run is assigned. Now, we have a group of similar offset runs that are used for additional evaluation. Furthermore, within the cluster, the multidimensional Euclidean distance between the planned run and all offset runs is computed, quantified by the so-called offset similarity index;  thus, the final recommendations can be based on only the most similar runs within the cluster. Also, a set of filters is provided, so that the end-user can manually refine the offset run selection if so desired. 

Once the most similar offset runs are discovered, they are used to determine which BHA types were used in the offset runs, as well as how each BHA type performed. KPIs are generated for each BHA type, including the median ROP, the median drilled footage, the 90th percentile achieved-maximum dogleg severity, the percentage of successful runs overall, and the total number of runs. Finally, a simple weighted average overall score is computed, based on the five KPIs and their individual KPI importance weights, which can be modified by the web app user. Figure 1 shows the BHA recommender web app interface. 

Fig. 2. t-distributed stochastic neighbor embedding plot of 140,000 drilling runs, where the data points are colored by cluster number.

Figure 2 shows a t-distributed stochastic neighbor embedding plot, which is a 2D representation of the approximately 140,000 runs using a manifold-learning approach that embeds high-dimensional datapoints in a lower dimensional space. This is a popular approach to visually observing clusters in a dataset. The markers on this plot are colored by the cluster number, and the plot shows that the clusters are, for the most part, distinct and well separated.  

The characteristics of the clusters can also be compared to each other. Figure 3 shows that most clusters are equally populated (pie chart). The figure also shows that most clusters are composed of runs from a specific L5GeoUnit, run type, and mud type. Furthermore, some clusters might contain a few runs from other geounits, run types, or mud types.

Fig. 3. Cluster analytics.

Motor power section recommender. After the BHA type is selected, additional tool selections can be made. If the BHA type is either a straight motor, steerable motor, or motor-assisted RSS, then the motor power section must be selected. Thus, the objective of the motor power section engine is to recommend the best motor power section configuration (PSC). The PSC is defined as the motor diameter, lobe configuration, and number of stages. Based on this definition, there are approximately 150 different motor PSCs that have been used in the last 10 years. The required input features are quite similar to those of the BHA engine, with the addition of the BHA type, which based on domain experience, was deemed necessary to make the motor recommendation. 

Similar to the BHA recommender engine, the motor power section recommender is based on an unsupervised clustering method. First, all historical runs are clustered into groups of similar runs, using the k-means algorithm. Then, the k-means model is used to predict to which cluster the planned run is assigned.  As a result, we have a group of similar offset runs that are used for supplementary evaluation. Additionally, within the cluster, the multidimensional Euclidean distance between the planned run and all offset runs is computed and  quantified by the so-called offset similarity index, so that the final recommendations can be based on only the most similar runs within the cluster. 

Fig. 4. Motor PSC architecture flow chart.

Once the most similar offset runs are found, they are used to determine which motor PSCs were used in the offset runs, as well as how each motor PSC performed. KPIs are generated for each PSC, including the median ROP, the median drilled footage, the percentage of successful runs overall,  the percentage of runs where the drilling run ended—not due to downhole motor failure,  and the total number of runs. Finally, a simple, weighted average, overall score is computed, based on the five KPIs and their individual KPI importance weights, which can be modified by the web app user.  

Additionally, filters are provided for motor power section specifications, such as motor outer diameter and motor flow speed ratio (revolutions per gallon), which the user can use to limit the PSC options considered. The overall architecture of the motor PSC recommender is shown in Fig. 4. The web app UI is similar to that of the BHA engine and is shown in Fig. 5.  

Fig. 5. Motor PSC recommender UI app.

Drill bit recommender. Finally, once the drilling fluid, BHA, and motor have been selected, a drill bit should be selected that is not only compatible with, but complements and maximizes, the performance of the remainder of the drilling system. However, drill bit designs are much more numerous than even motor PSCs. PDC drill bits, which drill far more global footage than roller cone bits, are available in many different varieties. These varieties can be generally characterized by the number of blades, the size of the PDC cutters, and the bit body material.  

PDC bits are becoming even more diverse, due to the explosion in shaped diamond element technology. Various types of shaped diamond element PDC bits are yielding enhanced drilling performance in some applications but are making the task of selecting a drill bit design significantly more complex. Not only are there numerous drill bit designs,  but new designs are frequently released.  This situation presents another challenge, because newly released bit designs have limited or no actual field performance data on which to base the KPIs used by a drill bit recommender engine.  

Fig. 6. An illustration of predicted bit design features. Four key design features were predicted by the pretrained ML engines.

For these reasons, the objective of the drill bit recommender engine is to predict four key PDC bit design features, rather than a specific drill bit. The bit design features predicted by the ML engine are the number of blades, the cutter diameter, the shaped diamond element technology (or PDC if no shaped cutter technology is used), and the bit body material type (either matrix or steel). As an example, Fig. 6 shows the prediction results of the ML engine for each design feature. In this example case, the ML engine recommends five blades, 11-mm cutter diameter, PDC cutter technology, and steel body material as the most preferable bit design features for the particular planned run. 

The drill bit engine architecture is shown in Fig. 7. Based on the predicted bit design features, a search engine is used to find all PDC bit designs that match the predicted design feature set. The KPIs are then computed for each bit design, when field performance data are available. When these data are not available, which is possible for newly introduced bit designs, the bit designs are still displayed, because the user might want to consider them. Because the historical selection information has accumulated in the data, the authors of this article aimed to develop an ML engine that characterizes the relationship between drilling environmental features and bit design features from the selected bit standpoint.  

Fig. 7. Drill bit recommender engine architecture.

Based on that relationship, the developed system can recommend potentially successful PDC bit designs for a planned drilling run. The input features are typically known during the planning phase and include​ hole size​, upper and lower depth​, trajectory, BHA type,​ fluid type and density. As shown in Fig. 6, the ML engine trains the relationship between drilling environments (run features) and corresponding choices (bit design features). The ML engine helps the decision-maker choose the relevant bit designs easily based on the recommendations generated. 

To predict each key bit feature, the results from six different classification models are collected. Those models are Bagging (BGG), ExtraTree (ExtraTree), K-th Nearest Neighboring (KNN), Light Gradient Boosting (LGBM), Random Forest (RF), and eXtreme Gradient Boosting (XGB). This ensemble approach is beneficial, because each modeling algorithm characterizes the relationship between input features and target classification features differently.  

The authors of this article attempted to collect all of the influences of the different ML algorithms as much as possible in the modeling architecture. The bit feature prediction from the various models may differ, thus presenting the results from all models can provide a different aspect to the end-users, which might be helpful in making their final decision. In Fig. 6, all ML models agree with PDC for the cutter technology type, but the KNN model reports a different opinion for the body material type. This ensemble approach also provides flexibility for the final recommendation, which helps to consider various KPIs, tool availability and user preferences. 

A data set was prepared for the proof-of-concept, including only the drilling runs in the Delaware basin, West Texas, since 2015, where PDC drill bits were used; hence, bit design details were known. These drilling runs generated a data set of approximately 9,500 total runs. 

Fig. 8. Results of the ML engine prediction capability for bit design features in testing dataset (=1,899 runs).

The prediction accuracy of the ensemble model was confirmed with the testing , and approximately 1,900 total runs were used for testing. Figure 8 shows the performance of the ML classifiers. In this particular basin data set, the ML engines accurately predicted all four bit design features for 52% of the validation runs and at least three or more bit design features for 80% of the validation runs. Therefore, the trained ML engines can successfully infer which drill bit design features are relevant to the given drilling conditions. 

Following the proof-of-concept, a second-generation bit recommendation engine was developed that expanded the geographic scope to all of U.S. land. This expanded geographic scope resulted in a significant increase in both the total number of runs and drill bit designs, because drill bit designs are typically developed for a specific geographic basin of which there are many in U.S. land, including the Permian, Anadarko, Gulf Coast, Williston and Appalachian basins.  

A dataset containing a total of approximately 58,000 drill bit runs from U.S. land since 2015 was prepared for this drill bit engine. Again, only drilling runs, in which PDC drill bits were utilized and bit design details were known, were used for model training and testing. 

While the overall accuracy of the models predicting the number of PDC blades and the PDC cutter diameter was 75% for the Delaware basin proof-of-concept data set, it was improved to 82% for the larger U.S. land data set. This demonstrates that using a larger, more varied data set for training can improve the performance of the classification models. The accuracy and F1 scores for the two models are shown in Tables 2 and 3.   

Fig. 9. Drill bit recommender web app UI.

Lastly, the engine based on the U.S. land data set was used within the web app. The interface of the web app is similar to that of the other recommenders. User inputs, including the planned run parameters, the KPI importance weights, and optional bit design feature filters are provided on the sidebar, on the left side of the screen. The recommended bit designs, including images of the top three bit designs, and a complete data table listing all recommendations, computed KPIs, and overall scores, are provided on the main section of the screen.

Fig. 10. Drill bit recommendations table. As with the other recommender UIs, additional visualizations are provided that allow the users to explore data on their own.

The KPIs include the median drilled footage, the median ROP, the run success rate, and the bit severe dull rate. The run success rate is a Boolean feature, based on the reason the drilling run ended, where a run that ended for a positive reason, such as total depth, is considered a success, while a run that ended for a negative reason, such as penetration rate, is considered a failure.

Likewise, bit severe dull rate is a Boolean feature, based on the IADC dull grade of the bit. Figure 9 shows the web app UI. Figure 10 shows the recommendations table. Figure 11  shows an interactive scatter plot, where each data point represents a bit design (400 total). Median drilled footage is shown on the X-axis, and the median ROP is shown on the Y-axis. Data points are colored by the overall score from the recommendation engine. Additional details of the bit design are shown when the mouse hovers over the data point. 

Fig. 11. Median performance of approximately 400-bit designs. The data points are colored by the recommendation engine overall score.


In addition to the drilling fluid recommendations reported previously, three new recommender systems have been developed to assist in selecting the BHA, the motor power section and the PDC drill bit design. These recommenders use ML algorithms to learn from global offset well data in finding the most similar drilling runs, or directly predict features of the drilling tools. A statistical approach is then used to score the selection options and rank them in terms of performance and the user’s context of which KPIs are most important for their decision.  

To power the new DSR, an extensive effort was made to gather, clean and prepare global operational data into a new database. This operational database includes the selection decisions and performance results of drill bits, motor power sections, RSS, BHA configurations, and drilling fluids. Furthermore, it is likely that this is the largest drilling operations database within the industry. A data engineering pipeline has been built, to allow it to be used for data-driven decision-making.

To put the data into the hands of the decision-makers, a web app was developed and deployed, using the Python Streamlit library. The new digital solution has been deployed to a global group of drilling engineers. The development team uses feedback collected through user acceptable testing sessions to rapidly iterate and improve user experience. The app was developed, using a modern and agile approach, including a continuous integration, continuous delivery (CI/CD) pipeline to automate deployment of new app changes.

While drilling engineers today have access to a vast amount of data and information, it often cannot be used in a practical and efficient way. The new solution places all of the previous drilling system technology selection choices and results into the hands of the drilling engineers, to empower them to make their best decisions. This effort shows how ML and innovative software deployment methods can, in fact, assist the human decision-making process and succeed in the goals of digital transformation. Proper drilling system selection is critical to achieve industry-wide goals of reduced operational time and increased consistency. Future goals of the ML-based drilling system recommender are joint and ultimately holistic system recommendations.  


This article contains excerpts from SPE paper 212559-MS, “Machine learning-based drilling system recommender: Towards optimal BHA and fluid technology selection,” presented at the SPE/IADC International Drilling Conference, Stavanger, Norway, March 7-9, 2023. 

About the Authors
Greg Skoff
Greg Skoff is a domain expert and data scientist with 12 years of experience at SLB, specializing in drill bits, drilling optimization and the well construction process. He has a rich background in mechanical engineering and is a graduate of University of Colorado, Boulder. Combined with his technical knowledge and expertise in data analytics/science, Mr. Skoff helps operators and SLB clients make informed decisions, based on insights gleaned from their data.
Fatma Mahfoudh
Fatma Mahfoudh
Cheolkyun Jeong
Cheolkyun Jeong
Sergey Makarychew-Mikhailov
Related Articles
Connect with World Oil
Connect with World Oil, the upstream industry's most trusted source of forecast data, industry trends, and insights into operational and technological advances.