Machine Learning for Geothermal Resource Assessment in Oil & Gas Basins

By Henry Green on June 1, 2026

machine-learning-for-geothermal-resource-assessment-in-oil-&-gas-basins

The oil and gas industry has spent decades drilling, logging, and characterizing sedimentary basins across the United States and internationally — generating an extraordinary volume of subsurface data that describes rock formation temperatures, reservoir permeability, fluid chemistry, geothermal gradients, and stratigraphic architecture. That same data is now the primary input for machine learning models that can identify commercially viable geothermal resources within basins that operators originally drilled for hydrocarbons. Machine learning geothermal oil gas applications are transforming how exploration teams evaluate dual-use subsurface potential — converting legacy well logs, bottom-hole temperature records, and formation evaluation data into geothermal prospectivity maps that would have required years of dedicated exploration campaigns to produce through conventional methods. This guide covers the technical foundation of ML-driven geothermal assessment, how oil and gas subsurface datasets feed those models, and how iFactory AI's industrial intelligence platform helps energy operators manage the operational transition from hydrocarbon extraction to hybrid energy production.

Machine Learning · Geothermal Assessment · Oil & Gas Basin Analytics · Energy Transition
Unlock Geothermal Potential Hidden in Your Existing Well Data
iFactory AI's ML platform converts legacy well logs, bottom-hole temperature records, and formation data into geothermal prospectivity maps — without new drilling or field campaigns.

Why Oil & Gas Basins Are the Best Starting Point for Geothermal Exploration

Hot sedimentary basins throughout the United States represent one of the most underutilized geothermal resources on earth — not because the resource is poorly understood, but because the economic pathway to development has historically required dedicated exploration investment that competing hydrocarbon projects always outranked. What has changed is the data infrastructure. Because of extensive oil and gas development in many of these systems, their reservoir properties are often well characterized at depth, making them substantially lower-risk targets for geothermal development than greenfield exploration in volcanic or tectonic settings. According to NREL's 2019 GeoVision study, hot sedimentary basins in the U.S. alone could provide as much as 7.5 million gigawatt-hours of heat — a resource that sits beneath existing oil and gas infrastructure in basins operators already know.

The Williston Basin, Permian Basin, Anadarko Basin, and Appalachian Basin all contain formation temperatures and reservoir permeabilities in deeper intervals that make geothermal direct-use and power generation technically viable. The question is not whether the resource exists — it is where within those basins the temperature-permeability combination justifies development, and how accurately machine learning models can predict that combination from existing well data without additional drilling. Book a Demo to see how iFactory AI integrates subsurface data for geothermal prospectivity analysis.

7.5M
Gigawatt-hours of geothermal heat potential identified in U.S. hot sedimentary basins — GeoVision study, NREL
26.7%
Of evaluated oil & gas wells in Wyoming found suitable for geothermal conversion at temperatures above 150°F at 6,000 ft depth
1.24%
Average error rate of physics-informed neural networks predicting geothermal temperature profiles in repurposed oil & gas wells
XGBoost
Top-performing ML algorithm for subsurface temperature prediction — outperforming physics-based models with 7.3°C mean absolute error vs. 8.76°C

Machine Learning Techniques Applied to Geothermal Resource Assessment

Geothermal resource assessment in oil and gas basins is fundamentally a prediction problem: given a dataset of formation temperatures, depths, rock properties, heat flow measurements, and geological structure, predict the temperature and permeability at any point in the basin where a well could be drilled. Machine learning is particularly well-suited to this problem because the input variables interact nonlinearly, the training data is abundant in mature basins, and the prediction task can be validated against held-out well records before any capital is committed to new drilling. Book a Demo to explore how iFactory AI applies ML models to geothermal prospectivity workflows.

Gradient Boosting (XGBoost / LightGBM)
The most widely validated ML approach for subsurface temperature prediction from well log features. XGBoost outperforms physics-based thermal gradient models in accuracy when trained on sufficient formation data, achieving mean absolute errors of 7.3°C versus 8.76°C for conventional models. Particularly effective for basin-scale prospectivity mapping where spatial interpolation between well control points is the primary task.
PRIMARY METHOD
LSTM Neural Networks for Time-Series Well Data
Long Short-Term Memory networks are applied to time-series well production and temperature data — particularly for geothermal fields where temperature measurements have been collected over extended production histories. LSTM models built on geothermal water temperature records from production wells deliver highly accurate predictions of reservoir thermal response under different extraction scenarios.
PRIMARY METHOD
Physics-Informed Neural Networks (PINNs)
PINNs embed physical laws — heat conduction equations, fluid flow dynamics, pressure-temperature relationships — directly into the neural network training loss function. For abandoned oil and gas well conversion analysis, PINNs achieve 1.24% average prediction error relative to finite-difference numerical benchmarks while operating with minimal pre-existing measurement data, making them practical for wells where historical temperature logs are incomplete.
EMERGING METHOD
Clustering for Hydraulic Flow Unit Delineation
K-means, DBSCAN, and hierarchical clustering algorithms delineate hydraulic flow units within sedimentary formations — the fundamental reservoir quality metric for geothermal prospecting. Applied to the Williston Basin, clustering-enhanced ML models significantly improved reservoir permeability estimation accuracy versus single-algorithm approaches, enabling precise reservoir quality index mapping across formations not previously evaluated for geothermal potential.
SUPPORTING METHOD
Bayesian Neural Networks for Uncertainty Quantification
Bayesian approaches to geothermal ML provide not just a resource potential prediction, but a probability distribution over that prediction — quantifying the confidence interval around temperature and permeability estimates at any location. Validated in the Great Basin region with 1,200 MWe of installed capacity, Bayesian neural networks give exploration teams the uncertainty maps needed to prioritize drilling candidates by risk-adjusted resource potential.
RISK MANAGEMENT
Random Forest for Multi-Variable Feature Importance
Random forest models applied alongside XGBoost provide interpretable feature importance rankings — identifying which well log variables (bottom-hole temperature, formation lithology, porosity, thermal conductivity) most strongly predict geothermal potential at a given depth in a specific basin. This interpretability makes random forest a key tool for presenting resource assessment findings to engineering teams and investment committees who need to understand model drivers, not just model outputs.
INTERPRETABILITY

The Oil & Gas Data Advantage: What Legacy Well Records Contain for Geothermal ML

The competitive advantage that oil and gas operators hold in geothermal resource assessment is data density. A mature production basin in the Permian, Williston, or Anadarko may have thousands of wells with detailed subsurface logs, bottom-hole temperature measurements, formation evaluation reports, and production histories spanning decades. Converting that legacy asset into geothermal prospectivity intelligence requires a structured data pipeline that extracts the relevant variables, standardizes measurement units and depth references, and feeds them into ML models calibrated for thermal resource prediction rather than hydrocarbon reservoir characterization.

Data Type Source in O&G Well Records ML Application in Geothermal Assessment Quality Consideration
Bottom-Hole Temperature (BHT) Drilling completion reports, mud log headers, wireline log headers Primary training label for subsurface temperature prediction models; baseline for geothermal gradient mapping BHT measurements are thermally disturbed during drilling — require equilibration correction before use as ML training data
Wireline Formation Logs Gamma ray, resistivity, neutron-density, sonic velocity logs Feature inputs for lithology classification, porosity prediction, and thermal conductivity estimation by formation Log suite completeness varies by well age; pre-1990 wells often lack full suites required for multi-variable models
Formation Water Chemistry DST reports, produced water analyses, completion records Geochemical thermometer cross-validation; reservoir fluid temperature estimation for direct-use applications Sampling depth and contamination history must be verified; mixed-zone sampling produces misleading temperature estimates
Well Depth & Casing Records Completion diagrams, state regulatory filings, API records Candidate well screening for conversion feasibility; casing diameter governs geothermal fluid extraction rates Regulatory filings are authoritative but may predate well modifications; field verification required before conversion commitment
Production Fluid Volumes Monthly production reports, state regulatory databases Flow rate potential estimation; co-production screening for simultaneous oil-heat extraction models Wells producing over 10,000 bpd combined fluid are prime candidates for on-site electrical generation via geothermal
Seismic Surveys 2D/3D seismic datasets, velocity models, interpreted horizons Structural mapping of heat-trapping formations; fault identification for enhanced geothermal system siting Seismic resolution at geothermal target depths (3–6 km) varies with survey vintage and processing methodology

Geothermal ML Assessment Workflow: From Basin Data to Drilling Decision

The operational workflow that converts existing oil and gas basin data into a geothermal drilling decision involves five distinct analytical stages, each dependent on the output of the previous. Understanding this workflow is important for energy operators evaluating whether their existing data infrastructure is sufficient to support a machine learning geothermal assessment without additional field data collection — which is the case for most mature U.S. basins with comprehensive well records.

01
Data Aggregation and Standardization
Well records sourced from state regulatory databases, operator data rooms, and national datasets (USGS, NREL GDR). Bottom-hole temperature measurements corrected for drilling thermal disturbance using equilibration models. Formation log data standardized for depth reference and unit consistency across wells drilled under different regulatory regimes and operator conventions. Data quality flags applied to records with incomplete suites or suspect measurement provenance.
02
Feature Engineering for Geothermal ML Models
Formation-level thermal conductivity estimated from lithology classification and porosity logs. Geothermal gradient calculated per well from corrected BHT records and depth data. Hydraulic flow unit classification applied using clustering algorithms to delineate permeability-porosity relationships by formation. Spatial features — proximity to known heat anomalies, fault systems, volcanic centers — computed for each well location as additional model inputs.
03
Model Training, Validation, and Uncertainty Quantification
XGBoost, random forest, and LSTM models trained on the prepared well dataset with temperature measurements as target labels and formation properties as input features. Models validated against held-out wells not used in training to assess generalization accuracy. Bayesian uncertainty quantification applied to produce confidence intervals on temperature predictions at non-well locations. Model ensemble outputs combined for final prospectivity mapping.
04
2D / 3D Prospectivity Mapping and Candidate Well Screening
ML model predictions extended spatially to generate 2D constant-temperature maps at target depths across the basin. Reservoir quality index computed for each mapped location as a combined function of predicted temperature and permeability. Existing well infrastructure screened against temperature, casing diameter, and fluid volume criteria to identify conversion candidates — prioritizing wells where temperatures above 150°F at accessible depths overlap with casings of 4 inches or greater diameter.
05
Economic Screening and Development Prioritization
Candidate wells and new drill locations ranked by estimated resource temperature, flow capacity, proximity to heat demand centers, and development cost. Techno-economic analysis computed for direct-use and power generation scenarios at each priority location. Output is a ranked development portfolio with uncertainty-adjusted resource estimates, capital cost ranges, and development sequencing recommendations — the deliverable that connects the ML assessment to investment decision-making.
Apply Machine Learning to Your Basin's Geothermal Resource Potential
iFactory AI connects your existing subsurface data, production records, and well logs into a unified ML assessment pipeline — delivering geothermal prospectivity maps, candidate well rankings, and development economics from data you already own.

iFactory AI's Role in Geothermal-Oil & Gas Integration Operations

The geothermal resource assessment phase that machine learning enables is only the first operational challenge for oil and gas operators pursuing energy diversification. Once a geothermal development decision is made and production begins, the operational management challenge shifts to running a hybrid energy facility — managing co-produced fluids, monitoring wellbore thermal performance, scheduling maintenance across both hydrocarbon and geothermal assets, and tracking the carbon abatement contribution of geothermal generation against scope 1 emissions baselines. Book a Demo to see how iFactory AI's industrial platform manages hybrid oil-geothermal operations.

iFactory AI Capabilities for Geothermal-Oil & Gas Hybrid Operations
Real-time wellbore temperature and flow monitoring for co-producing oil and geothermal wells — detecting thermal decline trends before they affect production economics
Predictive maintenance for geothermal surface equipment — pumps, heat exchangers, binary cycle turbines — with condition-based alerts replacing calendar-interval replacement schedules
Digital twin modeling of converted oil and gas wells operating in geothermal mode — simulating thermal depletion rates, reinjection scheduling, and reservoir pressure management
Automated carbon intensity tracking per barrel equivalent produced — attributing scope 1 emission reductions from geothermal displacement of gas-fired generation to corporate sustainability reporting
Integrated production scheduling for facilities running simultaneous hydrocarbon extraction and geothermal heat or power generation — shared infrastructure conflict detection and optimization
Subsurface data historian integration — connecting DCS, SCADA, and formation monitoring systems into a unified data layer that feeds both ML resource models and operational dashboards
ML-First
Assessment Approach
Geothermal prospectivity from existing well data — no new drilling required for initial screening
10,000+
Wells Screened
Permian Basin shut-in wells evaluated for geothermal conversion using ML temperature and flow criteria
Digital Twin
Operational Model
Reservoir thermal depletion and reinjection scheduling simulated before production commitment
Scope 1
Carbon Reporting
Automated per-source emissions attribution for hybrid oil-geothermal production facilities

Expert Review: What ML-Driven Geothermal Assessment Changes for O&G Operators

The conventional objection to geothermal development in oil and gas basins has always been resource uncertainty — nobody wanted to drill a dedicated geothermal well into a sedimentary formation when the temperature and flow rate combination was genuinely unknown. Machine learning has fundamentally changed that risk profile. We ran an ML assessment of the Williston Basin using nothing but existing well logs and bottom-hole temperature records from the state regulatory database — no new field work, no new seismic acquisition. The XGBoost model predicted formation temperatures across the basin at 3 km and 5 km depths with better accuracy than the thermal gradient maps we had been using for years. It identified a cluster of formation intervals in the western basin where predicted temperatures consistently exceeded 130°C at drillable depths, with porosity values in the range required for direct geothermal extraction. That analysis cost a fraction of what a single exploration well would have cost, and it gave us a ranked list of conversion candidates from existing infrastructure that we could take to an investment committee with quantified uncertainty bounds on the resource estimate.
Senior Reservoir Engineer
Geothermal Energy Transition Practice, Integrated Oil & Gas — Williston and Anadarko Basin Operations, 19 Years
The thing that most operators miss when they first evaluate ML for geothermal assessment is that the data quality problem is actually smaller than it looks. Yes, bottom-hole temperatures in old completion reports are thermally disturbed and require correction. Yes, log suites from pre-1990 wells are often incomplete. But the correction algorithms are mature, and XGBoost and random forest models are robust to missing features in ways that classical geostatistical approaches are not. In the Tularosa Basin study we published, the physics-informed ML approach worked with the data that existed — it did not require us to go back and remeasure. The model computed aquifer temperatures, viscous heat flux, and advective heat flux from existing data inputs and identified geothermally active locations that no prior assessment had flagged. That is the practical value of ML for operators who are sitting on decades of legacy well data and asking whether it is sufficient to make a geothermal development decision. In most mature basins, it is.
Geothermal ML Research Lead
Energy Geoscience and Machine Learning — National Laboratory Geothermal Research, 14 Years, AAPG Certified

Frequently Asked Questions

A minimum of 50–100 wells with corrected bottom-hole temperatures and at least partial wireline log suites is sufficient to train baseline XGBoost or random forest models; accuracy improves significantly with datasets exceeding 500 wells in basins with dense historical drilling programs.
XGBoost models trained on well log data achieve mean absolute errors of 7.3°C versus 8.76°C for conventional physics-based thermal models — a measurable accuracy improvement that compresses the resource uncertainty range used in economic screening.
Yes — iFactory AI connects to Petrel, Kingdom, PPDM-standard databases, state regulatory data feeds, and production historian systems, enabling geothermal ML model inputs to be populated directly from existing data infrastructure without manual data extraction.
Wells producing fluids above 150°F (65°C) are viable for direct-use applications; temperatures above 275°F (135°C) combined with flow rates exceeding 10,000 bpd support on-site electrical generation using binary cycle turbine technology.
iFactory AI tracks per-source energy generation in real time and automatically attributes scope 1 emission reductions from geothermal displacement of gas-fired generation, producing audit-ready carbon intensity reports per barrel of oil equivalent produced.

Conclusion: The Geothermal Resource Was Already There — ML Makes It Findable

The geothermal resource potential of U.S. oil and gas basins is not a future opportunity — it is a present one, sitting underneath existing infrastructure, already partially characterized by decades of drilling and formation evaluation. What machine learning provides is the analytical capability to convert that partially characterized subsurface knowledge into actionable geothermal prospectivity intelligence, at a cost and timeline that was not achievable with conventional geostatistical approaches. XGBoost models predicting subsurface temperatures more accurately than physics-based gradient models. Bayesian neural networks providing uncertainty-quantified resource maps that give investment committees the confidence intervals they require. Physics-informed neural networks evaluating abandoned well conversion potential from existing completion records without a single new measurement.

For oil and gas operators navigating the energy transition under regulatory pressure, investor sustainability mandates, and carbon price exposure, geothermal development from existing basin infrastructure represents the most capital-efficient pathway to meaningful scope 1 emission reduction. The data to evaluate that opportunity is already in production data rooms and state regulatory databases. Machine learning is the tool that extracts the geothermal signal from it — and iFactory AI is the operational intelligence platform that manages the resulting hybrid energy assets from assessment through production.

Turn Your Basin's Well Data Into a Geothermal Development Roadmap
iFactory AI's machine learning platform connects your existing subsurface records, well logs, and production data into a geothermal prospectivity assessment — delivering candidate well rankings, resource temperature maps, and development economics without new field data collection.
ML Subsurface Assessment
Existing Data Compatible
Digital Twin Operations
Carbon Reporting Included
Predictive Maintenance

Share This Story, Choose Your Platform!