Skip to main content

Python library for Norwegian Petroleum Factpages data

Project description

factpages-py

Python library for accessing Norwegian Petroleum Directorate (Sodir) FactPages data.

Access comprehensive petroleum data from the Norwegian Continental Shelf including fields, discoveries, wellbores, facilities, licenses, production data, and more.

Table of Contents

Installation

pip install factpages-py

Quick Start

from factpages_py import Factpages

# Initialize client with data directory
fp = Factpages(data_dir="./factpages_data")

# Download core datasets
fp.refresh()

# Access a field by name (case-insensitive)
troll = fp.field("troll")
print(troll.name)            # TROLL
print(troll.operator)        # Equinor Energy AS
print(troll.status)          # Producing
print(troll.id)              # 46437

# Or access by ID
troll = fp.field(46437)

Setup & Data Refresh

Setting Up the Database

from factpages_py import Factpages

# Initialize with custom data directory (default: ./factpages_data)
fp = Factpages(data_dir="./my_petroleum_data")

# The database is stored as SQLite at: ./my_petroleum_data/factpages.db

Refresh Methods

Basic Refresh (Maintenance Mode)

# Maintenance: fetches core entities if missing, then fixes mismatches + stale data
# Default: 10% limit on datasets to refresh, 25 days staleness threshold
fp.refresh()

# More aggressive maintenance (50% of datasets)
fp.refresh(limit_percent=50)

# Refresh specific datasets
fp.refresh('field')
fp.refresh(['field', 'discovery', 'wellbore'])

# Force re-download even if data exists
fp.refresh('field', force=True)

Maintenance mode priorities:

  1. Core entities: Downloads field, discovery, wellbore, facility, company, licence if missing
  2. Row count mismatches: High priority, always refreshed regardless of limit
  3. Stale datasets: Older than 25 days, refreshed up to the limit

Full Database Download

# Download ALL available datasets (75+ tables)
results = fp.refresh('all')
print(f"Downloaded {results['synced_count']} datasets")

Check API Statistics

# Get stats for all datasets (cached for 3 days, auto-refreshes if stale)
stats = fp.stats()
print(f"Total remote records: {stats['total_remote_records']:,}")
print(f"Missing datasets: {len(stats['missing'])}")
print(f"Changed datasets: {len(stats['changed'])}")

# Force refresh stats from API
stats = fp.stats(force_refresh=True)

Fix All Stale Data

# Download all stale and missing datasets (no limit)
results = fp.fix()
print(f"Fixed {results['synced_count']} datasets")

# Fix only stale (don't download new datasets)
results = fp.fix(include_missing=False)

Check Data Quality

report = fp.check_quality()
print(f"Health Score: {report['health_score']}%")
print(f"Fresh (<7d): {report['fresh_count']}")
print(f"Stale (>30d): {report['stale_count']}")
print(f"Missing: {report['missing_count']}")

Entity Access

The library provides 14 entity types with rich object-oriented access.

Entity Accessor Methods

Each entity type has an accessor with these methods:

# Get entity by name or ID
troll = fp.field("troll")           # By name (case-insensitive)
troll = fp.field(46437)             # By npdid

# Get random entity (no arguments)
random_field = fp.field()           # Returns a random field

# List all entity names
fp.field.list()                     # ['AASTA HANSTEEN', 'ALBUSKJELL', ...]

# List all entity IDs
fp.field.ids()                      # [43437, 43444, 43451, ...]

# Count entities
fp.field.count()                    # 141

# Get all as DataFrame
fp.field.all()                      # DataFrame of all fields

Fields

troll = fp.field("troll")

print(troll)                        # Formatted display
print(troll.name)                   # TROLL
print(troll.id)                     # 46437
print(troll.operator)               # Equinor Energy AS
print(troll.status)                 # Producing
print(troll.hc_type)                # OIL/GAS
print(troll.discovery_year)         # 1979

Discoveries

sverdrup = fp.discovery("johan sverdrup")

print(sverdrup.name)                       # 16/2-6 Johan Sverdrup
print(sverdrup.id)                         # 18387202
fp.discovery.count()                       # 638

Wellbores

well = fp.wellbore("31/2-1")

print(well.name)                  # 31/2-1
print(well.id)                    # 398
fp.wellbore.count()               # 9731

Facilities

platform = fp.facility("TROLL A")
print(platform)
print(platform.kind)
print(platform.phase)
print(platform.water_depth)

Pipelines

pipe = fp.pipeline("STATPIPE")
print(pipe)
print(pipe.medium)
print(pipe.from_facility)
print(pipe.to_facility)

Licences

licence = fp.licence("PL001")
print(licence)
print(licence.status)
print(licence.granted_date)

Companies

equinor = fp.company("equinor")
print(equinor)
print(equinor.short_name)
print(equinor.org_number)

Additional Entity Types

All entity types support the same accessor methods:

# Plays (geological)
fp.play("UPPER JURASSIC")         # By name
fp.play()                         # Random
fp.play.list()                    # All names
fp.play.count()                   # 71

# Blocks
fp.block("34/10")                 # By name
fp.block.count()                  # Number of blocks

# Quadrants
fp.quadrant("34")                 # By name

# Onshore facilities (TUF)
fp.tuf("KOLLSNES")                # By name

# Seismic surveys
fp.seismic("NPD-1901")            # By name

# Stratigraphy
fp.stratigraphy("DRAUPNE")        # By name

# Business arrangements
fp.business_arrangement("TROLL UNIT")

Exploring Data

Entity Display

Each entity has a formatted print() output:

troll = fp.field("troll")
print(troll)

Output:

FIELD: TROLL
============================================================
Status:     Producing              Area:      North Sea
HC Type:    OIL/GAS               Discovered: 1979
Operator:   Equinor Energy AS

Partners (PL054):
  Equinor Energy AS                    30.58%  (operator)
  Petoro AS                            56.00%
  ...

Explore: .reserves  .wells  .licensees  .operators

Related Tables

Access related data directly as DataFrames:

troll = fp.field("troll")

# Direct attribute access for related tables
troll.field_reserves        # DataFrame of reserves
troll.field_licensee_hst    # DataFrame of licensee history
troll.field_operator_hst    # DataFrame of operator history

# Generic related() method
troll.related('field_reserves')
troll.related('discovery')  # Related discoveries

Exploring Connections

troll = fp.field("troll")

# Get list of connected tables
connections = troll.connections
print(connections['incoming'])  # Tables that reference this field
# ['field_reserves', 'field_licensee_hst', 'field_operator_hst', ...]

print(connections['outgoing'])  # Base tables this field references
# ['company', 'wellbore']

# Get actual filtered data for all connections
full_conns = troll.full_connections
reserves_df = full_conns['incoming']['field_reserves']
operator_df = full_conns['outgoing']['company']

Partners and Ownership

troll = fp.field("troll")

# Get partners with ownership info
partners = troll.partners
print(partners)
# Partners (5):
# ============================================================
# Company                                   Share %  Operator
# ------------------------------------------------------------
# Equinor Energy AS                           30.58  *
# Petoro AS                                   56.00
# ...

# Iterate partners (list of dicts)
for partner in partners:
    print(f"{partner['company']}: {partner['share']}%")

Raw DataFrame Access

Direct Table Access

# Get any table as DataFrame
fields_df = fp.db.get('field')
discoveries_df = fp.db.get('discovery')
wellbores_df = fp.db.get('wellbore')

# Shorthand via fp.df() (auto-syncs if auto_sync=True)
fields_df = fp.df('field')

# Safe access (returns None if not exists)
df = fp.db.get_or_none('field_reserves')

Convenience Methods

# Get all fields
fields = fp.fields()

# Filter by status
producing = fp.fields(status='Producing')

# Get all discoveries
discoveries = fp.discoveries()

# Filter by year
discoveries_2023 = fp.discoveries(year=2023)

# Get all wellbores
wellbores = fp.wellbores()

List Available Tables

# Tables downloaded locally
fp.list_tables()
# ['block', 'company', 'discovery', 'facility', 'field', ...]

# Filter by prefix
fp.list_tables('field')
# ['field', 'field_reserves', 'field_licensee_hst', ...]

# Tables available on API
fp.api_tables()
# ['block', 'business_arrangement_area', 'company', ...]

# Filter API tables
fp.api_tables('wellbore')
# ['wellbore', 'wellbore_casing', 'wellbore_core', ...]

Download Without Storing

# Download data but don't store locally
df = fp.download('field')

# Download and store
df = fp.download('field', store=True)

# Download with filter
df = fp.download('wellbore', where="wlbStatus='COMPLETED'")

Database Status

Print Status

fp.status()

Output:

Database Status
================
Location: ./factpages_data/factpages.db
Tables: 53
Total records: 450,234

Top tables by size:
  wellbore_formation_top    125,432 records
  wellbore                   9,731 records
  licence_licensee_hst       8,234 records
  ...

Detailed Database Info

# List all tables
tables = fp.db.list_datasets()

# Get record count for a table
count = fp.db.get_record_count('wellbore')

# Check if table exists
exists = fp.db.has_dataset('field')

# Get last sync time
last_sync = fp.db.get_last_sync('field')

Graph Building

Export data for knowledge graph libraries:

Quick Export

from factpages_py import Factpages
import rusty_graph

fp = Factpages()
fp.refresh()

graph = rusty_graph.KnowledgeGraph()

# One-liner bulk loading
export = fp.graph.export_for_graph()
graph.add_nodes_bulk(export['nodes'])
graph.add_connections_from_source(export['connections'])

Step-by-Step Loading

# Get all nodes with column renaming
all_nodes = fp.graph.all_nodes(rename=True)
for node_type, df in all_nodes.items():
    print(f"{node_type}: {len(df)} nodes")

# Get connection specifications
connectors = fp.graph.all_connectors()
print(f"Found {len(connectors)} connection types")

# Load specific entity types
field_nodes = fp.graph.nodes('field', rename=True)
wellbore_nodes = fp.graph.nodes('wellbore', rename=True)

Configuration

Custom Client Configuration

from factpages_py import Factpages, ClientConfig

config = ClientConfig(
    timeout=60,                    # Request timeout (seconds)
    connect_timeout=10,            # Connection timeout
    max_retries=5,                 # Max retry attempts
    rate_limit=0.2,                # Min seconds between requests
    pool_connections=20,           # Connection pool size
)

fp = Factpages(config=config)

Auto-Refresh Mode

# Auto-download missing datasets when accessed
fp = Factpages(auto_sync=True)

# Now this will auto-download 'field' if not present
troll = fp.field("troll")

Available Datasets

The API provides access to 75+ datasets organized by category:

Category Examples Description
Core Entities field, discovery, wellbore, facility Main petroleum entities
Company company Operator and licensee information
Licensing licence, licence_licensee_hst License data and history
Field Details field_reserves, field_licensee_hst Field-specific tables
Discovery Details discovery_reserves, discovery_operator_hst Discovery-specific tables
Wellbore Details wellbore_core, wellbore_dst, wellbore_formation_top Well data
Facility Details facility_function, pipeline Infrastructure
Seismic seismic_acquisition Survey data
Stratigraphy strat_litho, strat_chrono Geological formations
Administrative block, quadrant Geographic divisions
# See all available datasets
fp.api_tables()

# Get dataset categories
from factpages_py import LAYERS, TABLES
print(f"Layers (with geometry): {len(LAYERS)}")
print(f"Tables (no geometry): {len(TABLES)}")

Examples

Find All Producing Fields

fp = Factpages()
fp.refresh('field')

producing = fp.fields(status='Producing')
print(f"Found {len(producing)} producing fields")

for _, row in producing.iterrows():
    print(f"  {row['fldName']}: {row['cmpLongName']}")

Analyze Wellbore Depths

import pandas as pd

fp = Factpages()
fp.refresh('wellbore')

wellbores = fp.wellbores()
print(f"Average depth: {wellbores['wlbTotalDepth'].mean():.0f}m")
print(f"Deepest wellbore: {wellbores['wlbTotalDepth'].max():.0f}m")

Export Field-Company Relationships

fp = Factpages()
fp.refresh(['field', 'field_licensee_hst', 'company'])

# Build relationship table
fields = fp.db.get('field')
licensees = fp.db.get('field_licensee_hst')
companies = fp.db.get('company')

# Merge for full picture
relationships = licensees.merge(
    fields[['fldNpdidField', 'fldName']],
    on='fldNpdidField'
).merge(
    companies[['cmpNpdidCompany', 'cmpLongName']],
    on='cmpNpdidCompany'
)

print(relationships[['fldName', 'cmpLongName', 'fldLicenseeDateValidFrom']])

License

MIT License

Acknowledgments

Data provided by the Norwegian Offshore Directorate (Sodir).

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

factpages_py-0.1.29.tar.gz (134.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

factpages_py-0.1.29-py3-none-any.whl (130.2 kB view details)

Uploaded Python 3

File details

Details for the file factpages_py-0.1.29.tar.gz.

File metadata

  • Download URL: factpages_py-0.1.29.tar.gz
  • Upload date:
  • Size: 134.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for factpages_py-0.1.29.tar.gz
Algorithm Hash digest
SHA256 4731f5d4e166730c47d45f87635db2a6df5f769ea8cc31af842a9f328d1a06e4
MD5 4078308a8d5e737f099251dc5c61bb60
BLAKE2b-256 95cad0090e69a805b7420a3bfbfeea3bfaf926030ab7389b204080c80a2bba4c

See more details on using hashes here.

File details

Details for the file factpages_py-0.1.29-py3-none-any.whl.

File metadata

  • Download URL: factpages_py-0.1.29-py3-none-any.whl
  • Upload date:
  • Size: 130.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for factpages_py-0.1.29-py3-none-any.whl
Algorithm Hash digest
SHA256 2d58b014693ede9e06986746d9b6ac103a57763b03214f926ff95b2119e3e366
MD5 61bc99ac90ae4f5bbe5781f9bbfd6d79
BLAKE2b-256 5eb739058d34670fb49579e3ed66f610ccab91a9cc26e3766d4d9e92c7d70425

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page