Python library for Norwegian Petroleum Factpages data
Project description
factpages-py
Python library for accessing Norwegian Petroleum Directorate (Sodir) FactPages data.
Access comprehensive petroleum data from the Norwegian Continental Shelf including fields, discoveries, wellbores, facilities, licenses, production data, and more.
Table of Contents
- Installation
- Quick Start
- Setup & Data Refresh
- Entity Access
- Exploring Data
- Raw DataFrame Access
- Database Status
- Graph Building
- Configuration
- Available Datasets
- Examples
Installation
pip install factpages-py
Quick Start
from factpages_py import Factpages
# Initialize client with data directory
fp = Factpages(data_dir="./factpages_data")
# Download core datasets
fp.refresh()
# Access a field by name (case-insensitive)
troll = fp.field("troll")
print(troll.name) # TROLL
print(troll.operator) # Equinor Energy AS
print(troll.status) # Producing
print(troll.id) # 46437
# Or access by ID
troll = fp.field(46437)
Setup & Data Refresh
Setting Up the Database
from factpages_py import Factpages
# Initialize with custom data directory (default: ./factpages_data)
fp = Factpages(data_dir="./my_petroleum_data")
# The database is stored as SQLite at: ./my_petroleum_data/factpages.db
Refresh Methods
Basic Refresh (Maintenance Mode)
# Maintenance: fetches core entities if missing, then fixes mismatches + stale data
# Default: 10% limit on datasets to refresh, 25 days staleness threshold
fp.refresh()
# More aggressive maintenance (50% of datasets)
fp.refresh(limit_percent=50)
# Refresh specific datasets
fp.refresh('field')
fp.refresh(['field', 'discovery', 'wellbore'])
# Force re-download even if data exists
fp.refresh('field', force=True)
Maintenance mode priorities:
- Core entities: Downloads
field,discovery,wellbore,facility,company,licenceif missing - Row count mismatches: High priority, always refreshed regardless of limit
- Stale datasets: Older than 25 days, refreshed up to the limit
Full Database Download
# Download ALL available datasets (75+ tables)
results = fp.refresh('all')
print(f"Downloaded {results['synced_count']} datasets")
Check API Statistics
# Get stats for all datasets (cached for 3 days, auto-refreshes if stale)
stats = fp.stats()
print(f"Total remote records: {stats['total_remote_records']:,}")
print(f"Missing datasets: {len(stats['missing'])}")
print(f"Changed datasets: {len(stats['changed'])}")
# Force refresh stats from API
stats = fp.stats(force_refresh=True)
Fix All Stale Data
# Download all stale and missing datasets (no limit)
results = fp.fix()
print(f"Fixed {results['synced_count']} datasets")
# Fix only stale (don't download new datasets)
results = fp.fix(include_missing=False)
Check Data Quality
report = fp.check_quality()
print(f"Health Score: {report['health_score']}%")
print(f"Fresh (<7d): {report['fresh_count']}")
print(f"Stale (>30d): {report['stale_count']}")
print(f"Missing: {report['missing_count']}")
Entity Access
The library provides 14 entity types with rich object-oriented access.
Entity Accessor Methods
Each entity type has an accessor with these methods:
# Get entity by name or ID
troll = fp.field("troll") # By name (case-insensitive)
troll = fp.field(46437) # By npdid
# Get random entity (no arguments)
random_field = fp.field() # Returns a random field
# List all entity names
fp.field.list() # ['AASTA HANSTEEN', 'ALBUSKJELL', ...]
# List all entity IDs
fp.field.ids() # [43437, 43444, 43451, ...]
# Count entities
fp.field.count() # 141
# Get all as DataFrame
fp.field.all() # DataFrame of all fields
Fields
troll = fp.field("troll")
print(troll) # Formatted display
print(troll.name) # TROLL
print(troll.id) # 46437
print(troll.operator) # Equinor Energy AS
print(troll.status) # Producing
print(troll.hc_type) # OIL/GAS
print(troll.discovery_year) # 1979
Discoveries
sverdrup = fp.discovery("johan sverdrup")
print(sverdrup.name) # 16/2-6 Johan Sverdrup
print(sverdrup.id) # 18387202
fp.discovery.count() # 638
Wellbores
well = fp.wellbore("31/2-1")
print(well.name) # 31/2-1
print(well.id) # 398
fp.wellbore.count() # 9731
Facilities
platform = fp.facility("TROLL A")
print(platform)
print(platform.kind)
print(platform.phase)
print(platform.water_depth)
Pipelines
pipe = fp.pipeline("STATPIPE")
print(pipe)
print(pipe.medium)
print(pipe.from_facility)
print(pipe.to_facility)
Licences
licence = fp.licence("PL001")
print(licence)
print(licence.status)
print(licence.granted_date)
Companies
equinor = fp.company("equinor")
print(equinor)
print(equinor.short_name)
print(equinor.org_number)
Additional Entity Types
All entity types support the same accessor methods:
# Plays (geological)
fp.play("UPPER JURASSIC") # By name
fp.play() # Random
fp.play.list() # All names
fp.play.count() # 71
# Blocks
fp.block("34/10") # By name
fp.block.count() # Number of blocks
# Quadrants
fp.quadrant("34") # By name
# Onshore facilities (TUF)
fp.tuf("KOLLSNES") # By name
# Seismic surveys
fp.seismic("NPD-1901") # By name
# Stratigraphy
fp.stratigraphy("DRAUPNE") # By name
# Business arrangements
fp.business_arrangement("TROLL UNIT")
Exploring Data
Entity Display
Each entity has a formatted print() output:
troll = fp.field("troll")
print(troll)
Output:
FIELD: TROLL
============================================================
Status: Producing Area: North Sea
HC Type: OIL/GAS Discovered: 1979
Operator: Equinor Energy AS
Partners (PL054):
Equinor Energy AS 30.58% (operator)
Petoro AS 56.00%
...
Explore: .reserves .wells .licensees .operators
Related Tables
Access related data directly as DataFrames:
troll = fp.field("troll")
# Direct attribute access for related tables
troll.field_reserves # DataFrame of reserves
troll.field_licensee_hst # DataFrame of licensee history
troll.field_operator_hst # DataFrame of operator history
# Generic related() method
troll.related('field_reserves')
troll.related('discovery') # Related discoveries
Exploring Connections
troll = fp.field("troll")
# Get list of connected tables
connections = troll.connections
print(connections['incoming']) # Tables that reference this field
# ['field_reserves', 'field_licensee_hst', 'field_operator_hst', ...]
print(connections['outgoing']) # Base tables this field references
# ['company', 'wellbore']
# Get actual filtered data for all connections
full_conns = troll.full_connections
reserves_df = full_conns['incoming']['field_reserves']
operator_df = full_conns['outgoing']['company']
Partners and Ownership
troll = fp.field("troll")
# Get partners with ownership info
partners = troll.partners
print(partners)
# Partners (5):
# ============================================================
# Company Share % Operator
# ------------------------------------------------------------
# Equinor Energy AS 30.58 *
# Petoro AS 56.00
# ...
# Iterate partners (list of dicts)
for partner in partners:
print(f"{partner['company']}: {partner['share']}%")
Raw DataFrame Access
Direct Table Access
# Get any table as DataFrame
fields_df = fp.db.get('field')
discoveries_df = fp.db.get('discovery')
wellbores_df = fp.db.get('wellbore')
# Shorthand via fp.df() (auto-syncs if auto_sync=True)
fields_df = fp.df('field')
# Safe access (returns None if not exists)
df = fp.db.get_or_none('field_reserves')
Convenience Methods
# Get all fields
fields = fp.fields()
# Filter by status
producing = fp.fields(status='Producing')
# Get all discoveries
discoveries = fp.discoveries()
# Filter by year
discoveries_2023 = fp.discoveries(year=2023)
# Get all wellbores
wellbores = fp.wellbores()
List Available Tables
# Tables downloaded locally
fp.list_tables()
# ['block', 'company', 'discovery', 'facility', 'field', ...]
# Filter by prefix
fp.list_tables('field')
# ['field', 'field_reserves', 'field_licensee_hst', ...]
# Tables available on API
fp.api_tables()
# ['block', 'business_arrangement_area', 'company', ...]
# Filter API tables
fp.api_tables('wellbore')
# ['wellbore', 'wellbore_casing', 'wellbore_core', ...]
Download Without Storing
# Download data but don't store locally
df = fp.download('field')
# Download and store
df = fp.download('field', store=True)
# Download with filter
df = fp.download('wellbore', where="wlbStatus='COMPLETED'")
Database Status
Print Status
fp.status()
Output:
Database Status
================
Location: ./factpages_data/factpages.db
Tables: 53
Total records: 450,234
Top tables by size:
wellbore_formation_top 125,432 records
wellbore 9,731 records
licence_licensee_hst 8,234 records
...
Detailed Database Info
# List all tables
tables = fp.db.list_datasets()
# Get record count for a table
count = fp.db.get_record_count('wellbore')
# Check if table exists
exists = fp.db.has_dataset('field')
# Get last sync time
last_sync = fp.db.get_last_sync('field')
Graph Building
Export data for knowledge graph libraries:
Quick Export
from factpages_py import Factpages
import rusty_graph
fp = Factpages()
fp.refresh()
graph = rusty_graph.KnowledgeGraph()
# One-liner bulk loading
export = fp.graph.export_for_graph()
graph.add_nodes_bulk(export['nodes'])
graph.add_connections_from_source(export['connections'])
Step-by-Step Loading
# Get all nodes with column renaming
all_nodes = fp.graph.all_nodes(rename=True)
for node_type, df in all_nodes.items():
print(f"{node_type}: {len(df)} nodes")
# Get connection specifications
connectors = fp.graph.all_connectors()
print(f"Found {len(connectors)} connection types")
# Load specific entity types
field_nodes = fp.graph.nodes('field', rename=True)
wellbore_nodes = fp.graph.nodes('wellbore', rename=True)
Configuration
Custom Client Configuration
from factpages_py import Factpages, ClientConfig
config = ClientConfig(
timeout=60, # Request timeout (seconds)
connect_timeout=10, # Connection timeout
max_retries=5, # Max retry attempts
rate_limit=0.2, # Min seconds between requests
pool_connections=20, # Connection pool size
)
fp = Factpages(config=config)
Auto-Refresh Mode
# Auto-download missing datasets when accessed
fp = Factpages(auto_sync=True)
# Now this will auto-download 'field' if not present
troll = fp.field("troll")
Available Datasets
The API provides access to 75+ datasets organized by category:
| Category | Examples | Description |
|---|---|---|
| Core Entities | field, discovery, wellbore, facility | Main petroleum entities |
| Company | company | Operator and licensee information |
| Licensing | licence, licence_licensee_hst | License data and history |
| Field Details | field_reserves, field_licensee_hst | Field-specific tables |
| Discovery Details | discovery_reserves, discovery_operator_hst | Discovery-specific tables |
| Wellbore Details | wellbore_core, wellbore_dst, wellbore_formation_top | Well data |
| Facility Details | facility_function, pipeline | Infrastructure |
| Seismic | seismic_acquisition | Survey data |
| Stratigraphy | strat_litho, strat_chrono | Geological formations |
| Administrative | block, quadrant | Geographic divisions |
# See all available datasets
fp.api_tables()
# Get dataset categories
from factpages_py import LAYERS, TABLES
print(f"Layers (with geometry): {len(LAYERS)}")
print(f"Tables (no geometry): {len(TABLES)}")
Examples
Find All Producing Fields
fp = Factpages()
fp.refresh('field')
producing = fp.fields(status='Producing')
print(f"Found {len(producing)} producing fields")
for _, row in producing.iterrows():
print(f" {row['fldName']}: {row['cmpLongName']}")
Analyze Wellbore Depths
import pandas as pd
fp = Factpages()
fp.refresh('wellbore')
wellbores = fp.wellbores()
print(f"Average depth: {wellbores['wlbTotalDepth'].mean():.0f}m")
print(f"Deepest wellbore: {wellbores['wlbTotalDepth'].max():.0f}m")
Export Field-Company Relationships
fp = Factpages()
fp.refresh(['field', 'field_licensee_hst', 'company'])
# Build relationship table
fields = fp.db.get('field')
licensees = fp.db.get('field_licensee_hst')
companies = fp.db.get('company')
# Merge for full picture
relationships = licensees.merge(
fields[['fldNpdidField', 'fldName']],
on='fldNpdidField'
).merge(
companies[['cmpNpdidCompany', 'cmpLongName']],
on='cmpNpdidCompany'
)
print(relationships[['fldName', 'cmpLongName', 'fldLicenseeDateValidFrom']])
License
MIT License
Acknowledgments
Data provided by the Norwegian Offshore Directorate (Sodir).
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file factpages_py-0.1.29.tar.gz.
File metadata
- Download URL: factpages_py-0.1.29.tar.gz
- Upload date:
- Size: 134.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4731f5d4e166730c47d45f87635db2a6df5f769ea8cc31af842a9f328d1a06e4
|
|
| MD5 |
4078308a8d5e737f099251dc5c61bb60
|
|
| BLAKE2b-256 |
95cad0090e69a805b7420a3bfbfeea3bfaf926030ab7389b204080c80a2bba4c
|
File details
Details for the file factpages_py-0.1.29-py3-none-any.whl.
File metadata
- Download URL: factpages_py-0.1.29-py3-none-any.whl
- Upload date:
- Size: 130.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d58b014693ede9e06986746d9b6ac103a57763b03214f926ff95b2119e3e366
|
|
| MD5 |
61bc99ac90ae4f5bbe5781f9bbfd6d79
|
|
| BLAKE2b-256 |
5eb739058d34670fb49579e3ed66f610ccab91a9cc26e3766d4d9e92c7d70425
|