High-level Python client for INE Portugal (Statistics Portugal) API
Project description
pyptine - INE Portugal API Client
High-level Python client for Statistics Portugal (INE) API. Query and download statistical data from INE Portugal with a simple, intuitive interface.
Features
- 🎯 High-level Convenience API: Simple interface for common data retrieval and analysis tasks.
- ⚡ Async Support: Non-blocking I/O with
AsyncINEfor concurrent requests using httpx. - 📊 Multiple Output Formats: Export data to pandas DataFrames, JSON, or CSV with ease.
- 📈 Data Visualization: Interactive plotly charts (line, bar, area, scatter) directly from data.
- 🔬 Statistical Analysis: Built-in YoY growth, MoM changes, moving averages, and EMA calculations.
- 💾 Smart Caching: Disk-based caching reduces redundant API calls, speeding up repeated queries.
- 🔍 Metadata Browsing: Search and discover indicators, themes, and dimensions.
- 🖥️ Enhanced CLI: Rich formatting with progress bars, tables, and colored output.
- 📑 True Pagination: Efficient streaming of large datasets with
get_all_data(). - 📖 Modern Python: Fully type-annotated for better developer experience and IDE support.
- ✅ Well-Tested: Comprehensive test suite with 81% code coverage (239 tests).
- 🔄 API Compatible: Supports both old and new INE API response formats seamlessly.
Installation
pip install pyptine
For development, install with all extra dependencies:
pip install "pyptine[dev,docs]"
Quick Start
from pyptine import INE
# Initialize the client
ine = INE(language="EN")
# 1. Search for an indicator
print("Searching for 'gdp' indicators...")
results = ine.search("gdp")
for indicator in results[:5]: # Print top 5 results
print(f"- {indicator.varcd}: {indicator.title}")
# 2. Get data for a specific indicator
varcd = "0004167" # Resident population
print(f"\nFetching data for indicator {varcd}...")
response = ine.get_data(varcd)
# 3. Convert to a pandas DataFrame
df = response.to_dataframe()
print("\nData as DataFrame:")
print(df.head())
# 4. Export data to a CSV file
output_file = "population_data.csv"
print(f"\nExporting data to {output_file}...")
ine.export_csv(varcd, output_file)
print("Done!")
Async API
For concurrent requests and non-blocking I/O, use the AsyncINE client:
import asyncio
from pyptine import AsyncINE
async def main():
async with AsyncINE(language="EN") as ine:
# Fetch single indicator
response = await ine.get_data("0004167")
df = response.to_dataframe()
print(df.head())
# Fetch multiple indicators concurrently
import asyncio
responses = await asyncio.gather(
ine.get_data("0004167"),
ine.get_data("0004127"),
ine.get_data("0008074")
)
# Stream large datasets
async for chunk in ine.get_all_data("0004127", chunk_size=40000):
df_chunk = chunk.to_dataframe()
print(f"Processing {len(df_chunk)} rows...")
asyncio.run(main())
AsyncINE Features:
- Non-blocking I/O for faster concurrent requests
- Async iterator for memory-efficient pagination
- Same API as the synchronous
INEclient - Automatic connection pooling and retries
Command-Line Usage
The pyptine CLI provides a convenient way to access INE data from your terminal, with rich formatting and progress indicators for a better user experience.
# Search for indicators related to "pib" (GDP in Portuguese)
pyptine search "pib"
# Get detailed information about a specific indicator
pyptine info 0004127
# Download data for an indicator to a CSV file (with progress bar)
pyptine download 0004127 --output data.csv
# Download data and filter by dimensions
pyptine download 0004167 --output filtered_data.csv -d Dim1=S7A2023 -d Dim2=PT
# List all available statistical themes (in formatted table)
pyptine list-commands themes
# List all indicators (with pagination support)
pyptine list-commands indicators --limit 50
# View available dimensions for an indicator
pyptine dimensions 0004167
# Clear the local cache
pyptine cache clear
CLI Features:
- Rich Formatting - Tables, panels, and colored output for better readability
- Progress Indicators - Spinners and progress bars for long-running operations
- Error Handling - Centralized, user-friendly error messages with context
- Better Organization - Data displayed in well-formatted tables rather than plain text
Documentation
Initializing the Client
The INE class is the main entry point.
from pyptine import INE
from pathlib import Path
# Default client (language='EN', caching=True)
ine = INE()
# Client with Portuguese language
ine_pt = INE(language="PT")
# Disable caching
ine_no_cache = INE(cache=False)
# Use a custom cache directory
ine_custom_cache = INE(cache_dir=Path("/path/to/custom/cache"))
Working with Indicators
Searching for Indicators
You can search for indicators by keyword and filter by theme or sub-theme.
# Basic search
results = ine.search("unemployment rate")
# Search within a specific theme
results = ine.search("employment", theme="Labour market")
Getting Indicator Metadata
Retrieve detailed information about an indicator, including its dimensions.
metadata = ine.get_metadata("0004167")
print(f"Title: {metadata.title}")
print(f"Unit: {metadata.unit}")
print(f"Source: {metadata.source}")
# List available dimensions
dimensions = ine.get_dimensions("0004167")
for dim in dimensions:
print(f"\nDimension: {dim.name}")
for value in dim.values[:5]: # Show first 5 values
print(f"- {value.code}: {value.label}")
Fetching and Exporting Data
Getting Data
The get_data method returns a DataResponse object, which can be easily converted to different formats.
response = ine.get_data("0004127")
# Convert to pandas DataFrame
df = response.to_dataframe()
# Convert to a dictionary
data_dict = response.to_dict()
# Get data as a JSON string
json_str = response.to_json()
Filtering Data with Dimensions
Use the dimensions parameter to filter data before downloading.
# Get data for the year 2023 and region "Portugal"
# Note: Dimension values use specific codes (e.g., 'S7A2023' for year 2023)
filtered_response = ine.get_data(
"0004167",
dimensions={
"Dim1": "S7A2023", # Year 2023
"Dim2": "PT" # Geographic region 'Portugal'
}
)
df_filtered = filtered_response.to_dataframe()
Exporting Data
You can export data directly to CSV or JSON files.
# Export to CSV
ine.export_csv("0004127", "output.csv")
# Export to JSON with pretty printing
ine.export_json("0004127", "output.json", pretty=True)
# Export filtered data
ine.export_csv(
"0004167",
"filtered_output.csv",
dimensions={"Dim1": "S7A2023"}
)
Working with Large Datasets
For large datasets that exceed the default 40,000 data point limit, use the get_all_data() method which automatically handles pagination:
from pyptine.client.data import DataClient
client = DataClient(language="EN")
# Fetch data in chunks (default chunk_size=40,000)
for chunk in client.get_all_data("0004127"):
df = chunk.to_dataframe()
print(f"Processed {len(df)} rows")
# Process each chunk
# Custom chunk size
for chunk in client.get_all_data("0004127", chunk_size=5000):
# Process smaller chunks
pass
# Combine all chunks into a single dataset
all_chunks = list(client.get_all_data("0004127"))
all_data = [point for chunk in all_chunks for point in chunk.data]
Visualizing Data
Create interactive visualizations directly from indicator data without exporting to DataFrame:
# Get data and create interactive line chart
response = ine.get_data("0004127")
fig = response.plot(chart_type="line")
fig.show()
# Different chart types
fig_bar = response.plot_bar()
fig_area = response.plot_area()
fig_scatter = response.plot_scatter()
# Customize visualization
fig = response.plot_line(
markers=True,
x_column="Period",
y_column="value"
)
# Color by dimensions (if data has dimension columns)
fig = response.plot_line(color_column="region")
# Save to HTML for sharing
fig.write_html("indicator_plot.html")
# Further customization with plotly
fig.update_layout(height=600, width=1200, title="Custom Title")
fig.show()
Available Visualization Methods:
plot(chart_type)- Generic plot with selectable chart typeplot_line()- Interactive line chart with optional markersplot_bar()- Bar chart for categorical comparisonplot_area()- Stacked area chart for trendsplot_scatter()- Scatter plot with optional size and color dimensions
All methods support:
- Interactive plotly charts with hover, zoom, and pan
- Custom column selection for x/y axes
- Color coding by dimension columns
- Export to HTML, PNG, or other formats
Advanced Data Analysis
Perform statistical calculations on indicator data directly within the library:
# Get data and calculate year-over-year growth
response = ine.get_data("0004127")
yoy_response = response.calculate_yoy_growth()
df_yoy = yoy_response.to_dataframe()
print(df_yoy[['Period', 'value', 'yoy_growth']])
# Calculate month-over-month changes
mom_response = response.calculate_mom_change()
df_mom = mom_response.to_dataframe()
# Calculate simple moving average (3-period)
ma_response = response.calculate_moving_average(window=3)
df_ma = ma_response.to_dataframe()
# Calculate exponential moving average
ema_response = response.calculate_exponential_moving_average(span=5)
df_ema = ema_response.to_dataframe()
# Chain multiple analyses
result = response.calculate_yoy_growth().calculate_moving_average(window=2)
df = result.to_dataframe()
print(df[['Period', 'value', 'yoy_growth', 'moving_avg']])
Available analysis methods on DataResponse:
calculate_yoy_growth()- Year-over-year percentage changecalculate_mom_change()- Month-over-month percentage changecalculate_moving_average(window)- Simple moving averagecalculate_exponential_moving_average(span)- Exponential weighted moving average
All methods support custom value_column and period_column parameters to work with different data structures.
API Reference
INE Class
The main class for interacting with the INE API.
INE(language: str = "EN", cache: bool = True, cache_dir: Optional[Path] = None, cache_ttl: int = 86400)
| Method | Description |
|---|---|
search(query, ...) |
Search for indicators. |
get_data(varcd, ...) |
Get data for an indicator as a DataResponse object. |
get_metadata(varcd) |
Get detailed metadata for an indicator. |
get_dimensions(varcd) |
Get available dimensions for an indicator. |
get_indicator(varcd) |
Get catalogue information for a single indicator. |
validate_indicator(varcd) |
Check if an indicator code is valid. |
list_themes() |
Get a list of all available themes. |
export_csv(varcd, ...) |
Export indicator data to a CSV file. |
export_json(varcd, ...) |
Export indicator data to a JSON file. |
clear_cache() |
Clear all cached data. |
get_cache_info() |
Get statistics about the cache. |
Links & Resources
- PyPI Package: https://pypi.org/project/pyptine/
- GitHub Repository: https://github.com/randsley/pyptine
- Issue Tracker: https://github.com/randsley/pyptine/issues
- INE Portal: https://www.ine.pt/
Development
Setup
To set up the development environment:
# Clone the repository
git clone https://github.com/nigelrandsley/pyptine.git
cd pyptine
# Install in editable mode with development dependencies
pip install -e ".[dev]"
# Install pre-commit hooks to ensure code quality
pre-commit install
Running Tests
# Run all tests
pytest
# Run tests with coverage report
pytest --cov=src/pyptine --cov-report=term-missing
Code Quality
This project uses black for formatting, ruff for linting, and mypy for type checking.
# Format code
black src/ tests/
# Lint code
ruff check src/ tests/
# Type check
mypy src/
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository.
- Create your feature branch (
git checkout -b feature/amazing-feature). - Commit your changes (
git commit -m 'Add amazing feature'). - Push to the branch (
git push origin feature/amazing-feature). - Open a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyptine-0.3.2.tar.gz.
File metadata
- Download URL: pyptine-0.3.2.tar.gz
- Upload date:
- Size: 55.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
068ac064cf83faa96972c7eab85982c1932577a670c066e0e86124ced5845522
|
|
| MD5 |
cf07aa876c620732cb36dda27ef762a2
|
|
| BLAKE2b-256 |
534ccfa999610fee88ac5960168dd906dee1900ec3b3814be3de45141d7b648b
|
File details
Details for the file pyptine-0.3.2-py3-none-any.whl.
File metadata
- Download URL: pyptine-0.3.2-py3-none-any.whl
- Upload date:
- Size: 63.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ded6f6bd438d044093f1783b5f84abacae030e34f8f440dc307d2a144acbe290
|
|
| MD5 |
f3ad09ac4309e4acc0a4768264d052ec
|
|
| BLAKE2b-256 |
03207bb7ab6c37c3c0c11f8c1b7f1b39830eb4b2779427625b23f53eea100e20
|