Tools to download and process public datasets from Peru (INEI & BCRP).

These details have not been verified by PyPI

Project links

Project description

PyPeruStats

Allows downloading data from various data sources in Peru.

Sources: INEI, BCRP

Installation

pip install pyperustats

INEI

Parameters Description

MICRODATOS_INEI

survey: Survey type ('enaho', 'enapres', 'endes')
- Available up to 2024-Quarter 3

download_default

format: Output file format
- 'csv': CSV files
- 'stata': Stata files
- 'spss': SPSS files
force: Force re-download of existing files
remove_zip: Remove ZIP files after extraction
workers: Number of workers for parallel download
zip_dir: Directory to store ZIP files

organize_files

dir_output: Directory where organized files will be saved
order_by: Organization method
- 'modules': Structure "mod_01/year_n.csv"
ext_documentation: List of documentation extensions
delete_master_dir: Delete master directory after organizing

USAGE

from pyPeruStats import MICRODATOS_INEI, print_tree

# Options: enaho, enapres, endes, available up to 2024-Quarter 3
enaho = MICRODATOS_INEI(survey="enaho") 
modules = enaho.modules
# Found modules 
print(modules.head(2))

   codigo_modulo                                      modulo      anio
0              1  Características de la Vivienda y del Hogar  2024 ...
1              2   Características de los Miembros del Hogar  2024 ...

downloaded = enaho.search(
    [2021, 2023, 2004, 2006, 2007, 2008], [1, 2, 3, 8]
).download_default(
    format='csv', # csv, stata, spss
    force=False, # download zip files again
    remove_zip=False, # remove original zips from microdata page
    workers=4,  # Parallel download
    zip_dir="trash_zips" # where zips will be downloaded
)

# Downloaded files within directory
print_tree('./trash_zips/')

📁 trash_zips
└── 📁 inei_enaho_download
    ├── 📁 2004
        ├── 📁 2004_01
        │   └── 📁 280-Modulo01
        │   │   ├── 📄 CED-01-100 2004.pdf
        │   │   ├── 📄 Diccionario.pdf
        │   │   ├── 📄 enaho01-2004-100.dta
        │   │   └── 📄 Ficha Tecnica - 2004.pdf
        ├── 📁 2004_02
....

result_files = downloaded.organize_files(
    dir_output="./data_inei/", # Where files will be saved
    order_by="modules", # modules: file structure "mod_01/year_n.csv" ; # year: file structure year_n/mod_n
    ext_documentation=['pdf'], # files used for documentation
    delete_master_dir=False # true if you want to delete all zip files and unzip again (use with caution)
)
print_tree("./data_inei/") # print file structure

📁 data_inei
├── 📁 documentation_pdf
    ├── 📄 2004_01_ced-01-100_2004.pdf
    ├── 📄 2004_01_diccionario.pdf
    ├── 📄 2004_01_ficha_tecnica_-_2004
...
└── 📁 modules
    ├── 📁 001
        ├── 📄 2004.dta
        ├── 📄 2006.dta
        ├── 📄 2007.dta
        ├── 📄 2008.csv
        ├── 📄 2021.csv
        └── 📄 2023.csv
    ├── 📁 002
        ├── 📄 2004.dta
        ├── 📄 2006.dta
        ├── 📄 2007.dta
        ├── 📄 2008.csv
....

Notes

Parallel download significantly improves performance but consumes more resources
It's recommended to keep original ZIP files as backup
Check disk space before downloading multiple years/modules
Documentation files are organized in a separate directory

BCRP

Current Issues with the Source Data

Inconsistent Data Formats Across Frequencies
- Spanish Month Abbreviations
  For example: "Ene05" (January 2005 in Spanish format).
- Complex Date Strings
  Example: "31Ene05" combines day, month (abbreviated in Spanish), and year, requiring parsing.
- Quarterly Indicators
  Example: "T113" indicates the 1st quarter of 2013 and needs transformation to a standard format.
Additional Steps Required for Proper DataFrame Conversion
- Converting non-standard date strings to a format recognized by pandas or similar libraries.
- Harmonizing date formats across daily, monthly, quarterly, and annual frequencies.
Slow Response Time from the BCRP UI
- The platform often experiences delays when fetching data, impacting the efficiency of workflows.

Features

Seamless data retrieval across different time frequencies
Automatic conversion of Spanish date formats to standard datetime
Parallel processing capabilities
Built-in caching mechanism
Flexible data processing

from pyPeruStats import BCRPDataProcessor

# Define series codes
diarios = ["PD38032DD", "PD04699XD"]
mensuales = ["RD38085BM", "RD38307BM"]
trimestrales = ["PD37940PQ", "PN38975BQ"]
anuales = [
    "PM06069MA",
    "PM06078MA",
    "PM06101MA",
    "	PM06088MA",
    "PM06087MA",
    "	PM06086MA",
    "	PM06085MA",
    "	PM06084MA",
    "	PM06083MA",
    "	PM06082MA",
    "	PM06081MA",
    "	PM06070MA",
]

# Combine all frequencies
all_freq = diarios + mensuales + trimestrales + anuales

# Initialize processor
processor = BCRPDataProcessor(
    all_freq, 
    start_date="2002-01-02", 
    end_date="2023-01-01", 
    parallel=True
)

# Process data
data = processor.process_data(save_sqlite=True)

# Access DataFrames by frequency
anuales_df = data.get("A")
trimestrales_df = data.get("Q")
mensuales_df = data.get("M")
diarios_df = data.get("D")

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Apache 2.0

Contact

fr.jhonk@gmail.com

TODO

BCRP
- Download statistical data from BCRP
- Implement advanced data search functionality
- Create autoplot functionality (inspired by ggplot)
- Set up GitHub repository and backup mechanism
- Add comprehensive documentation
- Create example notebooks

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.0

Apr 15, 2026

0.2.22

Apr 7, 2026

0.2.21

Apr 7, 2026

0.1.7

Dec 30, 2025

This version

0.1.5

Dec 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perustats-0.1.5.tar.gz (22.9 kB view details)

Uploaded Dec 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

perustats-0.1.5-py3-none-any.whl (22.9 kB view details)

Uploaded Dec 30, 2025 Python 3

File details

Details for the file perustats-0.1.5.tar.gz.

File metadata

Download URL: perustats-0.1.5.tar.gz
Upload date: Dec 30, 2025
Size: 22.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for perustats-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`40eab39774b50bbf098af05a6db25efa21bfc8a6a71c1a73b4cb7d3769d1f4b6`
MD5	`74fa13162efa36233db444e5fedb9fb5`
BLAKE2b-256	`edec50bebc5465b9d8008a40aa9fa98167d28985748030c07c473d69c8d5fed9`

See more details on using hashes here.

File details

Details for the file perustats-0.1.5-py3-none-any.whl.

File metadata

Download URL: perustats-0.1.5-py3-none-any.whl
Upload date: Dec 30, 2025
Size: 22.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for perustats-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`55589520772a6f2ee1c00aa6b4a42a0a0d27e8b50e39df6f08067b4cda9090ed`
MD5	`1722bfdeda58b0a4d4fa70013f163374`
BLAKE2b-256	`c7faa43dfec70395031603aebede6010ba0595dd0e576f2cad72b703c82d3869`

See more details on using hashes here.

perustats 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PyPeruStats

Installation

INEI

Parameters Description

MICRODATOS_INEI

download_default

organize_files

USAGE

Notes

BCRP

Current Issues with the Source Data

Features

Contributing

License

Contact

TODO

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes