A tool for managing survey/administrative data and import them in OpenFisca

These details have not been verified by PyPI

Project links

Project description

OpenFisca Survey Manager

[EN] Introduction

OpenFisca is a versatile microsimulation free software. You can check the online documentation for more details.

This repository contains the Survey-Manager module, to work with OpenFisca and survey data.

It provides two main features:

A Python API to access data in Hierarchical Data Format (HDF) or Parquet.
A script that transforms Parquet, SAS, Stata, SPSS, and CSV data files to HDF data files, along with some metadata so they can be used by the Python API. If the format is Parquet, it is kept as is.

For France survey data, you might find useful information on the next steps in openfisca-france-data repository.

[FR] Introduction

OpenFisca est un logiciel libre de micro-simulation. Pour plus d'information, vous pouvez consulter la documentation officielle.

Ce dépôt contient le module Survey-Manager. Il facilite l'usage d'OpenFisca avec des données d'enquête.

Il fournit deux fonctionnalités principales:

Une API Python permettant l'accès à des données au format Hierarchical Data Format (HDF) ou Parquet.
Un script qui tranforme les fichiers de données aux formats SAS, Stata, SPSS, and CSV data files en fichiers de données au format HDF, avec quelques metadonnées leur permettant d'être utilisés par l'API Python. Si le format est Parquet, il est conservé tel quel.

Si vous disposez de données d'enquête sur la France, le dépôt openfisca-france-data pourrait être utile à vos prochaines étapes de traitement.

Environment

OpenFisca-Survey-Manager runs on Python 3.9+. It is tested on 3.10, 3.11 and 3.12.

Usage

Installation

Install with PIP

If you're developing your own script or looking to run OpenFisca-Survey-Manager without editing it, you don't need to get its source code. It just needs to be known by your environment. To do so, first, install the package with pip:

pip install --upgrade pip
pip install openfisca-survey-manager

This should not display any error and end with:

Successfully installed [... openfisca-survey-manager-xx.xx.xx ...]

It comes with build-collection command that we will use in the next steps.

If you want to improve this module, please see the Development section below.

Install with Conda

Create an anvironment and install openfisca-survey-manager

conda create -n survey python=3.9
conda activate survey
conda install -c conda-forge -c openfisca openfisca-survey-manager

You are ready to go !

To exit your environment:

conda deactivate

Getting the configuration directory path

To be able to use OpenFisca-Survey-Manager, you have to create two configuration files:

raw_data.ini,
and config.ini.

To know where to copy them to, use the following command:

build-collection --help

You should get the following result.

usage: build-collection [-h] -c COLLECTION [-d] [-m] [-p PATH] [-s SURVEY]
                        [-v]

optional arguments:
  -h, --help            show this help message and exit
  -c COLLECTION, --collection COLLECTION
                        name of collection to build or update
  -d, --replace-data    erase existing survey data HDF5 file (instead of
                        failing when HDF5 file already exists)
  -m, --replace-metadata
                        erase existing collection metadata JSON file (instead
                        of just adding new surveys)
  -p PATH, --path PATH  path to the config files directory (default =
                        /your/path/.config/openfisca-survey-manager)
  -s SURVEY, --survey SURVEY
                        name of survey to build or update (default = all)
  -v, --verbose         increase output verbosity

Take note of the default configuration directory path in -p PATH, --path PATH option's description. This is the directory where you will set your raw_data.ini and config.ini files. In this example, it is /Users/you/.config/openfisca-survey-manager.

If you want to use a different path, you can pass the --path /another/path option to build-collection. This feature is still experimental though.

Editing the config files

Configuration files are INI files (text files).

The raw_data.ini lists your input surveys while config.ini specifies the paths to SurveyManager outputs.

raw_data.ini and config.ini must not be committed (they are already ignored by .gitignore).

raw_data.ini, for inputs configuration

To initialise your raw_data.ini file, you can follow these steps:

Copy the template file raw_data_template.ini to the configuration directory path you identified in the previous step and rename it to raw_data.ini. Ex: /your/path/.config/openfisca-survey-manager/raw_data.ini
Edit the latter by adding a section title for your survey. For example, if you name your survey housing_survey, you should get a line with:

[housing_survey]

Add a reference to the location of your raw data directory (SAS, stata DTA files, SPSS, CSV files). For paths in Windows, use / instead of \ to separate folders. You do not need to put quotes, even when the path name contains spaces.

Your file should look like this:

[housing_survey]

2014 = /path/to/your/raw/data/HOUSING_2014

You can also set multiple surveys as follows:

[revenue_survey]

2014 = /path/to/your/raw/data/REVENUE_2014
2015 = /path/to/your/raw/data/REVENUE_2015
2016 = /path/to/your/raw/data/REVENUE_2016

[housing_survey]

2014 = /path/to/your/raw/data/HOUSING_2014

config.ini, for outputs configuration

To initilalise your config.ini file:

Copy its template file config_template.ini to your configuration directory and rename it to config.ini. Ex: /your/path/.config/openfisca-survey-manager/config.ini.
Define a collections_directory path where the SurveyManager will generate your survey inputs and outputs JSON description. Ex: /.../openfisca-survey-manager/transformed_housing_survey For a housing_survey, you will get a /.../openfisca-survey-manager/transformed_housing_survey/housing_survey.json file.
Define an output_directory where the generated HDF file will be registered. This directory could be a sub-directory of your collections_directory.
Define a tmp_directory that will store temporay calculation results. Its content will be deleted at the end of the calculation. This directory could be a sub-directory of your collections_directory.

Your config.ini file should look similar to this:

[collections]

collections_directory = /path/to/your/collections/directory

[data]

output_directory = /path/to/your/data/output/directory
tmp_directory = /path/to/your/data/tmp/directory

Make sure those directories exist, otherwise the script will fail.

Building the HDF5 files

This step will read your configuration files and you survey data and generate a HDF5 file (.h5) for your survey. To build the HDF5 files, we'll use the build-collection script.

Here is an example for one survey with one serie: our housing_survey that knows only 2014 serie. We call our survey as a collection (with -c option) and build the HDF5 file with this command:

build-collection -c housing_survey -d -m -v

-d -m options put you on the safe side as they remove previous outputs if they exist.

It will generate:

A housing_survey.json listing a housing_survey_2014 survey with both:
- your input tables and your input file paths in an informations key,
- the transformed survey path in a hdf5_file_path key.
Your transformed survey in a housing_survey_2014.h5 file.

build-collection, what else?

As build-collection --help shows, other options exist. Here are other usage examples.

If you have multiple series of one survey like the revenue_survey, you can run the specific 2015 serie with:

build-collection -c revenue_survey -s 2015 -d -m -v

Or if you want to specify a different configuration directory path:

build-collection -p /another/path -c housing_survey -s 2014 -d -m -v

The --path /another/path option is still experimental though.

It should work. If it doesn't, please do not hesitate to open an issue.

Parquet files

Parquet files could be used as input files. They will not be converted to HDF5. As Parquet files can only contains one table, we add a "parquet_file" key to each table in a survey. This key contains the path to the Parquet file, or the folder containing many parquet files for the same table.

If using folder you have to name your files with the following pattern: some_name_-<number>.parquet and keep only the files for the same table in the same folder.

If a single file contains all the table, you can have many files for different tables in the same folder.

Development

To contribute to OpenFisca-Survey-Manager, you can use uv for a modern development workflow.

Clone the repository

git clone https://github.com/openfisca/openfisca-survey-manager.git
cd openfisca-survey-manager

Install dependencies and dev tools
```
uv sync
```
Run tests
```
uv run pytest
```
Linting and Formatting We use ruff for linting and formatting.
```
uv run ruff check .
uv run ruff format .
```

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

6.6.0

Mar 20, 2026

6.5.0

Mar 19, 2026

6.4.0

Mar 13, 2026

6.3.1

Mar 3, 2026

6.3.0

Mar 3, 2026

6.2.0

Feb 13, 2026

6.1.8

Feb 12, 2026

6.1.7

Jan 28, 2026

6.1.6

Jan 28, 2026

6.1.5

Jan 27, 2026

6.1.4

Jan 27, 2026

6.1.3

Jan 26, 2026

6.1.2

Jan 26, 2026

6.1.1

Jan 26, 2026

6.1.0

Jan 24, 2026

6.0.0

Jan 24, 2026

5.0.1

Jan 23, 2026

4.1.0

Jan 18, 2026

4.0.1

Jan 13, 2026

4.0.0

Nov 4, 2025

3.2.7

Oct 29, 2025

3.2.6

Oct 22, 2025

3.2.5

Oct 16, 2025

3.2.4

Oct 6, 2025

3.2.3

Sep 25, 2025

3.2.2

Sep 17, 2025

3.2.1

Jul 27, 2025

3.1.0

Jul 22, 2025

3.0.7

Jul 22, 2025

3.0.6

Jul 10, 2025

3.0.5

Jul 9, 2025

3.0.4

Feb 5, 2025

3.0.3

Feb 4, 2025

3.0.2

Jan 23, 2025

3.0.1

Nov 29, 2024

2.3.5

Nov 19, 2024

2.3.4

Nov 18, 2024

2.3.3

Nov 18, 2024

2.3.2

Nov 8, 2024

2.3.1

Nov 5, 2024

2.3.0

Nov 5, 2024

2.2.7

Nov 4, 2024

2.2.6

Oct 28, 2024

2.2.5

Oct 22, 2024

2.2.4

Oct 21, 2024

2.2.3

Oct 18, 2024

2.2.2

Oct 16, 2024

2.2.1

Apr 24, 2024

2.2.0

Apr 9, 2024

2.1.0

Mar 27, 2024

2.0.10

Mar 25, 2024

2.0.9

Feb 16, 2024

2.0.8

Feb 16, 2024

2.0.2

Dec 18, 2023

2.0.1

Dec 11, 2023

2.0.0

Nov 7, 2023

1.1.9

Oct 30, 2023

1.1.8

Oct 4, 2023

1.1.7

Sep 18, 2023

1.1.6

Aug 4, 2023

1.1.5

Jul 25, 2023

1.1.5rc0 pre-release

Jul 25, 2023

1.1.4

Jul 21, 2023

1.1.3

Jul 21, 2023

1.1.2

Jul 20, 2023

1.1.1

Jul 20, 2023

1.1.0

Jul 20, 2023

1.0.2

Jul 12, 2023

1.0.1

Jul 11, 2023

1.0.0

Jun 10, 2023

0.47.2

Mar 21, 2023

0.47.1

Mar 20, 2023

0.47

Mar 20, 2023

0.46.19

Mar 20, 2023

0.46.16

Dec 5, 2022

0.46.15

Dec 3, 2022

0.46.14

Jul 19, 2022

0.46.13

May 11, 2022

0.46.12

Apr 27, 2022

0.46.11

Feb 10, 2022

0.46.10

Feb 4, 2022

0.46.9

Feb 3, 2022

0.46.8

Jan 7, 2022

0.46.7

Jan 6, 2022

0.46.6

Dec 22, 2021

0.46.5

Dec 17, 2021

0.46.4

Dec 9, 2021

0.46.3

Nov 29, 2021

0.46.2

Sep 23, 2021

0.46.1

Sep 7, 2021

0.46

Sep 2, 2021

0.45

Aug 25, 2021

0.44.2

Jun 17, 2021

0.44.1

Jun 17, 2021

0.44

Jun 17, 2021

0.43

Jun 17, 2021

0.42.3

Jun 14, 2021

0.42.2

May 20, 2021

0.42.1

Apr 28, 2021

0.41.3

Mar 22, 2021

0.41.2

Mar 6, 2021

0.41.1

Dec 16, 2020

0.41.0

Dec 14, 2020

0.40.1

Dec 7, 2020

0.40.0

Dec 4, 2020

0.39.1

Aug 21, 2020

0.39.0

Apr 16, 2020

0.38.3

Feb 12, 2020

0.38.2

Jan 17, 2020

0.38.1

Nov 12, 2019

0.38.0

Oct 31, 2019

0.37.3

Oct 29, 2019

0.37.2

Oct 16, 2019

0.37.1

Oct 14, 2019

0.37.0

Oct 3, 2019

0.36.3

Sep 17, 2019

0.36.2

Sep 9, 2019

0.35.2

Aug 26, 2019

0.35.1

Aug 26, 2019

0.35.0

Aug 21, 2019

0.34.0

Aug 14, 2019

0.33.0

Aug 8, 2019

0.32.1

Aug 6, 2019

0.32

Jul 30, 2019

0.31

Jul 30, 2019

0.30.1

Jul 3, 2019

0.30.0

Jun 25, 2019

0.29.0

Jun 25, 2019

0.28.0

Jun 13, 2019

0.27.0

Jun 11, 2019

0.26.0

May 24, 2019

0.25.0

May 24, 2019

0.24.0

May 22, 2019

0.23.0

May 21, 2019

0.20.1

May 21, 2019

0.20

May 16, 2019

0.19.1

Mar 31, 2019

0.18.3

Feb 25, 2019

0.18.2

Feb 20, 2019

0.18.1

Feb 18, 2019

0.18.0

Feb 15, 2019

0.17.5

Jan 24, 2019

0.17.4

Jan 18, 2019

0.17.3

Jan 18, 2019

0.17.2

Jan 18, 2019

0.17.1

Jan 10, 2019

0.17.0

Jan 9, 2019

0.16.5

Jan 7, 2019

0.16.4

Dec 28, 2018

0.16.3

Dec 26, 2018

0.16.1

Dec 26, 2018

0.16.0

Dec 21, 2018

0.15.2

Oct 18, 2018

0.15.1

Oct 18, 2018

0.10.1

Jun 25, 2018

0.9.8

Mar 20, 2018

0.9.7

Mar 9, 2018

0.9.5

Feb 15, 2018

0.9.3

Jan 23, 2018

0.9.2

Jan 23, 2018

0.9.1

Dec 14, 2017

0.9.1.dev0 pre-release

Dec 14, 2017

0.8.12

Jun 16, 2017

0.8.11

Jun 16, 2017

0.8.3

Feb 6, 2017

0.8.2

Feb 4, 2017

0.8.1

Feb 4, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openfisca_survey_manager-6.6.0.tar.gz (325.7 kB view details)

Uploaded Mar 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openfisca_survey_manager-6.6.0-py3-none-any.whl (226.8 kB view details)

Uploaded Mar 20, 2026 Python 3

File details

Details for the file openfisca_survey_manager-6.6.0.tar.gz.

File metadata

Download URL: openfisca_survey_manager-6.6.0.tar.gz
Upload date: Mar 20, 2026
Size: 325.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for openfisca_survey_manager-6.6.0.tar.gz
Algorithm	Hash digest
SHA256	`ae326d1ce85bd3860540128d7be9ec850d1b1855f9ac918623789fe6baecd0cb`
MD5	`9d975b118f70667ade8eaa1f2ea2c5e8`
BLAKE2b-256	`38c0949a947dd11739d6ed09bf9f95ecc33c0709496517e315a0aa60d0971686`

See more details on using hashes here.

File details

Details for the file openfisca_survey_manager-6.6.0-py3-none-any.whl.

File metadata

Download URL: openfisca_survey_manager-6.6.0-py3-none-any.whl
Upload date: Mar 20, 2026
Size: 226.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for openfisca_survey_manager-6.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7ec6ad2f429378295a9fb725b97357450853e01be42934f4958c0dc14b3dfee9`
MD5	`a00fd964c7812f89e47423828039be74`
BLAKE2b-256	`e0f392cdd417f1d562b957e294035af8771940e05e1de1d3eac55baacdcd99ea`

See more details on using hashes here.

OpenFisca-Survey-Manager 6.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OpenFisca Survey Manager

[EN] Introduction

[FR] Introduction

Environment

Usage

Installation

Install with PIP

Install with Conda

Getting the configuration directory path

Editing the config files

raw_data.ini, for inputs configuration

config.ini, for outputs configuration

Building the HDF5 files

build-collection, what else?

Parquet files

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes