Skip to main content

Read your own data — pure-Python parser for locked Nesstar survey files

Project description

Nesstar Converter

PyPI Python 3.10+ License: MIT CI

Researchers, statistical agencies, and national archives spent years building survey datasets that document the economic and social life of entire populations — often with public money. Those datasets ended up locked in .Nesstar, a proprietary binary format whose only reader was a discontinued Windows desktop application. The company folded. The servers went offline. The licenses expired. But the data didn't stop mattering. nesstar-converter is a pure-Python binary parser that reads the format directly — no .exe, no Windows, no institutional subscription — and writes Parquet, CSV, Stata, Excel, and more. Your data. Open formats. Any platform.


What this does

  • Reverse-engineered binary parser — reads .Nesstar files directly in Python, no proprietary executable involved
  • Writes open formats — Parquet, CSV, TSV, Excel, Stata, JSON, JSONL, fixed-width text
  • Validates against official exports — cell-level comparison with Nesstar Explorer output
  • Runs everywhere — Linux, macOS, Windows; Python 3.10+

Quick start

Install:

pip install nesstar-converter

Convert:

nesstar-converter convert survey.Nesstar ddi.xml ./output --formats csv,parquet,stata

Validate:

nesstar-converter validate ./output ./exported_text

The DDI XML is auto-detected if it sits beside the .Nesstar file.


Supported formats

Format Extension Best for
parquet .parquet Python, R, DuckDB, long-term storage
csv .csv Excel, LibreOffice, Google Sheets
tsv .tsv Tab-separated workflows
excel .xlsx Non-technical users who just want a spreadsheet
stata .dta Stata, with leading zeros preserved
json .json Web apps, structured interchange
jsonl .jsonl Streaming pipelines
fwf .txt Fixed-width text

nesstar-converter vs ihsn/nesstar-exporter

The IHSN tool wraps the official Windows binary. It is not a replacement for it — you still need the .exe.

Dimension ihsn/nesstar-exporter nesstar-converter
Core approach Python wrapper around NesstarExporter.exe Pure-Python binary parser
Requires NesstarExporter.exe Yes No
OS model Windows-oriented workflow Linux / macOS / Windows
Reads binary directly No Yes
Reverse-engineered format support No Yes
Parquet output No Yes
RDF / DDI export via official tool Yes No
Validation against text exports No built-in validation layer Yes
Install model Repo scripts + external exe path Standard Python package / console script

Evidence: the IHSN repo's own README, config.json, src/config.py, and src/exporter.py all require a path to NesstarExporter.exe and shell out to it with subprocess.run(...).


Who uses Nesstar

Institution / repository Country / region Status
NSD / Sikt Norway Original Nesstar developer and ESS host
UK Data Archive / UK Data Service United Kingdom Co-developer and former Nesstar WebView operator
European Social Survey Pan-European Disseminated through Nesstar from 2004
Statistics Canada / ODESI Canada Licensed the full Nesstar suite
GESIS ZACAT Germany Former Nesstar WebView catalog
Sciences Po / CDSP France Documented migration away from Nesstar
SSJDA / CSRDA Japan Documented Nesstar deployment
IHSN / World Bank Global Still distributes Nesstar Publisher and migration tooling
India MoSPI / NSO India Active distributor of .Nesstar survey files
DataFirst / Stats SA South Africa Legacy archive and testing target

Full evidence and source links: docs/global-coverage.md.


Validation coverage

Validation distinguishes cell-level (row-for-row, value-for-value match against official exports) from structure-level (file counts and variable counts confirmed, but companion DDI XML was not shipped by the distributor for full binary re-validation).

Survey Years / rounds Level Result
EUS 38th Round (1983) Cell-level 9/9 blocks, 3.4M rows, zero mismatches
HCES 38th, 45th, 66th Cell-level 27/28 blocks, 23.4M+ rows, zero mismatches
PLFS 2017-18 to 2022-23 Structure-level 24/24 exports matched NADA dictionary row/column counts

PLFS raw packages include .Nesstar files but omit the companion DDI XML, so current evidence is structural. Cell-level re-validation awaits DDI availability.


Python API

from nesstar_converter import convert_nesstar, show_info

show_info("survey.Nesstar", "ddi.xml")
convert_nesstar("survey.Nesstar", "ddi.xml", "./output", formats=["csv", "parquet"])

Limitations

  • Expects DDI metadata. Without the companion DDI XML, the parser cannot yet do full extraction from the binary alone.
  • Data conversion, not RDF packaging. For DDI/RDF export via the official legacy toolchain, the IHSN wrapper exists — but still requires NesstarExporter.exe.
  • Legacy ecosystems vary. Different institutions used different Nesstar-era conventions; community test cases from outside India are especially valuable.

Contributing

  • Test on non-Indian Nesstar files and report results
  • Share evidence of .Nesstar / .NSDstat datasets still in circulation
  • Help improve metadata recovery for archives that omit DDI XML

Docs: docs/TECHNICAL.md · docs/global-coverage.md


Citation

If you use this in research, please cite via CITATION.cff.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nesstar_converter-1.0.2.tar.gz (40.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nesstar_converter-1.0.2-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file nesstar_converter-1.0.2.tar.gz.

File metadata

  • Download URL: nesstar_converter-1.0.2.tar.gz
  • Upload date:
  • Size: 40.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for nesstar_converter-1.0.2.tar.gz
Algorithm Hash digest
SHA256 47cda8d8234ab780e8bf651a0cbb7a6f82006ee29abef30edf939a25195f79a1
MD5 68a847c914e2837f27a7c18434daa27c
BLAKE2b-256 b7ff5a87456d2dad449472dac8dfd87149d50f6a9e89d001f3e9b219ee95903d

See more details on using hashes here.

File details

Details for the file nesstar_converter-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for nesstar_converter-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 03f90a2ff69d6c569f6a2b7d63f5fd40358704428ec7a61a61c366efa445f117
MD5 c213edcf79dc3bb8babf78567494a095
BLAKE2b-256 38cfcdbb4aee6f8e3dc0bbe38819ba80fe7e4bc560b130360f8035614e6c54e9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page