Skip to main content

SAS XPORT file reader

Project description

Python reader for SAS XPORT data transport files (*.xpt).

What’s it for?

XPORT is the binary file format used by a bunch of United States government agencies for publishing data sets. It made a lot of sense if you were trying to read data files on your IBM mainframe back in 1988.

The official SAS specification for XPORT is relatively straightforward. The hardest part is converting IBM-format floating point to IEEE-format, which the specification explains in detail.

How do I use it?

This module mimics the csv module of the standard library

import xport
with open('example.xpt', 'rb') as f:
    for row in xport.reader(f):
        print row

Each row will be a namedtuple, with an attribute for each field in the dataset. Values in the row will be either a unicode string or a float, as specified by the XPT file metadata. Note that since XPT files are in an unusual binary format, you should open them using mode 'rb'.

You can also use the xport module as a command-line tool to convert an XPT file to CSV (comma-separated values).:

$ python -m xport example.xpt > example.csv

The reader object also has a handful of metadata:

  • reader.fields – Names of the fields in each observation.

  • reader.version – SAS version number used to create the XPT file.

  • reader.os – Operating system used to create the XPT file.

  • reader.created – Date and time that the XPT file was created.

  • reader.modified – Date and time that the XPT file was last modified.

Random access to records

If you want to access specific records, you should either consume the reader in a list or use one of itertools recipes for quickly consuming and throwing away unncessary elements.

# Collect all the records in a list for random access
rows = list(xport.reader(f)))

# Select only record 42
from itertools import islice
row = next(islice(xport.reader(f), 42, None))

# Select only the last 42 records
from collections import deque
rows = deque(xport.reader(f), maxlen=42)

Recent changes

  • Improved the API.

  • Fixed handling of NaNs.

  • Fixed piping the file from stdin in Python 3.

Authors

Original version by Jack Cushman, 2012. Major revision by Michael Selik, 2016.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xport-0.3.3.tar.gz (16.4 kB view details)

Uploaded Source

File details

Details for the file xport-0.3.3.tar.gz.

File metadata

  • Download URL: xport-0.3.3.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for xport-0.3.3.tar.gz
Algorithm Hash digest
SHA256 b5be8e07a37813625e4233ea38d46512a95f2ff0c8e599cb10a748ea76fd4c46
MD5 460311a20a620543b1bae4d9118cfd56
BLAKE2b-256 68c2adf68fdf9e012c97edb5062e8c4e63126b910d226de0ac6eabcffaff0cda

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page