Python parser for legacy Tableau Data Extract (TDE) files
Project description
pytde
A Python parser for legacy Tableau Data Extract (TDE) files.
Overview
pytde reads TDE files (the legacy Tableau data format used before Hyper) and converts them to pandas DataFrames. This is useful for accessing data from older Tableau extracts without needing Tableau Desktop or Server.
Installation
pip install pytde
Or install from source:
git clone https://github.com/ronreiter/pytde.git
cd pytde
pip install -e .
Quick Start
from pytde import read_tde
# Read a TDE file
tables = read_tde('data.tde')
# Get the Extract table as a DataFrame
df = tables['Extract']
# Work with the data
print(df.head())
print(df.describe())
API Reference
read_tde(file_path: str) -> dict[str, pd.DataFrame]
Read a TDE file and return its contents as pandas DataFrames.
from pytde import read_tde
tables = read_tde('sales_data.tde')
df = tables['Extract']
read_tde_metadata(file_path: str) -> dict
Read TDE file metadata without fully parsing the data.
from pytde import read_tde_metadata
metadata = read_tde_metadata('sales_data.tde')
print(metadata['columns']) # List of column names
print(metadata['column_types']) # Column data types
print(metadata['file_size']) # File size in bytes
print(metadata['format_version']) # TDE format version
TDEParser class
For more control over parsing, use the TDEParser class directly:
from pytde import TDEParser
parser = TDEParser('data.tde')
tables = parser.parse()
metadata = parser.get_metadata()
# Access internal state
print(parser.xml_metadata) # Embedded XML schema
print(parser.column_entries) # Column index entries
Command Line Interface
pytde includes a CLI tool for quick inspection of TDE files:
pytde data.tde
Output:
Parsing TDE file: data.tde
------------------------------------------------------------
Format version: 2
File size: 51306 bytes
Columns found: ['Region', 'Sales', 'Sales Person']
Column types: {'Region': 'string', 'Sales': 'double', 'Sales Person': 'string'}
============================================================
Table: Extract
Shape: (43, 3)
Columns: ['Region', 'Sales', 'Sales Person']
...
Supported Data Types
| TDE Type | Python/Pandas Type |
|---|---|
| string | object (str) |
| double | float64 |
| integer | int64 |
| date | object* |
| datetime | object* |
| boolean | bool |
*Date/datetime support is limited in the current version.
TDE File Format
TDE files are binary files that store columnar data optimized for Tableau's data engine. Key features:
- Little-endian byte ordering
- Block-based structure with markers (
f0ca1278for data,f1ca1278for index) - Dictionary encoding for string columns
- Embedded XML metadata for schema definitions
For detailed format specification, see TDE.MD.
Limitations
- Index decoding: String column row-to-value mapping uses fallback distribution when exact indices cannot be decoded
- Date/time columns: Limited support for date and datetime types
- Compression: Some compressed TDE files may not be fully supported
- Large files: Memory usage scales with file size (entire file is loaded into memory)
Development
Setup
git clone https://github.com/ronreiter/pytde.git
cd pytde
pip install -e ".[dev]"
Running Tests
pytest
Running Tests with Coverage
pytest --cov=pytde --cov-report=html
License
MIT License - see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Acknowledgments
This parser was developed through reverse engineering of the TDE file format. Special thanks to the open source community for their work on understanding proprietary file formats.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytde-0.1.1.tar.gz.
File metadata
- Download URL: pytde-0.1.1.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e2afd7284d33e30c15a9dd790a644cf9dd9ebf39e21dc71c76deead8ef6378a
|
|
| MD5 |
cf6e9134573010c19f1513dc1ff57753
|
|
| BLAKE2b-256 |
47857fb9cd9b6af6b9b018dc60ecdacbd376569f406cae50deaa69ebbc3b5239
|
File details
Details for the file pytde-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pytde-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4ad6724370d63cbaa353d9b6e15b801c2889f8c6217250320f18be9e4ba474b
|
|
| MD5 |
c9231def10afe0b14725c359b0186469
|
|
| BLAKE2b-256 |
636f2668c0064e74d25d9dc63f4fdba0907122ecec3fdd4e0f83a3000ae943e5
|