Data list management library

These details have not been verified by PyPI

Project description

Dlist

A list of dictionaries with database-like operations. Think of it as a lightweight, in-memory table where each row is a Python dictionary.

Requires Python ≥ 3.10

Quickstart
Indexing
Four core operations
Arithmetic
Sorting
Input/Output
Quick reference
Project structure
License

Quickstart

from dlist import Dlist

data = [
    {'name': 'alice', 'age': '30', 'city': 'london'},
    {'name': 'bob',   'age': '25', 'city': 'london'},
    {'name': 'charlie', 'age': '28', 'city': 'paris'},
]

d = Dlist(data, id_='name')

d.filter(city='london')          # select rows  → new Dlist (2 items)
d.assign(status='active')        # set fields   → new Dlist (3 items, all with status)
d.partition('city')              # group by key → {'london': Dlist, 'paris': Dlist}
d.tree()                         # explore structure in terminal

Installation

pip install dlist

Or for development:

pip install dlist[dev]

Core concepts

Data model

A Dlist is a list of dictionaries. Scalar values are stored as strings; list values are preserved as-is. An optional id_ field acts as a unique primary key.

d = Dlist([
    {'id': '1', 'product': 'laptop',   'price': '1200'},
    {'id': '2', 'product': 'mouse',    'price': '25'},
    {'id': '3', 'product': 'keyboard', 'price': '80'},
], id_='id')

Nested dictionaries

Values can be nested dicts. Access nested keys using __ (double underscore), similar to Django's ORM syntax:

data = [
    {'name': 'john', 'address': {'city': 'NYC', 'zip': '10001'}},
    {'name': 'jane', 'address': {'city': 'LA',  'zip': '90001'}},
]
d = Dlist(data, id_='name')

d.vals('address__city')              # → ['LA', 'NYC']
d.filter(address__city='NYC')        # → Dlist with john

Internally, the nested structure is always preserved. The __ syntax is just for addressing.

List values

Fields can contain lists. List values are preserved (not coerced to strings) and get special treatment in filter, vals, and partition:

data = [
    {'id': '1', 'name': 'photo1', 'tags': ['photo', 'landscape']},
    {'id': '2', 'name': 'video1', 'tags': ['video', 'interview']},
    {'id': '3', 'name': 'mixed',  'tags': ['photo', 'video']},
    {'id': '4', 'name': 'doc1',   'tags': 'report'},
]
d = Dlist(data, id_='id')

d.filter(tags='photo')       # → items 1, 3 (membership test)
d.filter(tags='-photo')      # → items 2, 4 (exclude)
d.filter(tags='*port*')      # → item 4 (glob on each element)
d.vals('tags')               # → ['interview', 'landscape', 'photo', 'report', 'video']
d.vals('tags', count=True)   # → {'photo': 3, 'video': 2, ...}

When displayed in tables, lists are shown as comma-separated values: photo, landscape.

Exploring data

`keys()` — available keys

d.keys()                    # all keys (sorted)
d.keys('address')           # subkeys under 'address'
d.keys(All=True)            # only keys present in every record
d.keys(count=True)          # {'name': 3, 'address': 3, ...}

`vals()` — values for a key

d.vals('name')              # ['jane', 'john']
d.vals('address__city', count=True)   # {'NYC': 1, 'LA': 1}

`tree()` — visual overview

d.tree()

Dlist (2 items, id='name')
├── name    2/2 ['john' (1), 'jane' (1)]
└── address
    ├── city    2/2 ['NYC' (1), 'LA' (1)]
    └── zip     2/2 ['10001' (1), '90001' (1)]

Each leaf shows: key unique_values/records_with_key [top values (count), ...]

Use d.tree(root='address') to zoom into a subtree, or d.tree(top=5) to show more values.

Indexing

Access records by integer position, by id, or by list:

d[0]                # → dict (first record)
d['john']           # → dict (by id)
d[[0, 1]]           # → Dlist (subset)
d[['john', 'jane']] # → Dlist (subset by ids)

Assignment

Assignment merges the new dict into the existing record:

d['john'] = {'phone': '555-1234'}
# john now has name, address AND phone

d[['john', 'jane']] = {'verified': 'True'}
# both records get the new key

Setting or generating an id

Promote an existing field to id, or generate sequential ids:

d2 = d.set_id('name')                        # existing field → id
d2 = d.sort('name').set_id('id', generate='R')  # sort first, then generate R01, R02, …
d2 = d.set_id('id', generate='E', digits=5)  # E00001, E00002, …

Auto digits uses one more digit than needed (5 records → 2 digits, 150 → 4). Raises ValueError if the key already exists in records.

Four core operations

All operations are non-mutating — they return a new Dlist, leaving the original untouched.

`filter(**query)` — select rows

Returns a subset of records matching the query.

d.filter(city='london')              # exact match
d.filter(name='a*')                  # glob pattern
d.filter(city='-paris')              # exclude value
d.filter(phone=True)                 # key must exist
d.filter(phone=False)                # key must NOT exist
d.filter(city='london', age='30')    # AND (all must match)
d.filter(tags='photo')               # list membership (if tags is a list)

Get the complement (matched + rest):

matched, rest = d.filter(complement=True, city='london')

`map(f, inputL, outputL)` — transform fields

Applies a function to each record, reading from inputL keys and writing to outputL keys.

# Single output — return a bare value
d2 = d.map(lambda name: name.upper(), ['name'], ['upper_name'])

# Multiple outputs — return a tuple or list
d2 = d.map(lambda name, age: (name.upper(), int(age) + 1),
           ['name', 'age'], ['upper', 'next_age'])

Apply only to matching records (others are kept unchanged):

d2 = d.map(lambda price: str(float(price) * 0.9),
           ['price'], ['price'],
           query={'product': 'laptop'})

If f returns None, the record is kept unchanged. If an input key is missing, the record is skipped.

`assign(query={}, **kwargs)` — set constant values

Shorthand for map when you just want to set fixed values:

d2 = d.assign(status='active')                          # all records
d2 = d.assign(query={'city': 'london'}, status='uk')    # only matching

`partition(key)` — group by key

Splits into a dict of Dlists, one per unique value:

groups = d.partition('city')
# {'london': Dlist(2 items), 'paris': Dlist(1 items)}

Records where the key is missing go to None.

Arithmetic

dl1 + dl2    # merge: combine records, dl1 values win on conflict
dl1 - dl2    # subtract: remove dl2 records from dl1
dl1 == dl2   # equality: same records and same id

With id, + merges dicts with matching ids and appends new ones. Without id, it compares by full dict equality.

Sorting

sort() is the only in-place operation. It returns self for chaining:

d.sort()                          # sort by id field
d.sort('name')                    # sort by any key
d.sort('price', reverse=True)     # descending
d.sort('address__city')           # nested key
d.sort(['type', 'name'])          # multi-key: primary by type, secondary by name

Input/Output

Reading JSON files

d = Dlist.read('data.json', id_='id')          # from file
d = Dlist.read('[{"id": "1", "name": "a"}]')   # from JSON string
d = Dlist.read([{'id': '1', 'name': 'a'}])     # from list of dicts

Join multiple JSON files by a shared key (requires duckdb):

d = Dlist.read_pivot('evidence/*.json')                     # glob pattern
d = Dlist.read_pivot({'files.json': 'file',
                      'types.json': 'type'}, pivot='id')    # explicit mapping

Reading Excel files

Parse .xlsx workbooks with automatic header detection and subcategory recognition:

d = Dlist.read('report.xlsx')                          # format auto-detected from .xlsx
d = Dlist.read('report.xlsx', header_row=3)            # explicit header row
d = Dlist.read('report.xlsx', sheets=[1, 2])           # specific sheets
d = Dlist.read('report.xlsx', header_row={1: 3, 2: 1}) # per-sheet header

Each record gets a row field (original row number). Multi-sheet workbooks add a tab field. Subcategory rows (column A empty, column B has a label) inject a category field into subsequent records.

d = Dlist.read('report.xlsx', header_row=3, sheets=[1], id_='ID')
d.filter(category='Forensic*')    # filter by subcategory
d.partition('category')           # group by subcategory

The Excel formatter writes subcategory labels to column B, so write → read round-trips preserve categories.

See docs/parsers.md for full details.

Formatted output

Write tables in multiple formats — ASCII, Markdown, LaTeX, Excel, and JSON:

print(d.write(format='ascii', keys=['id', 'name']))
print(d.write(format='md', keys=['id', 'name']))
print(d.write(format='latex', keys=['id', 'name']))
d.write('output.xlsx', keys=['id', 'name'])   # format auto-detected from .xlsx
d.write('data.json')                           # format auto-detected from .json

Categorized output groups records by one or more keys:

print(d.write(format='ascii', ctree=['type'], keys=['id', 'name'],
              titles={'type': 'Type: {}', 'id': 'ID', 'name': 'Name'}))

All table formatters default to width='auto' (columns sized to content). Override with width=10 for fixed-width or maxwidth=20 to cap auto-width.

See docs/formatters.md for full details on each format, styling options, multi-column layout, pagination, and Excel subcategory output.

Quick reference

Method	Description
Creation
`Dlist(data, id_='key')`	Create from list of dicts
`Dlist.read('file.json')`	Read from JSON file, string, or list
`Dlist.read('file.xlsx')`	Read from Excel (format auto-detected)
`Dlist.read_pivot('*.json')`	Join multiple JSON files by shared key
Querying
`d.filter(**query)`	Select rows matching query → new Dlist
`d.filter(complement=True, ...)`	Select + return non-matching rows
`d.vals('key')`	Unique values for a key
`d.vals('key', count=True)`	Value frequencies as dict
`d.keys()`	All available keys
`d.tree()`	Visual structure overview
Indexing
`d[0]`, `d['id_val']`	Get record by position or id
`d[[0, 1]]`, `d[['a', 'b']]`	Get subset → new Dlist
`d['id_val'] = {...}`	Merge dict into record
Transforming
`d.map(f, inputs, outputs)`	Apply function to each record → new Dlist
`d.assign(**kwargs)`	Set constant values → new Dlist
`d.assign(query={...}, **kw)`	Set values on matching records only
`d.partition('key')`	Group by key → dict of Dlists
`d.set_id('key')`	Promote existing field to id
`d.set_id('id', generate='R')`	Generate sequential ids (R01, R02, …)
`d.sort('key')`	Sort in place (returns self)
`d.sort(['k1', 'k2'])`	Multi-key sort (primary, secondary, …)
Arithmetic
`d1 + d2`	Merge two Dlists
`d1 - d2`	Subtract records
`d1 == d2`	Equality check
Output
`d.write()`	JSON string
`d.write('file.json')`	Save to JSON file
`d.write('file.xlsx', keys=[...])`	Save to Excel (auto-detected)
`d.write(format='ascii', keys=[...])`	ASCII table string
`d.write(format='md', keys=[...])`	Markdown table string
`d.write(format='latex', keys=[...])`	LaTeX table string
`d.write(..., ctree=['key'])`	Categorized output

Project structure

dlist/
├── src/dlist/
│   ├── __init__.py      # Package entry point
│   ├── dlist.py         # Dlist class (core operations)
│   ├── helpers.py       # flattenDict, structDict, merge_dicts, getDitem
│   ├── formatters/      # ASCII, Markdown, LaTeX, Excel, JSON output
│   └── parsers/         # JSON, Excel input, pivot joins
├── tests/
│   ├── test_core.py         # core operations + list fields
│   ├── test_formatters.py   # ASCII, Markdown, LaTeX, Excel
│   └── test_excel_parser.py # Excel parser
├── docs/
│   ├── formatters.md    # formatter reference
│   └── parsers.md       # parser reference (Excel details)
├── example_input.xlsx   # sample Excel file for parser demo
└── pyproject.toml

License

MIT

Project details

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Release history Release notifications | RSS feed

This version

2.0.0

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydlist-2.0.0.tar.gz (53.7 kB view details)

Uploaded Feb 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pydlist-2.0.0-py3-none-any.whl (45.7 kB view details)

Uploaded Feb 23, 2026 Python 3

File details

Details for the file pydlist-2.0.0.tar.gz.

File metadata

Download URL: pydlist-2.0.0.tar.gz
Upload date: Feb 23, 2026
Size: 53.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Pop!_OS","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pydlist-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`b0e7e5b4ac14e4b84e694acefc947941f7ff73511ba82a8d0857982857ea6321`
MD5	`fe80f9310dae5ce64fc49be2d14fb9bf`
BLAKE2b-256	`1e115fb8cf245bcac8458bbfc8e126cbd4bff6af2b8d0879e5f760d93aa7432a`

See more details on using hashes here.

File details

Details for the file pydlist-2.0.0-py3-none-any.whl.

File metadata

Download URL: pydlist-2.0.0-py3-none-any.whl
Upload date: Feb 23, 2026
Size: 45.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Pop!_OS","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pydlist-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4dc03975859b9832c2608329ed335d270116c2c779cba9224ee0179685ccba9d`
MD5	`ed2df24890b89dccf311ed1c70ec1f23`
BLAKE2b-256	`0a7bca2d1df80b7ff617af06be26d4ffdb85b664f9e44224ec5035d01568353f`

See more details on using hashes here.

pydlist 2.0.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Dlist

Contents

Quickstart

Installation

Core concepts

Data model

Nested dictionaries

List values

Exploring data

keys() — available keys

vals() — values for a key

tree() — visual overview

Indexing

Assignment

Setting or generating an id

Four core operations

filter(**query) — select rows

map(f, inputL, outputL) — transform fields

assign(query={}, **kwargs) — set constant values

partition(key) — group by key

Arithmetic

Sorting

Input/Output

Reading JSON files

Reading Excel files

Formatted output

Quick reference

Project structure

License

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`keys()` — available keys

`vals()` — values for a key

`tree()` — visual overview

`filter(**query)` — select rows

`map(f, inputL, outputL)` — transform fields

`assign(query={}, **kwargs)` — set constant values

`partition(key)` — group by key