martens

Succinct small scale data manipulation

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Project description

Martens

Succinct small scale data manipulation

Free software: MIT license
Documentation: https://martens.readthedocs.io.

Usage

To use Martens in a project:

import martens

The package is available freely on pypi under MIT licence.

About

Martens is a python package for data manipulation in python. It is designed for data that is too small,for example, to worry about uploading into a cloud data warehouse for ease of processing but which is still useful to you. The kind of data that was probably passed to you in a spreadsheet or csv file which needs to be transformed quickly into what you want.

The primary aim of Martens is to enable data manipulation code that is:

Flexible
Succinct
Easily Readable and maintainable
Lightweight

And finally, reasonably performant. That is to say, the intent and philosophy is not to rely on libraries like numpy which may boost performance compared to base python. Rather, martens fits neatly around concepts from core python. This comes with benefits to flexibility and a minimal build profile.

The design is heavily inspired by dplyr from the R universe.

Example code

Importing data is simple:

source_data = martens.SourceFile(file_path=file_path).dataset

Generally speaking, martens will infer file type from the file extension provided but you can specify a file type to override it. Access the underlying dataframe with dataset property.

Dataframes

A martens dataframe is really just a dict of equal length lists and string keys. Any column of the dataframe can be accessed as follows and the result is always a list:

source_data['age']

There’s no such thing as a dataframe index in martens, they are not at all useful for the type of data that martens is designed to parse. But, we can quickly add a standard, integer based column to use as an id in later steps:

source_data.with_id('person_id')

Filtering with functions

Filtering is best done with functions or lambdas (but doesn’t have to be):

source_data.dataset.filter(lambda gender: gender == 'Male')

The key innovation of martens is that argument names of a function are used within the dataframe. These functions will then operate on data within the columns with corresponding names to it’s own arguments. That is, argument names of your functions are important and determine how that function will interact with the dataframe. This allows for succinct, readable and flexible code. For example, you can use any function to filter so long as its argument names correspond to existing columns in the dataframe and the function returns something that is ultimately able to be resolved to either true or false.

Mutate and apply

Similarly we can quickly create new columns on the fly using data from existing functions:

source_data.mutate(lambda age: 365*age,'age_in_days')

Again, there is significant flexibility here. Any arbitrary function with any arbitrary return value will do, as long as all of it’s arguments can be resolved using existing columns of the dataframe.

If you just want the output without adding to the dataframe, use apply:

source_data.apply(lambda age: 7*age)

Stack, stretch and squish

Sometimes, we don’t want to simply create a new column with the required features. If the output of your function resolves to a list, you can choose to stack the output vertically. This will produce a new dataframe with additional rows and the existing columns expanded (repeated):

source_data.mutate_stack(lambda age: list(range(age)),)

We might instead want to create multiple new columns simultaneously:

source_data.mutate_stretch(some_function_returning_tuple_of_2,names=['A','B'])

More complex code

If you are using martens the way it was intended, your code will tend to have large blocks of three plus lines of code with each new operation just being a method of the dataframe from the the previous line. That is, chaining commands is common:

def solve()
    data = mt.Dataset({'line': [x for x in data_input.split('\n')]})
    num_match = lambda line: [match for match in re.finditer(r'\b\d+\b', line)]
    num_matches = data.with_id('num_line_no') \
        .mutate_stack(num_match, 'match').with_id('num_id') \
        .mutate(lambda match: int(match.group()), name='num_match') \
        .mutate(lambda match: match.start(), name='num_start') \
        .mutate(lambda match: match.end(), name='num_end')
    chr_match = lambda line: [m.start() for m in re.finditer(r'[^.0-9]', line)]
    chr_matches = data.with_id('chr_line_no') \
        .mutate_stack(chr_match, 'chr_match') \
        .with_id('chr_id').select(['chr_line_no', 'chr_match', 'chr_id'])
    all_matches = num_matches.merge(chr_matches) \
        .filter(lambda chr_line_no, num_line_no: abs(chr_line_no - num_line_no) <= 1) \
        .filter(lambda chr_match, num_start, num_end: num_start - 1 <= chr_match <= num_end)
    gear_match = all_matches.group_by(['chr_id'], other_cols=['num_id', 'num_match']) \
        .mutate(lambda num_id: len(num_id), 'num_count') \
        .filter(lambda num_count: num_count >= 2) \
        .mutate(lambda num_match: prod(num_match), 'gear_ratio')
    return {
        'part one': sum(all_matches.unique_by(['num_id', 'num_match'])['num_match']),
        'part two': sum(gear_match['gear_ratio'])
    }

Extensibility

A martens dataframe can often be used in place of a pandas dataframe or similar in another package. For example in plotly

import plotly.express as px
px.bar(dataframe,x='column1',y='column2')

What’s next

This is just the beginning of this project, I hope it is useful to someone, somewhere. There are many, many feature and speed improvements that I would like to implement. Of course, feedback is welcome, raise an issue or otherwise get in touch and I’ll do my best to respond.

History

0.2.1 (2024-01-11)

First release featured on PyPI.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

This version

0.4.26

Dec 1, 2025

0.4.25

Aug 22, 2025

0.4.24

Aug 22, 2025

0.4.22

Aug 13, 2025

0.4.21

Aug 13, 2025

0.4.20

Aug 8, 2025

0.4.19

Aug 8, 2025

0.4.18

Jul 29, 2025

0.4.17

Jul 29, 2025

0.4.15

Jul 29, 2025

0.4.14

Jul 29, 2025

0.4.13

Jun 23, 2025

0.4.12

Jun 19, 2025

0.4.11

Jun 19, 2025

0.4.10

Jun 18, 2025

0.4.9

Jun 18, 2025

0.4.8

Jun 13, 2025

0.4.7

Jun 11, 2025

0.4.6

May 30, 2025

0.4.5

May 29, 2025

0.4.4

May 27, 2025

0.4.3

May 27, 2025

0.4.1

May 27, 2025

0.4.0

May 19, 2025

0.3.28

May 19, 2025

0.3.27

May 2, 2025

0.3.26

May 2, 2025

0.3.25

May 2, 2025

0.3.24

May 1, 2025

0.3.23

May 1, 2025

0.3.22

Apr 29, 2025

0.3.21

Apr 22, 2025

0.3.20

Apr 14, 2025

0.3.19

Apr 11, 2025

0.3.18

Apr 11, 2025

0.3.17

Apr 11, 2025

0.3.16

Apr 11, 2025

0.3.15

Apr 7, 2025

0.3.14

Apr 7, 2025

0.3.9

Apr 7, 2025

0.3.8

Dec 18, 2024

0.3.7

Dec 5, 2024

0.3.6

Dec 3, 2024

0.3.5

Dec 3, 2024

0.3.4

Nov 22, 2024

0.3.1

Aug 2, 2024

0.2.1

Jan 11, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

martens-0.4.26.tar.gz (30.7 kB view details)

Uploaded Dec 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

martens-0.4.26-py2.py3-none-any.whl (14.2 kB view details)

Uploaded Dec 1, 2025 Python 2Python 3

File details

Details for the file martens-0.4.26.tar.gz.

File metadata

Download URL: martens-0.4.26.tar.gz
Upload date: Dec 1, 2025
Size: 30.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for martens-0.4.26.tar.gz
Algorithm	Hash digest
SHA256	`f65fe051d1684cd5d59ee5ae5b1d9c9ea63a7946ba403214588c44e6b9c654a4`
MD5	`bcbc81c17112f261853efb996bcb3be3`
BLAKE2b-256	`7d9da2b441fe4c85f6f5827a63fc9505637cad0c1970d818c40d281a3c950499`

See more details on using hashes here.

File details

Details for the file martens-0.4.26-py2.py3-none-any.whl.

File metadata

Download URL: martens-0.4.26-py2.py3-none-any.whl
Upload date: Dec 1, 2025
Size: 14.2 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for martens-0.4.26-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`d9d8b20692ccfb466488ba805de2435e55b79e90b8b689865ff9374d6e974dca`
MD5	`cdded3dfeeda8f298208e7054fc2199e`
BLAKE2b-256	`c839c0c8fa05f80484b681915df09a0dd06721a84694f4a87595c7e719721672`

See more details on using hashes here.

martens 0.4.26

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Martens

Usage

About

Example code

Dataframes

Filtering with functions

Mutate and apply

Stack, stretch and squish

More complex code

Extensibility

What’s next

History

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes