Ingest gesturally-structured data into models with multiple export
Project description
ingesture
Ingest gesturally-structured data into models with multiple export
This package is not even close to usable, and is just a sketch at the moment. If for some reason you see it and would like to work on it with me, feel free to open an issue :)
Declare your data
Even the most disorganized data system has some structure. We want to be able to recover it without demanding that the entire acquisition process be reworked
To do that, we can use a family of specifiers to tell ingest
where to get metadata
from datetime import datetime
from ingesture import Schema, spec
from pydantic import Field
class MyData(Schema):
# parse metadata in a filename
subject_id: str = Field(...,
description="The ID of a subject of course!",
spec = spec.Path('electrophysiology_{subject_id}_*.csv')
)
# parse multiple values at once
date: datetime
experimenter: str
date, experimenter = Field(...,
spec = spec.Path('{date}_{experimenter}_optodata.h5')
)
# from inside a .mat file
other_meta: int = Field(...
spec = spec.Mat(
path='**/notebook.mat', # 2 **s mean we can glob recursively
field = ('nb', 1, 'user') # index recursively through the .mat
)
)
# and so on
Then, parse your schema from a folder
data = MyData.make('/home/lab/my_data')
Or a bunch of them!
data = MyData.make('/home/lab/my_datas/*')
Multiple Strategies
todo
Hierarchical Modeling
Our data is rarely a single type, often there is a repeatable substructure that is paired with different macro-structures: eg. you have open-ephys data within a directory with behavioral data in one experiment and paired with optical data in another.
Make submodels and recombine them freely...
todo
Export Data
Once we have data in an abstract model, then we want to be able to export it to multiple formats! To do that we need an interface that describes the basic methods of interacting with that format (eg. .csv files are written differently than hdf5 files) and a mapping from our model fields to locations, attributes, and names in the target format.
Pydantic base export
json
From the Field specification
class MyData(Schema):
subject_id: str = Field(
spec = ...,
nwb_field = "NWBFile:subject_id"
)
From a Mapping
object
class NWB_Map(Mapping):
subject_id = 'NWBFile:subject_id'
class MyData(Schema):
subject_id: str = Field(...)
__mapping__ = NWB_Map
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ingesture-0.1.0.tar.gz
.
File metadata
- Download URL: ingesture-0.1.0.tar.gz
- Upload date:
- Size: 19.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.1 Darwin/21.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 42020080b547533de244ef9da1da0a282c9f5d3e09ea64130e0193feaffb3ebb |
|
MD5 | 4c84490462be357ce75007b7f9e788df |
|
BLAKE2b-256 | d70a6c6d2e7663760cd729ca9a11fca79342fd97ecba9c890cbaa1d6b321b49b |
File details
Details for the file ingesture-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: ingesture-0.1.0-py3-none-any.whl
- Upload date:
- Size: 21.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.1 Darwin/21.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1a6a90ea80cce663ff819fc9375962a5535a7963d752dec68e40eba21ae471f |
|
MD5 | b8a48583c4e5a3eaae70010aeb65eae6 |
|
BLAKE2b-256 | 564be67fc35762151d4c83d7703ef0d059752bd10b727f39c67b18fb25258573 |