Skip to main content

A library with core primitives for analysis shared across all `Aind.Behavior` tasks

Project description

aind-behavior-core-analysis

CI PyPI - Version License ruff uv

A repository with core primitives for analysis shared across all Aind.Behavior tasks.

This repository is part of a bigger infrastructure that is summarized here.

⚠️ Caution:
This repository is currently under active development and is subject to frequent changes. Features and APIs may evolve without prior notice.

Installing and Upgrading

If you choose to clone the repository, you can install the package by running the following command from the root directory of the repository:

pip install .

Otherwise, you can use pip:

pip install aind-behavior-core-analysis

Getting started and API usage

The library provides two main functionalities: data contracts for standardized data loading and quality control tools for data validation.

Creating and Using Data Contracts

Data contracts provide a standard way to access and load data from various sources. Here's a simple example:

from pathlib import Path
from aind_behavior_core_analysis.contract import Dataset, DataStreamCollection
from aind_behavior_core_analysis.contract.csv import Csv
from aind_behavior_core_analysis.contract.text import Text

# Define the dataset structure
dataset_root = Path("path/to/dataset")
my_dataset = Dataset(
    name="my_dataset",
    version="1.0.0",
    description="Example dataset",
    data_streams=[
        DataStreamCollection(
            name="Behavior",
            description="Behavior data",
            data_streams=[
                Csv(
                    "Position",
                    description="Animal position data",
                    reader_params=Csv.make_params(
                        path=dataset_root / "behavior/position.csv",
                    ),
                ),
                Text(
                    name="Log",
                    description="Session log file",
                    reader_params=Text.make_params(
                        path=dataset_root / "behavior/session.log",
                    ),
                ),
            ],
        ),
    ],
)

# Load a specific stream
position_data = my_dataset["Behavior"]["Position"].load().data
print(f"Position data shape: {position_data.shape}")

# Load all streams and handle errors
my_dataset.load_all()

Quality Control of Primary Data

The QC module helps validate your data to ensure it meets specific requirements:

import aind_behavior_core_analysis.qc as qc

# Using the dataset created above
data_stream = my_dataset["Behavior"]["Position"]

# Create and run test suites
runner = qc.Runner()

# Add test suites for different data types
runner.add_suite(qc.csv.CsvTestSuite(data_stream))

# Or create your own custom test suite
class MyCustomTestSuite(qc.Suite):
    def __init__(self, data_stream):
        self.data_stream = data_stream
        
    def test_has_expected_columns(self):
        """Check if data has required columns."""
        expected_cols = {"timestamp", "x", "y", "speed"}
        if not expected_cols.issubset(self.data_stream.data.columns):
            missing = expected_cols - set(self.data_stream.data.columns)
            return self.fail_test(None, f"Missing columns: {missing}")
        return self.pass_test(None, "All required columns present")

runner.add_suite(MyCustomTestSuite(data_stream))

# Run all tests and display results
results = runner.run_all_with_progress()

For more detailed examples, please check the Examples folder.


Contributors

Contributions to this repository are welcome! However, please ensure that your code adheres to the recommended DevOps practices below:

Linting

We use ruff as our primary linting tool.

Testing

Attempt to add tests when new features are added. To run the currently available tests, run uv run pytest from the root of the repository.

Lock files

We use uv to manage our lock files and therefore encourage everyone to use uv as a package manager as well.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aind_behavior_core_analysis-0.2.0.tar.gz (141.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aind_behavior_core_analysis-0.2.0-py3-none-any.whl (30.5 kB view details)

Uploaded Python 3

File details

Details for the file aind_behavior_core_analysis-0.2.0.tar.gz.

File metadata

File hashes

Hashes for aind_behavior_core_analysis-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0764672d69ad49bdc4a1f527eafab669ef0ad8581e726e6ec2448b3d286eda7c
MD5 47b9afbae304374f12da40fce882bc52
BLAKE2b-256 064952ddacf8068ae50ddb85f655342fc7fd7d05c374e064bc7c31a7e72ef219

See more details on using hashes here.

File details

Details for the file aind_behavior_core_analysis-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for aind_behavior_core_analysis-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 599f41c1c645f5566db02012811f7170783cf2afb410f8482ca89d43b02d8b1c
MD5 cedf552ffa160a95d6cfd8fd67153fd5
BLAKE2b-256 0528ee62b13af2f2445b8cabf5a4eb4beda6703a0f9b044f1fdc3fb5f39b576f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page