Skip to main content

Functions to help with postprocessing dynamic foraging data.

Project description

aind-dynamic-foraging-data-utils

License Code Style semantic-release: angular Interrogate Coverage Python

Scope

Purpose: Ingests NWB and spits out dataframes with the relevant information. Focused on dynamic foraging. Other tasks can branch and build task-specific utils. Inputs are nwbs, outputs are dataframes (tidy and not) Dependencies: xarray (includes numpy and pandas), scikit-learn (includes scipy), matplotlib

Installation

To use the software, in the root directory, run

pip install -e .

To develop the code, run

pip install -e .[dev]

Usage

Accessing data from an NWB file

To load an NWB file

import aind_dynamic_foraging_data_utils.nwb_utils as nwb_utils
nwb = nwb_utils.load_nwb_from_filename(<filepath>)

To extract a pandas dataframe of trials

df_trials = nwb_utils.create_df_trials(nwb)

To extract a pandas dataframe of events

df_events = nwb_utils.create_df_events(nwb)

To extract a pandas dataframe of photometry data

df_fip = nwb_utils.create_df_fip(nwb)

By default, all of these functions adjust timestamps such that t(0) is the time of the first go cue. If you wish to disable this feature, use adjust_time=False

Time alignment tools

To align a data variable to a set of timepoints and create an event triggered response use the alignment module. For example to align FIP data to each go cue:

import aind_dynamic_foraging_data_utils.alignment as alignment

etr = alignment.event_triggered_response(
    df_fip.query('event == "<FIP channel>"'),
    "timestamps",
    "data",
    df_trials['goCue_start_time_in_session'].values,
    t_start = 0,
    t_end = 1,
    output_sampling_rate=40
    )

Code ocean utility code

To attach data, you'll want to create a token on code ocean with all read/write permissions. Make sure to attach your token on your capsule.

Then, you should be able to access the token via os.getenv(token_name).

Get list of assets

To get a list of code ocean assets for a subject

import aind_dynamic_foraging_data_utils.code_ocean_utils as cou
results = cou.get_subject_assets(my_id)

Users can give a list of required data modalities

import aind_dynamic_foraging_data_utils.code_ocean_utils as co
# FIP data
results = co.get_subject_assets(<subject_id>, modality=['fib'])

# FIP and behavior-videos
results = co.get_subject_assets(<subject_id>, modality=['fib','behavior-videos'])

# any modalities (default)
results = co.get_subject_assets(<subject_id>, modality=[])

Or supply an additional filter string

results = co.get_subject_assets(<subject_id>, extra_filter = <my docdb query string>)

Or filter by a task type:

results = co.get_subject_assets(<subject_id>, task=['Uncoupled Baiting', 'Coupled Baiting'])

Attach data

The 'code_ocean_asset_id' column gives you the data asset ID's on Code Ocean. the 'id' column is the docDB id.

To attach a long list of data, simply call

cou.attach_data(results['code_ocean_asset_id'].values)
results = co.add_data_asset_path(results)

with results as the dataframe from 'get_subject_assets', and 'code_ocean_asset_id' the 16 digit data asset ID from code ocean.

Load data

To get the dataframes from the NWBs, you can call function

filename_sessions = glob.glob(f"../data/**/nwb/behavior**")
SAVED_LOC = '../scratch/dfs'
interested_channels = ['G_1_dff-poly', 'R_1_dff-poly', 'R_2_dff-poly']

get_all_df_for_nwb(filename_sessions, loc = SAVED_LOC, interested_channels = interested_channels)

where filename_sessions are the folder locations for the nwbs, loc is a folder location where the dataframes will be saved, interested channels are the channels you want to save for df_fip.

All dataframes are saved per session, other than df_trials (this is because some df_trials have 2 y coordinates for the lick tube, some have 1).

To load the dataframes, use:

df_sess = pd.read_csv(SAVED_LOC + 'df_sess.csv', index_col = False)
df_events = pd.read_csv(SAVED_LOC + 'df_events.csv', index_col = False)
df_trials = pd.read_csv(SAVED_LOC + 'df_trials.csv', index_col = 0)
df_fip = pd.read_csv(SAVED_LOC + 'df_fip.csv', index_col = False)

To check what available fitted models we already have for each session, you can check with:

check_avail_model_by_nwb_name('746345_2024-11-22_09-55-54.nwb')

where you input the name of the session (formatted as <subject_ID>_<collection_date>_<collection_time>.nwb; sometimes a prefix of behavior_ is needed). Currently the models that are fitted on all sessions should include:

['QLearning_L2F1_softmax', 'QLearning_L1F1_CK1_softmax', 'WSLS', 'QLearning_L1F0_epsi', 'QLearning_L2F1_CK1_softmax']

You can find out more about these models by going here.

To enrich df_sessions and df_trials with the model information, you can use

nwb_name_for_models = [filename.split('/')[-1].replace('behavior_', '') for filename in filename_sessions]
SAVED_LOC = '../scratch/dfs'
get_foraging_model_info(df_trials, df_sess, nwb_name_for_models, loc = SAVED_LOC)

df_trials and df_sess are dataframes created from get_all_df_for_nwb and nwb_name_for_models formatted the same way for check_avail_model_by_nwb_name.

Contributing

Linters and testing

There are several libraries used to run linters, check documentation, and run tests.

  • Please test your changes using the coverage library, which will run the tests and log a coverage report:
coverage run -m unittest discover && coverage report
  • Use interrogate to check that modules, methods, etc. have been documented thoroughly:
interrogate .
  • Use flake8 to check that code is up to standards (no unused imports, etc.):
flake8 .
  • Use black to automatically format the code into PEP standards:
black .
  • Use isort to automatically sort import statements:
isort .

Pull requests

For internal members, please create a branch. For external members, please fork the repository and open a pull request from the fork. We'll primarily use Angular style for commit messages. Roughly, they should follow the pattern:

<type>(<scope>): <short summary>

where scope (optional) describes the packages affected by the code changes and type (mandatory) is one of:

  • build: Changes that affect build tools or external dependencies (example scopes: pyproject.toml, setup.py)
  • ci: Changes to our CI configuration files and scripts (examples: .github/workflows/ci.yml)
  • docs: Documentation only changes
  • feat: A new feature
  • fix: A bugfix
  • perf: A code change that improves performance
  • refactor: A code change that neither fixes a bug nor adds a feature
  • test: Adding missing tests or correcting existing tests

Semantic Release

The table below, from semantic release, shows which commit message gets you which release type when semantic-release runs (using the default configuration):

Commit message Release type
fix(pencil): stop graphite breaking when too much pressure applied Patch Fix Release, Default release
feat(pencil): add 'graphiteWidth' option Minor Feature Release
perf(pencil): remove graphiteWidth option

BREAKING CHANGE: The graphiteWidth option has been removed.
The default graphite width of 10mm is always used for performance reasons.
Major Breaking Release
(Note that the BREAKING CHANGE: token must be in the footer of the commit)

Documentation

To generate the rst files source files for documentation, run

sphinx-apidoc -o doc_template/source/ src 

Then to create the documentation HTML files, run

sphinx-build -b html doc_template/source/ doc_template/build/html

More info on sphinx installation can be found here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aind_dynamic_foraging_data_utils-0.1.42.tar.gz (18.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file aind_dynamic_foraging_data_utils-0.1.42.tar.gz.

File metadata

File hashes

Hashes for aind_dynamic_foraging_data_utils-0.1.42.tar.gz
Algorithm Hash digest
SHA256 39bf2b2790fa454b3bb7e4a3e6973359b8b0eee3e01eaff07f3dcdd27a89091d
MD5 0015bffbc1977460da635ca98e731bf8
BLAKE2b-256 0a2284ff1ec33896ee4f6a7fb1a61bb634cd5fe1f64cc2b61054c4b3eecc6fc3

See more details on using hashes here.

File details

Details for the file aind_dynamic_foraging_data_utils-0.1.42-py3-none-any.whl.

File metadata

File hashes

Hashes for aind_dynamic_foraging_data_utils-0.1.42-py3-none-any.whl
Algorithm Hash digest
SHA256 6c516adfe436b6e09c87af464382e6dfe1e541b25bb0a514501182d660670eee
MD5 89ac8ee288c2cf8d8fa0a62cb7d0224b
BLAKE2b-256 b1d155a96700844e24cd35a12d058739b4649c39bbf2919b70ceac63740e7083

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page