The OmicIDX project collects, reprocesses, and then republishes metadata from multiple public genomics repositories. Included are the NCBI SRA, Biosample, and GEO databases. Publication is via the cloud data warehouse platform Bigquery, a set of performant search and retrieval APIs, and a set of json-format files for easy incorporation into other projects.

These details have not been verified by PyPI

Project links

Project description

New process

Steps

Download xml
Create basic json
Upload json to s3
munge basic json to parquet
munge parquet to
- experiment joined
- sample joined
- run joined
- study with aggregates
- Include aggs in spark jobs:
  - number of samples, experiments, runs
  - sample, experiment, and run accessions (as array)
Save munged spark data (json, parquet)
Create elasticsearch index mappings
Drop existing elasticsearch mappings
Load elasticsearch index mappings

lambda

zip lambdas.zip lambda_handlers.py sra_parsers.py

aws lambda create-function --function-name sra_to_json --zip-file fileb://lambdas.zip --handler lambda_handlers.lambda_return_full_experiment_json --runtime python3.6 --role arn:aws:iam::377200973048:role/lambda_s3_exec_role

Invoke

aws lambda invoke --function-name sra_to_json --log-type Tail --payload '{"accession":"SRX000273"}' /tmp/abc.txt

Concurrency

1000 total, reserve for certain functions to limit, etc.

aws lambda put-function-concurrency --function-name sra_to_json --reserved-concurrent-executions 20

timeout and memory

aws lambda update-function-configuration --function-name sra_to_json --timeout 15

logging

https://github.com/jorgebastida/awslogs

awslogs get /aws/lambda/sra_to_json ALL --watch

dynamodb

aws dynamodb scan --table-name sra_experiment --select "COUNT"

GEO

python -m omicidx.geometa --gse=GSE10

Will print json, one "line" per entity to stdout.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.15.0

May 21, 2025

1.14.0

May 21, 2025

1.13.3

Nov 15, 2024

1.13.2

Nov 15, 2024

1.13.1

Jul 9, 2024

1.13.0

Jul 9, 2024

1.12.0

Jun 29, 2024

1.11.4

May 29, 2024

1.11.3

Apr 12, 2024

1.11.1

Apr 12, 2024

1.11.0

Apr 12, 2024

1.10.1

Mar 27, 2024

1.10.0

Mar 27, 2024

1.8.1

Nov 10, 2023

1.8.0

Nov 10, 2023

1.7.1

Nov 9, 2023

1.7.0

Jul 18, 2023

1.6.5

Mar 7, 2023

1.6.4

Feb 27, 2023

1.6.3

Feb 27, 2023

1.6.2

Jan 27, 2023

1.6.1

Jan 27, 2023

1.6.0

Jan 27, 2023

1.5.2

Dec 12, 2022

1.5.1

Oct 29, 2022

1.5.0

Oct 27, 2022

1.4.5

Oct 12, 2022

1.4.4

Oct 12, 2022

1.4.3

Oct 12, 2022

1.4.2

Oct 12, 2022

1.4.1

Aug 19, 2022

1.4.0

Aug 19, 2022

1.3.0

Aug 13, 2022

1.2.4

Jun 5, 2022

1.2.3

Jun 5, 2022

1.2.2

Feb 25, 2022

1.2.1

Feb 12, 2022

1.2.0

Feb 10, 2022

1.1.0.0

Oct 14, 2021

1.0.2.0

Jun 9, 2021

1.0.1.0

Jun 9, 2021

1.0.0.0

Jun 9, 2021

0.9.0.9000

Jan 10, 2021

0.6.0

Mar 6, 2020

0.5.1

Feb 14, 2020

0.5.0

Feb 14, 2020

0.4.1

Feb 12, 2020

0.4.0

Feb 12, 2020

0.3.16

Feb 8, 2020

0.3.15

Feb 7, 2020

0.3.14

Feb 7, 2020

0.3.13

Feb 7, 2020

0.3.12

Feb 7, 2020

0.3.11

Feb 4, 2020

0.3.10

Dec 22, 2019

0.3.9

Dec 22, 2019

This version

0.3.8

Dec 21, 2019

0.3.7

Dec 21, 2019

0.3.6

Dec 20, 2019

0.3.4

Dec 3, 2019

0.3.2

Dec 3, 2019

0.3.1

Dec 3, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omicidx-0.3.8.tar.gz (34.5 kB view details)

Uploaded Dec 21, 2019 Source

Built Distribution

omicidx-0.3.8-py3-none-any.whl (42.1 kB view details)

Uploaded Dec 21, 2019 Python 3

File details

Details for the file omicidx-0.3.8.tar.gz.

File metadata

Download URL: omicidx-0.3.8.tar.gz
Upload date: Dec 21, 2019
Size: 34.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/0.12.17 CPython/3.7.3 Darwin/18.2.0

File hashes

Hashes for omicidx-0.3.8.tar.gz
Algorithm	Hash digest
SHA256	`0b8b503a958c5da686e7694f9fc460083955a3bf3de639ca12f84f59bf4ab7ac`
MD5	`23d6588332eb77822e0cfb563f105e71`
BLAKE2b-256	`12bf536361131fd58081a3c930d2d464934b5d2087a195f23349a44fc09061cc`

See more details on using hashes here.

File details

Details for the file omicidx-0.3.8-py3-none-any.whl.

File metadata

Download URL: omicidx-0.3.8-py3-none-any.whl
Upload date: Dec 21, 2019
Size: 42.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/0.12.17 CPython/3.7.3 Darwin/18.2.0

File hashes

Hashes for omicidx-0.3.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e8e10230788eed839263e3015c61a5d165f3bef699f5a41b98c36dc1794cd62b`
MD5	`b1b0db9fc72c214ca53b92fbd4cc4165`
BLAKE2b-256	`a37e925cccd8b25b9122e6002d44f92e25c937faed28baa5d5482663b97db55e`

See more details on using hashes here.

omicidx 0.3.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

New process

Steps

lambda

Invoke

Concurrency

timeout and memory

logging

dynamodb

GEO

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes