A django library for running Nextflow pipelines and storing their result.
Project description
django-nextflow
django-nextflow is Django app for running Nextflow pipelines and storing their results in a database within a Django web app.
Installation
nextflow.py is available through PyPI:
pip install django-nextflow
You must install the Nextflow executable itself separately: see the Nextflow Documentation for help with this.
Setup
To use the app within Django, add django-nextflow
to your list of
INSTALLED_APPS
.
You must define four values in your settings.py
:
-
NEXTFLOW_PIPELINE_ROOT
- the location on disk where the Nextflow pipelines are stored. All references to pipeline files will use this as the root. -
NEXTFLOW_DATA_ROOT
- the location on disk to store execution records. -
NEXTFLOW_UPLOADS_ROOT
- the location on disk to store uploaded data. -
NEXTFLOW_PUBLISH_DIR
- the name of the folder published files will be saved to. Within an execution directory, django-nextflow will look in NEXTFLOW_PUBLISH_DIR/process_name for output files for that process.
Usage
Begin by defining one or more Pipelines. These are .nf files somewhere within
the NEXTFLOW_PIPELINE_ROOT
you defined:
from django_nextflow.models import Pipeline
pipeline = Pipeline.objects.create(path="workflows/main.nf")
You can also provide paths to a JSON input schema file (structured using the nf-core style) and a config file to use when running it:
pipeline = Pipeline.objects.create(
path="workflows/main.nf",
description="Some useful pipeline.",
schema_path="main.json",
config_path="nextflow.config"
)
print(pipeline.input_schema) # Returns inputs as dict
To run the pipeline:
execution = pipeline.run(params={"param1": "xxx"})
This will run the pipeline using Nextflow, and save database entries for three different models:
-
The
Execution
that is returned represents the running of this pipeline on this occasion. It stores the stdout and stderr of the command, and has aget_log_text()
method for reading the full log file from disk. A directory will be created inNEXTFLOW_DATA_ROOT
for the execution to take place in. -
ProcessExecution
records for each process that execution within the running of the pipeline. These also have their own stdout and stderr, as well as status information etc. -
Data
records for each file published by the processes in the pipeline. Note that this is not every file produced - but specifically those output by the process via its output channel. For this to work the processes must be configured to publish these files to a particular directory name (the one thatNEXTFLOW_PUBLISH_DIR
is set to), and to a subdirectory within that directory with the process's name.
If you want to supply a file for which there is a Data
object as the input to
a pipeline, you can do so as follows:
execution = pipeline.run(params={"param1": "xxx"}, data_params={"param2": 23})
...where 23 is the ID of the Data
object.
The Data
objects above were created by running some pipeline, but you might
want to create one from scratch without running a pipeline. You can do so either
from a path string, or from a Django UploadedFile
object:
data1 = Data.create_from_path("/path/to/file.txt")
data2 = Data.create_from_upload(django_upload_object)
The file will be copied to NEXTFLOW_UPLOADS_ROOT
in this case.
Changelog
0.2
14th November, 2021
- Pipelines now have description fields.
- Data objects now have creation time fields.
- Added upstream data objects as well as downstream to process executions.
0.1.1
3rd November, 2021
- Fixed duration string parsing.
0.1
29th October, 2021
- Initial models for pipelines, execution, process executions and data.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for django_nextflow-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d753609483d6ea175be68f18530e1bed80404f0df28ebe769261a581b52d02a |
|
MD5 | fecb7b60974787e70af0781d463ca44d |
|
BLAKE2b-256 | a9b593502e89b79f58acac6954e3fbac6093a2403a74109aeaffc683903548c4 |