Quilt Python API https://quiltdata.com
Project description
Python
The Quilt Python connector uses the Quilt REST API and SQL Alchemy (http://docs.sqlalchemy.org/), if installed, to access and update data sets in Quilt. Quilt tables are available as dictionaries or Pandas (http://pandas.pydata.org/) DataFrames.
The Quilt Python connector is available via PyPI: https://pypi.python.org/pypi/quilt
pip install quilt
Connection
To use the Quilt Python connector, add this repository to your PYTHONPATH and import quilt.
Connect to Quilt by creating a Connection object:
import quilt
connection = quilt.Connection(username=None)
Quilt username: *enter your username*
Password: *enter your password*
The connection will contain a list of your Quilt tables:
connection.tables
Search for Data Sets
You can also find tables by searching your own tables and Quilt’s public data sets
connection.search('term')
Get Table
Get a table by Table id using get_table:
t = connection.get_table(1234)
Create a New Table
Using the connection, you can create new tables in Quilt. To create an empty table:
t = connection.create_table(name, description)
To create a table from an input file:
t = connection.create_table(name, description, inputfile=path_to_input_file)
Or, to create a new table from a DataFrame:
t = connection.save_df(df, name, description="table description")
Table
Each Table object has a list of Columns
mytable.columns
After the columns have been fetched, columns are available as table attributes.
mytable.column1
Accessing Table Data
Tables are iterable. To access table data:
for row in mytable:
print(row)
Search
Search for matching rows in a table by calling search.
for row in mytable.search('foo'):
print(row)
Order By
Sort the table by any column or set of columns. You can set the ordering by passing a string that is the column’s field (name in the database).
mytable.order_by('column1')
You can find column field names with their “.field” attribute:
mytable.order_by(mytable.column1.field)
You can sort by multiple columns by passing a list of fields.
mytable.order_by(['column2', 'column1'])
To sort in descending order, add a “-” in front of the column field name:
mytable.order_by('-column1')
Limit
Limit the number of rows returned by calling limit(number_of_rows).
Putting it all together
Search, order_by and limit can be combined to return just the data you want to see. For example, to return the top 2 finishers with the name Sally from a table of race results (race_results: [name_000, time_001]), you could write:
for result in race_results.search('Sally').order_by('-time_001').limit(2):
print(row)
Pandas DataFrame
Access a table’s data as a Pandas DataFrame by calling mytable.df()
You can also combine the querying methods above to access particular rows.
race_results.search('Sally').order_by('-time\_001').limit(2).df()
Gene Math
Quilt supports intersect and subtract for tables that store genomic regions. Those operations assume that tables have columns storing: Chromsome, start and end. The function get_bed_cols tries to infer those columns based on column names.
If the guessing fails, or to override the guess, set the chromosome, start, end columns explicitly with set_bed_cols. mytable.set_bed_cols(mytable.chr_001, mytable.start_002, mytable.end_003)
Once the bed columns are set for both tables, they can be intersected and subtracted.
result = tableA.intersect(tableB)
result = tableA.intersect_wao(tableB)
result = tableA.subtract(tableB)
Development
Tests run with:
pip install -r requirements.txt
pip install pytest pytest-cov
pytest --cov=quilt/ tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file quilt-0.1.11.tar.gz
.
File metadata
- Download URL: quilt-0.1.11.tar.gz
- Upload date:
- Size: 11.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac9bbc78053254da9dd58154b2bc63e73a42f110e0b6d3aecc5a7fa16f677ab5 |
|
MD5 | a637f2f81205c6e3c7d7c34a44a6c6cf |
|
BLAKE2b-256 | a1e069a3d64277352a4cb9e157a67a3c494ac725af15b688bfea9b61f7dc8b87 |