Wrapped to download Flybase data with Python, easily and quickly.
Project description
FlyBaseDownloads
Python package to facilitate the data download from FlyBase. Most of the available data from their official wiki can be downloaded. One of the purposes of this library is to organize the data as closely as possible to the source, FlyBase. Despite not being the official package, it is organized by data class/type and provides direct downloads of the current bulk data files from the FTP site. For more information, visit the official FlyBase wiki.
- Usage and Installation
- Synonyms
- Genes
- Genetic interaction table
- RNA-Seq RPKM values
- RNA-Seq RPKM values matrix
- Single Cell RNA-Seq Gene Expression
- Physical interaction MITAB file
- Functional complementation table
- FBgn to DB Accession IDs
- FBgn to Annotation ID
- FBgn to GLEANR IDs
- FBgn to FBtr to FBpp IDs
- FBgn to FBtr to FBpp IDs (expanded)
- FBgn exons to Affy1
- FBgn exons to Affy2
- Genes Sequence Ontology (SO) data
- Genes map table
- Best gene summaries
- Automated gene summaries
- Gene Snapshots
- Unique protein isoforms
- Non-coding RNAs
- Enzyme
- Gene Ontology annotation files GO
- Gene Groups
- Alleles and Stocks
- Homologs
- Human disease
- Organisms
- Ontology Terms
- Insertions
- Clones
- References
Usage and Installation
In order to simplify the download of FlyBase files, the names have been kept as close as possible. To access the data, follow these steps:
-
Install the library using the pip command.
pip install FlyBaseDownloads
-
Import the library into your file.
import FlyBaseDownloads as FBD
-
Access the different classes of the library described below.
Synonyms
To download the file, execute the following command.
Synonyms = FBD.Synonyms.get()
The file reports current symbols and synonyms for the following objects in FlyBase: genes (FBgn), alleles (FBal), balancers (FBba), aberrations (FBab), transgenic constructs (FBtp), insertions (FBti), transcripts (FBtr), and proteins (FBpp).
Columns Description
| Column heading | Content Description |
|---|---|
| primary_FBid | Primary FlyBase identifier for the object |
| organism_abbreviation | Abbreviation (from the Species Abbreviations list) indicating the species of origin |
| current_symbol | Current symbol used in FlyBase for the object |
| current_fullname | Current full name used in FlyBase for the object |
| fullname_synonym(s) | Non-current full name(s) associated with the object (pipe separated values) |
| symbol_synonym(s) | Non-current symbol(s) associated with the object (pipe separated values) |
Genes
To facilitate its usage, it is suggested to access the data using the following command.
Genes = FBD.Genes
Then, enter the specific method according to the desired data
Genetic interaction table
To download the file, execute the following command.
Genetic_interaction_table = Genes.Genetic_interaction_table()
Columns Description
| Column heading | Content Description |
|---|---|
| Starting_gene(s)_symbol | Current FlyBase symbol of gene(s) involved in the starting genotype |
| Starting_gene(s)_FBgn | Current FlyBase identifier (FBgn#) of gene(s) involved in the starting genotype |
| Interacting_gene(s)_symbol | Current FlyBase symbol of gene(s) involved in the interacting genotype |
| Interacting_gene(s)_FBgn | Current FlyBase identifier (FBgn#) of gene(s) involved in the interacting genotype |
| Interaction_type | Type of interaction observed, either 'suppressible' or 'enhanceable' |
| Publication_FBrf | Current FlyBase identifier (FBrf#) of publication from which the data came |
RNA-Seq RPKM values
To download the file, execute the following command.
RNASeq_values = Genes.RNASeq_values()
Columns Description
| Column heading | Content Description |
|---|---|
| Release_ID | The D. melanogaster annotation set version from which the gene model used in the analysis derives |
| FBgn# | The unique FlyBase gene ID for this gene |
| GeneSymbol | The official FlyBase symbol for this gene |
| Parent_library_FBlc# | The unique FlyBase ID for the dataset project to which the RNA-Seq experiment belongs |
| Parent_library_name | The official FlyBase symbol for the dataset project to which the RNA-Seq experiment belongs |
| RNASource_FBlc# | The unique FlyBase ID for the RNA-Seq experiment used for RPKM expression calculation |
| RNASource_name | The official FlyBase symbol for the RNA-Seq experiment used for RPKM expression calculation |
| RPKM_value | The RPKM expression value for the gene in the specified RNA-Seq experiment |
| Bin_value | The expression bin classification of this gene in this RNA-Seq experiment, based on RPKM value. Bins range from 1 (no/extremely low expression) to 8 (extremely high expression) |
| Unique_exon_base_count T | he number of exonic bases unique to the gene (not overlapping exons of other genes). Field will be blank for genes derived from dicistronic/polycistronic transcripts |
| Total_exon_base_count | The number of bases in all exons of this gene |
| Count_used | Indicates if the RPKM expression value was calculated using only the exonic regions unique to the gene and not overlapping exons of other genes (Unique), or, if the RPKM expression value was calculated based on all exons of the gene regardless of overlap with other genes (Total). RPKM expression values are typically reported for the "Unique" count, except for genes on dicistronic/polycistronic transcripts, in which case the "Total" count is reported |
RNA-Seq RPKM values matrix
To download the file, execute the following command.
RNASeq_values_matrix = Genes.RNASeq_values_matrix()
Columns Description
| Column heading | Content Description |
|---|---|
| gene_primary_id | The unique FlyBase gene ID for this gene. |
| gene_symbol | The official FlyBase symbol for this gene. |
| gene_fullname | The official full name for this gene. |
| gene_type | The type of gene: e.g., protein_coding_gene, non_protein_coding_gene. |
| DATASAMPLE_NAME_(DATASET_ID) | Each subsequent column reports the gene RPKM values for the sample listed in the header. The dataset "FBlc" ID is listed in parentheses, and can be pasted into FlyBase search to access more information on the sample from the "dataset" report. |
Single Cell RNA-Seq Gene Expression
To download the file, execute the following command.
SingleCellRNASeq_Gene_Expression = Genes.Single_Cell_RNA_Gene_Expression()
Columns Description
| Column heading | Content Description |
|---|---|
| Pub_ID | The FlyBase FBrf ID for the reference in which the expression was reported. |
| Pub_miniref | The FlyBase citation for the publication in which the expression was reported. |
| Clustering_Analysis_ID | The FlyBase FBlc ID for the dataset representing the clustering analysis. |
| Clustering_Analysis_Name | The FlyBase name for the dataset representing the clustering analysis. |
| Source_Tissue_Sex | The sex of the source tissue used for the experiment: male, female or mixed. |
| Source_Tissue_Stage | The life stage of the source tissue used for the experiment, using only high-level terms: embryonic stage, larval stage, pupal stage, adult stage or mixed. |
| Source_Tissue_Anatomy | The anatomical region of the source tissue used for the experiment; only "mixed" is shown if many |
| Cluster_ID | The FlyBase FBlc ID for the dataset representing the cell cluster. |
| Cluster_Name | The FlyBase name for the dataset representing the cell cluster. |
| Cluster_Cell_Type_ID | The FlyBase FBbt ID for the cell type represented by the cell cluster. |
| Cluster_Cell_Type_Name | The FlyBase name for the cell type represented by the cell cluster. |
| Gene_ID | The FlyBase FBgn ID for the expressed gene. |
| Gene_Symbol | The FlyBase symbol for the expressed gene (ASCII-format). |
| Mean_Expression | The average level of expression of the gene across all cells of the cluster in which the gene is detected at all. |
| Spread | The proportion of cells in the cluster in which the gene is detected. |
Physical interaction MITAB file
To download the file, execute the following command.
Physical_interaction_MITAB = Genes.Physical_interaction_MITAB()
Columns Description
| Column number | Column heading | General format | FlyBase example | Content description |
|---|---|---|---|---|
| 1 | ID(s) Interactor A | database:identifier | flybase:FBgn0002121 | The unique Flybase identifier for the first gene of the interacting pair. |
| 2 | ID(s) Interactor B | - | - | The unique Flybase identifier for the second gene of the interacting pair. |
| 3 | Alt ID(s) Interactor A | database:identifier | flybase:CG2671| entrez gene/locuslink:33156 | The alternative gene identifiers currently provided are Flybase annotation IDs (CG#) and NCBI’s Entrez Gene ID separated by “|“ |
| 4 | Alt ID(s) Interactor B | - | - | - |
| 5 | Alias(es) Interactor A | database:name(alias type) | flybase:l(2)gl(gene name) | The official Flybase gene symbol. It is referred to as “gene name” to adhere to the psi-mi ontology. |
| 6 | Alias(es) Interactor B | - | - | - |
| 7 | Interaction Detection Method(s) | ontology:identifier(method name) | psi-mi:"MI:0006"(anti bait coimmunoprecipitation) | The assay used to detect the interaction, taken from the psi-mi ontology. |
| 8 | Publication 1st Author(s) | surname initial(s) (publication year) | Betschinger K. (2003) | The first author and year of the publication where the interaction is described. |
| 9 | Publication ID(s) | database:identifier | flybase:FBrf0157155|pubmed:12629552 | The unique FlyBase identifier for the publication followed by the unique PubMed identifier (if there is one) separated by “|”. |
| 10 | Taxid Interactor A | taxid:identifier | taxid:7227("Drosophila melanogaster") | The NCBI taxonomy identifier for the source organism of the interactor. The vast majority of interactors in FlyBase come from D. melanogaster. There are, however, a few interspecies interactions consisting of a D. melanogaster interactor and an interactor of a different species. |
| 11 | Taxid Interactor B | - | - | - |
| 12 | Interaction Type(s) | ontology:identifier(interaction type) | psi-mi:"MI:0915"(physical association) | Taken from the psi-mi ontology. Most often “physical association” for FlyBase. |
| 13 | Source Database(s) | ontology:identifier(database name) | psi-mi:"MI:0478"(flybase) | All interactions are curated by FlyBase. |
| 14 | Interaction Identifier(s) | database:identifier | flybase:FBrf0157155-13.coIP.WB | The unique FlyBase identifier for this interaction. |
| 15 | Confidence Value(s) | - | - | Not applicable |
| 16 | Expansion Method(s) | - | - | Not applicable |
| 17 | Biological Role(s) Interactor A | - | - | Not applicable |
| 18 | Biological Role(s) Interactor B | - | - | Not applicable |
| 19 | Experimental Role(s) Interactor A | ontology:identifier(experimental role name) | psi-mi:"MI:0496"(bait) | The role played by the interactor in the experiment. Taken from the psi-mi ontology. |
| 20 | Experimental Role(s) Interactor B | - | - | - |
| 21 | Type(s) Interactor A | ontology:identifier(interactor type name) | psi-mi:"MI:0326"(protein) | The molecule type. For FlyBase, these are limited to protein or ribonucleic acid. Taken from the psi-mi ontology. |
| 22 | Type(s) Interactor B | - | - | - |
| 23 | Xref(s) Interactor A | - | - | Not applicable |
| 24 | Xref(s) Interactor B | - | - | Not applicable |
| 25 | Interaction Xref(s) | database:identifier | flybase:FBig0000000103 | Cross references for the interactions. For Flybase, these include an interaction group identifier (FBig) and possibly a collection identifier (FBlc) separated by “|”. All experiments that show an interaction between the products of gene A and gene B are compiled into an A-B interaction group, such that all interactions are associated with an interaction group identified by an FBig number. Interactions identified as part of a large scale study are also associated with the collection identifier, or FBlc number. |
| 26 | Annotation(s) Interactor A | topic:text isoform- | comment:a isoform | Information on whether the interaction is specific to a particular interactor isoform. |
| 27 | Annotation(s) Interactor B | - | - | - |
| 28 | Interaction Annotation(s) | topic:text | comment:Phosphorylated isoforms of @l(2)gl@ are absent when @aPKC@ is knocked down by RNAi. | Describes the source(s) of the interaction participants and includes free text comments about the interaction. |
| 29 | Host Organism(s) | - | - | Not applicable |
| 30 | Interaction Parameters | - | - | Not applicable |
| 31 | Creation Date | - | - | Not applicable |
| 32 | Update Date | - | - | Not applicable |
| 33 | Checksum Interactor A | - | - | Not applicable |
| 34 | Checksum Interactor B | - | - | Not applicable |
| 35 | Interaction Checksum | - | - | Not applicable |
| 36 | Negative | - | FALSE | All interactions in FlyBase are positive. |
| 37 | Feature(s) Interactor A | feature_type:range(text) | sufficient binding region:aa 1-58(N-terminal region) | Describes features of Interactor A such as binding sites, mutations that disrupt the interaction, epitope tags, etc. |
| 38 | Feature(s) Interactor B | - | - | - |
| 39 | Stoichiometry Interactor A | - | - | Not applicable |
| 40 | Stoichiometry Interactor B | - | - | Not applicable |
| 41 | Identification Method(s) Participant A | - | - | Not applicable |
| 42 | Identification Method(s) Participant B | - | - | Not applicable |
Functional complementation table
To download the file, execute the following command.
Functional_complementation = Genes.Functional_complementation()
Columns Description
| Column heading | Content Description |
|---|---|
| Dmel gene (symbol) | Current FlyBase symbol of Dmel gene. |
| Dmel gene (FBgn) | Current FlyBase identifier (FBgn#) of Dmel gene in column 1. |
| Functionally complementing ortholog (symbol) | Current FlyBase symbol of a non-Dmel ortholog of the Dmel gene in column 1 where this non-Dmel gene has been show to functionally complement the Dmel gene. |
| Functionally complementing ortholog (FBgn#) | Current FlyBase identifier (FBgn#) of a non-Dmel ortholog of the Dmel gene in column 1 where this non-Dmel gene has been show to functionally complement the Dmel gene. |
| Supporting_FBrf | Current FlyBase identifier (FBrf#) of the publication that provides support for the functional complementation statement (the publication that reported the suppression of a mutant phenotype of the Dmel gene by a transgenic construct/mutant allele of the non-Dmel ortholog). |
FBgn to DB Accession IDs
To download the file, execute the following command.
FBgn_toDB_Accession_IDs = Genes.FBgn_toDB_Accession_IDs()
Columns Description
| Column heading | Content Description |
|---|---|
| gene_symbol | Current symbol of gene. |
| organism_abbreviation | Abbreviation (from the Species Abbreviations list) indicating the species of origin of the gene. |
| primary_FBgn# | Current FlyBase identifier (FBgn#) of gene. |
| nucleotide_accession | EMBL/GenBank/DDBJ nucleotide accession associated with the gene. |
| na_based_protein_accession | EMBL/GenBank/DDBJ protein accession associated with the gene and the nucleotide accession in the preceeding 'nucleotide_accession' column |
| UniprotKB/Swiss-Prot/TrEMBL_accession | UniProtKB/SwissProt/TrEMBL protein accession associated with the gene. |
| EntrezGene_ID | NCBI Entrez ID associated with the gene. |
| RefSeq_transcripts | NCBI RefSeq transcript accession associated with the gene. |
| RefSeq_proteins | NCBI RefSeq protein accession associated with the gene and the transcript accession in the preceeding 'RefSeq_transcripts' column. |
FBgn to Annotation ID
To download the file, execute the following command.
FBgn_toAnnotation_ID = Genes.FBgn_toAnnotation_ID()
Columns Description
| Column heading | Content Description |
|---|---|
| gene_symbol | Current symbol of gene. |
| organism_abbreviation | Abbreviation (from the Species Abbreviations list) indicating the species of origin of the gene. |
| primary_FBgn# | Current FlyBase identifier (FBgn#) of gene. |
| secondary_FBgn#(s) | Secondary FlyBase identifier(s) (FBgn#) associated with the gene (comma separated values). |
| annotation_ID | Current annotation identifier associated with the gene. |
| secondary_annotation_ID(s) | Secondary annotation identifier(s) associated with the gene (comma separated values). |
FBgn to GLEANR IDs
To download the file, execute the following command.
FBgn_toGLEANR_IDs = Genes.FBgn_toGLEANR_IDs()
Columns Description
| Column heading | Content Description |
|---|---|
| organism_abbreviation | Abbreviation (from the Species Abbreviations list) indicating the species of origin of the gene. |
| gene_symbol | Current FlyBase gene symbol. |
| primary_FBgn# | Current FlyBase identifier (FBgn#) of the gene. |
| GLEANR_ID | GLEANR identifier assigned by the AAA Consortium. |
FBgn to FBtr to FBpp IDs
To download the file, execute the following command.
FBgn_to_FBtr_to_FBpp = Genes.FBgn_to_FBtr_to_FBpp()
Columns Description
| Column heading | Content Description |
|---|---|
| FlyBase_FBgn | Current FlyBase identifier (FBgn#) of the gene. |
| FlyBase_FBtr | Current FlyBase identifier (FBtr#) of a transcript encoded by the gene listed in the preceeding 'FlyBase_FBgn' column. |
| FlyBase_FBpp | Current FlyBase identifier (FBpp#) of a polypeptide encoded by the transcript listed in the preceeding 'FlyBase_FBtr' column, where this is relevant. |
FBgn to FBtr to FBpp IDs (expanded)
To download the file, execute the following command.
FBgn_to_FBtr_to_FBpp_exp = Genes.FBgn_to_FBtr_to_FBpp_expanded()
Columns Description
| Column heading | Content Description |
|---|---|
| organism | Abbreviation (from the Species Abbreviations list) indicating the species of origin of the gene. |
| gene_type | The type of gene, represented by a Sequence Ontology term. |
| gene_ID | Current "FBgn" identifier of gene. |
| gene_symbol | Current symbol of the gene. |
| gene_fullname | Current full name of the gene. |
| annotation_ID | Current FlyBase annotation identifier of the gene. |
| transcript_type | The type of transcript, represented by a Sequence Ontology term. |
| transcript_ID | Current FlyBase annotation identifier of the transcript. |
| transcript_symbol | Current symbol of the transcript. |
| polypeptide_ID | Current FlyBase annotation identifier of the polypeptide. |
| polypeptide_symbol | Current symbol of the polypeptide. |
FBgn exons to Affy1
To download the file, execute the following command.
FBgn_exons2affy1 = Genes.FBgn_exons2affy1()
The file is generated by testing for overlaps, no matter how small, of the locations of Affy1 oligos in the genome with the locations of gene exons, as defined by the Dmel gene models for the current release of FlyBase. If the location of an Affy1 oligo shows any kind of overlap with an exon of a gene, a Gene=>Affy reference is recorded in this file.
The extent of the overlap has no influence on the inclusion of a crossreference in this file. The overlap might be just one nucleotide, or it could be an exact match to the exon. For interpretation of the significance of a partial overlap please contact Affymetrix.
The file includes the following Dmel genes:
- Nuclear genes located to the sequence
it excludes:
- genes not located to the sequence
- mitochondrial genes
The first column of a line it is the FBgn ID, and the second one is the Affy1 ID that overlaps with an exon of the gene.
FBgn exons to Affy2
To download the file, execute the following command.
FBgn_exons2affy2 = Genes.FBgn_exons2affy2()
Similar to the Affy1 but with Affy2.
Genes Sequence Ontology (SO) data
To download the file, execute the following command.
Genes_Sequence_Ontology = Genes.Genes_Sequence_Ontology()
Columns Description
| Column heading | Content Description |
|---|---|
| gene_primary_id | The unique FlyBase gene ID for this gene. |
| gene_symbol | The official FlyBase symbol for this gene. |
| so_term_name | The SO term name. |
| so_term_id | The SO term primary identifier. |
Genes map table
To download the file, execute the following command.
Genes_map = Genes.Genes_map()
Columns Description
| Column heading | Content Description |
|---|---|
| organism_abbreviation | Abbreviation (from the Species Abbreviations list) indicating the species of origin of the gene. |
| current_symbol | Current FlyBase gene symbol. |
| primary_FBid | Current FlyBase identifier (FBgn#) of gene. |
| recombination_loc | Recombination map location. |
| cytogenetic_loc | Cytogenetic location. |
| sequence_loc | Genomic location. |
Best gene summaries
To download the file, execute the following command.
Best_gene_summaries = Genes.Best_gene_summaries()
Columns Description
| Column heading | Content Description |
|---|---|
| FBgn_ID | Current FlyBase identifier number for the gene. |
| Gene_Symbol | Current FlyBase symbol of the gene. |
| Summary_Source | The source of the gene summary. |
| Summary | The gene summary text. |
Automated gene summaries
To download the file, execute the following command.
Automated_gene_summaries = Genes.Automated_gene_summaries()
Columns Description
| Column heading | Content Description |
|---|---|
| FlyBase ID. | The Valid FlyBase identifier number for the gene. |
| Summary | The gene summary as a string of plain text. |
Gene Snapshots
To download the file, execute the following command.
Gene_Snapshots = Genes.Gene_Snapshots()
Columns Description
| Column heading | Content Description |
|---|---|
| FBgn_ID | Current FlyBase identifier number for the gene. |
| GeneSymbol | Current FlyBase symbol of the gene. |
| GeneName | Current FlyBase name of the gene. |
| datestamp | Date on which the information was last reviewed. |
| gene_snapshot_text | Gene snapshot information for the gene. Cases that are in progress or are deemed to have insufficient data to summarize are stated as such |
Unique protein isoforms
To download the file, execute the following command.
Unique_protein_isoforms = Genes.Unique_protein_isoforms()
Columns Description
| Column heading | Content Description |
|---|---|
| FBgn | Current FlyBase identifier (FBgn#) of the gene. |
| FB_gene_symbol | Current FlyBase gene symbol of the gene. |
| representative_protein | Current FlyBase protein symbol of the representative protein isoform. |
| identical_protein(s) | Current FlyBase protein symbol(s) of identical protein isoforms. |
Non-coding RNAs
To download the file, execute the following command.
Noncoding_RNAs = Genes.Noncoding_RNAs()
This file reports all ncRNAs with gene models supported by FlyBase in JSON format, as submitted to RNAcentral. Pseudogenes are excluded. In addition to the symbols and IDs for ncRNAs, this file also includes their associated gene, genomic location, sequence, Sequence Ontology classification, etc.
Enzyme data
To download the file, execute the following command.
Enzyme = Genes.Enzyme()
Columns Description
| Column heading | Content Description |
|---|---|
| group_id | FlyBase gene group (FBgg) ID of the relevant terminal group within the ENZYMES (FBgg0001715) hierarchy (only terminal groups contain members). |
| group_name | FlyBase gene group (FBgg) name of relevant terminal group within the ENZYMES (FBgg0001715) hierarchy (only terminal groups contain members). |
| group_GO_ID | The GO molecular function term ID on the given gene group. Multiple entries are separated with a pipe. |
| group_GO_name | The GO molecular function term name on the given gene group. Multiple entries are separated with a pipe. |
| group_EC_number | The EC number on the given gene group, if present. (This is computed, corresponding to the EC cross-reference on the GO molecular function term.) |
| group_EC_name | The EC name on the given gene group, if present. (This is computed, corresponding to the EC cross-reference on the GO molecular function term.) |
| gene_id | The current FlyBase gene ID (FBgn) of the gene. |
| gene_symbol | The current FlyBase symbol of the gene. |
| gene_name | The current FlyBase name of the gene. |
| gene_EC_number | The EC number(s) associated with the gene, if present. Multiple entries are separated with a pipe. (This is computed, corresponding to the EC cross-reference(s) on any positive GO molecular function term(s) annotated to the gene.) |
| gene_EC_name | The EC name(s) associated with the gene, if present. Multiple entries are separated with a pipe. (This is computed, corresponding to the EC cross-reference(s) on any positive GO molecular function term(s) annotated to the gene.) |
Gene Ontology annotation files
To facilitate its usage, it is suggested to access the data using the following command.
GOAnn = FBD.GOAnn
Then, enter the specific method according to the desired data
Gene Association File - GAF
Columns Description
| Column heading | Content Description |
|---|---|
| DB | The database contributing the gene_association file FB File: always "FB" for gene_association.fb. |
| DB Object ID | A unique identifier in the database for the item being annotated. FB File: This is always the primary FlyBase identifier number for a Drosophila gene. Example: FBgn0000490 |
| DB Object Symbol | A (unique and valid) symbol to which the DB Object ID is matched. FB File: This is always the valid gene symbol for a Drosophila gene. Example: dpp |
| Qualifier | For each GO annotation, one of the following as gene product to term relations are used: 'acts_upstream_of', 'acts_upstream_of_negative_effect', 'acts_upstream_of_positive_effect', 'colocalizes_with', 'contributes_to', 'enables', 'involved_in', 'is_active_in', 'located_in', 'part_of'. This column may also contain the 'NOT' qualifier, separated by a pipe (“|”) from the gene product to term relation, which makes the annotation statement a negation. |
| GO ID | The unique GO identifier for the GO term attributed to the DB_Object_ID. Example: GO:0005160 |
| DB:Reference | The unique identifier for the reference to which the GO annotation is attributed. FB File: Each FlyBase reference including published literature, conference abstracts, personal communications, sequence records and computer files has a unique 7 digit identifier (an FBrf). Where this reference is a published paper with a PubMed identifier, the PubMed ID is also listed in column 6, separated from the FBrf with a pipe (“|”). Example: FB:FBrf0136863|PMID:11432817 |
| Evidence | The evidence code for the GO annotation; one of IMP, IGI, IPI, ISS, IDA, IEP, IEA, TAS, NAS, ND, IC, RCA, HDA, HMP, HGI, HEP, IBA |
| With (or) From | FB File: This column contains the identifier for annotations where the evidence code is IGI, IPI, ISS, IEA or IC. For IGI the database gene symbol and identifier is listed. For ISS and IPI the identifier can be a gene symbol and identifier, or a sequence (protein or nucleic acid) identifier. For IC, the GO identifier of the term used as the basis of a curator inference is given. IGI example: FLYBASE:rpr; FB:FBgn0011706, ISS example: UniProt:P35569, ISS example: EMBL:AF064523, ISS example: SGD_LOCUS:COP1; SGD:S0002304, IC example: GO:0045298 |
| Aspect | Which ontology the GO term belongs to: Function ( F ), Process ( P ) or Component ( C ). Example: P |
| DB Object Name | FB File: The full name of the FlyBase gene. Example: decapentaplegic Where a FlyBase gene has no full name (eg Pten), this field is left blank. |
| DB Object Synonym | Alternative names by which the database object is known. FB File: Multiple synonyms of a FlyBase gene are separated by a pipe (“|”). Example: M(2)LS1|shortvein|Dm-DPP|dpp|Dpp|DPP|CG9885|TGF-beta|TGF-&bgr|TGF-b|Hin-d|l(2)10638|shv|DPP-C|ho|M(2)23AB|blk|l(2)22Fa|l(2)k17036|Tg|TGF&bgr |
| DB Object Type | The type of object being annotated. Always a gene for FlyBase data. FB file: always "gene" for gene_association.fb. |
| Taxon | The taxonomic identifier of the species encoding the gene product Example: taxon:7227 |
| Date | The date of last annotation update, in the format 'YYYYMMDD'. At present this date is the same for all annotations and corresponds to the date of the latest FlyBase update; we are in the process of changing our system so that dates more accurately reflect the date the annotation is made. Example: 20040821 |
| Assigned by | The source of the GO annotation. |
Gene groups
To facilitate its usage, it is suggested to access the data using the following command.
Gene_groups = FBD.Gene_groups
Then, enter the specific method according to the desired data
Gene group
To download the file, execute the following command.
Gene_groups_data = Gene_groups.Gene_group()
Columns Description
| Column heading | Content Description |
|---|---|
| FB_group_id | Current FlyBase identifier (FBgg##) of Gene Group. |
| FB_group_symbol | Current FlyBase symbol of Gene Group. |
| FB_group_name | Current FlyBase full name of Gene Group. |
| Parent_FB_group_id | Current FlyBase identifier (FBgg##) of parent of given Gene Group (if relevant). |
| Parent_FB_group_symbol | Current FlyBase symbol of parent of given Gene Group (if relevant). |
| Group_member_FB_gene_id | Current FlyBase identifier (FBgn##) of member gene (if terminal group). |
| Group_member_FB_gene_symbol | Current FlyBase symbol of member gene (if terminal group). |
Gene groups with HGNC IDs
To download the file, execute the following command.
Gene_groups_HGNC= Gene_groups.Gene_groups_HGNC()
Columns Description
| Column heading | Content Description |
|---|---|
| FB_group_id | Current FlyBase identifier (FBgg##) of Gene Group. |
| FB_group_symbol | Current FlyBase symbol of Gene Group. |
| FB_group_name | Current FlyBase full name of Gene Group. |
| HGNC_family_ID | HGNC ID of equivalent human 'gene family'. |
Pathway group
To download the file, execute the following command.
Pathway_group = Gene_groups.Pathway_group()
Columns Description
| Column heading | Content Description |
|---|---|
| FB_group_id | Current FlyBase identifier (FBgg##) of Pathway Gene Group. |
| FB_group_symbol | Current FlyBase symbol of Pathway Gene Group. |
| FB_group_name | Current FlyBase full name of Pathway Gene Group. |
| Parent_FB_group_id | Current FlyBase identifier (FBgg##) of parent of given Pathway Gene Group (if relevant). |
| Parent_FB_group_symbol | Current FlyBase symbol of parent of given Pathway Gene Group (if relevant). |
| Group_member_FB_gene_id | Current FlyBase identifier (FBgn##) of member gene (if terminal group). |
| Group_member_FB_gene_symbol | Current FlyBase symbol of member gene (if terminal group). |
Alleles and Stocks
To facilitate its usage, it is suggested to access the data using the following command.
Alleles_Stocks = FBD.Alleles_Stocks
Then, enter the specific method according to the desired data
Stock data
To download the file, execute the following command.
Stock = Alleles_Stocks.Stock()
Columns Description
| Column heading | Content Description | Example |
|---|---|---|
| FBst | The unique identifier assigned to this stock by FlyBase. | FBst0000002 |
| collection_short_name | A short name for the stock collection that holds the stock. | Bloomington |
| stock_type_cv | The controlled vocabulary term and unique identifier that describe the state of the stock. | living stock ; FBsv:0000002 |
| species | Abbreviation (from the Species Abbreviations list) indicating the species of the stock. | Dmel |
| FB_genotype | Genetic components of the stock corresponding to alleles, aberrations, balancers, or insertions in FlyBase. May be empty. | w[*]; betaTub60D[2] Kr[If-1]/CyO |
| description | Genetic components of the stock as provided to FlyBase by the collection that holds the stock. | FlyTrap: ZCL1796 III |
| stock_number | The stock identifier provided to FlyBase by the collection that holds the stock. May be empty. | 110818 |
Genetic interactions
To download the file, execute the following command.
Genetic_interactions = Alleles_Stocks.Allele_genetic_interactions()
Columns Description
| Column heading | Content Description |
|---|---|
| allele_symbol | Current FlyBase allele symbol. |
| allele_FBal# | Current FlyBase identifier (FBal#) of allele. |
| interaction | Interaction information associated with allele. |
| FBrf# | Current FlyBase identifer (FBrf#) of publication from which data came. |
Phenotypic data
To download the file, execute the following command.
Phenotypic_data = Alleles_Stocks.Phenotypic()
Columns Description
| Column heading | Content Description |
|---|---|
| genotype_symbols | Current FlyBase symbol(s) of the components that make up the genotype. |
| genotype_FBids | Current FlyBase identifier(s) of the components that make up the genotype. |
| phenotype_name | Phenotypic name associated with the genotype. |
| phenotype_id | Phenotypic identifier associated with the genotype. |
| qualifier_names | Qualifier name(s) associated with phenotypic data for genotype. |
| qualifier_ids | Qualifier identifier(s) associated with phenotypic data for genotype. |
| reference | Current FlyBase identifer (FBrf#) of publication from which data came. |
Alleles to Genes
To download the file, execute the following command.
Alleles_toGenes = Alleles_Stocks.FBal_to_FBgn()
Columns Description
| Column heading | Content Description |
|---|---|
| AlleleID | Current FlyBase identifier (FBal#) of the allele. |
| AlleleSymbol | Current symbol of the allele. |
| GeneID | Current FlyBase identifier (FBgn#) of the gene. |
| GeneSymbol | Current symbol of the gene. |
Homologs
To facilitate its usage, it is suggested to access the data using the following command.
Homologs = FBD.Homologs
Then, enter the specific method according to the desired data
Drosophila Paralogs
To download the file, execute the following command.
Dmel_Paralog = Homologs.Drosophila_Paralogs()
Columns Description
| Column heading | Content Description |
|---|---|
| FBgn_ID | Current FlyBase identifier (FBgn#) of the D. melanogaster gene. |
| GeneSymbol | Current FlyBase gene symbol of the D. melanogaster gene. |
| Arm/Scaffold | Arm upon which the D. melanogaster gene is localized. |
| Location | Location of D. melanogaster gene on the arm. |
| Strand | Strand of D. melanogaster gene ('1' indicates the positive strand, '-1' indicates the negative strand). |
| Paralog_FBgn_ID | Current FlyBase identifier (FBgn#) of the paralogous gene. |
| Paralog_GeneSymbol | Current FlyBase gene symbol of the paralogous gene. |
| Paralog_Arm/Scaffold | Arm upon which the paralogous gene is localized. |
| Paralog_Location | Location of paralogous gene on the arm. |
| Paralog_Strand | Strand of paralogous gene ('1' indicates the positive strand, '-1' indicates the negative strand). |
| DIOPT_score | DIOPT 'score' for the paralog call (i.e. the number of individual algorithms that support the call). |
Human Orthologs
To download the file, execute the following command.
Hman_Orthologs = Homologs.Human_Orthologs()
Columns Description
| Column heading | Content Description |
|---|---|
| Dmel_gene_ID | Current FlyBase identifier (FBgn#) of the D. melanogaster gene. |
| Dmel_gene_symbol | Current FlyBase gene symbol of the D. melanogaster gene. |
| Human_gene_HGNC_ID | HGNC ID of orthologous human gene. |
| Human_gene_OMIM_ID | OMIM ID of orthologous human gene. |
| Human_gene_symbol | HGNC gene symbol of orthologous human gene. |
| DIOPT_score | DIOPT 'score' for orthology call (i.e. the number of individual algorithms that support the call). |
| OMIM_Phenotype_IDs | OMIM Phenotype ID of orthologous human gene (comma separated values). |
| OMIM_Phenotype_IDs[name] | OMIM Phenotype ID of orthologous human gene (with the corresponding OMIM name in square brackets). |
Human disease
To facilitate its usage, it is suggested to access the data using the following command.
Human_disease = FBD.Human_disease
Human disease model data
To download the file, execute the following command.
Human_disease_model = Human_disease.Disease_model_annotations()
Columns Description
| Column heading | Content Description |
|---|---|
| FBgn ID | Current FlyBase identifier (FBgn#) of the gene associated with the allele of an experimental annotation, or the D. melanogaster ortholog of a human gene associated with a disease in OMIM. |
| Gene symbol | Current FlyBase symbol of the gene in column 1. |
| HGNC ID | HGNC ID of the gene identified in column 1 where it is a human gene (experimental-based annotations only). |
| DO qualifier | Type of association between the object of annotation and the disease - one of 'model of', 'ameliorates', 'exacerbates', 'DOES NOT model', 'DOES NOT ameliorate' or 'DOES NOT exacerbate'. |
| DO ID | Disease Ontology (DO) ID. |
| DO term | Disease Ontology (DO) term. |
| Allele used in model (FBal ID) | Current FlyBase identifier (FBal#) of allele (experimental-based annotations only). |
| Allele used in model (symbol) | Current FlyBase symbol of allele (experimental-based annotations only). |
| Based on orthology with (HGNC ID) | HGNC ID of the human ortholog used for annotations based on orthology to human disease genes. |
| Based on orthology with (symbol) | HGNC gene symbol of the human ortholog used for annotations based on orthology to human disease genes. |
| Evidence/interacting alleles | Evidence code, with interacting allele(s) where appropriate. For experimental-based annotations, the evidence code is one of: 'inferred from mutant phenotype', 'in combination with', 'modeled by', 'is ameliorated by', 'is exacerbated by', 'is NOT ameliorated by' or 'is NOT exacerbated by'. Interacting alleles are give as 'FLYBASE:<allele_symbol>; FB:<FBal_ID>', with multiple alleles separated by a comma. For orthology-based annotations, the evidence code is 'inferred from electronic annotation'. |
| Reference (FBrf ID) | Current FlyBase identifier (FBrf#) of the source publication. |
Human orthologs
To download the file, execute the following command.
Hman_Orthologs = Human_disease.Human_Orthologs()
This is identical to the file of the same name listed under the 'Orthologs' section above.
Organisms
Species list
To download the file, execute the following command.
Species = FBD.Organisms.Species_list()
Columns Description
| Column heading | Content Description |
|---|---|
| Genus | The genus designation of the organism. |
| Species name | The species designation of the organism. |
| Abbreviation | The standard FlyBase prefix for the species. This abbreviation is used in FlyBase as the first part of the symbol (before the '') of any object, e.g. a gene or allele, that originates from this species. This column may be blank, if no individual report page exists for that species in FlyBase. |
| Common name | The NCBI Taxonomy Database common name of the organism. This column may be blank. |
| Ncbi-taxon-id | The NCBI Taxonomy Database Taxon ID for the organism. This column may be blank. |
| drosophilid | If the species is from the family Drosophilidae, this column is filled in with 'y'. |
Ontology Terms
To facilitate its usage, it is suggested to access the data using the following command.
Ontology = FBD.Ontology_Terms
Fly anatomy
FBbt = Ontology.FBbt()
Fly development
FBdv = Ontology.FBdv()
Flybase controlled vocabulary
FBcv = Ontology.FBcv()
Stock ontology
FBsv = Ontology.FBsv()
Gene ontology
GO = Ontology.G0()
Image ontology
FBbi= Ontology.FBbi()
Human disease ontology
DO = Ontology.DO()
Insertions
To facilitate its usage, it is suggested to access the data using the following command.
Insertions = FBD.Insertions
Map data for insertions
To download the file, execute the following command.
map_Insertions = Insertions.Map_insertions()
Columns Description
| Column heading | Content Description |
|---|---|
| insertion_symbol | Current symbol of insertion. |
| FBti# | Current FlyBase identifier (FBti#) of insertion. |
| genomic_location | Genomic location of insertion. |
| range | Range (t/f) indicates whether genomic location is range or single base |
| orientation | Orientation (1/0) indicates orientation of insertion on chromosome. |
| estimated_cytogenetic_location | Estimated cytogenetic location based on correlation of genomic location and estimated genomic location of cytological bands. |
| observed_cytogenetic_location | Observed cytogenetic location reported in the literature. |
Frequently-used GAL4 drivers
To download the file, execute the following command.
Gal4 = Insertions.GAL4_drivers()
This file reports a list of all GAL4 drivers that have been curated to at least 21 references and/or are among 150 most frequently requested GAL4 stocks from the Bloomington Drosophila Stock Center, in JSON format. In addition to the symbols and IDs for Scer\GAL4 alleles, this file also includes their associated transposon or insertion, associated gene, expression pattern in controlled vocabulary stage and anatomy terms, stocks, and publications, all with IDs, as well as free text expression pattern descriptions. This file, except for publications and stocks, is also available in TSV format here.
Clones
To facilitate its usage, it is suggested to access the data using the following command.
Clones = FBD.Clones
cDNAs: FBcl to acc. ID
To download the file, execute the following command.
c_cDNAs = Clones.cDNA_clone_data()
Columns Description
| Column heading | Content Description |
|---|---|
| FBcl# | Current FlyBase identifier (FBcl#) of cDNA clone. |
| organism_abbreviation | Abbreviation (from the Species |
| clone_name | Clone name. |
| dataset_metadata_name | Name of dataset associated with clone. |
| cDNA_accession(s) | EMBL/GenBank/DDBJ cDNA accession number. |
| EST_accession(s) | EMBL/GenBank/DDBJ EST accession number. |
Genomic: FBcl to acc. ID
To download the file, execute the following command.
c_genomic = Clones.genomic_clone_data()
Columns Description
| Column heading | Content Description |
|---|---|
| FBcl# | Current FlyBase identifier (FBcl#) of genomic clone. |
| organism_abbreviation | Abbreviation (from the Species Abbreviations list) indicating the species of origin of the clone. |
| clone_name | Clone name. |
| accession | EMBL/GenBank/DDBJ cDNA accession number. |
References
FlyBase FBrf to PubMed ID to PMCID to DOI
To download the file, execute the following command.
References = FBD.References.FBrf_PMid_PMCid_doi()
Columns Description
| Column heading | Content Description |
|---|---|
| FBrf | The unique FlyBase ID for this publication. |
| PMID | The unique PubMed ID for this publication. |
| PMCID | The unique PubMed Central ID for this publication, if applicable. |
| DOI | The digital object identifier assigned to the publication. |
| pub_type | The publication type (for example, paper, review, erratum, abstract, book, etc.) |
| miniref | A short citation listing the first author, year of publication, journal, volume, issue and page numbers. |
| pmid_added | The FlyBase release in which the publication was first incorporated into the FlyBase bibliography. Note: as this report first generated for fb_2012_01 release, all publications associated with a Pub Med ID prior to this release have pmid_added = fb_2011_10. |
Autor:
Javiera Quiroz, email.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file FlyBaseDownloads-2.0.0.tar.gz.
File metadata
- Download URL: FlyBaseDownloads-2.0.0.tar.gz
- Upload date:
- Size: 44.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa2dd31f57fdcff90538c42cd31f3308a84fabf6bc92847ac94dfafaa09e4b13
|
|
| MD5 |
4e85ec9a1b348e9676f7e991f6f76024
|
|
| BLAKE2b-256 |
a9f0b07cab4b7b4b8f225a1780e7b1e32c18dac31540971552295bd0320bef42
|
File details
Details for the file FlyBaseDownloads-2.0.0-py3-none-any.whl.
File metadata
- Download URL: FlyBaseDownloads-2.0.0-py3-none-any.whl
- Upload date:
- Size: 29.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40a7aaa61f60c4c8667353d95b17240d2536921a988e59d118bc8ca744f3575e
|
|
| MD5 |
7a296e11c3bd9d4cd9c723672b771ab1
|
|
| BLAKE2b-256 |
89cd159b7220f8aacfb38b608e8054f49a36b46952b8a1fe507607aba0f0bd3e
|