Skip to main content

This package allows interface with the AlphaFold Protein Structure Database. This package allows the download of entries' metadata an AlphaFold files (e.g. mmCIF, PAE, PDB...)

Project description

AlphaFetcher

AlphaFetcher facilitates fetching and downloading protein metadata and related files from the AlphaFold Protein Structure Database using Uniprot access codes.


🌟 Features

  • Batch Import: Input single or multiple Uniprot access codes seamlessly.

  • Parallel Processing: Efficiently fetch metadata using multithreading.

  • Flexible Downloads: Choose among various file types - PDB, CIF, BCIF, PAE image, and PAE data files.

  • Optimal Performance: Easily adjust the number of workers for threaded tasks.


🔧 Installation

We recommend PyPI installation:

pip install alphafetcher

💡 Usage

from alphafetcher import AlphaFetcher

# Instantiate the fetcher
# The base_savedir parameter allows you to set a base directory where files will be saved.
# Inside this directory, two separate directories for pdb and cif files will be created.
fetcher = AlphaFetcher(base_savedir="my_savedir")

# Add desired Uniprot access codes
fetcher.add_proteins(["A1KXE4", "H0YL14", "B2RXH2", "A8MVW5"])

# Retrieve metadata
fetcher.fetch_metadata(multithread=True, workers=4)
# Metadata available at fetcher.metadata_dict

# Commence download of specified files
fetcher.download_all_files(pdb=True, cif=True, multithread=True, workers=4)

📜 Documentation

Initialization

  • AlphaFetcher(base_savedir: str)
    • Description: Initialize the fetcher with a base save directory. The base_savedir is where the downloaded pdb and cif files will be stored. Inside this directory, two subdirectories will be automatically created: one for pdb files and another for cif files.
    • Parameters:
      • base_savedir: The base directory where the pdb and cif files will be saved.

Methods

  • add_proteins(proteins: Union[str, List[str]])

    • Description: Add the provided Uniprot access codes for fetching. A single string or a list of strings are accepted.
  • fetch_metadata(multithread: bool = False, workers: int = 10)

    • Description: Extracts metadata corresponding to the supplied Uniprot access codes. This metadata is used to download the relevant files and is stored in fetcher.metadata_dict, assuming the notation of the example above is followed.
  • download_all_files(uniprot_access: str, pdb: bool = False, cif: bool = False, bcif: bool = False, pae_image:bool = False, pae_data: bool = False)

    • Description: Initiates download for the specified file types linked to the given Uniprot codes.
    • Specify the types of files to be downloaded by changing the values of their parameters to True.

For a comprehensive guide, users are encouraged to view the docstrings incorporated within the source code.


⚠️ Limitations

Always respect the AlphaFold Protein Structure Database terms of service, ensuring not to flood it with excessive concurrent requests. Consider adjusting the number of workers to reduce the requests density.


🙌 Contributing

We welcome your contributions! To collaborate:

  1. Fork this repository.
  2. Commit your changes.
  3. Open a pull request with your updates.

📖 Authors and Acknowledgment


📄 License

This project is licensed under the GNU General Public License v3 (GPLv3).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AlphaFetcher-0.2.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

AlphaFetcher-0.2.0-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file AlphaFetcher-0.2.0.tar.gz.

File metadata

  • Download URL: AlphaFetcher-0.2.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.12

File hashes

Hashes for AlphaFetcher-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c641984999fbcece38dcb3676baf0553f143687d68f0e069da7a0759908aa346
MD5 940b19f222535c701812b302c02cb1d5
BLAKE2b-256 23dc29cb4113277eda5721cf0ea9fbabb14c5a70453c1227f1859834600e2b05

See more details on using hashes here.

File details

Details for the file AlphaFetcher-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for AlphaFetcher-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6d22bdca137978dbd551ba82408af533ee6f949ad3f8ad39b628499d687611f1
MD5 810ae6290e729328b4f2682b88d97330
BLAKE2b-256 b8f953e88af0dd9b8fb79c7fe16b05da73363db821c94c464a6ff28c919fdd9c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page