Skip to main content

A python package for integrating data from multiple resources

Project description

pyBioDataFuse

Tests PyPI PyPI - Python Version PyPI - License Documentation Status Codecov status Cookiecutter template from @cthoyt Code style: black Contributor Covenant

💪 Getting Started

We introduce BioDataFuse, a query-based Python tool for seamless integration of biomedical databases. BioDataFuse establishes a modular framework for efficient data wrangling, enabling context-specific knowledge graph creation and supporting graph-based analyses. With a user-friendly interface, it enables users to dynamically create knowledge graphs from their input data. Supported by a robust Python package, pyBiodatafuse, this tool excels in data harmonization, aggregating diverse sources through modular queries. Moreover, BioDataFuse provides plugin capabilities for Cytoscape and Neo4j, allowing local graph hosting. Ongoing refinements enhance the graph utility through tasks like link prediction, making BioDataFuse a versatile solution for efficient and effective biological data integration.

To know more about the package, read our documentation here.

Creating your own graph

To generate your own graph, check out our tutorial notebook in examples.

We support exporting of the graphs in Cytoscape, Neo4J and GraphDB. You can use the following functions:

# on neo4j
neo4j.load_graph(pygraph, uri="bolt://localhost:7687", username="YOUR_USERNAME", password="YOUR_PASSWORD")  # change username and password

# on cytoscape
cytoscape.load_graph(pygraph, network_name="YOUR_CUSTOM_NAME")

# rdf ttl files
bdf = BDFGraph(
    base_uri="https://biodatafuse.org/YOUR_CUSTOM_NAME/",
    version_iri="https://biodatafuse.org/example/YOUR_CUSTOM_NAME.ttl",
    orcid="YOUR_ORCID",
    author="YOUR_NAME",
)

bdf.generate_rdf(combined_df, combined_metadata)  # Generate the RDF from the (meta)data files from the example runs
bdf.serialize(
    "YOUR_CUSTOM_NAME.ttl",
    format="ttl",
)

🚀 Installation

The most recent release can be installed from PyPI with:

$ pip install pyBiodatafuse

The most recent code and data can be installed directly from GitHub with:

$ pip install git+https://github.com/BioDataFuse/pyBiodatafuse.git

👐 Contributing

Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.md for more information on getting involved.

👋 Attribution

⚖️ License

The code in this package is licensed under the MIT License.

📖 Citation

The work was started as part of the Elixir BioHackathon 2023 integrating and bringing together multiple Core Data Resources together.

Gadiya, Y., Ammar, A., Willighagen, E., Martinat, D., Sima, A. C., Balci, H., & Abbassi Daloii, T. (2023). BioHackEU23 report: Extending interoperability of experimental data using modular queries across biomedical resources. BioHackrXiv Preprints. https://doi.org/10.37044/osf.io/mhsqp

🍪 Cookiecutter

This package was created with @audreyfeldroy's cookiecutter package using @cthoyt's cookiecutter-snekpack template.

🛠️ For Developers

See developer instructions

The final section of the README is for if you want to get involved by making a code contribution.

Development Installation

To install in development mode, use the following:

$ git clone git+https://github.com/BioDataFuse/pyBiodatafuse.git
$ cd pyBiodatafuse
$ pip install -e .

🥼 Testing

After cloning the repository and installing tox with pip install tox, the unit tests in the tests/ folder can be run reproducibly with:

$ tox

Additionally, these tests are automatically re-run with each commit in a GitHub Action.

📖 Building the Documentation

The documentation can be built locally using the following:

$ git clone git+https://github.com/BioDataFuse/pyBiodatafuse.git
$ cd pyBiodatafuse
$ tox -e docs
$ open docs/build/html/index.html

The documentation automatically installs the package as well as the docs extra specified in the setup.cfg. sphinx plugins like texext can be added there. Additionally, they need to be added to the extensions list in docs/source/conf.py.

📦 Making a Release

After installing the package in development mode and installing tox with pip install tox, the commands for making a new release are contained within the finish environment in tox.ini. Run the following from the shell:

$ tox -e finish

This script does the following:

  1. Uses Bump2Version to switch the version number in the setup.cfg, src/pyBiodatafuse/version.py, and docs/source/conf.py to not have the -dev suffix
  2. Packages the code in both a tar archive and a wheel using build
  3. Uploads to PyPI using twine. Be sure to have a .pypirc file configured to avoid the need for manual input at this step
  4. Push to GitHub. You'll need to make a release going with the commit where the version was bumped.
  5. Bump the version to the next patch. If you made big changes and want to bump the version by minor, you can use tox -e bumpversion -- minor after.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybiodatafuse-1.3.0.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pybiodatafuse-1.3.0-py3-none-any.whl (207.2 kB view details)

Uploaded Python 3

File details

Details for the file pybiodatafuse-1.3.0.tar.gz.

File metadata

  • Download URL: pybiodatafuse-1.3.0.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.17

File hashes

Hashes for pybiodatafuse-1.3.0.tar.gz
Algorithm Hash digest
SHA256 74d8589d635d5895a24a2a3c28e25a64bb648e29539360ae8a3cc4e841fa08c6
MD5 64b5176429ed9a6de6adaf8eca1cbe95
BLAKE2b-256 c2022dc129807e6a503e016abd346bc01dbf2c6e4226022de482c50fd776300f

See more details on using hashes here.

File details

Details for the file pybiodatafuse-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: pybiodatafuse-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 207.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.17

File hashes

Hashes for pybiodatafuse-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 45a21a1cde1c6a4bbe5e162be0c2579daa5276a26fb48da9feea72ffa1f48e74
MD5 bf333401989d439b17b290e60eca4b07
BLAKE2b-256 c9ad7e506e04c49d0fd381f9ee4755e63eb564e75ffd4b0bb793242593da30bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page