Python implementation of the Coreference-based Graph Search (CGS) algorithm.
Project description
Coreference-based Graph Search (CGS)
This is the Python implementation of the CGS algorithm.
Documentation
The documentation for pycgs
is available on the documentation website of the ShennongAlpha (ShennongDoc):
You can also contribute to the documentation on the ShennongDoc
GitHub repository by submitting a pull request:
Foundational CGS
from pycgs import cgs
relationships = [('A', 'B'), ('B', 'C'), ('D', 'B'), ('E', 'F')]
primary_terms = cgs.foundational_cgs(relationships)
print(primary_terms)
# Output:
# {'A': 'C', 'B': 'C', 'C': 'C', 'D': 'C', 'E': 'F', 'F': 'F'}
Weighted CGS
from pycgs import cgs
weighted_relationships = [('A', 'B', 1), ('B', 'C', 2), ('D', 'B', 1), ('B', 'E', 1)]
primary_terms = cgs.weighted_cgs(weighted_relationships)
print(primary_terms)
# Output:
# {'A': 'C', 'B': 'C', 'C': 'C', 'D': 'C', 'E': 'E'}
PrimaryTermExtractor
PrimaryTermExtractor
is a class that allows the extraction of primary terms from a given text based on a dictionary of coreference relationships between terms and their primary terms.
from pycgs.cgs import PrimaryTermExtractor
# Create a dictionary mapping terms to their primary terms
primary_term_dict = {
"Artemisia annua Part-aerial": "nmm-0001",
"Qing-hao": "nmm-0001",
"黄花蒿地上部": "nmm-0001",
"青蒿": "nmm-0001",
"Ephedra sinica Stem-herbaceous": "nmm-0003",
"Cao-ma-huang": "nmm-0003",
"草麻黄草质茎": "nmm-0003",
"草麻黄": "nmm-0003",
}
# Initialize the PrimaryTermExtractor
extractor = PrimaryTermExtractor(primary_term_dict)
# Extract primary terms from a mixed language text
text = "Both Artemisia annua Part-aerial and 草麻黄草质茎 are Natural Medicinal Materials and are used in traditional Chinese medicine."
result = extractor.extract_primary_terms(text)
print(result)
# Output:
# {'Artemisia annua Part-aerial': 'nmm-0001', '草麻黄草质茎': 'nmm-0003'}
Cite this work
@misc{yang2024shennongalphaaidrivensharingcollaboration,
title={ShennongAlpha: an AI-driven sharing and collaboration platform for intelligent curation, acquisition, and translation of natural medicinal material knowledge},
author={Zijie Yang and Yongjing Yin and Chaojun Kong and Tiange Chi and Wufan Tao and Yue Zhang and Tian Xu},
year={2024},
eprint={2401.00020},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2401.00020},
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pycgs-1.1.0.tar.gz
(8.1 kB
view details)
Built Distribution
pycgs-1.1.0-py3-none-any.whl
(9.3 kB
view details)
File details
Details for the file pycgs-1.1.0.tar.gz
.
File metadata
- Download URL: pycgs-1.1.0.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.5 Darwin/24.0.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | deb6e125bd68eb79e4a8509c0844f2922f5742f43028b70731f644c1ae3bffa3 |
|
MD5 | a1e25cdb438a8dedcbc710838e9f7004 |
|
BLAKE2b-256 | 8f7b410e37b7e70d7fd258e008446344d34b86067ac8397620754ffb31956fd1 |
File details
Details for the file pycgs-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: pycgs-1.1.0-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.5 Darwin/24.0.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5867df3a0474f1f7133c879c9e0eb04fe18afb3abe3975842d99747c1843992d |
|
MD5 | 6aa2825374d2e932a0daa4afcc5419d3 |
|
BLAKE2b-256 | 0c4421c7113278b368d89ff72f78dcb795b4da626d63cecdb888f27a53a5e2a8 |