Python implementation of the Coreference-based Graph Search (CGS) algorithm.
Project description
Coreference-based Graph Search (CGS)
This is the Python implementation of the CGS algorithm.
Documentation
The documentation for pycgs is available on the documentation website of the ShennongAlpha (ShennongDoc):
You can also contribute to the documentation on the ShennongDoc GitHub repository by submitting a pull request:
Foundational CGS
from pycgs import cgs
relationships = [('A', 'B'), ('B', 'C'), ('D', 'B'), ('E', 'F')]
primary_terms = cgs.foundational_cgs(relationships)
print(primary_terms)
# Output:
# {'A': 'C', 'B': 'C', 'C': 'C', 'D': 'C', 'E': 'F', 'F': 'F'}
Weighted CGS
from pycgs import cgs
weighted_relationships = [('A', 'B', 1), ('B', 'C', 2), ('D', 'B', 1), ('B', 'E', 1)]
primary_terms = cgs.weighted_cgs(weighted_relationships)
print(primary_terms)
# Output:
# {'A': 'C', 'B': 'C', 'C': 'C', 'D': 'C', 'E': 'E'}
PrimaryTermExtractor
PrimaryTermExtractor is a class that allows the extraction of primary terms from a given text based on a dictionary of coreference relationships between terms and their primary terms.
from pycgs.cgs import PrimaryTermExtractor
# Create a dictionary mapping terms to their primary terms
primary_term_dict = {
"Artemisia annua Part-aerial": "nmm-0001",
"Qing-hao": "nmm-0001",
"黄花蒿地上部": "nmm-0001",
"青蒿": "nmm-0001",
"Ephedra sinica Stem-herbaceous": "nmm-0003",
"Cao-ma-huang": "nmm-0003",
"草麻黄草质茎": "nmm-0003",
"草麻黄": "nmm-0003",
}
# Initialize the PrimaryTermExtractor
extractor = PrimaryTermExtractor(primary_term_dict)
# Extract primary terms from a mixed language text
text = "Both Artemisia annua Part-aerial and 草麻黄草质茎 are Natural Medicinal Materials and are used in traditional Chinese medicine."
result = extractor.extract_primary_terms(text)
print(result)
# Output:
# {'Artemisia annua Part-aerial': 'nmm-0001', '草麻黄草质茎': 'nmm-0003'}
Cite this work
@misc{yang2024shennongalphaaidrivensharingcollaboration,
title={ShennongAlpha: an AI-driven sharing and collaboration platform for intelligent curation, acquisition, and translation of natural medicinal material knowledge},
author={Zijie Yang and Yongjing Yin and Chaojun Kong and Tiange Chi and Wufan Tao and Yue Zhang and Tian Xu},
year={2024},
eprint={2401.00020},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2401.00020},
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pycgs-1.1.0.tar.gz.
File metadata
- Download URL: pycgs-1.1.0.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.5 Darwin/24.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
deb6e125bd68eb79e4a8509c0844f2922f5742f43028b70731f644c1ae3bffa3
|
|
| MD5 |
a1e25cdb438a8dedcbc710838e9f7004
|
|
| BLAKE2b-256 |
8f7b410e37b7e70d7fd258e008446344d34b86067ac8397620754ffb31956fd1
|
File details
Details for the file pycgs-1.1.0-py3-none-any.whl.
File metadata
- Download URL: pycgs-1.1.0-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.5 Darwin/24.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5867df3a0474f1f7133c879c9e0eb04fe18afb3abe3975842d99747c1843992d
|
|
| MD5 |
6aa2825374d2e932a0daa4afcc5419d3
|
|
| BLAKE2b-256 |
0c4421c7113278b368d89ff72f78dcb795b4da626d63cecdb888f27a53a5e2a8
|