A Python package for executing graph generation from textual inputs.
Project description
eKnowledge
eKnowledge
is a Python package designed to facilitate the generation of knowledge graphs from textual inputs using various language models. The package leverages the power of NLP to extract relationships and constructs from code snippets or any other structured text.
Installation
To install eKnowledge
, you can use pip:
pip install eknowledge
Usage
eKnowledge
supports various language models for processing input text, including locally hosted models and remote API-based models. Below is an example using the ChatOllama
model, which requires a locally downloaded model from ollama.com. The recommended model for local usage is "codestral:22b-v0.1-q2_K".
Example with Local Language Model (ChatOllama)
from eknowledge import execute_graph_generation
from langchain_community.chat_models import ChatOllama
from langchain_huggingface import HuggingFaceEmbeddings
# Configure the language model
MISTRAL_MODEL = "codestral:22b-v0.1-q2_K"
MAX_TOKENS = 1500
# Initialize the model with the desired configuration
llm = ChatOllama(model=MISTRAL_MODEL, max_tokens=MAX_TOKENS)
# Sample Python code to process
input_text = """
def factorial(x):
if x == 1:
return 1
else:
return (x * factorial(x-1))
num = 3
print("The factorial of", num, "is", factorial(num))
"""
# Generate the knowledge graph
embed = HuggingFaceEmbeddings()
graph = execute_graph_generation(input_text, llm, embed, max_attempts=1)
print(graph)
# Output:
# [
# {
# 'from_node': 'factorial',
# 'relation': 'depends_on',
# 'to_node': 'x'
# },
# {
# 'from_node': 'factorial',
# 'relation': 'composed_of',
# 'to_node': 'factorial(x-1)'
# },
# {
# 'from_node': 'num',
# 'relation': 'is_a',
# 'to_node': '3'
# },
# {
# 'from_node': 'factorial(num)',
# 'relation': 'used_for',
# 'to_node': 'print function'
# }
#]
Input Parameters
The execute_graph_generation
function takes several parameters to configure the generation of the knowledge graph. Here are the details of each parameter:
-
text (
str
): The input text from which the knowledge graph is to be generated. This should be a string containing the textual content or code snippet you want to analyze. -
language_model (
ChatOllama
or other compatible model instances): The language model used to process the input text. This model should be capable of understanding and analyzing the structure of the provided text. -
embedding_model (
HuggingFaceEmbeddings
instance): This model is used to create embeddings for the text, which are then used to facilitate the retrieval of related text segments and to enhance the context understanding of the language model. -
relations (
dict
): A dictionary defining the possible relations between nodes in the generated knowledge graph. Each key-value pair in the dictionary specifies a relation type and its associated properties or rules. -
max_attempts (
int
): The maximum number of iterative attempts the function will make to generate the knowledge graph before stopping. This parameter helps control the execution time and resources. -
process_chunk_size (
int
): The size of the text chunks that the text is split into for processing. Smaller chunks might lead to more detailed analysis at the cost of potentially higher computational overhead. -
embedding_chunk_size (
int
): Specifies the size of text chunks used specifically for generating embeddings. This size can impact the granularity of embeddings and thus the detail of the retrieved context.
These parameters allow you to finely tune the knowledge graph generation process to meet specific needs or constraints of your application.
Features
- Supports multiple language models including remote API and local executions.
- Extracts structured knowledge from unstructured text.
- Can be used in various domains like academic research, software development, and data science.
Contributing
Contributions, issues, and feature requests are welcome! Feel free to check the issues page.
License
This project is licensed under the MIT License.
The codestral
model is licensed under The Mistral AI Non-Production License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file eknowledge-0.1.4.tar.gz
.
File metadata
- Download URL: eknowledge-0.1.4.tar.gz
- Upload date:
- Size: 8.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ba92142d7cbdf496905553708a2f54d74124a62e56c553ae7fd8637bbe5199e |
|
MD5 | a1cbad4432f5be5428fa2c810c2b4b8d |
|
BLAKE2b-256 | c8a9fda26a9ca4b4c3c8652d894317ad74bdfc2b64a2d39abc5d0526f53a1d9a |
File details
Details for the file eknowledge-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: eknowledge-0.1.4-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2adade3c933f83bd51d702c11780e789f8e96b3bc205da62c60a875a3d46e7e3 |
|
MD5 | 92c36665c18e00917832d10e94b24921 |
|
BLAKE2b-256 | 33fd8b3edf87d534c2129ad05d98989eb443c9da06d9c9af2dedee9cfce5e3c7 |