Skip to main content

Python package for ESCARGOT that enables solving elaborate problems with LLMs and Knowledge Graphs

Project description

ESCARGOT

Overview

LLMs like GPT-4, despite their advancements, often produce hallucinations and struggle with integrating external knowledge effectively. While Retrieval-Augmented Generation (RAG) attempts to address this by incorporating external information, it faces significant challenges such as context length limitations and imprecise vector similarity search. ESCARGOT aims to overcome these issues by combining LLMs with a dynamic Graph of Thoughts and knowledge graphs, improving output reliability and reducing hallucinations. ESCARGOT significantly outperforms industry-standard RAG methods, particularly in open-ended questions that demand high precision. ESCARGOT also offers greater transparency in its reasoning process, allowing for the vetting of both code and knowledge requests, in contrast to the black-box nature of LLM-only or RAG-based approaches.

ESCARGOT significantly outperforms industry-standard RAG methods, particularly in open-ended questions that demand high precision. ESCARGOT also offers greater transparency in its reasoning process, allowing for the vetting of both code and knowledge requests, in contrast to the black-box nature of LLM-only or RAG-based approaches.

Results

Dataset GPT 3.5 Turbo Standard RAG ESCARGOT
Openended 1-hop (508 questions) 3.3% 50.2% 81.0%
Openended 2-hop (450 questions) 3.5% 12.8% 91.8%
True/False 1-hop (560 questions) 55.9% 73.0% 80.7%
True/False 2-hop (540 questions) 26.7% 64.4% 77.6%
Multiple Choice 1-hop (498 questions) 42.6% 77.7% 94.6%
Multiple Choice 2-hop (419 questions) 49.9% 81.9% 94.2%

Key Features

  1. Dynamic GoT Generation: ESCARGOT dynamically generates a Python-executable Graph of Thoughts (GoT) that integrates with knowledge graphs. This dynamic approach ensures improved accuracy and contextual relevance compared to static GoT frameworks.

  2. Strategic Planning and Execution: ESCARGOT's workflow includes strategy generation, assessment, code generation, XML conversion, and execution. This multi-step process ensures that each strategy is thoroughly evaluated and executed efficiently.

  3. Advanced Knowledge Retrieval:

    • Cypher Queries: ESCARGOT utilizes Cypher queries to extract structured and precise information from knowledge graphs like Memgraph, offering superior accuracy in data retrieval.
    • Vector Database Requests: As a backup, ESCARGOT leverages vector databases for similarity searches, enhancing the system’s ability to handle diverse queries even if Cypher queries fail.
  4. Direct Python Execution: The system supports direct Python execution for tasks requiring high precision. By converting knowledge into executable Python code, ESCARGOT ensures reliable and efficient computation, reducing the risk of errors and hallucinations.

  5. Self-Debugging and Adaptability: ESCARGOT includes built-in self-debugging capabilities. It can autonomously analyze and revise code if errors occur during execution, ensuring resilience and reducing the need for manual intervention.

  6. Enhanced Accuracy and Reduced Hallucinations: By integrating structured knowledge retrieval and direct code execution, ESCARGOT minimizes the risk of hallucinations and improves the overall accuracy of reasoning and computation.


Quick Install Through Pip

pip install escargot

Example Usage

Here's how to configure and use the Escargot library with your chosen models and databases:

Configuration

First, set up your configuration with the necessary parameters:

config = {
    "azuregpt35-16k" : {
        "model_id":"gpt-35-turbo-16k", 
        "prompt_token_cost": 0.001,
        "response_token_cost": 0.002,
        "temperature": 0.7,
        "max_tokens": 2000,
        "stop": None,
        "api_version": "API_VERSION",
        "api_base": "API_BASE_HERE",
        "api_key": "API_KEY_HERE",
        "embedding_id":"text-embedding-ada-002"
    },
    "memgraph" : {
        "host": "MEMGRAPH_URL",
        "port": 7687
    },
    "weaviate" : {
        "api_key": "WEAVIATE_API_KEY",
        "url": "WEAVIATE_URL",
        "db": "WEAVIATE_DB",
        "limit": 200
    }
}

Initializing Escargot

Initialize the Escargot instance with your configuration. Escargot will automatically connect to the Memgraph database and retrieve all the node types and relationships.

from escargot import Escargot

escargot = Escargot(
    config, model_name="azuregpt35-16k"
)

Ask a question

To query the Escargot library, use the ask function. You can specify optional parameters such as debug_level and answer_type to control the verbosity of logging and the format of the response.

Example of extracting information from the knowledge graph

# Basic question
response = escargot.ask("What is the function of the gene APOE?")
The gene APOE has several functions, including low-density lipoprotein particle receptor binding, protein homodimerization activity, steroid binding, heparin binding, tau protein binding, amide binding, lipoprotein particle receptor binding, and protein-lipid complex assembly.
# Advanced usage with additional parameters
response = escargot.ask(
    "What is the function of the gene APOE?",
    debug_level=0,       # Set the level of logging (0 to 3). 0 is default
    answer_type="natural"  # Specify the response format ("natural" or "array") "natural" is default.
)

Example of extracting an array of genes from the knowledge graph

escargot.ask("List the genes that are associated with the Alzheimer's disease", answer_type="array")
{'genes_list': ['GSK3B',
  'CASP3',
  'CHRNB2',
  'IGF2',
  'IQCK',
  'MS4A4A',
  'IL1B',
  'ACE',
  'VEGFA',
  'WWOX',
  'DPYSL2',
  'MIR4467',
...
  'INSR',
  'ABCA7',
  'SORL1',
  'HLA-DRB5',
  'ACHE']}

Manually Configure Knowledge Graph Schema (bypasses automated database schema extraction)

Retrieving the schema does take time, so it is useful to extract the schema once, manually configure it and insert it into the instantiation of Escargot. If you need to manually configure the node_types and relationships, you will skip the above automated knowledge graph schema retrieval and use the manually inputted configuration:

from escargot import Escargot

escargot = Escargot(
    config,
    node_types="BiologicalProcess, BodyPart, CellularComponent, Datatype, Disease, Drug, DrugClass, Gene, MolecularFunction, Pathway, Symptom",
    relationship_types="""CHEMICALBINDSGENE
CHEMICALDECREASESEXPRESSION
CHEMICALINCREASESEXPRESSION
DRUGINCLASS
DRUGCAUSESEFFECT
DRUGTREATSDISEASE
GENEPARTICIPATESINBIOLOGICALPROCESS
GENEINPATHWAY
GENEINTERACTSWITHGENE
GENEHASMOLECULARFUNCTION
GENEASSOCIATEDWITHCELLULARCOMPONENT
GENEASSOCIATESWITHDISEASE
SYMPTOMMANIFESTATIONOFDISEASE
BODYPARTUNDEREXPRESSESGENE
BODYPARTOVEREXPRESSESGENE
DISEASELOCALIZESTOANATOMY
DISEASEASSOCIATESWITHDISEASET""",
    model_name="azuregpt35-16k"
)
escargot.memgraph_client.schema = """Node properties are the following:
Node name: 'BiologicalProcess', Node properties: ['commonName']
Node name: 'BodyPart', Node properties: ['commonName']
Node name: 'CellularComponent', Node properties: ['commonName']
Node name: 'Disease', Node properties: ['commonName']
Node name: 'Drug', Node properties: ['commonName']
Node name: 'DrugClass', Node properties: ['commonName']
Node name: 'Gene', Node properties: ['commonName', 'geneSymbol', 'typeOfGene']
Node name: 'MolecularFunction', Node properties: ['commonName']
Node name: 'Pathway', Node properties: ['commonName']
Node name: 'Symptom', Node properties: ['commonName']
Relationship properties are the following:
The relationships are the following:
(:Drug)-[:CHEMICALBINDSGENE]-(:Gene)
(:Drug)-[:CHEMICALDECREASESEXPRESSION]-(:Gene)
(:Drug)-[:CHEMICALINCREASESEXPRESSION]-(:Gene)
(:Drug)-[:DRUGINCLASS]-(:DrugClass)
(:Drug)-[:DRUGCAUSESEFFECT]-(:Disease)
(:Drug)-[:DRUGTREATSDISEASE]-(:Disease)
(:Gene)-[:GENEPARTICIPATESINBIOLOGICALPROCESS]-(:BiologicalProcess)
(:Gene)-[:GENEINPATHWAY]-(:Pathway)
(:Gene)-[:GENEINTERACTSWITHGENE]-(:Gene)
(:Gene)-[:GENEHASMOLECULARFUNCTION]-(:MolecularFunction)
(:Gene)-[:GENEASSOCIATEDWITHCELLULARCOMPONENT]-(:CellularComponent)
(:Gene)-[:GENEASSOCIATESWITHDISEASE]-(:Disease)
(:Symptom)-[:SYMPTOMMANIFESTATIONOFDISEASE]-(:Disease)
(:BodyPart)-[:BODYPARTUNDEREXPRESSESGENE]-(:Gene)
(:BodyPart)-[:BODYPARTOVEREXPRESSESGENE]-(:Gene)
(:Disease)-[:DISEASELOCALIZESTOANATOMY]-(:BodyPart)
(:Disease)-[:DISEASEASSOCIATESWITHDISEASET]-(:Disease)"""

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

escargot-0.0.1.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

escargot-0.0.1-py3-none-any.whl (40.9 kB view details)

Uploaded Python 3

File details

Details for the file escargot-0.0.1.tar.gz.

File metadata

  • Download URL: escargot-0.0.1.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for escargot-0.0.1.tar.gz
Algorithm Hash digest
SHA256 d2c62a78da319abe92b4d2684b0fe6363829a509e6e5d7a7745e836ef3c5b4cc
MD5 4a84a83c311f91189bb91d733803c1fe
BLAKE2b-256 3928974ae2d0a9848967f3db13c189f9fe9e489f48c5bd1fbff55fae92abdc20

See more details on using hashes here.

File details

Details for the file escargot-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: escargot-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 40.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for escargot-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cef123b99819f433d77f795d6ea87d9ea9d9cae4e261cfe8fef49746a3bf992b
MD5 7d1f19d90c978ea6e1ae39b89dd7cbdd
BLAKE2b-256 35fc97fb97fe6e8ff93cc50139cb38ee56c0464eec770e1f2c65e903688891e3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page