Skip to main content

LangChain integration for Cala — the verified knowledge graph API for AI agents

Project description

langchain-cala

LangChain retriever for Cala — the verified knowledge graph API for AI agents.

Cala returns structured, typed, source-cited data instead of raw web scrapes. Every document includes source and verified metadata fields so your agent can reason over facts with full traceability.

pip install langchain-cala

Why Cala?

Web search APIs Cala
Returns URLs + scraped HTML Returns structured, typed JSON
Agent must parse, deduplicate, guess Agent reasons directly over clean data
No source tracing Every fact cites its source
Hallucination risk Verified entity graph

Quickstart

from langchain_cala import CalaRetriever

retriever = CalaRetriever(api_key="YOUR_CALA_API_KEY")
docs = retriever.invoke("Latest funding rounds for Barcelona AI startups")

for doc in docs:
    print(doc.page_content)
    print(doc.metadata["source"])

Get your API key at console.cala.ai.
You can also set it as an environment variable: export CALA_API_KEY=your-key


Modes

CalaRetriever supports three modes matching Cala's API:

"search" (default) — Natural language questions

Ask a free-text question and get a synthesised answer with source citations.

retriever = CalaRetriever(api_key="...", mode="search")
docs = retriever.invoke("Who are the key investors in European AI infrastructure?")

# docs[0] → synthesised answer
# docs[1:] → source citations with URLs

"query" — Structured dot-notation queries

Query the entity graph directly using Cala's dot-notation syntax.

retriever = CalaRetriever(api_key="...", mode="query")
docs = retriever.invoke("OpenAI.founded.year")
# → "2015"

docs = retriever.invoke("Mistral.headquarters.city")
# → "Paris"

"entities" — Entity discovery

Find and explore entities by name across companies, people, products, research papers, laws, and places.

retriever = CalaRetriever(api_key="...", mode="entities")
docs = retriever.invoke("Factorial HR")
# → Entity card: name, type, description, fields (funding, location, founded…)

Parameters

Parameter Type Default Description
api_key str CALA_API_KEY env Your Cala API key
mode "search" | "query" | "entities" "search" API mode
k int 5 Max documents returned
timeout int 30 Request timeout (seconds)

Use in a RAG chain

from langchain_cala import CalaRetriever
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

retriever = CalaRetriever(api_key="...")

prompt = ChatPromptTemplate.from_template("""
Answer the question based only on the context below.
Every fact in your answer must be traceable to a source.

Context:
{context}

Question: {question}
""")

def format_docs(docs):
    return "\n\n".join(
        f"{doc.page_content}\n[Source: {doc.metadata.get('source', 'cala')}]"
        for doc in docs
    )

chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | ChatOpenAI(model="gpt-4o")
    | StrOutputParser()
)

answer = chain.invoke("Which EU AI startups raised Series A in 2024?")
print(answer)

Use as a LangGraph tool

from langchain_cala import CalaRetriever
from langchain_core.tools import create_retriever_tool

retriever = CalaRetriever(api_key="...")

cala_tool = create_retriever_tool(
    retriever,
    name="cala_knowledge",
    description=(
        "Search Cala's verified knowledge graph for structured, "
        "source-cited facts about companies, people, funding, research, and more. "
        "Returns typed data your agent can reason over immediately."
    ),
)

# Add to your LangGraph agent's tools list
tools = [cala_tool, ...]

Document metadata

Every Document returned includes metadata for full traceability:

{
    "source": "https://techcrunch.com/...",   # or "cala:knowledge_search"
    "verified": True,
    "type": "answer" | "citation" | "structured_result" | "entity",
    "query": "original query string",
    # mode-specific extras:
    "entity": "OpenAI",          # query mode
    "field": "founded.year",     # query mode
    "entity_id": "ent_xxx",      # entities mode
    "entity_type": "Company",    # entities mode
}

Installation

pip install langchain-cala

Requirements: Python ≥ 3.9, langchain-core ≥ 0.2, requests ≥ 2.28, pydantic ≥ 2.0


Development

git clone https://github.com/your-username/langchain-cala
cd langchain-cala
pip install -e ".[dev]"
pytest tests/ -v

Links


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_cala-0.1.0.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_cala-0.1.0-py3-none-any.whl (3.4 kB view details)

Uploaded Python 3

File details

Details for the file langchain_cala-0.1.0.tar.gz.

File metadata

  • Download URL: langchain_cala-0.1.0.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for langchain_cala-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1a02b72703f8449405f18bb03eeae8c190c3183f7fa1a0c6420ce4fbd7fb86cc
MD5 04fab67f3be6824841cdb8572562e410
BLAKE2b-256 6e620610f66483124609dbc0d5bd5ee308da98c5a321bd463f8f471ced859917

See more details on using hashes here.

File details

Details for the file langchain_cala-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: langchain_cala-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for langchain_cala-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 59e495a9ffac68fc4456822a37cf3e84c0f71b3a9d4986f7e343a8bc4ae725f1
MD5 8abb3e0c74914e1ce56b6e19316886d9
BLAKE2b-256 ee8a189461fc3c68699328f0f9ebc9b43bccf1d9853b06b1d0ad80d7ea5f5e66

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page