Skip to main content

Python client library for Diffbot APIs

Project description

Diffbot Python Library

Python client library for Diffbot APIs.

Installation

pip install git+https://github.com/diffbot/diffbot-python.git

Or, for local development:

pip install -e ".[dev]"

Usage

Authentication

Set your Diffbot API token in your environment or .env.

export DIFFBOT_API_TOKEN=<TOKEN>

Extract structured content

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
data = db.extract("https://www.example.com")

Ask Diffbot LLM

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
for chunk in db.ask([{"role": "user", "content": "What's the capital of France?"}]):
    print(chunk, end="")

Crawl a site for structured content

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
for event in db.crawl("https://www.example.com", hops=1):
    print(event)

Query the Knowledge Graph

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
results = db.dql('type:Organization name:"Diffbot"')

Web Search

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
results = db.web_search("diffbot knowledge graph")
for r in results["search_results"]:
    print(r["score"], r["title"], r["pageUrl"])
    print(r["content"])

Entities (NLP)

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
result = db.entities("Apple CEO Tim Cook announced record quarterly earnings.")
for entity in result["entities"]:
    print(entity["name"], entity.get("type"), entity.get("id"))
print("sentiment:", result.get("sentiment"))

Async Usage

Extract structured content

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        data = await db.extract("https://www.example.com")
        print(data)

asyncio.run(main())

Ask Diffbot LLM

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        async for chunk in db.ask([{"role": "user", "content": "What's the capital of France?"}]):
            print(chunk, end="")

asyncio.run(main())

Crawl a site for structured content

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        async for event in db.crawl("https://www.example.com", hops=1):
            print(event)

asyncio.run(main())

Query the Knowledge Graph

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        results = await db.dql('type:Organization name:"Diffbot"')
        print(results)

asyncio.run(main())

Web Search

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        results = await db.web_search("diffbot knowledge graph")
        for r in results["search_results"]:
            print(r["score"], r["title"], r["pageUrl"])
            print(r["content"])

asyncio.run(main())

Entities (NLP)

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        result = await db.entities("Apple CEO Tim Cook announced record quarterly earnings.")
        for entity in result["entities"]:
            print(entity["name"], entity.get("type"), entity.get("id"))
        print("sentiment:", result.get("sentiment"))

asyncio.run(main())

CLI

This library also includes a CLI.

export DIFFBOT_API_TOKEN=your-token-here

db extract https://www.example.com
db ask "What's the capital of France?"
db crawl https://www.example.com --hops 1
db crawl-list-jobs
db crawl-delete-job crawl-1234567890
db web-search "diffbot knowledge graph"
db web-search "diffbot knowledge graph" -n 5 -f json
db entities "Apple CEO Tim Cook announced record quarterly earnings."
db entities "Apple CEO Tim Cook announced record quarterly earnings." -f dql

Tests

Run the mock test suite:

python -m pytest

Run live integration tests against the real API (requires a valid token):

DIFFBOT_TOKEN=your_token python -m pytest -m live

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffbot_python-0.1.0.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diffbot_python-0.1.0-py3-none-any.whl (25.1 kB view details)

Uploaded Python 3

File details

Details for the file diffbot_python-0.1.0.tar.gz.

File metadata

  • Download URL: diffbot_python-0.1.0.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for diffbot_python-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b5eefe5ec01ee6bf31dcdfb00f8be66179a08e6506a7f80a4fad266990377eca
MD5 61c10fd8fb1e1603ead352b9bb743f03
BLAKE2b-256 aff336d58aa80e3a7b7d112f1a568d57a6e641298ae8f1bf79da045083510ff5

See more details on using hashes here.

File details

Details for the file diffbot_python-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: diffbot_python-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 25.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for diffbot_python-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7d68ab59e66b6cc8fe516822c0ec79358a62cc59f29c62bf863bd57ac93bd474
MD5 d6154744e3a4274cf2a18f12fc5820c0
BLAKE2b-256 df7e0592f9502de0a12680327061e4dbacf63b87ae3f3aa5bd0d56828209f372

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page