Skip to main content

AI-powered CLI for WordLift knowledge graph and SEO workflows.

Project description

worai

Command-line toolkit for WordLift operations and SEO checks. Pronunciation: "waw-RYE"

Docs: https://docs.wordlift.io/worai/

Install

  • pipx install worai
  • pip install worai

Full docs: https://docs.wordlift.io/worai/

Runtime dependency note:

  • wordlift-sdk>=3.6.0,<4.0.0 (installed automatically by pip)
  • copier (required by worai graph sync create, installed automatically by pip)

If you plan to run seocheck, install Playwright browsers:

  • playwright install chromium

Quick Start

  • worai --help
  • worai seocheck https://example.com/sitemap.xml
  • worai google-search-console --site sc-domain:example.com --client-secrets ./client_secrets.json
  • worai <command> --help

Configuration

Config file (TOML) discovery order:

  • --config
  • WORAI_CONFIG
  • ./worai.toml
  • ~/.config/worai/config.toml
  • ~/.worai.toml

Profiles:

  • [profile.<name>] with --profile or WORAI_PROFILE

Common keys:

  • wordlift.api_key
  • gsc.id
  • gsc.client_secrets
  • ga.id
  • ga.client_secrets
  • oauth.token (shared token for GSC + GA)
  • postprocessor_runtime (graph sync runtime: subprocess or persistent; profile override supported)

Supported environment variables:

  • WORAI_CONFIG — path to a config TOML file (overrides discovery order).
  • WORAI_PROFILE — profile name under [profile.<name>].
  • WORAI_LOG_LEVEL — default log level (debug|info|warning|error).
  • WORAI_LOG_FORMAT — default log format (text|json).
  • WORDLIFT_KEY — WordLift API key for entity operations.
  • WORDLIFT_API_KEY — alternate WordLift API key name (also accepted by some commands).
  • GSC_CLIENT_SECRETS — path to OAuth client secrets JSON for GSC.
  • GSC_ID — GSC property URL.
  • OAUTH_TOKEN — path to store the shared OAuth token (GSC + GA).
  • GSC_OUTPUT — default output CSV path for GSC export.
  • GA_ID — GA4 property ID for Analytics sections.
  • GA_CLIENT_SECRETS — path to OAuth client secrets JSON for GA4.
  • GSC_TOKEN / GA_TOKEN — legacy aliases for OAUTH_TOKEN (must point to the same file if used).

.env support:

  • worai loads .env from the current working directory (and parent lookup) at startup.
  • values from .env are treated as environment variables.
  • existing environment variables take precedence over .env values.

Example environment setup:

export WORDLIFT_KEY="wl_..."
export WORAI_CONFIG="~/worai.toml"
export WORAI_PROFILE="dev"
export GSC_CLIENT_SECRETS="~/client_secrets.json"
export OAUTH_TOKEN="~/oauth_token.json"

Example worai.toml:

[defaults]
log_level = "info"

[wordlift]
api_key = "wl_..."

[gsc]
id = "sc-domain:example.com"
client_secrets = "/path/to/client_secrets.json"

[ga]
id = "123456789"
client_secrets = "/path/to/client_secrets.json"

[oauth]
token = "/path/to/oauth_token.json"

Commands

Full docs: https://docs.wordlift.io/worai/

  • seocheck — run SEO checks for sitemap URLs and URL lists.
  • google-search-console — export GSC page metrics as CSV.
  • dedupe — deduplicate WordLift entities by schema:url.
  • canonicalize-duplicate-pages — select canonical URLs using GSC KPIs.
  • delete-entities-from-csv — delete entities listed in a CSV.
  • find-faq-page-wrong-type — find and patch FAQPage typing issues.
  • find-missing-names — find entities missing schema:name/headline.
  • find-url-by-type — list schema:url values by type from RDF.
  • graph — run graph-specific workflows.
  • link-groups — build or apply LinkGroup data from CSV.
  • patch — patch entities from RDF.
  • structured-data — generate JSON-LD/YARRRML mappings or materialize RDF from YARRRML.
  • validate — validate JSON-LD with SHACL shapes (use structured-data validate page for webpage URLs).
  • upload-entities-from-turtle — upload .ttl files with resume.

Command help:

  • worai <command> --help

Autocompletion:

  • worai --install-completion
  • worai --show-completion

Examples

seocheck

  • worai seocheck https://example.com/sitemap.xml
  • worai seocheck https://example.com/sitemap.xml --output-dir ./seocheck-report --save-html
  • worai seocheck https://example.com/sitemap.xml --output-dir ./seocheck-report --no-open-report
  • worai seocheck https://example.com/sitemap.xml --user-agent "Mozilla/5.0 ..."
  • worai seocheck https://example.com/sitemap.xml --sitemap-fetch-mode browser
  • worai seocheck https://example.com/sitemap.xml --no-report-ui
  • worai seocheck https://example.com/sitemap.xml --recheck-failed --recheck-from ./seocheck-report

google-search-console

  • worai google-search-console --site sc-domain:example.com --client-secrets ./client_secrets.json
    • Uses OAuth redirect port 8080 by default.

seoreport (with Analytics)

  • worai seoreport --site sc-domain:example.com --ga-id 123456789 --format html

canonicalize-duplicate-pages

  • worai canonicalize-duplicate-pages --input gsc_pages.csv --output canonical_targets.csv --kpi-window 28d --kpi-metric clicks
  • worai canonicalize-duplicate-pages --input gsc_pages.csv --entity-type Product

dedupe

  • worai dedupe --dry-run

find-faq-page-wrong-type

  • worai find-faq-page-wrong-type ./data.ttl --dry-run --replace-type
  • worai find-faq-page-wrong-type ./data.ttl --patch --replace-type

find-missing-names

  • worai find-missing-names ./data.ttl

find-url-by-type

  • worai find-url-by-type ./data.ttl schema:Service schema:Product

link-groups

  • worai link-groups ./links.csv --format turtle
  • worai link-groups ./links.csv --apply --dry-run --concurrency 4

graph

  • worai --config ./worai.toml graph sync run --profile acme
  • worai graph sync run --profile acme --debug
  • worai graph sync create ./acme-graph
  • worai graph sync create ./acme-graph --template ./graph-sync-template --defaults
  • worai graph sync create ./acme-graph --data-file ./answers.yml --non-interactive
  • worai graph sync create ./acme-graph --vcs-ref v1.2.3
    • graph sync create runs Copier in trusted mode by default so template _tasks execute.
    • Mapping docs (for [profiles.<name>]): docs/graph-sync-mappings-reference.md, docs/graph-sync-mappings-guide.md, docs/graph-sync-mappings-examples.md
    • web_page_import_timeout is configured in seconds in worai.toml (60 -> 60000 ms in SDK).
    • postprocessor_runtime = "persistent" in worai.toml sets SDK env POSTPROCESSOR_RUNTIME=persistent for graph sync run (profile value overrides global).

patch

  • worai patch ./data.ttl --dry-run --add-types

structured-data

  • worai structured-data create https://example.com/article Review --output-dir ./structured-data
  • worai structured-data create https://example.com/article --type Review --output-dir ./structured-data
  • worai structured-data create https://example.com/article --type Review --debug
  • worai structured-data create https://example.com/article --type Review --max-xhtml-chars 40000 --max-nesting-depth 2
  • worai structured-data generate https://example.com/sitemap.xml --yarrrml ./mapping.yarrrml --output-dir ./out
  • worai structured-data generate https://example.com/page --yarrrml ./mapping.yarrrml --format jsonld
  • worai structured-data inventory https://example.com/sitemap.xml --output ./structured-data-inventory.csv
  • worai structured-data inventory ./urls.txt --output ./structured-data-inventory.csv
  • worai structured-data inventory https://docs.google.com/spreadsheets/d/<id>/edit --sheet-name URLs_US --output ./structured-data-inventory.csv
  • worai structured-data inventory https://example.com/sitemap.xml --destination-sheet-id <spreadsheet_id> --destination-sheet-name Inventory
  • worai structured-data inventory https://example.com/sitemap.xml --output ./structured-data-inventory.csv --concurrency auto
  • worai structured-data inventory /path/to/debug_cloud/us --source-type debug-cloud --output ./structured-data-inventory.csv

validate

  • worai validate jsonld --shape review-snippet --shape schema-review ./data.jsonld
  • worai validate jsonld --format raw https://api.wordlift.io/data/example.jsonld
  • worai structured-data validate page https://example.com/article --shape review-snippet

upload-entities-from-turtle

  • worai upload-entities-from-turtle ./entities --recursive --limit 50

Troubleshooting

  • Playwright missing browsers:
    • playwright install chromium
  • YARRRML conversion:
    • npm install -g @rmlio/yarrrml-parser
  • RML execution:
    • morph-kgc is included in project dependencies
  • Dependency notes:
    • Common runtime libs (e.g., requests, rdflib, tqdm, advertools, Google auth helpers) are provided transitively by wordlift-sdk.
  • OAuth token issues:
    • Remove the token file and re-run worai google-search-console.
    • If you are prompted to re-auth every run, delete the token file to force a new consent flow that includes a refresh token.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

worai-3.0.1.tar.gz (91.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

worai-3.0.1-py3-none-any.whl (105.7 kB view details)

Uploaded Python 3

File details

Details for the file worai-3.0.1.tar.gz.

File metadata

  • Download URL: worai-3.0.1.tar.gz
  • Upload date:
  • Size: 91.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for worai-3.0.1.tar.gz
Algorithm Hash digest
SHA256 84737c38bf8504f5754be2205c4d0b46b99823849f6cb1c1f675e431301ca14a
MD5 e6b8611e3093596ccaa670f29d43b5d8
BLAKE2b-256 5efc4b3736fd5cb4730b6552b2d0a7b92bcf2533864fcada8cca2986b65f49d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for worai-3.0.1.tar.gz:

Publisher: publish.yml on wordlift/worai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file worai-3.0.1-py3-none-any.whl.

File metadata

  • Download URL: worai-3.0.1-py3-none-any.whl
  • Upload date:
  • Size: 105.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for worai-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c6448f359091d8d70d93174f59677de3080a0338c74f0be00dc21cecc3eafc70
MD5 92772b9fb0ce6cb5a8607ffef8489e06
BLAKE2b-256 0f8fe3020f74f257de91b28a1be749a954c7bde917aac51fe9afb699fb4111a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for worai-3.0.1-py3-none-any.whl:

Publisher: publish.yml on wordlift/worai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page