Skip to main content

llama-index readers wordpress integration

Project description

Wordpress Loader

pip install llama-index-readers-wordpress

This loader fetches the text from Wordpress blog posts using the Wordpress API. It also uses the BeautifulSoup library to parse the HTML and extract the text from the articles.

Usage

To use this loader, you need to pass base url of the Wordpress installation (e.g. https://www.mysite.com) and optionally a username, and an application password for the user (more about application passwords here)

from llama_index.readers.wordpress import WordpressReader

loader = WordpressReader(
    url="https://www.mysite.com",
    username="my_username",
    password="my_password",
)
documents = loader.load_data()

This loader is designed to be used as a way to load data into LlamaIndex.

Pages and Posts

Be default, the loader retrieves both Wordpress pages (static content) and posts (blog entries) from the target site. This behavior can be configured by setting get_pages=False or get_posts=False when initializing the WordpressReader object.

Additional Custom Post types

To scrape additional custom endpoints beside posts and pages, you can specify additional_post_types as a comma-separated list (e.g., additional_post_types="custom-pages,custom-posts") when initializing the WordpressReader object.

from llama_index.readers.wordpress import WordpressReader

loader = WordpressReader(
    url="https://www.mysite.com",
    username="my_username",
    password="my_password",
    additional_post_types="webiners,podcasts",
)
documents = loader.load_data()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_wordpress-0.5.0.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_readers_wordpress-0.5.0-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_readers_wordpress-0.5.0.tar.gz.

File metadata

  • Download URL: llama_index_readers_wordpress-0.5.0.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_wordpress-0.5.0.tar.gz
Algorithm Hash digest
SHA256 a8c8c82eecc70c37331031275cfa6871346ad07ecc91e57207bfa4f5ac7e81cc
MD5 bab0bb0ec2ae61cbe22c5e6c86f4d624
BLAKE2b-256 c40442a0b6b0c21d53b97569288e53788161e28a806590e9bfb069edcb60396f

See more details on using hashes here.

File details

Details for the file llama_index_readers_wordpress-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_readers_wordpress-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_wordpress-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8c671abc8a991414cb7e8a0833ee371a27e2165df9950a548873f4c80a9b4734
MD5 037b9d5155241c4f500d3b6fcd337142
BLAKE2b-256 3491196b46c70fc06438b2edb7e9f104d68e7d34679799ad20fc313f50f6c16c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page