llama-index readers wordpress integration
Project description
Wordpress Loader
pip install llama-index-readers-wordpress
This loader fetches the text from Wordpress blog posts using the Wordpress API. It also uses the BeautifulSoup library to parse the HTML and extract the text from the articles.
Usage
To use this loader, you need to pass base url of the Wordpress installation
(e.g. https://www.mysite.com
) and optionally a username, and an application
password for the user (more about application passwords
here)
from llama_index.readers.wordpress import WordpressReader
loader = WordpressReader(
url="https://www.mysite.com",
username="my_username",
password="my_password",
)
documents = loader.load_data()
This loader is designed to be used as a way to load data into LlamaIndex.
Pages and Posts
Be default, the loader retrieves both Wordpress pages (static content) and
posts (blog entries) from the target site. This behavior can be configured
by setting get_pages=False
or get_posts=False
when initializing the
WordpressReader
object.
Additional Custom Post types
To scrape additional custom endpoints beside posts and pages, you can specify additional_post_types
as a comma-separated list (e.g., additional_post_types="custom-pages,custom-posts"
) when initializing the WordpressReader
object.
from llama_index.readers.wordpress import WordpressReader
loader = WordpressReader(
url="https://www.mysite.com",
username="my_username",
password="my_password",
additional_post_types="webiners,podcasts",
)
documents = loader.load_data()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file llama_index_readers_wordpress-0.2.3.tar.gz
.
File metadata
- Download URL: llama_index_readers_wordpress-0.2.3.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5add4f06b7fa3c4c2cfff3a711b63251cc939f522a833bea0c5d6cece69252cb |
|
MD5 | 7e8ed57f00039b5d093ece82786a2768 |
|
BLAKE2b-256 | 7bb333950292a3f98069fb58dd188e1a096ca7495fa2de2515e4ccccd7f3002d |
File details
Details for the file llama_index_readers_wordpress-0.2.3-py3-none-any.whl
.
File metadata
- Download URL: llama_index_readers_wordpress-0.2.3-py3-none-any.whl
- Upload date:
- Size: 3.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 19f56e05b0d8f87985821bbacdfd2fcef571bf2b2d2250b3d20add6002f9daf6 |
|
MD5 | 281b2544145a07050e549784b8a06bc2 |
|
BLAKE2b-256 | 06a569c7df6897ebb51b518bd3c1b2381bb73c77e0e69cbbdcc94cc074d90690 |