llama-index readers reddit integration
Project description
Reddit Reader
pip install llama-index-readers-reddit
For any subreddit(s) you're interested in, search for relevant posts using keyword(s) and load the resulting text in the post and and top-level comments into LLMs/ LangChains.
Get your Reddit credentials ready
- Visit Reddit App Preferences (https://www.reddit.com/prefs/apps) or https://old.reddit.com/prefs/apps/
- Scroll to the bottom and click "create another app..."
- Fill out the name, description, and redirect url for your app, then click "create app"
- Now you should be able to see the personal use script, secret, and name of your app. Store those as environment variables REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, and REDDIT_USER_AGENT respectively.
- Additionally store the environment variables REDDIT_USERNAME and REDDIT_PASSWORD, which correspond to the credentials for your Reddit account.
Usage
LlamaIndex
from llama_index.core import VectorStoreIndex, download_loader
from llama_index.readers.reddit import RedditReader
subreddits = ["MachineLearning"]
search_keys = ["PyTorch", "deploy"]
post_limit = 10
loader = RedditReader()
documents = loader.load_data(
subreddits=subreddits, search_keys=search_keys, post_limit=post_limit
)
index = VectorStoreIndex.from_documents(documents)
index.query("What are the pain points of PyTorch users?")
LangChain
from llama_index.core import VectorStoreIndex, download_loader
from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI
from langchain.chains.conversation.memory import ConversationBufferMemory
from llama_index.readers.reddit import RedditReader
subreddits = ["MachineLearning"]
search_keys = ["PyTorch", "deploy"]
post_limit = 10
loader = RedditReader()
documents = loader.load_data(
subreddits=subreddits, search_keys=search_keys, post_limit=post_limit
)
index = VectorStoreIndex.from_documents(documents)
tools = [
Tool(
name="Reddit Index",
func=lambda q: index.query(q),
description=f"Useful when you want to read relevant posts and top-level comments in subreddits.",
),
]
llm = OpenAI(temperature=0)
memory = ConversationBufferMemory(memory_key="chat_history")
agent_chain = initialize_agent(
tools, llm, agent="zero-shot-react-description", memory=memory
)
output = agent_chain.run(input="What are the pain points of PyTorch users?")
print(output)
This loader is designed to be used as a way to load data into GPT Index.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file llama_index_readers_reddit-0.2.0.tar.gz
.
File metadata
- Download URL: llama_index_readers_reddit-0.2.0.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Darwin/23.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e447ba8a91ee7764ac8045f5dff9e0e1e99904316f1eb404ce127f310c4428e |
|
MD5 | 4f4efc1335667057f923b1ad4d9726e6 |
|
BLAKE2b-256 | edf8795094f5cfcb3e941331c62227ee01d3442021ced92f9c92d04265fd2c39 |
File details
Details for the file llama_index_readers_reddit-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: llama_index_readers_reddit-0.2.0-py3-none-any.whl
- Upload date:
- Size: 3.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Darwin/23.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a09b2d41dbf1b3e11846c6ae1a8eb13a381ab63e379f25ee4eef573ce76d53ee |
|
MD5 | d6617751dedf1b6a2affa65ac73f108c |
|
BLAKE2b-256 | 092f77ce6c3c71c86bddc134c3a94748d3f1141276a2001cfb4fc8b3801ed959 |