Skip to main content

llama-index readers reddit integration

Project description

Reddit Reader

For any subreddit(s) you're interested in, search for relevant posts using keyword(s) and load the resulting text in the post and and top-level comments into LLMs/ LangChains.

Get your Reddit credentials ready

  1. Visit Reddit App Preferences (https://www.reddit.com/prefs/apps) or https://old.reddit.com/prefs/apps/
  2. Scroll to the bottom and click "create another app..."
  3. Fill out the name, description, and redirect url for your app, then click "create app"
  4. Now you should be able to see the personal use script, secret, and name of your app. Store those as environment variables REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, and REDDIT_USER_AGENT respecitvely.
  5. Additionally store the environment variables REDDIT_USERNAME and REDDIT_PASSWORD, which correspond to the credentials for your Reddit account.

Usage

LlamaIndex

from llama_index import VectorStoreIndex, download_loader

RedditReader = download_loader("RedditReader")

subreddits = ["MachineLearning"]
search_keys = ["PyTorch", "deploy"]
post_limit = 10

loader = RedditReader()
documents = loader.load_data(
    subreddits=subreddits, search_keys=search_keys, post_limit=post_limit
)
index = VectorStoreIndex.from_documents(documents)

index.query("What are the pain points of PyTorch users?")

LangChain

from llama_index import VectorStoreIndex, download_loader

from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI
from langchain.chains.conversation.memory import ConversationBufferMemory

RedditReader = download_loader("RedditReader")

subreddits = ["MachineLearning"]
search_keys = ["PyTorch", "deploy"]
post_limit = 10

loader = RedditReader()
documents = loader.load_data(
    subreddits=subreddits, search_keys=search_keys, post_limit=post_limit
)
index = VectorStoreIndex.from_documents(documents)

tools = [
    Tool(
        name="Reddit Index",
        func=lambda q: index.query(q),
        description=f"Useful when you want to read relevant posts and top-level comments in subreddits.",
    ),
]
llm = OpenAI(temperature=0)
memory = ConversationBufferMemory(memory_key="chat_history")
agent_chain = initialize_agent(
    tools, llm, agent="zero-shot-react-description", memory=memory
)

output = agent_chain.run(input="What are the pain points of PyTorch users?")
print(output)

This loader is designed to be used as a way to load data into GPT Index and/or subsequently used as a Tool in a LangChain Agent. See here for examples.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_reddit-0.1.0.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file llama_index_readers_reddit-0.1.0.tar.gz.

File metadata

File hashes

Hashes for llama_index_readers_reddit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2e7a67dfe7aeb4ac6a0b792b25e7b690d59997cc6c9d2ea2d354f063756ca074
MD5 0aaacdd4638ec59b52c4156ab4d4d4cd
BLAKE2b-256 04d90f7ecaffc62e5c3e52994de4b1db963285e750108fd4b1ab0827c272e66f

See more details on using hashes here.

File details

Details for the file llama_index_readers_reddit-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_readers_reddit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d7fef2718ff60b4e1d3263a612e69e5d678a001bab5cc3bc73ee1cdb613af983
MD5 491a2470ae391f416e8ae8bb6104c2a4
BLAKE2b-256 e93fd524c5f86f73f8af6f6236b57c61219aa422dfb3a103b156688100ab1955

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page