Skip to main content

llama-index readers reddit integration

Project description

Reddit Reader

For any subreddit(s) you're interested in, search for relevant posts using keyword(s) and load the resulting text in the post and and top-level comments into LLMs/ LangChains.

Get your Reddit credentials ready

  1. Visit Reddit App Preferences (https://www.reddit.com/prefs/apps) or https://old.reddit.com/prefs/apps/
  2. Scroll to the bottom and click "create another app..."
  3. Fill out the name, description, and redirect url for your app, then click "create app"
  4. Now you should be able to see the personal use script, secret, and name of your app. Store those as environment variables REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, and REDDIT_USER_AGENT respecitvely.
  5. Additionally store the environment variables REDDIT_USERNAME and REDDIT_PASSWORD, which correspond to the credentials for your Reddit account.

Usage

LlamaIndex

from llama_index import VectorStoreIndex, download_loader

RedditReader = download_loader("RedditReader")

subreddits = ["MachineLearning"]
search_keys = ["PyTorch", "deploy"]
post_limit = 10

loader = RedditReader()
documents = loader.load_data(
    subreddits=subreddits, search_keys=search_keys, post_limit=post_limit
)
index = VectorStoreIndex.from_documents(documents)

index.query("What are the pain points of PyTorch users?")

LangChain

from llama_index import VectorStoreIndex, download_loader

from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI
from langchain.chains.conversation.memory import ConversationBufferMemory

RedditReader = download_loader("RedditReader")

subreddits = ["MachineLearning"]
search_keys = ["PyTorch", "deploy"]
post_limit = 10

loader = RedditReader()
documents = loader.load_data(
    subreddits=subreddits, search_keys=search_keys, post_limit=post_limit
)
index = VectorStoreIndex.from_documents(documents)

tools = [
    Tool(
        name="Reddit Index",
        func=lambda q: index.query(q),
        description=f"Useful when you want to read relevant posts and top-level comments in subreddits.",
    ),
]
llm = OpenAI(temperature=0)
memory = ConversationBufferMemory(memory_key="chat_history")
agent_chain = initialize_agent(
    tools, llm, agent="zero-shot-react-description", memory=memory
)

output = agent_chain.run(input="What are the pain points of PyTorch users?")
print(output)

This loader is designed to be used as a way to load data into GPT Index and/or subsequently used as a Tool in a LangChain Agent. See here for examples.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_reddit-0.0.1.tar.gz (3.2 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file llama_index_readers_reddit-0.0.1.tar.gz.

File metadata

File hashes

Hashes for llama_index_readers_reddit-0.0.1.tar.gz
Algorithm Hash digest
SHA256 96781cf242e28a157045d3428be36a8d691df1a0130b191f7eae086ba3441395
MD5 58f4a8a70f82fc8533705cf9bf987e9f
BLAKE2b-256 16a68a8382a1961045a2b66e023df038be48ac30a95a8efe33b35a27fc1ca63c

See more details on using hashes here.

File details

Details for the file llama_index_readers_reddit-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_readers_reddit-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 81bc47cdca528a274f31bddc2db275e9fdf1ae93b9b4f4c235910fb5d068417b
MD5 c6181225fcaee046deaeb3a5ecd19fdc
BLAKE2b-256 71bcdcb7a3c52e23a8e459f67af5f23d6c7d62023d722b234b4baaf3b3b19ff6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page