Skip to main content

Fetch message history from discord for LLMs

Project description

Discord Fetch

Why Discord Fetch?

Discord conversations contain valuable knowledge, but Discord’s UI makes it hard to: - Export community discussions for analysis - Archive important conversations before they’re lost

  • Process messages with LLMs or data tools - Search across multiple channels efficiently

Discord Fetch solves this by providing a simple interactive CLI that exports Discord messages to clean JSON files, removing all the complexity of Discord’s API.

Installation

pip install discord-fetch

Or install from source:

git clone https://github.com/hamelsmu/discord_fetch.git
cd discord_fetch
pip install -e .

[!NOTE]

Discord Bot Setup

Before using this tool, you need to create a Discord bot and obtain a token:

1. Create a Discord Application

  1. Go to the Discord Developer Portal
  2. Click “New Application” and give it a name
  3. Navigate to the “Bot” section in the sidebar
  4. Click “Add Bot”
  5. Under the “Token” section, click “Copy” to get your bot token

2. Required Bot Permissions

Your bot needs the following permissions: - View Channels - To see the channels in the server - Read Message History - To fetch historical messages - Read Messages/View Channels - Basic read access

3. Bot Scopes and OAuth2

When inviting your bot to a server, use these scopes: - bot - Basic bot permissions - applications.commands (optional) - If you plan to add slash commands

4. Invite the Bot to Your Server

  1. In the Discord Developer Portal, go to “OAuth2” > “URL Generator”
  2. Select the bot scope
  3. Select the required permissions listed above
  4. Copy the generated URL and open it in your browser
  5. Select the server you want to add the bot to

5. Environment Setup

Create this environment variable

DISCORD_TOKEN=your_bot_token_here

Getting Started in 2 Minutes

Once you’ve set up your Discord bot (see setup section below), using Discord Fetch is simple:

# Install the tool
pip install discord-fetch

# Set your bot token
export DISCORD_TOKEN=your_bot_token_here

# Run the interactive CLI
discord-fetch

The CLI will guide you through:

  1. Selecting a Discord server from your bot’s servers
  2. Choosing channels to export (one, multiple, or all)
  3. Picking output format (single combined file or separate files)
  4. Watching progress with live progress bars

That’s it! Your Discord messages are now in clean JSON files ready for processing.

Live example

What You Can Do With Discord Fetch

🎯 For Quick Channel Exports

Use the interactive CLI when you want to: - Export specific channels from a Discord server - Archive channels before they’re deleted - Get data for one-time analysis

Example: Export your project’s #general and #dev channels

discord-fetch
# Select your server, choose specific channels, export to JSON

📊 For Bulk Server Exports

Perfect when you need to: - Archive an entire Discord server - Migrate community knowledge to another platform - Create regular backups of all channels

Example: Export all 84 channels from a large community

discord-fetch
# Select server, choose "ALL CHANNELS", save to directory

🔧 For Custom Integrations

Use the Python API when you need: - Automated exports on a schedule - Custom filtering or processing - Integration with other Python tools

Example: Auto-export new messages daily

from discord_fetch.core import fetch_messages_since_date
from datetime import datetime, timedelta

# Fetch only messages from last 24 hours
yesterday = datetime.now() - timedelta(days=1)
messages = await fetch_messages_since_date(channel_id, yesterday)

See It In Action

Here’s what the interactive CLI looks like:

$ discord-fetch

Discord Channel Fetcher

Available Guilds:
1. AI Evals For Engineers & Technical PMs (84 channels)
2. LangChain Community (126 channels)
3. OpenAI DevDay (45 channels)

Select guild number [1]: 1

Channels in AI Evals For Engineers & Technical PMs:
┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ #  ┃ Channel                        ┃ Category         ┃ ID                ┃
┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ 0  │ ALL CHANNELS                   │ —                │ —                 │
│ 1  │ general                        │ Text Channels    │ 136866639069682...│
│ 2  │ announcements                  │ Text Channels    │ 137615200096996...│
│ 3  │ introductions                  │ Text Channels    │ 137655886438386...│
│ 4  │ course-discussions             │ Course           │ 138234567891234...│
└────┴────────────────────────────────┴──────────────────┴───────────────────┘

Select channel number (0 for all) [0]: 0

Selected: All 84 channels

Output Options:
1. Concatenate to single file
2. Separate files in directory

Select output option [1]: 2

Enter output directory [./discord_output]: ./ai-evals-export

#general                      ⠸ ████████████████████████████████████████ 100/100 ✓ Complete
#announcements                ⠼ ████████████████████████████████████████ 100/100 ✓ Complete  
#introductions                ⠴ ████████████████████████████████████████ 100/100 ✓ Complete
#course-discussions           ⠦ ██████████████████░░░░░░░░░░░░░░░░░░░░░  47/100 Fetching...
Total                         ⠧ ██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   4/84  4/84 completed

✓ Saved 84 files to ./ai-evals-export

Successfully fetched 84 out of 84 channels!

Which Tool Should I Use?

Use discord-fetch (Interactive CLI) when:

✅ You want a guided experience with visual feedback
✅ You’re exporting channels for analysis or archival
✅ You need to browse and select from available channels
✅ You’re not familiar with Discord channel IDs

Use fetch_discord_msgs (Direct CLI) when:

✅ You know the exact channel ID to export
✅ You want to pipe output directly to another tool
✅ You’re scripting or automating exports
✅ You need stdout output for processing

Use the Python API when:

✅ You need custom filtering or processing logic
✅ You’re building a larger application
✅ You want to fetch only new messages since a date
✅ You need programmatic access to the data

Quick decision: If you’re unsure, start with discord-fetch - it’s the easiest way to get started!

Advanced Python API

For developers who need programmatic access:

Basic Channel Export

from discord_fetch.core import fetch_discord_msgs
import asyncio

# Export a single channel
channel_id = 1369370266899185746
original, simplified = await fetch_discord_msgs(
    channel_id, 
    save_original=False,    # Don't save to file
    save_simplified=False,  # Don't save to file  
    print_summary=False     # Silent operation
)

List All Available Channels

from discord_fetch.core import list_all_channels

# Get all channels your bot can see
channels = await list_all_channels()
for ch in channels:
    print(f"{ch['guild_name']} - #{ch['channel_name']} ({ch['channel_id']})")

Fetch Only New Messages

from discord_fetch.core import fetch_messages_since_date
from datetime import datetime, timedelta

# Get messages from last 7 days
last_week = datetime.now() - timedelta(days=7)
recent = await fetch_messages_since_date(channel_id, last_week)

Find Active Channels

from discord_fetch.core import list_channels_with_new_messages

# Find channels with activity since a date
active = await list_channels_with_new_messages("01-01-2024")
for guild, data in active.items():
    for ch in data['channels_with_new_messages']:
        print(f"{ch['channel_name']}: {ch['new_message_count']} new messages")

Understanding the Output Format

Discord Fetch simplifies complex Discord data into clean JSON:

Original Format (with metadata)

{
  "channel_info": {...},
  "messages": [
    {
      "id": "123456789",
      "author": {...},
      "content": "Message text",
      "timestamp": "2024-01-15T10:30:00",
      "attachments": [...],
      "reactions": [...],
      "reply_to": {...}
    }
  ],
  "threads": {...}
}

Simplified Format (for processing)

{
  "channel": "general",
  "conversations": [
    {
      "main_message": {
        "author": "alice",
        "content": "Has anyone tried the new API?"
      },
      "replies": [
        {
          "author": "bob",
          "content": "Yes! It's much faster now"
        },
        {
          "author": "charlie",
          "content": "Agreed, the response time improved by 50%"
        }
      ]
    }
  ]
}

The simplified format: - Groups related messages into conversations - Removes IDs, timestamps, and metadata - Reduces file size by ~75% - Perfect for LLMs and analysis tools

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

discord_fetch-0.0.11.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

discord_fetch-0.0.11-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file discord_fetch-0.0.11.tar.gz.

File metadata

  • Download URL: discord_fetch-0.0.11.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for discord_fetch-0.0.11.tar.gz
Algorithm Hash digest
SHA256 41d64315cb638bb3c073af5d1ce528af7c1a056ac7cbd525a7434d3f89760ee0
MD5 46594d12693f62c0c8aba6cf7994e832
BLAKE2b-256 e76e64f7ae02000409084889c39bd5b9d040a1a8b3721d57cc0dbd42f672a9dc

See more details on using hashes here.

File details

Details for the file discord_fetch-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: discord_fetch-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for discord_fetch-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 2568a57939c5c2aba3b93395b3d94f48245ed3fb0dc8713900faff2a29a8029a
MD5 252b2d653d4a2ddec36f0051d97a14bc
BLAKE2b-256 6c84a727007ac56da82103bbbd2682e3f8a374d5fba1a2115d876abb3cbca2f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page