Extensible Next.js/DeepWiki content extractor with zero external dependencies

These details have not been verified by PyPI

Project links

Project description

deepwiki-to-md

English README. 日本語はこちら → README_JP.md

Zero-dependency CLI and Python library to extract Markdown from Next.js/DeepWiki HTML. Includes a small search helper for public repository indexes and an optional chat helper.

CLI: deepwiki-to-md
Requirements: Python 3.8+
Dependencies: Standard library only (optional extras for dev/docs)

Install

pip install deepwiki-to-md

Usage

From local HTML/string (CLI and Python):

# CLI
echo "<html>...</html>" | deepwiki-to-md

# Python API
from deepwiki_to_md import ContentExtractor

html = """
<!doctype html>
<html>...</html>
"""

extractor = ContentExtractor()
md = extractor.extract_from_html(html)
print(md)

From URL (files are saved only when the input is a URL):

# CLI
# Files under .deepwiki are created only for URL input
deepwiki-to-md https://deepwiki.com/microsoft/vscode/some-page --path ./.deepwiki

# Python API (same behavior as the CLI)
from deepwiki_to_md import ContentExtractor, save_markdown_to_library

url = "https://deepwiki.com/microsoft/vscode/some-page"
base_dir = "./.deepwiki"  # equivalent to --path (optional)

extractor = ContentExtractor()
md = extractor.extract_from_url(url)

result = save_markdown_to_library(md, url, base_dir)
print("saved files:")
for p in result["saved_files"]:
    print(" -", p)
print("library index:", result["library_file"])  # .deepwiki/<username>/<library>.md

Search public repository indexes:

# CLI (JSON by default)
deepwiki-to-md --search "Gemini"

# Human-readable development-log style
deepwiki-to-md --search "Gemini" --devlog

# Python API (same search capability)
from search_repository import search_repositories, API_URL

print(API_URL)  # => https://api.devin.ai/ada/list_public_indexes
result = search_repositories("Gemini")
indices = result.get("indices", [])
print("indices:", len(indices))

Chat with Devin API (via CLI):

# Positional argument must be a DeepWiki URL
# JSON output by default
deepwiki-to-md https://deepwiki.com/microsoft/vscode --chat "What is the purpose of this repository?"

# Human-readable output for development logs
deepwiki-to-md https://deepwiki.com/microsoft/vscode --chat "Summarize top features" --devlog

Options for chat via deepwiki-to-md:

--chat MESSAGE: Message to send. Requires a DeepWiki URL as the positional input.
--deep-research: Enable deep research mode for chat.
--config-file PATH: Path to chat config JSON (default: ./config.json). The file must exist and contain complete settings.
--devlog: When used with --chat, prints a human-readable response body and reference files.

License

MIT License

More documentation

Library reference (includes both Python API and CLI examples): deepwiki_to_md.md

Chat (Devin API) result object: ChatResult

The chat helper (src/chat.py) returns a ChatResult object instead of a plain dict.

Highlights
- Inherits from dict → works with json.dumps(result) directly.
- Convenient attribute access (e.g., result.response_message) and to_dict().
- print(result) shows a human-readable summary.
Main properties
- sent_message: str
- response_message: Optional[str]
- status_code: Any
- reference_files: List[str]
- reference_file_contents: Dict[str, str]
Example (excerpt)

import asyncio
import json
from chat import load_or_create_config, send_chat_message, ChatResult

async def main() -> None:

    result: ChatResult = await send_chat_message(
        wiki_url='https://deepwiki.com/microsoft/vscode',
        message='What is the purpose of this repository?',
        use_deep_research=False,
    )

    print(result)  # human-readable summary via __str__
    print(result.response_message)  # attribute access
    print(json.dumps(result, indent=2, ensure_ascii=False))  # still a dict

if __name__ == '__main__':
    asyncio.run(main())

Arguments for chat.py:

--url: URL of the chat interface.
--message: Message to send.
--selector: CSS selector for the chat input (default: textarea).
--button: CSS selector for the submit button (default: button).
--wait: Time to wait for response in seconds (default: 30).
--debug: Enable debug mode.
--output: Output directory (default: ChatResponses).
--deep: Enable "Deep Research" mode (specific to some interfaces).
--headless: Run browser in headless mode.
--format: Output format(s): html, md, yaml, or comma-separated list (default: html).

Note: The chat scraper uses Selenium, which requires a compatible browser installed.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.0.3

Oct 1, 2025

2.0.2

Sep 30, 2025

2.0.1

Sep 30, 2025

2.0.0

Sep 30, 2025

0.3.2

May 6, 2025

0.3.1

May 6, 2025

0.3.0

May 5, 2025

0.2.0

May 5, 2025

0.1.0

May 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepwiki_to_md-2.0.3.tar.gz (26.3 kB view details)

Uploaded Oct 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

deepwiki_to_md-2.0.3-py3-none-any.whl (25.4 kB view details)

Uploaded Oct 1, 2025 Python 3

File details

Details for the file deepwiki_to_md-2.0.3.tar.gz.

File metadata

Download URL: deepwiki_to_md-2.0.3.tar.gz
Upload date: Oct 1, 2025
Size: 26.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for deepwiki_to_md-2.0.3.tar.gz
Algorithm	Hash digest
SHA256	`28a8b1e018e25db30b40bda27a08eb1a1999695c0fb596b9c7d396873a856235`
MD5	`d5369fe8188da8b51333f6dfc756d497`
BLAKE2b-256	`3a8ca046247d039a8e467ad56ab7e294f53d1188535aed3a1b7953349ec93a89`

See more details on using hashes here.

File details

Details for the file deepwiki_to_md-2.0.3-py3-none-any.whl.

File metadata

Download URL: deepwiki_to_md-2.0.3-py3-none-any.whl
Upload date: Oct 1, 2025
Size: 25.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for deepwiki_to_md-2.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58efeea773474eda2ca020d0663ee6a67c269bcf40b6b8c7879e3aaf3c3adcf1`
MD5	`c77ea4bd439dd1e190e7986a4081595a`
BLAKE2b-256	`b34855bee017238cbc496f31b89df2e7e69d353a60b0f555de63d479b2efcea2`

See more details on using hashes here.

deepwiki-to-md 2.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

deepwiki-to-md

Install

Usage

License

More documentation

Chat (Devin API) result object: ChatResult

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes