Skip to main content

Extensible Next.js/DeepWiki content extractor with zero external dependencies

Project description

deepwiki-to-md

Next.js/DeepWiki 由来のHTMLからMarkdownテキストを抽出するゼロ依存のCLIツール。

  • CLI: deepwiki-to-md
  • 必要要件: Python 3.7+
  • 依存関係: 標準ライブラリのみ(オプション機能は extras)

インストール

pip install deepwiki-to-md

使い方

  • ローカルHTML/文字列から(CLI と Python の両方):
# CLI
echo "<html>...</html>" | deepwiki-to-md
# Python API
from deepwiki_to_md import ContentExtractor

html = """
<!doctype html>
<html>...</html>
"""

extractor = ContentExtractor()
md = extractor.extract_from_html(html)
print(md)
  • URLから(保存は URL 入力時のみ):
# CLI
# URL 入力のときのみ、.deepwiki 配下に分割保存されます
deepwiki-to-md https://deepwiki.com/microsoft/vscode/some-page --path ./.deepwiki
# Python API(CLI と同等の動作)
from deepwiki_to_md import ContentExtractor, save_markdown_to_library

url = "https://deepwiki.com/microsoft/vscode/some-page"
base_dir = "./.deepwiki"  # --path に相当(省略可)

extractor = ContentExtractor()
md = extractor.extract_from_url(url)

result = save_markdown_to_library(md, url, base_dir)
print("saved files:")
for p in result["saved_files"]:
    print(" -", p)
print("library index:", result["library_file"])  # .deepwiki/<username>/<library>.md
  • 検索機能(公開リポジトリ・インデックス):
# CLI(既定は JSON 出力)
deepwiki-to-md --search "Gemini"

# 人間可読な開発ログ形式
deepwiki-to-md --search "Gemini" --devlog
# Python API(CLI と同等の検索機能)
from deepwiki_to_md import search_repositories, API_URL

print(API_URL)  # => https://api.devin.ai/ada/list_public_indexes
result = search_repositories("Gemini")
indices = result.get("indices", [])
print("indices:", len(indices))

ライセンス

MIT License

詳細ドキュメント

Chat (Devin API) の結果オブジェクト: ChatResult

chat ヘルパー(src/chat.py)の send_chat_message は、辞書ではなく「オブジェクト型」の ChatResult を返します。

  • 特長

    • dict を継承しているため json.dumps(result) がそのまま使えます。
    • 便利な属性アクセス(result.response_message など)と to_dict() を提供します。
    • print(result) で人間が読みやすい要約が表示されます。
  • 主なプロパティ

    • sent_message: 送信したメッセージ(str)
    • response_message: 応答本文(Optional[str])
    • status_code: ステータスコード(Any)
    • reference_files: 参照ファイルのリスト(List[str])
    • reference_file_contents: 参照ファイルの内容(Dict[str, str])
  • 例(簡易抜粋)

import asyncio
import json
from chat import load_or_create_config, send_chat_message, ChatResult

async def main() -> None:
    config = load_or_create_config('./config.json')
    if not config:
        raise SystemExit('config missing')
    result: ChatResult = await send_chat_message(
        wiki_url='https://deepwiki.com/microsoft/vscode',
        message='What is the purpose of this repository?',
        config=config,
        use_deep_research=False,
    )

    print(result)  # __str__ による要約
    print(result.response_message)  # プロパティアクセス
    print(json.dumps(result, indent=2, ensure_ascii=False))  # dict 継承のためそのまま JSON 出力

if __name__ == '__main__':
    asyncio.run(main())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepwiki_to_md-2.0.1.tar.gz (23.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepwiki_to_md-2.0.1-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file deepwiki_to_md-2.0.1.tar.gz.

File metadata

  • Download URL: deepwiki_to_md-2.0.1.tar.gz
  • Upload date:
  • Size: 23.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for deepwiki_to_md-2.0.1.tar.gz
Algorithm Hash digest
SHA256 cd83c9dae7df6e8376b6414dd033f2783f9c33b5e33c9d5faa87f54faf7eb1ba
MD5 4dfc0ea1c6bde40d0eb9847280065838
BLAKE2b-256 b5fbb59a68d035fb2deb59bd2f56ae58b7948aeaf6caa1ec852f8db3c348a0b4

See more details on using hashes here.

File details

Details for the file deepwiki_to_md-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: deepwiki_to_md-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 23.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for deepwiki_to_md-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3eaa23b8a793c3a8e7a8fba8e242bb0a9bd5241dc8e1754d2ea9ec79584a32ff
MD5 710a58be9d3fe7639fe3e729967bca3f
BLAKE2b-256 61a79c7038f4369e10678d97e6cdd650d7b2e6bf055c3f6f4377283f2e731fb9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page