Yomitoku Client is a Python library for processing SageMaker Yomitoku API outputs with format conversion and visualization capabilities.

These details have not been verified by PyPI

Project description

Yomitoku Client

上記のボタンをクリックして、お好みの言語でドキュメントを表示してください

クイックリンク

📖 English Documentation - 英語での完全ガイド
📖 日本語ドキュメント - 日本語での完全ガイド
📓 Notebook Guide (English) - ステップバイステップのノートブックチュートリアル（英語）
📓 ノートブックガイド (日本語) - ステップバイステップのノートブックチュートリアル

Yomitoku Clientは、SageMaker Yomitoku APIの出力を処理し、包括的なフォーマット変換と可視化機能を提供するPythonライブラリです。Yomitoku ProのOCR分析と実用的なデータ処理ワークフローを橋渡しします。

主な機能

SageMaker統合: Yomitoku Pro OCR結果のシームレスな処理
複数フォーマット対応: CSV、Markdown、HTML、JSON、PDF形式への変換
検索可能PDF生成: OCRテキストオーバーレイ付きの検索可能PDFの作成
高度な可視化: 文書レイアウト分析、要素関係、信頼度スコア
ユーティリティ関数: 矩形計算、テキスト処理、画像操作
Jupyter Notebook対応: すぐに使える例とワークフロー

インストール

pipを使用

# GitHubから直接インストール
pip install git+https://github.com/MLism-Inc/yomitoku-client.git@main

uvを使用（推奨）

# GitHubから直接インストール
uv add git+https://github.com/MLism-Inc/yomitoku-client.git@main

注意: uvがインストールされていない場合は、以下でインストールできます：
curl -LsSf https://astral.sh/uv/install.sh | sh

クイックスタート

ステップ1: SageMakerエンドポイントに接続

import boto3
import json
from yomitoku_client.parsers.sagemaker_parser import SageMakerParser

# SageMakerランタイムクライアントを初期化
sagemaker_runtime = boto3.client('sagemaker-runtime')
ENDPOINT_NAME = 'your-yomitoku-endpoint'

# パーサーを初期化
parser = SageMakerParser()

# 文書でSageMakerエンドポイントを呼び出し
with open('document.pdf', 'rb') as f:
    response = sagemaker_runtime.invoke_endpoint(
        EndpointName=ENDPOINT_NAME,
        ContentType='application/pdf',  # または 'image/png', 'image/jpeg'
        Body=f.read(),
    )

# レスポンスをパース
body_bytes = response['Body'].read()
sagemaker_result = json.loads(body_bytes)

# 構造化データに変換
data = parser.parse_dict(sagemaker_result)

print(f"ページ数: {len(data.pages)}")
print(f"ページ1の段落数: {len(data.pages[0].paragraphs)}")
print(f"ページ1のテーブル数: {len(data.pages[0].tables)}")

# 特定のページにアクセス（page_index: 0=最初のページ）
page_index = 0  # 最初のページ
print(f"指定ページの段落数: {len(data.pages[page_index].paragraphs)}")

ステップ2: データを異なる形式に変換

単一ページ文書（画像）

# 異なる形式に変換（page_index: 0=最初のページ）
data.to_csv('output.csv', page_index=0)
data.to_html('output.html', page_index=0)
data.to_markdown('output.md', page_index=0)
data.to_json('output.json', page_index=0)

# 画像から検索可能PDFを作成
data.to_pdf(output_path='searchable.pdf', img='document.png')

複数ページ文書（PDF）

# 全ページを変換（フォルダ構造を作成）
data.to_csv_folder('csv_output/')
data.to_html_folder('html_output/')
data.to_markdown_folder('markdown_output/')
data.to_json_folder('json_output/')

# 検索可能PDFを作成（既存のPDFに検索可能テキストを追加）
data.to_pdf(output_path='enhanced.pdf', pdf='original.pdf')

# または個別のページを変換（page_index: 0=最初のページ、1=2番目のページ）
data.to_csv('page1.csv', page_index=0)  # 最初のページ
data.to_html('page2.html', page_index=1)  # 2番目のページ

テーブルデータ抽出

# 様々な形式でテーブルをエクスポート（page_index: 0=最初のページ）
data.export_tables(
    output_folder='tables/',
    output_format='csv',    # または 'html', 'json', 'text'
    page_index=0
)

# 複数ページ文書の場合
data.export_tables(
    output_folder='all_tables/',
    output_format='csv'
)

# 特定のページのテーブルのみをエクスポート
data.export_tables(
    output_folder='page1_tables/',
    output_format='csv',
    page_index=0  # 最初のページ
)

ステップ3: 結果を可視化

単一画像の可視化

# OCRテキストの可視化
result_img = data.pages[0].visualize(
    image_path='document.png',
    viz_type='ocr',
    output_path='ocr_visualization.png'
)

# レイアウト詳細の可視化（テキスト、テーブル、図）
result_img = data.pages[0].visualize(
    image_path='document.png',
    viz_type='layout_detail',
    output_path='layout_visualization.png'
)

複数画像の一括可視化

# 全ページのOCR結果を一括可視化（0.png, 1.png, 2.png...として保存）
data.export_viz_images(
    image_path='document.pdf',
    folder_path='ocr_results/',
    viz_type='ocr'
)

# 全ページのレイアウト詳細を一括可視化
data.export_viz_images(
    image_path='document.pdf',
    folder_path='layout_results/',
    viz_type='layout_detail'
)

# 特定のページのみ可視化
data.export_viz_images(
    image_path='document.pdf',
    folder_path='page1_results/',
    viz_type='layout_detail',
    page_index=0  # 最初のページのみ
)

PDF可視化

# PDFの特定ページを可視化
result_img = data.pages[0].visualize(
    image_path='document.pdf',
    viz_type='layout_detail',
    output_path='pdf_visualization.png',
    page_index=0  # 可視化するページを指定
)

サポート形式

CSV: 適切なセル処理による表形式データのエクスポート
Markdown: テーブルと見出しを含む構造化文書形式
HTML: 適切なスタイリングを含むWeb対応形式
JSON: 完全な文書構造を含む構造化データエクスポート
PDF: OCRテキストオーバーレイ付きの検索可能PDF生成

ライセンス

Apache License 2.0 - 詳細はLICENSEファイルを参照してください。

お問い合わせ

ご質問やサポートについては: support-aws-marketplace@mlism.com

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

Mar 2, 2026

0.1.1

Nov 9, 2025

0.0.2

Nov 7, 2025

This version

0.0.1

Oct 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yomitoku_client-0.0.1.tar.gz (12.1 MB view details)

Uploaded Oct 13, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

yomitoku_client-0.0.1-py3-none-any.whl (12.1 MB view details)

Uploaded Oct 13, 2025 Python 3

File details

Details for the file yomitoku_client-0.0.1.tar.gz.

File metadata

Download URL: yomitoku_client-0.0.1.tar.gz
Upload date: Oct 13, 2025
Size: 12.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for yomitoku_client-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`a8ef1c582548ca41284bb9ec6f985460f008f4a6ff8d1e1d2fd1b078464e7b6c`
MD5	`4854d199f99e743033aed646e1f412fd`
BLAKE2b-256	`3c388d488370f9aa720d469e8c5abfda2bfd16bb0eaae76fc99675629585189e`

See more details on using hashes here.

File details

Details for the file yomitoku_client-0.0.1-py3-none-any.whl.

File metadata

Download URL: yomitoku_client-0.0.1-py3-none-any.whl
Upload date: Oct 13, 2025
Size: 12.1 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for yomitoku_client-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4d83912347cbdf17a5d31f1a022f375361cc4e9e65a6022034feaffaca61537e`
MD5	`e83982aa7e69acc072951691f8be8972`
BLAKE2b-256	`3268b74a45b143b80480c181f96c6e1ad8b6c39b4762b2b80ea0234f649ab19f`

See more details on using hashes here.

yomitoku-client 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Yomitoku Client

クイックリンク

主な機能

インストール

pipを使用

uvを使用（推奨）

クイックスタート

ステップ1: SageMakerエンドポイントに接続

ステップ2: データを異なる形式に変換

単一ページ文書（画像）

複数ページ文書（PDF）

テーブルデータ抽出

ステップ3: 結果を可視化

単一画像の可視化

複数画像の一括可視化

PDF可視化

サポート形式

ライセンス

お問い合わせ

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes