SDK Python officiel pour l'API OCR Facture France - Extraction automatique de données de factures via OCR
Project description
OCR Facture API - SDK Python
SDK Python officiel pour l'API OCR Facture France. Facilite l'intégration de l'extraction automatique de données de factures dans vos applications Python.
🚀 Installation
pip install ocr-facture-api
📖 Utilisation
Configuration de base
from ocr_facture_api import OCRFactureAPI
# Initialiser le client
api = OCRFactureAPI(
api_key="votre_cle_api_rapidapi",
base_url="https://ocr-facture-api-production.up.railway.app" # Optionnel
)
Extraire les données d'une facture
Depuis un fichier
# Depuis un fichier local
result = api.extract_from_file("facture.pdf", language="fra")
# Accéder aux données extraites
invoice_data = result["extracted_data"]
print(f"Numéro de facture: {invoice_data['invoice_number']}")
print(f"Total TTC: {invoice_data['total_ttc']}€")
print(f"Date: {invoice_data['date']}")
# Scores de confiance
confidence = result["confidence_scores"]
print(f"Confiance numéro: {confidence['invoice_number']}")
Depuis une image base64
import base64
with open("facture.jpg", "rb") as f:
image_data = base64.b64encode(f.read()).decode('utf-8')
result = api.extract_from_base64(image_data, language="fra")
Traitement par lot (batch)
# Traiter plusieurs factures en une requête
files = ["facture1.pdf", "facture2.pdf", "facture3.pdf"]
batch_result = api.batch_extract(files, language="fra")
# Parcourir les résultats
for i, result in enumerate(batch_result["results"]):
if result["success"]:
print(f"Facture {i+1}: {result['extracted_data']['invoice_number']}")
Validation de conformité FR
# Extraire puis valider
result = api.extract_from_file("facture.pdf", check_compliance=True)
# Ou valider séparément
invoice_data = result["extracted_data"]
compliance = api.check_compliance(invoice_data)
if compliance["compliance"]["compliant"]:
print("✅ Facture conforme")
else:
print(f"❌ Champs manquants: {compliance['compliance']['missing_fields']}")
Génération Factur-X
# Extraire les données
result = api.extract_from_file("facture.pdf")
invoice_data = result["extracted_data"]
# Générer XML Factur-X
facturx_result = api.generate_facturx(invoice_data)
xml_content = facturx_result["xml"]
# Sauvegarder le XML
with open("facture_facturx.xml", "w", encoding="utf-8") as f:
f.write(xml_content)
Validation TVA
invoice_data = result["extracted_data"]
vat_validation = api.validate_vat(invoice_data)
if vat_validation["validation"]["valid"]:
print("✅ TVA valide")
else:
print(f"❌ Erreurs: {vat_validation['validation']['errors']}")
Enrichissement SIRET
# Enrichir avec API Sirene
enrichment = api.enrich_siret("12345678901234")
print(f"Raison sociale: {enrichment['enrichment']['uniteLegale']['denominationUniteLegale']}")
Gestion des erreurs
from ocr_facture_api import (
OCRFactureAPIError,
OCRFactureAuthError,
OCRFactureRateLimitError,
OCRFactureValidationError,
)
try:
result = api.extract_from_file("facture.pdf")
except OCRFactureAuthError:
print("❌ Clé API invalide")
except OCRFactureRateLimitError as e:
print(f"❌ Quota dépassé. Réessayez dans {e.retry_after} secondes")
except OCRFactureValidationError as e:
print(f"❌ Erreur de validation: {e.message}")
except OCRFactureAPIError as e:
print(f"❌ Erreur API: {e.message}")
Idempotence
import uuid
# Utiliser une clé d'idempotence pour éviter les doublons
idempotency_key = str(uuid.uuid4())
result = api.extract_from_file(
"facture.pdf",
idempotency_key=idempotency_key
)
# Réutiliser la même clé retourne le même résultat sans retraitement
result2 = api.extract_from_file(
"facture.pdf",
idempotency_key=idempotency_key
) # Résultat instantané depuis le cache
📚 API Reference
Méthodes principales
extract_from_file(file_path, language="fra", check_compliance=False, idempotency_key=None)- Extraction depuis fichierextract_from_base64(base64_string, language="fra", check_compliance=False, idempotency_key=None)- Extraction depuis base64batch_extract(files, language="fra", idempotency_key=None)- Traitement par lot (max 10 fichiers)check_compliance(invoice_data)- Validation conformité FRvalidate_vat(invoice_data)- Validation TVAenrich_siret(siret)- Enrichissement SIRETvalidate_vies(vat_number)- Validation VIESgenerate_facturx(invoice_data)- Génération XML Factur-Xparse_facturx(file_path)- Extraction XML depuis PDF/A-3validate_facturx_xml(xml_content)- Validation XML Factur-Xget_supported_languages()- Liste des langues supportéesget_quota()- Informations sur quota restanthealth_check()- État de santé de l'API
🌍 Langues supportées
fra- Français (défaut)eng- Anglaisdeu- Allemandspa- Espagnolita- Italienpor- Portugais
📝 Exemples complets
Exemple 1 : Traitement automatique de factures
from ocr_facture_api import OCRFactureAPI
import os
api = OCRFactureAPI(api_key=os.getenv("OCR_FACTURE_API_KEY"))
# Traiter toutes les factures d'un dossier
factures_dir = "./factures"
for filename in os.listdir(factures_dir):
if filename.endswith(('.pdf', '.jpg', '.png')):
filepath = os.path.join(factures_dir, filename)
try:
result = api.extract_from_file(filepath, check_compliance=True)
invoice_data = result["extracted_data"]
print(f"\n📄 {filename}")
print(f" Numéro: {invoice_data.get('invoice_number')}")
print(f" Date: {invoice_data.get('date')}")
print(f" Total TTC: {invoice_data.get('total_ttc')}€")
# Vérifier conformité
if result.get("compliance", {}).get("compliant"):
print(" ✅ Conforme")
else:
print(f" ⚠️ Non conforme: {result['compliance']['missing_fields']}")
except Exception as e:
print(f"❌ Erreur pour {filename}: {e}")
Exemple 2 : Export vers CSV
import csv
from ocr_facture_api import OCRFactureAPI
api = OCRFactureAPI(api_key="votre_cle")
# Traiter plusieurs factures
files = ["facture1.pdf", "facture2.pdf", "facture3.pdf"]
batch_result = api.batch_extract(files)
# Exporter vers CSV
with open("factures_export.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.writer(f)
writer.writerow(["Numéro", "Date", "Vendeur", "Total TTC", "Confiance"])
for result in batch_result["results"]:
if result["success"]:
data = result["extracted_data"]
writer.writerow([
data.get("invoice_number"),
data.get("date"),
data.get("vendor"),
data.get("total_ttc"),
result["confidence_scores"].get("total_ttc", 0)
])
🔗 Liens utiles
📄 Licence
MIT License
🤝 Contribution
Les contributions sont les bienvenues ! N'hésitez pas à ouvrir une issue ou une pull request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ocr_facture_api-2.0.0.tar.gz.
File metadata
- Download URL: ocr_facture_api-2.0.0.tar.gz
- Upload date:
- Size: 12.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa270eb4400f8a04e313e0ac4e9c63c4264a73e18b9ada68130e4b0945a2435d
|
|
| MD5 |
df306507b1e514ab59d4a416753bcea5
|
|
| BLAKE2b-256 |
ce1de984690a19095644b91955d08b5989cb6ea24c56091050d8413ac223be0b
|
File details
Details for the file ocr_facture_api-2.0.0-py3-none-any.whl.
File metadata
- Download URL: ocr_facture_api-2.0.0-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b0a2d6cdb1ca337fc6db5c3bef885579b265958bb58f1609e3591778079c64a
|
|
| MD5 |
e9567b58673e37d8686a65ebe0fb32ee
|
|
| BLAKE2b-256 |
6c70e8d6fde43e89b40ff2ac905ff3ff71ae093a13dd6f19e6f73502a9acb0aa
|