Convert product URLs into structured product data.
Project description
Product2Schema
Product2Schema is a Python library designed to convert product URLs into structured product data.
This library offers both synchronous and asynchronous capabilities to fit various use cases.
The conversion process leverages the Zyte API for fetching page contents and OpenAI for generating the schema from the fetched content.
Features
- Asynchronous and synchronous versions available.
- Converts product URLs into structured product data.
- Utilizes Zyte API for web scraping and OpenAI for AI transformation.
- Easy to integrate with existing Python projects.
Installation
You can install the library from PyPI:
pip install product2schema
Usage
Synchronous Engine
The SyncEngine class provides a synchronous interface for URL transformation.
from product2schema import SyncEngine
# Initialize the engine with your API keys
sync_engine = SyncEngine(openai_key="your_openai_key", zyte_key="your_zyte_key")
# Transform a product URL
response = sync_engine.transform_url("https://example.com/product")
# Access the response details
print(f"Original URL: {response.url}")
print(f"Cost: {response.cost}")
print(f"Product Schema: {response.product_schema}")
Asynchronous Engine
The AsyncEngine class provides an asynchronous interface for URL transformation and is more suitable for use cases that require non-blocking operations.
import asyncio
from product2schema import AsyncEngine
async def main():
# Initialize the engine with your API keys
async_engine = AsyncEngine(openai_key="your_openai_key", zyte_key="your_zyte_key")
# Transform a product URL
response = await async_engine.transform_url("https://example.com/product")
# Access the response details
print(f"Original URL: {response.url}")
print(f"Cost: {response.cost}")
print(f"Product Schema: {response.product_schema}")
# Run the async main function
asyncio.run(main())
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the Apache-2.0 License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for product2schema-1.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10e56e08c99a653ce33376ff4d5b403333c1c1593355481f0efed0a212f5ecf3 |
|
MD5 | b455782f65f8086c40291a42c0e8676c |
|
BLAKE2b-256 | 101243cf83bd2cd54b96de32a382f51ec60cdb3e5f6501360941dbf101bb1447 |