Skip to main content

An integration package created by the company LOGYCA to interact with ChatGPT and analyze documents, files and other functionality of the OpenAI library.

Project description

Logyca

LOGYCA public libraries

Package version Python


About us


LOGYCA public libraries: To interact with ChatGPT and analyze documents, files and other functionality of the OpenAI library.

Source code | Package (PyPI) | Samples

To interact with the examples, keep the following in mind

FastAPI example. Through Swagger, you can:

  • https://github.com/logyca/python-libraries/tree/main/logyca-ai/samples/fastapi_async
  • Use the example endpoints to obtain the input schemas for the post method and interact with the available parameters.
  • Endpoint publishing is asynchronous of openai SDK.
  • The model currently used is ChatGPT-4o, no other models have been tested so far.
  • Currently the formats supported to receive files and extract the text to interact with artificial intelligence are: txt, csv, pdf, images, Microsoft (docx, xlsx).

Script example. Through of code, you can:

Environment variables documentation for example: fastapi_async

The examples are built in the Microsoft Azure OpenAI environment, and the variables to use are the following:

.env.sample

# Environment variables documentation:

# API_KEY:
# The general API key used for authentication with services. This key is typically used for accessing cloud-based or other API-driven platforms. Replace '***' with the actual key.

# AZURE_OPENAI_DEPLOYMENT:
# The name or identifier of the OpenAI deployment within Azure. This defines the specific model version and configuration you are using in Azure OpenAI Service. Set this to the name of the deployed model, such as 'chatgpt3.5-turbo-1106'.

# AZURE_OPENAI_ENDPOINT:
# The base URL of the Azure OpenAI Service endpoint. This is the URL where API requests are sent, typically formatted like 'https://<your-endpoint>.openai.azure.com/'.

# AZURE_OPENAI_MODEL_NAME:
# The name of the specific OpenAI model being used in Azure, for example, 'gpt-35-turbo'. This identifies which model variant will be used for processing requests.

# AZURE_OPENAI_MODEL_VERSION:
# The version of the OpenAI model deployed in Azure. This typically reflects updates or optimizations to the model, such as '1106' to indicate a version from November 6th.

# OPENAI_API_KEY:
# The API key provided by OpenAI directly (not through Azure). This is used to authenticate and access OpenAI services outside of Azure.

# OPENAI_API_VERSION:
# The version of the OpenAI API being used. This specifies the version of the API and its capabilities, for example, '2023-03-15-preview'. It dictates the available features and request format.

API_KEY=***
AZURE_OPENAI_DEPLOYMENT=***
AZURE_OPENAI_ENDPOINT=***
AZURE_OPENAI_MODEL_NAME=***
AZURE_OPENAI_MODEL_VERSION=***
OPENAI_API_KEY=***
OPENAI_API_VERSION=***

# Example
# API_KEY=CUSTOM_ABC
# AZURE_OPENAI_DEPLOYMENT=chat4omni
# AZURE_OPENAI_ENDPOINT=azurenameforendpoint
# AZURE_OPENAI_MODEL_NAME=gpt-4o
# AZURE_OPENAI_MODEL_VERSION=2024-05-13
# OPENAI_API_KEY=AZURE_ABC
# OPENAI_API_VERSION=2024-07-01-preview

OCR engine to extract images.

  • Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License. Originally developed by Hewlett-Packard as proprietary software in the 1980s, it was released as open source in 2005 and development was sponsored by Google in 2006

Install

Example for simple conversation.

{
  "system": "Voy a definirte tu personalidad, contexto y proposito.\nActua como un experto en venta de frutas.\nSe muy positivo.\nTrata a las personas de usted, nunca tutees sin importar como te escriban.",
  "messages": [
    {
      "additional_content": "",
      "type": "text",
      "user": "Dime 5 frutas amarillas"
    },
    {
      "assistant": "\n¡Claro! Aquí te van 5 frutas amarillas:\n\n1. Plátano\n2. Piña\n3. Mango\n4. Melón\n5. Papaya\n"
    },
    {
      "additional_content": "",
      "type": "text",
      "user": "Dame los nombres en ingles."
    }
  ]
}

Example for image conversation.

Using public published URL for image

{
  "system": "Actua como una maquina lectora de imagenes.\nDevuelve la información sin lenguaje natural, sólo responde lo que se está solicitando.\nEl dispositivo que va a interactuar contigo es una api, y necesita la información sin markdown u otros caracteres especiales.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "https://raw.githubusercontent.com/logyca/python-libraries/main/logyca-ai/logyca_ai/assets_for_examples/file_or_documents/image.png",
        "image_format": "image_url",
        "image_resolution": "auto"
      },
      "type": "image_url",
      "user": "Extrae el texto que recibas en la imagen y devuelvelo en formato json."
    }
  ]
}

Using image content in base64

{
  "system": "Actua como una maquina lectora de imagenes.\nDevuelve la información sin lenguaje natural, sólo responde lo que se está solicitando.\nEl dispositivo que va a interactuar contigo es una api, y necesita la información sin markdown u otros caracteres especiales.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "<base64 image png content>",
        "image_format": "png",
        "image_resolution": "auto"
      },
      "type": "image_base64",
      "user": "Extrae el texto que recibas en la imagen y devuelvelo en formato json."
    }
  ]
}

Example for pdf conversation.

Using public published URL for pdf

{
  "system": "No uses lenguaje natural para la respuesta.\nDame la información que puedas extraer de la imagen en formato JSON.\nSolo devuelve la información, no formatees con caracteres adicionales la respuesta.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "https://raw.githubusercontent.com/logyca/python-libraries/main/logyca-ai/logyca_ai/assets_for_examples/file_or_documents/pdf.pdf",
        "pdf_format": "pdf_url"
      },
      "type": "pdf_url",
      "user": "Dame los siguientes datos: Expediente, radicación, Fecha, Numero de registro, Vigencia."
    }
  ]
}

Using pdf content in base64

{
  "system": "No uses lenguaje natural para la respuesta.\nDame la información que puedas extraer de la imagen en formato JSON.\nSolo devuelve la información, no formatees con caracteres adicionales la respuesta.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "<base64 pdf content>",
        "pdf_format": "pdf"
      },
      "type": "pdf_base64",
      "user": "Dame los siguientes datos: Expediente, radicación, Fecha, Numero de registro, Vigencia."
    }
  ]
}

Example for plain_text conversation.

Using public published URL for plain_text

{
  "system": "No uses lenguaje natural para la respuesta.\n                Dame la información que puedas extraer en formato JSON.\n                Solo devuelve la información, no formatees con caracteres adicionales la respuesta.\n                Te voy a enviar un texto que representa información en formato csv.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "https://raw.githubusercontent.com/logyca/python-libraries/main/logyca-ai/logyca_ai/assets_for_examples/file_or_documents/plain_text.csv",
        "file_format": "plain_text_url"
      },
      "type": "plain_text_url",
      "user": "Dame los siguientes datos de la primera fila del documento: Expediente, radicación, Fecha, Numero de registro, Vigencia.\n                A partir de la fila 2 del documento, suma los valores de la columna Valores_A.\n                A partir de la fila 2 del documento, Suma los valores de la columna Valores_B."
    }
  ]
}

Using plain_text content in base64

{
  "system": "No uses lenguaje natural para la respuesta.\n                Dame la información que puedas extraer en formato JSON.\n                Solo devuelve la información, no formatees con caracteres adicionales la respuesta.\n                Te voy a enviar un texto que representa información en formato csv.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "<base64 pdf content>",
        "file_format": "csv"
      },
      "type": "plain_text_base64",
      "user": "Dame los siguientes datos de la primera fila del documento: Expediente, radicación, Fecha, Numero de registro, Vigencia.\n                A partir de la fila 2 del documento, suma los valores de la columna Valores_A.\n                A partir de la fila 2 del documento, Suma los valores de la columna Valores_B."
    }
  ]
}

Example for Microsoft files conversation (Word, Excel).

Using public published URL for Excel file

{
  "system": "No uses lenguaje natural para la respuesta.\n                Dame la información que puedas extraer de la imagen en formato JSON.\n                Solo devuelve la información, no formatees con caracteres adicionales la respuesta.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "https://raw.githubusercontent.com/logyca/python-libraries/main/logyca-ai/logyca_ai/assets_for_examples/file_or_documents/ms_excel.xlsx",
        "file_format": "ms_url"
      },
      "type": "ms_url",
      "user": "Dame los siguientes datos: Expediente, radicación, Fecha, Numero de registro, Vigencia."
    }
  ]
}

Using Excel file content in base64

{
    "system": "No uses lenguaje natural para la respuesta.\n                Dame la información que puedas extraer de la imagen en formato JSON.\n                Solo devuelve la información, no formatees con caracteres adicionales la respuesta.",
    "messages": [
      {
        "additional_content": {
          "base64_content_or_url": "<base64 pdf content>",
          "file_format": "xlsx"
        },
        "type": "ms_base64",
        "user": "Dame los siguientes datos: Expediente, radicación, Fecha, Numero de registro, Vigencia."
      }
    ]
}

Semantic Versioning

logyca_ai < MAJOR >.< MINOR >.< PATCH >

  • MAJOR: version when you make incompatible API changes
  • MINOR: version when you add functionality in a backwards compatible manner
  • PATCH: version when you make backwards compatible bug fixes

Definitions for releasing versions

  • https://peps.python.org/pep-0440/

    • X.YaN (Alpha release): Identify and fix early-stage bugs. Not suitable for production use.
    • X.YbN (Beta release): Stabilize and refine features. Address reported bugs. Prepare for official release.
    • X.YrcN (Release candidate): Final version before official release. Assumes all major features are complete and stable. Recommended for testing in non-critical environments.
    • X.Y (Final release/Stable/Production): Completed, stable version ready for use in production. Full release for public use.

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Types of changes

  • Added for new features.
  • Changed for changes in existing functionality.
  • Deprecated for soon-to-be removed features.
  • Removed for now removed features.
  • Fixed for any bug fixes.
  • Security in case of vulnerabilities.

[0.0.1aX] - 2024-08-02

Added

  • First tests using pypi.org in develop environment.

[0.1.0] - 2024-08-02

Added

  • Completion of testing and launch into production.

[0.1.1] - 2024-08-16

Added

  • The functions of extracting text from PDF files are refactored, using disk to optimize the use of ram memory and methods are added to extract text from images within the pages of the PDF files.

[0.2.0] - 2024-08-30

Added

  • New feature of attaching documents with txt, csv, docx, xlsx extension

[0.2.1] - 2024-09-16

Added

  • New tiktoken function to count tokens and check model capacity, returning if it meets the maximum_request_tokens requirements for both input and output.

Fixed

  • Extract excel files to output formats json, csv and list.

[0.2.2] - 2024-10-22

Added

  • New functionalities are added to extract images from documents in base64 lists: extract_images_from_pdf_file, extract_images_from_docx_file
  • The Swagger documentation is improved in the FastAPI example, adding the parameter: just_extract_images to the POST method to use the new document image extraction features.

[0.2.3] - 2024-10-31

Added

  • new functionality when extracting text in Excel, you can select only extraction of visible sheets or all sheets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logyca_ai-0.2.3.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

logyca_ai-0.2.3-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file logyca_ai-0.2.3.tar.gz.

File metadata

  • Download URL: logyca_ai-0.2.3.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.9

File hashes

Hashes for logyca_ai-0.2.3.tar.gz
Algorithm Hash digest
SHA256 2edefbe2eedf1d5852d6fc2e18eec53b9afc69edafbbf583a99ff2c521618f12
MD5 fe44811eba3f0ddf37cb0840faa8db8b
BLAKE2b-256 de1f1045a9d89358252744d2da82d116e69a03888c482c91468ce74442a47217

See more details on using hashes here.

File details

Details for the file logyca_ai-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: logyca_ai-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.9

File hashes

Hashes for logyca_ai-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 64f59811913130d0e18c63a790736ac07b6b17433723bda99b73c52bbbe09077
MD5 42757fb2acfc82caaaf0c41c75392be7
BLAKE2b-256 8c3b146d0cc8ac654d2fbc3bc01b274eb69f72dd6c306111be0769c68d0232f4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page