Skip to main content

An integration package created by the company LOGYCA to interact with ChatGPT and analyze documents, files and other functionality of the OpenAI library.

Project description

Logyca

LOGYCA public libraries

Package version Python


About us


LOGYCA public libraries: To interact with ChatGPT and analyze documents, files and other functionality of the OpenAI library.

Source code | Package (PyPI) | Samples

To interact with the examples, keep the following in mind

FastAPI example. Through Swagger, you can:

  • https://github.com/logyca/python-libraries/tree/main/logyca-ai/samples/fastapi_async
  • Use the example endpoints to obtain the input schemas for the post method and interact with the available parameters.
  • Endpoint publishing is asynchronous of openai SDK.
  • The model currently used is ChatGPT-4o, no other models have been tested so far.
  • Currently the formats supported to receive files and extract the text to interact with artificial intelligence are: txt, csv, pdf, images, Microsoft (docx, xlsx).

Script example. Through of code, you can:

Environment variables documentation for example: fastapi_async

The examples are built in the Microsoft Azure OpenAI environment, and the variables to use are the following:

.env.sample

# Environment variables documentation:

# API_KEY:
# The general API key used for authentication with services. This key is typically used for accessing cloud-based or other API-driven platforms. Replace '***' with the actual key.

# AZURE_OPENAI_DEPLOYMENT:
# The name or identifier of the OpenAI deployment within Azure. This defines the specific model version and configuration you are using in Azure OpenAI Service. Set this to the name of the deployed model, such as 'chatgpt3.5-turbo-1106'.

# AZURE_OPENAI_ENDPOINT:
# The base URL of the Azure OpenAI Service endpoint. This is the URL where API requests are sent, typically formatted like 'https://<your-endpoint>.openai.azure.com/'.

# AZURE_OPENAI_MODEL_NAME:
# The name of the specific OpenAI model being used in Azure, for example, 'gpt-35-turbo'. This identifies which model variant will be used for processing requests.

# AZURE_OPENAI_MODEL_VERSION:
# The version of the OpenAI model deployed in Azure. This typically reflects updates or optimizations to the model, such as '1106' to indicate a version from November 6th.

# OPENAI_API_KEY:
# The API key provided by OpenAI directly (not through Azure). This is used to authenticate and access OpenAI services outside of Azure.

# OPENAI_API_VERSION:
# The version of the OpenAI API being used. This specifies the version of the API and its capabilities, for example, '2023-03-15-preview'. It dictates the available features and request format.

API_KEY=***
AZURE_OPENAI_DEPLOYMENT=***
AZURE_OPENAI_ENDPOINT=***
AZURE_OPENAI_MODEL_NAME=***
AZURE_OPENAI_MODEL_VERSION=***
OPENAI_API_KEY=***
OPENAI_API_VERSION=***

# Example
# API_KEY=CUSTOM_ABC
# AZURE_OPENAI_DEPLOYMENT=chat4omni
# AZURE_OPENAI_ENDPOINT=azurenameforendpoint
# AZURE_OPENAI_MODEL_NAME=gpt-4o
# AZURE_OPENAI_MODEL_VERSION=2024-05-13
# OPENAI_API_KEY=AZURE_ABC
# OPENAI_API_VERSION=2024-07-01-preview

OCR engine to extract images.

  • Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License. Originally developed by Hewlett-Packard as proprietary software in the 1980s, it was released as open source in 2005 and development was sponsored by Google in 2006

Install

Example for simple conversation.

{
  "system": "Voy a definirte tu personalidad, contexto y proposito.\nActua como un experto en venta de frutas.\nSe muy positivo.\nTrata a las personas de usted, nunca tutees sin importar como te escriban.",
  "messages": [
    {
      "additional_content": "",
      "type": "text",
      "user": "Dime 5 frutas amarillas"
    },
    {
      "assistant": "\n¡Claro! Aquí te van 5 frutas amarillas:\n\n1. Plátano\n2. Piña\n3. Mango\n4. Melón\n5. Papaya\n"
    },
    {
      "additional_content": "",
      "type": "text",
      "user": "Dame los nombres en ingles."
    }
  ]
}

Example for image conversation.

Using public published URL for image

{
  "system": "Actua como una maquina lectora de imagenes.\nDevuelve la información sin lenguaje natural, sólo responde lo que se está solicitando.\nEl dispositivo que va a interactuar contigo es una api, y necesita la información sin markdown u otros caracteres especiales.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "https://raw.githubusercontent.com/logyca/python-libraries/main/logyca-ai/logyca_ai/assets_for_examples/file_or_documents/image.png",
        "image_format": "image_url",
        "image_resolution": "auto"
      },
      "type": "image_url",
      "user": "Extrae el texto que recibas en la imagen y devuelvelo en formato json."
    }
  ]
}

Using image content in base64

{
  "system": "Actua como una maquina lectora de imagenes.\nDevuelve la información sin lenguaje natural, sólo responde lo que se está solicitando.\nEl dispositivo que va a interactuar contigo es una api, y necesita la información sin markdown u otros caracteres especiales.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "<base64 image png content>",
        "image_format": "png",
        "image_resolution": "auto"
      },
      "type": "image_base64",
      "user": "Extrae el texto que recibas en la imagen y devuelvelo en formato json."
    }
  ]
}

Example for pdf conversation.

Using public published URL for pdf

{
  "system": "No uses lenguaje natural para la respuesta.\nDame la información que puedas extraer de la imagen en formato JSON.\nSolo devuelve la información, no formatees con caracteres adicionales la respuesta.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "https://raw.githubusercontent.com/logyca/python-libraries/main/logyca-ai/logyca_ai/assets_for_examples/file_or_documents/pdf.pdf",
        "pdf_format": "pdf_url"
      },
      "type": "pdf_url",
      "user": "Dame los siguientes datos: Expediente, radicación, Fecha, Numero de registro, Vigencia."
    }
  ]
}

Using pdf content in base64

{
  "system": "No uses lenguaje natural para la respuesta.\nDame la información que puedas extraer de la imagen en formato JSON.\nSolo devuelve la información, no formatees con caracteres adicionales la respuesta.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "<base64 pdf content>",
        "pdf_format": "pdf"
      },
      "type": "pdf_base64",
      "user": "Dame los siguientes datos: Expediente, radicación, Fecha, Numero de registro, Vigencia."
    }
  ]
}

Example for plain_text conversation.

Using public published URL for plain_text

{
  "system": "No uses lenguaje natural para la respuesta.\n                Dame la información que puedas extraer en formato JSON.\n                Solo devuelve la información, no formatees con caracteres adicionales la respuesta.\n                Te voy a enviar un texto que representa información en formato csv.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "https://raw.githubusercontent.com/logyca/python-libraries/main/logyca-ai/logyca_ai/assets_for_examples/file_or_documents/plain_text.csv",
        "file_format": "plain_text_url"
      },
      "type": "plain_text_url",
      "user": "Dame los siguientes datos de la primera fila del documento: Expediente, radicación, Fecha, Numero de registro, Vigencia.\n                A partir de la fila 2 del documento, suma los valores de la columna Valores_A.\n                A partir de la fila 2 del documento, Suma los valores de la columna Valores_B."
    }
  ]
}

Using plain_text content in base64

{
  "system": "No uses lenguaje natural para la respuesta.\n                Dame la información que puedas extraer en formato JSON.\n                Solo devuelve la información, no formatees con caracteres adicionales la respuesta.\n                Te voy a enviar un texto que representa información en formato csv.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "<base64 pdf content>",
        "file_format": "csv"
      },
      "type": "plain_text_base64",
      "user": "Dame los siguientes datos de la primera fila del documento: Expediente, radicación, Fecha, Numero de registro, Vigencia.\n                A partir de la fila 2 del documento, suma los valores de la columna Valores_A.\n                A partir de la fila 2 del documento, Suma los valores de la columna Valores_B."
    }
  ]
}

Example for Microsoft files conversation (Word, Excel).

Using public published URL for Excel file

{
  "system": "No uses lenguaje natural para la respuesta.\n                Dame la información que puedas extraer de la imagen en formato JSON.\n                Solo devuelve la información, no formatees con caracteres adicionales la respuesta.",
  "messages": [
    {
      "additional_content": {
        "base64_content_or_url": "https://raw.githubusercontent.com/logyca/python-libraries/main/logyca-ai/logyca_ai/assets_for_examples/file_or_documents/ms_excel.xlsx",
        "file_format": "ms_url"
      },
      "type": "ms_url",
      "user": "Dame los siguientes datos: Expediente, radicación, Fecha, Numero de registro, Vigencia."
    }
  ]
}

Using Excel file content in base64

{
    "system": "No uses lenguaje natural para la respuesta.\n                Dame la información que puedas extraer de la imagen en formato JSON.\n                Solo devuelve la información, no formatees con caracteres adicionales la respuesta.",
    "messages": [
      {
        "additional_content": {
          "base64_content_or_url": "<base64 pdf content>",
          "file_format": "xlsx"
        },
        "type": "ms_base64",
        "user": "Dame los siguientes datos: Expediente, radicación, Fecha, Numero de registro, Vigencia."
      }
    ]
}

Semantic Versioning

logyca_ai < MAJOR >.< MINOR >.< PATCH >

  • MAJOR: version when you make incompatible API changes
  • MINOR: version when you add functionality in a backwards compatible manner
  • PATCH: version when you make backwards compatible bug fixes

Definitions for releasing versions

  • https://peps.python.org/pep-0440/

    • X.YaN (Alpha release): Identify and fix early-stage bugs. Not suitable for production use.
    • X.YbN (Beta release): Stabilize and refine features. Address reported bugs. Prepare for official release.
    • X.YrcN (Release candidate): Final version before official release. Assumes all major features are complete and stable. Recommended for testing in non-critical environments.
    • X.Y (Final release/Stable/Production): Completed, stable version ready for use in production. Full release for public use.

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Types of changes

  • Added for new features.
  • Changed for changes in existing functionality.
  • Deprecated for soon-to-be removed features.
  • Removed for now removed features.
  • Fixed for any bug fixes.
  • Security in case of vulnerabilities.

[0.0.1aX] - 2024-08-02

Added

  • First tests using pypi.org in develop environment.

[0.1.0] - 2024-08-02

Added

  • Completion of testing and launch into production.

[0.1.1] - 2024-08-16

Added

  • The functions of extracting text from PDF files are refactored, using disk to optimize the use of ram memory and methods are added to extract text from images within the pages of the PDF files.

[0.2.0] - 2024-08-30

Added

  • New feature of attaching documents with txt, csv, docx, xlsx extension

[0.2.1] - 2024-09-16

Added

  • New tiktoken function to count tokens and check model capacity, returning if it meets the maximum_request_tokens requirements for both input and output.

Fixed

  • Extract excel files to output formats json, csv and list.

[0.2.2] - 2024-10-22

Added

  • New functionalities are added to extract images from documents in base64 lists: extract_images_from_pdf_file, extract_images_from_docx_file
  • The Swagger documentation is improved in the FastAPI example, adding the parameter: just_extract_images to the POST method to use the new document image extraction features.

[0.2.3] - 2024-10-31

Added

  • new functionality when extracting text in Excel, you can select only extraction of visible sheets or all sheets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logyca_ai-0.2.3rc1.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

logyca_ai-0.2.3rc1-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file logyca_ai-0.2.3rc1.tar.gz.

File metadata

  • Download URL: logyca_ai-0.2.3rc1.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.9

File hashes

Hashes for logyca_ai-0.2.3rc1.tar.gz
Algorithm Hash digest
SHA256 794b401e724a6257837b1c55172df11463524453cd405790016ace3c43ddea40
MD5 788f4d250b9dea85e74ce558a76b0d29
BLAKE2b-256 b21b9a86ec1559c195c76601f85f7f14ccf74c53fe8e48a83d30c21789d57151

See more details on using hashes here.

File details

Details for the file logyca_ai-0.2.3rc1-py3-none-any.whl.

File metadata

File hashes

Hashes for logyca_ai-0.2.3rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 85d96d2cd6b23cd0b4d37891e1f5f4d5a0c2fa36b20d016eff8fbcee229e7060
MD5 8c5ad5fce396910db254b79f917582c0
BLAKE2b-256 e6f9b9b54b4c6e1b25af21140b94c5a8f692146bfad8b6b0f3850321a9f5730a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page