Skip to main content

Um pacote para previsão de séries temporais usando modelos de linguagem.

Project description

LLM4Time

A library for time series forecasting using Large Language Models (LLMs)

Open in Colab PyPI version Python versions License Docs

Get StartedDocumentationReferencesContact

🧩 Get Started

LLM4Time is a Python library for time series forecasting using Large Language Models (LLMs). It provides a modular architecture that includes:

Installation

pip install llm4time

Running the Streamlit interface

In addition, we provide a Streamlit-based interface, offering a more intuitive and practical way to interact with the library.

Follow the steps below to clone the repository, set up the environment, and run the application.

1. Clone the repository

git clone https://github.com/zairobastos/LLM4Time.git
cd LLM4Time

2. Create and activate a virtual environment (Optional)

python -m venv .venv
source .venv/bin/activate      # Bash/Zsh
source .venv/bin/activate.fish # Fish Shell

3. Install the dependencies

pip install -e .
pip install -r requirements.txt -r requirements-streamlit.txt

4. Run the application

Using python 🐍

python app/main.py

Access the application at http://localhost:8501

Or using docker 🐋

docker compose up

Data preprocessing and handling

1. Data loading

from llm4time.core.data import loader
from llm4time.core.evaluate import Statistics

# Data loading using CSV, XLSX, JSON or Parquet
df = loader.load_data("etth2.csv")

# Descriptive statistics
stats = Statistics(df['OT'])
print(f"Mean: {stats.mean}")
print(f"Median: {stats.median}")
print(f"1° Quartile: {stats.first_quartile}")
print(f"3° Quartile: {stats.third_quartile}")
print(f"Standard Deviation: {stats.std}")
print(f"Minimum: {stats.min}")
print(f"Maximum: {stats.max}")
print(f"Number of missing values: {stats.missing_count}")
print(f"Percentage of missing values: {stats.missing_percentage}")

2. Data preprocessing

from llm4time.core.data import preprocessor

# Standardize into time series format
df = preprocessor.standardize(
  df,
  date_col='date',    # Column containing dates/timestamps
  value_col='OT',     # Column containing time series values
  duplicates='first'  # How to handle duplicate rows: 'first' keeps the first occurrence
)

# Ensure all timestamps are present
df = preprocessor.normalize(df, freq='h')

3. Missing data imputation

from llm4time.core.data import imputation

# Replace missing values with the column mean
df = imputation.mean(df)

4. Data split

from llm4time.core.data import preprocessor

# Split the dataset into training and validation sets
train, y_val = preprocessor.split(
  df,
  start_date='2016-06-01 00:00:00', # Start of the training set
  end_date='2016-12-01 00:00:00',   # End of the training set
  periods=24                        # Number of periods to forecast
)

Prompt generation

5. Zero-shot prompt generation

from llm4time.core import prompt
from llm4time.core import PromptType, TSFormat, TSType

content = prompt.generate(
    train,       # Training set [(date, value), ...]
    periods=24,  # Number of periods to forecast
    prompt_type=PromptType.ZERO_SHOT,  # prompt type: ZERO_SHOT (no examples)
    ts_format=TSFormat.ARRAY,          # time series format
    ts_type=TSType.NUMERIC             # Type of encoding for series values
)

Forecasting with LLMs

6. Initializing an OpenAI model

from llm4time.core.models import OpenAI

model = OpenAI(
  model='gpt-4o',  # OpenAI model to be used.
  api_key='...',   # API key for authentication with the OpenAI service.
  base_url='..'    # Base URL of the OpenAI endpoint.
)

7. Predicting values

# Forecasting
response, prompt_tokens, response_tokens, time_sec = model.predict(
    content,          # Previously generated prompt
    temperature=0.7,  # Level of randomness in the response
    max_tokens=1000   # Maximum number of tokens in the response
)

print("Model response:", response)
print("Prompt tokens:", prompt_tokens)
print("Response tokens:", response_tokens)
print("Execution time (s):", time_sec)

Metric evaluation

8. Error metrics

from llm4time.core import formatter
from llm4time.core.evaluate.metrics import Metrics

# Converts the response string into a numerical list
y_pred = formatter.parse(
  response,
  ts_format=TSFormat.ARRAY,
  ts_type=TSType.NUMERIC
)

metrics = Metrics(y_val, y_pred)

# Error metrics
print(f"sMAPE: {metrics.smape}") # Symmetric Mean Absolute Percentage Error
print(f"MAE: {metrics.mae}")     # Mean Absolute Error
print(f"RMSE: {metrics.rmse}")   # Root Mean Squared Error

Interactive evaluation

9. Plots comparing actual and predicted values

from llm4time.visualization import plots

# Generate a comparison plot between actual and predicted values
plots.plot_forecast("Comparison between actual and predicted values", y_val, y_pred)

# Generate a bar chart comparing descriptive statistics
plots.plot_forecast_statistics("Statistical comparison", y_val, y_pred)

🔍 References

@article{zairo2025prompt,
  title={Prompt-Driven Time Series Forecasting with Large Language Models},
  author={Zairo Bastos and João David Freitas and José Wellington Franco and Carlos Caminha},
  journal={Proceedings of the 27th International Conference on Enterprise Information Systems - Volume 1: ICEIS},
  year={2025}
}

👥 Team

Zairo Bastos
Zairo Bastos
Master’s student - UFC
📧 🔗
Wesley Barbosa
Wesley Barbosa
Undergraduate student - UFC
📧 🔗
Fernanda Scarcela
Fernanda Scarcela
Undergraduate student - UFC
📧 🔗
Carlos Caminha
Carlos Caminha
Academic advisor - UFC
📧 🔗
José Wellington Franco
José Wellington Franco
Academic advisor - UFC
📧 🔗

📄 License

This project is licensed under the MIT License.

📬 Contact

For questions, suggestions, or feedback:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm4time-0.5.0.tar.gz (38.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm4time-0.5.0-py3-none-any.whl (59.6 kB view details)

Uploaded Python 3

File details

Details for the file llm4time-0.5.0.tar.gz.

File metadata

  • Download URL: llm4time-0.5.0.tar.gz
  • Upload date:
  • Size: 38.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for llm4time-0.5.0.tar.gz
Algorithm Hash digest
SHA256 720f443a9fd2840b65d31e68cdb564becc44a553747a3187fac7a72c7bcc93c8
MD5 8dafa4033aa4a4842d23df9b85be002d
BLAKE2b-256 628210c2dc5a9eda0b4cfc931b1b5f57b83c478f5afc16e99e8be2b5607d8d55

See more details on using hashes here.

File details

Details for the file llm4time-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: llm4time-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 59.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for llm4time-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 68cf46690d00300230dd043cb8e7c085b58b5893c785b195f4df9e509adfccf7
MD5 38fa62e29ed4bfc05d0622f267243be8
BLAKE2b-256 1177950e98f910ab86ee00659f06dd559014aefea6a61d0747aafe62f153b45d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page