Skip to main content

Lexical sentiment analysis pipeline for central bank and economic text data.

Project description

AutoEconSentiment

Python License

auto-econ-sentiment is a reproducible pipeline for measuring economic sentiment in central bank and financial text. The core package is lexical-first: it cleans text, scores it with established economic dictionaries, and exports comparable sentiment outputs. Transformer sentiment is available as an optional extension.

Quick Start

Install the base package:

uv sync

Run the default YAML-configured pipeline:

uv run python -m src.auto_econ_sentiment.pipeline

Or call the Python API:

from auto_econ_sentiment.pipeline import AutoEconSentiment

analyzer = AutoEconSentiment(
    import_file_path="data/raw/basic_tests/monetary_policy_statement.parquet.gzip",
    text_column="text",
    date_column="date",
    export_path="data/sentiment/basic_tests/",
)

analyzer.run(
    clean_config={"tokenize": True, "stem": True},
    dictionaries={"unstemmed": ["correa", "hubert", "lm", "hiv"], "stemmed": ["ap", "bn"]},
    aggregation_methods=["posneg", "allwords"],
    export_results=True,
)

Transformer Quick Start

Transformer support is optional so lexical users do not need to install torch or download Hugging Face models.

uv sync --extra transformers

Then enable transformer models in params.yaml:

models:
  transformer:
    enabled: true
    text_column_transformer: text_clean
    aggregation_methods: [bysentence]
    output_schema: shares
    models:
      - name: gtfintechlab/FOMC-RoBERTa
        short_name: fomc
        num_labels: 3
        label_mapping:
          LABEL_0: positive
          LABEL_1: negative
          LABEL_2: neutral
        sentiment_values:
          positive: 1
          negative: -1
          neutral: 0

The transformer examples in params.yaml show the supported model-list format. More experimental transformer notes live locally in docs/feedback/ when present.

Transformer runs export sentiment_transformer.parquet.gzip and, for sentence-level aggregation, sentiment_transformer_sentence_probabilities.parquet.gzip alongside the existing lexical and combined output tables.

What It Does

  • Loads CSV or parquet text data.
  • Cleans and normalizes economic text.
  • Scores text with multiple central bank and financial dictionaries.
  • Optionally scores text with transformer classifiers using explicitly configured label mappings.
  • Exports cleaned text, matched words, counts, probabilities, and sentiment scores.
  • Makes dictionary and model disagreement visible for research workflows.

Documentation

CBS Speeches Demo

Download the CBS central bank speeches dataset and run the sentiment pipeline:

uv run python -m src.data.cb_speeches_download
uv run python -m src.data.cb_speeches_clean

Then open notebooks/demo_cb_speechs.ipynb to explore the outputs.

Citations

Dataset:

  • Campiglio, E., Deyris, J., Romelli, D., & Scalisi, G. (2025). Warning words in a warming world: Central bank communication and climate change. European Economic Review, 105101.

Lexical dictionaries:

  • Loughran, T. and B. McDonald (2011). When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks. The Journal of Finance 66, 35-65.
  • Correa, R., K. Garud, J. Londono, and N. Mislang (2017). Sentiment in Central Bank as Financial Stability Reports. International Finance Discussion Paper 1203.
  • Hubert, P. and F. Labondance (2021). The signaling effects of central bank tone. European Economic Review 133, 103684.
  • Stone, P. J., D. C. Dunphy, and M. S. Smith (1966). The General Inquirer: A Computer Approach to Content Analysis.
  • Apel, M. and M. Blix Grimaldi (2014). How Informative Are Central Bank Minutes? Review of Economics 65(1), 53-76.
  • Bennani, H. and M. Neuenkirch (2017). The (Home) Bias of European Central Bankers. Applied Economics 49(11), 1114-1131.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_econ_sentiment-0.2.0.tar.gz (54.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_econ_sentiment-0.2.0-py3-none-any.whl (56.5 kB view details)

Uploaded Python 3

File details

Details for the file auto_econ_sentiment-0.2.0.tar.gz.

File metadata

  • Download URL: auto_econ_sentiment-0.2.0.tar.gz
  • Upload date:
  • Size: 54.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for auto_econ_sentiment-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6f2127ab6f6f8bcc366aa795de0406a221b7d53847ceebeb9d43e6161af9ddc8
MD5 ccd1026e6567de445206b9a63950b8fc
BLAKE2b-256 05e33dc0490741066d60e23033e854739fecc637bb99b2ef3239593393852f66

See more details on using hashes here.

Provenance

The following attestation bundles were made for auto_econ_sentiment-0.2.0.tar.gz:

Publisher: publish.yml on corybaird/auto-econ-sentiment

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file auto_econ_sentiment-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for auto_econ_sentiment-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 865bd9434b84739bcce5b5c6e0b10db5b964dd5ef62a7350a22417e7bfc1b3a8
MD5 115f175a1bb9b02ce6eb9963d32f895a
BLAKE2b-256 c79b8691d2342a27656d7dcb7d8568be817c33bf6056855f9c9e9420c90c986f

See more details on using hashes here.

Provenance

The following attestation bundles were made for auto_econ_sentiment-0.2.0-py3-none-any.whl:

Publisher: publish.yml on corybaird/auto-econ-sentiment

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page