Skip to main content

No project description provided

Project description

Overview

This research repository maintains the code and the results for the research paper: SETLEXSEM CHALLENGE: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language Models.

"Set theory has become the standard foundation for mathematics, as every mathematical object can be viewed as a set." -Stanford Encyclopedia of Philosophy

Install

When installing, it's important to upgrade to the most recent pip. This ensures that setup.py runs correctly. An outdated version of pip can fail to run the InstallNltkWordnetAfterPackages command class in setup.py and cause subsequent errors.

/usr/bin/python3 -mvenv venv
. venv/bin/activate
python3 -m pip install --upgrade pip
pip install -e .
pip install -e ."[dev, test]"

NLTK words

If you get errors from nltk about package words not being installed while executing the code in this repository, run:

import nltk
nltk.download("words")

Note that words should be automatically installed by pip when you follow the installation instructions for this package.

Directory Structure

  • configs/
    • configs/experimetns contains configuration files which specify hyperparamter settings for running experiments.
    • configs/generation_data contains configuration files for dataset generation
    • configs/generation_prompt contains configuration files for prompt generation based on the data previously stored
    • configs/post_analysis contains a configuration file which can be used for analysis of cost, latency, and performance metrics for one set of hyperparameters for a particular study. This config is used in the script scripts/anaylsis_for_one_study.py
    • configs/post_hypothesis contains a configuration file which specifies filtering criterias for generating figures for various hypotheses.
  • notebooks/ has a Jupyter notebook for generating figures that are used in the research paper
  • scripts/ contains Python scripts for running experiments, post-processing the results, and analysis of results
  • setlexsem/ is the module which has all the important functions and utils for analysis, experimentation, generation of data, samplers.
    • analyze contains code for error_analysis of post-processed results, visualizaiton code and utils neeeded for generating figures for hypothesis.
    • experiment contains code for invoking LLMs and running experiments for a particular hypothesis/study.
    • generate contains code for generating data, sample synthetic sets, prompts and utils needed for data generation.
    • prepare contains helper functions for partitioning words according to their frequencies.

Generating Datasets

Generate Sets with Numbers or Words

To generate your own data, you can run the following:

python setlexsem/generate/generate_data.py --config_path "configs/generation_data/numbers_and_words.yaml" --seed_value 292 --save_data 1

Generate Sets based on their training-set frequency

To generate sets based on their training-set frequency, we use an approximation based on rank frequency in the Google Books Ngrams corpus.

This requires wget (brew install wget or apt install wget). After installing wget, you need to create deciles.json. The following command downloads the English unigram term frequencies of the Google Books Ngram corpus, filters them by the vocabulary of the nltk.words English vocabulary, and stores the vocabulary, separated by deciles of rank frequency, in data/deciles.json.

scripts/make-deciles.sh

This will take ~10 minutes or more, depending on your bandwidth and the speed of your computer.

Then run the following to generate data.

python setlexsem/generate/generate_data.py --config_path "configs/generation_data/deciles.yaml" --seed_value 292 --save_data 1

Generating Prompts

Example: Sets with Numbers

To generate your own data, you can run the following:

python setlexsem/generate/generate_prompts.py --config_path "configs/generation_prompt/test_config.yaml" --seed_value 292 --save_data 1

Running Experiments End-to-End

  1. Create a config file like configs/experiments/anthr_sonnet.yaml
  2. Run the experiment:
python setlexsem/experiment/run_experiments.py --account_number <account-number> --save_file 1 --load_last_run 1 --config_file configs/experiments/anthr_sonnet.yaml

Note: Currently, our experiments are dependent on AWS Bedrock and need an AWS account number to be provided. However, you have the capability to run experiments using OPENAI_KEY. We will add more instructions soon.

  1. Post-Processing results (Check whether your study_name is present in the STUDY2MODEL dict in setlexsem/constants.py)
python scripts/save_processed_results_for_study_list.py

4, Analysis of cost, latency, and performance metrics for one set of hyperparameters for a particular study - enter hyperparameter values in the configs/post_analysis/study_config.json

python scripts/analysis_for_one_study.py
  • Generate figures using notebooks/Hypothesis Testing - Manuscript.ipynb. Validate the filtering criterias in configs/post_hypothesis/hypothesis.json

Test

To test the full-suite of tests, you need to provide the Account Number.

pytest -s .

You will be prompted to provide your Account Number after that.

Coverage Report

pip install pytest-cov
pytest --cov=setlexsem --cov-report=term-missing

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

setlexsem-0.0.1.tar.gz (44.7 kB view details)

Uploaded Source

Built Distribution

setlexsem-0.0.1-py3-none-any.whl (45.8 kB view details)

Uploaded Python 3

File details

Details for the file setlexsem-0.0.1.tar.gz.

File metadata

  • Download URL: setlexsem-0.0.1.tar.gz
  • Upload date:
  • Size: 44.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for setlexsem-0.0.1.tar.gz
Algorithm Hash digest
SHA256 7b5b6e131cfe8a6ae3311a2c7fc7b4251e3d42136a8ee14709f9ea4e64259ff5
MD5 202a14eb991d3737b05ff8e268c9213d
BLAKE2b-256 12269baade793bc46314682ad8be4474e188dd5cb53dc7284e08b1126dd8a861

See more details on using hashes here.

Provenance

The following attestation bundles were made for setlexsem-0.0.1.tar.gz:

Publisher: publish.yml on amazon-science/SetLexSem-Challenge

Attestations:

File details

Details for the file setlexsem-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: setlexsem-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 45.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for setlexsem-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8bccbd8d58cf24f760884ecbc96db379361043d2afa887adebea329f75c1d5be
MD5 04f69bd86084defbf6952baa7c0ba617
BLAKE2b-256 c2bf473d54cc370ae55a3db3758a4a781e278e1ede0db7ae176880b865633306

See more details on using hashes here.

Provenance

The following attestation bundles were made for setlexsem-0.0.1-py3-none-any.whl:

Publisher: publish.yml on amazon-science/SetLexSem-Challenge

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page