Skip to main content

LLM-powered test coverage improver

Project description

by Juan Altmayer Pizzorno and Emery Berger at UMass Amherst's PLASMA lab.

pypi pyversions

About CoverUp

CoverUp automatically generates tests that ensure that more of your code is tested (that is, it increases its code coverage). CoverUp can also create a test suite from scratch if you don't yet have one. The new tests are based on your code, making them useful for regression testing.

CoverUp is designed to work closely with the pytest test framework. To generate tests, it first measures your suite's coverage using SlipCover. It then selects portions of the code that need more testing (that is, code that is uncovered). CoverUp then engages in a conversation with an LLM, prompting for tests, checking the results to verify that they run and increase coverage (again using SlipCover), and re-prompting for adjustments as necessary. Finally, CoverUp optionally checks that the new tests integrate well, attempting to resolve any issues it finds.

For technical details and a complete evaluation, see our arXiv paper, CoverUp: Coverage-Guided LLM-Based Test Generation (PDF).

Installing CoverUp

CoverUp is available from PyPI, so you can install simply with

$ python3 -m pip install coverup

LLM model access

CoverUp can be used with OpenAI, Anthropic or AWS Bedrock models; it requires that the access details be defined as shell environment variables: OPENAI_API_KEY, ANTHROPIC_API_KEY or AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_REGION_NAME, respectively.

For example, for OpenAI you would create an account, ensure it has a positive balance and then create an an API key, storing its "secret key" (usually a string starting with sk-) in an environment variable named OPENAI_API_KEY:

$ export OPENAI_API_KEY=<...your-api-key...>

Using CoverUp

If your module is named mymod, its sources are under src and the tests under tests, you can run CoverUp as

$ coverup --source-dir src/mymod --tests-dir tests

CoverUp then creates tests named test_coverup_N.py, where N is a number, under the tests directory.

Example

Here we have CoverUp create additional tests for the popular package Flask:

$ coverup --package src/flask --tests tests
Measuring coverage...  90.9%
Prompting gpt-4o-2024-05-13 for tests to increase coverage...
(in the following, G=good, F=failed, U=useless and R=retry)
100%|███████████████████████████████████████| 92/92 [01:01<00:00,  1.50it/s, G=55, F=122, U=20, R=0, cost=~$4.19]
Measuring coverage...  94.4%

$

In just over a minute, CoverUp increases Flask's test coverage from 90.9% to 94.4%.

Avoiding flaky tests

While evaluating each newly generated test, CoverUp executes it a number of times in an attempt to detect any flaky tests; that can be adjusted with the --repeat-tests and --no-repeat-tests options. If CoverUp detects that a newly generated test is flaky, it prompts the LLM for a correction.

Test pollution and isolation

CoverUp only adds tests to the suite that, when run by themselves, pass and increase coverage. However, it is possible tests to "pollute" the state, changing it in a way that causes other tests to fail. By default, CoverUp uses the pytest-cleanslate plugin to isolate tests, working around any (in-memory) test pollution; that can be disabled by passing in the --no-isolate-tests option. CoverUp can also be asked to find and disable the polluting test module or function (--disable-polluting) or simply disable any failing tests (``--disable-failing`).

Running CoverUp with Docker

To evaluate the tests generated by the LLM, CoverUp must execute them. For best security and to minimize the risk of damage to your system, we recommend running CoverUp with Docker.

Evaluation

The graph shows CoverUp in comparison to CodaMosa, a state-of-the-art search-based test generator based on Pynguin test generator. For this experiment, both CoverUp and CodaMosa created tests "from scratch", that is, ignoring any existing test suite. The bars show the difference in coverage percentage between CoverUp and CodaMosa for various Python modules; green bars, above 0, indicate that CoverUp achieved a higher coverage.

As the graph shows, CoverUp achieves higher coverage than CodaMosa for most modules.


Work In Progress

This is an early release of CoverUp. Please enjoy it, and pardon any disruptions as we work to improve it. We welcome bug reports, experience reports, and feature requests (please open an issue).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coverup-0.5.0.tar.gz (49.7 kB view details)

Uploaded Source

Built Distribution

CoverUp-0.5.0-py3-none-any.whl (34.7 kB view details)

Uploaded Python 3

File details

Details for the file coverup-0.5.0.tar.gz.

File metadata

  • Download URL: coverup-0.5.0.tar.gz
  • Upload date:
  • Size: 49.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.10

File hashes

Hashes for coverup-0.5.0.tar.gz
Algorithm Hash digest
SHA256 3cdbfa7aa634f709956cf59460fcd8ac2a6d5841fbbbcd3125ddf90beec310fb
MD5 34252a6dbc49cfb9f7742e3d6e0c4f8e
BLAKE2b-256 7530122834e3f35af34e52267ed2d8439b16291b9d45a3a2db5af1844760e074

See more details on using hashes here.

File details

Details for the file CoverUp-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: CoverUp-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 34.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.10

File hashes

Hashes for CoverUp-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f6a65fe37ff73bd19c19a9af1cc47d6d424cd98a8661d13808c1c28fafa8f748
MD5 abc9283e6a89a0f8378f8d37deee98b2
BLAKE2b-256 126ad8967abf478a9c92b074b4253cb6c618085a8eaac94662f0406da452e282

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page