Skip to main content

How much would it have cost if GPT-4 had written your code?

Project description

Cost Of Code

How much would it have cost if GPT-4 had written your code?

Installation

pip install cost-of-code

Usage

cost-of-code

Arguments

Argument Description Default Value
--repo-path The path to the git repository. ./
--branch-name The name of the branch to analyze. master
--cost-per-thousand-tokens The cost (in USD) per thousand tokens according to the current OpenAI pricing. 0.06
--extension-whitelist A comma-separated list of file extensions to consider when analyzing the repository. *.py,*.js,*.java,*.c,*.cpp,*.go

This will output the total number of tokens in the repository and the estimated cost to generate these tokens using GPT-4.

Sample output

Total tokens in the current state of the repo: 1334
Estimated cost for the current state of the repo: $0.08
Total tokens in all added lines: 1517
Estimated cost for all added lines: $0.09

How It Works

  1. The script starts by scanning the specified git repository.
  2. It uses GitPython to collect all the files that are currently tracked by git. This means it will not scan files that are ignored by git (like those specified in .gitignore).
  3. It uses gitpython to collect the lines added in each git commit.
  4. Only the added lines from the git diff patches are considered and tokenized. It does not tokenize lines that have been removed. If the user is not in a git repository, an error message is displayed and the program exits.
  5. It then tokenizes the lines from these files using the tiktoken Python package from OpenAI. tiktoken is a tokenizer that counts the tokens in the same way the OpenAI API does.
  6. The tokens from each file are then counted.
  7. It also uses gitpython to count the number of tokens added to the repo so far by scanning through the git commit history. Only the added lines in each git commit are considered for this token count.
  8. The script finally estimates how much it would cost to generate the same amount of tokens using GPT-4, based on the current pricing.
  9. It reports two cost estimates: one for the current state of the repository (total tokens in the code files in the current state), and one for the total added tokens over the entire history of the repository. This gives you a sense of how the cost to generate your codebase with GPT-4 would have accumulated over time as the codebase grew.

Please note: this tool assumes that the current cost per 1,000 tokens for using GPT-4 is $0.06, as per OpenAI's current pricing. If OpenAI's pricing changes, you can update this value using the --cost-per-thousand-tokens argument.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cost-of-code-0.1.4.tar.gz (5.2 kB view hashes)

Uploaded Source

Built Distribution

cost_of_code-0.1.4-py3-none-any.whl (6.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page