How much would it have cost if GPT-4 had written your code?
Project description
Cost Of Code
How much would it have cost if GPT-4 had written your code?
Installation
pip install cost-of-code
Usage
cost-of-code
Arguments
Argument | Description | Default Value |
---|---|---|
--repo-path |
The path to the git repository. | ./ |
--branch-name |
The name of the branch to analyze. | master |
--cost-per-thousand-tokens |
The cost (in USD) per thousand tokens according to the current OpenAI pricing. | 0.06 |
--extension-whitelist |
A comma-separated list of file extensions to consider when analyzing the repository. | *.py,*.js,*.java,*.c,*.cpp,*.go |
This will output the total number of tokens in the repository and the estimated cost to generate these tokens using GPT-4.
Sample output
Total tokens in the current state of the repo: 1334
Estimated cost for the current state of the repo: $0.08
Total tokens in all added lines: 1517
Estimated cost for all added lines: $0.09
How It Works
- The script starts by scanning the specified git repository.
- It uses GitPython to collect all the files that are currently tracked by git. This means it will not scan files that are ignored by git (like those specified in .gitignore).
- It uses gitpython to collect the lines added in each git commit.
- Only the added lines from the git diff patches are considered and tokenized. It does not tokenize lines that have been removed. If the user is not in a git repository, an error message is displayed and the program exits.
- It then tokenizes the lines from these files using the
tiktoken
Python package from OpenAI.tiktoken
is a tokenizer that counts the tokens in the same way the OpenAI API does. - The tokens from each file are then counted.
- It also uses gitpython to count the number of tokens added to the repo so far by scanning through the git commit history. Only the added lines in each git commit are considered for this token count.
- The script finally estimates how much it would cost to generate the same amount of tokens using GPT-4, based on the current pricing.
- It reports two cost estimates: one for the current state of the repository (total tokens in the code files in the current state), and one for the total added tokens over the entire history of the repository. This gives you a sense of how the cost to generate your codebase with GPT-4 would have accumulated over time as the codebase grew.
Please note: this tool assumes that the current cost per 1,000 tokens for using GPT-4 is $0.06, as per OpenAI's current pricing. If OpenAI's pricing changes, you can update this value using the --cost-per-thousand-tokens
argument.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cost-of-code-0.1.4.tar.gz
(5.2 kB
view details)
Built Distribution
File details
Details for the file cost-of-code-0.1.4.tar.gz
.
File metadata
- Download URL: cost-of-code-0.1.4.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3c86edaba73e5c6775fd9f2d0dc43280d264f57e6273bf4c2af36304b190863 |
|
MD5 | 2d346ea9a8184e3d24c42735dc5d970e |
|
BLAKE2b-256 | 4c09894adbc88c8091abaae0dd902e3856f3088c7b97e5418a4cd723d6e31185 |
File details
Details for the file cost_of_code-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: cost_of_code-0.1.4-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2a45a3039a35543ded4280735f9272601a9ce853b09e04e97edd35f2bc0075e |
|
MD5 | 93edb4d935c3165cc35c82e97bd716b1 |
|
BLAKE2b-256 | 1c76413b6141c37236378003cd7b72ceb239b40318af058aba7b9ae1a073ffac |