Creates basicness-scores of software
Project description
Basic Score
How basic is that file?
Measure how unsurprising text is by asking an LLM, unsurprising is roughly the double negation of probable, and large language models just predict the next token.
By feeding the text through an LLM and inspecting the token probabilities for the known next character basic score generates the probabilities for all tokens in your text.
Why would you want to do that?
While I don't think this is sufficient, the main idea is to give a third party opinion for code. I'm currently working on evaluating the idea by comparing relative scores with third party code bases and adding noise to them.
Once the evaluation setup is good I'll implement a RAG loop to create better contexts, to see if I can get code convention violations to penalize the score.
Besides that this can potentially detect LLM generated output.
Installation
For CPU inference it's easy:
pip install basicscore
For GPU inference you must force reinstall llama-cpp-python
for example:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir
Usage
Typical usage:
basicscore --model model.gguf --gpu-layers 50 --html-out=basicscore_file.html file.py
However you use a config-file, passed as argument, in $CWD/.basicscore.json
or
in $HOME/.config/basicscore/config.json
in falling precedence. There you can
specify all the options for ease of use.
To run it like this:
basicscore file.py
Why didn't you use tinygrad?
I tried at first but I could not get acceptable performance on my hardware and for this to be possible to evaluate I needed to squeeze out more toks/s than what I could. A lot has happened since I started this project and I'll likely revisit it later.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for basicscore-0.0.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c6730ee3ca044d5c34112a89604fc4f556ad1b2a2ee1723aea2d47be52453da |
|
MD5 | 8ebf1fb068a348d2700780f9a4af719a |
|
BLAKE2b-256 | 0a54877637a0c616f6466c72517558a7ca2478cc9a490e4b80209711c2bde5d0 |