Visualizer for human attention over source code tokens.
Project description
Colored Tokens Visualizer
A python library to color your code tokens based on certain vector of weights.
Java Example
Given some Java code:
class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, World!");
}
}
You can create a JSON file which contains information on the tokens you are going to color. For the given snippets you will have to provide the following JSON file: example_java.json
[
[
{
"t": "class",
"i": 0,
"l": 1,
"c": 1
},
{
"t": "HelloWorld",
"i": 1,
"l": 1,
"c": 7
},
{
"t": "{",
"i": 2,
"l": 1,
"c": 18
},
{
"t": "public",
"i": 3,
"l": 2,
"c": 5
},
{
"t": "static",
"i": 4,
"l": 2,
"c": 12
},
...
]
Note that there are no meaningless tokens, such as whitespace, new lines, tab, etc.
**Warning:**If you use our tool to prepare the JSON file, remember that the JavaTokenizer removes comments while PythonTokenizer
keeps them. Implement your own tokenizer if you want a different behavior.
Which is a list of element, each representing one token. A token contains the following fields:
t
key contains the string representation of the token,i
key contains the major index position of the token,l
key contains the line number of the token,c
key contains the column number of the token.si
key contains the minor index position (si = subindex),d
key contains the number of other tokens sharing the same index
Python Example
With Python we might want to separate a long identifier with underscores in separate tokens. That is why we have the fields si
and d
.
def hello_python():
print("Hello World!")
You will need to prepare your input like this JSON: example_python.json
:
[
{
"t": "def",
"i": 0,
"l": 1,
"c": 0
},
{
"t": "hello",
"i": 1,
"l": 1,
"c": 4,
"si": 0,
"d": 2
},
{
"t": "python",
"i": 1,
"l": 1,
"c": 10,
"si": 1,
"d": 2
},
{
"t": "(",
"i": 2,
"l": 1,
"c": 16
},
...
]
Here the identifier hello_world
is split in two, and for the two parts you need to use the same i
field, and use an incremental si
(subindex) field. Both should have a d
(dimension) field to keep track of how many token the original identifier had.
Usage
Once you have the two JSON files: example_java.json
and example_python.json
you can simply use this code to create the colored version of your tokens, using your weights.
from random import randint
from source_code import SourceCode
weights = []
for i in range(len(jt.tokens)):
weights.append(randint(1, 10))
java_sc = SourceCode("example_java.json")
java_sc.show_with_weights(weights=weights)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for code-attention-visualizer-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 79ad156a5d1c0fc9f5baa4f17c34c08502a43a1927e73e6d075982cbb0f50db8 |
|
MD5 | 67033c62e4dcc5802236664eb8ddd289 |
|
BLAKE2b-256 | f11930a9df7de85befe648676106ac1fc322b1cd18f0f861c633f547bff05ea2 |