Next-generation codebase analysis toolkit.
Project description
ScubaTrace
Next-Generation Codebase Analysis Toolkit.
Install
pip install scubatrace
Features
- Multi-Language Support (C, C++, Java, Python, JavaScript, Go)
- No Need To Compile
- Statement-Based AST Abstraction
- Code Call Graph
- Code Control Flow Graph
- Code Data/Control Dependency Graph
- References Inference
- CPG Based Multi-Granularity Slicing
| Tool | Type | Capabilities | Requires Compilation (Instruction) | Supported Languages | Limitations |
|---|---|---|---|---|---|
| ScubaTrace | Lib | CG/CFG/DataFlow/Slicing | ✅ No | Multiple Languages | |
| Soot | CLI/Lib (Java) | CG/CFG/DataFlow | ❌ Yes | Java (Bytecode) | Cannot directly analyze the source code |
| LLVM | CLI/Lib (C) | CG/CFG/DataFlow | ❌ Yes | C/C++ (IR) | Cannot directly analyze the source code |
| pycallgraph | CLI | CG | ✅ No | Python | Does not provide a library, requires parsing the tool output |
| pycg | CLI | CG | ✅ No | Python | Precision is low, requires parsing the tool output, no longer maintained |
| Jelly | CLI | CG | ✅ No | JavaScript | Incomplete call graph (CG), the generated output requires further processing |
| Infer | OCaml | CG/CFG/DataFlow | ❌ Yes | Multiple Languages | 1. High cost of adaptation |
| CodeQL | QL | CG/CFG/DataFlow | ❌ Required for compiled languages ✅ Not required for interpreted languages |
Multiple Languages | 1. Compiled languages require compilation 2. Requires learning QL and using it for analysis 3. Lower performance, slow for large-scale projects |
| Joern | CLI/Scala | CG/CFG/DataFlow | ✅ No | Multiple Languages | 1. The generated CG and other results cannot be directly used, require further processing 2. Generated CG graphs are prone to errors in resolving output failures 3. Lower performance, slow for large-scale projects |
Usage
Project-Level Analysis
Load a project (codebase)
proj = scubatrace.CProject("path/to/your/codebase")
Call Graph
# Get the call graph of the project
callgraph = proj.callgraph
# Export call graph to a dot file
proj.export_callgraph("callgraph.dot")
Code Search
stat = proj.search_function("relative/path/to/your/file.c", start_line=20)
File-Level Analysis
Load a file from a project
file = proj.files["relative/path/to/your/file.c"]
Function-Level Analysis
Load a function from a file
the_first_func = file.functions[0]
func_in_tenth_line = file.function_by_line(10)
Call Relationships
callers = func.callers
callfrom, callto, callsite_line, callsite_column = (
callers[0].src,
callers[0].dst,
callers[0].line,
callers[0].column,
)
callees = func.callees
callfrom, callto, callsite_line, callsite_column = (
callees[0].src,
callees[0].dst,
callees[0].line,
callees[0].column,
)
Function Control Flow Graph
# Export the control flow graph to a dot file
func.export_cfg_dot("cfg.dot")
Function Data Dependency Graph
# Export the data dependency graph to a dot file
func.export_cfg_dot("ddg.dot", with_ddg=True)
Function Control Dependency Graph
# Export the control dependency graph to a dot file
func.export_cfg_dot("cdg.dot", with_cdg=True)
Function Code Walk
statements_you_interest = list(
func.walk_backward(
filter=lambda x: x.is_jump_statement,
stop_by=lambda x: x.is_jump_statement,
depth=-1,
base="control",
)
)
statements_you_interest = list(
func.walk_forward(
filter=lambda x: x.is_jump_statement,
stop_by=lambda x: x.is_jump_statement,
depth=-1,
base="control",
)
)
Multi-Granularity Slicing
# Slicing by lines
lines_you_interest = [4, 5, 19]
slice_statements = func.slice_by_lines(
lines=lines_you_interest,
control_depth=3,
data_dependent_depth=5,
control_dependent_depth=2,
)
# Slicing by statements
statements_you_interest = func.statements[0:3]
slice_statements = func.slice_by_statements(
statements=statements_you_interest,
control_depth=3,
data_dependent_depth=5,
control_dependent_depth=2,
)
Statement-Level Analysis
Load a statement from a function
the_first_stmt = the_first_func.statements[0]
stmt_in_second_line = the_first_func.statement_by_line(2)
stmt_by_type = func.statements_by_type('tree-sitter Queries', recursive=True)
Statement Controls
pre_controls: list[Statement] = stat.pre_controls
post_controls: list[Statement] = stat.post_controls
Statement Data Dependencies
pre_data_dependents: dict[Identifier, list[Statement]] = stat.pre_data_dependents
post_data_dependents: dict[Identifier, list[Statement]] = stat.post_data_dependents
Statement Control Dependencies
pre_control_dependents: list[Statement] = stat.pre_control_dependents
post_control_dependents: list[Statement] = stat.post_control_dependents
AST Node
You can also get the AST node from a file, function, or statement.
file_ast = file.node
func_ast = func.node
stmt_ast = stat.node
ScubaTrace Landscape
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scubatrace-0.6.5.tar.gz
(36.8 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scubatrace-0.6.5.tar.gz.
File metadata
- Download URL: scubatrace-0.6.5.tar.gz
- Upload date:
- Size: 36.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed48c932897a87a8f9c88983dde8f95ef64b208d9196ecfd3ee64d0fec854ffe
|
|
| MD5 |
00dfede0d6337761d60764068da64589
|
|
| BLAKE2b-256 |
da5f26a36c0d9c3b0f45a7fe7819c58507ea3482af993caf5ee0c4dcdf5df665
|
File details
Details for the file scubatrace-0.6.5-py3-none-any.whl.
File metadata
- Download URL: scubatrace-0.6.5-py3-none-any.whl
- Upload date:
- Size: 38.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ec9615b6d98ea4c2edf94bbc459116c985e07c418de94452f3fc63cf78da440
|
|
| MD5 |
b42d34ed570603cb2cde9bd1300d49bd
|
|
| BLAKE2b-256 |
8cd9f02d45acd0920c08fd1b394c927e99f14f64feed84efc123174dfbd84488
|