SQL Fuzzing Module
Project description
AMOEBA-6422
Environment Setup
- Install dependencies
- Install postgresql-12 (https://www.postgresql.org/download/)
- Download demo data from dropbox and unzip it to
data
folder - Start postgresql server with data folder:
/usr/lib/postgresql/12/bin/pg_ctl -D ./data start
- If you install postgresql-12 from apt, stop it via
sudo systemctl stop postgresql.service
and give permission to the lock file viasudo chmod o+w /var/run/postgresql
- If you install postgresql-12 from apt, stop it via
- Install dependencies
sudo apt-get install python3-dev libmysqlclient-dev
(if not Debian/Ubuntu, see https://github.com/PyMySQL/mysqlclient#install for more details)pip install -r requirements.txt
for python dependencies
- Setup environment
- Installation of
direnv
is needed. And runeval "$(direnv hook bash)"
to setup the environment (replacebash
withzsh
or any shell you're using). - Build the
MUTATOR
ofAMOEBA
(only necessary if you make changes to theMUTATOR
)cd /workspace/calcite-fuzzing && ./gradlew build -x core:checkstyleMain -x test -x core:checkstyleTest -x core:forbiddenApisTest -x core:autostyleJavaCheck
Quick Start
AMOEBA
is configurable, a launch command template looks like the following:
$ timeout {total_timeout} ./test_driver.py --workers {num_workers} --output {outputfolder} --queries {num_queries_per_worker} --rewriter ./calcite-fuzzing --dbms={dbms_undertest} --validate --num_loops={num_feedbackloops} --feedback={conf_feedback} --dbconf=db_conf_demo.json --query_timeout {per_query_timeout}
You can customize the value of the following options:
- {total_timeout}: timeout for the entire run of
AMOEBA
(unit is seconds) - {workers}: number of parallel workers to invoke
GENERATOR
andMUTATOR
- {output}: location to store the intermediate results and bug reports
- {queries}: number of base queries that are generated by each
GENERATOR
instance - {dbms}: DBMS that
AMOEBA
will evaluate on (i.e.,postgresql
orcockroachdb
) - {num_loops}: number of feedback loops
- {validate}: a boolean argument that decides whether to invoke the
VALIDATOR
after generating the equivalent query pairs - {feedback}: what types of feedbacks to utilize (i.e.,
both
,none
,mutator
, orvalidator
) - {query_timeout}: timeout for executing each query (unit is seconds)
For example, you can launch AMOEBA
with the following command:
timeout 3600 ./test_driver.py --workers 1 --output /home/postgres/exp/1 --queries 200 --rewriter ./calcite-fuzzing --dbms=postgresql --validate --num_loops=100 --feedback=none --dbconf=db_conf_demo.json --query_timeout 30
If AMOEBA
is working correctly, you should expect to see the following progress information is printed:
start query generator
['mutator.py --prob_table=/home/postgres/test/190156/prob_table_190156.json --db_info=/workspace/amoeba_conf/db_conf_demo.json -s seq --queries 100 1>/home/postgres/test/190156/0/log_sa0 2>/home/postgres/test/190156/0/input.sql']
finish query generator
start query rewriter
['java -cp calcite-core-1.22.0-SNAPSHOT-tests.jar:./*:. org.apache.calcite.test.Transformer /home/postgres/test/190156/0']
finish query rewriter
start validator
begin compare plan cost of equivalent queries
compare plan cost /home/postgres/test/190156/0/out/q20.sql
find plan diff /home/postgres/test/190156/0/out/q20.sql
compare plan cost /home/postgres/test/190156/0/out/q13.sql
find plan diff /home/postgres/test/190156/0/out/q13.sql
compare plan cost /home/postgres/test/190156/0/out/q18.sql
compare plan cost /home/postgres/test/190156/0/out/q23.sql
This example command should complete within 20 minutes. You can check the generated intermediate results in /home/postgres/exp/1
. If AMOEBA
discovers potential performance bugs, the generated bug report will live at /home/postgres/exp/1/bugs.md
.
The shortcut CTRL+C can be used to terminate AMOEBA
manually. Otherwise, AMOEBA
will terminate either after a specified experiment timeout is reached or after a specified number of base queries have been examined. The option total_timeout
controls the experiment timeout. The options workers
, queries
, and num_loops
altogether determine the number of base queries that AMOEBA
is going to examine.
Note
- calcite part is based on Apache Calcite 1.22 and the only new class is added on
core/src/test/java/org/apache/calcite/test/Transformer.java
- need to remove all dependencies in this repo and ask users to download it as needed.
- grant appropriate permission to your user to access the database. ref: https://stackoverflow.com/a/23934693/10180666
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for amoeba_db-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | babc0653115ca892aa161f5e7194ff1414674766a3f42688f76aeb4ca9620f20 |
|
MD5 | 7b1fbac7fad43f95950f176b5c621223 |
|
BLAKE2b-256 | 6148195752a2f224c211deaebe3269860789ad92d6b1eff656ded23908165902 |