Skip to main content

MatplotCLI: create matplotlib visualizations from the command-line

Project description

MatplotCLI

Create matplotlib visualizations from the command-line

MatplotCLI is a simple utility to quickly create plots from the command-line, leveraging Matplotlib.

plt "scatter(x,y,5,alpha=0.05); axis('scaled')" < sample.json

scatter plot of sample data

plt "hist(x,30)" < sample.json

histogram of x on the sample data

The format of the input data format is JSON lines, where each line is a valid JSON object. Look at the recipes section to learn how to handle other formats like CSV.

MatplotCLI executes python code (passed as argument) where some handy imports are already done (e.g. from matplotlib.pyplot import *) and where the input JSON data is already parsed and available in variables, making plotting easy. Please refer to matplotlib.pyplot's reference and tutorial for comprehensive documentation about the plotting commands.

Data from the input JSON is made available in the following way. Given the input myfile.json:

{"a": 1, "b": 2}
{"a": 10, "b": 20}
{"a": 30, "c$d": 40}

The following variables are made available:

data = {
    "a": [1, 10, 30],
    "b": [2, 20, None],
    "c_d": [None, None, 40]
}

a = [1, 10, 30]
b = [2, 20, None]
c_d = [None, None, 40]

col_names = ("a", "b", "c_d")

So, for a scatter plot a vs b, you could simply do:

plt "scatter(a,b); title('a vs b')" < myfile.json

Notice that the names of JSON properties are converted into valid Python identifiers whenever they are not (e.g. c$d was converted into c_d).

Execution flow

  1. Import matplotlib and other libs;
  2. Read JSON data from standard input;
  3. Execute user code;
  4. Show the plot.

All steps (except step 3) can be skipped through command-line options.

Installation

The easiest way to install MatplotCLI is from pip:

pip install matplotcli

Recipes and Examples

Plotting from a json array

jq is a very handy utility whenever we need to handle different JSON flavors. The -c option guarantees one JSON object per line in the output.

echo '[
    {"a":0, "b":1},
    {"a":1, "b":0},
    {"a":3, "b":3}
    ]' | 
jq -c .[] | 
plt "plot(a,b)"

Plotting from a csv

SPyQL is a data querying tool that allows running SQL queries with Python expressions on top of different data formats. Here, SPyQL is reading a CSV file, automatically detecting if there's an header row, the dialect and the data type of each column, and converting the output to JSON lines before handing over to MatplotCLI.

cat my.csv | spyql "SELECT * FROM csv TO json" | plt "plot(x,y)"

Plotting from a yaml/xml/toml

yq converts yaml, xml and toml files to json, allowing to easily plot any of these with MatplotCLI.

cat file.yaml | yq -c | plt "plot(x,y)"
cat file.xml | xq -c | plt "plot(x,y)"
cat file.toml | tomlq -c | plt "plot(x,y)"

Plotting from a parquet file

parquet-tools allows dumping a parquet file to JSON format. jq -c makes sure that the output has 1 JSON object per line before handing over to MatplotCLI.

parquet-tools cat --json my.parquet | jq -c | plt "plot(x,y)"

Plotting from a database

Databases CLIs typically have an option to output query results in CSV format (e.g. psql --csv -c query for PostgreSQL, sqlite3 -csv -header file.db query for SQLite).

Here we are visualizing how much space each namespace is taking in a PostgreSQL database. SPyQL converts CSV output from the psql client to JSON lines, and makes sure there are no more than 10 items, aggregating the smaller namespaces in an All others category. Finally, MatplotCLI makes a pie chart based on the space each namespace is taking.

psql -U myuser mydb --csv  -c '
    SELECT 
        N.nspname,
        sum(pg_relation_size(C.oid))*1e-6 AS size_mb
    FROM pg_class C
    LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
    GROUP BY 1 
    ORDER BY 2 DESC' | 
spyql "
    SELECT 
        nspname if row_number < 10 else 'All others' as name, 
        sum_agg(size_mb) AS size_mb 
    FROM csv 
    GROUP BY 1 
    TO json" | 
plt "
nice_labels = ['{0}\n{1:,.0f} MB'.format(n,s) for n,s in zip(name,size_mb)];
pie(size_mb, labels=nice_labels, autopct='%1.f%%', pctdistance=0.8, rotatelabels=True)"

pie chart of namespace size in a posgresql database

Plotting a function

Disabling reading from stdin and generating the output using numpy.

plt --no-input "
x = np.linspace(-1,1,2000); 
y = x*np.sin(1/x); 
plot(x,y); 
axis('scaled'); 
grid(True)"

plotting a function

Saving the plot to an image

Saving the output without showing the interactive window.

cat sample.json | 
plt --no-show "
hist(x,30); 
savefig('myimage.png', bbox_inches='tight')"

Plot of the global temperature

Here's a complete pipeline from getting the data to transforming and plotting it:

  1. Downloading a CSV file with curl;
  2. Skipping the first row with sed;
  3. Grabbing the year column and 12 columns with monthly temperatures to an array and converting to JSON lines format using SPyQL;
  4. Exploding the monthly array with SPyQL (resulting in 12 rows per year) while removing invalid monthly measurements;
  5. Plotting with MatplotCLI .
curl https://data.giss.nasa.gov/gistemp/tabledata_v4/GLB.Ts+dSST.csv |
sed 1d | 
spyql "
  SELECT Year, cols[1:13] AS temps 
  FROM csv 
  TO json" | 
spyql "
  SELECT 
    json->Year + ((row_number-1)%12)/12 AS year, 
    json->temps AS temp 
  FROM json 
  EXPLODE json->temps 
  WHERE json->temps is not Null 
  TO json" | 
plt "
scatter(year, temp, 2, temp); 
xlabel('Year'); 
ylabel('Temperature anomaly w.r.t. 1951-80 (ºC)'); 
title('Global surface temperature (land and ocean)')"

plotting a function

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matplotcli-0.1.0.post2.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

matplotcli-0.1.0.post2-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file matplotcli-0.1.0.post2.tar.gz.

File metadata

  • Download URL: matplotcli-0.1.0.post2.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/29.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.62.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.13

File hashes

Hashes for matplotcli-0.1.0.post2.tar.gz
Algorithm Hash digest
SHA256 f163afe1cb9624c74943a7d2cd7afff65d23539ba8f4539eceda26a9cee403de
MD5 a7d6b13ddbe6ddb98c4f7775e871a430
BLAKE2b-256 838a4067e016bdb1df8673784ae9a253af544c8ad5262908b2cbb616d5f4343d

See more details on using hashes here.

File details

Details for the file matplotcli-0.1.0.post2-py3-none-any.whl.

File metadata

  • Download URL: matplotcli-0.1.0.post2-py3-none-any.whl
  • Upload date:
  • Size: 5.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/29.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.62.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.13

File hashes

Hashes for matplotcli-0.1.0.post2-py3-none-any.whl
Algorithm Hash digest
SHA256 bb0aedadf602b4ec4a39f0ca98a795503dd6ab2703aa48593d920d57e9c2b7ca
MD5 932fedbdcc396c341312800c081f7eeb
BLAKE2b-256 df915ab7a5fe489b4849d8c6e2bec53601d74ebad021fb6fd748196b1eb14d4e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page