Product analytics over your data warehouse
Project description
Mitzu is an open source product analytics tool that queries your data lake or data warehouse.
Features
- Visualization for:
- Funnels
- Segmentation
- Retention (coming soon)
- User Journey (coming soon)
- Revenue calculations (coming soon)
- User Lookup (coming soon)
- Cohorts Analysis
- Standalone web app for non-tech people
- Notebook visual app
- Notebook low-code analytics in python
- Batch ETL jobs support
Supported Integrations
Mitzu integrates with most modern data lake and warehouse solutions:
- Databricks Spark (SQL)
- Trino / Starburst
- AWS Athena
- PostgreSQL
- MySQL
- Files - SQLite (csv, parquet, json, etc.)
Coming Soon
Quick Start
In this section we will describe how to start with Mitzu
on your local machine. Skip this section if you rather see Mitzu
in a prepared notebook or webapp. Otherwise get ready and fire up your own data-science notebook.
Install Mitzu python library
pip install mitzu
Reading The Sample Dataset
The simplest way to get started with Mitzu
is in a data-science notebook. In your notebook read the sample user behavior dataset.
Mitzu can discover your tables in a data warehouse or data lake. For the sake of simplicity we provide you an in-memory sqlite based table that contains
import mitzu.samples as smp
dp = smp.get_simple_discovered_project()
m = dp.create_notebook_class_model()
Segmentation
The following command visualizes the count of unique users
over time who did page visit
action in the last 30 days
.
m.page_visit
In the example above m.page_visit
refers to a user event segment
for which the notebook representation is a segmentation chart
.
If this sounds unfamiliar, don't worry! Later we will explain you everything.
Funnels
You can create a funnel chart
by placing the >>
operator between two user event segments
.
m.page_visit >> m.purchase
This will visualize the conversion rate
of users that first did page_visit
action and then did purchase
within a day in the last 30 days.
Filtering
You can apply filters to user event segment
the following way:
m.page_visit.country_code.is_us >> m.purchase
# You can achieve the same filter with:
# (m.page_visit.country_code == 'us')
#
# you can also apply >, >=, <, <=, !=, operators.
With this syntax we have narrowed down our page visit
user event segment
to page visits from the US
.
Stacking filters is possible with the &
(and) and |
(or) operators.
m.page_visit.country_code.is_us & m.page_visit.acquisition_campaign.is_organic
# if using the comparison operators, make sure you put the user event segments in parenthesis.
# (m.page_visit.country_code == 'us') & (m.page_visit.acquisition_campaign == 'organic')
Apply multi value filtering with the any_of
or none_of
functions:
m.page_visit.country_code.any_of('us', 'cn', 'de')
# m.page_visit.country_code.none_of('us', 'cn', 'de')
Of course you can apply filters on every user event segment
in a funnel.
m.add_to_cart >> (m.checkout.price_shown <= 1000)
Metrics Configuration
To any funnel or segmentation you can apply the config method. Where you can define the parameters of the metric.
m.page_visit.config(
start_dt="2022-08-01",
end_dt="2022-09-01",
group_by=m.page_visit.url,
time_group='total',
)
start_dt
should be an iso datetime string, or python datetime, where the metric should start.end_dt
should be an iso datetime string, or python datetime, where the metric should end.group_by
is a property that you can refer to from the notebook class model.time_group
is the time granularity of the query for which the possible values are:hour
,day
,week
,month
,year
,total
Funnels have an extra configuration parameter conv_window
, this has the following format: <VAL> <TIME WINDOW>
, where VAL
is a positive integer.
(m.page_visit >> m.checkout).config(
start_dt="2022-08-01",
end_dt="2022-09-01",
group_by=m.page_visit.url,
time_group='total',
conv_window='1 day',
)
SQL Generator
For any metric you can print out the SQL code that Mitzu
generates.
This you can do by calling the .print_sql()
method.
(m.page_visit >> m.checkout).config(
start_dt="2022-08-01",
end_dt="2022-09-01",
group_by=m.page_visit.url,
time_group='total',
conv_window='1 day',
).print_sql()
Pandas DataFrames
Similarly you can access the results in the form of a Pandas DataFrame with the method .get_df()
(m.page_visit >> m.checkout).config(
start_dt="2022-08-01",
end_dt="2022-09-01",
group_by=m.page_visit.url,
time_group='total',
conv_window='1 day',
).get_df()
Usage In Notebooks
Webapp
Mitzu can run as a standalone webapp or embedded inside a notebook.
Connect Your Own Data
Mitzu is be able to connect to your data warehouse or data lake. To get started with your own data integration please read our handy docs
Contribution Guide
Please read our Contribution Guide
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mitzu-0.2.42.tar.gz
.
File metadata
- Download URL: mitzu-0.2.42.tar.gz
- Upload date:
- Size: 70.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.14 CPython/3.9.7 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 62967240c99187fa6c63b80ba5dd252a48ceaee34482a42b761f7935e24d1772 |
|
MD5 | 36626b6189a4195ca9a2c56a381ceb4a |
|
BLAKE2b-256 | 4cc3b3cb921133cb32d9811a136e00407da113a8842ccb83eb9609bbb343fe86 |
File details
Details for the file mitzu-0.2.42-py3-none-any.whl
.
File metadata
- Download URL: mitzu-0.2.42-py3-none-any.whl
- Upload date:
- Size: 91.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.14 CPython/3.9.7 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | be8bd1ebf107d7ba9613e0753ffc0d4f7cae9703211795ea51f94fb85193943f |
|
MD5 | 99ed4e8142898ece0e959f0e3611bcf6 |
|
BLAKE2b-256 | 641f0b998a38fe948a278e7e3dd7c6e0691c06c20167647b410d17b8b6a86d8a |