InnoDB & MyISAM Table Parser & Usage Estimator

These details have not been verified by PyPI

Project description

MariaDB Table Usage

This tool reports usage statistics for InnoDB and MyISAM tables.

For InnoDB, it parses table pages and segment nodes to calculate usage accurately.
For MyISAM, it reads file stats.

The tool reports the following information per table :

data_length : Size of table data (In case of InnoDB, it includes the primary index usage as well)
index_length : Usage of secondary indexes (In case of MyISAM, size of MYI file)
data_free : Deleted or uninitialized space (InnoDB only)

Usage

from mariadb_table_usage import usage

data = usage("test_db")

Why not rely on the database ?

To get the same information from the database, we usually run:

SELECT data_length, index_length, data_free
FROM information_schema.tables
WHERE table_schema = 'test_db'

This query can cause disk I/O issues on shared database servers, especially when many small databases exist on the same host.

Related issues and references:

Although MySQL and MariaDB have improved this over time, there are still some fundamental problems.

Problems with information_schema approach

When MariaDB calculates table usage using information_schema.tables, it needs to open each table and inspect its metadata.

If a database has many tables (for example, hundreds or thousands), this creates a few issues:

The database has a limited table_open_cache. Opening many tables for metadata queries can evict entries from the cache, which affects both subsequent metadata queries and normal production queries.
The database engine treats information_schema queries like regular queries and does not consider system-level disk I/O limits.
Unlike normal queries, information_schema queries with dynamic column do not benefit much from the InnoDB buffer pool and often result in direct disk reads.
MariaDB may use multiple threads to calculate usage, which can further increase disk I/O pressure.

For each InnoDB table, the engine typically needs to read at least:

The FSP header page
The clustered index root page
Non-leaf segment node page

Each page is 16 KB, so even a single table requires multiple disk reads. When this is repeated across many tables, disk I/O can spike quickly.

Sometimes the query appears fast because the required pages are already cached. However, this cache can be cleared:

periodically
after DDL operations
when more than 10% of a table is modified

When the cache is cleared, the same query can suddenly cause heavy disk I/O and impact the entire server.

Frappe Cloud Scenario

In Frappe Cloud, shared servers host a large number of very small sites. At minimum, site migrations happen once a week, and these migrations usually modify a few tables in each site database.

After such operations, metadata caches are often invalidated (theoretically). If usage information is then queried sequentially for each site’s database, information_schema queries can trigger disk reads for every table across many databases. This can lead to a sudden and significant disk I/O spike and, in some cases, a temporary I/O freeze on the server.

Approach of this tool

This tool reads table files directly instead of querying the database engine.

It limits disk reads to a configurable rate (default: 200 ops/sec). This ensures consistent IO pressure regardless of how fast the files can be read.

It also monitors CPU iowait and pauses or stops its work if a configured threshold is reached. This helps keep the database server responsive.

Concurrency can be controlled to limit parallel table scans. For InnoDB tables, each table scan typically reads a few pages (FSP header, clustered index root, segment nodes). Each page is 16 KB.

By combining rate limiting, iowait monitoring, and concurrency control, the tool focuses on stability rather than speed and avoids putting excessive load on the disk.

Resources to Understand

The parser and library has been written based on this following resources

InnoDB Internals PPT : https://speakerdeck.com/xy/innodb-internals
Great blogs to learn about InnoDB file structures :
InnoDB Parser in Ruby for Learning : https://github.com/jeremycole/innodb_ruby/blob/main/bin/innodb_space
MariaDB Source Code : https://github.com/MariaDB/server

The codebase also includes comments explaining the implementation and references to the corresponding C source code.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- POSIX :: Linux
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.3

Feb 20, 2026

0.0.2

Feb 20, 2026

0.0.1

Feb 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mariadb_table_usage-0.0.3-py3-none-manylinux2014_x86_64.whl (2.1 MB view details)

Uploaded Feb 20, 2026 Python 3

mariadb_table_usage-0.0.3-py3-none-manylinux2014_aarch64.whl (2.0 MB view details)

Uploaded Feb 20, 2026 Python 3

File details

Details for the file mariadb_table_usage-0.0.3-py3-none-manylinux2014_x86_64.whl.

File metadata

Download URL: mariadb_table_usage-0.0.3-py3-none-manylinux2014_x86_64.whl
Upload date: Feb 20, 2026
Size: 2.1 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for mariadb_table_usage-0.0.3-py3-none-manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`28deb53f82f9726d03282d2c658ff58da6331913487b95b0ccc91c9e999f6a43`
MD5	`7b4d05db5eb7a5975e845165a3534a76`
BLAKE2b-256	`f1c85971ffcdcf7fd99afcb4191755aaed8c5ed462e499e9646011cda52734c7`

See more details on using hashes here.

File details

Details for the file mariadb_table_usage-0.0.3-py3-none-manylinux2014_aarch64.whl.

File metadata

Download URL: mariadb_table_usage-0.0.3-py3-none-manylinux2014_aarch64.whl
Upload date: Feb 20, 2026
Size: 2.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for mariadb_table_usage-0.0.3-py3-none-manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`32450f7fce61eb1cf3191b181ea78dcf7d1520427bff1cc0d1d7229c2d761270`
MD5	`47265cc3e2e8c5dab04962694161289e`
BLAKE2b-256	`bc9ead0921ea9fdb0bb0ce948037ef64e4f70cfd1771ad3c54c6f1fedb99d49c`

See more details on using hashes here.

mariadb-table-usage 0.0.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

MariaDB Table Usage

Usage

Why not rely on the database ?

Problems with information_schema approach

Frappe Cloud Scenario

Approach of this tool

Resources to Understand

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes