Skip to main content

A simple tool for benchamrking and tracking machine learning models and experiments.

Project description

Xetrack

xetrack is a lightweight package to track experiments and benchmarks data using duckdb. It looks and feels like pandas and is very easy to use.

Each instance of the tracker has a "track_id" which is a unique identifier for a single run.

Features

  • Simple
  • Embedded
  • Fast
  • Pandas-like
  • SQL-like
  • Multiprocessing reads and writes

Installation

pip install xetrack

Quickstart

from xetrack import Tracker

tracker = Tracker('database.db',
                  params={'model': 'resnet18'},
                  verbose=False)
tracker.log(accuracy=0.9, loss=0.1, epoch=1)
tracker.latest
{'accuracy': 0.9, 'loss': 0.1, 'epoch': 1, 'model': 'resnet18', 'timestamp': '18-08-2023 11:02:35.162360',
 'track_id': 'cd8afc54-5992-4828-893d-a4cada28dba5'}

tracker.to_df(all=True)  # all runs as dataframe
                    timestamp                              track_id     model  accuracy  loss  epoch
0  21-08-2023 11:32:55.433332  9066505b-4a09-4946-ae4a-c3c957720bba  resnet18       0.9   0.1      1
1  21-08-2023 11:33:01.302282  d4f5c4b7-e372-4d54-9a30-2d8a774fe9cc  resnet18       0.9   0.1      1

Params are values which are added to every future row:

tracker.set_params({'model': 'resnet18', 'dataset': 'cifar10'})

You can also set a value to an entire run with set_value ("back in time"):

tracker.set_value('test_accuracy', 0.9)

Track functions

You can track any function.

  • The function must return a dictionary or None
tracker = Tracker('database.db', log_system_params=True, log_network_params=True, measurement_interval=0.1)
image = tracker.track(read_image, *args, **kwargs)
tracker.latest
{'result': 571084, 'name': 'read_image', 'time': 0.30797290802001953, 'error': '', 'disk_percent': 0.6,
 'p_memory_percent': 0.496507, 'cpu': 0.0, 'memory_percent': 32.874608, 'bytes_sent': 0.0078125,
 'bytes_recv': 0.583984375}

Or with a wrapper:

@tracker.wrap(name='foofoo')
def foo(a: int, b: str):
    return a + len(b)
{'function_result': 6, 'name': 'foofoo', 'time': 7.867813110351562e-06, 'error': '', 'args': "[1, 'hello']", 'kwargs': '{}', 'disk_percent': 0, 'p_memory_percent': 0, 'cpu': 0, 'memory_percent': 0, 'bytes_sent': 0.0, 'bytes_recv': 0.0, 'model': 'lightgbm', 'timestamp': '18-08-2023 10:59:26.011938', 'track_id': 'a6f99e21-dfd8-4056-98e5-46b2a76fab41'}

Pandas-like

print(tracker)
                                    _id                              track_id                 date    b    a  accuracy
0  48154ec7-1fe4-4896-ac66-89db54ddd12a  fd0bfe4f-7257-4ec3-8c6f-91fe8ae67d20  16-08-2023 00:21:46  2.0  1.0       NaN
1  8a43000a-03a4-4822-98f8-4df671c2d410  fd0bfe4f-7257-4ec3-8c6f-91fe8ae67d20  16-08-2023 00:24:21  NaN  NaN       1.0

tracker['accuracy'] # get accuracy column
tracker.to_df() # get pandas dataframe of current run

SQL-like

You can filter the data using SQL-like syntax using duckdb:

  • The sqlite database is attached as db and the table is events
tracker.conn.execute(f"SELECT * FROM db.events WHERE accuracy > 0.8").fetchall()

Analysis

To get the data of all runs in the database for analysis:
Use this for further analysis and plotting.

  • This works even while a another process is writing to the database.
from xetrack import Reader
df = Reader('database.db').to_df() 

Merge two databases

If you have two databases, and you want to merge them into one, you can use the copy function:

python -c 'from xetrack import copy; copy(source="db1.db", target="db2.db")'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xetrack-0.0.8.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

xetrack-0.0.8-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file xetrack-0.0.8.tar.gz.

File metadata

  • Download URL: xetrack-0.0.8.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.4 Darwin/22.4.0

File hashes

Hashes for xetrack-0.0.8.tar.gz
Algorithm Hash digest
SHA256 3b5940c9a9dab16e14f43d0f7fdb047918b23865d1c9aa81bebfa8ea5f078af2
MD5 5b6438bf66d63ea33e81ff2b0e7a37db
BLAKE2b-256 13e745b448c5a0c696bdfc81a8f0eb9ad3cb5b781f8a1a6a90505a3bea8c76a7

See more details on using hashes here.

File details

Details for the file xetrack-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: xetrack-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.4 Darwin/22.4.0

File hashes

Hashes for xetrack-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 87e41cb9d96e8159a695aa66c0b0b99a533da3ee0e9cb21cd1287fdcb328c268
MD5 fb821088dcf2cd0525e0251f09a6fea2
BLAKE2b-256 e0c65a420f7ee84b3d3fecfb038893eb6e4631410e748b9ec924f65964c05e1a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page