Skip to main content

A print and debugging utility that makes your error printouts look nice

Project description

A common pain that comes after getting to launch ML training jobs on AWS is a lack of a good way to manage and visualize your data. So far, a common practice is to upload your experiment data to aws s3 or google cloud buckets. Then one quickly realizes that downloading data from s3 can be slow. s3 does not offer diffsync like gcloud-cli’s g rsync. This makes it hard to sync a large collection of data that is constantly appended to.

Visualization Dashboard (Preview) :boom:

Incoming: A real-time visualization dashboard (and sever!) ml visualization dashboard

An Example Log from ML-Logger

So far the best way we have found for organizing experimental data is to have a centralized instrumentation server. Compared with managing your data on S3, a centralized instrumentation server makes it much easier to move experiments around, run analysis that is co-located with your data, and hosting visualization dashboards on the same machine. To download data locally, you can use sshfs, smba, rsync or a variety of remote disks. All faster than s3.

ML-Logger is the logging utility that allows you to do this. To make ML_logger easy to use, we made it so that you can use ml-logger with zero configuration, logging to your local hard-drive by default. When the logging directory field logger.configure(log_directory= <your directory>) is an http end point, the logger will instantiate a fast, future based logging client that launches http requests in a separate thread. We optimized the client so that it won’t slow down your training code-block.

API wise, ML-logger makes it easy for you to log textual printouts, simple scalars, numpy tensors, image tensors, and pyplot figures. Because you might also want to read data from the instrumentation server, we also made it possible to load numpy, pickle, text and binary files remotely.

In the future, we will start building an integrated dashboard with fast search, live figure update and markdown-based reporting/dashboarding to go with ml-logger.

Now give this a try, and profit!

Usage

To install ml_logger, do:

pip install ml-logger

To kickstart a logging server, run

python -m ml_logger.server

In your project files, do:

from params_proto import cli_parse
from ml_logger import logger


@cli_parse
class Args:
    seed = 1
    D_lr = 5e-4
    G_lr = 1e-4
    Q_lr = 1e-4
    T_lr = 1e-4
    plot_interval = 10
    log_dir = "http://54.71.92.65:8081"
    log_prefix = "https://github.com/episodeyang/ml_logger/blob/master/runs"

logger.configure(log_directory="http://some.ip.address.com:2000", prefix="your-experiment-prefix!")
logger.log_params(Args=vars(Args))
logger.log_file(__file__)


for epoch in range(10):
    logger.log(step=epoch, D_loss=0.2, G_loss=0.1, mutual_information=0.01)
    logger.log_keyvalue(epoch, 'some string key', 0.0012)
    # when the step index updates, logger flushes all of the key-value pairs to file system/logging server

logger.flush()

# Images
face = scipy.misc.face()
face_bw = scipy.misc.face(gray=True)
logger.log_image(index=4, color_image=face, black_white=face_bw)
image_bw = np.zeros((64, 64, 1))
image_bw_2 = scipy.misc.face(gray=True)[::4, ::4]

logger.log_image(i, animation=[face] * 5)

This version of logger also prints out a tabular printout of the data you are logging to your stdout. - can silence stdout per key (per logger.log call) - can print with color: logger.log(timestep, some_key=green(some_data)) - can print with custom formatting: logger.log(timestep, some_key=green(some_data, percent)) where percent - uses the correct unix table characters (please stop using | and +. Use ``│``, ``┼`` instead)

A typical print out of this logger look like the following:

from ml_logger import ML_Logger

logger = ML_Logger(log_directory=f"/mnt/bucket/deep_Q_learning/{datetime.now(%Y%m%d-%H%M%S.%f):}")

logger.log_params(G=vars(G), RUN=vars(RUN), Reporting=vars(Reporting))

outputs the following

═════════════════════════════════════════════════════
              G
───────────────────────────────┬─────────────────────
           env_name            │ MountainCar-v0
             seed              │ None
      stochastic_action        │ True
         conv_params           │ None
         value_params          │ (64,)
        use_layer_norm         │ True
         buffer_size           │ 50000
      replay_batch_size        │ 32
      prioritized_replay       │ True
            alpha              │ 0.6
          beta_start           │ 0.4
           beta_end            │ 1.0
    prioritized_replay_eps     │ 1e-06
      grad_norm_clipping       │ 10
           double_q            │ True
         use_dueling           │ False
     exploration_fraction      │ 0.1
          final_eps            │ 0.1
         n_timesteps           │ 100000
        learning_rate          │ 0.001
            gamma              │ 1.0
        learning_start         │ 1000
        learn_interval         │ 1
target_network_update_interval │ 500
═══════════════════════════════╧═════════════════════
             RUN
───────────────────────────────┬─────────────────────
        log_directory          │ /mnt/slab/krypton/machine_learning/ge_dqn/2017-11-20/162048.353909-MountainCar-v0-prioritized_replay(True)
          checkpoint           │ checkpoint.cp
           log_file            │ output.log
═══════════════════════════════╧═════════════════════
          Reporting
───────────────────────────────┬─────────────────────
     checkpoint_interval       │ 10000
        reward_average         │ 100
        print_interval         │ 10
═══════════════════════════════╧═════════════════════
╒════════════════════╤════════════════════╕
│      timestep      │        1999        │
├────────────────────┼────────────────────┤
│      episode       │         10         │
├────────────────────┼────────────────────┤
│    total reward    │       -200.0       │
├────────────────────┼────────────────────┤
│ total reward/mean  │       -200.0       │
├────────────────────┼────────────────────┤
│  total reward/max  │       -200.0       │
├────────────────────┼────────────────────┤
│time spent exploring│       82.0%        │
├────────────────────┼────────────────────┤
│    replay beta     │        0.41        │
╘════════════════════╧════════════════════╛
from ml_logger import ML_Logger

logger = ML_Logger('/mnt/slab/krypton/unitest')
logger.log(0, some=Color(0.1, 'yellow'))
logger.log(1, some=Color(0.28571, 'yellow', lambda v: f"{v * 100:.5f}%"))
logger.log(2, some=Color(0.85, 'yellow', percent))
logger.log(3, {"some_var/smooth": 10}, some=Color(0.85, 'yellow', percent))
logger.log(4, some=Color(10, 'yellow'))
logger.log_histogram(4, td_error_weights=[0, 1, 2, 3, 4, 2, 3, 4, 5])

colored output: (where the values are yellow)

╒════════════════════╤════════════════════╕
│        some        │        0.1         │
╘════════════════════╧════════════════════╛
╒════════════════════╤════════════════════╕
│        some        │     28.57100%      │
╘════════════════════╧════════════════════╛
╒════════════════════╤════════════════════╕
│        some        │       85.0%        │
╘════════════════════╧════════════════════╛
╒════════════════════╤════════════════════╕
│  some var/smooth   │         10         │
├────────────────────┼────────────────────┤
│        some        │       85.0%        │
╘════════════════════╧════════════════════╛

TODO:

  • [ ] Integrate with visdom, directly plot locally.

    • (better to keep it separate, because visdom is shitty.)

    • ml_logger does NOT know the full data set. Therefore we should not expect it to do the data processing such as taking mean, reservoir sampling etc. Where should this happen though?

    • just log to visdom for now. Use the primitive plot.ly plotting inteface.

    • data: keys/values

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

ml_logger-0.1.18-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file ml_logger-0.1.18-py3-none-any.whl.

File metadata

  • Download URL: ml_logger-0.1.18-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.4

File hashes

Hashes for ml_logger-0.1.18-py3-none-any.whl
Algorithm Hash digest
SHA256 4ec769d0d163ea0870007544bb535ea24e48dc8e870d2d7ae25eb95364a59710
MD5 7ef0b6c9f659e2f2060830a4aa4845e5
BLAKE2b-256 d790baa200ebd3a06c53fc00d204e641ec009619de69dd3c4ce8f5a1f68d1783

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page