Skip to main content

No project description provided

Project description

scale-sdk

python library for skt scale

Install

pip install skt-scale

Configure secret and endpoint

Description

Configure SCALE CLI options. If this command is run, you will be prompted for configuration values such as your secret and your SCALE endpoint. If your config file does not exist (the location is ~/.scale/config), the SCALE CLI will create it for you.

Input

$ scalecli configure 
Enter secret: super-secret
Endter scale endpoint: 10.0.0.2:14001

Output

$ cat ~/.scale/config 
{"endpoint": "10.0.0.2:14001", "secret": "super-secret"}  

Usage

CLI create job

Description

Starts a job

In the arguments you provided following:

  • --job_name: job name (ex: my-job)
  • --image_name: docker image name (ex: scale/tensorflow:1.14)
  • --source_file: python source file location (ex: /home/user/test.py)
  • --gpu_type: gpu type (ex: rtx-2080-ti)
  • --cpu: cpu count (ex: 1 -> 1core)
  • --mem: memory gigabyte (ex: 4 -> 4Gi)
  • --gpu: gpu count (ex: 1 -> 1 gpu)

Input

$ scalecli create_job \
 --job_name=$RANDOM \
 --image_name=scale/tensorflow:1.14-v1-py3 \
 --source_file=./source.py \
 --mem=2 \
 --cpu=1

Output

job id:  be37ae98-f605-4c40-9f6e-70e4087e6ce7
..........[SYSTEM] Train start.
[2021-01-18 03:47:35:1] WARNING: Logging before flag parsing goes to stderr.
[2021-01-18 03:47:35:2] I0118 03:47:32.800786 140663770220352 estimator.py:1790] Using default config.
[2021-01-18 03:47:35:3] W0118 03:47:32.802039 140663770220352 estimator.py:1811] Using temporary folder as model directory: /tmp/tmp4gn_unya
[2021-01-18 03:47:35:4] I0118 03:47:32.803353 140663770220352 estimator.py:209] Using config: {'_model_dir': '/tmp/tmp4gn_unya', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
[2021-01-18 03:47:35:5] graph_options {
[2021-01-18 03:47:35:6]   rewrite_options {
[2021-01-18 03:47:35:7]     meta_optimizer_iterations: ONE
[2021-01-18 03:47:35:8]   }
[2021-01-18 03:47:35:9] }
[2021-01-18 03:47:35:9] }
[2021-01-18 03:47:35:10] , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7feeb27627b8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
[2021-01-18 03:47:35:11] W0118 03:47:32.803683 140663770220352 model_fn.py:630] Estimator's model_fn (<function model_fn at 0x7feeb27b4d90>) includes params argument, but params are not passed to Estimator.
[2021-01-18 03:47:35:12] W0118 03:47:32.891000 140663770220352 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
[2021-01-18 03:47:35:13] Instructions for updating:
[2021-01-18 03:47:35:14] Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
[2021-01-18 03:47:35:15] I0118 03:47:32.996457 140663770220352 estimator.py:1145] Calling model_fn.
[2021-01-18 03:47:35:16] W0118 03:47:32.996738 140663770220352 deprecation_wrapper.py:119] From app.py:16: The name tf.layers.Dense is deprecated. Please use tf.compat.v1.layers.Dense instead.
[2021-01-18 03:47:35:17] 
[SYSTEM] Train completed.

GPU example

scalecli create_job \
 --job_name=$RANDOM \
 --image_name=scale/tensorflow:1.14-v1-py3 \
 --source_file=./source.py \
 --gpu_type=Tesla_P100-PCIE-16GB \
 --mem=2 \
 --gpu=1 \
 --cpu=1

SDK create job

Description

Starts a job

create_job function parameter you provided following:

  • job_name: job name (ex: my-job)
  • image_name: docker image name (ex: scale/tensorflow:1.14)
  • source_file: python source file location (ex: /home/user/test.py)
  • gpu_type: gpu type (ex: rtx-2080-ti)
  • cpu: cpu count (ex: 1 -> 1core)
  • mem: memory gigabyte (ex: 4 -> 4Gi)
  • gpu: gpu count (ex: 1 -> 1 gpu)

Input

cpu_example.py

import os
import random
import string
from scale import Client


def main():
    client = Client()
    random_job_name = "".join(
        random.choice(string.ascii_letters + string.digits) for _ in range(10)
    )
    image_name = "scale/tensorflow:1.14-v1-py3"
    current_dir = os.path.dirname(os.path.realpath(__file__))
    source_file = os.path.join(current_dir, "source.py")
    client.create_job(
        job_name=random_job_name, image_name=image_name, source_file=source_file
    )


if __name__ == "__main__":
    main()

gpu_example.py

import os
import random
import string
from scale import Client


def main():
    client = Client(
        host="http://0.0.0.0:13202", user_id="user", token="secret_token"
    )
    random_job_name = "".join(
        random.choice(string.ascii_letters + string.digits) for _ in range(10)
    )
    image_name = "scale/tensorflow:1.14-v1-py3"
    current_dir = os.path.dirname(os.path.realpath(__file__))
    source_file = os.path.join(current_dir, "source.py")
    gpu_type = "Tesla_P100-PCIE-16GB"
    client.create_job(
        job_name=random_job_name,
        image_name=image_name,
        source_file=source_file,
        gpu_type=gpu_type,
        cpu=1,
        mem=2,
        gpu=1,
    )


if __name__ == "__main__":
    main()

source.py

# -*- coding: utf-8 -*-
# pylint disable

import tensorflow as tf


tf.logging.set_verbosity(tf.logging.INFO)


def input_fn():
    features = tf.data.Dataset.from_tensors([[1.0]]).repeat()
    labels = tf.data.Dataset.from_tensors(1.0).repeat()
    return tf.data.Dataset.zip((features, labels))


def model_fn(features, labels, mode, params):
    layer = tf.layers.Dense(1)
    logits = layer(features)
    loss = tf.losses.mean_squared_error(
        labels=labels, predictions=tf.reshape(logits, [])
    )
    if mode == tf.estimator.ModeKeys.TRAIN:
        step = tf.train.get_or_create_global_step()
        train_op = tf.train.AdamOptimizer().minimize(loss, step)
        return tf.estimator.EstimatorSpec(
            mode=mode, loss=loss, train_op=train_op
        )


def main():
    estimator = tf.estimator.Estimator(model_fn=model_fn)
    estimator.train(input_fn=input_fn, steps=1000)


main()

Output

$ python test.py
job id:  bc773270-a343-4dd0-9644-55f839de9d84
............[SYSTEM] Train start.
[2021-01-18 03:46:25:1] WARNING: Logging before flag parsing goes to stderr.
[2021-01-18 03:46:25:2] I0118 03:46:22.422873 140147188627264 estimator.py:1790] Using default config.
[2021-01-18 03:46:25:3] W0118 03:46:22.423443 140147188627264 estimator.py:1811] Using temporary folder as model directory: /tmp/tmpccfxb_6l
[2021-01-18 03:46:25:4] I0118 03:46:22.423917 140147188627264 estimator.py:209] Using config: {'_model_dir': '/tmp/tmpccfxb_6l', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
[2021-01-18 03:46:25:5] graph_options {
[2021-01-18 03:46:25:6]   rewrite_options {
[2021-01-18 03:46:25:7]     meta_optimizer_iterations: ONE
[2021-01-18 03:46:25:8]   }
[2021-01-18 03:46:25:9] }
[2021-01-18 03:46:25:9] }
[2021-01-18 03:46:25:10] , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f76247387b8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
[2021-01-18 03:46:25:11] W0118 03:46:22.424028 140147188627264 model_fn.py:630] Estimator's model_fn (<function model_fn at 0x7f762478ad90>) includes params argument, but params are not passed to Estimator.
[2021-01-18 03:46:25:12] W0118 03:46:22.435012 140147188627264 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
[2021-01-18 03:46:25:13] Instructions for updating:
[2021-01-18 03:46:25:14] Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
[2021-01-18 03:46:25:15] I0118 03:46:22.502991 140147188627264 estimator.py:1145] Calling model_fn.
[2021-01-18 03:46:25:16] W0118 03:46:22.503099 140147188627264 deprecation_wrapper.py:119] From app.py:16: The name tf.layers.Dense is deprecated. Please use tf.compat.v1.layers.Dense instead.
[2021-01-18 03:46:25:17] 
[SYSTEM] Train completed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skt_scale-3.0.0a1-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file skt_scale-3.0.0a1-py3-none-any.whl.

File metadata

  • Download URL: skt_scale-3.0.0a1-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/50.3.0.post20201006 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.6.12

File hashes

Hashes for skt_scale-3.0.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 bcbb9b50de976c03d9040bf017e1874a9041c473447a56cbcd125df1a42ef622
MD5 90e7b42553de8273fc3350fc0d8cd7e2
BLAKE2b-256 bfe70e1df0656a3a44865a18101464bb1a53a890d8d5aeb1370f66b7f106ab29

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page