Skip to main content

CSGHub SDK for downloading and uploading models, datasets, and spaces

Project description

English | 中文

CSGHub SDK

Introduction

The CSGHub SDK is a powerful Python client specifically designed to interact seamlessly with the CSGHub server. This toolkit is engineered to provide Python developers with an efficient and straightforward method to operate and manage remote CSGHub instances. Whether you're looking to automate tasks, manage data, or integrate CSGHub functionalities into your Python applications, the CSGHub SDK offers a comprehensive set of features to accomplish your goals with ease.

Key Features

With just a few lines of code, you can seamlessly and quickly switch the model download URL to OpenCSG, enhancing the download speed of models.

Effortlessly connect and interact with CSGHub server instances from your Python code.

Comprehensive API Coverage: Full access to the wide array of functionalities provided by the CSGHub server, ensuring you can perform a broad spectrum of operations.

User-Friendly: Designed with simplicity in mind, making it accessible for beginners while powerful enough for advanced users.

Efficient Data Management: Streamline the process of managing and manipulating data on your CSGHub server.

Automation Ready: Automate repetitive tasks and processes, saving time and reducing the potential for human error.

Open Source: Dive into the source code, contribute, and customize the SDK to fit your specific needs.

The main functions are:

  1. Repo downloading(model/dataset/space/code/mcp/skill)
  2. Repo information query(Compatible with huggingface)

XNet Accelerated Transfer (New!): Next-generation storage and version control technology for large-scale AI/ML data.

  • Storage Optimization: Significantly reduces storage costs (tested savings > 50%) via intelligent Content-Defined Chunking and deduplication.
  • High-Speed Transfer: Incremental updates ensure only changed data chunks are transferred, boosting upload/download speeds by multiples.
  • Enabled by Default: Automatically optimizes upload, download, and storage for LFS large files. To disable, set the environment variable CSGHUB_DISABLE_XNET=true to fallback to standard LFS mode.

Get My Token

Visit OpenCSG, click on Sign Up in the top right corner to complete the user registration process. Use the successfully registered username and password to log in to OpenCSG. After logging in, find Access Token under Account Settings to obtain the token.

Getting Started

To get started with the CSGHub SDK, ensure you have Python installed on your system. Then, you can install the SDK using pip:

pip install csghub-sdk

# install with train dependencies
pip install "csghub-sdk[train]"

After installation, you can begin using the SDK to connect to your CSGHub server by importing it into your Python script:

import os 
from pycsghub.repo_reader import AutoModelForCausalLM, AutoTokenizer

os.environ['CSGHUB_TOKEN'] = 'your_access_token'

mid = 'OpenCSG/csg-wukong-1B'
model = AutoModelForCausalLM.from_pretrained(mid)
tokenizer = AutoTokenizer.from_pretrained(mid)

inputs = tokenizer.encode("Write a short story", return_tensors="pt")
outputs = model.generate(inputs)
print('result: ',tokenizer.batch_decode(outputs))

Quickly switch download URLs

By simply changing the import package name from transformers to pycsghub.repo_reader and setting the download token, you can seamlessly and quickly switch the model download URL.

os.environ['CSGHUB_TOKEN'] = 'your_access_token'
from pycsghub.repo_reader import AutoModelForCausalLM, AutoTokenizer

Install from source code

git clone https://github.com/OpenCSGs/csghub-sdk.git
cd csghub-sdk
pip install .

You can install the dependencies related to the model and dataset using pip install '.[train]', for example:

pip install '.[train]'

Use cases of command line

For detailed command line usage examples, including downloading models/datasets, uploading files/folders, and managing inference/fine-tuning instances, please refer to our CLI documentation.

Use cases of SDK

For detailed SDK usage examples, including model/dataset downloading, file uploading, directory uploading, and Hugging Face compatible model loading, please refer to our SDK documentation.

Roadmap

  1. Interacting with CSGHub via command-line tools
  2. Management operations such as creation and modification of CSGHub repositories
  3. Model deployment locally or online
  4. Model fine-tuning locally or online

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csghub_sdk-0.9.0.tar.gz (334.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csghub_sdk-0.9.0-py3-none-any.whl (557.9 kB view details)

Uploaded Python 3

File details

Details for the file csghub_sdk-0.9.0.tar.gz.

File metadata

  • Download URL: csghub_sdk-0.9.0.tar.gz
  • Upload date:
  • Size: 334.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for csghub_sdk-0.9.0.tar.gz
Algorithm Hash digest
SHA256 31bc748e3c29a4b95a0a8201a0da305cacb1835426bd1c2f0668ec13a9173674
MD5 df643a119df22e8ab1838c367e55ea8f
BLAKE2b-256 3d42aa86f39e8fe5c1f5f619e384a735b924798678c2fef6c79978c2a0f6baa8

See more details on using hashes here.

File details

Details for the file csghub_sdk-0.9.0-py3-none-any.whl.

File metadata

  • Download URL: csghub_sdk-0.9.0-py3-none-any.whl
  • Upload date:
  • Size: 557.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for csghub_sdk-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 800f103efd414dad55bd153d779701917b0e1c7054c087e2cf154687d885a597
MD5 54e4e17c2f8810eb6f63d406284ac4d1
BLAKE2b-256 7f7ce9b5923791f9e41c27f5e54daeaed5b3c2388eb9e599cc192d000e67aeda

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page