Package for Inferencing AI at DSC
Project description
AIDSC: Lightweight Classroom Inference Client
aidsc is a Python client designed for students and researchers to perform LLM inference on classroom clusters. It acts as a gateway to the Zeus control plane, allowing you to run models on high-performance classroom hardware from your local machine.
Key Features
- Classroom-Ready: Built specifically for inferencing classroom machines running LLMs.
- Broad Machine Access: Users can access any classroom machine, and Skynet access is available with special permissions.
- Seamless Remote Access: Automatic SSH tunneling via PAMD to bridge the gap to internal networks.
- Smart Connectivity: Checks for existing local tunnels before prompting for credentials.
- Persistent Credentials: Optional encrypted-style saving to
.env(git-ignored automatically). - Lightweight & Typed: Pydantic-validated payloads ensuring your requests always match the Zeus API.
Hardware and Model Capacity
Each classroom machine is equipped with an NVIDIA RTX A5000 GPU with 24GB of VRAM. These machines can run models upwards of 35B parameters at 4-bit, though some newer models may be difficult to serve reliably given the older hardware.
Skynet is equipped with 4 NVIDIA A100 GPUs with 80GB of VRAM each, for a total of 320GB. With special permissions, Skynet can run models up to roughly 550B-600B parameters at 4-bit.
All inference runs through vLLM. Models are not offloaded to CPU memory; they are served directly on the available GPU memory. To request additional models, email bdg20b.
Connectivity Logic
The client automatically resolves connectivity in this order:
- Local Check: Checks
localhost:32553. If healthy, it proceeds immediately. - Environment/Dotenv: Looks for
PAMD_USERandPAMD_PASSin your system or.envfile. - Interactive Setup: If credentials are missing or the port is closed, it clears the console and guides you through establishing an SSH tunnel.
- Background Persistence: After establishing a tunnel, it asks if you'd like to keep it running in the background after your script exits.
Usage Reference
1. Initialization
from aidsc import aidsc
client = aidsc(
local_base_url="http://127.0.0.1:32553", # Override local Zeus URL
infer_timeout=600.0, # Global timeout for inference calls (seconds)
dotenv_path=".env" # Custom path to .env file
)
# Ensure connectivity (checks local port, prompts for SSH if needed)
client.ensure_ready()
2. Request Parameters
The LanguageInferenceRequest object allows fine-grained control over the job:
from aidsc import LanguageArgs, LanguageInferenceRequest
req = LanguageInferenceRequest(
args=LanguageArgs(
model="gpt-oss-20b", # The target model name
questions=["Recipe for rice?"], # List of prompts/questions
),
max_new_tokens=512, # Tokens to generate per response
max_context_length=8192, # Context window (alias: max_model_len)
max_server_uptime=604800, # How long the server can stay alive (seconds)
total_concurrent_models=1, # Instances of the model to run
max_num_seqs=8 # Max sequences for batching
)
3. Inference & Overrides
You can override specific parameters during the infer() call:
response = client.infer(
req,
server=117, # Target a specific machine ID (e.g., 117)
max_new_tokens=1024, # Override the request's max_new_tokens
client_timeout=300 # Specific timeout for this request
)
Configuration Reference
Environment Variables
These can be set in your OS or in a .env file in your project root.
| Variable | Default | Description |
|---|---|---|
PAMD_USER |
- | Your FSU/PAMD username (e.g., abc20z). |
PAMD_PASS |
- | Your PAMD password (used for sshpass). |
aidsc_ZEUS_LOCAL_URL |
http://127.0.0.1:32553 |
The local port where the tunnel maps. |
aidsc_ZEUS_REMOTE_URL |
http://144.174.11.196:32553 |
Internal IP of the Zeus orchestration server. |
Client Settings (aidsc())
| Parameter | Type | Description |
|---|---|---|
local_base_url |
str |
URL to reach Zeus (default: aidsc_ZEUS_LOCAL_URL). |
infer_timeout |
float |
Max time to wait for an inference result (default: 600.0). |
dotenv_path |
str |
Custom path to load .env from. |
Request Settings (LanguageInferenceRequest)
| Field | Default | Description |
|---|---|---|
max_new_tokens |
512 |
Max tokens generated for each question. |
max_context_length |
8192 |
Total tokens (prompt + completion) allowed. |
max_server_uptime |
604800 |
Kill server after this many seconds of idle. |
total_concurrent_models |
1 |
Number of engine instances to spin up. |
max_num_seqs |
8 |
Maximum number of concurrent sequences. |
Project Architecture
client.py: The high-level API. Handles the "Automatic SSH" logic, process detachment, and console UI.models.py: Pydantic definitions for all request types.connectivity.py: Low-level logic for checking local port health.zeus_client.py: The underlying HTTP client for the Zeus REST API.
Security Note
The client uses sshpass to handle password-based SSH connections to the PAMD jump host. Your .env file is automatically added to .gitignore when you run setup_interactive() to prevent accidental exposure of your credentials.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aidsc-0.1.6.tar.gz.
File metadata
- Download URL: aidsc-0.1.6.tar.gz
- Upload date:
- Size: 26.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
636fc51a87f2ff4b2f475dd411a0827fae76d9dd623915ee178daceec7aea0b7
|
|
| MD5 |
66d2ffa160ba25eb693a56588387524f
|
|
| BLAKE2b-256 |
4d8dd4ad0ef8c7170fc820a2f963a0328e7c09918f8d20caa114e40f9c421024
|
File details
Details for the file aidsc-0.1.6-py3-none-any.whl.
File metadata
- Download URL: aidsc-0.1.6-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d01a63994dafd7e6db80d2afbf3c8044920848265d37a7d9bb62f5f3b39401c
|
|
| MD5 |
1a67354e192120e30125a2577529d538
|
|
| BLAKE2b-256 |
3bcfaad789a08189c69f3e3161bfa5f5b917d54ecf5054d8ab662e280b475167
|