BYO-Cluster Quickstart
This guide covers the requirements and infrastructure needed to deploy Smart Studio on your own local IDC (Internet Data Center) with your GPU cluster.
Standard Requirements for IDC Deployment
The following CPU and storage servers are required. These servers must be on the same local virtual network as the GPUs.
| No. | Basic Server Configuration | Quantity | Purpose |
|---|---|---|---|
| 1 | CPU: 64 cores, Memory: 128GB, Storage: 1T+ | 3 | Deploy K8S cluster control service + router platform service (required) |
| 2 | GPU: depends on the local GPU configuration of the customer and the size of the LLM | N | See the Minimum Configuration for Different Model Deployments |
Demo of Network Topology
Minimum Requirements for IDC Deployment (POC)
For a minimal proof-of-concept setup, a single GPU server can be used.
| No. | Basic Server Configuration | Quantity | Purpose |
|---|---|---|---|
| 1 | GPU: H20/H100/H200/RTX PRO 6000, CPU: 128+ cores, Memory: 1TB, Storage: 1T+ | 1 | All services — K8S cluster control service + CNI, AI Gateway, Model Serving deployed on one GPU server (without HA guarantee) |
Demo of Network Topology
Minimum Configuration for Different Model Deployments
You can provide your required model name to us, and we will respond with the GPU requirements.
| Model | Minimum Deployment GPU Specification (Cards) | Memory Pool Base Scale |
|---|---|---|
| Minimax-M2.5 | 4 Cards: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G, A100-80G, A800-80G | >4T |
| Qwen3.5-397B-A22B-FP8 | 8 Cards: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G | >4T |
| Deepseek V4-Flash | 4 Cards: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G | >1T |
| Deepseek V4-Pro | 8 Cards: H20-141G, H200-141G 16 Cards: H100-80G, H20-96G | >1T |
| Deepseek-V3.2 | 8 Cards: H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G 16 Cards: H100-80G | >8T |
| GLM-5.1 | 8 Cards: H20-141G, H200-141G 16 Cards: H100-80G, H20-96G | >8T |
| Qwen3.6-27B / Qwen3.6-35B-A3B | 1 Card: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G, A100-80G, A800-80G 2 Cards: L20-48G, L40s-48G | >1.5T |
| Qwen3.6-Flash | 1 Card: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G, A100-80G, A800-80G 2 Cards: L20-48G, L40s-48G | >1T |
| Qwen3.7-Plus | 8 Cards: H20-141G, H200-141G 16 Cards: H100-80G, H20-96G | >4T |
| Qwen3.7-Max | 16 Cards: B300-288G 24 Cards: B200-188G, H200-141G, H20-141G 40 Cards: H20-96G | >15T |
Note
Estimations for Qwen3.6-Flash and Qwen3.7-Plus/Max are based on Bfloat16 dtype and may not be fully accurate.
Diagram of Infrastructure to deploy smart studio on Local IDC
GPU/CPU Infra → Scheduler & Model Serving → Router Platform & Gateway
