Skip to main content

BYO-Cluster Quickstart

This guide covers the requirements and infrastructure needed to deploy Smart Studio on your own local IDC (Internet Data Center) with your GPU cluster.

Standard Requirements for IDC Deployment

The following CPU and storage servers are required. These servers must be on the same local virtual network as the GPUs.

No.Basic Server ConfigurationQuantityPurpose
1CPU: 64 cores, Memory: 128GB, Storage: 1T+3Deploy K8S cluster control service + router platform service (required)
2GPU: depends on the local GPU configuration of the customer and the size of the LLMNSee the Minimum Configuration for Different Model Deployments

Demo of Network Topology


Minimum Requirements for IDC Deployment (POC)

For a minimal proof-of-concept setup, a single GPU server can be used.

No.Basic Server ConfigurationQuantityPurpose
1GPU: H20/H100/H200/RTX PRO 6000, CPU: 128+ cores, Memory: 1TB, Storage: 1T+1All services — K8S cluster control service + CNI, AI Gateway, Model Serving deployed on one GPU server (without HA guarantee)

Demo of Network Topology


Minimum Configuration for Different Model Deployments

You can provide your required model name to us, and we will respond with the GPU requirements.

ModelMinimum Deployment GPU Specification (Cards)Memory Pool Base Scale
Minimax-M2.54 Cards: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G, A100-80G, A800-80G>4T
Qwen3.5-397B-A22B-FP88 Cards: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G>4T
Deepseek V4-Flash4 Cards: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G>1T
Deepseek V4-Pro8 Cards: H20-141G, H200-141G
16 Cards: H100-80G, H20-96G
>1T
Deepseek-V3.28 Cards: H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G
16 Cards: H100-80G
>8T
GLM-5.18 Cards: H20-141G, H200-141G
16 Cards: H100-80G, H20-96G
>8T
Qwen3.6-27B / Qwen3.6-35B-A3B1 Card: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G, A100-80G, A800-80G
2 Cards: L20-48G, L40s-48G
>1.5T
Qwen3.6-Flash1 Card: H100-80G, H20-96G, H20-141G, H200-141G, RTX PRO 6000-96G, A100-80G, A800-80G
2 Cards: L20-48G, L40s-48G
>1T
Qwen3.7-Plus8 Cards: H20-141G, H200-141G
16 Cards: H100-80G, H20-96G
>4T
Qwen3.7-Max16 Cards: B300-288G
24 Cards: B200-188G, H200-141G, H20-141G
40 Cards: H20-96G
>15T
Note

Estimations for Qwen3.6-Flash and Qwen3.7-Plus/Max are based on Bfloat16 dtype and may not be fully accurate.

Diagram of Infrastructure to deploy smart studio on Local IDC

GPU/CPU Infra → Scheduler & Model Serving → Router Platform & Gateway