Create Deployment
Deploy models from Model Gallery, upload your own models, or use existing fine-tuned models with full configuration control.
Step 1: Select a Model
Choose a base model or fine-tuned model to deploy from a variety of sources to deploy.
Model Selection Options
Explore our curated collection of pre-trained models from leading AI providers.
Available Models: Qwen, meta-llama, openai, deepseek-ai, zai-org, Alibaba-NLP, tencent, moonshotai, google, internlm, and more
Configuration Options
Selecting a Model
Enable speculative decoding for faster inference
- Base Models: Choose a base model that best suits your use case, Consider the model's capabilities and limitations, Review model specifications (parameters, context window, etc.)
- Fine-Tuned Models: Custom models trained on the platform based on the Base Model.
Display name
A descriptive name to help you identify the deployment on the dashboard. (within 64 characters)

Step 2: Configure Resources
Select the appropriate compute resources and deployment settings for your model.
Resource Configuration
Region Selection
Choose the deployment region based on your users' location for optimal latency.
Available Regions: Singapore, Japan, USA-East, Indonesia, Frankfurt, Hong Kong, Malaysia
GPU Type Selection
Select a GPU type based on your model's requirements and performance needs.
Options: NVIDIA A10, L20, etc.
Accelerator Count
Number of accelerators to use per replica (Automatically recommended based on model size).
Replicas
Number of replicas to deploy for load balancing and high availability.

Step 3: Review and Deploy
Review your configuration and cost summary before creating the deployment.
Cost Summary
- GPU Compute Cost
- System Overhead (currently $0)
Model download and storage costs are not included in the Cost Summary.
Troubleshoot
1. Error 403: Sales of this resource are temporarily suspended
403: Sales of this resource are temporarily suspended.
- Reason: The selected GPU type is temporarily unavailable in the current region due to high demand and insufficient resources.
- Solution:
- Try deploying the model in a different region where resources may be available.
- If the issue persists, please contact our support team for assistance.
2. Error: Account has an outstanding balance
Account has an outstanding balance.
- Reason: Your account balance is insufficient to cover the cost of creating or running the deployment.
- Solution: Navigate to the Billing section of your account dashboard and add funds to your balance.
3. Error: Your account information is incomplete
Your account information is incomplete.
- Reason: Your account has not completed the required identity or information verification process.
- Solution: Navigate to the Alibaba Cloud Account Settings page and complete the required identity verification.