Skip to main content

How to Choose Models

Selecting the right model for your use case is crucial for optimal performance. This guide recommends open-source models by category to help you find the best fit for your project.

Open Source vs. API-Based Models

Compare open-source and closed-source models to determine which type best fits your use case.

Open Source Models

Deploy and customize models with full transparency and control.

  • Complete code access for customization and optimization
  • No usage restrictions
  • Community-driven development and support
  • Full transparency for compliance and security audits
API-Based Models

Leverage optimized, production-ready models with professional support.

  • Superior performance through proprietary optimization
  • Managed infrastructure and automatic updates
  • Dedicated technical support and SLAs
  • Faster iteration and feature releases
Model Selection Tips
  • Select Open Source Models if you need fine-tuning capabilities or want full control over deployment
  • Select API-Based Models if you prioritize stable performance and strong general-purpose capabilities

Understand Model Types

Learn about different model categories and variants to choose the right model for your needs.

Model Types

Model TypeDescriptionRepresentative Models
Large Language Models (LLM)Handle natural language tasks such as text generation, comprehension, summarization, and reasoning.Qwen, GLM, DeepSeek, etc.
Vision-Language Models (VLM)Integrate image and text modalities to support image captioning, visual question answering, and multimodal understanding.Qwen3-VL, Qwen2.5-Omni, MedGemma, etc.
Image ModelsFocus on image generation, editing, and recognition through computer vision techniques.Qwen-Image, Wan, Flux, etc.
Video ModelsExtend image modeling into temporal sequences, enabling video understanding and captioning.Wan, CogVideo, HunyuanVideo, etc.

Model Variants

Model VariantDescriptionRepresentative Examples
Base ModelsGeneral-purpose models trained on broad data without specialized fine-tuning. Serve as foundational backbones.Qwen2.5-7B, gpt-oss-20b, etc.
Instruct ModelsFine-tuned with human instructions to better follow prompts and improve dialogue quality.Qwen3-30B-A3B-Instruct-2507-FAST, Hunyuan-A13B-Instruct, etc.
Thinking ModelsDesigned for advanced reasoning, multi-step problem-solving, and complex cognitive tasks.Qwen3-Next-80B-A3B-Thinking, DeepSeek-R1, etc.
Mixture of Experts (MOE)Utilize dynamic routing across multiple expert subnetworks to balance model capacity and computational efficiency.Qwen3.5, Qwen3-30B-A3B, etc.

LLM

Large Language Models excel at natural language understanding, reasoning, and generation across diverse tasks and domains.

Use CaseDescriptionRecommended Models
Code & DevelopmentGenerate code, debug, explain logic, assist programmingOpen Source: Qwen3-Coder, GLM-4.7, DeepSeek-Coder, etc.
API-Based: Qwen3-MAX, Qwen3.5-Plus, etc.
AgentBuild autonomous agents for multi-step tasks, tool useOpen Source: Qwen3-235B, DeepSeek-V3.2, GPT-OSS-120B, etc.
API-Based: Qwen3-MAX, Qwen3.5-Plus, etc.
ReasoningPerform multi-step logic, math reasoning, structured thinkingOpen Source: Qwen3-235B, DeepSeek-R1-Distill-Llama-70B, GPT-OSS-120B, etc.
API-Based: Qwen3-MAX, Qwen3.5-Plus, etc.
ConversationEnable multi-turn dialogue for chatbots and assistantsOpen Source: Qwen3-32B-FAST, GPT-OSS-20B, GLM-4.7, etc.
API-Based: Qwen3.5-Flash, Qwen-Plus, etc.

VLM

VLMs process images and text together for multimodal understanding tasks.

Use CaseDescriptionRecommended Models
Visual Question Answering (VQA)Answer complex questions based on images or videos, combining visual perception and language understanding.Open Source: Qwen3-VL-32B-Thinking-FP8, MedGemma-27b-it, Qwen2.5-VL-72B-Instruct, etc.
API-Based: Qwen3.5-Flash, Qwen3-VL-Plus, etc.
Image CaptioningAutomatically generate descriptive captions for images to improve accessibility and content indexing.Open Source: Qwen3-32B-FAST, Qwen3-VL-32B-Thinking-FP8, MedGemma-27b-it, etc.
API-Based: Qwen3.5-Flash, Qwen3-VL-Plus, etc.
Real-time Video ConversationEngage in spoken conversations about live video streams, enabling interactive, real-time analysis.Open Source: Qwen2.5-Omni, Qwen3-VL, etc.
API-Based: Qwen3.5-Flash, Qwen3-VL-Plus, etc.

Image Model

Generate and edit images for business and creative applications.

Use CaseDescriptionRecommended Models
Text to Image (T2I)Generate detailed images directly from text prompts, ideal for creative design and content creation.Open Source: Qwen-Image-2512, FLUX.1-dev, etc.
API-Based: wan2.6-t2i, qwen-image-plus, etc.
Image to Image (I2I)Modify existing images guided by textual input for style transfer or enhancement.Open Source: Qwen-Image-Edit-2511, FLUX.1-Kontext-dev, etc.
API-Based: wan2.6-image, wan2.5-i2i-preview, etc.
E-commerce Product ImagesGenerate marketing and product display images for e-commerce platforms.Open Source: Z-Image-Turbo, Qwen-Image-Edit, etc.
API-Based: wan2.6-image, wan2.5-i2i-preview, etc.
Social Media Content CreationCreate visual content for social media platforms and marketing campaigns.Open Source: Qwen-Image, FLUX.1-Krea-dev, etc.
API-Based: qwen-image-plus, wan2.6-t2i, etc.

Video Model

Create videos from text descriptions or static images.

Use CaseDescriptionRecommended Models
Text to Video (T2V)Create videos based on textual descriptions.Open Source: Wan2.2-T2V-A14B-Diffusers, LTX-Video-0.9.7-dev, etc.
API-Based: wan2.6-t2v, wan2.5-t2v-preview, etc.
Image to Video (I2V)Generate videos from static images with smooth movement synthesis.Open Source: Wan2.2-I2V-A140B-Diffusers, Wan2.1-I2V-14B-720P-Diffusers, etc.
API-Based: wan2.6-i2v, wan2.2-i2v-plus, etc.
E-commerce Product AdsCreate product advertisement videos from text or images.Open Source: Wan2.1-I2V-14B-720P-Diffusers, Wan2.1-T2V-14B-Diffusers, etc.
API-Based: wan2.2-i2v-plus, wan2.1-i2v-turbo, etc.
Social Media SnippetsCreate short video content for social media platforms and marketing.Open Source: Wan2.2-T2V-1.3B-Diffusers, Wan2.1-T2V-1.3B-Diffusers, etc.
API-Based: wan2.6-i2v-flash, wan2.2-kf2v-flash, etc.
Character ConsistencyGenerate animated avatars and digital humans for videos with consistent character representation.Open Source: Wan2.2-TI2V-5B-Diffusers, Wan2.1-I2V-14B-720P-Diffusers, etc.
API-Based: wan2.6-r2v, wan2.6-r2v-flash, etc.
Explore More Models

For more models you can refer to Model Gallery.

Next Steps

Deploy Your Model

Create a deployment endpoint for your model to start serving inference requests in production.

Fine-tune Your Model

Use your model as a base for fine-tuning with your custom datasets to improve performance on specific tasks.