How to Choose Models

Selecting the right model for your use case is crucial for optimal performance. This guide recommends open-source models by category to help you find the best fit for your project.

Open Source vs. API-Based Models

Compare open-source and closed-source models to determine which type best fits your use case.

Open Source Models

Deploy and customize models with full transparency and control.

Complete code access for customization and optimization
No usage restrictions
Community-driven development and support
Full transparency for compliance and security audits

API-Based Models

Leverage optimized, production-ready models with professional support.

Superior performance through proprietary optimization
Managed infrastructure and automatic updates
Dedicated technical support and SLAs
Faster iteration and feature releases

Model Selection Tips

Select Open Source Models if you need fine-tuning capabilities or want full control over deployment
Select API-Based Models if you prioritize stable performance and strong general-purpose capabilities

Understand Model Types

Learn about different model categories and variants to choose the right model for your needs.

Model Types

Model Type	Description	Representative Models
Large Language Models (LLM)	Handle natural language tasks such as text generation, comprehension, summarization, and reasoning.	Qwen, GLM, DeepSeek, etc.
Vision-Language Models (VLM)	Integrate image and text modalities to support image captioning, visual question answering, and multimodal understanding.	Qwen3-VL, Qwen2.5-Omni, MedGemma, etc.
Image Models	Focus on image generation, editing, and recognition through computer vision techniques.	Qwen-Image, Wan, Flux, etc.
Video Models	Extend image modeling into temporal sequences, enabling video understanding and captioning.	Wan, CogVideo, HunyuanVideo, etc.

Model Variants

Model Variant	Description	Representative Examples
Base Models	General-purpose models trained on broad data without specialized fine-tuning. Serve as foundational backbones.	Qwen2.5-7B, gpt-oss-20b, etc.
Instruct Models	Fine-tuned with human instructions to better follow prompts and improve dialogue quality.	Qwen3-30B-A3B-Instruct-2507-FAST, Hunyuan-A13B-Instruct, etc.
Thinking Models	Designed for advanced reasoning, multi-step problem-solving, and complex cognitive tasks.	Qwen3-Next-80B-A3B-Thinking, DeepSeek-R1, etc.
Mixture of Experts (MOE)	Utilize dynamic routing across multiple expert subnetworks to balance model capacity and computational efficiency.	Qwen3.5, Qwen3-30B-A3B, etc.

LLM

Large Language Models excel at natural language understanding, reasoning, and generation across diverse tasks and domains.

Use Case	Description	Recommended Models
Code & Development	Generate code, debug, explain logic, assist programming	Open Source: `Qwen3-Coder`, `GLM-4.7`, `DeepSeek-Coder`, etc. API-Based: `Qwen3-MAX`, `Qwen3.5-Plus`, etc.
Agent	Build autonomous agents for multi-step tasks, tool use	Open Source: `Qwen3-235B`, `DeepSeek-V3.2`, `GPT-OSS-120B`, etc. API-Based: `Qwen3-MAX`, `Qwen3.5-Plus`, etc.
Reasoning	Perform multi-step logic, math reasoning, structured thinking	Open Source: `Qwen3-235B`, `DeepSeek-R1-Distill-Llama-70B`, `GPT-OSS-120B`, etc. API-Based: `Qwen3-MAX`, `Qwen3.5-Plus`, etc.
Conversation	Enable multi-turn dialogue for chatbots and assistants	Open Source: `Qwen3-32B-FAST`, `GPT-OSS-20B`, `GLM-4.7`, etc. API-Based: `Qwen3.5-Flash`, `Qwen-Plus`, etc.

VLM

VLMs process images and text together for multimodal understanding tasks.

Use Case	Description	Recommended Models
Visual Question Answering (VQA)	Answer complex questions based on images or videos, combining visual perception and language understanding.	Open Source: `Qwen3-VL-32B-Thinking-FP8`, `MedGemma-27b-it`, `Qwen2.5-VL-72B-Instruct`, etc. API-Based: `Qwen3.5-Flash`, `Qwen3-VL-Plus`, etc.
Image Captioning	Automatically generate descriptive captions for images to improve accessibility and content indexing.	Open Source: `Qwen3-32B-FAST`, `Qwen3-VL-32B-Thinking-FP8`, `MedGemma-27b-it`, etc. API-Based: `Qwen3.5-Flash`, `Qwen3-VL-Plus`, etc.
Real-time Video Conversation	Engage in spoken conversations about live video streams, enabling interactive, real-time analysis.	Open Source: `Qwen2.5-Omni`, `Qwen3-VL`, etc. API-Based: `Qwen3.5-Flash`, `Qwen3-VL-Plus`, etc.

Image Model

Generate and edit images for business and creative applications.

Use Case	Description	Recommended Models
Text to Image (T2I)	Generate detailed images directly from text prompts, ideal for creative design and content creation.	Open Source: `Qwen-Image-2512`, `FLUX.1-dev`, etc. API-Based: `wan2.6-t2i`, `qwen-image-plus`, etc.
Image to Image (I2I)	Modify existing images guided by textual input for style transfer or enhancement.	Open Source: `Qwen-Image-Edit-2511`, `FLUX.1-Kontext-dev`, etc. API-Based: `wan2.6-image`, `wan2.5-i2i-preview`, etc.
E-commerce Product Images	Generate marketing and product display images for e-commerce platforms.	Open Source: `Z-Image-Turbo`, `Qwen-Image-Edit`, etc. API-Based: `wan2.6-image`, `wan2.5-i2i-preview`, etc.
Social Media Content Creation	Create visual content for social media platforms and marketing campaigns.	Open Source: `Qwen-Image`, `FLUX.1-Krea-dev`, etc. API-Based: `qwen-image-plus`, `wan2.6-t2i`, etc.

Video Model

Create videos from text descriptions or static images.

Use Case	Description	Recommended Models
Text to Video (T2V)	Create videos based on textual descriptions.	Open Source: `Wan2.2-T2V-A14B-Diffusers`, `LTX-Video-0.9.7-dev`, etc. API-Based: `wan2.6-t2v`, `wan2.5-t2v-preview`, etc.
Image to Video (I2V)	Generate videos from static images with smooth movement synthesis.	Open Source: `Wan2.2-I2V-A140B-Diffusers`, `Wan2.1-I2V-14B-720P-Diffusers`, etc. API-Based: `wan2.6-i2v`, `wan2.2-i2v-plus`, etc.
E-commerce Product Ads	Create product advertisement videos from text or images.	Open Source: `Wan2.1-I2V-14B-720P-Diffusers`, `Wan2.1-T2V-14B-Diffusers`, etc. API-Based: `wan2.2-i2v-plus`, `wan2.1-i2v-turbo`, etc.
Social Media Snippets	Create short video content for social media platforms and marketing.	Open Source: `Wan2.2-T2V-1.3B-Diffusers`, `Wan2.1-T2V-1.3B-Diffusers`, etc. API-Based: `wan2.6-i2v-flash`, `wan2.2-kf2v-flash`, etc.
Character Consistency	Generate animated avatars and digital humans for videos with consistent character representation.	Open Source: `Wan2.2-TI2V-5B-Diffusers`, `Wan2.1-I2V-14B-720P-Diffusers`, etc. API-Based: `wan2.6-r2v`, `wan2.6-r2v-flash`, etc.

Explore More Models

For more models you can refer to Model Gallery.

Next Steps

Deploy Your Model

Create a deployment endpoint for your model to start serving inference requests in production.

Learn More

Fine-tune Your Model

Use your model as a base for fine-tuning with your custom datasets to improve performance on specific tasks.

Learn More

Open Source vs. API-Based Models​

Understand Model Types​

Model Types​

Model Variants​

LLM​

VLM​

Image Model​

Video Model​

Next Steps​