💬

deepseek-v4-flash

openrouter

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and high-throughput workloads, while maintaining strong reasoning and coding performance. The model includes hybrid attention for efficient long-context processing. Reasoning efforts high and xhigh are supported; xhigh maps to max reasoning. It is well suited for applications such as coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency are important.

Model Type

chat

Cost

20 tokens / 1_megabitpixiels

Features

Stream

Model Settings

current version

deepseek-v4-flash

Join the Discussion

Have questions or want to share your experience with deepseek-v4-flash? Join the conversation in our forums.

Visit Forums