DeepSeek Reasoner Thinking Mode for Math
Code & Logic
DeepSeek Reasoner is DeepSeek’s dedicated “thinking mode” model, built to solve hard math, code, and logic problems with step-by-step chain-of-thought reasoning. Exposed via the deepseek-reasoner endpoint, it first generates an internal reasoning trace and then produces a clear final answer, giving developers higher accuracy on complex tasks and the option to inspect or log the model’s thought process. It’s ideal for AI tutors, research assistants, advanced agents, and any application where how the answer was reached matters as much as the answer itself.
DeepSeek Reasoner: How the deepseek-reasoner Endpoint Thinks Before It Answers
If you’re building agents, tutors, or analysis tools, you’ve probably seen two DeepSeek endpoints:
-
deepseek-chat– fast, “normal” chat mode -
deepseek-reasoner– slower, reasoning mode
This article explains what DeepSeek Reasoner actually is, how it works under the hood, and when you should (and shouldn’t) use it.
1. What is DeepSeek Reasoner?
DeepSeek Reasoner is the API endpoint deepseek-reasoner – a reasoning model that generates a Chain of Thought (CoT) before it produces the final answer. You can even access that CoT in the API response via a dedicated field.
Key points:
-
Endpoint name:
deepseek-reasoner -
Backing model (2025-09+): DeepSeek-V3.2-Exp in “thinking mode”
-
Behavior: first think, then answer (internal CoT → final reply)
-
Context: up to 128K tokens like
deepseek-chat
In short: DeepSeek Reasoner = V3.2-Exp with reasoning turned on, exposing its internal thought process when you want it.
2. How DeepSeek Reasoner Works (API View)
When you call deepseek-reasoner, the model returns two separate text fields:
-
reasoning_content– the Chain of Thought (how the model reasons step-by-step). -
content– the final answer you actually show to users.
Example structure (simplified):
{ "choices": [ { "message": { "reasoning_content": "Let me compare 9.11 and 9.8: ...", "content": "9.11 is greater than 9.8." } } ] }
API-specific behavior:
-
max_tokenscontrols the total output, including CoT + final answer (default 32K, max 64K). -
Supported features:
-
JSON Output
-
Chat Completion
-
Chat Prefix Completion (Beta)
-
-
Not supported:
-
Function Calling
-
FIM Completion
-
Parameters like
temperature,top_p,presence_penalty,frequency_penalty(they are accepted for compatibility but have no effect).
-
This design forces the model to behave deterministically and focus on reasoning quality rather than sampling variety.
3. Model & Pricing Compared to deepseek-chat
As of the V3.2-Exp release:
-
Both
deepseek-chatanddeepseek-reasonerrun on DeepSeek-V3.2-Exp. -
deepseek-chat= non-thinking mode;deepseek-reasoner= thinking mode.
From the official pricing page:
-
Model version: DeepSeek-V3.2-Exp
-
Context length: 128K tokens for both
-
Output limits:
-
deepseek-chat: default 4K, max 8K -
deepseek-reasoner: default 32K, max 64K (to allow long CoT)
-
Pricing (per 1M tokens, V3.2-Exp):
-
Input (cache hit): $0.028
-
Input (cache miss): $0.28
-
Output: $0.42
Because Reasoner outputs more tokens (CoT + answer), you’ll usually spend more on output tokens than with deepseek-chat, but you get much stronger reliability on hard tasks.
4. Multi-Round Conversation: How to Use It Safely
One important detail from the docs: in multi-turn chat you must not feed reasoning_content back into the next request.
The official guide shows this pattern:
-
Call
deepseek-reasonerwith yourmessages. -
Save:
-
reasoning_content→ for logging, analysis, distillation, or internal UI. -
content→ the assistant message you add back intomessages.
-
-
Next turn:
-
Append only
{ "role": "assistant", "content": content }from the previous reply. -
Do not include
reasoning_contentin the messages array; otherwise you get a 400 error.
-
So your loop looks like:
messages = [{"role": "user", "content": "Question 1"}] response = client.chat.completions.create( model="deepseek-reasoner", messages=messages, ) reasoning = response.choices[0].message.reasoning_content answer = response.choices[0].message.content # Store reasoning somewhere internal; only feed back the answer: messages.append({"role": "assistant", "content": answer}) messages.append({"role": "user", "content": "Follow-up question"})
The same logic applies in streaming mode, where you concatenate delta.reasoning_content and delta.content separately.
5. When Should You Use DeepSeek Reasoner?
Use deepseek-reasoner when extra thinking time genuinely improves outcomes:
5.1 Math, logic & exams
-
Contest math (AIME, Olympiad-style problems)
-
Logical puzzles, proofs, step-by-step derivations
-
“Show your work” situations where CoT is valuable
DeepSeek’s earlier R1 models already matched or approached OpenAI’s o1 on these benchmarks; Reasoner brings that style of chain-of-thought reasoning into V3.2-Exp with long context and lower cost.
5.2 Complex coding & debugging
-
Multi-file / architecture-level reasoning
-
Explaining and fixing tricky bugs
-
Migration plans and refactors that require several steps
You can combine deepseek-chat for quick edits and deepseek-reasoner for complicated “why is this failing?” questions.
5.3 Agents, research & high-stakes decisions
-
Research agents that have to read multiple documents, compare evidence, and justify conclusions
-
AI ops agents planning multi-step workflows or runbooks
-
Internal decision-support tools where you want a human to review the reasoning before acting
Because Reasoner exposes reasoning_content, you can review or log the CoT as part of your evaluation and safety pipeline instead of treating the model as a black box.
6. When to Stick with deepseek-chat Instead
deepseek-reasoner is not always the right choice:
Use deepseek-chat (non-thinking mode) when:
-
You need fast, low-latency replies (customer chat, UI helpers).
-
Tasks are simple: small code snippets, short Q&A, boilerplate text.
-
You’re cost-sensitive and don’t need long CoT.
Because Reasoner’s max_tokens and CoT make responses longer, your bills and latency both go up. For most routine tasks, deepseek-chat is enough.
7. Practical Implementation Tips
Here’s how to get the most from DeepSeek Reasoner in a real product.
7.1 Pattern: dual-model strategy
A common pattern in recent tutorials and blog posts:
-
Route simple requests to
deepseek-chat(cheap, fast). -
Detect hard tasks (math, long context, low confidence, or explicit “explain step-by-step”).
-
For those, call
deepseek-reasoneronce, capturereasoning_content, and cache the result.
This keeps overall cost low while still giving you “reasoning on demand”.
7.2 Hide or show CoT depending on user
-
End users: show only
content(final answer). -
Internal tools / expert modes: expose
reasoning_contentfor debugging and education. -
Evaluation harness: store both
reasoning_contentandcontentplus ground truth so you can analyze where reasoning goes wrong.
7.3 Guardrails and safety
Even in reasoning mode, you still need app-level safety:
-
Filter prompts & outputs for sensitive or disallowed content.
-
Add domain-specific constraints (e.g., no medical diagnosis, financial advice disclaimers).
-
Log CoT for safety audits—but treat it as sensitive data, since it may contain user info and model-generated speculation.
8. DeepSeek Reasoner in the Bigger Ecosystem
Two broader trends make deepseek-reasoner important:
-
Unified “thinking/non-thinking” architecture
DeepSeek’s changelog shows a progression: R1 → V3.1 → V3.2-Exp, each time merging reasoning capability into a more general architecture. Now the same base (V3.2-Exp) powers both chat and reasoning modes. -
Cheaper long-context reasoning
V3.2-Exp’s DeepSeek Sparse Attention (DSA) and architecture refinements cut API costs by 50%+ compared to earlier versions, while preserving benchmark performance. This makes long-context CoT and agents much more affordable.
For builders, that means you can realistically run chain-of-thought agents with 128K context without blowing the budget.
9. Summary: When to Reach for DeepSeek Reasoner
Use DeepSeek Reasoner (deepseek-reasoner) when:
-
You need reliable solutions on hard problems, not just fluent answers.
-
You value inspectable reasoning (
reasoning_content) for review, distillation, or training. -
You’re building agents, tutors, or analysis tools where step-by-step logic matters.
Stick with deepseek-chat for everyday chat, support, and light coding.
DeepSeek Reasoner - Frequently Asked Questions
1. What is DeepSeek Reasoner and how is it different from deepseek-chat?
On multiple subs (r/LocalLLaMA, r/SillyTavernAI, r/JanitorAI_Official) people sum it up like this:
-
deepseek-chat= non-thinking mode – fast, general chat model. -
deepseek-reasoner= thinking mode – the same base model, but it generates a full chain-of-thought before the final answer.
Reasoner is meant for hard problems (math, logic, complex code) where extra reasoning is worth the extra latency and token cost.
2. Is DeepSeek Reasoner the same thing as DeepSeek R1?
Reddit threads and DeepSeek’s own release notes link the Reasoner endpoint directly to the R1 reasoning models:
-
When R1 launched, the docs explicitly said: use it by setting
model=deepseek-reasoner. -
Later updates say both
deepseek-chatanddeepseek-reasonerare now upgraded to newer bases (V3.1-Terminus, then V3.2-Exp), with Reasoner always mapped to the reasoning / “thinking” mode.
So in practice:
deepseek-reasoner= “the current DeepSeek reasoning model” (R1 originally, now the reasoning mode of V3.2-Exp / successors).
3. How do I actually use DeepSeek Reasoner instead of chat in my app / frontend?
Reddit answers this in a very “just change the string” way:
-
Anywhere you used
deepseek-chat, swap the model name todeepseek-reasoner. -
In some tools (JanitorAI, RooCode, aider, etc.) you must type the model ID exactly as docs or the provider expect (sometimes just
deepseek-reasoner, sometimesdeepseek/deepseek-reasoner). Typos or wrong prefixes cause 400 errors.
If a host doesn’t list Reasoner explicitly in its model list, you usually can’t just invent the string – check their docs or UI.
4. What is reasoning_content and what do I do with it?
DeepSeek Reasoner gives you two outputs in each reply:
-
reasoning_content– the hidden chain-of-thought -
content– the final user-visible answer
Reddit + integration docs repeat two rules:
-
Don’t feed
reasoning_contentback into the next request. If you include it in the messages array, the API will return a 400 error. -
You can log or store it for:
-
internal debugging / evaluation
-
training data (distilling R1-style reasoning)
-
“expert mode” UIs that show how the model thought things through
-
Most apps only show content to end users and keep reasoning_content internal.
5. Why doesn’t DeepSeek Reasoner support temperature, top_p or function calling?
One of the most upvoted complaints on r/DeepSeek / r/LLMDevs:
-
Reasoner ignores sampling controls – official docs say
temperature,top_p, presence/frequency penalties are accepted but have no effect ondeepseek-reasoner. -
Earlier API threads also noted Reasoner didn’t support function calling or FIM, while
deepseek-chatdid.
The design goal is deterministic, stable reasoning: fewer knobs, more predictable outputs. If you need function calling or playful sampling, people suggest using deepseek-chat plus tools instead.
6. Why is DeepSeek Reasoner slower and more expensive than deepseek-chat?
Reddit + blog posts emphasise two things:
-
Reasoner can generate up to ~32K reasoning tokens (and up to 64K total) before the final answer, whereas chat usually stops after a few thousand tokens.
-
It therefore uses more output tokens and has higher per-token prices than chat (especially in the original R1 pricing: ~$0.55/M input, ~$2.19/M output, later converging as pricing was updated).
Users on r/LocalLLaMA report that the 670B/R1-style Reasoner can take many seconds or even minutes on very hard prompts – which is fine for research or coding, but not for snappy chatbots.
7. When should I use DeepSeek Reasoner vs DeepSeek Chat?
The typical Reddit advice:
-
Use
deepseek-chatwhen you need:-
fast responses
-
general chat, summarisation, everyday coding
-
lower cost per request
-
-
Use
deepseek-reasonerwhen you need:-
exam-style math, proofs, logic puzzles
-
deep code reasoning / debugging
-
agents or tutors that must “think out loud” step-by-step
-
Posts titled “What do you guys prefer between DeepSeek-chat and DeepSeek-reasoner?” usually conclude:
Chat for most things, Reasoner for the hard stuff.
8. How can I stop DeepSeek Reasoner from generating huge walls of reasoning?
People get annoyed when Reasoner dumps hundreds of tokens of CoT for simple questions. Common tips:
-
Control
max_tokens– since CoT + answer share the same budget, a smallermax_tokenscaps runaway reasoning. -
Prompt for concise steps, e.g.
-
“Show only the essential steps, then the final answer.”
-
“Limit your reasoning to 5 lines.”
-
-
Use Reasoner only for prompts tagged as “hard”, and keep easy prompts on
deepseek-chat.
There is currently no official reasoning_effort parameter from DeepSeek to directly dial reasoning depth (some wrappers simulate one).
9. Why am I getting errors or timeouts with deepseek-reasoner?
Common Reddit issues:
-
400 / bad request – almost always due to:
-
wrong model name (
deepseek/deepseek-reasonervsdeepseek-reasoner) -
mistakenly including
reasoning_contentwithin the nextmessagesarray, which the API explicitly forbids.
-
-
Network / “can’t make request to reasoner (but can with chat)” – users report occasional outages or rate-limit issues affecting Reasoner more than chat.
Debug steps people recommend:
-
Copy a minimal code example from the official docs and confirm that works.
-
Double-check model string and headers / base URL.
-
Strip any
reasoning_contentfields from your outgoing messages. -
If it still fails but chat works, assume a temporary upstream issue.
10. Can I extract Reasoner’s chain-of-thought and feed it to another model?
Yes – and Reddit loves this trick.
A popular r/LocalLLaMA post shows exactly that:
-
Call
deepseek-reasoner, capture only the CoT steps. -
Feed that reasoning into another model (e.g. GPT-3.5 or Claude Sonnet) as context, then ask that model to produce the final answer.
People use this both for:
-
“Reasoning booster” pipelines – cheap model + imported R1-style reasoning.
-
Training data generation – building datasets of step-by-step solutions for their own RL or distillation.
11. Is DeepSeek Reasoner actually good for coding, or should I use other models?
Opinions are mixed on Reddit:
-
Some users say Reasoner is great at algorithmic problems and explaining bugs, especially when you want detailed step-by-step reasoning.
-
Others, especially on r/ClaudeAI, report that for large real-world codebases they still prefer Claude Sonnet 3.5 or other models – Reasoner sometimes fixes the snippet but doesn’t fully account for interactions across the whole project.