DeepSeek Reasoner Thinking Mode for Math
Code & Logic

DeepSeek Reasoner is DeepSeek’s dedicated “thinking mode” model, built to solve hard math, code, and logic problems with step-by-step chain-of-thought reasoning. Exposed via the deepseek-reasoner endpoint, it first generates an internal reasoning trace and then produces a clear final answer, giving developers higher accuracy on complex tasks and the option to inspect or log the model’s thought process. It’s ideal for AI tutors, research assistants, advanced agents, and any application where how the answer was reached matters as much as the answer itself.

Browse All Downloads How to Use DeepSeek →

DeepSeek Reasoner: How the `deepseek-reasoner` Endpoint Thinks Before It Answers

If you’re building agents, tutors, or analysis tools, you’ve probably seen two DeepSeek endpoints:

deepseek-chat – fast, “normal” chat mode
deepseek-reasoner – slower, reasoning mode

This article explains what DeepSeek Reasoner actually is, how it works under the hood, and when you should (and shouldn’t) use it.

1. What is DeepSeek Reasoner?

DeepSeek Reasoner is the API endpoint deepseek-reasoner – a reasoning model that generates a Chain of Thought (CoT) before it produces the final answer. You can even access that CoT in the API response via a dedicated field.

Key points:

Endpoint name: deepseek-reasoner
Backing model (2025-09+): DeepSeek-V3.2-Exp in “thinking mode”
Behavior: first think, then answer (internal CoT → final reply)
Context: up to 128K tokens like deepseek-chat

In short: DeepSeek Reasoner = V3.2-Exp with reasoning turned on, exposing its internal thought process when you want it.

2. How DeepSeek Reasoner Works (API View)

When you call deepseek-reasoner, the model returns two separate text fields:

reasoning_content – the Chain of Thought (how the model reasons step-by-step).
content – the final answer you actually show to users.

Example structure (simplified):


{
  "choices": [
    {
      "message": {
        "reasoning_content": "Let me compare 9.11 and 9.8: ...",
        "content": "9.11 is greater than 9.8."
      }
    }
  ]
}

API-specific behavior:

max_tokens controls the total output, including CoT + final answer (default 32K, max 64K).
Supported features:
- JSON Output
- Chat Completion
- Chat Prefix Completion (Beta)
Not supported:
- Function Calling
- FIM Completion
- Parameters like temperature, top_p, presence_penalty, frequency_penalty (they are accepted for compatibility but have no effect).

This design forces the model to behave deterministically and focus on reasoning quality rather than sampling variety.

3. Model & Pricing Compared to `deepseek-chat`

As of the V3.2-Exp release:

Both deepseek-chat and deepseek-reasoner run on DeepSeek-V3.2-Exp.
deepseek-chat = non-thinking mode; deepseek-reasoner = thinking mode.

From the official pricing page:

Model version: DeepSeek-V3.2-Exp
Context length: 128K tokens for both
Output limits:
- deepseek-chat: default 4K, max 8K
- deepseek-reasoner: default 32K, max 64K (to allow long CoT)

Pricing (per 1M tokens, V3.2-Exp):

Input (cache hit): $0.028
Input (cache miss): $0.28
Output: $0.42

Because Reasoner outputs more tokens (CoT + answer), you’ll usually spend more on output tokens than with deepseek-chat, but you get much stronger reliability on hard tasks.

4. Multi-Round Conversation: How to Use It Safely

One important detail from the docs: in multi-turn chat you must not feed reasoning_content back into the next request.

The official guide shows this pattern:

Call deepseek-reasoner with your messages.
Save:
- reasoning_content → for logging, analysis, distillation, or internal UI.
- content → the assistant message you add back into messages.
Next turn:
- Append only { "role": "assistant", "content": content } from the previous reply.
- Do not include reasoning_content in the messages array; otherwise you get a 400 error.

So your loop looks like:


messages = [{"role": "user", "content": "Question 1"}]
response = client.chat.completions.create(
    model="deepseek-reasoner",
    messages=messages,
)

reasoning = response.choices[0].message.reasoning_content
answer    = response.choices[0].message.content

# Store reasoning somewhere internal; only feed back the answer:
messages.append({"role": "assistant", "content": answer})
messages.append({"role": "user", "content": "Follow-up question"})

The same logic applies in streaming mode, where you concatenate delta.reasoning_content and delta.content separately.

5. When Should You Use DeepSeek Reasoner?

Use deepseek-reasoner when extra thinking time genuinely improves outcomes:

5.1 Math, logic & exams

Contest math (AIME, Olympiad-style problems)
Logical puzzles, proofs, step-by-step derivations
“Show your work” situations where CoT is valuable

DeepSeek’s earlier R1 models already matched or approached OpenAI’s o1 on these benchmarks; Reasoner brings that style of chain-of-thought reasoning into V3.2-Exp with long context and lower cost.

5.2 Complex coding & debugging

Multi-file / architecture-level reasoning
Explaining and fixing tricky bugs
Migration plans and refactors that require several steps

You can combine deepseek-chat for quick edits and deepseek-reasoner for complicated “why is this failing?” questions.

5.3 Agents, research & high-stakes decisions

Research agents that have to read multiple documents, compare evidence, and justify conclusions
AI ops agents planning multi-step workflows or runbooks
Internal decision-support tools where you want a human to review the reasoning before acting

Because Reasoner exposes reasoning_content, you can review or log the CoT as part of your evaluation and safety pipeline instead of treating the model as a black box.

6. When to Stick with `deepseek-chat` Instead

deepseek-reasoner is not always the right choice:

Use deepseek-chat (non-thinking mode) when:

You need fast, low-latency replies (customer chat, UI helpers).
Tasks are simple: small code snippets, short Q&A, boilerplate text.
You’re cost-sensitive and don’t need long CoT.

Because Reasoner’s max_tokens and CoT make responses longer, your bills and latency both go up. For most routine tasks, deepseek-chat is enough.

7. Practical Implementation Tips

Here’s how to get the most from DeepSeek Reasoner in a real product.

7.1 Pattern: dual-model strategy

A common pattern in recent tutorials and blog posts:

Route simple requests to deepseek-chat (cheap, fast).
Detect hard tasks (math, long context, low confidence, or explicit “explain step-by-step”).
For those, call deepseek-reasoner once, capture reasoning_content, and cache the result.

This keeps overall cost low while still giving you “reasoning on demand”.

7.2 Hide or show CoT depending on user

End users: show only content (final answer).
Internal tools / expert modes: expose reasoning_content for debugging and education.
Evaluation harness: store both reasoning_content and content plus ground truth so you can analyze where reasoning goes wrong.

7.3 Guardrails and safety

Even in reasoning mode, you still need app-level safety:

Filter prompts & outputs for sensitive or disallowed content.
Add domain-specific constraints (e.g., no medical diagnosis, financial advice disclaimers).
Log CoT for safety audits—but treat it as sensitive data, since it may contain user info and model-generated speculation.

8. DeepSeek Reasoner in the Bigger Ecosystem

Two broader trends make deepseek-reasoner important:

Unified “thinking/non-thinking” architecture
DeepSeek’s changelog shows a progression: R1 → V3.1 → V3.2-Exp, each time merging reasoning capability into a more general architecture. Now the same base (V3.2-Exp) powers both chat and reasoning modes.
Cheaper long-context reasoning
V3.2-Exp’s DeepSeek Sparse Attention (DSA) and architecture refinements cut API costs by 50%+ compared to earlier versions, while preserving benchmark performance. This makes long-context CoT and agents much more affordable.

For builders, that means you can realistically run chain-of-thought agents with 128K context without blowing the budget.

9. Summary: When to Reach for DeepSeek Reasoner

Use DeepSeek Reasoner (deepseek-reasoner) when:

You need reliable solutions on hard problems, not just fluent answers.
You value inspectable reasoning (reasoning_content) for review, distillation, or training.
You’re building agents, tutors, or analysis tools where step-by-step logic matters.

Stick with deepseek-chat for everyday chat, support, and light coding.

DeepSeek Reasoner - Frequently Asked Questions

Short answers to the most common questions developers and teams ask before they switch to DeepSeek Reasoner.

1. What is DeepSeek Reasoner and how is it different from `deepseek-chat`?

On multiple subs (r/LocalLLaMA, r/SillyTavernAI, r/JanitorAI_Official) people sum it up like this:

deepseek-chat = non-thinking mode – fast, general chat model.
deepseek-reasoner = thinking mode – the same base model, but it generates a full chain-of-thought before the final answer.

Reasoner is meant for hard problems (math, logic, complex code) where extra reasoning is worth the extra latency and token cost.

2. Is DeepSeek Reasoner the same thing as DeepSeek R1?

Reddit threads and DeepSeek’s own release notes link the Reasoner endpoint directly to the R1 reasoning models:

When R1 launched, the docs explicitly said: use it by setting model=deepseek-reasoner.
Later updates say both deepseek-chat and deepseek-reasoner are now upgraded to newer bases (V3.1-Terminus, then V3.2-Exp), with Reasoner always mapped to the reasoning / “thinking” mode.

So in practice:

deepseek-reasoner = “the current DeepSeek reasoning model” (R1 originally, now the reasoning mode of V3.2-Exp / successors).

3. How do I actually use DeepSeek Reasoner instead of chat in my app / frontend?

Reddit answers this in a very “just change the string” way:

Anywhere you used deepseek-chat, swap the model name to deepseek-reasoner.
In some tools (JanitorAI, RooCode, aider, etc.) you must type the model ID exactly as docs or the provider expect (sometimes just deepseek-reasoner, sometimes deepseek/deepseek-reasoner). Typos or wrong prefixes cause 400 errors.

If a host doesn’t list Reasoner explicitly in its model list, you usually can’t just invent the string – check their docs or UI.

4. What is `reasoning_content` and what do I do with it?

DeepSeek Reasoner gives you two outputs in each reply:

reasoning_content – the hidden chain-of-thought
content – the final user-visible answer

Reddit + integration docs repeat two rules:

Don’t feed reasoning_content back into the next request. If you include it in the messages array, the API will return a 400 error.
You can log or store it for:
- internal debugging / evaluation
- training data (distilling R1-style reasoning)
- “expert mode” UIs that show how the model thought things through

Most apps only show content to end users and keep reasoning_content internal.

5. Why doesn’t DeepSeek Reasoner support temperature, top_p or function calling?

One of the most upvoted complaints on r/DeepSeek / r/LLMDevs:

Reasoner ignores sampling controls – official docs say temperature, top_p, presence/frequency penalties are accepted but have no effect on deepseek-reasoner.
Earlier API threads also noted Reasoner didn’t support function calling or FIM, while deepseek-chat did.

The design goal is deterministic, stable reasoning: fewer knobs, more predictable outputs. If you need function calling or playful sampling, people suggest using deepseek-chat plus tools instead.

6. Why is DeepSeek Reasoner slower and more expensive than `deepseek-chat`?

Reddit + blog posts emphasise two things:

Reasoner can generate up to ~32K reasoning tokens (and up to 64K total) before the final answer, whereas chat usually stops after a few thousand tokens.
It therefore uses more output tokens and has higher per-token prices than chat (especially in the original R1 pricing: ~$0.55/M input, ~$2.19/M output, later converging as pricing was updated).

Users on r/LocalLLaMA report that the 670B/R1-style Reasoner can take many seconds or even minutes on very hard prompts – which is fine for research or coding, but not for snappy chatbots.

7. When should I use DeepSeek Reasoner vs DeepSeek Chat?

The typical Reddit advice:

Use deepseek-chat when you need:
- fast responses
- general chat, summarisation, everyday coding
- lower cost per request
Use deepseek-reasoner when you need:
- exam-style math, proofs, logic puzzles
- deep code reasoning / debugging
- agents or tutors that must “think out loud” step-by-step

Posts titled “What do you guys prefer between DeepSeek-chat and DeepSeek-reasoner?” usually conclude:

Chat for most things, Reasoner for the hard stuff.

8. How can I stop DeepSeek Reasoner from generating huge walls of reasoning?

People get annoyed when Reasoner dumps hundreds of tokens of CoT for simple questions. Common tips:

Control max_tokens – since CoT + answer share the same budget, a smaller max_tokens caps runaway reasoning.
Prompt for concise steps, e.g.
- “Show only the essential steps, then the final answer.”
- “Limit your reasoning to 5 lines.”
Use Reasoner only for prompts tagged as “hard”, and keep easy prompts on deepseek-chat.

There is currently no official reasoning_effort parameter from DeepSeek to directly dial reasoning depth (some wrappers simulate one).

9. Why am I getting errors or timeouts with `deepseek-reasoner`?

Common Reddit issues:

400 / bad request – almost always due to:
- wrong model name (deepseek/deepseek-reasoner vs deepseek-reasoner)
- mistakenly including reasoning_content within the next messages array, which the API explicitly forbids.
Network / “can’t make request to reasoner (but can with chat)” – users report occasional outages or rate-limit issues affecting Reasoner more than chat.

Debug steps people recommend:

Copy a minimal code example from the official docs and confirm that works.
Double-check model string and headers / base URL.
Strip any reasoning_content fields from your outgoing messages.
If it still fails but chat works, assume a temporary upstream issue.

10. Can I extract Reasoner’s chain-of-thought and feed it to another model?

Yes – and Reddit loves this trick.

A popular r/LocalLLaMA post shows exactly that:

Call deepseek-reasoner, capture only the CoT steps.
Feed that reasoning into another model (e.g. GPT-3.5 or Claude Sonnet) as context, then ask that model to produce the final answer.

People use this both for:

“Reasoning booster” pipelines – cheap model + imported R1-style reasoning.
Training data generation – building datasets of step-by-step solutions for their own RL or distillation.

11. Is DeepSeek Reasoner actually good for coding, or should I use other models?

Opinions are mixed on Reddit: