AI Fixes

Claude Conversation Too Long: How to Fix

Claude says a conversation is too long when the thread fills its context window. On paid claude.ai plans the current Opus and Sonnet models (Opus 4.8, 4.7, 4.6 and Sonnet 4.6) use a 500K-token window, while older models use 200K. The fix: ask Claude to summarize the chat, paste that summary into a new conversation, or move reusable material into a Project so retrieval handles it.

The short fix for "conversation too long"

The message means the chat has reached its token budget. Claude reads the entire thread on every turn, so each message, attached file, and reply adds to a running total. Once that total approaches the context window, Claude warns that the next message will exceed the chat limit.

The direct fix has three moves. Ask Claude to summarize the conversation so far. Copy that summary. Open a new chat and paste it as the first message. The new conversation starts with a clean window and the essential context intact.

When code execution is enabled, Claude already summarizes earlier messages automatically as a thread grows, so the hard error is uncommon in everyday use. Very large first messages can still trip the limit. When the warning does appear, the steps above restore momentum in under a minute.

  • Ask Claude to summarize the thread before you hit the wall.
  • Paste the summary into a fresh conversation to rebuild context.
  • Attach only the files the next stage of work actually needs.

Why Claude hits the length limit

A context window is the maximum amount of text a model can consider at once, measured in tokens. On paid claude.ai plans the window depends on the model. Claude Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6 support a 500K-token window when chatting with Claude. Older models use a 200K-token window. As a rough guide, 200K tokens is about 150,000 words.

The window holds the whole conversation, not just your latest message. Long back-and-forth threads, pasted documents, code, and prior answers all draw on the same budget. A single very large first message can trip the limit on its own, which is why pasting a huge file early often triggers the warning fast.

  • 500K tokens: window for Opus 4.8, 4.7, 4.6, and Sonnet 4.6 on paid plans.
  • 200K tokens: window for older Claude models.
  • Everything in the thread counts, including Claude's own replies and attached files.

Continue the chat without losing context

The cleanest way to carry context forward is a summary handoff. Type a prompt such as: "Summarize this conversation so I can continue it in a new chat. Include decisions made, open questions, and any constraints." Claude returns a compact recap that fits easily into a fresh window.

Then open a new conversation and paste that recap as your opening message. You keep the thread of reasoning without dragging along thousands of tokens of earlier detail. Trim attachments to only what the next phase needs, since re-uploading a large file reintroduces the same pressure that ended the last chat.

  • Prompt Claude for a structured summary: decisions, open questions, constraints.
  • Start a new chat and lead with that summary.
  • Re-attach only the files relevant to the next step, not the whole history.

Use Projects for large, reusable knowledge

When the same documents need to stay available across many chats, Projects fit better than a single long thread. Once a project's files pass a threshold, Claude switches to a retrieval mode: instead of loading every file into the window at once, it searches the project knowledge and pulls only the relevant sections into active context for each question.

Anthropic states that this retrieval mode lets Projects hold up to 10x more content than the raw window while keeping responses accurate. That makes Projects a durable home for reference material you query repeatedly, rather than re-pasting it into chat after chat.

  • Projects retrieve relevant file sections instead of loading everything.
  • Retrieval mode supports up to 10x more content than the raw window.
  • Good for reference material you query across many separate chats.

Keep context across chats with an external memory layer

Summaries and Projects solve context inside one assistant. A different gap appears when the same facts, preferences, and project details need to survive across new chats and even across different assistants. Re-explaining them every time is the recurring cost of a fixed context window.

MemX, an AI memory app from Neural Forge Technologies, addresses that recall angle. It acts as an external memory layer where you store durable context once, then paste the relevant pieces back into a new Claude conversation when you need them. MemX does not replace Claude or any chat assistant; it holds the personal memory you would otherwise lose when a thread ends.

On privacy, MemX is private by architecture: per-user isolation, encryption at rest, and Google Cloud KMS, with on-device handling where applicable. That matters when the context you save includes work details or personal information you reuse over time.

  • Stores durable context once, so you stop re-explaining it in every new chat.
  • Complements Claude and Projects; it does not replace the assistant.
  • Private by architecture: per-user isolation, encryption at rest, Google Cloud KMS, on-device handling.

When a summary is not enough: split the work

Some tasks are too large for any single window, summary or not. For those, Anthropic's own guidance is to break content into smaller pieces and process them separately, or to extract the key sections before sending them to Claude.

A practical pattern: have Claude first identify the most relevant portions of a large document, then work through those portions in focused chats. This keeps each conversation well under the limit and produces tighter answers, because the model spends its budget on what matters rather than on bulk it does not need.

  • Break large inputs into smaller pieces and process each separately.
  • Extract or summarize key sections before sending the full text.
  • Let Claude flag the most relevant portions first, then dive into those.

Key takeaways

  • On paid claude.ai plans, Opus 4.8, 4.7, 4.6 and Sonnet 4.6 use a 500K-token window while older models use 200K; the "too long" message appears when the whole thread nears that ceiling.
  • When code execution is enabled, Claude auto-summarizes older messages, so the hard error is rare during normal use.
  • The fastest fix: ask Claude to summarize the thread, then paste that summary into a new conversation to rebuild context.
  • Claude Projects switch to a retrieval mode for large file sets, holding up to 10x more content than the raw window so it sidesteps the limit for reusable knowledge.
  • On the Claude API, the 1M-token window is standard availability for current models with no beta header, but for chat the practical fix is a summary handoff or an external memory layer.

Frequently asked questions

Each chat has a fixed context window, and every message, file, and reply counts toward it. On paid claude.ai plans the current Opus and Sonnet models (Opus 4.8, 4.7, 4.6 and Sonnet 4.6) use a 500K-token window, while older models use 200K. When the running total nears that ceiling, Claude warns the message will exceed the chat limit.
It depends on the model. On paid claude.ai plans, Opus 4.8, 4.7, 4.6 and Sonnet 4.6 support a 500K-token context window, while older models use 200K. As a rough guide, 200K tokens is about 150,000 words. The window covers the whole thread, not a single message.
You cannot raise the per-chat window for a given model, but switching to Opus 4.8, 4.7, 4.6 or Sonnet 4.6 gives you a 500K-token window instead of 200K. Claude Code on a paid plan reaches a 1M-token window on supported models, and the Claude API offers 1M at standard availability on current models with no beta header. For chat, the practical fix is a new conversation or a Project.
Ask Claude to summarize the thread, copy that summary, and paste it into a new conversation as the starting message. For recurring context, use a Project so files stay attached, or keep durable notes in an external memory layer you can paste back in.
Starting a new conversation gives Claude a fresh context window and does not carry over the previous thread automatically. Your old chat still exists in your history, but the new chat knows nothing about it until you paste a summary or attach the relevant files.