LLM Token Counter — GPT, Claude Token & Cost Estimator

Count GPT-4o, GPT-4, GPT-3.5 and Claude tokens in your browser and estimate API cost before you send the prompt.

Loading Token counter…

This step happens once. Future runs on this device are instant.

Tokens
0
Characters
0
Words
0
Lines
0
Cost estimate for GPT-4o
If sent as input
0.00
$2.50 / 1M tokens
If produced as output
0.00
$10.00 / 1M tokens

Output cost assumes the same token count comes back. Real responses vary.

Type above to see how the model splits your text into tokens.

About LLM Token Counter

A browser-side tokenizer for OpenAI and Claude models. Paste a prompt, document, or chat history and see exact token counts for cl100k_base (GPT-4, GPT-3.5) and o200k_base (GPT-4o, GPT-4o mini), plus a per-model cost estimate. Text never leaves your device.

Large language models charge per token, not per character. A token is usually a short chunk of text produced by byte-pair encoding, so "hello" might be one token while a rare word splits into three. Knowing the count ahead of time matters for two reasons: staying inside a model's context window, and predicting how much a request will cost before you fire it.

This tool ships the same BPE tables the official OpenAI tiktoken library uses, running in your browser. The o200k_base encoding covers GPT-4o and GPT-4o mini. cl100k_base covers GPT-4, GPT-4 Turbo, and GPT-3.5. Anthropic has not released a public tokenizer, so Claude counts use cl100k_base as a close approximation. Expect a few percent of drift on Claude, exact parity on OpenAI models.

The cost estimator multiplies your token count by each model's published per-million-token input and output rate. Prices are hardcoded and reflect late-2025 pricing. If OpenAI or Anthropic shifts rates, the numbers here will need a refresh.

How to use the LLM Token Counter
  1. 1

    Paste your text

    Drop in a prompt, document, chat transcript, or piece of code. The tokenizer loads once from your browser cache, then encodes live as you type.

  2. 2

    Pick the target model

    Choose GPT-4o, GPT-4o mini, GPT-4 Turbo, GPT-3.5 Turbo, or Claude Sonnet. The encoding switches automatically: o200k_base for GPT-4o family, cl100k_base for the rest.

  3. 3

    Read the cost estimate

    See token count, character/word/line stats, and dollar cost for both input and output sides. A per-token chip view shows exactly how your text was split.

Common use cases

Estimate API cost before sending

Check whether a long prompt will cost cents or dollars at GPT-4 Turbo rates before you commit to a batch of requests.

Fit within a context window

Confirm that a RAG chunk plus system prompt and conversation history stays under the 128k or 200k token limit of your target model.

Split long documents for embedding

Find the right split point by measuring tokens directly, rather than guessing from character count, when chunking for a vector store.

Compare tokenization across models

See how the same paragraph tokenizes under cl100k_base vs o200k_base to understand why GPT-4o is often cheaper per character on non-English text.

Frequently asked questions
Does my text leave my browser?

No. The tokenizer vocabulary downloads once, then all encoding happens locally. There is no server, no logging, no request back to anyone. You can watch the Network tab to confirm.

Is the Claude count exact?

Close, but not exact. Anthropic has never published the official Claude tokenizer, so the counter uses cl100k_base as an approximation. Expect drift within a few percent on normal English prose, more on code or non-Latin scripts. Use it as a planning estimate, not a billing figure.

What is the difference between cl100k_base and o200k_base?

They are two versions of OpenAI's BPE vocabulary. cl100k_base has about 100k tokens and covers GPT-4 and GPT-3.5. o200k_base has about 200k tokens and was introduced with GPT-4o, compressing non-English text more efficiently. The same input almost always tokenizes to fewer tokens under o200k_base.

What actually counts as a token?

A token is a byte-pair encoded fragment, often a whole common word but sometimes a subword, a punctuation mark, or part of a long compound. Whitespace is usually attached to the following word, so " hello" and "hello" can tokenize differently. The Raw tokens tab shows the exact split.

Are the prices current?

They reflect late-2025 published rates: GPT-4o at $2.50 input / $10 output per million tokens, GPT-4o mini at $0.15 / $0.60, GPT-4 Turbo at $10 / $30, GPT-3.5 Turbo at $0.50 / $1.50, Claude Sonnet 4.7 at $3 / $15. Check the provider's pricing page for the latest figures before making budget decisions.

developertextapiweb