Made my first AI request of the day and that's the error I was met with. Any ideas anyone?
Ask mode; Claude Sonnet 3.7
Gemini 2.5 went through
I have no idea how tokens work, but maybe your request was too long? Just guessing here. No idea how that works.
68 words, 410 characters - and Gemini responded without hesitation - so I'm betting it's Claude.
But that whole token counting has got me lost too..
Lets cheat and use AI to explain it...
Token Calculation by Model
1. OpenAI GPT-4 / GPT-4o
- Tokenizer: Uses tiktoken, a byte pair encoding (BPE) tokenizer.
- Token size:
- 1 token β 4 characters (English), 0.75 words.
- Emojis, punctuation, and whitespace are all tokenized.
- How it's calculated: You can use
tiktoken
to tokenize and count tokens. - Example:
"I love AI."
β 5 tokens:["I", " love", " AI", ".", ""]
2. Anthropic Claude (Opus, Sonnet, Haiku)
- Tokenizer: Custom tokenizer, similar to sentencepiece.
- Token size:
- 1 token β 3β4 characters on average.
- Notable Feature: Claude's tokenizer is more optimized for English than OpenAI's in some ways, meaning fewer tokens for the same sentence.
- How it's calculated: Anthropic has not open-sourced its tokenizer, but 3rd party tools estimate similar token counts to OpenAI's.
3. Google Gemini (formerly Bard)
- Tokenizer: Based on SentencePiece, often with BPE or Unigram LM.
- Token size:
- 1 token β 3β4 characters (varies significantly by language).
- Multilingual support: Highly optimized for multilingual input.
- How it's calculated: Use Googleβs open-source SentencePiece models to replicate.
Let's break down what 100,701 tokens roughly equates to.
Since token-to-word ratios vary by language and complexity, here's a range:
Language Complexity | Words per Token | Estimated Words for 100,701 Tokens |
---|---|---|
Simple English (average) | ~0.75 | ~75,525 words |
Complex English (technical/legal) | ~0.66 | ~66,460 words |
So, 100,701 tokens β 66,000β75,000 words.
Web Page Size Estimate
1. By Word Count
Based on common webpage word counts:
Page Type | Avg. Word Count | 100,701 Tokens Can Fit... |
---|---|---|
Blog Post | ~1,000 | ~66β75 pages |
Marketing Landing Page | ~500 | ~132β150 pages |
Technical Docs Page | ~2,000 | ~33β38 pages |
2. By Character Count / File Size
- 1 token β 4 characters β 100,701 tokens β 402,804 characters
- Thatβs about 400 KB of raw text (uncompressed).
- In terms of HTML page size, with basic styling/markup, that would be about 500β700 KB.
For comparison:
A typical full-featured web article (with light media and CSS) is 200β500 KB.
Summary:
- 100,701 tokens = about 66,000β75,000 words
- That's like:
- A short novel (e.g., The Great Gatsby)
- A manual or guidebook
- 60β75 blog posts
- A medium-sized website's text content
LOL - showoff -
but how do I get an error on the first paragraph of the day
Hahahahaha! I asked AI!
Ask Claude. I'm sure it will apologise in a very nice way.
In all seriousness I don't know Jimed. Seems very strange though.
See also my explanation about the context size and tokens usage as we supply Wappler specific knowledge to the AI models:
So on larger pages it is advisable to use models with larger context size.
We will be adding soon also the Google Gemini as direct provider which goes up to 1MB on tokens context size and should be plenty