01
One message, 20,000 characters.
That's about 3,600 words — roughly a longform magazine feature, eight double-spaced pages, or the transcript of a 25-minute talk. Long enough to develop a real thought; short enough that the conversation stays conversation-shaped rather than sliding into document-dump territory.
Polarist is built on a simple premise: you don't have to say everything at once. Splitting a long thought across several messages isn't a workaround — it's how dialogue works. Polarist remembers what you've said, so the next message can just keep going.
Messages you split across the same topic are threaded together automatically. You don't have to mark them, tag them, or remind the AI that this is still the same thread. It just works.
Why is the cap locale-aware?
Japanese and Korean messages are capped at 10,000 characters instead of 20,000. The reason is simple: one Japanese or Korean character carries roughly twice the information of one Latin letter, so a fair ceiling has to reflect the script's density. X/Twitter made this exact observation in 2017 — they doubled Latin-script tweets to 280 characters while keeping CJK tweets at 140. Polarist follows the same 2:1 ratio.
The detection runs on the actual content — not on cookies, not on your UI language, not on any header you could spoof. Paste a Japanese article into the English interface and you'll get the CJK cap; write English with the Japanese interface and you'll get the Latin cap. The boundary is the text itself.
Fine print on the detection
This is a coarse Unicode-range heuristic, not a real language identifier. Polarist counts Hiragana, Katakana, common Han ideographs, and Hangul syllables; if they make up 30% or more of the message, you get the CJK cap. Otherwise, you get the Latin cap. Mixed-script messages (English prose with a stray Chinese character, or Japanese writing with English loanwords) are handled gracefully by the 30% threshold.
Vietnamese, Arabic, Hebrew, Thai, and Devanagari all fall into the Latin bucket because they sit outside the ranges we count — the 20,000-character cap is still generous for these scripts in practice, but the classification is a deliberate simplification rather than true linguistic analysis.