SMS & RCS Message Segment Calculator
Compare SMS and RCS segmentation in one place. See how your message will be split into billable segments and how encoding affects cost.
Segment Analysis
SMS Segment Limits Reference
| Encoding | Single SMS | Multi-part (per segment) |
|---|---|---|
| GSM-7 | 160 chars | 153 chars |
| Unicode (UCS-2) | 70 chars | 67 chars |
Multi-part messages lose 7 characters per segment due to the User Data Header (UDH) used for reassembly.
What Are SMS Segments and Why Do They Matter?
An SMS segment is the unit carriers use to bill text messages. A single SMS can carry a limited number of characters, and when your message exceeds that limit, it gets split into multiple segments. Each segment is billed separately by your carrier or messaging provider (Twilio, Vonage, Plivo, etc.), so a message that spans 2 segments costs twice as much as one that fits in a single segment.
For businesses sending thousands of transactional or marketing SMS messages, even small differences in message length can have a significant impact on cost. Understanding segment math helps you write messages that stay within budget.
How to Use This Tool
- Type or paste your SMS message into the text area above. The analysis updates live as you type.
- Check the encoding indicator: green means GSM-7 (standard), amber means Unicode (UCS-2).
- Review the segment count and remaining character count to see if your message fits within a single segment.
- If Unicode is detected, check the warning box to see which characters triggered it. Remove or replace those characters to switch back to GSM-7 and reduce segment count.
- Use the preset buttons to quickly compare how encoding and length affect segment count.
How SMS Segments Work
When you send a short text message, it fits inside a single SMS Protocol Data Unit (PDU) and is delivered as one message. The capacity of that PDU depends on the character encoding used:
- GSM-7 encoding: 160 characters per single message.
- Unicode (UCS-2) encoding: 70 characters per single message.
When a message exceeds the single-message limit, the carrier splits it into multiple concatenated segments. Each segment includes a User Data Header (UDH) that contains reassembly instructions — telling the receiving phone the total number of parts and the order to stitch them together. This UDH uses 7 characters worth of space in GSM-7 (or 3 characters in Unicode), reducing the per-segment capacity:
- GSM-7 multi-part: 153 characters per segment (160 - 7 for UDH).
- Unicode multi-part: 67 characters per segment (70 - 3 for UDH).
GSM-7 vs Unicode Encoding
GSM-7 is the default encoding for SMS and supports 128 characters including the Latin alphabet (A-Z, a-z), digits (0-9), common punctuation, and a handful of accented characters (like e, a, u, o, i with graves and accents, plus German umlauts and Spanish n-tilde). There is also an "extended" GSM-7 table that includes characters like | ^ ~ [ ] {} \ € — these are supported but each one counts as 2 characters because they require an escape sequence.
Unicode (UCS-2) encoding kicks in the moment your message contains any character outside the GSM-7 set. This includes emoji, Chinese/Japanese/Korean characters, Arabic, Hindi, most accented characters beyond the GSM-7 set, curly quotes (“ ”), em dashes (—), and many other symbols. The entire message switches to Unicode — you cannot mix encodings within a single message.
This encoding switch is the most common reason businesses are surprised by SMS costs. A 155-character message in GSM-7 fits in 1 segment, but add a single emoji and it becomes Unicode, requiring 3 segments (155 UTF-16 code units / 67 per segment = 3 segments).
Emojis and Special Characters Can Use More Than 1 Character
In Unicode SMS, characters are counted as UTF-16 code units — not visible characters. Most common letters, digits, and symbols use 1 code unit each. But many emojis and special characters use 2 or more code units, meaning they consume extra space in your message even though they look like a single character on screen.
- Basic emojis like 😊 (smiling face), ❤️ (red heart), and 👍 (thumbs up) each use 2 UTF-16 code units. A single smiley takes up the same space as two regular letters.
- Skin-tone and gender modified emojis like 👋🏽 (waving hand, medium skin tone) or 👨💻 (man technologist) can use 4-7 code units because they combine a base emoji with one or more modifier sequences joined by zero-width joiners.
- Flag emojis like 🇺🇸 (US flag) and 🇬🇧 (UK flag) use 4 code units each — they are actually two regional indicator symbols combined.
- Family and couple emojis like 👨👩👧👦 can use 11 or more code units because they chain multiple person emojis together with joiners.
- Special symbols outside the basic multilingual plane (mathematical symbols, rare scripts, historic characters) also use 2 code units each.
This means a short, emoji-heavy message can consume far more segment space than it appears to visually. For example, five emojis that look like 5 characters might actually count as 10-25 code units toward your 70-character Unicode limit — potentially pushing a short message into multiple segments. Always check the character count in this calculator, not just the visible length of your message.
Why One Emoji Can Double Your Message Cost
Consider a transactional SMS: "Your order #1234 has shipped! Estimated delivery: March 20. Track at example.com/t/abc123" — that is about 90 characters in GSM-7, fitting comfortably in 1 segment.
Now add a single smiley emoji at the end: "...abc123 😊". The emoji forces the entire message into Unicode encoding. Suddenly the limit drops from 160 to 70 characters per segment. Your 92-character message now needs 2 segments (92 / 67 = 1.37, rounded up to 2). That one emoji literally doubled the cost of the message.
For a longer marketing message of 300 characters, switching from GSM-7 to Unicode increases the segment count from 2 (300 / 153 = 1.96) to 5 (300 / 67 = 4.48). That is a 2.5x cost increase from a single non-GSM-7 character.
Tips for Reducing SMS Length and Cost
- Avoid emoji in transactional SMS. Save emoji for marketing messages where the engagement value outweighs the extra segment cost. For order confirmations, shipping updates, and appointment reminders, stick to plain text.
- Watch for invisible Unicode characters. Copying text from Word, Google Docs, or email can introduce curly quotes (“ ” instead of straight quotes), em dashes, and non-breaking spaces — all of which trigger Unicode encoding. Always paste as plain text.
- Use a URL shortener. Long tracking URLs eat into your character count. Services like Bitly or your messaging provider's built-in link shortening can save 30-80 characters per link.
- Keep messages concise. Every character counts. Remove filler words, use abbreviations where appropriate, and front-load the most important information.
- Be aware of GSM-7 extended characters. Characters like
| ^ ~ [ ] {} \and the Euro sign (€) are in the GSM-7 extended table but count as 2 characters each. If you're near the 160-character boundary, these can push you into a second segment. - Consider MMS for media-heavy messages. If you need emoji, images, or rich formatting, an MMS may be more cost-effective than a multi-segment Unicode SMS, depending on your carrier's pricing.
Smart Encoding: Automatic Unicode-to-GSM-7 Replacement
Some SMS providers — most notably Twilio — offer a feature called smart encoding that can automatically replace common Unicode characters with their GSM-7 equivalents before sending. This keeps your message in the cheaper GSM-7 encoding without requiring you to manually clean up every message.
Smart encoding typically handles substitutions like:
| Unicode character | GSM-7 replacement | Common source |
|---|---|---|
| “ ” (curly double quotes) | " " (straight quotes) | Word, Google Docs, CMS editors |
| ‘ ’ (curly single quotes) | ' ' (straight apostrophes) | Word, macOS auto-correct |
| — (em dash) | - (hyphen) | Word, markdown renderers |
| – (en dash) | - (hyphen) | Date ranges, Word auto-format |
| … (ellipsis character) | ... (three periods) | macOS/iOS auto-correct |
| (non-breaking space) | (regular space) | HTML editors, copy-paste from web |
How to enable it: On Twilio, set SmartEncoded=true in your API request when sending a message. Other providers may have similar features under different names — check your provider's documentation.
Limitations: Smart encoding only works for Unicode characters that have a reasonable GSM-7 equivalent. It cannot replace emoji, CJK characters, Arabic script, or other characters that have no GSM-7 counterpart. If your message contains any of those, it will still fall back to Unicode encoding regardless of smart encoding settings.
Best practice: Even with smart encoding enabled, it is better to write GSM-7-clean messages from the start. Smart encoding is a safety net for accidental Unicode characters introduced by copy-paste or rich text editors — not a substitute for careful message composition. Use this calculator to check your messages before sending, and rely on smart encoding as a fallback for edge cases.
Frequently Asked Questions
What is an SMS segment?
An SMS segment is the smallest billable unit of a text message. A single SMS can hold up to 160 GSM-7 characters or 70 Unicode characters. When your message exceeds that limit, the carrier splits it into multiple segments, each billed individually. For example, a 300-character GSM-7 message uses 2 segments (each holding up to 153 characters due to the reassembly header).
Why does adding an emoji increase my SMS cost?
Emoji are not part of the GSM-7 character set, so including even one emoji forces the entire message into Unicode (UCS-2) encoding. This drops the per-segment capacity from 160 to 70 characters (or 153 to 67 for multi-part messages), which can double or triple the number of segments required — and therefore the cost.
What's the difference between GSM-7 and Unicode?
GSM-7 is the standard encoding for SMS, supporting 128 characters including basic Latin letters, digits, and common punctuation. It allows 160 characters per segment. Unicode (UCS-2) supports virtually every character in every language, plus emoji, but it allows only 70 characters per segment. The entire message must use one encoding — if even one character requires Unicode, the whole message switches to Unicode.
How are multi-part SMS messages reassembled?
Each segment of a multi-part SMS includes a User Data Header (UDH) containing a reference number, the total number of parts, and the sequence number of that particular segment. The receiving phone uses this information to reassemble the segments in the correct order and display them as a single message. This header takes up 7 characters (GSM-7) or 3 characters (Unicode) per segment, which is why multi-part segment limits are lower than single-message limits.
How much does each SMS segment cost?
Pricing varies by provider and destination country. On Twilio, a single segment to a US number costs approximately $0.0079 (as of 2024). A 3-segment message would cost 3 times that, or about $0.024. For high-volume senders, these per-segment costs add up quickly, which is why optimizing message length and encoding matters.
Can I use emojis and still keep costs low?
Yes, but you need to keep the total message very short. A Unicode SMS still fits in one segment if it is 70 characters or fewer. If your message with emoji exceeds 70 characters, consider whether the engagement benefit of the emoji outweighs the extra segment cost. For transactional messages (order confirmations, shipping alerts, OTP codes), it is almost always better to skip the emoji and stay in GSM-7.
Powered by HumanCalculations — free online calculators