EasySendSMS Help Center
Help Center Unicode SMS
EasySendSMS Knowledge Base

Unicode SMS

Find simple, clear, and practical answers for this EasySendSMS help category.

Articles in this category

Help Article

The Evolution and Impact of Unicode

Introduction to Unicode: Unicode stands as a transformative international character encoding standard. It assigns a distinctive number to each character, irrespective of language or script. This unique approach ensures seamless compatibility across various platforms, applications, and devices. Historical Perspective on Character Encoding: In the pre-Unicode era, there were countless character encodings, each designating numbers to symbols and letters for computer interpretation. This archaic system had an inherent limitation—it couldn't encode a sufficient number of characters to encapsulate all global languages. Even the technical symbols, letters, and punctuation that were universally used couldn't be wholly integrated. The overlapping nature of these encoding systems often led to confusion, where the same number could represent multiple characters or a single character could possess various numerical representations. The consequence? Computers had to accommodate an array of encodings, leading to frequent data corruption when information transitioned between diverse machines or encodings. Realization of the Unicode Vision: Come October 1991, the Unicode Consortium's ambition to replace the discordant encoding methods with a singular, universal standard bore fruit. This marked the release of Unicode Standard version 1.0. Fundamentals of Unicode: At its core, Unicode offers a distinct number for every conceivable character. This spans from punctuations, mathematical symbols, and arrows to non-Latin scripts like Thai, Chinese, or Arabic. Today, thanks to Unicode, data can be seamlessly and reliably transferred across diverse devices, applications, and platforms without any corruption. This character encoding system has become the backbone of modern software, featuring prominent operating systems, web browsers, laptops, smartphones, and almost all aspects of the internet. Guardians of Unicode: The Unicode Consortium, a non-profit entity, shoulders the responsibility of developing and advocating the Unicode Standard. Any alterations to this standard require the dual endorsement of both the Consortium and the international standard ISO/IEC 10646, ensuring character consistency. This standard and ISO/IEC 10646 collectively endorse three encoding modalities: UTF-8, UTF-16, and UTF-32. Together, they share a character repertoire, capable of encoding a staggering million characters.   Deciphering Unicode SMS: A "Unicode SMS" is one where the content includes characters outside the GSM-7 character set's purview. Standard SMS can encompass up to 160 characters from the GSM-7 set, comprising Latin characters (A-Z), numerals (0-9), and a handful of special symbols. While Unicode can represent any known character, it consumes more SMS space than GSM's concise 7-bit binary code. As a result, Unicode SMS messages are truncated to 70 characters, and longer messages get segmented.    

Read guide
Help Article

The Power of the Unicode Standard

What is Unicode? The Unicode standard is a transformative approach to text representation. It grants the ability to send SMS messages featuring characters from virtually any written language. Connection Software proudly supports this standard in SMS messaging, though there's a constraint of 70 characters per message. However, for more extended communications, EasySendSMS APIs can send them as concatenated SMS messages to maintain continuity. Understanding the Unicode Mechanism: Unlike the traditional 7-bit binary code, Unicode operates with 8-bit "code units", with the capability to combine up to four of these units simultaneously. This multi-unit system boosts the encoding capacity from a mere 128 characters to a staggering 1,112,064 characters. Such extensive capacity ensures virtually all world languages have a place in a single unified character set. Interestingly, Unicode even has code ranges set aside for unique languages like Klingon, although it's worth noting this hasn't received official endorsement from the Unicode Registry. A genius aspect of Unicode is its efficiency. Instead of utilizing four code units for every character, it employs only the necessary units. Take the capital letter "A" as an example; while its full binary representation could be [0041], Unicode optimizes it to [01000001] for space efficiency. Without this kind of optimization, a text message's character capacity would drop drastically.  

Read guide