Class ilib.GlyphString
Create a new glyph string instance. This string inherits from the ilib.String class, and adds methods that allow you to access whole glyphs at a time.
In Unicode, various accented characters can be created by using a base character and one or more combining characters following it. These appear on the screen to the user as a single glyph. For example, the Latin character "a" (U+0061) followed by the combining diaresis character "¨" (U+0308) combine together to form the "a with diaresis" glyph "ä", which looks like a single character on the screen.
The big problem with combining characters for web developers is that many CSS engines do not ellipsize text between glyphs. They only deal with single Unicode characters. So if a particular space only allows for 4 characters, the CSS engine will truncate a string at 4 Unicode characters and then add the ellipsis (...) character. What if the fourth Unicode character is the "a" and the fifth one is the diaresis? Then a string like "xxxäxxx" that is ellipsized at 4 characters will appear as "xxxa..." on the screen instead of "xxxä...".
In the Latin script as it is commonly used, it is not so common to form accented characters using combining accents, so the above example is mostly for illustrative purposes. It is not unheard of however. The situation is much, much worse in scripts such as Thai and Devanagari that normally make very heavy use of combining characters. These scripts do so because Unicode does not include pre-composed versions of the accented characters like it does for Latin, so combining accents are the only way to create these accented and combined versions of the characters.
The solution to thise problem is not to use the the CSS property "text-overflow: ellipsis" in your web site, ever. Instead, use a glyph string to truncate text between glyphs instead of between characters.
Glyph strings are also useful for truncation, hyphenation, and line wrapping, as all of these should be done between glyphs instead of between characters.
The options parameter is optional, and may contain any combination of the following properties:
- onLoad - a callback function to call when the locale data are fully loaded. When the onLoad option is given, this object will attempt to load any missing locale data using the ilib loader callback. When the constructor is done (even if the data is already preassembled), the onLoad function is called with the current instance as a parameter, so this callback can be used with preassembled or dynamic loading or a mix of the two.
- sync - tell whether to load any missing locale data synchronously or asynchronously. If this option is given as "false", then the "onLoad" callback must be given, as the instance returned from this constructor will not be usable for a while.
- loadParams - an object containing parameters to pass to the loader callback function when locale data is missing. The parameters are not interpretted or modified in any way. They are simply passed along. The object may contain any property/value pairs as long as the calling code is in agreement with the loader callback function as to what those parameters mean.
Defined in: ilib-dyn-full.js.
Constructor Attributes | Constructor Name and Description |
---|---|
ilib.GlyphString(str, options)
|
Method Attributes | Method Name and Description |
---|---|
Return an iterator that will step through all of the characters
in the string one at a time, taking care to step through decomposed
characters and through surrogate pairs in the UTF-16 encoding
as single characters.
|
|
ellipsize(length)
Truncate the current string at the given number of glyphs and add an ellipsis
to indicate that is more to the string.
|
|
truncate(length)
Truncate the current string at the given number of whole glyphs and return
the resulting string.
|
- Parameters:
- {string|ilib.String=} str
- initialize this instance with this string
- {Object=} options
- options governing the way this instance works
The GlyphString class will return decomposed Unicode characters as a single unit that a user might see on the screen as a single glyph. If the next character in the iteration is a base character and it is followed by combining characters, the base and all its following combining characters are returned as a single unit.
The standard Javascript String's charAt() method only returns information about a particular 16-bit character in the UTF-16 encoding scheme. If the index is pointing to a low- or high-surrogate character, it will return that surrogate character rather than the surrogate pair which represents a character in the supplementary planes.
The iterator instance returned has two methods, hasNext() which returns true if the iterator has more characters to iterate through, and next() which returns the next character.
- Returns:
- {Object} an iterator that iterates through all the characters in the string
- Parameters:
- {number} length
- the number of whole glyphs to keep in the string including the ellipsis
- Returns:
- {string} a string truncated to the requested number of glyphs with an ellipsis
- Parameters:
- {number} length
- the number of whole glyphs to keep in the string
- Returns:
- {string} a string truncated to the requested number of glyphs