The Unicode standard is derived originally from ASCII. ASCII was an 8 bit coded character set with a limited number of individual character values. Unicode is a 16 bit coding that supports all the required characters for the major languages of the world, plus the technical symbols in common use.
Implementations of JavaScript that conform to the ECMA 262 standard must interpret character values in accordance with the Unicode Standard, version 2.0, and ISO/IEC 10646-1 with UCS-2 (Universal Character Set) as the adopted encoding form, implementation level 3. If the adopted 10646 subset is not indicated, then it should be assumed to be the BMP subset, collection 300.
For most usage, the character set will be the lower 128 characters that roughly correspond to the ASCII character table. However, internationalization and localization work in progress suggests that non-English speaking users should be able to declare identifiers in their own natural language with its particular character sets and special symbols. Internet web site domain name standards are undergoing some revision to support double byte characters. The ECMAScript edition 3 improves support for handling localized strings, numbers and dates.
The Unicode standard is due for updating to a new edition as the version 2.0 is somewhat old now and does not support Euro currency symbols among other things. A version 3.0 standard is reported to be on the way.
Unicode characters are more properly referred to as code points.
ECMA 262 edition 2 - section - 2
ECMA 262 edition 2 - section - 6
ECMA 262 edition 3 - section - 2
ECMA 262 edition 3 - section - 6
O'Reilly JavaScript Definitive Guide - page - 30
Prev | Home | Next |
unescape() | Up | Universal coordinated time |
JavaScript Programmer's Reference, Cliff Wootton Wrox Press (www.wrox.com) Join the Wrox JavaScript forum at p2p.wrox.com Please report problems to support@wrox.com © 2001 Wrox Press. All Rights Reserved. Terms and conditions. |