Unicode Codes

Home

4D v16.3

Unicode Codes

In databases created with version 11 of 4D, the language as well as the database engine store and work natively with Unicode characters.

This facilitates the internationalization of 4D applications. Unicode is a standard unified character set that can handle practically every common language of the world. A character set is a character/number value correspondence table, for example “a”->1, “b”->2, “5”->15, “oe”->662, and so on. Whereas with ASCII, the basic number value is typically included between 1 and 127, with Unicode the upper limit exceeds 65,000, which means that nearly every character for all languages can be represented.

There are several ways to code the Unicode number values: UTF-16 codes them on 16-bit integers, UTF-32 uses 32-bit integers and UTF-8 uses 8-bit integers. 4D mainly uses UTF-16 (like Windows and Mac OS).

Sometimes, essentially for specific needs related to the Internet, 4D uses UTF-8 which has the advantage of being more compact and having better readability for common characters (a-z,0-9).

For more information about the Unicode standard, please refer, for example, to the following page:
http://en.wikipedia.org/wiki/Unicode

A list of Unicode codes:
http://en.wikipedia.org/wiki/List_of_Unicode_characters

Warning: In Unicode in 4D v11, the following character codes are reserved and must never be included in a text:
0
65534 (FFFE)
65535 (FFFF)

Compatibility Note: Databases created with a version of 4D prior to version 11 can function in ASCII compatibility mode. For more information, please refer to the EXPORT TEXT section.