Formatted Numeric Data
International Components for Unicode (ICU) is IBM's toolkit to internationalise applications. Initially developped for Java it has been ported to C/C++.
Contents
IBM's ICU
Useful Links
- Home page
- Source Forge
- Licence
- Documentation
- Doxygen API
- ISO Language Codes
- ISO Country Codes
- ISO Currency Codes
- ICU Decimal Format Syntax
- ICU Date/Time Format Syntax
Getting started
Download these files from IBM
- Source: icu-3.4.1.zip OR icu-3.4.1.tgz
- Documentation: icu-3.4-docs.zip
- User Guide: icu-3.4-userguide.zip
Compile these solutions:
- icu/source/allinone/allinone.sln
- icu/source/samples/all/all.sln
Fix the following errors:
- For projects strsrch and coll
- Move:
if (pOpt->name == 0) { fprintf(stderr, "Unrecognized option \"%s\"\n", pArgName); return FALSE; }
- Upward just after the line:
for (OptSpec *pOpt = opts; pOpt->name != 0; pOpt ++) {
- For project legacy
- remove dependencies to:
- ../../../../icu-1-8-1/lib/icuucd.lib
- ../../../../icu-1-8-1/lib/icuind.lib
- remove dependencies to:
- For project GDIFontInstance
- cast 6th parameter: (LPCWSTR) &ttGlyphs[dyStart]
Please note that samples will NOT run in their compilation directories. In order to run any of the samples you must place the executable within the icu/bin directory, where the required DLLs are situated.
Incorporating ICU within an existing project
Additional include directories:
- $(ICU)\include
Additional library directories:
- $(ICU)\lib
Link with:
- debug: icuucd.lib icuind.lib
- release: icuuc.lib icuin.lib
Make certain these DLLs are present with your executable:
- debug: icuuc34d.dll icuin34d.dll icudt34.dll
- release: icuuc34.dll icuin34.dll icudt34.dll
CEGUI's ICU
IBM's ICU provides many features to support internationalisation. The class I've created encapsultes some (not all) of these features.
Locale
A Locale describes the rules in effect within a country for a specific language. These rules specify the formatting of currencies, numeric values, dates, and others. Unfortunately these rules do not take into consideration the modifications that may have been applied within the Regional Settings of Window's Control Panel. However most functions accept a formatting mask. This allows applications to recreate some of the features of the control panel, allowing users to specify their desired formats. The setLocale() and setCurrency() functions will configure the class to adopt the rules specified by the specified language, country, and currency.
formatNumber
The formatNumber() function formats a numeric value given the specified mask. Three versions are available. The versions accepting a CEGUI::String has been customised to allow large numbers to be formatted: numbers having up to 18 integers and 7 decimals. The current implementation of this function is limited to accepting a single format for positive numbers. If the numeric value to be formatted is negative then a negative sign will precede the formatted value. However it is impossible to format a negative value within parentheses: formatting -1234.56 to (1,234.56).
numberToText
The numberToText() function converts a numeric value into a textual representation. For example the numeric value 123 is converted into "one hundred twenty-three" in english and "cent vingt-trois" in french.
numberToOrdinal
The numberToOrdinal() function converts a numeric value into an ordinal representation. Form example the numeric value 1 is converted into "1st" in english. Support for other languages is either buggy or lacking (or I have improperly coded this feature).
formatText
The formatText() function will parse a numeric value and sprinkle digits into the slots specified by the formatting mask. A North American telephone number of "12223334444" can be formatted into "1 (222) 333-4444" with the mask "0 (000) 000-0000". This non-localised function accepts three format specifier. A "0" represents a forced digit. If the numerical value possesses a digit at that position then the digit will be displayed, otherwise the place holder character(s) will be used. A "#" represents a potential digit. If the numerical value possesses a digit at that position the the digit will be displayed, otherwise nothing is displayed. Finally the apostrophe "'" allows the mask to specify the characters "0", "#", or "'", rather than using them as format specifiers.
formatCurrency
The formatCurrency() function will format a numerical value according to the currently configured locale and currency. It will NOT convert the monetary value of one currency to another.
formatDateTime
The formatDateTime() function will format a date, time, or date/time given the specified mask. The UDate variable type is a double. According to IBM's documentation "A UDate value is stored as UTC time in milliseconds, which means it is calendar and time zone independent. UDate is the most compact and portable way to store and transmit a date and time." However I have found that attempting to specify a date AND time within the UDate data type results in imprecisions of up to 1 minute 47 seconds. My solution is to use two UDate data variables, one to hold a date and another to hold a time.
localToGmt and gmtToLocal
The localToGmt() and the gmtToLocal() functions convert a date and a time between a local and a GMT value.
CeguiStringToDateTime
The CeguiStringToDateTime() function will parse a string specifying a date or a time into it's corresponding UDate value. Although it is possible to parse a string containing both a date and a time the resulting UDate value will be inaccurate, varying from its intended value by up to 1 minute 47 seconds. A better approach is to keep the date and time separated.