Page 1 of 1

Codepage dependend utf8 -> ascii conversion (solution!)

Posted: Fri Sep 01, 2006 15:46
by nebukadnezzar
Hi folks,

I've written a small function for converting CEGUI::String to std::string dependend on the Codepage - so your umlauts are converted too!

The disadvantage of this function:
it depends on the ICU library ( )

The advantage:
it is portable (since ICU is availible for *nix, too)

Code: Select all

#include "unicode/ucnv.h"

#include <vector>

std::string CEGUIToASCII(const CEGUI::String& cestr)
    UConverter *ceguiconv = NULL, *asciiconv = NULL;
    UErrorCode status = U_ZERO_ERROR;
    std::string result;

    if (cestr.size() == 0) return result;

    asciiconv = ucnv_open(NULL, &status);
    ceguiconv = ucnv_open("utf-8", &status);

    if (U_SUCCESS(status))
      int32_t sourcelen = cestr.utf8_stream_len();
      const char* source = cestr.c_str();
      const char* sourcelimit = source + sourcelen;

      int32_t targetlen = UCNV_GET_MAX_BYTES_FOR_STRING(cestr.length() , ucnv_getMaxCharSize(asciiconv) );
      std::vector<char> tmp(targetlen + 1);
      char* target = &tmp[0];
      char* targetlimit = target + targetlen;
        if ( U_SUCCESS(status) )
            result = std::string(tmp.begin(),tmp.end());
            result = "Error while converting string";
        result = "Error setting up the converters";

    if (asciiconv) ucnv_close(asciiconv);
    if (ceguiconv) ucnv_close(ceguiconv);

    return result;

Posted: Mon Sep 04, 2006 19:59
by TobiasKilian
Hmm i'd really like to have this feature, but it doesnt work for me? :

Code: Select all

   CEGUI::MultiLineEditbox * textWindow = static_cast<CEGUI::MultiLineEditbox *>(wmgr.getWindow((CEGUI::utf8*)"My/Text"));

   CEGUI::String ceguiStr = textWindow->getText();
   String ogreStr = CEGUIToASCII( ceguiStr);

   LogManager::getSingleton().logMessage( "Converting CEGUI to OGRE String:\nIn CEGUI:");
   LogManager::getSingleton().logMessage( ceguiStr.c_str() );
   LogManager::getSingleton().logMessage( "In OGRE: \n");   
   LogManager::getSingleton().logMessage( ogreStr );
   LogManager::getSingleton().logMessage( "\n");

gives me the following output, when my gui shows "König" :

21:52:48: König
21:52:48: In OGRE:
21:52:48: Knig

Posted: Mon Sep 04, 2006 20:47
by nebukadnezzar
this is the output from your ogre.log i assume...

and since your "ö" was transcoded from two bytes in the CEGUI::String
to one byte in the std::string the transcoding seems to work...

your issue could be the encoding/codepage of your log file, that can't display the "ö"

try to set a breakpoint and look in the debugger what the string contains...

if it really don't work correctly try to select the codepage for your own..


Code: Select all

asciiconv = ucnv_open(NULL, &status);


Code: Select all

asciiconv = ucnv_open("ISO-8859-15", &status);

for example the ISO-8859-15 Codepage ...
or the CP you wish

(if you pass NULL - like i did - ICU determines the codepage...
but i don't know how...)

Posted: Tue Sep 05, 2006 04:16
by Rackle
There's some more ICU usefulness within this Cegui Wiki: Formatted Numeric Data. Of particular interest is the memory allocated by the ICU library, which can erronously be reported as a memory leak. My solution was to call u_cleanup() within my destructor.

Posted: Wed Sep 06, 2006 15:15
by TobiasKilian
Thanks you for the hint!

I got it now working as expected using

Code: Select all

asciiconv = ucnv_open("ISO-8859-1", &status);