Codepage dependend utf8 -> ascii conversion (solution!)

Forum for general chit-chat or off-topic discussion.

Moderators: CEGUI MVP, CEGUI Team

nebukadnezzar
Just popping in
Just popping in
Posts: 7
Joined: Sun Jul 09, 2006 10:42
Contact:

Codepage dependend utf8 -> ascii conversion (solution!)

Postby nebukadnezzar » Fri Sep 01, 2006 15:46

Hi folks,

I've written a small function for converting CEGUI::String to std::string dependend on the Codepage - so your umlauts are converted too!

The disadvantage of this function:
it depends on the ICU library ( http://icu.sf.net )

The advantage:
it is portable (since ICU is availible for *nix, too)



Code: Select all

#include "unicode/ucnv.h"

#include <vector>

std::string CEGUIToASCII(const CEGUI::String& cestr)
{
    UConverter *ceguiconv = NULL, *asciiconv = NULL;
    UErrorCode status = U_ZERO_ERROR;
    std::string result;

    if (cestr.size() == 0) return result;

    asciiconv = ucnv_open(NULL, &status);
    ceguiconv = ucnv_open("utf-8", &status);

    if (U_SUCCESS(status))
    {
      int32_t sourcelen = cestr.utf8_stream_len();
      const char* source = cestr.c_str();
      const char* sourcelimit = source + sourcelen;

      int32_t targetlen = UCNV_GET_MAX_BYTES_FOR_STRING(cestr.length() , ucnv_getMaxCharSize(asciiconv) );
      std::vector<char> tmp(targetlen + 1);
      char* target = &tmp[0];
      char* targetlimit = target + targetlen;
      
        ucnv_convertEx(asciiconv,ceguiconv,&target,targetlimit,&source,sourcelimit,
                NULL,NULL,NULL,NULL,true,true,&status);
      
        if ( U_SUCCESS(status) )
            result = std::string(tmp.begin(),tmp.end());
        else
            result = "Error while converting string";
    }
    else
        result = "Error setting up the converters";

    if (asciiconv) ucnv_close(asciiconv);
    if (ceguiconv) ucnv_close(ceguiconv);

    return result;
};

TobiasKilian
Just popping in
Just popping in
Posts: 2
Joined: Mon Sep 04, 2006 17:42

Postby TobiasKilian » Mon Sep 04, 2006 19:59

Hmm i'd really like to have this feature, but it doesnt work for me? :

Code: Select all

   CEGUI::MultiLineEditbox * textWindow = static_cast<CEGUI::MultiLineEditbox *>(wmgr.getWindow((CEGUI::utf8*)"My/Text"));

   CEGUI::String ceguiStr = textWindow->getText();
   String ogreStr = CEGUIToASCII( ceguiStr);

   LogManager::getSingleton().logMessage( "Converting CEGUI to OGRE String:\nIn CEGUI:");
   LogManager::getSingleton().logMessage( ceguiStr.c_str() );
   LogManager::getSingleton().logMessage( "In OGRE: \n");   
   LogManager::getSingleton().logMessage( ogreStr );
   LogManager::getSingleton().logMessage( "\n");


gives me the following output, when my gui shows "König" :

In CEGUI:
21:52:48: König
21:52:48: In OGRE:
21:52:48: Knig

nebukadnezzar
Just popping in
Just popping in
Posts: 7
Joined: Sun Jul 09, 2006 10:42
Contact:

Postby nebukadnezzar » Mon Sep 04, 2006 20:47

this is the output from your ogre.log i assume...

and since your "ö" was transcoded from two bytes in the CEGUI::String
to one byte in the std::string the transcoding seems to work...

your issue could be the encoding/codepage of your log file, that can't display the "ö"

try to set a breakpoint and look in the debugger what the string contains...

if it really don't work correctly try to select the codepage for your own..

change

Code: Select all

asciiconv = ucnv_open(NULL, &status);

to

Code: Select all

asciiconv = ucnv_open("ISO-8859-15", &status);

for example the ISO-8859-15 Codepage ...
or the CP you wish

(if you pass NULL - like i did - ICU determines the codepage...
but i don't know how...)

Rackle
CEGUI Team (Retired)
Posts: 534
Joined: Mon Jan 16, 2006 11:59
Location: Montréal

Postby Rackle » Tue Sep 05, 2006 04:16

There's some more ICU usefulness within this Cegui Wiki: Formatted Numeric Data. Of particular interest is the memory allocated by the ICU library, which can erronously be reported as a memory leak. My solution was to call u_cleanup() within my destructor.

TobiasKilian
Just popping in
Just popping in
Posts: 2
Joined: Mon Sep 04, 2006 17:42

Postby TobiasKilian » Wed Sep 06, 2006 15:15

Thanks you for the hint!

I got it now working as expected using

Code: Select all

asciiconv = ucnv_open("ISO-8859-1", &status);


Greetings,
Tobias


Return to “Offtopic Discussion”

Who is online

Users browsing this forum: No registered users and 15 guests