Windows 95, 98 and Me are (mostly) gone. Yes, I know they are still lurking in some places, but I development for them (mostly) stopped.
But the ghosts are still here to haunt users and developers: hundreds of thousands of non-Unicode applications.
And very often questions show up in newsgroups about supporting all kind of languages on non-matching operating systems (Japanese on English machines, etc.). And when the only solution is “move to Unicode” the answer is usually “the application is way too big, we cannot afford.”
It looks like the price to convert a lot of code to Unicode is too high. But the price of maintaining non-Unicode applications in a Unicode world, with all kind of hacks to enable some kind of limping support, is higher.
It looks difficult only because many people don’t known what implies. Thing is, the conversion process is relatively simple, with big chinks of it being easy to automate.
How to start
Although the tool automates some of the steps , you still have to understand what is going on, because there is enough work for you too.
Start by reading Michael’s Kaplan “Converting a project to Unicode” (and don’t ignore the comments, which also bring some useful insights):
- Part 0 (The introduction)
- Part 1 (Business before pleasure)
- Part 2 (‘Sorry, you’re not my type.’ ‘Um, maybe I could change that?)
- Part 3 (Can I quote you on that?)
- Part 4 (/Delightful, /Delicious, /DUnicode!)
- Part 5 (Are we there yet? Well, not just yet)
- Part 6 (Upon the road not traveled)
- Part 7 (What does it mean to fit things to a ‘T’, anyway?)
- Part 8 (Fitting MSLU into the mix)
- Part 9 (The project’s postpartum postmortem)
Developed using GNU Flex in one afternoon (but with some thinking in advance :-).
It is not bullet proof, and it will not magically do everything for you.
But it will:
- replace the big chunk of the CRT API with generic versions
TEXTwrappers around strings and characters
- “knows” to avoid strings/characters already using
TEXTor Unicode (
- does no changes in comments or inside strings
- gives warnings for several things you will have to fix by hand
Code page conversion
WideCharToMultiByte: you will have to see if the conversion still need to be done. If the original string was not Unicode and the now it is Unicode, no conversion is required.
But you will probably have to do conversions in order to read legacy files.
And you might discover that some of the buffers you used to move binary data around are now converted to
TCHAR. But it is good practice to use
BYTE for that, not
unsigned char). So if you used
char, that is really your fault (or the guy who wrote the code, if you are just a maintainer :-)
This has two traps:
- It takes an ASCII string as parameter, so it should remain
const char *, or
LPSTR, or whatever) and the string itself should not be wrapped with
TEXT. So if the tool already “fixed” it, you will have to undo it.
- The name of the function might be that of an ANSI version (stuff like
LoadLibraryA, or one of the hundreds of such APIs). In case you might want still be able to compile your application as ANSI you will need to use some conditional code:
Most APIs will take the length of a buffer in
TCHARs, but sizeof will give you the size in bytes. So if you used to write into a buffer of
sizeof(buffer) was nice and dandy.
charbuffer; sprintf( buffer,
"Some error: '%s'\n", szError );
Not is becomes:
TCHAR buffer; _stprintf( buffer,
"Some error: '%s'\n"), szError );
In first case
sizeof(buffer) is 100, but in the second case it is 100*
sizeof(TCHAR), meaning 200. This can lead to a buffer overrun, because the count should be expressed in characters, not in bytes.
Anyway, if you read Michael’s posts you already know what the problems are. If you did not, go there and read them, really.
You will need to fix it by using
Personally I prefer
_countof (not available in older VS versions), which can detect if it is mistakenly applied on pointers instead of arrays.
What I normally do if I think there is any chance I use the code on VS 6:
Ok, you can download the tool (with the sources) from here.
If you want to compile it yourself, you will need GNU flex 2.5.2 in the tools folder.
And if you find bugs and you fix them, it would be nice to share that :-)
And just in case it is not obvious, this tool is provided “as is,” no guarantees, no responsibility :-)