Internationalization Cookbook
This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.

An internationalization checklist

Intro

This is something I have tendered for a long time. It is not the first, it is probably not the last. It is also not very original (just in how many can you say “don’t hard-code the strings” and “use Unicode”?). Most of the items are (should be?) common knowledge. And yes, it does overlap with other checklists out there.

But I have organized it according to my own perspective. I have used it for desktop and server applications, on all kind of platforms, for development, for testing, and what not. So I think it is pretty universal.

Like any checklist, it is a reminder. It does not teach you anything. You have to know what the bullets mean, and whenever start a new application, or add a new feature, or write a test plan, you can go thru it and make sure you did not miss anything.

And it also helped me get some order and structure when I have attacked a new project. And I am thinking to use it (again) to give some structure to my site. I am thinking about taking each bullets, explain it, and then give some possible solutions using different frameworks (Win32 API, Mac OS X APIs, POSIX, Java, ICU4C, ICU4J, ActionScript, PHP, JavaScript, maybe others).

So, here it is.


I18N checklist

Data processing

  • Unicode, encodings & code pages
  • Any application language – any data language – any OS language
  • File system access, path names
  • File formats (all language editions can read one another’s documents)
  • Communication protocols, network, URLs
  • Interoperability
  • Indexing, searching, etc.
  • The locale info is not frozen

Content

  • Features important to international markets are included
  • Text and messages are devoid of slang and cultural references
  • Consistent and correct terminology is used in strings
  • Language negotiation

Locale/Cultural Awareness

  • Sorting & string comparison, search
  • Text conversions and categorization (case and others)
  • Working with individual characters
  • Number formatting
  • Currency formatting
  • Date formatting
  • Calendar differences
  • Time formatting
  • Time zones, daylight saving time
  • Addresses
  • Personal titles
  • Telephone numbers
  • Measurement Units
  • Paper and envelope size
  • Punctuation, separators
  • Spell-checker or thesaurus/dictionary
  • Text to speech, speech recognition, OCR
  • Parsing, validation
  • Input, Display & Output
  • Layout and text rendering
  • Honoring the system / user settings
  • Fonts, font linking, font fallback
  • Decoration, colors, icons, graphics, sound, voice, animations, video
  • Keyboards and IMEs
  • Input validation
  • Passwords
  • Shortcut-key combinations are accessible on international keyboards
  • Characters boundaries, word and line breaking, hyphenation
  • Complex scripts
  • Vertical scripts
  • Console globalization
  • Clipboard
  • Printing

Localizability

  • Isolate localizable resources
  • Concatenation, variables
  • Buffers are large enough
  • Do not reuse strings
  • Provide context for translators
  • Mark non-localizable strings or characters
  • String handling
  • Mirroring
  • UI Considerations
  • Hard-coded size, coordinates, alignment
  • Runtime resizing / moving / hiding
  • Custom resources
  • 3rd party components

Development and deployment

  • Same code base, language structure, building
  • Single executable model
  • English is just another language
  • Plug-and-Play language support / MUI
  • Installation / deployment
  • Modularity
  • Patches
  • SDKs
  • Support for locale-specific hardware
  • Electronic payment
  • Tech support

Legal issues (still fuzzy)

  • Encryption restrictions
  • Comparative advertising
  • Registration
  • Taxes, finances
  • Privacy, personal information handling
  • Freedom of speech
  • “Politically correct”
  • Local legislation
  • Using ®, © ™
  • DRM

Leave a comment