The ICU message format is widely used across translation software and i18n libraries to structure source messages clearly. If youâve ever engaged in software localization for a project, you've likely encountered it.
Localization management plays a crucial role in handling complex message formats like ICU. It involves organizing, coordinating, and maintaining all aspects of the localization processâfrom content extraction and translation to integration and quality assurance. Effective localization management ensures that structured formats are preserved correctly across languages and platforms, reducing errors and improving overall efficiency.
The syntax is intuitive, using curly braces for placeholders and arguments, but it can get confusing since different tools often support different subsets of the format. Understanding the role of software internationalization is essential for leveraging the full potential of the ICU message format in your projects.
In this guide, weâll break down the ICU format and explore how to use it effectively for localization, with practical examples to make it all click.
ICU stands for International Components for Unicode. According to the official docs, itâs a set of libraries providing tools for globalizing software systems.
Originally designed for C/C++ and Java, ICU has expanded to other languages like JavaScript. Itâs known for being portable and delivering consistent results across different platforms.
Many i18n frameworks rely on the ICU message format to handle translations, and weâll dive into some of them in this article. Beyond basic translations, ICU also supports advanced features like plural rules and selection logic, making it ideal for complex localization needs. Efficient translation management can optimize the use of the ICU format, ensuring accurate and timely updates.
Modules in ICU library suite
The ICU library offers a range of modules to support different internationalization needs. We wonât dive into every single one, but hereâs a breakdown of the key components youâll likely use when developing i18n software:
Strings, properties, and CharacterIterator
This core module provides Unicode support for:
Strings: Directly supported by the ICU API.
Properties: Includes C definitions, functions, and some macros.
String iteration: Lets you navigate forward and backward through Unicode characters, returning either the characters themselves or their index values.
Conversion basics
Used to convert text between different encoding types. It handles transformations between Unicode and non-Unicode encodings. ICUâs converter API supports all major encodings and offers advanced features like fast text conversion and customizable callbacks to manage invalid or unmapped sequences.
Locales and resources
This module handles everything related to localesâwhich represent a group of users with similar language and cultural expectations. Each locale can contain multiple attributes like language, script, country code, and more.
Date/Time services
ICU uses a scalar value called UDates to represent dates and times, independent of time zones or calendar systems. The module includes four main classes: Calendar, GregorianCalendar, TimeZone, and SimpleTimeZone.
Formatting and parsing
This handles most of the formatting work required for localization. It supports currency, date, and number formatting, along with text display and complex message formats like pluralization and selection rules, all of which are essential components of a smooth localization workflow
Libraries that support the ICU message format
The ICU message format is part of the formatting and parsing module in the ICU library. Itâs a powerful and flexible module that supports various formatting methods across many programming languages.
Due to the importance of localizing software, a lot of i18n libraries have adopted the ICU message format. Hereâs a quick rundown of some key libraries that use it:
C/C++ â ICU4C (version 68.1) is the full implementation of ICU in C/C++.
Java â ICU4J (version 68.1) is the Java version of the ICU library.
JavaScript â Thereâs no native ICU implementation, but you can use third-party libraries like Angularâs i18n module, Globalize, and react-i18n.
PHP â Symfony is a popular PHP library with built-in support for ICU formats.
Python â PyICU is a Python wrapper that provides ICU functionality.
For managing translations in larger i18n projects, tools like Lokalise are super useful. Lokalise is an all-in-one translation management platform that fully supports the ICU format. Itâs got features for handling filenames, language codes, collaborative access for translators, and customizing the process for uploading and downloading translations.
Practical usage of the ICU message format
Letâs dive into using the ICU message format and see some practical examples. For this section, weâll use YAML files to store translations and JavaScript to implement our scenarios.
We'll be using the format() function from the i18next library. This function follows a format(value, format, locale) signature and returns the formatted message string.
Normal text translation
First, letâs translate some simple text between different locales. Weâll create a YAML file for each locale (English, Chinese, and Arabic) to store the translated messages.
Hereâs how the content looks in each file:
messages_en.yaml
"Welcome_to_the_tutorial": "Welcome to the tutorial"
With the translation files set up, we can create a basic formatting file in JavaScript:
formatting.js
format('Welcome_to_the_tutorial')
Pretty straightforward, right? Now, letâs move on to handling pluralization with the ICU format in the next section.
Pluralization
Pluralization is a key feature of the ICU message format, making it easy to handle different text forms based on numeric values. This is powered by the CLDR (Common Locale Data Repository), which defines plural rules for different languages to ensure correct text forms for each target language.
For example, letâs say we want to display a sentence like: âI bought one bookâ or âI bought <number of books> books,â depending on the count. First, weâll create YAML files for each locale with the appropriate CLDR-based plural rules.
Translation files:
messages_en.yaml
booksCount: > {n, plural, one {# book} other {# books}}
messages_ar.yaml
booksCount: > {n, plural, one {# اÙÙØªØ§Ø¨} other {# اÙÙØªØ¨}}
messsages_zh.yaml
booksCount: > {n, plural, one {# 书} other {# å¾ä¹¦}}
Now that we have the translation files set up, we can use the format() function to pass in the count value:
formatter.js
format('booksCount', { n: 3 });
For English, this outputs: â3 booksâ.
When uploading translation files with ICU plurals to Lokalise, make sure to enable the Detect ICU plurals option in the upload settings.
This ensures that the plural keys are correctly recognized and you can provide translations for each form.
Interpolation
The ICU message format makes it easy to handle dynamic text using interpolation. For example, letâs say we want to display the sentence: âWhen I left home, my age was <the_age>,â where <the_age> is a variable value.
To implement this, weâll create one JSON file per locale (messages_en.json, messages_ar.json, messages_es.json, and messages_zh.json) and update our formatter.js file to use the format() function.
Translation files:
messages_en.json
{ "left_home_age": "When I left home my age was, {age}"}
Now letâs add the format function in formatter.js:
format('left_home_age', { age: 21 });
This function will output:
English message: âWhen I left home, my age was 21.â
Chinese message: âå½æç¦»å¼å®¶æ¶ï¼æç年龿¯ 21.â
See how easy it is to handle interpolation with the ICU message format? Itâs all about substituting values into the message strings using the correct argument names.
In the next section, weâll cover conditional selection for customizing messages based on different scenarios.
Conditional selection using select
The ICU message format also supports conditional text selection, making it easy to handle scenarios like gender-based pronouns. Letâs say we want to display the following text:
âHello, Your friend <friendâs name> is now online. <She/He/They> added a new image to the system.â
Here, we need to show the correct pronoun based on the friendâs gender. No worriesâICUâs select arguments are built just for this.
Translation files:
messages_en.yaml
friend_add_image: > Hello, Your friend {friend} is now online. {gender, select, female {She} male {He} other {They}} added a new image to the system.
If the selected locale is English, the output will be: âHello, Your friend Ann is now online. She added a new image to the system.â
Simple, right? This same pattern works for any content that needs conditional rendering based on variables like gender, role, or status. Using select arguments ensures your message strings are flexible and adapt to the correct context across different target languages.
Number formatting
The ICU message format supports number formatting for two main use cases: currency and percentage formatting. Letâs see how to set these up in an i18n application.
For example, suppose you need to display the sentence: âThey could achieve a 70% success rate in the project.â Weâll use ICUâs number syntax to format this as a percentage.
Translation files:
messages_en.yaml
success_rate: They could achieve {n, number, percent} success rate in the project.
For English, this outputs:âThey could achieve 70% success rate in the project.â
Currency formatting
You can also format currency values using the same syntax by specifying a currency type. Different libraries handle this slightly differently. Hereâs a quick rundown of how you can set it up depending on the library:
ICU4C (C++): Use the NumberFormat.setCurrency() method.
ICU4C (C API): Set the currency code via unum_setTextAttribute().
ICU4J (Java): Use NumberFormat.setCurrency() for currency formatting.
This approach ensures you get the correct default format for each target language based on locale-specific settings.
Date/Time formatting
ICU provides four predefined date formats: short, medium, long, and full. If you want to display a date like: âI entered university on 19/02/2017,â you just need to choose the right date format in your translation files.
Hereâs how to set it up:
Translation files:
messages_en.yaml
enter_university : I entered university on {uni_date, date, short}
format('enter_university', { uni_date: new Date('2019-01-01') });
For English, this would display as: âI entered university on 1/1/2019â (using the short format).
Thatâs how easily ICU can handle date formatting according to your applicationâs needs.
Summary
In this article, we explored how to use the ICU message format to localize software applications, helping your localization team manage complex language rules more effectively. The ICU format simplifies localization by offering support for:
Basic text translations
Conditional text (using select)
Interpolation
Number formatting (percentages and currencies)
Date formatting
With its powerful features and wide adoption, ICU is a solid choice for handling complex localization requirements across multiple target languages, making it a valuable asset in any comprehensive localization strategy.
Ilya is a lead of content/documentation/onboarding at Lokalise, an IT tutor and author, web developer, and ex-Microsoft/Cisco specialist. His primary programming languages are Ruby, JavaScript, Python, and Elixir. He enjoys coding, teaching people and learning new things. In his free time he writes educational posts, participates in OpenSource projects, goes in for sports and plays music.
Ilya is a lead of content/documentation/onboarding at Lokalise, an IT tutor and author, web developer, and ex-Microsoft/Cisco specialist. His primary programming languages are Ruby, JavaScript, Python, and Elixir. He enjoys coding, teaching people and learning new things. In his free time he writes educational posts, participates in OpenSource projects, goes in for sports and plays music.
Handsâon guide to GitHub Actions for Lokalise translation sync: A deep dive
In this tutorial, weâll set up GitHub Actions to manage translation files using Lokalise: no manual uploads or downloads, no reinventing a bicycle. Instead of relying on the Lokalise GitHub app, weâll use open-source GitHub Actions. These let you push and pull translation files directly via the API in an automated way. Youâll learn how to: Push translation files from your repo to LokalisePull translated content back and open pull requests automaticallyWork w
Building an AI-powered translation flow using Lokalise API and webhooks
Managing translations in a growing product can quickly become repetitive and error-prone, especially when dealing with frequent content updates or multiple languages. Lokalise helps automate this process, and with the right setup you can build a full AI-powered translation pipeline that runs with minimal manual input. In this guide, youâll learn how to: Upload translation files to Lokalise automaticallyCreate AI-based translation tasksUse webhooks to downloa
An SRT file is a plain text file used to add subtitles to videos. Itâs one of the simplest and most common formats out there. If youâve ever turned on captions on a YouTube video, thereâs a good chance it was using an SRT file behind the scenes. People use SRT files for all kinds of things: social media clips, online courses, interviews, films, you name it. Theyâre easy to make, easy to edit, and they work pretty much everywhere without hassle. In this post, weâll