Handling Translations in an Embedded Project

Mattia MaldiniOctober 13, 20234 comments

The most common class of embedded projects need some form of interface to allow information to be exchanged between the machine and the user.

This can be done in a number of ways, the most ubiquitous approach being to include a general purpose display. Being able to present elaborate graphics opens up the possibility to convey arbitarily common information via text.

While a convenient solution, this leads to the new problem of choosing a human language with which to communicate.

This article is available in PDF format for easy printing

If the project is meant for a specific context a single language (typically English) might suffice; it never hurts, however, aiming to appeal to foreign markets as well by translating the application in multiple languages.

Although this may seem like a trivial problem, I have witnessed enough approaches to warrant a brief essay on the matter.

At their core, they are all variations of having a string (i.e. const char *) array indexed by the currently selected language for each text message. I'll proceed by listing them in order from least to most convenient, analyzing nuances, andvantages and disadvantages for each one.

My examples will be given in C, but most principles apply to any other low-level programming language (like Zig or Rust).

Macro Rows and Selecting Function

This is an odd one, but I've seen it in the wild. Basically the translations are included in an header file as defines of rows of strings.

#define TRANSLATION_HELLO "Hello", "Ciao", "Halo", "Hola"
#define TRANSLATION_WORLD "World", "Mondo", "Welt", "Mundo"

typedef enum {
} language_t;

To distinguish among them you define a specialized function that takes one argument for each language and an index to select the correct one:

const char *translate(language_t language, const char* english, const char *italian, const char *german, const char *spanish) {
    switch (language) {
        case LANGUAGE_ENGLISH:
            return english;
        case LANGUAGE_ITALIAN:
            return italian;
        case LANGUAGE_GERMAN:
            return german;
        case LANGUAGE_SPANISH:
            return spanish;
            return NULL;

Then the translated string is accessed by invoking translate on the current language and the "translation" macro, like so: translate(LANGUAGE_ITALIAN, TRANSLATION_HELLO);.

This approach has all the benefits of simplicity. It's trivial to implement, maintain and extend; each new translation is very similar to a CSV line, with just an additional #define directive. Any mistake, like adding less or more languages than what is required, will be switftly caught by the compiler as the "row" is fed directly to an argument list.

I would call this a good first attempt, but there are several issues with it.

First, doing such clever work with macros is rarely a good idea. While the mechanism itself is fairly simple and doesn't stretch what the preprocessor can do, reading code like this will initially confuse any reading C programmer. Also, going through macros tend to confuse the error messages. Say, for example, that we forgot the spanish translation in one of the macros. As I said before the error would be caught, but this would be the description:

main.c: In function 'main':
main.c:16:26: error: too few arguments to function 'translate'
   16 |     printf("%s world\n", translate(0, TRANSLATION_HELLO));
      |                          ^~~~~~~~~
main.c:6:13: note: declared here
    6 | const char *translate(uint16_t index, const char *english, const char *italian, const char *german , const char *spanish) {
      |             ^~~~~~~~~

A quick peek at the definition of TRANSLATION_HELLO would reveal that's where the argument list lies but just reading the compiler output would not be enough. In fact, it may seem one needs to provide additional arguments after TRANSLATION_HELLO, as translate takes 5 but only two are given at the call site. All of this adds preventable cognitive load to the reader, which is never a desirable outcome.

Secondly, using macro for recurring string constants can take a significant toll on the resource cost of the final application. Every invocation of translate and consequent use of a corresponding macro will force the compiler to allocate space for 4 new strings in the .rodata section, even if they are always the same.

Note: to be precise, gcc is able to bundle repeated inline declarations of the same string in the same file. This means that if main.c uses TRANSLATION_HELLO 10 times, each string will only appear once in the resulting object file. However, if two files use TRANSLATION_HELLO and then are linked together the result will contain each string twice. I'm not sure of much of this behaviour is guaranteed or specific to my experiments, but I always assume the worst case in my day to day programming tasks.

It's not uncommon for medium sized projects to contain hundreds of message strings over a handful of languages. If even some of them are repeated they can quickly add up to a non-negligible amount of memory, depending on the target platform.

Bottom line, I don't recommend adopting this approach.

String Arrays

The problem of repeated memory allocation applies every time you name a string resource with a macro. The solution in these cases is to use a const variable instead.

It's not directly applicable here because the macro also relies on a bit of metaprogramming to pass the string list as arguments, but we can modify the translate function to accomodate for an array instead:

const char *translate(language_t language, const char** translations) {
    return translations[language];

The translation rows then become

const char *TRANSLATION_HELLO[] = {"Hello", "Ciao", "Halo", "Hola"};
const char *TRANSLATION_WORLD[] = {"World", "Mondo", "Welt", "Mundo"};

The usage remains unchanged.

This solves the issue of repetition, as each invocation of translate refers to a variable instead of another instance of the same string literals.

As arrays are very flexible tools, it also introduces a few quirks and corresponding variations.

In the first naive attempt I omitted the array's size; having the compiler infer it is convenient but makes it vulnerable to errors of omission or addition. While extra translations won't do any harm, forgetting a language may lead to an out-of-bounds access in the worst case scenario. This can be easily fixed by explicitly stating the number of languages:

const char *TRANSLATION_HELLO[NUM_LANGUAGES] = {"Hello", "Ciao", "Halo", "Hola"};
const char *TRANSLATION_WORLD[NUM_LANGUAGES] = {"World", "Mondo", "Welt", "Mundo"};

Note: while C syntax allows the array's length to be specified in the function signature (e.g. const char translate(language_t language, const char *translations[NUM_LANGUAGES]) it will will be ignored, so it's mostly pointless to include it.

The translation rows are starting to look a little bloated though. Copy and paste can get you a long way, but it's still a pain to read if a good chunk of the line is meaningliess, repeated boilerplate.

One way to solve this is to use a bidimensional matrix instead of a simple array. Each column still corresponds to the language and now you have a row for each message. To refer to the row we will use an index which should be defined as an enum.

typedef enum {
} translation_t;

    [TRANSLATION_HELLO] = {"Hello", "Ciao", "Halo", "Hola"},
    [TRANSLATION_WORLD] = {"World", "Mondo", "Welt", "Mundo"},

Adding the explicit index for each rows helps mitigate the possibility of mismatch between the translation_t enum and the actual rows of the matrix.

The translate function should then refer to TRANSLATIONS as a static variable, like so:

const char *translate(language_t language, translation_t translation) {
    return TRANSLATIONS[translation][language];

The choice between using individual translation rows and an array of arrays is up to personal preference.

Back to Switch

For a long time I've been satisfied with the last mentioned approach until - by sheer chance - I stumbled upon its bill in terms of memory consumption.

I was working on a project that started to get tight on memory when I ran into the .data. section count for the file that hosted the translation array.

While obvious in hindsight, since all strings I've used so fare are constants resources I had always naively assumed the only const to pay was in the numbers of the .rodata section which ends up occupying flash memory - or whichever inexpensive and abundant memory type your code resides in. I was wrong.

Here is the contents of a binary with the TRANSLATIONS table included:

$ objdump -t -j .data translations

a.out:     file format elf64-x86-64

0000000000004020  w      .data  0000000000000000              data_start
0000000000004080 g       .data  0000000000000000              _edata
0000000000004020 g       .data  0000000000000000              __data_start
0000000000004028 g     O .data  0000000000000000              .hidden __dso_handle
0000000000004040 g     O .data  0000000000000040              TRANSLATIONS
0000000000004080 g     O .data  0000000000000000              .hidden __TMC_END__

You can see the TRANSLATIONS symbol spans from the address 0x4040 to 0x4080. In hexadecimal that's a whopping 64 bytes for just two translation rows!

Let's print it out to see it in more detail:

printf("Sizes:\nTotal\t%3lu bytes\nRow\t%3lu bytes\nString*\t%3lu bytes\n",

The result - on a 64 bit machine - is:

Total    64 bytes
Row      32 bytes
String*   8 bytes

In this example we only have two strings to translate; in a real world scenario they are usually in the order of a few hundreds: this means that our table will end up occupying entire Kilobytes of RAM just to index a few strings!

That's because despite the strings themselves residing in the .rodata section - and thus in more abundant flash memory - it still needs to have pointers to said strings present in RAM; each pointer in turns require a word of memory (8 bytes on my machine, 4 bytes on a typical 32 bit microcontroller). To get back to the numbers, I realized that translating about 200 messages to four languages was costing my application about 4% of its available RAM!

We can do much better than that. My solution was to partially go back to the first idea and dispatch the required translation through several switches:

const char *translate(language_t language, translation_t translation) {
    switch (translation) {
            switch (language) {
                case LANGUAGE_ENGLISH:
                    return "Hello";
                case LANGUAGE_ITALIAN:
                    return "Ciao";
                case LANGUAGE_GERMAN:
                    return "Halo";
                case LANGUAGE_SPANISH:
                    return "Hola";

            switch (language) {
                case LANGUAGE_ENGLISH:
                    return "World";
                case LANGUAGE_ITALIAN:
                    return "Mondo";
                case LANGUAGE_GERMAN:
                    return "Welt";
                case LANGUAGE_SPANISH:
                    return "Mundo";

            return NULL;

Now every string is in the .rodata section and the dispatching logic is implemented as code, thus residing in the .text section. Besides a small stack for calling translate RAM memory is thus untouched and safe.

This new translation function will suffer from a slight performance decrease: while addressing an array is completed in costant time, walking through a switch can be logarithmic or even linear in the number of options, depending on the optimizations applied by the compiler. I would however judge that for most cases the improvement for memory consumption trumps the additional time overhead.

Only a not-so-small issue remains: this last implementation is at least one order of magnitude more verbose than any previous solution.

Typing ~10 lines of code for every new message would be bad enough, and that's assuming that the developer is the one handling the actual translation process. That part is usually outsourced to a third party that has enough knowledge to translate the text in the required languages.

Is said party supposed to know a bit of C as well, enough to type the final result using one of the aforementioned approaches? Or are you - the developer - going to copy each translation in your source files by hand?

Luckily this step can be easily optimized with a sprinkle of metaprogramming.

Automatically Generate Sources

I have actually witnessed projects where source files were sent out to the translators with the instruction to fill them out respecting C syntax as much as possible. That is, in my opinion, neither professional nor acceptable. The translation contents should be saved in an appropriate format, like CSV, and then converted during compilation to C sources.

In this way the translations can be edited as a common Office spreadsheet and you don't need to teach non-programmers about details that are irrelevant to their job. Meanwhile the conversion can be implemented by a simple script - maybe in Python - and be integrated in the project's compilation process.

Note: metaprogramming is a terrible and powerful tool, to be employed only for the most mundane and verbose of tasks. It happens to apply very well to our predicament.

Just for context, this is part of a script that converts a CSV translation file into sources using the last technique I mentioned:

# Read a CSV file and load it in `translations` ... 
# Open up `h` and `c` files to be generated ...

h.write(f"#ifndef AUTOGEN_FILE_{name.upper()}_H_INCLUDED\n")
h.write(f"#define AUTOGEN_FILE_{name.upper()}_H_INCLUDED\n\n")
c.write(f"#include \n")
c.write(f"#include \"AUTOGEN_FILE_{name}.h\"\n\n")

for key, value in translations.items():
    prefix = f"{name}_{key}"

    c.write(f"const char *{prefix}_get({prefix}_t element, unsigned int language) {{\n")
    c.write(f"{TAB}switch (element) {{\n")
    count = 1

    h.write(f"typedef enum {{\n")
    for enum in value.keys():
        case = f"{prefix.upper()}_{enum.upper()}"

        if count == 1:
                f"{TAB}{case} = 0,\n")
        count += 1

        c.write(f"{TAB*2}case {case}:\n")
        c.write(f"{TAB*3}switch (language) {{\n")

        language_index = 0
        for string in value[enum]:
            c.write(f"{TAB*4}case {language_index}: return \"" + string.replace('"', '\\"') + "\";\n")
            language_index += 1
        c.write(f"{TAB*4}default: return NULL;\n")

    c.write(f"{TAB*2}default: return NULL;\n")

    h.write(f"}} {prefix}_t;\n\n\n")
    h.write(f"const char *{prefix}_get({prefix}_t element, unsigned int language);\n")



Baffled Mention: Separate Asset Bundling

One final approach deserves a mention due to the fact that I've witnessed it in a couple of projects.

Disregarding all that as been said up until now, one could bundle the translations as a separate resource file (e.g. CSV, JSON or even in binary form) and have your application load and parse it at runtime.

While quite straightforwared, I don't really understand the benefits of this approach. I guess it grants the flexibility of modifying the translations without having to update the whole software, but on the other side the application now has the additional error condition of being loaded with an incompatible translation set.

Maybe there are other andvantages I can't see; maybe if the platform is at an high enough level it's just more conventient to rely on an external file rather than going throguh hoops and loops to embed it in the application.


Whenever I start working on a new project I now include a Python that converts a CSV translation file to one of the listed implementations. Your mileage may vary, but for me it works like a charm and greatly reduces the overhead of handling different languages.

[ - ]
Comment by jms_nhOctober 18, 2023
I have a hard time believing that switch-case statements are the most RAM-efficient way to handle this use case.

My recommendation would be to put the strings in program memory (typically Flash), grouped by language:


and then in RAM, have a buffer and an array of char * that gets initialized to each of the words whenever a language is selected. The buffer just needs to be large enough to hold the longest set of words, and you can use code generation to determine that, rather than create switch-case statements.

This also isn't a new problem -- what have other people done to solve it?

[ - ]
Comment by QLOctober 21, 2023

Here is one possible solution to the problem of language translations without switch statements, python scripts, and RAM buffers. Everything conveniently separated into languages and all in ROM.

First, I would define an enumeration of all phrases used in the firmware. For this example, it will look as follows:

enum TR_Phrases {




    . . .

    TR_MAX // keep last


Now, I would define string arrays in ROM, each such array located in the separate .C file by language:


char const *TR_English[TR_MAX] = { // const string array in ROM



    "good morning",

    . . .



char const *TR_Spanish[TR_MAX] = { // const string array in ROM

. . .

(A translation file, such as spanish.c, can be created by a non-programmer by giving them english.c and requesting to translate each English phrase into Spanish, and put them all in the same order.) 

Finally, in the main program, I would use a pointer to the language array:

char const **TR_selected; // currently selected translation

. . .

// select the language translation at runtime...

TR_selected = TR_Spanish; // or TR_German or any other language

Then, whenever I need to get a translated phrase, I would use the TR_selected[] entry. For example, to get a selected translation for TR_OPEN_FILE, I would use:

// print the phrases in the currently selected language...

printrf("%s", TR_selected[TR_HELLO]);

. . .

printrf("%s %s", TR_selected[TR_HELLO],


I hope this conveys the general idea.

[ - ]
Comment by Maldus512October 23, 2023

So you are referring to the possibility of storing the string array itself in ROM memory instead of RAM? That would work, but isn't it an option dependent on the compiler/processor/architecture you are using?

[ - ]
Comment by Maldus512October 19, 2023

That is certainly an interesting optimization, thank you for bringing it up.

I'm still not sure, however, how it would improve RAM usage with respect to the switch-case solution. The latter only requires the activation record of the translate function (which I would still use in any case for code readability); while dividing the array size by the number of supported languages, you still have a linear RAM cost with the solution you are proposing. Am I missing something?

Pretty much all of the solutions listed were found in other embedded projects. Open source embedded applications are hard to come by, so my experience is quite limited. Do you know of other relevant instances of this problem?

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: