This feature is supported for the "Marketing and support" as well as the "Ad-hoc documents" project types.
When you start using Lokalise, chances are you already have content translated into multiple languages. It can be, for instance, blog posts, email templates, or regular HTML files — in other words, any content type supported by the Marketing and support or Ad-hoc documents projects.
However, in many cases different language versions won't be 100% identical. For example, a translated blog post might have extra empty paragraphs added by mistake, or a slightly different layout. What will happen when you import the original and the translated blog post content to Lokalise? Thanks to a state-of-the-art machine-learning model, Lokalise will do its best to properly align the base and the target paragraphs so that you don't need to do it manually. In other words, even if the target version has some extra paragraphs (or lacks a few paragraphs), all text elements will still be properly connected to the base version.
Moreover, once your source and target content is imported to Lokalise, the extracted entries will be automatically added to the translation memory.
Typical use cases
Translation alignment feature shines in the following scenarios:
You have multiple language versions of the same content, and these versions have identical layout.
Different language versions have different layouts. For example, in one of the versions an unordered list might be replaced with regular paragraphs, or a table might be replaced with a block of text.
Some translations are missing from the target content versions. For example, the base content has an unordered list that is missing from the target file.
Seeing it in action
To see this feature in action, let's use two very simple HTML files representing onboarding email templates.
Suppose you have English and Spanish emails that should be imported to Lokalise (some tags have been removed for brevity):
<h1>¡La bienvenida a bordo!</h1>
If you are not familiar with HTML, don't worry. The main thing to note here is that we have some text elements that we would like to upload to Lokalise. There's a header, a paragraph of text, and a short list with some items.
The content layout looks nearly identical but if you look closer, the last few elements are different:
The English version contained three items but in the Spanish version we have only two. Moreover, the HTML tags in the Spanish version are different, and there's an extra empty paragraph (
<p></p>) as well. In fact, this is a very common thing especially when the content is modified with the help of a graphical editor.
Now, what is going to happen when these two files are uploaded to Lokalise? Let's find out!
Proceed to the Upload page and add both translation files (alternatively, you can first upload the English version and then the Spanish file):
Here you can already see that Lokalise had detected 6 keys for English but only 5 keys for Spanish. This is because empty tags are ignored.
Let's hit Import files and proceed to the Editor:
As you can see, the list items have been aligned properly, and the empty tags have been discarded! There's no translation for the "Blog" string because it is not present in the Spanish template but of course you can add any missing translations to fill in any unintentional historical gaps.
Checking results of the translation alignment
Sometimes the imported keys might still contain issues because the base and the target versions are too different. Thus, to find potential problems we recommend doing the following:
Use the Filter dropdown to find all the Untranslated keys. It may be easiest to review one language at a time, deselecting the other languages from the language filter. This way you'll be able to detect keys with missing target translations. Usually the target translation is empty due to the following:
The corresponding translation has not been detected in the target file
The base content has two different paragraphs whereas the target has only one corresponding paragraph with merged content. For example, this might happen when the base has two separate
ptags but inside the target version your translator has mistakenly removed the
ptags and added a line break (
br) to separate the portions of the text. In this case you'll need to manually split the target text.
To detect this problem you can also search for a
<br><br>in your content on Lokalise.
Carefully check the entries (files) that have many untranslated keys. It may indicate that the imported target version is not really a translation but rather a totally different content.
Translation memory in existing teams
If you are heavily using translation memory within your team, and the memory is already populated with some content, please be very careful when importing large volumes of new content. This is because the translation memory will be automatically populated after the import, and if something goes wrong, your memory might be polluted with many invalid entries that are hard to remove.
Thus, when adopting this new feature and importing large portions of content to the existing team, we really recommend creating a separate translation memory storage and use it within the new project.
To achieve that, first open the Team settings:
Switch to the Translation memory tab and click Create new TM:
Give it a name and hit Create TM:
Now you can create a new project to import content to (with the Marketing and support or the Ad-hoc documents type). However, before importing the content, make sure to open the project settings:
Pick a newly created TM from the Translation memory target dropdown because this setting controls where the new entries will be saved to:
Click Save changes at the bottom of the page. Now all the content imported within the current project will be saved to the new translation memory without affecting your existing entries.
If your target content contains some elements that cannot be found within the base version, the extra content will be ignored. Lokalise only supports the same layout in different content versions: number of paragraphs, headers and other block-level components should be roughly similar.
For example, suppose your English HTML file has only one paragraph and a Spanish version of the same document with two paragraphs:
<p>¡La bienvenida a bordo!</p>
In this example the second paragraph in the Spanish version will be effectively ignored because it cannot be connected to any text within the base version.
The base layout is used
While Lokalise will properly align content even if different versions use slightly different layouts, the exported translations will follow the layout used in the base version.
For example, let's suppose we have an English HTML file with a list and a Spanish version with the same items but in this case the list has been replaced with regular paragraphs:
This content will be aligned properly but when the Spanish version is exported, you'll see the paragraphs being replaced with an unordered list following the layout in the base version.
In other words, Lokalise does not allow having different layouts for different languages: all locales follow the layout provided for the base language.
Transcreated content might not be aligned properly
If the target content contains not translations but rather transcreated texts, the algorithm might not be able to match up the base and target text as they deviate too much in their messaging to be clearly aligned.
Suppose we have the following English paragraph with a welcoming message and also a Spanish version:
<p>Me alegra darle la bienvenida en este maravilloso día.</p>
However, while the Spanish text conveys the original message, it is very different from the base version. In fact, this is not a direct translation and thus the target version is very vaguely connected to the base.
Therefore, in this case the English text will be imported to Lokalise but the corresponding Spanish version will be ignored as Lokalise won't be able to establish connection between these two texts.