This feature is supported for the Marketing and support as well as the Ad hoc documents project types.

When you begin working with Lokalise, you might already possess content translated into various languages. This content could range from blog posts and email templates to standard HTML files, essentially encompassing any content type that fits within the scope of Marketing and support or Ad hoc documents projects.

It's common for translations not to match their original versions perfectly. For instance, a translated blog post may unintentionally include extra empty paragraphs or showcase a slightly altered layout. When importing both the original and translated versions of content into Lokalise, a sophisticated machine-learning model assists in accurately aligning the base text with its translated counterparts, eliminating the need for manual adjustments. This ensures that, despite any additional or missing paragraphs in the translated version, all text elements remain correctly linked to the original.

Furthermore, upon importing your source and translated content into Lokalise, the system automatically incorporates the extracted entries into the translation memory.

Typical use cases

The translation alignment feature is particularly useful in scenarios such as:

Handling multiple language versions of the same content that share an identical layout.
Dealing with language versions that feature different layouts. For example, one version might substitute an unordered list with paragraphs, or replace a table with a block of text.
Addressing situations where some translations are absent in the translated versions, like a base content's unordered list missing from the target file.

Seeing it in action

To see this feature in action, let's use two very simple HTML files representing onboarding email templates.

Suppose you have English and Spanish emails that should be imported to Lokalise (some tags have been removed for brevity):

English

Spanish

<h1>Welcome onboard!</h1>
	
<p>We are thrilled to welcome you! Here are some links to help you get started:</p>

<ul>
  <li>Documentation</li>
  <li>Blog</li>
  <li>Forum</li>
</ul>

<h1>¡La bienvenida a bordo!</h1>

<p>¡Estamos encantados de darle la bienvenida! Aquí hay algunos enlaces que le ayudarán a empezar:</p>

<p></p>
<p></p>
<p>Documentación</p>
<p>Foro</p>

If you are not familiar with HTML, don't worry. The main thing to note here is that we have some text elements that we would like to upload to Lokalise. There's a header, a paragraph of text, and a short list with some items.

The content layout looks nearly identical but if you look closer, the last few elements are different:

<p></p>
<p></p>
<p>Documentación</p>
<p>Foro</p>

The English version contained three items but in the Spanish version we have only two. Moreover, the HTML tags in the Spanish version are different, and there's an extra empty paragraph () as well. In fact, this is a very common thing especially when the content is modified with the help of a graphical editor.

Now, what is going to happen when these two files are uploaded to Lokalise? Let's find out!

Proceed to the Upload page and add both translation files (alternatively, you can first upload the English version and then the Spanish file):

Here you can already see that Lokalise had detected 6 keys for English but only 5 keys for Spanish. This is because empty tags are ignored.

Let's hit Import files and proceed to the Editor:

As you can see, the list items have been aligned properly, and the empty tags have been discarded! There's no translation for the "Blog" string because it is not present in the Spanish template but of course you can add any missing translations to fill in any unintentional historical gaps.

Checking results of the translation alignment

Sometimes the imported keys might still contain issues because the base and the target versions are too different. Thus, to find potential problems we recommend doing the following:

Enable QA checks in the project settings and then use the Filter dropdown to find keys with QA issues. The most useful QA checks are:
- Inconsistent placeholders
- Inconsistent HTML
- Different numbers
- Different URLs
- Different emails
- Different brackets
Use the Filter dropdown to find all the Untranslated keys. It may be easiest to review one language at a time, deselecting the other languages from the language filter. This way you'll be able to detect keys with missing target translations. Usually the target translation is empty due to the following:
- The corresponding translation has not been detected in the target file
- The base content has two different paragraphs whereas the target has only one corresponding paragraph with merged content. For example, this might happen when the base has two separate p tags but inside the target version your translator has mistakenly removed the p tags and added a line break (br) to separate the portions of the text. In this case you'll need to manually split the target text.
 To detect this problem you can also search for a   or   in your content on Lokalise.
Carefully check the entries (files) that have many untranslated keys. It may indicate that the imported target version is not really a translation but rather a totally different content.

Important notes

Translation memory in existing teams

If you are heavily using translation memory within your team, and the memory is already populated with some content, please be very careful when importing large volumes of new content. This is because the translation memory will be automatically populated after the import, and if something goes wrong, your memory might be polluted with many invalid entries that are hard to remove.

Thus, when adopting this new feature and importing large portions of content to the existing team, we really recommend creating a separate translation memory storage and use it within the new project.

To achieve that, first open the Team settings:

Switch to the Translation memory tab and click Create new TM:

Give it a name and hit Create TM:

Now you can create a new project to import content to (with the Marketing and support or the Ad-hoc documents type). However, before importing the content, make sure to open the project settings:

Pick a newly created TM from the Translation memory target dropdown because this setting controls where the new entries will be saved to:

Click Save changes at the bottom of the page. Now all the content imported within the current project will be saved to the new translation memory without affecting your existing entries.

Missing content

If your target content contains some elements that cannot be found within the base version, the extra content will be ignored. Lokalise only supports the same layout in different content versions: number of paragraphs, headers and other block-level components should be roughly similar.

For example, suppose your English HTML file has only one paragraph and a Spanish version of the same document with two paragraphs:

English	Spanish
<p>Welcome!</p>	<p>¡La bienvenida a bordo!</p> <p>¡Estamos encantados de darle la bienvenida!</p>

In this example the second paragraph in the Spanish version will be effectively ignored because it cannot be connected to any text within the base version.

The base layout is used

While Lokalise will properly align content even if different versions use slightly different layouts, the exported translations will follow the layout used in the base version.

For example, let's suppose we have an English HTML file with a list and a Spanish version with the same items but in this case the list has been replaced with regular paragraphs:

English	Spanish
<ul> <li>Documentation</li> <li>Blog</li> <li>Forum</li> </ul>	<p>Documentación</p> <p>Blog</p> <p>Foro</p>

This content will be aligned properly but when the Spanish version is exported, you'll see the paragraphs being replaced with an unordered list following the layout in the base version.

In other words, Lokalise does not allow having different layouts for different languages: all locales follow the layout provided for the base language.

Transcreated content might not be aligned properly

If the target content contains not translations but rather transcreated texts, the algorithm might not be able to match up the base and target text as they deviate too much in their messaging to be clearly aligned.

Suppose we have the following English paragraph with a welcoming message and also a Spanish version:

English	Spanish
<p>Welcome!</p>	<p>Me alegra darle la bienvenida en este maravilloso día.</p>

However, while the Spanish text conveys the original message, it is very different from the base version. In fact, this is not a direct translation and thus the target version is very vaguely connected to the base.

Therefore, in this case the English text will be imported to Lokalise but the corresponding Spanish version will be ignored as Lokalise won't be able to establish connection between these two texts.

Translation memory

Ad hoc documents project type

Storyblok

Translation editor

HTML parsing