This feature is currently in beta.

Segmentation splits translations into smaller, relevant chunks. It makes translators more efficient, translation memory richer and the experience better when localizing longer texts.

How does it work? Basically, Lokalise is able to automatically split translation-related text into smaller chunks using rules based on the code elements and features of the particular language, like a full stop. You can also perform this process manually by taking any long text under a translation key, splitting it into multiple smaller segments, and translating them separately. The key will be considered fully translated when all its segments in all languages are translated. Upon export, these segments will be automatically combined into a single translation string.

Please note that segmentation cannot be disabled. Also, it cannot be enabled for non-segmented projects. You can only create a new project and make it initially segmented.

Index

Getting started

First of all, please note that currently segmentation is supported only for projects with the Software localization type. It is currently not supported for the Documents type.

When creating a new translation project, set its type to Software localization and enable the Split text into segments option:

Next, upload your translation files as usual. The text under each translation key will be segmented automatically.

How is segmentation performed?

To learn about some edge cases and potential issues that you may encounter when updating your segmented project, check out the corresponding article.

Segmentation has two main components:

  • Language-based segmentation — the text is split based on the language rules (processing if performed by NTLK). This type works on all content.

  • Code-base segmentation — this type works only for HTML content. In this case we perform segmentation using HTML block-level tags (like p, div, article, ul, and others). Take a look at the following example:

In this case, we have five block tags, namely article, p, section, ul, and li. These block tags will act as delimiters, and the text will be separated accordingly. Also there are two inline tags: strong and a. These inline tags will be left intact and won't be utilized during segmentation.

Here's the result:

As you can see, all the inline tags are present in the segments, whereas all the block tags were stripped out. However, this does not mean that p, article, and the other block tags are lost — no, they are simply hidden in the project editor. When you go to export your translation keys, all block tags and any whitespaces between them will be restored automatically. Under the hood, these block tags are stored in segment suffixes and prefixes.

Segmentation specifics

To learn about some edge cases and potential issues that you may encounter when updating your segmented project, check out the corresponding article.

Translation editor

  • Filters work with translation keys. This means that if you choose to show only unverified items, and one translation segment is unverified, then the whole key will be displayed.

  • When you edit the base language translation for a segment, all the corresponding segments in the target language will become unverified:

Character limit

  • The character limit is set on per key basis as before.

  • However, the limit will count the sum of all the key segment characters. For example, suppose you have a translation key with a character limit set to 100, and this key contains three segments. In the first segment you enter a phrase that is 60 characters long. This means that when you proceed to the second segment, the limit will show as 60/100. In other words, you'll have only 40 characters left for the last two segments.

  • When the key character limit changes after the sum of all segment characters exceeds the limit, you'll see a warning message when editing your segments.

Tasks

  • You can create tasks for translation keys only — not for individual segments.

  • You cannot include or exclude specific segments in or from a task — all segments will always be included in the task.

  • The Offline XLIFF task works in the same way as for non-segmented projects.

  • When running the task initial analysis, each segment is calculated separately. Suppose you have one translation key with two segments. The first segment has an 85% match for 5 source words. The second segment has a 100% match for 7 source words, which means that in total the base value has 12 words. In this case, the initial analysis will calculate translation memory (TM) matches based on segments. Thus, in the initial analysis table you'll see:

    • 7 in the TM 100% category

    • 5 in the TM 85-94% category

    • 2 in the Segments total category

Orders

  • Translation orders can be created for translation keys only — you cannot add individual segments to the order.

Translation memory

Key merge

  • You can merge segmented translation keys but only when these have the same number of segments.

Translation history

  • Each segment has its own translation history.

  • Each segment also shows the entire translation history before it was segmented (this happens after the initial segmentation).

  • You can switch between the segment and key history using the Segment history switch in the History pane:

Apps (previously known as "integrations")

  • Apps support segmented projects. Upon importing, all texts are segmented automatically, and upon export they are automatically concatenated.

Branching

  • You cannot enable branching for a segmented project.

  • You cannot enable segmentation for a project that has branching enabled.

Changing segment statuses

You can set individual segments to reviewed or verified, as well as assign custom translation statuses, these actions have some specifics that are summarized in the following table:

Unverified

Reviewed

Custom translation status

Change in base segment affects status of target segments

Yes

No

No

Change of segment value deactivates its status

Yes

Yes

No

When splitting a segment, its status is copied to newly created base segments

No

No

Yes

When splitting a segment, its status is copied to newly created target segments

Yes

No

No

When creating a key through API and forcing the status (is_unverified, is_reviewed), the status is copied to newly created base segments

Yes

Yes

N/A

When creating a key through API and forcing the status (is_unverified, is_reviewed), the status is copied to newly created target segments

Yes

No

No

Manual split and merge

It's important to mention that you can perform manual split and merge only on the base language. In other words, only the base language can change the segment count, while other languages will comply with the existing segment count. Please note that this operation can be done only in multilingual view, and the user must be a contributor for the base language.

Split

To split text into two segments, click on your translation to edit it, place your cursor at the position where you would like to perform the split operation, and click. Then simply press the Split button:

A new segment will be placed below the current one:

Take a look at the following diagram illustrating how splitting works:

Finally, you can perform a mass split on the base language. To achieve this, choose the keys that you would like to split and choose Split into segments from the bulk actions dropdown:

In this case, all existing segments will be shifted as needed.

Merge

You can also merge two segments. To achieve this, start editing one of the segments and click the Merge button:

The currently chosen segment will be merged with the one above it. This means that you cannot perform the merge operation on the topmost segment.

Take a look at the following diagram illustrating how merging works:

Segmentation during key creation

Whenever you create new translation keys in your project with segmentation enabled, their translations will be segmented automatically. Here's an example:

Here, the text has two paragraphs marked with the p tags, and therefore the newly created translation key will contain two segments:

File exporting and segmentation

Upon exporting your translations, the segments will be automatically merged into a single translation key. However, there are some exceptions wherein you'll still receive separate segments in the output:

  • Offline XLIFF — upon export, keys will have new names: lokalise-<key_id>-<segment_number>.

  • Translation object in APIv2.

  • Segment object in APIv2.

Please note that translation filters work as a group of all segments for a specific key. For example, if you choose to export only the Translated strings, then all segments must be translated to be included in the export. If one of the segments is not translated, the whole translation key will be ignored. The same logic applies to other filters. For instance, if you enable the Reviewed only strings filter, then all segments must be reviewed.

Also, if you choose to replace empty translations with the base language value, the whole translation will be replaced, not the individual segments.

Finally, sorting does not affect segment order. For example, if you choose to sort by last updated, the sorting will take the most recently updated segments into consideration and sort the keys accordingly. However, the segment order won't be changed.

Did this answer your question?