Skip to main content

AI LQA

Evaluate translation quality with the power of artificial intelligence in a cost-effective way.

Ilya Krukowski avatar
Written by Ilya Krukowski
Updated over a week ago

You might also be interested in learning about AI translations that enable you to translate keys in bulk and the AI suggestions features.

AI LQA is a new task type that allows to perform localization quality assurance on the provided content in a fully automated way.

It uses AI assistant built on OpenAI's GPT API to automatically identify linguistic issues, categorize them according to the DQF-MQM framework and deliver detailed reports with comments and suggested corrections. AI LQA helps to improve translation quality without increasing costs.

AI LQA tasks assess and evaluate your existing translations rather than produce new translations.

Introduction

AI LQA is a task type powered by AI, designed to streamline and automate the evaluation of translation quality. Here's what AI LQA offers:

  • Evaluate translation quality: Create tasks to assess translation quality in 30 different languages (see the full list below).

    • Automatic assignment: The task is automatically assigned to the AI assistant, ensuring a seamless workflow.

    • Locked keys during evaluation: While the task is ongoing, all translation keys are automatically locked until the AI assistant completes the evaluation for each respective language.

  • Completion notifications: Receive notifications when the language evaluation is finished.

  • Quality reports: Generate a comprehensive quality report that includes:

    • A scorecard for each evaluated language.

    • A detailed report with suggested corrections and comments from the AI assistant.

  • Glossary adherence checks: AI LQA performs checks to identify translations that do not adhere to your glossary terms, helping maintain consistency across your content.

AI LQA typically takes just a few minutes to complete, though the time can vary depending on the amount of content and the number of languages involved. Once you've selected the languages and scope, you'll be able to see an estimated time of completion.


Supported languages

AI LQA currently supports the following languages and their variations for different locales:

Supported locales

  1. Afrikaans

  2. Albanian

  3. Arabic

  4. Armenian

  5. Assamese

  6. Azerbaijani

  7. Bashkir

  8. Basque

  9. Belarusian

  10. Bengali

  11. Bihari

  12. Bosnian

  13. Bulgarian

  14. Catalan

  15. Cebuano

  16. Corsican

  17. Croatian

  18. Czech

  19. Danish

  20. Dutch

  21. English

  22. Esperanto

  23. Estonian

  24. Faroese

  25. Finnish

  26. French

  27. Frisian

  28. Galician

  29. Georgian

  30. German

  31. Greek

  32. Gujarati

  33. Haitian Creole

  34. Hebrew

  35. Hindi

  36. Hungarian

  37. Icelandic

  38. Indonesian

  39. Interlingua

  40. Interlingue

  41. Ido

  42. Irish

  43. Italian

  44. Japanese

  45. Javanese

  46. Kannada

  47. Kashmiri

  48. Kazakh

  49. Khmer

  50. Kabuverdianu

  51. Konkani

  52. Korean

  53. Kyrgyz

  54. Lao

  55. Latin

  56. Latvian

  57. Limburgish

  58. Lithuanian

  59. Luxembourgish

  60. Macedonian

  61. Malagasy

  62. Malay

  63. Malayalam

  64. Maltese

  65. Mauritian Creole

  66. Mongolian

  67. Marathi

  68. Neapolitan

  69. Nepali

  70. Norwegian

  71. Occitan

  72. Oriya (Odia)

  73. Pashto

  74. Persian

  75. Piedmontese

  76. Polish

  77. Portuguese

  78. Rhaeto-Romanic

  79. Romanian

  80. Russian

  81. Sardinian

  82. Scottish Gaelic

  83. Serbian

  84. Sindhi

  85. Sinhala

  86. Slovak

  87. Slovenian

  88. Somali

  89. Spanish

  90. Sundanese

  91. Swahili

  92. Swedish

  93. Tagalog

  94. Tamil

  95. Telugu

  96. Thai

  97. Tatar

  98. Turkish

  99. Turkmen

  100. Ukrainian

  101. Urdu

  102. Uzbek

  103. Vietnamese

  104. Welsh

  105. Walloon

  106. Xhosa

  107. Yiddish

  108. Yoruba

  109. Cantonese

  110. Chinese Simplified

  111. Chinese Traditional

  112. Zulu

Note: This is not an exhaustive list. It will be updated as we improve the quality and build confidence in each language. If you'd like us to support additional languages and are willing to assist with quality evaluation, please let us know!


Using AI LQA

AI LQA tasks still use up your AI word quota, even though they don’t create any new translations. Every word that gets checked is counted and taken from your total. If the team linked to the project runs out of AI words, you won’t be able to start an AI LQA task. Check out the Team quotas article to learn more.

Prerequisites

To get started with AI LQA, you have to enable Reviewing for your project.

First, proceed to the project and click More > Settings:

Then, find the Quality assurance section and tick the Reviewing option:

Don't forget to save the changes.

This is it, now you can start using AI LQA!

Creating a new LQA task

It's important to remember that a translation key cannot be assigned to a new task if it is already part of another ongoing task. To include such a key in a new task, you have two options:

  1. Wait for the existing task to be completed: Once the task is finished, the key will be available for reassignment in a new task.

  2. Remove the key from the current task: If you need to reassign the key immediately, go to the project editor, select the key, and use the Remove from task option available in the bulk actions menu. This will free up the key, allowing it to be included in a different task. However, if the key is already marked as completed in the task, it won't be possible to remove it from that task.

To get started with AI LQA, open any project in Lokalise and create a new task. There are two ways to create a task.

Via the editor

Start by selecting multiple keys in the editor. You can do this by ticking the checkboxes next to the keys you want to include in the task.

Once selected, choose Create task... from the bulk actions menu. This will take you directly to the task creation page, where the task scope is automatically set to the keys you’ve selected.

Via the Tasks page

Alternatively, you can create a task from the Tasks page in your project. Simply navigate to your project and go to the Tasks page:

Click Create a task. This will open the task creation wizard, where you can define the task's details and scope.

General task information

In the task creation wizard, you will see a new task type called AI LQA.

Select the AI LQA task, provide a task name, and add a description (you can include additional context here for the AI).

Once you've filled in the necessary details, proceed to the next step.

Adjusting advanced task options

The Advanced options for AI LQA tasks are limited to the following:

  • Tag keys after the task is closed — tag the translation keys included in the current task once it's completed. This helps you easily identify these keys later.

Some options are hidden and automatically enabled by default:

  • Lock translations (non-modifiable) — all translations added to the task will be locked until the AI assistant completes the evaluation for each respective language.

  • Auto-close languages (non-modifiable) — once the AI assistant completes the evaluation for a language, that language will be automatically closed, and the task creator will receive an email notification.

  • Auto-close task (non-modifiable) — the task will automatically close once all the added languages have been completed by the AI assistant, and the task creator will receive an email notification.

Adjusting task scope

Select the scope and languages:

  • Task scope — adjust the filter to select the specific keys that should be included in the task. This allows you to focus the quality evaluation on the most relevant content.

  • Source language — choose the language that will be used as a reference for performing the quality evaluation. This is the language against which the translations will be assessed.

  • Target languages — select one or more languages that you want the AI assistant to evaluate. These are the languages where the quality check will be performed.

  • Task assignees — you won’t be able to modify the assignees, as all languages will be automatically assigned to the AI assistant for evaluation.

Task summary

On the right side of the task creation wizard, you'll find the task summary and your AI words balance:

  • AI words quota: This section shows the AI words quota for your team and the number of AI words that will be consumed by this specific task. Note that AI LQA tasks use the same AI word allowance as other AI-related tasks within Lokalise, so your allowance is shared across all AI functionalities. For more details, refer to the Team quotas article.

  • Estimated delivery time: The summary also provides an estimated delivery time for the task. Please note that this is an approximate figure and may fluctuate depending on the current system load.

If the AI quota is exhausted while the AI LQA task is in progress, the task will be locked until the quota is replenished.


Downloading a report

Once the AI LQA task is completed, the Download report button will become active. Click on it to download the report for the task:

The report will be downloaded in .xlsx format.

Each language evaluated will have its own separate sheet in the report.

Report breakdown

At the top of each sheet, you will see a quality metric scorecard summarizing all detected errors. To understand the categories and severity levels used, refer to the Translation quality evaluation framework section. ETPT (Error Type Penalty Total): This is calculated by multiplying the error count by the severity multiplier.

Below the scorecard, you'll find various calculations and key metrics:

  • Evaluation word count — the number of words that were evaluated for this language.

  • Reference word count — a hypothetical number of words (default is 1000) used for easier comparison across different scorecards. The purpose of this metric is to understand what would be the penalty score for the scope of X words.

  • Scaling parameter — a multiplier that adjusts the overall penalty total based on the importance of the content. For example, you might give more weight to high-visibility strings on your landing page compared to backend strings that aren't customer-facing. The default value is 1.

  • Max score value — the highest possible quality score for a language, usually set at 100.

  • Threshold value — the quality score threshold that determines whether the translation quality for this language is considered a pass or fail. The default value is 85.

  • Per-word penalty total — calculated by dividing the absolute penalty total by the evaluation word count.

  • Overall normed penalty total — represents the total error penalty per word relative to the reference word count (default 1000 words).

  • Overall quality score — the primary measure of translation quality, calculated by multiplying the per-word penalty score by the maximum score value (usually 100) and subtracting this value from 100, resulting in a percentage.

  • Pass/fail rating — indicates whether the quality score has passed the threshold.

Below the scorecard, a detailed report provides a granular breakdown of the issues that AI found. Each error is represented in a separate row with the following information:

  • Suggested correction — the AI provides a corrected translation to fix the identified issue. In the future, these corrections may be available directly as suggestions in the Lokalise UI.

  • Comment — a comment from AI explaining why the issue was flagged and what specifically is wrong with the translation.


Fixing issues using AI Suggestions

After the AI LQA task is completed, you’ll be able to view potential corrections in the AI Suggestions side panel. This panel appears when you're editing a translation:

In the example above, you can see one suggested correction for an issue detected during the AI LQA task.


Translation quality evaluation framework

AI LQA uses the DQF-MQM framework to perform linguistic quality assurance (LQA). This framework applies to both human and machine translation and is designed to standardize error categorization, providing structured data to minimize subjectivity in translation quality assessments.

The results help identify underperforming languages, conduct root cause analysis, and improve localization processes to achieve higher-quality translations.

The framework includes predefined categories and severity levels, with different multipliers based on how critical the error is.

At this time, it’s not possible to modify the existing error categories or adjust their weights.

Error categories

Category name

Description

Accuracy

Issues related to the correctness of the translation, including mistranslations, omissions, or additions.

Fluency

Issues affecting the naturalness and readability of the translation, such as grammar, syntax, punctuation, or spelling errors.

Terminology

Issues involving incorrect or inconsistent use of domain-specific terms.

Locale Convention

Issues with adherence to locale-specific conventions, such as date formats, number formats, or currency symbols.

Style

Issues related to following a specific style guide or maintaining the correct tone and voice in the translation.

Consistency

Issues with maintaining uniformity, such as using different terms for the same concept or inconsistencies in formatting.

Coherence

Issues that disrupt the logical flow or organization of the translation, like unclear references or improper sentence structure.

Design

Issues related to the visual presentation, including layout, formatting, or font problems.

Markup

Issues involving incorrect or missing markup elements, such as tags.

Internationalization

Issues affecting the adaptation of the content for a specific audience, including cultural relevance or region-specific examples.

Verity

Issues concerning the truthfulness or factual accuracy of the content, like outdated or incorrect information.

Severity levels

Severity name

Multiplier

Description

Neutral

0

Issues that have minimal impact on the overall quality and are considered inconsequential or insignificant.

Minor

1

Small issues that slightly affect translation quality but do not significantly hinder understanding or usability.

Major

5

Significant issues that affect quality and comprehension, potentially causing confusion or misunderstandings for the target audience.

Critical

25

Severe issues that compromise the accuracy, clarity, or usability of the content, making the translation unusable or misleading for its intended purpose.


Translation scoring vs. AI LQA

Both translation scoring and AI LQA use the MQM framework to evaluate translation quality, but they serve different purposes:

  • Translation scoring

    • Built directly into the editor with real-time scores for each string

    • Integrated into workflows, enabling automation (e.g., routing low scores to human review)

    • Provides a faster, more user-friendly experience for day-to-day translation work

  • AI LQA

    • Generates quality reports in batch format

    • Useful for large-scale assessments and audits

    • Offers less interactivity compared to scoring

Translation scoring is the natural evolution of AI LQA bringing the same evaluation logic into a more scalable, real-time, and workflow-integrated experience. Over time, it may fully take the place of AI LQA for most use cases.


Frequently asked questions

💡 Looking for more?
Find all answers in AI: Frequently asked questions!

Did this answer your question?