This feature is currently in beta.

The AI-powered scoring feature lets you quickly check how accurate the target translation is. Accuracy is shown as a score from 0 to 100, visible for every translation in the project editor. If the score is below 80, we suggest reviewing and possibly re-translating that string.

This scoring is based on AI-driven Multidimensional Quality Metrics (MQM). These metrics automatically evaluate the quality of translated content, focusing mainly on AI-generated translations for now.

The feature helps avoid the common problem of either reviewing every translation manually (which takes time) or skipping reviews entirely (which risks quality issues). With AI scoring, you can:

Make fast review decisions based on the quality score
Get instant feedback in the editor while working

How to view scores for translations

To see translation scores, open your Lokalise project editor and look for a target translation value. You’ll notice a small lens icon next to it:

This icon only appears for target translations, not source values.

Click the lens to trigger scoring. Lokalise AI will check the translation quality and show a score. Click the score to get more details:

If the translation has issues, points will be deducted, and you’ll see a list of detected problems:

View scoring issues

Scoring does not consume any extra AI words from your team’s quota.

Scoring and AI tasks: Generating scores in bulk

You can generate scores for many translations at once. To do this, select one or more keys in the editor, then choose Create a task from the bulk actions menu.

Bulk actions menu

In the task settings, select the Automatic translation type (AI-powered task). This task will automatically score all added keys. Scoring doesn’t use your AI words quota. AI words are only counted for the translation itself, not for scoring.

Scoring won’t apply to 100% matches from the translation memory. If a translation comes straight from your translation memory, it won’t be scored. Lokalise only scores translations generated by AI within tasks.

Using scoring in workflows

You can also set up scoring as part of your workflows. To do this, go to the Workflows page and create a new workflow that includes the step Review task with AI scoring. This step comes after the AI translation step that might be preceded by the TM step.

When setting up the AI scoring step, you’ll see these options:

View AI scoring step options

You can set a quality score threshold — by default, it’s 80. If a translation score falls below this number, the translation is automatically added to a review task created during this step. All translations that for some reason were not scored will also be added to the task. Please keep in mind that the scoring will ignore all entries from translation memory with 100% matches; Lokalise only scores AI-generated translations within tasks.

You can also customize the review task: give it a name, description, set a due date, and assign team members to handle the translations flagged for review. For more details on review tasks, check the Tasks documentation.

How scoring works

The scoring feature runs on Lokalise AI, powered by the MQM (Multidimensional Quality Metrics) framework. MQM is an international standard for measuring translation quality. It breaks down possible issues into clear categories like grammar, meaning, consistency, and fluency. Each issue gets a severity level — this makes it easy to evaluate translations in a structured and transparent way.

When you trigger scoring, Lokalise AI analyzes the translation using MQM principles and assigns a score from 0 to 100:

100 — Perfect translation, no changes needed.
80–99 — Good quality, no major problems detected. Some minor issues have been detected but these are typically seen as nice to have improvements.
Below 80 — Serious issues present, review recommended.

Issue types and scoring impact:

Critical (-75 points) — Serious mistakes that change or break the meaning, like missing words, wrong translation, or character limit violations.
Major (-25 points) — Grammar, spelling, terminology, or readability issues.
Minor (-5 points) — Small flaws, such as unnatural phrasing or extra spaces.

Example:
If a translation has one major and one minor issue, the score is:

100 - 25 - 5 = 70

If a translation has two critical issues, the score is:

100 - 75 - 75 = 0 (the score cannot be lower than 0)

Onboarding guide for translators

Scoring translation quality

How to view scores for translations

Scoring and AI tasks: Generating scores in bulk

Using scoring in workflows

How scoring works