Quality Checks

Pootle has an ability to check for common translation mistakes. Once you submit a translation, it will compare its certain features with the original string and identify potential problems. Сhecks are displayed in the upper right corner, just above the submit button. Once there are checks in red, you will stay on the same unit, until you have each check resolved or muted. Note that you should only mute checks if you know what you're doing. Muted checks will be reviewed periodically by the managers.

If you are not sure what a particular error means and how to fix it properly, use "Report a problem with this string" link at the top of the unit. The managers will try to assist you as soon as possible.

Below is the description of each quality check.

Accelerators

Checks whether accelerators are consistent between the two strings.

This test is capable of checking the different type of accelerators that are used in different projects, like Mozilla or KDE. The test will pick up accelerators that are missing and ones that shouldn't be there.

See accelerators on the localization guide for a full description on accelerators.

Acronyms

Checks that acronyms that appear are unchanged.

If an acronym appears in the original this test will check that it appears in the translation. Translating acronyms is a language decision but many languages leave them unchanged. In that case this test is useful for tracking down translations of the acronym and correcting them.

Blank

Checks whether a translation is totally blank.

This will check to see if a translation has inadvertently been translated as blank i.e. as spaces. This is different from untranslated which is completely empty. This test is useful in that if something is translated as " " it will appear to most tools as if it is translated.

Brackets

Checks that the number of brackets in both strings match.

If ([{ or }]) appear in the original this will check that the same number appear in the translation.

Romanian: Avoid cedilla diacritics

Check if the translation contains an illegal cedilla character

Cedillas are obsoleted diacritics for Romanian:

  • U+0162 Latin capital letter T with cedilla
  • U+0163 Latin small letter T with cedilla
  • U+015E Latin capital letter S with cedilla
  • U+015F Latin small letter S with cedilla

Cedilla-letters are only valid for Turkish (S-cedilla) and Gagauz languages (S-cedilla and T-comma). Fun fact: Gagauz is the only known language to use T-cedilla.

param str1:the source string
param str2:the target (translated) string
return:True if str2 contains a cedilla character

Compendium conflict

Checks for Gettext compendium conflicts (#-#-#-#-#).

When you use msgcat to create a PO compendium it will insert #-#-#-#-# into entries that are not consistent. If the compendium is used later in a message merge then these conflicts will appear in your translations. This test quickly extracts those for correction.

Translator credits

Checks for messages containing translation credits instead of normal translations.

Some projects have consistent ways of giving credit to translators by having a unit or two where translators can fill in their name and possibly their contact details. This test allows you to find these units easily to check that they are completed correctly and also disables other tests that might incorrectly get triggered for these units (such as urls, emails, etc.)

Double quotes

Checks whether doublequoting is consistent between the two strings.

Checks on double quotes " to ensure that you have the same number in both the original and the translated string. This tests takes into account that several languages use different quoting characters, and will test for them instead.

Double spaces

Checks for bad double-spaces by comparing to original.

This will identify if you have [space][space] in when you don't have it in the original or it appears in the original but not in your translation. Some of these are spurious and how you correct them depends on the conventions of your language.

Repeated word

Checks for repeated words in the translation.

Words that have been repeated in a translation will be highlighted with this test e.g. "the the", "a a". These are generally typos that need correcting. Some languages may have valid repeated words in their structure, in that case either ignore those instances or switch this test off.

E-mail

Checks that emails are not translated.

Generally you should not be translating email addresses. This check will look to see that email addresses e.g. info@example.com are not translated. In some cases of course you should translate the address but generally you shouldn't.

Ending punctuation

Checks whether punctuation at the end of the strings match.

This will ensure that the ending of your translation has the same punctuation as the original. E.g. if it ends in :[space] then so should yours. It is useful for ensuring that you have ellipses [...] in all your translations, not simply three separate full-stops. You may pick up some errors in the original: feel free to keep your translation and notify the programmers. In some languages, characters such as ? or ! are always preceded by a space e.g. [space]? — do what your language customs dictate. Other false positives you will notice are, for example, if through changes in word-order you add "), etc. at the end of the sentence. Do not change these: your language word-order takes precedence.

It must be noted that if you are tempted to leave out [full-stop] or [colon] or add [full-stop] to a sentence, that often these have been done for a reason, e.g. a list where fullstops make it look cluttered. So, initially match them with the English, and make changes once the program is being used.

This check is aware of several language conventions for punctuation characters, such as the custom question marks for Greek and Arabic, Devanagari Danda, full-width punctuation for CJK languages, etc. Support for your language can be added easily if it is not there yet.

Ending whitespace

Checks whether whitespace at the end of the strings matches.

Operates the same as endpunc but is only concerned with whitespace. This filter is particularly useful for those strings which will evidently be followed by another string in the program, e.g. [Password: ] or [Enter your username: ]. The whitespace is an inherent part of the string. This filter makes sure you don't miss those important but otherwise invisible spaces!

If your language uses full-width punctuation (like Chinese), the visual spacing in the character might be enough without an added extra space.

Escapes

Checks whether escaping is consistent between the two strings.

Checks escapes such as \n \u0000 to ensure that if they exist in the original string you also have them in the translation.

File paths

Checks that file paths have not been translated.

Checks that paths such as /home/user1 have not been translated. Generally you do not translate a file path, unless it is being used as an example, e.g. your_user_name/path/to/filename.conf.

Functions

Checks that function names are not translated.

Checks that function names e.g. rgb() or getEntity.Name() are not translated.

Old KDE comment

Checks to ensure that no KDE style comments appear in the translation.

KDE style translator comments appear in PO files as "_: comment\n". New translators often translate the comment. This test tries to identify instances where the comment has been translated.

Long

Checks whether a translation is much longer than the original string.

This is most useful in the special case where the translation is multiple characters long while the source text is only 1 character long. Otherwise, we use a general ratio that will catch very big differences but is set conservatively to limit the number of false positives.

Must translate words

Checks that words configured as definitely translatable don't appear in the translation.

If for instance in your language you decide that you must translate 'OK' then this test will flag any occurrences of 'OK' in the translation if it appeared in the source string. You must specify a file containing all of the must translate words using --musttranslatefile.

Newlines

Checks whether newlines are consistent between the two strings.

Counts the number of \n newlines (and variants such as \r\n) and reports and error if they differ.

Romanian: Use "niciun"/"nicio"

Checks for sequences containing 'nici un'/'nici o' which are obsolete Romanian syntax. Correct is 'niciun'/'nicio'

Don't translate words

Checks that words configured as untranslatable appear in the translation too.

Many brand names should not be translated, this test allows you to easily make sure that words like: Word, Excel, Impress, Calc, etc. are not translated. You must specify a file containing all of the no translate words using --notranslatefile.

Number of plurals

Checks for the correct number of noun forms for plural translations.

This uses the plural information in the language module of the Translate Toolkit. This is the same as the Gettext nplural value. It will check that the number of plurals required is the same as the number supplied in your translation.

Numbers

Checks whether numbers of various forms are consistent between the two strings.

You will see some errors where you have either written the number in full or converted it to the digit in your translation. Also changes in order will trigger this error.

Options

Checks that command line options are not translated.

In messages that contain command line options, such as --help, this test will check that these remain untranslated. These could be translated in the future if programs can create a mechanism to allow this, but currently they are not translated. If the options has a parameter, e.g. --file=FILE, then the test will check that the parameter has been translated.

printf()

Checks whether printf format strings match.

If the printf formatting variables are not identical, then this will indicate an error. Printf statements are used by programs to format output in a human readable form (they are placeholders for variable data). They allow you to specify lengths of string variables, string padding, number padding, precision, etc. Generally they will look like this: %d, %5.2f, %100s, etc. The test can also manage variables-reordering using the %1$s syntax. The variables' type and details following data are tested to ensure that they are strictly identical, but they may be reordered.

See also printf Format String.

Punctuation spacing

Checks for bad spacing after punctuation.

In the case of [full-stop][space] in the original, this test checks that your translation does not remove the space. It checks also for [comma], [colon], etc.

Some languages don't use spaces after common punctuation marks, especially where full-width punctuation marks are used. This check will take that into account.

Pure punctuation

Checks that strings that are purely punctuation are not changed.

This extracts strings like + or - as these usually should not be changed.

Python brace placeholders

Checks whether python brace format strings match.

Number of sentences

Checks that the number of sentences in both strings match.

Adds the number of sentences to see that the sentence count is the same between the original and translated string. You may not always want to use this test, if you find you often need to reformat your translation, because the original is badly-expressed, or because the structure of your language works better that way. Do what works best for your language: it's the meaning of the original you want to convey, not the exact way it was written in the English.

Short

Checks whether a translation is much shorter than the original string.

This is most useful in the special case where the translation is 1 characters long while the source text is multiple characters long. Otherwise, we use a general ratio that will catch very big differences but is set conservatively to limit the number of false positives.

Simple capitalization

Checks the capitalisation of two strings isn't wildly different.

This will pick up many false positives, so don't be a slave to it. It is useful for identifying translations that don't start with a capital letter (upper-case letter) when they should, or those that do when they shouldn't. It will also highlight sentences that have extra capitals; depending on the capitalisation convention of your language, you might want to change these to Title Case, or change them all to normal sentence case.

Simple plural(s)

Checks for English style plural(s) for you to review.

This test will extract any message that contains words with a final "(s)" in the source text. You can then inspect the message, to check that the correct plural form has been used for your language. In some languages, plurals are made by adding text at the beginning of words, making the English style messy. In this case, they often revert to the plural form. This test allows an editor to check that the plurals used are correct. Be aware that this test may create a number of false positives.

For languages with no plural forms (only one noun form) this test will simply test that nothing like "(s)" was used in the translation.

Single quotes

Checks whether singlequoting is consistent between the two strings.

The same as doublequoting but checks for the ' character. Because this is used in contractions like it's and in possessive forms like user's, this test can output spurious errors if your language doesn't use such forms. If a quote appears at the end of a sentence in the translation, i.e. '., this might not be detected properly by the check.

Starting capitalization

Checks that the message starts with the correct capitalisation.

After stripping whitespace and common punctuation characters, it then checks to see that the first remaining character is correctly capitalised. So, if the sentence starts with an upper-case letter, and the translation does not, an error is produced.

This check is entirely disabled for many languages that don't make a distinction between upper and lower case. Contact us if this is not yet disabled for your language.

Starting punctuation

Checks whether punctuation at the beginning of the strings match.

Operates as endpunc but you will probably see fewer errors.

Starting whitespace

Checks whether whitespace at the beginning of the strings matches.

As in endwhitespace but you will see fewer errors.

Tabs

Checks whether tabs are consistent between the two strings.

Counts the number of \t tab markers and reports an error if they differ.

Unchanged

Checks whether a translation is basically identical to the original string.

This checks to see if the translation isn’t just a copy of the English original. Sometimes, this is what you want, but other times you will detect words that should have been translated.

URLs

Checks that URLs are not translated.

This checks only basic URLs (http, ftp, mailto etc.) not all URIs (e.g. afp, smb, file). Generally, you don't want to translate URLs, unless they are example URLs (http://your_server.com/filename.html). If the URL is for configuration information, then you need to query the developers about placing configuration information in PO files. It shouldn't really be there, unless it is very clearly marked: such information should go into a configuration file.

Valid characters

Checks that only characters specified as valid appear in the translation.

Often during character conversion to and from UTF-8 you get some strange characters appearing in your translation. This test presents a simple way to try and identify such errors.

This test will only run of you specify the --validcharsfile command line option. This file contains all the characters that are valid in your language. You must use UTF-8 encoding for the characters in the file.

If the test finds any characters not in your valid characters file then the test will print the character together with its Unicode value (e.g. 002B).

Placeholders

Checks whether variables of various forms are consistent between the two strings.

This checks to make sure that variables that appear in the original also appear in the translation. It can handle variables from projects like KDE or OpenOffice. It does not at the moment cope with variables that use the reordering syntax of Gettext PO files.

XML tags

Checks that XML/HTML tags have not been translated.

This check finds the number of tags in the source string and checks that the same number are in the translation. If the counts don't match then either the tag is missing or it was mistakenly translated by the translator, both of which are errors.

The check ignores tags or things that look like tags that cover the whole string e.g. <Error> but will produce false positives for things like An <Error> occurred as here Error should be translated. It also will allow translation of the alt attribute in e.g. <img src="bob.png" alt="Image description"> or similar translatable attributes in OpenOffice.org help files.