Want to know what Translation Memory is?

The explosion in content these days is driving increased demand for translation.  As globalization reduces the cost of entering new markets, this increases the demand for translation into more and more languages.  As the number of digital marketing and business channels grows, the types of content translators need to be able to work with multiples.

Computer Aided Translation (CAT) software and Translation Memory are essential tools that allow a translation agency to manage this complexity.

Professional translators use computer-aided translation software to translate.  A CAT software is a computer-aided editing software with a memory database called a ‘translation memory’, that stores previously translated content made by human translators. The translation memory sits behind the editor as an integrated database that stores the translation. This feature makes translating efficient and faster.

A translation memory is a repository of everything human translators have translated for a particular language, recorded in its respective database format. Each entry in the database is in the format of a source language sentence and its corresponding target language translation.  This translation memory is something that is built up over time as more and more content is being translated.

Any new content for translation is matched against this translation memory.  Any sentences in the new content that have already been translated are automatically translated and need not be translated again.  Sentences that have a small variation to a previously translated sentence can also be extracted so they only need a quick edit.  If the content contains many duplicate sentences then you only need to translate the duplicates the first time you encounter them, the rest will be automatically translated.

Content for translation can come in many different formats.  It can be in Microsoft Word, Excel or PowerPoint.  It can be a PDF.  It can be in Adobe InDesign or FrameMaker.  It could be XML content from a content management system or from an Android App.  It could be website HTML.

CAT tools convert different file types to a common file type that a translator works on. This means a translator does not keep licenses for all the different authoring software.  The CAT software will extract the content for translation, so this is all the translator sees, it hides any non-translatable parts so the translator can ignore these.  The translation memory is then sitting behind the CAT, continuously updating as the translator progresses through the content.

What is the difference between Translation Memory and Machine Translation?

A Translation Memory is a record of everything that was translated.  Machine Translation is automatically translating content using a Machine Translation engine like Google Translate.  Most CAT software now integrates with the major machine translation engines.  This gives the translator the option of using machine translation to create a first draft translation that they then edit.  If a translator is using machine translation, the usual sequence is to first leverage any previous translations and fuzzy matches from the translation memory, and only then to use machine translated to translate non-leveraged parts.  The CAT software will indicate which sentences come from the translation memory and which have been done using machine translation.

What is the difference between Translation Memory and a Glossary?

A Glossary is a bilingual list of key terms.  These can be common industry terminology, or it can be a list of key terms for a particular customer.  Glossaries ensure that key terms are translated consistently across all content.   In regulated industries, such as the pharmaceutical, legal or finance industries, using a glossary is vital to ensure terms are translated accurately and consistently according to industries standards.

What is Leveraging?

The first step in a new translation project is to analyze the content for translation to determine the number of words for translation.  This is where you see the value of using Translation Memory.  The CAT software compares the content for translation against the content in the translation memory.  Any content that matches something that was already translated will not need to be translated again. This reduces the cost of translation and will allow the translation to be completed faster.

What is Fuzzy Matching?

Fuzzy matching is the concept of finding sentences that closely match the sentence to be translated but may have some slight differences, for example, “It was raining in Bangkok on Tuesday”.  A fuzzy match might be: “It was raining in Tokyo on Tuesday” or “It was raining in Bangkok on Wednesday”.  The translation can be reused with some quick editing.  This improves the translator’s productivity, lowers the cost of translation and helps improve consistency.

What is 100% Matching? 

A 100% match is when a sentence for translation matches exactly to a sentence that was already translated.

What are Repetitions?

Some content has a lot of repeated sentences, for example, a questionnaire might have the same content across multiple choice answers.  CAT software finds these, and the translator treats them in the same way as a 100% match.  They are translated once; the repeated instances are then automatically translated.   This reduces cost, speeds up the translation and ensures consistency.

What are the advantages of a Translation Memory?

  • Productivity & faster time to market: The translator only needs to translate new and unique content, so the translation is done faster.
  • Cost Saving: As you only need to pay for unique content that has not been translated before, this will reduce the overall cost of translation. As you build up your translation memory over time, you will start seeing more leverage from the translation memory.  A typical business can expect to save between 36% and 90% of their translation cost by utilizing translation memory over time.
  • Improved ConsistencyBy ensuring that content translated once does not need to be translated again, you ensure consistency across different product releases or content types.
  • Training material for customized Machine TranslationCustomizing machine translation engines is the optimal way to get better output quality for a particular language. Your translation memory is a central store of all your translations.  Once you have a lot of content in your translation memory, you can use this to build custom machine translation engines that work best on your content.

How Secure is my Content in a Translation Memory?

Traditionally, translation memory has been a desktop tool used by translators.  This required sending files for translation directly to the translator by email or FTP.  In recent years, CAT software has moved to a cloud-based environment.  This improves security.  Translators no longer need to download files to translate them, they can work in a secure, online environment where access is controlled, and content cannot be downloaded.

Translation Memories as AI Content Stores

Traditionally, translation memories have just been used as stores of translated content.  Increasingly, businesses are seeing value in their translation memory data as a tool for training AI applications.  Machine translation is one example.  Companies can use their translation memory content to train machine translation engines, so they have their own customized engines that have been trained using their own terminology and translation style.  A customized machine translation trained like this will give more accurate results than a generic machine translation engine.

The Future of Translation Memory

Translation memory technology is well-established in the translation industry.  Improved machine translation output quality is increasing the use and acceptance of machine translation.  All major CAT tools now incorporate machine translation.

The future of translation memory will more tightly integrate machine translation and translation memory.  The most likely impact will be:

  1. Machine learning: Automatic and real-time customization of machine translation from translation memory.
  2. Auto-suggestions as you type: Similar to how Google auto-suggests for search as you type search queries.
  3. Subject matter detection: Machine translation works best when you use an engine that has been trained on a particular subject e.g. finance, insurance, automotive, e-commerce. CAT tools will detect the subject matter of content and select the appropriate machine translation engine to use based on this.