Some parts of this page may be machine-translated.

 

  • Localization: HOME
  • Blog
  • Translation Memory Construction and Management in Word-Centered Translation Tasks|From Expert Review to Reuse

Translation Memory Construction and Management in Word-Centered Translation Tasks|From Expert Review to Reuse

alt

2025.5.7

Translation Memory Construction and Management in Word-Centered Translation Tasks|From Expert Review to Reuse

Many translation projects use Microsoft Word for delivery and review. When introducing new CAT tools, there is often a demand to utilize translation memory (TM) while maintaining the workflow in Word as much as possible. This article introduces how to create and use translation memory from existing Word documents without changing the current Word-based workflow.

Table of Contents

1. The Gap Between Word-Centric Workplaces and the Use of Translation Memory

In translation projects, communication with clients and expert reviews are often conducted using Word. For example, in fields such as pharmaceuticals and law, where the accuracy of technical terms and expressions is crucial, it is common for experts in the relevant field to handle the final review before delivery. In such settings, it has become standard practice for experts to use Word, which they are accustomed to, to leave feedback using the "Track Changes" and "Comments" features.

Building translation memories for CAT tools tends to be postponed because it requires importing Word documents into CAT tools or preprocessing for that purpose. To utilize translation memories while maintaining a Word-centered workflow, it is necessary to prepare Word files in a form that can be directly aligned and linked to translation memory updates.

2. What is Alignment?

"Alignment" refers to the process of comparing the source document and the translated document, pairing the source and translated sentences that have the same meaning on a sentence-by-sentence basis. It is also called "text alignment" or "segment alignment." Generally, since CAT tools translate on a sentence basis, pairs of source and translated sentences are registered in the translation memory. When importing Word documents into the translation memory, it is necessary to perform alignment of the source and translated texts.

3. The Reason Why Accurate Alignment Is Essential

The accuracy of translation memory greatly depends on whether the source and target text pairs are precisely matched. In the case of Word documents, inconsistent paragraphs, line breaks, and style settings reduce the accuracy of CAT tool automatic alignment, causing source and target texts with different meanings to be mistakenly paired. To reflect translations obtained from existing Word workflows into the translation memory and utilize them for future translations, ensure alignment accuracy by focusing on the following points.

3-1. Unification of Paragraph Styles

Classify headings, main text, and bullet points using Word's style features so that CAT tools can read them in the same segment units.

3-2. Organizing Line Breaks and Spaces

Line breaks at the end of sentences and unnecessary spaces can cause errors in automatic alignment, so they are deleted and standardized in Word beforehand.

3-3. Exclusion of Sections Not Requiring Translation

Elements excluded from translation (e.g., footnotes, comments, text boxes) will be removed as they interfere with alignment.

4. Expert Review in Word and Reflection in Translation Memory

Post-translation expert reviews are often conducted using Word's "Track Changes" and "Comments" features. In many workplaces, a workflow has been established where, during revisions, the previous and new versions are compared using Word's document comparison function to identify differences and review only the relevant sections. By performing alignment again using the final reviewed Word file, including the corrected parts, the latest translation pairs are registered in the translation memory, allowing past review results to be effectively utilized in subsequent tasks.

5. Existing Word → Translation Memory Workflow

5-1. Organizing Word Documents

We add "_EN" and "_JP" to file names, along with version numbers and dates, to centrally manage source and translated documents. Separate copies are prepared for working purposes.

5-2. Preprocessing

Organize paragraph styles in the copied Word document, and clean up line breaks, spaces, and sections that do not require translation.

5-3. Execute Alignment

Load the original Word file and the translated file into the CAT tool's alignment feature, review the automatically extracted pairs of source and target texts, and make corrections.

5-4. Export to Translation Memory

Export alignment results in TMX format and merge them into new or existing translation memories.

5-5. Translation by Translators

The source text file to be translated is imported into the CAT tool, and translation is performed using the translation memory. After translation, the CAT tool's features are used to export the file as a Word document. In the case of revisions, a document comparing the old and new translated files is prepared using Word's document comparison feature.

5-6. Expert Review and Realignment

We conduct expert reviews in Word and make corrections. We perform realignment on the corrected files. The corrected source and target text pairs are added to the translation memory.

5-7. Operation and Maintenance

We correct duplicate and incorrect entries in the translation memory and regularly perform version control.

6. Automatic Alignment by AI

Traditionally, automatic alignment between source and translated texts has been performed based on sentence length and order of appearance. However, the accuracy was not always high, and a lot of time was spent on manual alignment corrections. In recent years, with the advent of AI represented by ChatGPT, AI-based automatic alignment that understands the meaning of sentences has become possible. Even if the layout differs somewhat, it can pair source and translated texts with similar meanings, reducing the need for manual preprocessing and correction work. For more details, please see the blog article below.

Creating Translation Memory with AI: How to Automatically Pair English and Japanese

7. Summary

Word has become the standard workflow in translation environments. To leverage its strengths while operating translation memory in parallel, precise alignment work is essential. By carefully performing preprocessing such as unifying paragraph styles and removing noise, and reflecting updates from expert reviews through realignment, the benefits of translation memory can be maximized even in a Word-centered environment.

At Human Science, we also support the introduction of CAT tools, the construction of translation memories, and the improvement of translation workflows. For questions or requests, please contact us using the inquiry form below.

Contact Us

 

Most Popular
Category

For those who want to know more about translation

Tokyo Headquarters: +81 35-321-3111

Reception hours: 9:30 AM to 5:00 PM JST

Contact Us / Request for Materials