Some parts of this page may be machine-translated.

 

  • Localization: HOME
  • Blog
  • Trends in Machine Translation and Common Mistakes and Corrections (Post-Editing) Tips ④ ~Considering Readability by Splitting Text into Two or More Parts~

Trends in Machine Translation and Common Mistakes and Corrections (Post-Editing) Tips ④ ~Considering Readability by Splitting Text into Two or More Parts~

alt

2021.5.28

Trends in Machine Translation and Common Mistakes and Corrections (Post-Editing) Tips ④ ~Considering Readability by Splitting Text into Two or More Parts~

In machine translation, when the original text is a single sentence, the translated text tends to be generated as a single sentence as well. Additionally, a characteristic of the Japanese language is the tendency to combine multiple ideas into a single sentence with complex grammar and structure. Therefore, in some cases, it is necessary to split the translated English text into two or more sentences to enhance readability.

 

This time, I would like to introduce a few examples like this.

Table of Contents

>>Related Download Materials: Nine Examples of Machine Translation Errors and Post-Editing & Post-Editing Checklist

1. Example of splitting one sentence into two①

  • • Original: However, deviation alone cannot accurately represent the variability of the data, so variance and standard deviation are necessary.
  • •Machine Translation: However, deviation alone cannot accurately represent the degree of variability in the data, so variance and standard deviation are required.
  • •After correction: However, the condition of data variation cannot be expressed correctly using only deviation. To do this, we also need variance and standard deviation.

 

Here, since the original text is a single sentence, the machine translation is also translated as a single sentence. However, it has been appropriately modified to make it easier to read by breaking the original text into parts as shown below.

 

"However, the deviation alone cannot accurately represent the variability of the data. (To represent it accurately) variance and standard deviation are necessary."

 

Machine translation may sometimes split the original text of a sentence into two or more parts. However, caution is necessary as the text may be unnaturally divided. Please see the following example.

2. Example of splitting one sentence into two②

  • •Original text: Variance is a measure of the degree of dispersion, represented by the average of the squared deviations, indicating that a larger variance corresponds to greater dispersion.
  • •Machine Translation: The variance is an index showing the magnitude of the variation. It is the average value of the sum of squared deviations, and the larger the variance, the larger the variation.
  • •After correction: Variance is an indicator of the extent of variation, and is the mean value of the sum of squared deviations. The greater the variance, the greater the variation.

 

In machine translation, the text is divided as follows.

 

"Variance is a measure of the degree of dispersion. (Variance is) the average of the squared deviations, and the larger the variance, the greater the dispersion."

 

However, to make it easier to read, it is recommended to divide the original text as follows.

 

"Variance is a measure of the degree of dispersion, representing the average of the squared deviations. A larger variance indicates a greater degree of dispersion."

3. Example of splitting one sentence into two③

  • •Original text: Therefore, we investigated the temperature and the number of visitors for one month, created a scatter plot using that data, and made a forecast for the preparation quantity.
  • •Machine Translation: Therefore, I investigated the temperature and the number of visitors for one month, created a scatter plot using the data, and made a forecast of the quantity to be charged for each problem.
  • •After the revision: In response, the temperature and number of customers were measured for one month. The owner used this data to create a scatter diagram and try to predict the order quantity to address his problem.

 

Machine translation results in a long and difficult-to-read structure like, "I investigated ……, created ……, and made……." Therefore, in the post-editing process, I translated the original text by breaking it down as follows.

 

So, the owner investigated the temperature and the number of customers for a month. Using that data, a scatter plot was created to predict the quantity of items needed for preparation.

 

(👉Other post-editing points: In the machine translation, the subject is 'I', but contextually the subject is 'the owner', so I changed it to 'the owner'.)

4. Example of splitting one sentence into two④

  • •Original text: One of the fundamental concepts of DTC is that it is important to analyze data collected from the field, the actual items, and reality, to judge the merits and demerits of things, and to take action.
  • •Machine Translation: One of the basic ideas of DTC, it is important to analyze the data collected in the field, in kind, and in reality, judge the quality of things, and take action.
  • •After correction: This is one of the DTC basic concepts. The key point of fact finding is to analyze data gathered from actual locations, actual objects, and actual conditions to make judgments about things before acting.

 

Looking at the original text, it has a complex structure with phrases like "...and, ...and, ...and, ...becomes," summarizing multiple contents into a single sentence. As a result, the translation generated by machine translation is unnatural in its composition. Therefore, I judged that separating the sentences as shown below would make it easier to read, and I made the necessary corrections.

 

(Facing the facts) is one of the fundamental concepts of DTC. Analyzing data collected from the field, the actual items, and reality, judging the merits and demerits of things, and taking action is an important point.

 

(👉 Other post-editing points: 'the field, in kind, and in reality' was translated in a way that was unclear, so I corrected it to 'actual locations, actual objects, and actual conditions.')

 

5. Example of splitting one sentence into two⑤

  • • Original text: If the model of Device C is DR5600S and the software version (confirmed in step 8-9-4) is earlier than Ver.3.2, it is an unsupported version, and the switch number will be 4 (fixed value).
  • •Machine Translation (Google Translate): If the model of device C is DR5600S and the software version (confirmed in step 8-9-4) is earlier than Ver.3.2, the switch number is 4 (fixed value).
  • •Machine Translation (DeepL): If Device C is a DR5600S and the software version is earlier than Ver. 3.2 (as confirmed in Step 8-9-4), the switch number is 4 (fixed value) because the stack is not supported by this version.
  • •After correction: If the model of device C is DR5600S and the software version (confirmed in step 8-9-4) is earlier than Ver. 3.2, the stack functionality is not supported. In this case, the switch number will be 4 (fixed value).
  • •Further Improvement: If the model of device C is DR5600S and the software version confirmed in step 8-9-4 is earlier than Ver. 3.2, the stack functionality is not supported. In this case, the switch number will remain fixed at 4.

 

In the last example, for comparison, I processed the original text using Google Translate and DeepL. As a result, it was clear that DeepL had a significantly higher quality.

 

Looking at the results from Google Translate, we can first see that the phrase "because it is a stack unsupported version" in the original text is completely omitted. This is a critical error due to translation omission. Additionally, in "(fixed value)", it is evident that there is an unknown space after "value".

 

We believe that DeepL has fewer issues, which reduces the burden when performing post-editing.

 

However, since the original sentence is a single sentence, the machine translation is also translated as a single sentence. In this example, since relatively a lot of information is included in one sentence, we recommend splitting it into two sentences for readability. The result is the "after correction" mentioned above.

 

To make it even more readable, you may also consider integrating the information in parentheses from the original text ("Confirm in Step 8-9-4", "Fixed Value") into the sentences. The example of "Further Improvement" above illustrates this.

 

Machine translation lacks the ability to split a single sentence into two or more parts for better readability. Therefore, I believe that addressing this issue is an important point in post-editing work.

 

 



>>Related Download Materials: Nine Examples of Machine Translation Errors and Post-Editing & Post-Editing Checklist


 

Most Popular
Category

For those who want to know more about translation

Tokyo Headquarters: +81 35-321-3111

Reception hours: 9:30 AM to 5:00 PM JST

Contact Us / Request for Materials