Some parts of this page may be machine-translated.

 

  • Translation Service: HOME
  • Blog
  • Trends in machine translation and common mistakes and tips for correction (Post-editing) Part 4 - Splitting sentences into two or more for readability

Trends in machine translation and common mistakes and tips for correction (Post-editing) Part 4 - Splitting sentences into two or more for readability

Trends in machine translation and common mistakes and tips for correction (Post-editing) Part 4 - Splitting sentences into two or more for readability

In machine translation, there is a tendency to generate a translated sentence in one sentence when the original text is one sentence. Also, as a characteristic of Japanese, there is a tendency to combine multiple contents into one sentence with complex grammar and structure. Therefore, in some cases, it may be necessary to divide the translated sentence into two or more sentences to make it easier to read.

 

This time, I would like to introduce a few examples like this.

Table of Contents

1. Example of dividing one sentence into two sentences①

  • • Original text: However, just the deviation alone cannot accurately represent the degree of variation in the data, so both variance and standard deviation are necessary.
  • •Machine Translation: However, deviation alone cannot accurately represent the degree of variability in the data, so variance and standard deviation are required.
  • •After correction: However, the condition of data variation cannot be expressed correctly using only deviation. To do this, we also need variance and standard deviation.

 

Here, since the original text is only one sentence, it is also translated by machine in one sentence. However, in order to make it easier to read, we have appropriately modified it by dividing the original text as follows.

 

"However, deviation alone cannot accurately represent the degree of data dispersion. (To accurately represent it), variance and standard deviation are necessary."

 

Machine translation may sometimes divide a single sentence into two or more translations. However, caution is necessary as the sentence may be divided unnaturally. Please see the following example sentences.

2. Example of dividing one sentence into two sentences②

  • •Original text: Variance is an indicator that represents the size of dispersion, and it is the average of the squared deviations. The larger the variance, the larger the dispersion.
  • •Machine Translation: The variance is an index showing the magnitude of the variation. It is the average value of the sum of squared deviations, and the larger the variance, the larger the variation.
  • •After correction: Variance is an indicator of the extent of variation, and is the mean value of the sum of squared deviations. The greater the variance, the greater the variation.

 

In machine translation, the text is divided as follows.

 

"Variance is a measure of the size of dispersion. (Variance is) the average of the squared deviations, and the larger the variance, the larger the dispersion."

 

However, to make it easier to read, it is recommended to divide the original text as follows.

 

"Variance is a measure of the size of dispersion, and it is the average of the squared deviations. The larger the variance, the greater the dispersion."

3. Example of dividing one sentence into two sentences③

  • •Original text: So, we conducted a one-month survey of temperature and number of visitors, created a scatter plot using that data, and tried to make a prediction of the amount of preparation needed.
  • •Machine Translation: Therefore, I investigated the temperature and the number of visitors for one month, created a scatter plot using the data, and made a forecast of the quantity to be charged for each problem.
  • •After correction: In response, the temperature and number of customers were measured for one month. The owner used this data to create a scatter diagram and try to predict the order quantity to address his problem.

 

Machine translation is structured as "I investigated ..., created ..., and made ....", which is long and difficult to read. Therefore, in post-editing, the original text was divided and translated as follows.

 

"So, the store owner conducted a one-month survey of temperature and number of visitors. Using that data, they created a scatter plot and tried to predict the amount of preparation needed for any issues."

 

(👉Other post-edit points: In machine translation, the subject is "I", but in context, the subject is "the owner", so I changed it to "the owner".)

4. Example of dividing one sentence into two sentences④

  • •Original text: One of the basic ideas of DTC is to analyze data collected in the field, on site, and in reality, and make judgments about the quality of things and take action.
  • •Machine Translation: One of the basic ideas of DTC is to analyze the data collected in the field, in kind, and in reality, judge the quality of things, and take action.
  • •After correction: This is one of the DTC basic concepts. The key point of fact finding is to analyze data gathered from actual locations, actual objects, and actual conditions to make judgments about things before acting.

 

Upon viewing the original text, it was found to have a complex structure of "…, …, …, and …". This structure combines multiple contents into one sentence. As a result, the machine translation output is also unnatural in its sentence structure. Therefore, it was determined that breaking up the sentence as shown below would make it easier to read, and it was revised accordingly.

 

"Facing reality is one of the fundamental principles of DTC. It is important to analyze data collected from the field, physical objects, and reality, and make judgments on the quality of things and take action."

 

(👉Other post-edit points: "the field, in kind, and in reality" ⇒ "actual locations, actual objects, and actual conditions", as the translation was unclear.)

 

5. Example of dividing one sentence into two sentences⑤

  • • Original: If the model of machine C is DR5600S and the software version (confirmed in step 8-9-4) is earlier than Ver.3.2, it is a non-stackable version, so the switch number will be 4 (fixed value).
  • •Machine Translation (Google Translate): If the device model is DR5600S and the software version (confirmed in step 8-9-4) is earlier than Ver.3.2, the switch number is 4 (fixed value).
  • •Machine Translation (DeepL): If Device C is a DR5600S and the software version is earlier than Ver. 3.2 (as confirmed in Step 8-9-4), the switch number is 4 (fixed value) because the stack is not supported by this version.
  • •After correction: If the model of device C is DR5600S and the software version (confirmed in step 8-9-4) is earlier than Ver. 3.2, the stack functionality is not supported. In this case, the switch number will be 4 (fixed value).
  • •Further improvements: If the model of device C is DR5600S and the software version confirmed in step 8-9-4 is earlier than Ver. 3.2, the stack functionality is not supported. In this case, the switch number will remain fixed at 4.

 

In the final example, for comparison, we processed the original text using Google Translate and DeepL. As a result, it was clear that DeepL had higher quality.

 

Looking at the results of Google Translate, first of all, it can be seen that the phrase "due to being an unsupported version of Stack" in the original text has been completely omitted. This is a translation error and a fatal error. Also, it can be seen that there is an unknown space after "value" in "(fixed value )".

 

DeepL believes that the burden of post-editing is reduced because it does not have such problems.

 

However, the results from DeepL are also translated into one sentence because the original text is only one sentence. In this example, since a relatively large amount of information is included in one sentence, it is recommended to divide it into two sentences for readability. The result will be as shown in "After Correction" above.

 

To make it even easier to read, you can also consider integrating the information in parentheses (such as "confirmed in steps 8-9-4" and "fixed value") into the text. The example above shows this in action with "further improvements".

 

Machine translation lacks the ability to divide a sentence into two or more for the sake of readability when the original text is only one sentence. Therefore, I believe that this is an important point to consider in post-editing work.

 

 

Author Information

Andy ParkMultilingual Translation Group
Japanese-English Translation Reviewer

  • ・In my previous job, I worked as an IT engineer for about 4 years, and then I worked as an English conversation instructor for 8 years, where I was involved in developing educational programs and training instructors.
  • ・Translation experience of 11 years, specializing in IT and business fields.
  • ・Currently engaged in translation work and translation quality management, primarily focusing on FA-related products such as product manuals, help documents, and operation manuals.
  • - Responsible for evaluating and verifying the translation quality of machine translation engines.
Popular Article Ranking
Archive
Category

For those who want to know more about translation

Tokyo: +81-3-5321-3111
Nagoya: +81-52-269-8016

Reception hours: 9:30 AM to 5:00 PM JST

Contact Us / Request for Materials