Some parts of this page may be machine-translated.

 

  • Translation Service: HOME
  • Blog
  • Mechanical Translation Mechanism ~ Literal Translation Type and Free Translation Type, What is Neural Translation? ~

Mechanical Translation Mechanism ~ Literal Translation Type and Free Translation Type, What is Neural Translation? ~

Mechanical Translation Mechanism ~ Literal Translation Type and Free Translation Type, What is Neural Translation? ~

Mechanical Translation Mechanism ~ Literal Translation Type and Free Translation Type, What is Neural Translation? ~

Table of Contents

1. What is Machine Translation?

Machine translation is the technology of automatically translating a text written in one language to another language using a computer program. It is also known as automatic translation, MT, or AI translation. It is now widely used not only in research and business, but also in everyday situations such as browsing foreign news or traveling. It is easily accessible through web browser translation features and translation apps for smartphones, and its popularity is increasing due to its convenience.

One of the main features of machine translation is its speed. Machine translation is much faster than human translators, making it possible to translate large amounts of text in a short amount of time. Additionally, it is more cost-effective compared to using a translation company. Furthermore, it supports multiple languages and some services can translate over 100 languages.

Machine translation has both advantages and disadvantages. First, as an advantage, it increases the number of users and makes information gathering and communication easier. With machine translation, it becomes easier to access information and content from all over the world, and it also enables communication with people who speak different languages, making international exchange more accessible. It is also utilized in education and business, not only being useful for studying abroad or overseas business trips, but also contributing to information gathering in research and development, and streamlining communication with overseas partners.

On the other hand, there is an issue with accuracy. Machine translation is not perfect and can have difficulty accurately capturing cultural backgrounds and expressions. Therefore, it is recommended to outsource important documents and contracts to a specialized translation company.

2. History of Machine Translation

The first machine translation appeared in the 1950s, initially using rule-based technology. It transitioned to statistical machine translation in the late 1980s, and then neural machine translation emerged in the 2010s.

With rule-based translation, humans are required to create translation rules based on dictionaries and grammar, making the process complex and difficult to update. Additionally, this method results in low translation accuracy and is only able to translate standard phrases.

On the other hand, in statistical machine translation, computers learn rules. By reading a large number of pairs of original and translated texts (e.g. 1 million sentences), and learning the correspondence between words and phrases in the data (corpus), it becomes relatively easy to handle new terms. However, translation between languages with different grammars, such as English and Japanese, is difficult and the translation accuracy is not practical.

In addition, there are also hybrid translation methods that combine rule-based and statistical machine translation, as well as example-based translation techniques that extract similar parts from existing source and translated pairs for translation. These methods have improved translation accuracy compared to traditional rule-based machine translation.

In neural translation, a large number of pairs of original text and translated text are read in for learning, similar to statistical machine translation. However, by using neural networks and deep learning, a type of machine learning, more information can be utilized for translation. As a result, translation accuracy has greatly improved and natural and fluent translations can be obtained. With the emergence of neural machine translation, machine translation has gained attention and is now widely used in daily life and work.

3. What is the mechanism of machine translation?

We will provide a detailed explanation of machine translation methods and techniques.

Rule-based Machine Translation

Rule-Based Machine Translation (RBMT) requires specialized knowledge of both the source and target languages' linguistics and grammar. The translation process is mainly composed of the following three stages.

1. Morphological Analysis:
In this stage, the input sentence is divided into morphemes (smallest units of meaning) and information such as parts of speech and conjugation forms are extracted. For example, the English sentence "I am eating a cake." can be divided into the morphemes "I", "am", "eat+ing", "a", and "cake".

2. Syntax and Semantic Analysis:
In syntax analysis, morphemes are further analyzed according to language structure and converted into a hierarchical structure called a syntax tree. A syntax tree is a tree structure that represents grammatical structure in natural language processing (NLP), showing how words and phrases combine to form a complete sentence. The syntax tree represents the components of a sentence, such as subject, verb, and object, as hierarchical nodes, and by explicitly indicating their relationships and functions, it enables more accurate translation. In semantic analysis, the meaning of words and sentences in the source text is extracted. The purpose of semantic analysis is to understand the meaning of the source language text and accurately convey it in the target language. Even for polysemous words, appropriate translation terms are selected based on context.

3. Generation:
At this stage, sentences in the target language are generated based on the syntax tree and semantic information. Grammar rules and dictionaries are applied to combine appropriate word order and morphemes to construct sentences in the target language.

The advantages of rule-based machine translation are that the translation process is clear and it is easy to identify the cause of errors or problems that may occur. Additionally, by utilizing high-quality dictionaries and grammar rules created by experts, translations are generally grammatically correct.

The drawbacks are the need for a large number of human resources to support new language pairs and domains, as well as the difficulty in handling diverse expressions and slang. Furthermore, due to grammatical and lexical differences between languages, translation results often lack naturalness and fluency. For these reasons, modern translation systems mainly use data-driven approaches such as Neural Machine Translation (NMT).

Statistical Machine Translation

Statistical Machine Translation (SMT) is a machine translation method in which computers translate based on statistical patterns extracted from a large amount of bilingual text (parallel corpus consisting of original and translated sentences). Unlike rule-based machine translation that relies on linguistic knowledge and grammar rules, statistical machine translation learns the transformation between language pairs using machine learning algorithms and probability models.

The main approaches to statistical machine translation include the following:

1. Word-based Translation:
In a word-based approach, the probability of one word being translated to another is used. This allows for the selection of the most likely word combinations. However, this approach makes it difficult to address issues with word order and structure.

2. Phrase-based Translation:
In phrase-based (or word-based) translation, longer units (phrases made up of multiple words) are handled instead of individual words. This allows for a more accurate understanding of grammatical relationships and structures. In phrase-based translation, phrase pairs are extracted from parallel corpora and their combination in the translated sentence is determined.

3. Syntax-based Translation:
In a syntax-based approach, the structure of a sentence is captured using the syntax trees of the source and target languages. This allows for accurate representation of grammatical relationships and semantic information. In syntax-based translation, the syntax tree of the source language is converted to the syntax tree of the target language based on syntax rules extracted from parallel corpora.

The advantage of statistical machine translation is that it can automatically acquire translation knowledge from a large amount of data. Therefore, it is easy to handle various languages and domains, and is also effective for new expressions and slang. However, performance may decrease for language pairs with insufficient parallel corpora or significantly different grammatical structures.

In recent years, the approach of neural machine translation has become mainstream, and statistical machine translation is gradually giving way to it.

Neural Translation

Neural Machine Translation (NMT) is the latest machine translation method that uses deep learning technology to translate from one language to another. This method is also referred to as AI translation. Neural Machine Translation utilizes neural network architectures such as Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), and Transformers to automatically learn the knowledge needed for translation from parallel corpora. Unlike statistical machine translation and rule-based machine translation, Neural Machine Translation can capture the meaning and grammatical relationships of sentences through comprehensive vector representations.

A typical neural machine translation system consists of two main components:

1. Encoder:
The encoder converts the input sentence in the source language into a continuous vector representation. Each word or character is converted into a pre-trained word embedding vector, ensuring that the structure and meaning of the sentence are properly encoded. Recurrent neural networks (RNNs) and convolutional neural networks (CNNs) are commonly used for this purpose.

2. Decoder:
The decoder uses the continuous vector representation obtained from the encoder to generate the translated text in the target language. This is usually done word by word, with the most probable word being selected at each step to construct the translation. Similar to the encoder, the decoder also utilizes RNNs or CNNs.

In recent years, the Transformer architecture has been very successful, surpassing traditional approaches in terms of translation quality and speed. Transformers can efficiently capture the relationship between words in the source and target sentences through attention mechanisms.

The advantages of neural machine translation include high translation accuracy and naturalness, as well as the ability to automatically acquire grammar and expressions from training data. However, the large amount of training data and computational resources required, as well as the complexity of the learning process, are challenges. In addition, the internal state of the model is often black-boxed, making it difficult to identify and fix errors.

The quality of translation may vary depending on the machine translation service, even if the same neural machine translation method is used. The representative machine translation services are Google Translate and DeepL, and roughly speaking, Google Translate is a literal translation type, while DeepL can be considered an idiomatic translation type. Google translates literally, so the translation may become unnatural. On the other hand, DeepL translates idiomatically, so the translation tends to be more natural. However, with DeepL, there is a common issue of words or sentences from the original text being omitted in the translated text. Also, the smooth and natural expression of the translated text can sometimes be a hindrance, as it may not be noticeable that there are missing translations when only reading the translated text. For more information on the translation accuracy of DeepL and Google Translate, please refer to the following article.

Latest Trends in Machine Translation and Comparison of "DeepL" and "Google Translate"

What is Deep Learning?

Deep learning is a type of artificial intelligence (AI) technology that specifically refers to machine learning algorithms using neural networks. Deep learning utilizes "deep neural networks" that mimic the connections and structure of the human brain to learn from large amounts of data and perform advanced recognition and decision-making. Its applications are not limited to machine translation.

There are the following benefits to deep learning.

1. The naturalness of translation is being improved:
Translation systems using deep learning can provide more natural and accurate translations compared to traditional translation methods. This is because neural networks have the ability to understand sentence structure and context, and translate appropriately.

2. Proficient in technical terminology:
Systems using deep learning are able to accurately understand and translate words related to specific fields, such as technical terminology and proper nouns. This is because translation services from companies like DeepL and Google primarily learn from large amounts of data on the internet, which includes various technical terminology and proper nouns from different fields.

4. Summary

Machine translation is a technology that uses computer programs to automatically translate text written in one language to another. Its advantages include the ability to translate large amounts of text quickly and at a low cost, as well as being able to handle multiple languages. There are three main types of machine translation: rule-based, statistical, and neural. In recent years, neural machine translation has become widely used. While the introduction of neural machine translation has greatly improved translation accuracy, issues such as mistranslations and omissions still exist.

We offer a translation software MTrans for Office (MTrans for Office) that incorporates machine translation services from DeepL, Google, and Microsoft. With just one click, you can translate Microsoft Office products (Word, Excel, PowerPoint, Outlook), which can also lead to reduced workload. Try it for free for 14 days and see for yourself the quality and ease of use.

Popular Article Ranking
Archive
Category

For those who want to know more about translation

Tokyo: +81-3-5321-3111
Nagoya: +81-52-269-8016

Reception hours: 9:30 AM to 5:00 PM JST

Contact Us / Request for Materials