Some parts of this page may be machine-translated.

 

What are the strengths of AI translation using LLM (Large Language Models) such as GPT-4 and Gemini Pro?

What are the strengths of AI translation using LLM (Large Language Models) such as GPT-4 and Gemini Pro?

ChatGPT, Copilot, Bard, and other chat AIs are gaining attention. In this blog post, we will explain LLM (Large Language Model) and AI translation using LLM, which enables chat AIs.

Table of Contents

1. What is Translation Using LLM (Large Language Model)?

1-1. What is LLM?

Large Language Models (LLM) refers to language models built using a very wide range of datasets and deep learning techniques. It is a type of generative AI that can autonomously generate text, images, and audio, and excels in natural language processing. LLM is responsible for realizing chat AI, such as ChatGPT. LLM is capable of advanced language processing in various tasks, providing functions such as explanation, summarization, and translation, in addition to text generation.

Some specific examples of LLM include "Gemini Pro" which supports Google's chat AI "Bard", "GPT-3.5 Turbo" and "GPT-4" which support OpenAI's "ChatGPT", as well as "Llama" from Meta and "Dolly" from Databricks. These models are used in various fields.

In addition, Google's Japanese version of Bard has been using PaLM 2, but it switched to the latest generation AI model, Gemini Pro, in February 2024.

1-2. What is Machine Translation by LLM?

The dataset that LLM is learning mainly consists of text data collected from the internet. The internet uses various languages, and these language data are also included in the text data. Therefore, LLM is not only learning English, but also Asian languages such as Japanese, Chinese, Korean, Vietnamese, Thai, as well as European languages such as French, Italian, German, Spanish, and Portuguese, and is capable of multilingual machine translation. Since the dataset contains a large amount of English data, it is possible to translate naturally, especially when translating into English.

2. Benefits and Strengths of Translation Using LLM (Large Language Model)

There are the following benefits and strengths to using LLM for translation.

2-1. High Precision

LLM incorporates a wide range of content as its learning source, and can accurately translate regardless of the field. In particular, by specifying the field and purpose of the document in the translation instructions, it is possible to translate using appropriate terminology and style for that document. This was not possible with traditional machine translation engines that use NMT (Neural Machine Translation) models. To prepare NMT models specialized for specific fields and purposes, a large amount of text data and training work were essential, requiring both time and cost. However, with LLM, simply by giving various instructions, it is possible to generate translated text according to the purpose.

2-2. Corresponds to subtle word usage

LLM has the unique ability to understand even subtle word choices with its excellent language comprehension skills. This characteristic allows for sensitivity in handling delicate differences in language, resulting in translations that convey nuances effectively. LLM excels in English expression, producing fluent translations into English. Additionally, it is possible to rephrase terminology and expressions according to the intended audience and purpose. By specifying the target readers and purpose of the document, it is possible to make appropriate changes in word choice.

2-3. Real-time speed

Despite learning a massive amount of language data, estimated to be in the trillions, LLM is able to respond almost in real-time. While it may take slightly longer compared to NMT models, considering the amount of data it has to process, it is surprisingly quick and does not cause any issues in regular business operations.

3. LLM (Large Language Model) commonly used for translation

We will explain the types and features of LLM (Large Language Models) commonly used for translation.

3-1. Transformer

Transformer is a neural network architecture announced by Google in 2017. This model is able to capture longer contexts more efficiently than traditional RNN (Recurrent Neural Networks), and training is faster because it can be parallelized. In machine translation, it has improved translation accuracy compared to RNN.

The feature of Transformer is the ability to understand the meaning of words in the context of the entire sentence through a function called Self-Attention. This allows for capturing word dependencies even in long sentences, making it possible to achieve high-precision translations even for complex grammar and nuances. It is applied to various natural language processing tasks such as language modeling and translation.

In commercial services, Google's translation service uses Transformer. As of 2016, it was using RNN alone, but in 2017, the architecture was changed to combine Transformer and RNN.

3-2. GPT (Generative Pre-trained Transformer)

GPT is a large-scale and pre-trained language model developed by OpenAI based on the architecture of Transformer. It has been trained on a huge dataset to acquire general language knowledge. This model can also be fine-tuned for specific tasks based on its base. By learning from a massive amount of text data, GPT has the ability to understand the context of words and sentences, and generate sentences. It can predict and generate text that follows based on the given context, making it suitable for a wide range of tasks such as high-quality text generation, translation, question-answering, and text classification.

OpenAI announced GPT-1 in 2018, GPT-2 in 2019, GPT-3 in 2020, and plans to release GPT-4 in 2023. As a commercial service, GPT-3 was released as a Web API in 2020. In 2022, ChatGPT, a chat AI specialized in conversations with humans, was released. For more information on the Web API and the latest model, GPT-4 Turbo, please refer to the following articles.

"What is ChatGPT API? From capabilities to benefits, explained with examples!"
"GPT-4 Turbo, customizable GPTs that can be used with ChatGPT, have been released."

4. How to Choose a Translation Service Using LLM (Large Language Model)

The following are the key points to consider when choosing a translation service using LLM (Large Language Model):

4-1. Check Translation Accuracy

One of the key points when choosing a translation service using LLM (Large Language Model) is to check its translation accuracy. The main purpose of translation is to convey accurate information, and the higher the translation accuracy of the model, the more reliable it becomes. It is important to try the service and verify specific example sentences and industry-specific expressions, and refer to both machine evaluation and manual evaluation.

4-2. Is the tool taking proper information security measures?

Please make sure that the selected tools have strong measures for information security. The translated text may also be highly confidential, so focus on SSL/TLS encryption and data protection, and confirm the security policy to ensure reliability. For more information on ChatGPT security, please see the following blog post.

"What is ChatGPT's translation ability? Thoroughly verified in each step of the translation process."

4-3. Can industry-specific and technical terms be customized?

When accurate translation of industry-specific and technical terms is required, it is important to consider whether a glossary or automatic post-editing of translated text is possible. By utilizing such functions, the accuracy of translations can be improved and the need for manual editing can be significantly reduced.

5. Summary

LLM (Large Language Model) is built using a wide range of datasets and deep learning technology, making it excellent for natural language processing. Examples of LLM services include chat AI such as "Gemini Pro" that supports Google's Bard, and "GPT-3.5 Turbo" and "GPT-4" that support OpenAI's ChatGPT. Machine translation using LLM is trained on a wide range of datasets and can handle multiple languages, allowing for highly accurate and natural expressions, especially in English. When utilizing LLM, it is important to confirm translation accuracy and information security, and using glossaries and automatic post-editing functions can be beneficial when precise translations of industry and specialized terms are required.

At Human Science, we offer automatic translation software MTrans for Office and MTrans for Trados that can utilize ChatGPT. You can also translate transcribed text with just one click. ChatGPT can not only be used as a translation engine, but also for transcribing, rewriting, and proofreading text depending on the prompt. MTrans for Office and MTrans for Trados also offer a 14-day free trial. Please feel free to contact us.

Features of MTrans for Office

① Unlimited number of files and glossaries that can be translated with a fixed fee
② Translate with one click from Office products!
③ Secure API connection for peace of mind
・For customers who want further enhancement, we also offer SSO, IP restrictions, etc.

④ Japanese language support by Japanese companies
・Possible to respond to security check sheets
・Payment by bank transfer is available

MTrans for Office is an easy-to-use translation software for Office.

Features of MTrans for Trados

  1. ① Simultaneous translation using multiple machine translation engines such as DeepL and Google.
  2. ② Automatically apply terminology to machine-translated text. Centrally manage terminology glossaries regardless of machine translation engine.
  3. ③ String replacement, regular expression replacement, automatic correction of translation style, notation, and expression using ChatGPT
  4. ④ Automatic Correction of Fuzzy Matches in Translation Memory
  5. ⑤ Maintain original formatting and tags during machine translation
 

What is the machine translation solution MTrans for Trados for Trados?

Popular Article Ranking
Archive
Category

For those who want to know more about translation

Tokyo: +81-3-5321-3111
Nagoya: +81-52-269-8016

Reception hours: 9:30 AM to 5:00 PM JST

Contact Us / Request for Materials