
- Table of Contents
-
- 1. PDF Translation
- 2. What types of PDF translation tools are available?
- 2-1. DeepL
- 2-2. Google Translate
- 2-3. MTrans Team
- 2-4. MTrans for Office
- 3. Compare the translation results of the PDF translation tool!
- 3-1. Translation Accuracy
- 3-2. OCR Accuracy of Scanned PDFs
- 4. Reasons and Disadvantages of Inability to Translate PDF Successfully
- 4-1. The PDF file is protected and cannot be translated.
- 4-2. Layout and table distortion occurs
- 4-3. Translations that become meaningless due to unintended line breaks
- 4-4. The processing capacity of machine translation is small, making it difficult to translate large files at once.
- 4-5. There is a risk of information leakage
- 5. What are the requirements for PDF translation tools?
- 5-1. High Translation Accuracy and Availability of OCR Functionality
- 5-2. Ability to translate while maintaining layout
- 5-3. Can it handle large PDF files?
- 5-4. Considerations for Security and Privacy
- 5-5. Post-Translation Editing Features
- 6. Summary
1. PDF Translation
There are two types of PDF files: text-based PDFs, where the text is stored as text data, and image-based PDFs, where the text is stored as image data. Image-based PDFs are often documents scanned with a copier, and are therefore also referred to as scanned PDFs.
When you open a text-based PDF with the Adobe Acrobat Reader app, you can select text using the mouse cursor.

In the case of scanned PDFs, when you try to select text, the entire page gets selected instead.

If you want to translate a text-based PDF, select the text, copy it, and paste it into an automatic translation service. On the other hand, if you want to translate a scanned PDF, you will have to manually retype the text while looking at the PDF. However, if there is a large amount of text, it can be very time-consuming and impractical.
Therefore, a function called OCR, which recognizes characters and converts them into text data, is necessary. The Snipping Tool, a standard app in Windows 11, has OCR functionality built-in. By using the Snipping Tool to capture an image of any part of the screen and then clicking the "Text Actions" button in the toolbar, you can execute the OCR function.

Once the OCR is complete, you can either select any text with your mouse or click on "Copy All Text" to select and copy everything. After that, you can paste the copied text into an automatic translation service for translation.
However, in the case of a PDF that does not fit on one screen, it is necessary to repeat the steps of scrolling, taking screen captures, performing OCR, and copying and pasting, which can be very cumbersome.
In the case of text-based PDFs as well as scanned PDFs, when you copy and translate the text, the layout of the PDF will not be preserved. To translate a PDF while maintaining its layout, a PDF translation tool that translates the PDF file as is is required.
2. What types of PDF translation tools are available?
Representative automatic translation services include DeepL and Google Translate. Both can translate PDF files, but the level of support varies.
2-1. DeepL
DeepL has a feature called file translation. You can upload PDF files for translation. It supports both text-based PDFs and scanned PDFs. The translation is done while preserving the layout, and you can download the translated PDF.

The number of files and file sizes that can be translated varies depending on the DeepL subscription plan. The free version allows up to 3 files per month, with a file size limit of 5MB. The translated PDF files cannot be edited.
In the paid version, the number of files is limited to 5 to 100 per month depending on the plan, and the file size is limited to 10 to 30MB. The translated PDF files are editable. You can also download them as Word files.
If you use the free version, please be aware that the data you enter may be reused by DeepL. For details on security, please see the blog post below.
Confidentiality with DeepL Translations. Is It Secure?
Also, please see the blog post below for the differences between the free and paid versions.
2-2. Google Translate
Google Translate has a feature called Document Translation. You can upload PDF files for translation. It only supports text-based PDFs and does not support scanned PDFs. The translation is done while maintaining the layout, and you can download the translated PDF. The translated PDF cannot be edited. The maximum file size is 10MB. Additionally, since the input data may be reused for the improvement of Google services, it is not recommended for business use.

2-3. MTrans Team
Our automatic translation product, MTrans Team, also includes a PDF translation feature. You can upload PDF files for translation. It supports both text-based PDFs and scanned PDFs. You can choose from translation engines such as Google, Microsoft, NAVER Papago, OpenAI, and Claude for automatic translation. The translation is done while maintaining the layout, and the translated file can be downloaded as an editable Word file. The maximum file size limit is 45MB.

2-4. MTrans for Office
Our other automatic translation product, MTrans for Office, adds automatic translation features to Outlook, Word, Excel, and PowerPoint. Among these, the Windows version of MTrans for Office includes PDF translation capabilities using DeepL and Google, supporting both text-based PDFs and scanned PDFs. The translation is done while preserving the layout, and an editable PDF is saved. The file size limit is 30MB when using DeepL as the translation engine and 20MB when using Google.

If you choose Google as your translation engine, the Google Cloud Translation API will be used for OCR and automatic translation. This API was also introduced in "How to apply automatic translation to scanned PDF materials". Google's OCR functionality has been widely used in the Google Translate app on smartphones for many years, and due to the accumulation of technical know-how, it can recognize characters with higher accuracy.
Both DeepL and Google support PDF translation, but the accuracy of translation and OCR differs. MTrans for Office supports both services, allowing you to translate and compare PDFs using each service to choose the one with better accuracy.
3. Compare the translation results of the PDF translation tool!
3-1. Translation Accuracy
DeepL is characterized by fluent and natural translations that resemble those of a native speaker. However, one of the challenges with DeepL is that when translating long sentences, phrases may be omitted, and when translating multiple sentences together, entire sentences may be missing, making human verification essential. In particular, if you only read the translated text for verification, the fluency may cause you to overlook omissions in the translation. Be sure to compare the original text with the translated text for verification.
On the other hand, while Google Translate's translations tend to be slightly less fluent compared to DeepL, they are less likely to miss any parts. Generally, if you prioritize fluency, we recommend using DeepL, and if you prioritize accuracy, we recommend using Google. For a detailed comparison of DeepL and Google, please see the blog article below.
How Accurate Are DeepL Translations? Comparison with Google and Microsoft for Business Emails
3-2. OCR Accuracy of Scanned PDFs
We prepared a scanned PDF as shown below and compared the OCR accuracy of DeepL and Google. Note that for Google, since the "Google Translate" used from the web does not support scanned PDFs, we used MTrans for Office to perform PDF translation via the Google Cloud Translation API.

The image below shows the results translated by DeepL. The header section at the top was translated, but the rest remained in English. Additionally, many parts of the English text were missing. When the quality of the scanned PDF is poor or the text is small, there tends to be a decrease in OCR accuracy.

On the other hand, at Google, most English text was correctly recognized and translated. However, the English in the diagrams was incorrectly translated, and the layout of the vertical text translations was disrupted. As a result, there are still challenges with the translation and layout of the diagrams.

When translating scanned PDFs with OCR, there are still challenges, but generally, it is recommended to use Google via API.
4. Reasons and Disadvantages of Inadequate PDF Translation
4-1. The PDF file is protected and cannot be translated
Password-protected PDF files cannot be translated directly. You need to remove the password and prepare a non-password-protected PDF file. If the password cannot be removed but printing is allowed, you can use Windows' "Microsoft Print to PDF" to save it as a new PDF file, which can then be translated.
4-2. Layout and table distortion occurs
When automatically translating PDFs, the layout and tables may become distorted. If you use the paid version of DeepL, MTrans Team, or MTrans for Office, you can load the output PDF or Word file into the Word application after translation, allowing for manual corrections.
4-3. Translations that become meaningless due to unintended line breaks
There are cases where a sentence may be split and recognized as multiple sentences. Whether it's a text-based PDF or a scanned PDF, with the MTrans Team, you can edit the original text after automatic translation, allowing you to combine multiple sentences into one before re-translating it automatically.
4-4. The processing capacity of machine translation is small, making it difficult to translate large files at once.
Since the file size of scanned PDFs often becomes large, the file size limit of automatic translation services can become an issue. If the limit is exceeded, you can either split the file or use the PDF compression service provided by Adobe (https://acrobat.adobe.com/link/acrobat/compress-pdf). When compressing a PDF, selecting a low compression level (highest quality) will help maintain the accuracy of text recognition and prevent text from becoming distorted.
4-5. There is a risk of information leakage
Using free versions of DeepL or Google Translate may result in the input data being repurposed for other purposes. When translating confidential information, it is necessary to use a paid service with guaranteed security instead of a free automatic translation service. For more information on DeepL's security, please see "Is confidentiality maintained with DeepL translation? What about security?".
5. What are the requirements for PDF translation tools?
5-1. High Translation Accuracy and Availability of OCR Functionality
To correctly translate PDF files, it is essential that the performance of the translation engine is high and that the OCR functionality for scanned PDFs is robust. Products that support both DeepL and Google translation engines are convenient as they allow users to switch engines freely according to the content and purpose of the document. Additionally, the higher the accuracy of the OCR, the more accurately the text in scanned PDFs is recognized, reducing omissions and misrecognitions in the translation.
5-2. Ability to translate while maintaining layout
When translating PDFs, it is very important to maintain the layout. If the layout is not preserved, there will be a lot of revisions needed after translation for materials that include charts and images, which will decrease work efficiency. Having the ability to output in formats like PDF or Word while maintaining the layout makes editing after translation much easier.
5-3. Can it handle large-capacity PDFs?
When translating large PDF files, it may exceed the limits of translation tools. Scanned PDFs, in particular, tend to have larger file sizes. Even if multiple translation engines are available, the limits on file size and page count can vary, so it is necessary to check in advance. Whether it is possible to translate without the hassle of splitting or compressing the files is also an important point.
5-4. Considerations for Security and Privacy
When translating PDFs that contain confidential information, utmost care must be taken in handling the data. Free automatic translation services should be avoided for business use, as the uploaded information may be reused for service improvement. When implementing translation tools, be sure to check what security measures are in place and whether the privacy policy is clear.
5-5. Post-Translation Editing Features
Machine translation is convenient, but it does not always provide perfect translations. If there are omissions or mistranslations, having the ability to easily edit the translated Word or PDF files significantly improves work efficiency. In particular, tools that allow for side-by-side comparison and editing of the original text and the translation make it easier to identify omissions and enhance the accuracy of the translation.
6. Summary
For PDF translation, it is efficient to use translation tools that support text-based PDFs and scanned PDFs (such as DeepL, Google Translate, MTrans, etc.). These tools differ in translation accuracy, OCR accuracy, and file size limits. While DeepL tends to provide more natural translations, it is also prone to missing translations. On the other hand, Google has fewer missing translations but tends to be slightly less fluent. In terms of OCR accuracy, Google excels. Additionally, regarding security, caution is needed about the risk of secondary use of data when translating confidential information.
Human Science offers "MTrans for Office," an automatic translation software that supports both text-based PDFs and scanned PDFs, utilizing translation engines from DeepL, Google, Microsoft, and OpenAI. OpenAI can not only translate but also generate and rewrite text, as well as proofread, depending on the prompts, supporting business efficiency and multilingual capabilities. MTrans for Office also offers a 14-day free trial. Please feel free to contact us.
Features of MTrans for Office
- ① No limit on the number of files that can be translated or on the glossary, with a flat-rate system
- ② Translate with one click from Office products!
- ③ API connection ensures security
・For customers who want further enhancement, we also offer SSO, IP restrictions, and more. - ④ Support in Japanese by Japanese companies
・Support for security check sheets is also available
・Payment via bank transfer is available
MTrans for Office is an easy-to-use translation software for Office.