Speech Recognition QA Testing

Speech Recognition QA Testing

With the recent introduction of Amazon Echo, voice recognition technology has become more familiar, along with iPhone's Siri, Google's Google Now, and Microsoft's Cortana.
As voice recognition technology gains more attention, Human Science Co., Ltd. offers QA, QA seminars, and QA training based on our extensive experience as a localization vendor, to support our clients' services that utilize voice recognition technology.

What is Speech Recognition QA?

Speech recognition is a technology that recognizes human speech and converts it into text on a computer. And as a service (device) that applies this technology, there are Siri and Google Now mentioned earlier.
In speech recognition QA (Quality Assurance), we ensure the quality of "the computer correctly recognizing speech and converting it into text" and "the device performing the expected action on the textified speech".


As for speech recognition software, there are Dragon (Nuance) and Speech Assistant (Lyric), but the basic mechanism of speech recognition is to convert speech from "sound waveform" to "reading", and apply language information (words and grammar) to "text" for conversion. Then, the speech recognition device performs actions corresponding to this "converted text".


Basic Mechanism of Speech Recognition


However, due to various factors such as environmental noise, user's age and gender, and speech habits, there is a possibility that the software may misrecognize the voice and convert it into text, or the device may perform actions that do not meet the user's expectations. Therefore, ensuring that the computer correctly recognizes the voice and converts it into text, and that the device performs the expected actions for the converted text, directly affects the user's satisfaction.

Speech Recognition and Deep Learning

The field of speech recognition is also a very active field for deep learning. In the past, even for the same speech, it was necessary to train all the speech of each user. However, with deep learning, it is possible to significantly improve the recognition accuracy for different users by learning the patterns of speech.


Previous Method


Deep Learning


Deep learning is the best answer for improving speech recognition accuracy and is essential for improving user satisfaction with speech recognition services.

HS Voice Recognition QA

At Human Science, with years of experience in IT translation, we contribute to improving the accuracy of software voice recognition and increasing customer satisfaction with the following voice recognition QA.


Speech Recognition Analysis and Evaluation

In order to achieve high-precision speech recognition, learning through deep learning is important. However, it is also crucial to analyze and evaluate after learning speech patterns and specialized terminology. Analyzing and evaluating whether the speech patterns have been learned correctly and if the recognition accuracy has improved is an essential step for improving speech recognition.


Analysis and Evaluation 1 Analysis and Evaluation 2


At Human Science Co., Ltd., experienced Japanese native speakers living in Japan are registered to work, allowing us to accurately understand and analyze a wide range of Japanese speech, IT, medical, and social media terminology used by various regions and age groups. The results of our analysis and evaluation will be reported as a quality improvement plan, including identified issues, challenges, and proposed solutions.

User Experience Testing

Once the user's speech has been correctly recognized, the next step is to execute the appropriate processing on the device. Basically, the device will execute the action set for the "keyword" included in the speech.


For example, in a phrase like "I'm looking for a place where I can drink coffee", the keywords "coffee", "drink", "place", and "look for" are included. The device will select and execute actions corresponding to these keywords.


Device Operation Examples


However, unfortunately, it is not always possible to perform the actions that users expect.

Most of the users who made this statement probably want to take a break while drinking coffee at a cafe. However, how would the user feel if they were directed to a take-out only coffee shop? If a coffee shop 100km away is displayed, will the user go there?


Even for take-out specialty stores or stores 100km away, it is not wrong to respond to the request "looking for a place to drink coffee". However, in order to thoroughly examine the user's intention and execute beneficial results for the user, meticulous testing by humans is necessary.


At Human Science, we have experts with the necessary language and cultural knowledge to interpret user speech, as well as a wealth of expertise in recognizing user intent accurately on devices. Through testing by our knowledgeable experts, we contribute to improving user satisfaction with devices.


We also have a Secured Access Room within our company, so you can trust us with high-security product testing, such as right before release.

Voice QA Seminar/Voice QA Training

Based on the knowledge accumulated over the years as a localization vendor, we will plan and propose QA seminars/training tailored to your service and background.

  • ・Quality Management Method for Services Incorporating Voice Recognition
  • ・Quality improvement of existing speech recognition services・Efficient quality management methods
  • ・In-house development of voice recognition QA


For those who want to know more about Japanese translation and localization

Tokyo: 03-5321-3111 Nagoya: 052-269-8016

Reception hours: 9:30 AM to 5:00 PM JST