Some parts of this page may be machine-translated.

 

[Spin-off] Training data starts with good teacher development - What communication is needed in the field?

alt

2023.9.1

[Spin-off] Training data starts with good teacher development - What communication is needed in the field?

Spinoff Blog Project
――Annotations Supporting AI in the DX Era. The Reality of the Analog Field
Good Teacher Data Comes from Creating Good Teachers
~What Communication is Needed in the Field~

Until now, our company has been publishing various blogs related to annotation and AI. In those, we have focused on conveying general knowledge and know-how. While the task of annotation may seem simple when expressed in words, it is a "task that cannot be avoided by humans" and contains a lot of "ambiguity," which inevitably leads to a greater involvement of people. Therefore, it often becomes a bit messy and cannot be resolved with the neat theories that abound in the world; in order to ensure quality and productivity, various experiences and know-how are actually required.
For this reason, we believe that understanding the specific problems that occur in actual annotation settings and how they are addressed can serve as hints for leading to successful annotation.
In our field, we want to convey what actually happens and what specific responses and measures are taken. Unlike regular blogs, we would like to share the realities of the field, including our unique characteristics and commitments, under the spin-off blog project titled: "Annotation Supporting AI in the DX Era. The Reality of That Analog Field."

 

>>Past Published Blogs (Partial)

7 Tips for Successfully Leading Annotations

What is training data? An explanation from its relationship with AI, machine learning, and annotation to how to create it.

Table of Contents

1. Can everything be conveyed in specifications and work descriptions?

It can be said that the quality of the annotator's work determines the quality of the training data. Of course, annotation is based on defined requirements, so it is important that the requirements are clearly established first. However, even if the requirements are confirmed, a specification document is prepared based on them, and proper work instructions are provided, there are still significant traps that cannot be avoided.

 

Annotation does not require any specific knowledge or qualifications, and as mentioned at the beginning, it is a simple task when put into words, which can lead to misunderstandings. For example, distinguishing dog breeds is something that people unconsciously judge based on their past experiences and instincts. I believe there are few people who think theoretically and judge, saying, "If this were like this, it would be a Chihuahua." It's a matter of instinctive judgment... In annotation work, we inevitably have to rely on human judgment based on experience and instinct. Additionally, because we handle large amounts of data, there are many exceptions that cannot be judged by specifications alone. There are significant traps hidden within them... (It is not realistic to check all data in advance and incorporate exceptional cases into the specifications, and if we were to detail the characteristics of dog breeds in the specifications, it would result in an enormous amount of documentation that would be impractical.)

 

No matter how carefully a person works, discrepancies in judgment will inevitably occur. I have also experienced discrepancies in judgment and recognition while participating in various projects as an annotator. To produce high-quality training data, it is crucial that the annotators themselves are good teachers, and for that, it is necessary to properly manage the people involved.

 

It goes without saying that there are various types of people. Some tend to prioritize speed, while others may become overly cautious, and some may struggle with communication, such as asking questions. Each person's personality can affect the quality of annotations. Additionally, annotation requires careful attention to detail, and this process can go on indefinitely. As time passes, it is possible to become desensitized, leading to inconsistencies in judgment and careless mistakes.

 

In addition to explaining the preparation of specifications and rules, the PM can prevent many judgment errors by conveying key points and important aspects. However, it is essential to keep an eye on the workers' conditions and quality until the work is completed, ensuring that the annotators can maintain quality and work smoothly. In other words, education and support to become a good teacher are crucial.

2. Education and Support through Communication

As mentioned earlier, it is possible to enhance understanding by creating clear specifications and supplementary materials, and maintaining them as needed. However, simply sharing documents leads to one-way communication, and we cannot ensure mutual understanding. When we open the lid, we often find that the way we communicated was poor, and they did not understand... leading to having to redo the work from scratch... (ding). This often results in inflated costs and time. Depending on the scale and difficulty of the annotations, we have implemented education and support focused on communication, tailored to the situation.

 

However, there are various forms of communication. Group meetings? Contact via chat tools? Email? Among these options, what is the best approach? Based on our experience, although it takes effort, the most effective form of communication is one-on-one meetings.

 

In annotation work, situations that require communication often involve clarifying the ambiguities specific to annotation or confirming the worker's understanding of the specifications. In such cases, one-on-one meetings are particularly effective. Complex nuances that cannot be conveyed through text can be communicated directly, and using screen sharing can make it even clearer. Most importantly, being able to communicate face-to-face (or across a screen in remote settings) is the best approach. Individual topics that are difficult to address in group meetings can be more easily communicated, and annotators can speak freely without worrying about those around them, making it easier for them to share their concerns and opinions.

3. Implementation of 1on1

This is about a certain natural language processing annotation project. In this project, there were reviews for each individual annotator from the client, and if poor performance continued, that person would not be able to continue with the project, which was quite strict.

 

Annotator A has been barely passing for several months, but at one point, they finally fell below the passing mark. If we don't take action to recover here, it may lead to a situation where there are no options left. Continuing to barely pass may indicate a lack of deep understanding of the specifications. Upon reviewing feedback from the client reviewer, it became clear that A is annotating in a manner that differs from what is explained in the specifications. Therefore, I felt it was necessary for them to fully understand the specifications. However, the quality of the Q&A in the chats we've had so far has not been sufficient, no matter how much time we have. So, I decided to have a one-on-one meeting.

 

"You fell below the score. Shall we have a private lesson?" Right after I emailed this feedback to Mr. A, I received a personal message in the chat. "I messed up~. Please help...".

 

Act quickly for good, so we will promptly conduct a 1-on-1 session. We will go through each piece of feedback, explaining why it is incorrect by comparing it with the specifications. Then, you might say, 'Oh! So this part of the specifications is interpreted that way! I have been misunderstanding it all this time...'. '...What?' (I will gather my thoughts and continue explaining...). We also reviewed some common mistakes and spent about an hour together confirming the feedback to deepen our understanding.

 

If you have any concerns while working, please refer to the previous feedback or double-check the specifications. Of course, if you are still unsure, feel free to ask questions in the chat. If it's difficult to explain in writing, let's have a direct meeting.

 

The next day, perhaps due to the effectiveness of the advice, there were more questions, but there seemed to be no mistakes in the way of thinking. A few days later, the results of Mr. A, who returned to me as the PM, were passing marks. I will send feedback to Mr. A. "That's great! The number of mistakes has drastically decreased, and you got a good score. You understand it better than I do," I messaged, feeling relieved. No, I remind myself to stay vigilant moving forward, and I send a message to another annotator who has fallen below the passing mark. "You fell below the score. Shall we have a private lesson...?"

4. Summary

This time, I talked about the importance of training annotators to create training data for quality assurance, focusing on one method: communication, with real examples. Among them, I feel that one-on-one meetings are very effective because they allow for education, advice, and course correction for specific annotators. By meeting face-to-face, we can understand whether the message was conveyed through speech and gestures, and we can also grasp the person's character, making it easier to approach them later with questions like, "How can I provide feedback that will be easily understood by this person?"

 

To create good training data, we cultivate annotators as teachers. Annotation is a manual craft. Communication is an essential element that shapes the foundation of this process. Some may think that annotation can be done easily by simply following the specifications without much effort, but in practice, things often do not go as smoothly as expected. To overcome these issues and challenges, and to create higher quality training data while also reducing unnecessary costs that may arise from corrections, we are committed to focusing on people and managing the process. This reflects our desire not only to ensure quality but also to create a comfortable working environment.

 

Depending on the scale and continuity of the annotations, this method may not always be correct. However, in annotation work, which often involves unfamiliar data and rules, it is important to provide appropriate training and support throughout the work period to ensure quality. This requires not only throwing information in a text-based format but also achieving consensus and accumulating know-how through direct interaction with the person (or remotely via a screen), which is something I have learned from my past experiences, even though it may seem obvious.

 

Such work may be messy and not considered a smart way of doing things. However, from the perspective of the field, I believe this is what annotation truly is. Our company is committed to diving into this messy work without hesitation, and we would like to continue assisting everyone with this spirit.

 

Author:

Manabu Kitada

Annotation Group Project Manager

 

Since the establishment of our Annotation Group, we have been broadly responsible for team building and project management for large-scale projects,
formulation of annotation specifications for PoC projects, and consulting for scaling.
Currently, in addition to being a project manager for image and video annotation and natural language annotation,
I am also engaged in promotional activities such as being a seminar instructor for annotation and writing blogs.

 

 

 

Related Blog Posts

 

 

Contact Us / Request for Materials

TOP