Measuring Inter-Annotator Agreement: Can You Trust Your Gold Standard?


Whether you’re developing a training data set for supervised machine learning application or simply want to create an evaluation data set for unsupervised learning, you need annotated data. How do you assess the quality of annotated data? How can you be confident the annotations are reliable? This talk will cover methods for objectively evaluating manual annotations so you can be confident in your machine learning experiments.