IBM Debater Mention Detection Benchmark

By IBM Data Asset eXchange

The goal of Mention Detection is to map entities/concepts mentioned in text to the correct concept in a knowledge base. The dataset contains 3000 sentences that are annotated with Mentions.

Product type

Dataset

Update frequency

Historical

Updated

Mar 26, 2020

Delivery method

Download

A large, high-quality benchmark dataset for mention detection. The goal of Mention Detection is to map entities/concepts mentioned in text to the correct concept in a knowledge base. The benchmark contains annotations of both named entities as well as other types of entities, annotated on different types of text, ranging from clean text taken from Wikipedia, to noisy spoken data. There are 3000 sentences with a total of 6375 Mentions in the Wikipedia sentences and 6239 in the spoken sentences.

Notices

Datasets offered on Red Hat Marketplace are provided on an "AS IS" basis and IBM makes no warranties or conditions, express or implied, regarding the datasets or support for them. If support is needed for the dataset, reference the resources below and/or reach directly out to the source for any additional questions.

For instructions on accessing datasets on Red Hat Marketplace please visit the documentation. If you need additional support downloading a dataset please visit our Red Hat Marketplace Dataset FAQ on the support center.