NIST TAC Belief and Sentiment (BeSt) Track

Data Sets

There are three types of data files:
  • 1. The 2016 Belief and Sentiment training data. This is data in best format and includes discussion forum and newswire data. There are separate releases for the three languages (English, Chinese, Spanish). The releases include separate source, ERE, and best files.
  • 2. The 2016 Belief and Sentiment test data. This is data in best format and includes discussion forum and newswire data. There is a single release for all three languages. The release includes separate source, ERE, and best files for each language.
  • 3. Data tagged for committed belief. This is a different type of belief annotation (target only, source is only ever the author). This may be useful for training belief taggers. The content and format is explained in more detail in the data releases.
Note that participants in the 2017 eval are free to choose how they use the existing data; there is no need to treat the 2016 Belief and Sentiment test data differently from the 2016 Belief and Sentiment training data in preparing for the 2017 evaluation.

Details are as follows.

  • ALL LANGUAGES 2016 Eval data: LDC2016E114 (TAC KBP 2016 Belief and Sentiment Evaluation Gold Standard Annotation V2) Committed belief annotation: LDC2014E125 (DEFT Committed Belief Annotation Self-Evaluation Package)
  • ENGLISH 2016 training: LDC2016E27 (DEFT English Belief and Sentiment Annotation V2) - 2016 training: 236 documents, 165k words - 2016 eval: 165 documents, 100k words
    Committed belief annotation: LDC2014E55 (DEFT Committed Belief Annotation R1 V1.1) and LDC2014E106 (DEFT Committed Belief Annotation R2)
  • CHINESE 2016 training: LDC2016E61 (DEFT Chinese Belief and Sentiment Annotation) - 2016 training: 180 documents - 2016 eval: 159 documents
    Committed belief annotation: LDC2015E99 (DEFT Chinese Committed Belief Annotation)
  • SPANISH: 2016 training: LDC2016E62 (DEFT Spanish Belief and Sentiment Annotation) - 2016 training: 90 documents, 82k words - 2016 eval: 168 documents, 85k words
    Committed belief annotation: LDC2016E40 (DEFT Spanish Committed Belief Annotation)