Welcome to the Workshop of Semantic Textual Similarity (STS)!
- How did it begin? DARPA proposal


Organizers: Mona Diab, Eneko Agirre, Weiwei Guo


STS is a 2-day workshop on Semantic Textual Similarity (STS) where we are seeking input from the NLP and Speech community to contribute to the field of semantic processing or interest in using semantic processing.


As part of the STS framework there are the following overarching research goals:

  • To create an interoperable STS pipeline that integrates different semantic components ranging from simple word similarity to more nuanced semantic components that can handle more complex semantic and pragmatic phenomena such as modality and lambda logic.
  • To perform intrinsic evaluation of STS
  • To show the utility of STS to large NLP applications using extrinsic evaluations
  • To advance our understanding of the underpinning semantics of natural languages and how we can empirically exploit this knowledge
  • To foster stronger collaborations within the Semantic community and across to other sub-communities within CL
We have a SEMEVAL task in 2012 that serves as a pilot version of STS (http://www.cs.york.ac.uk/semeval-2012/task6). SEMEVAL will be held in conjunction with the new Semantics conference *SEM http://ixa2.si.ehu.es/starsem/ which will take place in Montreal June 7-8, 2012.


The STS workshop will address different issues associated with STS in general (beyond the SEMEVAL task):
  1. What is STS?
  2. A. How to characterize STS quantitatively and qualitatively: to that end, we will examine together sample data sets that we have annotated
  3. How to create the STS blackbox?
  4. A. What are relevant semantic components? Negation detection, lexical substitution, MWE resolution, coreference resolution, WSD, etc.
    B. Design issues:
       i. How should contributing semantic components (both different components performing the same sub task, and components that complement each other) interface and interact?
       ii. What is the role of interoperability and web services (exploiting existing infrastructure such as UIMA, distributed computing, etc.)
  5. How to illustrate the utility of STS?
  6. A. Discuss intrinsic and extrinsic evaluation of the components, of the STS pipeline, of STS integrated in an NLP application.
    B. Focus on two applications MT evaluation and Summarization
  7. Future prospects
  8. A. Possibility of multilingual STS