skip to content

Andreas Witt’s research focuses on annotation science and information modeling for language resources. An application-oriented aim of his research is to find mechanisms for the automatic extraction of information conveyed in digital texts. At a theoretical level, information modelling of text data allows researchers to make explicit their abstract analyses of the textual content.

Moreover, Andreas Witt is interested in developing methods and strategies to ensure the sustainability and the re-usability of linguistic resources. Therefore, he is involved in activities of standardization bodies: he is convener of the ISO working group, "Linguistic Annotation" (TC37/SC4/WG6) and a co-chair of the Special Interest Group "TEI for Linguists" within the Text Encoding Initiative (TEI).

Selected publications

  1. Andrea Horbach, Stefan Thater, Diana Steffen, Peter M. Fischer, Andreas Witt, and Manfred Pinkal (2015): “Internet Corpora: A Challenge for Linguistic Processing.” In: Datenbank-Spektrum, March 2015, Volume 15, Issue 1, pp 41–47 Berlin/Heidelberg: Springer.
  2. Laurent Romary and Andreas Witt (2014): “Méthodes pour la représentation informatisée de données lexicales / Methoden der Speicherung lexikalischer Daten.” In: Lexicographica 30. pp. 152-186 - Berlin/Boston: de Gruyter, 2014.
  3. Albrecht Plewnia and Andreas Witt (eds.) (2014): Sprachverfall? Dynamik - Wandel - Variation. Berlin/New York: de Gruyter, 2014.
  4. C. M. Sperberg-McQueen, Oliver Schonefeld, Marc Kupietz, Harald Lüngen, and Andreas Witt (2013): “Igel: Comparing document grammars using XQuery.” In: Proceedings of Balisage: The Markup Conference 2013. Elektronische Ressource - Balisage, 2013. (Balisage Series on Markup Technologies 10)
  5. Alexander Mehler, Kai-Uwe Kühnberger, Henning Lobin, Harald Lüngen, Harald, Angelika Storrer, and Andreas Witt (eds.) (2011): Modeling, learning, and processing of text-technological data structures. Berlin/Heidelberg: Springer. (Studies in computational intelligence; 370).
  6. Andreas Witt and Dieter Metzing (eds.) (2010): Linguistic Modeling of Information and Markup Languages. Contributions to Language Technology. Springer Netherland, 2010 (Text, Speech and Language Technology, Vol. 41.)
  7. Andreas Witt, Georg Rehm, Erhard Hinrichs, Timm Lehmberg, Timm, Jens Stegmann (2009): “SusTEInability of linguistic resources through feature structures.” In: Literary and Linguistic Computing 24, Issue 3. pp. 363-372 - Oxford: Oxf. University Press, 2009.
  8. Georg Rehm, Oliver Schonefeld, Andreas Witt, Erhard Hinrichs, Erhard, and Marga Reis (2009): “Sustainability of annotated resources in linguistics: A web-platform for exploring, querying, and distributing linguistic corpora and other resources.” In: Literary and Linguistic Computing 24, Issue 2. pp. 193-210 - Oxford: Oxf. University Press, 2009.
  9. Timm Lehmberg, Georg Rehm, Andreas Witt, and Felix Zimmermann (2008): “Digital Text Collections, Linguistic Research Data, and Mashups: Notes on the Legal Situation.” In: Library Trends 57/1. pp. 52-71 - Urbana-Champaign: University of Illinois, 2008.
  10. Andreas Witt, Daniela Goecke, Felix Sasaki, and Harald Lüngen (2005): “Unification of XML Documents with Concurrent Markup.” In: Literary and Linguistic Computing 20, Issue 1. pp. 103-116 - Oxford: Oxford University Press, 2005