This is a list of software projects I am working on, or have worked on in the past.
CorA (Corpus Annotator)
CorA is a web-based annotation tool for word-level annotation of historical and other non-standard language data. It was originally developed to annotate historical texts for the Anselm corpus and the reference corpus of Early New High German, but has since been used for a variety of other projects, including the annotation of social media data.
Norma (Normalization Tool)
Norma is a tool for automatic spelling normalization of non-standard language data. It was originally developed for use with historical documents in the DFG-funded Anselm project. Originally written by me in Python, it was later ported to C++ (with optional bindings for Python 2.x) with the help of Florian Petran.
SimpleNLG for German
SimpleNLG for German is an adaption of the SimpleNLG library for natural language generation, written in Java. I created it as part of my studies for my Master’s degree.
It is in desperate need of an update for the current SimpleNLG v4 framework, and also needs a lexical resource (not provided) for proper inflection of words.
- Computational Historical Linguistics
- St. Anselmi Fragen an Maria
- Reference Corpus of Middle High German
- KONVENS2016 conference website
…and the one you’re just viewing, of course!