Tags and Categories Categories dataset genesis infrastucture milestone mood paperworking reading-notes story time Tags accuracy cotutelle courses CREMMA eScriptorium evaluation experiment health insurance house cleaning HTR init kraken Large Language Models manifesto metrics OCR software documentation static website synthetic data tuition visa wikicremma