Bio

See my personal homepage.

Courses

Recent publications

  1. The GEM Benchmark: Natural Language Generation, its Evaluation and Me…

    Gehrmann, S., Adewumi, T., Aggarwal, K., Ammanamanchi, P. S., Aremu, A., Bosselut, A., Chandu, K. R., Clinciu, M-A., Das, D., Dhole, K., Du, W., Durmus, E., Dušek, O., Emezue, C. C., Gangal, V., Garbacea, C., Hashimoto, T., Hou, Y., Jernite, Y., ... Zhou, J. (2021). The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics. In Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021) (pp. 96-120). Association for Computational Linguistics. https://aclanthology.org/2021.gem-1.10
  2. Human evaluation of automatically generated text - Current trends and…

    Lee, C. V. D., Gatt, A., Miltenburg, E. V., & Krahmer, E. (2021). Human evaluation of automatically generated text: Current trends and best practice guidelines. Computer Speech and Language: An official publication of the International Speech Communication Association (ISCA), 67, 1-24. [101151].
  3. Preregistering NLP research

    van Miltenburg, E., van der Lee, C., & Krahmer, E. (2021). Preregistering NLP research. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 613-623). Association for Computational Linguistics. https://www.aclweb.org/anthology/2021.naacl-main.51
  4. Underreporting of errors in NLG output, and what to do about it

    van Miltenburg, E., Clinciu, M., Dušek, O., Gkatzia, D., Inglis, S., Leppänen, L., Mahamood, S., Manning, E., Schoch, S., Thomson, C., & Wen, L. (2021). Underreporting of errors in NLG output, and what to do about it. In Proceedings of the 14th International Conference on Natural Language Generation (pp. 140-153). Association for Computational Linguistics. https://aclanthology.org/2021.inlg-1.14
  5. Twenty Years of Confusion in Human Evaluation - NLG Needs Evaluation …

    Howcroft, D. M., Belz, A., Clinciu, M-A., Gkatzia, D., Hasan, S. A., Mahamood, S., Mille, S., van Miltenburg, E., Santhanam, S., & Rieser, V. (2020). Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions. In Proceedings of the 13th International Conference on Natural Language Generation (pp. 169-182). Association for Computational Linguistics. https://www.aclweb.org/anthology/2020.inlg-1.23

Find an expert or expertise