Tracking progress in Style Transfer: From Human to Automatic Evaluation.

Abstract

While the field of style transfer (ST) has been growing rapidly, progress has been hampered by a lack of standardized practices for both human and automatic evaluation. In this talk, we will first summarize human evaluation practices described in 97 style transfer papers with respect to three main evaluation aspects: style transfer, meaning preservation, and fluency. As we will see, protocols for human evaluations in ST are often underspecified and not standardized, which hampers the reproducibility of research in this field and progress toward better human and automatic evaluation methods. Then, we will switch gears and discuss issues in automatic evaluation of ST. Concretely, taking formality as a case study, we will revisit several metrics for automatic evaluation of each of the three ST aspects and finally outline best practices that correlate well with human judgments and are robust across languages.

Date
Location
Online
Avatar
Eleftheria Briakou
Eleftheria Briakou

I research Multilingual NLP and Machine Translation.