I am a fifth-year Ph.D. Candidate in the Department of Computer Science at the University of Maryland, College Park. I am a member of the CLIP lab working with Marine Carpuat. My research interests are broadly in Multilingual Natural Language Processing (NLP) and Machine Translation.
My recent work focuses on building better models across diverse languages by using humans and AI as joining forces. In my past work, I focused on revisiting common hypotheses adopted when modeling and evaluating multilingual content:
✔️ Improving Machine Translation Through Semantic Analysis: A source text and its translation are not always equivalent in meaning—a common hypothesis made when training machine translation systems. I have worked on building models and algorithms for detecting, analyzing, and mitigating the impact of small cross-lingual meaning differences on machine translation training.
✔️ Revisiting Style Transfer Beyond English: Progress recorded when modeling English does not always port to other languages—a common hypothesis made in Multilingual NLP. I have worked toward creating resources and evaluation models that lay the foundation for studying the task of controlling stylistic variations in natural language generation in a multilingual setting.
Google Research, Summer 2022
Research Intern
Meta AI Research, Summer 2021
Research Intern
Dataminr Research, Summer 2020
Research Intern
Ph.D. in Computer Science, 2018-now
University of Maryland, College Park, USA
M.Sc. in Computer Science, 2018-2020
University of Maryland, College Park, USA
B.Sc. & M.Eng. in Electrical and Computer Engineering, 2012-2018
National Technical University of Athens, Athens, Greece
Mined bitexts can contain imperfect translations that yield unreliable training signals for Neural Machine Translation. While filtering …
Quantifying fine-grained cross-lingual semantic divergences at scale, requires computational models that do not rely on human-labeled …
Parallel texts—a source paired with its (human) translation—are routinely used for training machine translation systems …
As a community, we have overfitted the characteristics of English-language data when modeling various tasks, does the same hold for our …
A dominant hypothesis in multilingual research is that models developed and optimized for English can be seamlessly transferred (and …
Detecting fine-grained semantic divergences—small meaning differences in segments that are treated as exact translation …