Recent progress in Natural Language Processing follows the scaling recipe: larger models, larger data, and larger computational resources. In this talk, I will argue that making further progress at this increased scale requires meaningfully incorporating human insights at different steps of the NLP development lifecycle. In the first part of the talk, I will present examples of how insights from human processes can help us design annotation protocols and unsupervised approaches that improve the detection of small meaning differences across languages for humans and machines. Then, I will present work showcasing how insights from human audits of large-scale datasets can help us understand and improve the translation capabilities of language models (by designing losses and models with inductive biases that account for those insights). Finally, I will summarize work highlighting insights from human evaluations toward establishing fair comparisons between NLP systems across languages.