image

Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models

Publication Date:

Abstract

Massively multilingual models subsuming tens or even hundreds of languages pose great challenges to multi-task optimization. While it is a common practice to apply a language-agnostic procedure optimizing a joint multilingual task objective, how to properly characterize and take advantage of its underlying problem structure for improving optimization efficiency remains under-explored... (read more)

Authors

Topics

https://openreview.net/pdf?id=F1vEjWK-lH_

0001-01-01 -