You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Sep 10, 2025. It is now read-only.
Add support for all 8 remaining datasets (SST-2 is already supported) of the GLUE benchmark: CoLA, MRPC, QQP, STS-B, MNLI, QNLI, RTE, WNLI.
Motivation
In itself adding support for all GLUE datasets has a lot of value: GLUE is one of the most widely used benchmark in the NLP community. Furthermore, our planned effort to develop a cohesive API for Multi-Task Learning requires us to enhance our suite our datasets, starting with GLUE.
Additional context
We have already created a streamlined dataset API, with a consistent use of DataPipes for dataset download and load operations, as well as a testing methodology relying on mock data (see #1493). This feature will only require to add support for these 8 datasets following that methodology.