Unified pre-training for language understanding (NLU) and generation (NLG)
Update (June, 2020): UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training was accepted by ICML 2020.
UniLM v2 New
(February, 2020): "UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training".
UniLM v1 (September 30th, 2019): the code and pre-trained models for the NeurIPS 2019 paper entitled "Unified Language Model Pre-training for Natural Language Understanding and Generation".
If you find UniLM useful in your work, you can cite the following paper:
@inproceedings{unilmv2,
title={UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training},
author={Bao, Hangbo and Dong, Li and Wei, Furu and Wang, Wenhui and Yang, Nan and Liu, Xiaodong and Wang, Yu and Piao, Songhao and Gao, Jianfeng and Zhou, Ming and Hon, Hsiao-Wuen},
year={2020},
booktitle = "Preprint"
}
Our code is based on pytorch-transformers v0.4.0. We thank the authors for their wonderful open-source efforts.
This project is licensed under the license found in the LICENSE file in the root directory of this source tree. Portions of the source code are based on the pytorch-transformers v0.4.0 project.
Microsoft Open Source Code of Conduct
For help or issues using UniLM, please submit a GitHub issue.
For other communications related to UniLM, please contact Li Dong ([email protected]
), Furu Wei ([email protected]
).