逐行源码阅读中文笔记。
blog关于跨模态视频时刻检索的综述:https://blog.csdn.net/qq_39388410/article/details/107316185
原paper: Jiyang Gao, Chen Sun, Zhenheng Yang, and Ram Nevatia. 2017. TALL: Temporal Activity Localization via Language Query. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 5277–5285.