Linyi Li

Assistant Professor, CS@SFU. Director of TAI Lab @ SFU [firstnamelowercase]_[lastnamelowercase]@sfu.ca

I am Linyi LI, assistant professor in School of Computing Science, Simon Fraser University. I am directing TAI Lab at SFU.

My research is in trustworthy deep learning, with a focus on certifiably trustworthy deep learning and trustworthy foundation models. My research spans machine learning and computer security. More concretely, I like to

enable certifiable and verifiable trustworthiness guarantees (such as robustness, fairness, and numerical reliability) for large-scale deep learning systems;
understand and analyze mechanisms of deep learning and foundation models especially their root causes of trustworthiness issues;
and evaluate foundation models scientifically and comprehensively.

I have published over 30 papers in flagship machine learning and computer security conferences such as ICML, NeurIPS, ICLR, IEEE S&P, and ACM CCS. I am awarded Rising Stars in Data Science, AdvML Rising Star Award, and Wing Kai Cheng Fellowship. I co-led the Team \(\alpha,\beta\)-CROWN in 2023 that won 4th International Verification of Neural Networks Competition (VNN-COMP'23). I am the finalist of 2022 Qualcomm Innovation Fellowship and 2022 Two Sigma PhD Fellowship.

I got my PhD in Computer Science, University of Illinois Urbana-Champaign in 2023 advised by gorgeous Bo Li and awesome Tao Xie. I got my bachelor's degree from Department of Computer Science and Technology, Tsinghua University in 2018, where I did research on Web API Automated Testing, advised by Xiaoying Bai. I was a senior research scientist at ByteDance between 2023 and 2024. I interned in Microsoft twice (mentored by Adam Kalai and Neel Sundaresan in 2022 and 2019 respectively), Fujitsu Research of America (mentored by Mukul Prasad) in 2021, and Carnegie Mellon University (mentored by Matt Fredrikson) in 2017.

More About My Research Teaching Lab Openings

News

[Dec, 2024] I will give a talk at AAAI 2025 New Faculty Highlights program about Certified Trustworthiness in the Era of Large Language Models.

[Dec, 2024] I will be serving as an Area Chair for ICML 2025.

[Nov, 2024] Our lab website is available now. We have multiple PhD openings in 2025 Fall.

[Sept, 2024] Our InfiBench, a novel benchmark for open-world question-answering for code large language models, will appear at NeurIPS 2024! We summarize the empirical scaling laws and trends from over 100 open-source code LLMs.

[Aug, 2024] I joined School of Computing Science at Simon Fraser University as a tenure-track assistant professor. Excited to embark on this new journey!

[July, 2023] Our α,β-CROWN wins the neural network verification competition again!

[Dec, 2022] Our RANUM framework for assuring numerical reliability of deep neural networks is accepted by ICSE 2023.

[Oct, 2022] Happy to be selected as Rising Stars in Data Science at DSI, University of Chicago!

[Sept, 2022] I am co-organizing the workshop on Trustworthy and Socially Responsible Machine Learning at NeurIPS 2022. We invite submissions on any aspect of trustworthy and socially responsible machine learning.

[Sept, 2022] Five papers accepted to NeurIPS 2022. My co-first authored paper proposes a scalable method for certifying model's distributional fairness.

[Aug, 2022] Happy to receive 2022 AdvML Rising Star Award!

[Jun, 2022] We release a systematization of knowledge (SOK) paper (accepted by IEEE SP 2023) along with a toolkit for evaluating about 20 neural network verification approaches on GitHub.

[May, 2022] Three papers accepted by ICML 2022. We provide a tighter certification against L2 perturbations (link), a tighter certification for point cloud models (link), and an out-of-domain generalization certification (link). Look forward to seeing you in Baltimore in July 2022.

[May, 2022] Started internship at Microsoft Research New England on deep program synthesis - I am in Boston area this summer.

[Apr, 2022] Selected as finalist for 2022 Qualcomm Innovation Fellowship.

[Jan, 2022] Selected as finalist for 2022 Two Sigma PhD Fellowship.

[Jan, 2022] We propose practical robustness certification approaches for RL against evasion attacks (CROP, accepted by ICLR 2022) and poisoning attacks (COPA, accepted by ICLR 2022).

[Jan, 2022] Motivated by theoretical analysis, we propose DRT, a training approach for randomized smoothing that diversifies submodels within an ensemble to achieve state-of-the-art certified robustness. Check out our paper at ICLR 2022!

[Sept, 2021] Regularizing gradient similarity and model smoothness is sufficient to diversify sub-models in an ensemble, and thus leading to significant improvements on ensemble NN robustness. Details available in our paper at NeurIPS 2021.

[May, 2021] Simple downsampling combined with Progressive GAN can attack neural networks very efficiently. Details available in our paper at ICML 2021.

[May, 2021] We provide the first rigorous robustness certification on ImageNet against common image transformations including rotation and scaling! Paper will appear at CCS 2021.

[Jan, 2021] We will present a novel analysis of using non-linear projections for neural networks black-box attack at AISTATS 2021.

[Aug, 2020] Paper on clustering test steps leveraging NLP for automating software testing got accepted by ESEC/FSE'20 (Industry Track).

[Apr, 2020] Passed the Ph.D. Qualifying exam.

[Nov, 2019] Our team ranked 2nd in ICPC Mid-Central USA Regional Contest 2019.

[May, 2019] Paper on training provable robust NN via reference adversarial space got accepted by IJCAI'19.

[July, 2018] Graduated from Tsinghua University with Outstanding Underguaduate Award from the university and Excellence Undergraduate Award from the department.

[Feb, 2018] Recevied CS Ph.D admission offers from Carnegie Mellon University, University of Illinois at Urbana-Champaign and University of Wisconsin-Madison. Many thanks to everyone who helped my application!

[Sept, 2017] Finished summer internship at Carnegie Mellon University on neural network explaining, advised by Prof. Matt Fredrikson.

[Mar, 2017] Paper on Cloud API Testing got accepted by COMSPAC'17.

[Nov, 2015] Started to work with Prof. Xiaoying Bai on software testing.

Selected Publications

Full publication list is avaiable at TAI Lab - Publication and Google Scholar.

(* denotes to equal contribution)

Linyi Li, Shijie Geng, Zhenwen Li, Yibo He, Hao Yu, Ziyue Hua, Guanghan Ning, Siwei Wang, Tao Xie, Hongxia Yang
InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models
38th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS 2024 D&B)
[Full Version] [Conference Version] [Code] [Project Website] [Slides] [BibTex]

@inproceedings{
li2024infibench,
title={InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models},
author={Linyi Li and Shijie Geng and Zhenwen Li and Yibo He and Hao Yu and Ziyue Hua and Guanghan Ning and Siwei Wang and Tao Xie and Hongxia Yang},
booktitle={The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2024},
}

Topic: LLM benchmark code

Summary A comprehensive benchmark for code large language models (LLMs) evaluating model ability on answering freeform real-world questions in the code domain. From the evaluation of over 100 models, we summarize the empirical trends and scaling laws for existing open-source code LLMs.

Linyi Li
Certifiably Trustworthy Deep Learning Systems at Scale
Doctoral Thesis
[Full Version] [Official Version] [BibTex]

@phdthesis{li2023thesis,
title = {Certifiably Trustworthy Deep Learning Systems at Scale},
author = {Linyi Li},
year = 2023,
month = {Oct},
school = {University of Illinois Urbana-Champaign},
type = {PhD thesis}
}

Topic: certified ML

Summary My PhD thesis. The thesis systematically summarizes the current research horizon of deep learning certified trustworthiness. Compared to the SoK paper, the thesis extends beyond just robustness and covers the technical details of representative methods.

Linyi Li, Tao Xie, Bo Li
SoK: Certified Robustness for Deep Neural Networks
44th IEEE Symposium on Security and Privacy (SP 2023)
[Full Version] [Conference Version] [Slides] [Code] [Leaderboard] [BibTex]

@inproceedings{li2023sok,
author={Linyi Li and Tao Xie and Bo Li},
title = {SoK: Certified Robustness for Deep Neural Networks},
booktitle = {44th {IEEE} Symposium on Security and Privacy, {SP} 2023, San Francisco, CA, USA, 22-26 May 2023},
publisher = {{IEEE}},
year = {2023},
}

Topic: certified ML

Summary A comprehensive systemization of knowledge on DNN certified robustness, including discussion on practical and theoretical implications, findings, main challenges, and future directions, accompanied with an open-source unified platform to evaluate 20+ representative approaches.

Linyi Li, Yuhao Zhang, Luyao Ren, Yingfei Xiong, Tao Xie
Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects
45th IEEE/ACM International Conference on Software Engineering (ICSE 2023)
[Full Version] [Conference Version] [Slides] [Code] [BibTex]

@inproceedings{li2023reliability,
author={Linyi Li and Yuhao Zhang and Luyao Ren and Yingfei Xiong and Tao Xie},
title = {Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects},
booktitle = {45th International Conference on Software Engineering, {ICSE} 2023, Melbourne, Australia, 14-20 May 2023},
publisher = {{IEEE/ACM}},
year = {2023},
}

Topic: certified ML numerical reliability

Summary An effective and efficient white-box framework for generic DNN architectures, named RANUM, for certifying numerical reliability (e.g., not output NaN or INF), generating failure-exhibiting system tests, and suggesting fixes, where RANUM is the first automated framework for the last two tasks.

Mintong Kang*, Linyi Li*, Maurice Weber, Yang Liu, Ce Zhang, Bo Li
Certifying Some Distributional Fairness with Subpopulation Decomposition
Advances in Neural Information Processing Systems (NeurIPS) 2022
[Full Version] [Conference Version] [Code] [Poster] [BibTex]

@inproceedings{kang2022certifying,
title = {Certifying Some Distributional Fairness with Subpopulation Decomposition},
author = {Mintong Kang and Linyi Li and Maurice Weber and Yang Liu and Ce Zhang and Bo Li},
booktitle = {Advances in Neural Information Processing Systems 35 (NeurIPS 2022)},
year = {2022}
}

Topic: certified ML fairness

Summary A practical and scalable certification approach to provide fairness bound for a given model when distribution shifts from training, based on subpopulation decomposition.

Linyi Li, Jiawei Zhang, Tao Xie, Bo Li
Double Sampling Randomized Smoothing
39th International Conference on Machine Learning (ICML 2022)
[Conference Version] [Full Version] [Code] [BibTex]

@inproceedings{
li2022double,
title={Double Sampling Randomized Smoothing},
author={Linyi Li and Jiawei Zhang and Tao Xie and Bo Li},
booktitle={39th International Conference on Machine Learning (ICML 2022)},
year={2022},
}

Topic: certified ML

Summary A tighter certification approach for randomized smoothing, that for the first time circumvents the well-known curse of dimensionality under mild conditions by leveraging statistics from two strategically-chosen distributions.

Fan Wu*, Linyi Li*, Chejian Xu, Huan Zhang, Bhavya Kailkhura, Krishnaram Kenthapadi, Ding Zhao, Bo Li
COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks
10th International Conference on Learning Representations (ICLR 2022)
[Conference Version] [Full Version] [Leaderboard] [Code] [BibTex]

@inproceedings{
wu2022copa,
title={{COPA}: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks},
author={Fan Wu and Linyi Li and Chejian Xu and Huan Zhang and Bhavya Kailkhura and Krishnaram Kenthapadi and Ding Zhao and Bo Li},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=psh0oeMSBiF}
}

Topic: certified ML deep reinforcement learning

Summary The first approach for certifying deep RL robustness against offline training dataset perturbations, i.e., poisoning attacks, by aggregating over policies trained on partitioned datasets and policies for multiple time steps.

Zhuolin Yang*, Linyi Li*, Xiaojun Xu, Bhavya Kailkhura, Tao Xie, Bo Li
On the Certified Robustness for Ensemble Models and Beyond
10th International Conference on Learning Representations (ICLR 2022)
[Conference Version] [Full Version] [Code] [BibTex]

@inproceedings{
yang2022on,
title={On the Certified Robustness for Ensemble Models and Beyond},
author={Zhuolin Yang and Linyi Li and Xiaojun Xu and Bhavya Kailkhura and Tao Xie and Bo Li},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=tUa4REjGjTf}
}

Topic: certified ML

Summary Based on a curvature bound for randomized smoothing based classifiers, we prove that large confidence margin and gradient diversity are sufficient and necessary condition for certifiably robust ensembles. By regularizing these two factors, we acheive SOTA L2 certified robustness.

Zhuolin Yang*, Linyi Li*, Xiaojun Xu*, Shiliang Zuo, Qian Chen, Pan Zhou, Benjamin I. P. Rubinstein, Ce Zhang, Bo Li
TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness
Advances in Neural Information Processing Systems (NeurIPS) 2021
[Conference Version] [Full Version] [Code] [BibTex]

@inproceedings{yangli2021trs,
title = {TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness},
author = {Zhuolin Yang and Linyi Li and Xiaojun Xu and Shiliang Zuo and Qian Chen and Pan Zhou and Benjamin I. P. Rubinstein and Ce Zhang and Bo Li},
booktitle = {Advances in Neural Information Processing Systems 34 (NeurIPS 2021)},
year = {2021}
}

Topic: robust ML

Summary We prove the guaranteed correlation between model diversity and adversarial transferabiltiy given bounded model smoothness, which leads to a strong regularizer that achieves SOTA ensemble robustness against existing strong attacks.

Jiawei Zhang*, Linyi Li*, Huichen Li, Xiaolu Zhang, Shuang Yang, Bo Li
Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation
International Conference on Machine Learning (ICML) 2021
[Conference Version] [Full Version] [Code] [Slides] [BibTex]

@inproceedings{zhangli2021progressive,
title = {Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation},
author = {Zhang, Jiawei and Li, Linyi and Li, Huichen and Zhang, Xiaolu and Yang, Shuang and Li, Bo},
booktitle = {Proceedings of the 38th International Conference on Machine Learning (ICML 2021)},
pages = {12479--12490},
year = {2021},
editor = {Meila, Marina and Zhang, Tong},
volume = {139},
series = {Proceedings of Machine Learning Research},
month = {18--24 Jul},
publisher = {PMLR},
}

Topic: attacks for ML

Summary We systematically analyzed the gradient estimator that guides black-box attacks for DNNs, which reveals several key factors that can lead to more accurate gradient estimation with fewer queries. One way to realize these key factors is to conduct the attack with gradient estimation on a particularly scaled version of the image, which leads to the PSBA black-box attack with SOTA query effciency.

Linyi Li*, Maurice Weber*, Xiaojun Xu, Luka Rimanic, Bhavya Kailkhura, Tao Xie, Ce Zhang, Bo Li
TSS: Transformation-Specific Smoothing for Robustness Certification
ACM Conference on Computer and Communications Security (CCS) 2021
[Conference Version] [Full Version] [Code] [Slides] [BibTex]

@inproceedings{li2021tss,
title={TSS: Transformation-Specific Smoothing for Robustness Certification},
author={Linyi Li and Maurice Weber and Xiaojun Xu and Luka Rimanic and Bhavya Kailkhura and Tao Xie and Ce Zhang and Bo Li},
year={2021},
booktitle={ACM Conference on Computer and Communications Security (CCS 2021)}
}

Topic: certified ML

Summary Natural transformations such as rotation and scaling are common in the physical world. We propose the first scalable certification approach against natural transformations based on randomzied smoothing, rigorous Lipschitz analysis, and stratified sampling. For the first time, we certify non-trivial robustness (>30% certified robust accuracy) on the large-scale ImageNet dataset.

Huichen Li*, Linyi Li*, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li
Nonlinear Projection Based Gradient Estimation for Query Efficient Blackbox Attacks
International Conference on Artificial Intelligence and Statistics (AISTATS) 2021
[Conference Version] [Full Version] [Code] [BibTex]

@inproceedings{li2020nolinear,
title={Nonlinear Gradient Estimation for Query Efficient Blackbox Attack},
author={Huichen Li and Linyi Li and Xiaojun Xu and Xiaolu Zhang and Shuang Yang and Bo Li},
year={2021},
booktitle = {International Conference on Artificial Intelligence and Statistics (AISTATS 2021)},
series = {Proceedings of Machine Learning Research},
month = {13--15 Apr},
publisher = {PMLR},
}

Topic: attacks for ML

Summary We analyze the outcome of using nonlinear projections for black-box gradient-estimation-based attacks, which shows that proper nonlinear projections can help to improve the attack efficiency.

Linyi Li, Zhenwen Li, Weijie Zhang, Jun Zhou, Pengcheng Wang, Jing Wu, Guanghua He, Xia Zeng, Yuetang Deng, Tao Xie
Clustering Test Steps in Natural Language toward Automating Test Automation
ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2020, Industry Track
[Paper] [Video] [BibTex]

@inproceedings{li2020clustep,
title = {Clustering Test Steps in Natural Language toward Automating Test Automation},
author = {Li, Linyi and Li, Zhenwen and Zhang, Weijie and Zhou, Jun and Wang, Pengcheng and Wu, Jing and He, Guanghua and Zeng, Xia and Deng, Yuetang and Xie, Tao},
booktitle = {Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering {(ESEC/FSE 2020)}},
year = {2020},
doi = {10.1145/3368089.3417067},
url = {https://doi.org/10.1145/3368089.3417067}
}

Topic: ML for software testing

Summary We provide an effective pipeline to cluster test steps in natural language and then synthesize executable test cases, deployed for WeChat testing.

Linyi Li*, Zexuan Zhong*, Bo Li, Tao Xie
Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space
International Joint Conference on Artificial Intelligence (IJCAI) 2019
[Paper] [Code] [BibTex]

@inproceedings{li2019robustra,
title = {Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space},
author = {Li, Linyi and Zhong, Zexuan and Li, Bo and Xie, Tao},
booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI 2019)},
publisher = {International Joint Conferences on Artificial Intelligence Organization},
pages = {4711--4717},
year = {2019},
month = {7},
doi = {10.24963/ijcai.2019/654},
url = {https://doi.org/10.24963/ijcai.2019/654}
}

Topic: certified ML

Summary We propose a training method for achieving certified robustness by regularizing only within the reference adversarial space from a jointly trained model to alleviate the optimization hardness and achieve higher certified robustness.

Miscellaneous

I love traveling, geography, and languages especially Chinese Phonology. I admire Yuen Ren Chao.

Sometimes I play programming contests for fun.

I love VERY VERY spicy 🌶 food :)

I was born and spent childhood in Zhangjiajie, China. I lived in Changsha, China before college.

I am a Northern Tujia. In Tujia Language: Ngaf Bifzivkar.

I was on the faculty job market in 2022-2023 cycle - here are my research, teaching, and diversity statements.