INFERCEPT: Efficient Intercept Support for Augmented Large Language Model Inference

This repo contains implementation of InferCept. Please refer to our paper for more details.

Instructions

To install InferCept to your environment:

# After cloning the repo
cd infercept/
pip install -e .

To enable the serving system to hook on augmentation calls, register your aug-stop token in vllm/utils.py. You can register multiple keys at once:

def get_api_stop_strings() -> List[str]:
  return ["<stop token 1>", "<stop token 2>"]

To reproduce paper results, check exps folder.

Citation

If you use InferCept for your research, please cite our paper:

@inproceedings{
  abhyankar2024infer,
  title={INFERCEPT: Efficient Intercept Support for Augmented Large Language Model
Inference},
  author={Reyna Abhyankar and Zijian He and Vikranth Srivatsa and Hao Zhang and Yiying Zhang},
  booktitle={Forty-first International Conference on Machine Learning},
  year={2024},
  month=Jul,
  address={Vienna, Austria},
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
benchmarks		benchmarks
csrc		csrc
docs		docs
examples		examples
exps		exps
tests		tests
vllm		vllm
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

INFERCEPT: Efficient Intercept Support for Augmented Large Language Model Inference

This repo contains implementation of InferCept. Please refer to our paper for more details.

Instructions

Citation

About

Releases

Packages

Contributors 3

Languages

License

WukLab/InferCept

Folders and files

Latest commit

History

Repository files navigation

INFERCEPT: Efficient Intercept Support for Augmented Large Language Model Inference

This repo contains implementation of InferCept. Please refer to our paper for more details.

Instructions

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages