基于Goalng、Docker和微服务思想实现了高并发、高性能和高可用的推荐系统推理微服务,包括多种召回/排序服务,并提供多种接口访问方式(REST、gRPC和Dubbo)等,每日可处理上千万次推理请求。
Large Scale Deep Learning Recommender Systems Inference Services / Microservices base on TFServing、Faiss 、Redis、Dubbo、Nacos、gRPC and Golang. This system provides real-time inference services(Dubbo api、gPRC api and REST api),which can withstand millions of inference requests per day.
The model inference microservices based on deep learning mainly uses the following components:
| Type | Component | Description |
|---|---|---|
| Data | Hive / Spark | ETL millions users's behavior data and then build the feature data warehouse. |
| Redis | Save the training samples in TFRecord format and store them in Redis Cluster. | |
| Model | TensorFlow | Training deep learning recall / rank model , alse you can use other deep learning framework ,but need save models as *.pb format. |
| TensorFlow Serving | Deploy models and provide a grpc service. | |
| FAISS | Quick ANN search thousands items from millions items. | |
| Microservices | Nacos | Manage config files and services. |
| Dubbo | Build dubbo protocol RPC services and register them to Nacos. | |
| Hystrix | How to distribute traffic during peak traffic (Latency and Fault Tolerance). | |
| Skywalking | Record the time spent on each request. | |
| Deploy | Docker | Docker containerization deployment services. |
| Kubernetes | Manage dockers and monitor the resource consumption of each service, such as memory and CPUs. | |
| Nginx、Apisix | API gateway. |
The core components of model inference microservices are as follows:
| Type | Component | Description |
|---|---|---|
| Feature | feature engineering | user offline、user realtime、user seq features and item features. |
| Sample | recall/rank samples | create TFRcords format samples. |
| Recall | cf recall | user cf 、 item cf and swing. |
| dssm recall | recall from dssm model and faiss index. | |
| simple recall | rules recall, such as hot items recall. | |
| cold start | new users and new items cold start. | |
| Rank | pre_ranking | thousands items pre_ranking after recall . |
| ranking | hundreds items ranking after pre_ranking. | |
| re_rank | hundreds items re_ranking after ranking . | |
| Services | config loader | Sparse service's start config from Naocs, such as grpc info 、 redis info and index info. |
| dubbo service | dubbo protocol service. | |
| gRPC service | grpc protocol service. | |
| rest service | restful service. | |
| APIs | dubbo api | provide dubbo protocol api. |
| gRPC api | provide gRPC protocol api. | |
| rest api | provide http protocol api. | |
| Web | web | services manage and Service monitor page. |
| Deploy | faiss | faiss index service deploy. |
| tfserving | tensorflow model deploy. | |
| infer | recommend system infer deploy. |
Docker
Kubernetes
Nginx
Apisix
ELK
1.推荐系统
王树森推荐系统公开课 - 基于小红书的场景讲解工业界真实的推荐系统。
● Recommender_System
2.YouTuBe推荐系统排序模型
以"DNN_for_YouTube_Recommendations"模型和电影评分数据集(ml-1m)为基础,详尽的展示了如何基于TensorFlow2实现推荐系统排序模型。
● YouTube深度排序模型(多值embedding、多目标学习)
3.机器学习 Sklearn入门教程
● 机器学习Sklearn入门教程
4.深度学习TensorFlow入门教程
● 深度学习TensorFlow入门教程