prediction-flow is a Python package providing modern Deep-Learning based CTR models. Models are implemented by PyTorch.
- Install using pip.
pip install prediction-flow
There are two parameters for all feature types, name and column_flow. The name parameter is used to index the column raw data from input data frame. The column_flow parameter is a single transformer of a list of transformers. The transformer is used to pre-process the column data before training the model.
- dense number feature
Number('age', StandardScaler())
Number('ctr', None)
- sparse category feature
Category('movieId', CategoryEncoder(min_cnt=1))
- var length sequence feature
Sequence('genres', SequenceEncoder(sep='|', min_cnt=1))
The following transformers are provided now.
transformer | supported feature type | detail |
---|---|---|
StandardScaler | Number | Wrapper of scikit-learn's StandardScaler. Null value must be filled in advance. |
LogTransformer | Number | Log scaler. Null value must be filled in advance. |
CategoryEncoder | Category | Converting str value to int. Null value must be filled in advance using '__UNKNOWN__'. |
SequenceEncoder | Sequence | Converting sequence str value to int. Null value must be filled in advance using '__UNKNOWN__'. |
model | reference |
---|---|
DNN | - |
Wide & Deep | [DLRS 2016]Wide & Deep Learning for Recommender Systems |
DeepFM | [IJCAI 2017]DeepFM: A Factorization-Machine based Neural Network for CTR Prediction |
DIN | [KDD 2018]Deep Interest Network for Click-Through Rate Prediction |
DNN + GRU + GRU + Attention | [AAAI 2019]Deep Interest Evolution Network for Click-Through Rate Prediction |
DNN + GRU + AIGRU | [AAAI 2019]Deep Interest Evolution Network for Click-Through Rate Prediction |
DNN + GRU + AGRU | [AAAI 2019]Deep Interest Evolution Network for Click-Through Rate Prediction |
DNN + GRU + AUGRU | [AAAI 2019]Deep Interest Evolution Network for Click-Through Rate Prediction |
DIEN | [AAAI 2019]Deep Interest Evolution Network for Click-Through Rate Prediction |
OTHER | TODO |
This dataset is just used to test the code can run, accuracy does not make sense.
- Prepare the dataset. preprocess.ipynb
- Run the model. movielens-1m.ipynb
- Prepare the dataset. prepare_neg.ipynb
- Run the model. amazon.ipynb
- An example using pytorch-lightning. amazon-lightning.ipynb
accuracy
- Referring the design from DeepCTR, the features are divided into dense (class Number), sparse (class Category), sequence (class Sequence) types.