The project consists of a Multi-Label Text Classifier project using a RNN from Keras.
The dataset consists of disaster messages that are classified into 36 different classes. The goal of the project is to classify an input message into these different classes.
The goal of the project is to explain how to deal with Multiple-Outputs and Multiple-Losses RNN in Keras. The project comes across this issue keras-team/keras#8011 and presents a validy solution.
There are two approches on the project:
- Creating the Embeddings using Keras Embedding Layer
- Using Glove pre-trained embedding
Check the [Text Classifier with Multiple Outputs and Multiple Losses in Keras] Medium Post (https://medium.com/@danieldacosta_75030/text-classifier-with-multiple-outputs-and-multiple-losses-in-keras-4b7a527eb858) for further details!
The dataset consists of disaster messages that are classified into 36 different classes. The dataset in highly imbalanced, having different distributions for each class. In order to reduce this problem a class weighted approach was used, where we make the classifier aware of the imbalanced data by incorporating the weights of classes into the cost function.
In the RNN model, a class_weight
paramater was set in order to reduce the imbalaced distributions problem.
- Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation.
- https://medium.com/towards-artificial-intelligence/keras-for-multi-label-text-classification-86d194311d0e
- https://stats.stackexchange.com/questions/323961/how-to-define-multiple-losses-in-machine-learning
- keras-team/keras#8011