Skip to content

ghostagan/NLP-Projects

 
 

Repository files navigation

NLP-Projects

Natural Language Processing related projects, which includes concepts and srcipts about:

DL best practices in NLP

1. Word embeddings

  • Use pre-trained embeddings if available.
  • Embedding dimension is task-dependent
    • Smaller dimensionality (i.e., 100) works well for syntactic tasks (i.e., NER, POS tagging)
    • Larger dimensionality (i.e., 300) is useful for semantic tasks (i.e., sentiment analysis)

2. Depth

  • 3 or 4 layer Bi-LSTMs (e.g. POS tagging, semantic role labelling).
  • 8 encoder and 8 decoder layers (e.g., Google's NMT)
  • In most case, shallower model(i.e., 2 layers) is good enough.

3. Layer connections (for avoiding vanishing gradients)

  • Highway layer
    • h = t * a(WX+b) + (1-t) * X, where t=sigmoid(W_TX+b_T) is called transform gate.
    • Application: language modelling and speech recognition.
    • Implementation: tf.contrib.rnn.HighwayWrapper
  • Residual connection
    • h = a(WX+b) + X
    • Implementation: tf.contrib.rnn.ResidualWrapper
  • Dense connection
    • h_l = a(W[X_1, ..., X_l] + b)
    • Application: multi-task learning

4. Dropout

5. LSTM tricks

  • Treat initial state as variable [2]
# note: if here is LSTMCell, a bug appear: https://stackoverflow.com/questions/42947351/tensorflow-dynamic-rnn-typeerror-tensor-object-is-not-iterable
cell = tf.nn.rnn_cell.GRUCell(state_size)
init_state = tf.get_variable('init_state', [1, state_size], initializer=tf.constant_initializer(0.0))
# https://stackoverflow.com/questions/44486523/should-the-variables-of-the-initial-state-of-a-dynamic-rnn-among-the-inputs-of
init_state = tf.tile(init_state, [batch_size, 1])
  • Gradients clipping
variables = tf.trainable_variables()
gradients = tf.gradients(ys=cost, xs=variables)
clipped_gradients, _ = tf.clip_by_global_norm(gradients, clip_norm=self.clip_norm)
optimizer = tf.train.AdamOptimizer(learning_rate=1e-3)
optimize = optimizer.apply_gradients(grads_and_vars=zip(clipped_gradients, variables), global_step=self.global_step)

6. Attention

  • To do...

Reference:
[1] http://ruder.io/deep-learning-nlp-best-practices/
[2] https://r2rt.com/recurrent-neural-networks-in-tensorflow-iii-variable-length-sequences.html

Awesome packages

Chinese

English

About

text preprocess, word2vec, sentence embedding in text similarity, text classification, Chinese word segmentation, Hidden Markov Model, CRFs, named entity recognition, knowledge graph, dialog system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • OpenEdge ABL 62.3%
  • Jupyter Notebook 32.8%
  • Python 4.8%
  • Shell 0.1%