Two sentences are tokenized and encoded by a BERT model. The first sentence describes two kids playing with a green crocodile float in a swimming pool. The second sentence describes two kids pushing an inflatable crocodile around in a pool. The tokenized sentences are passed through the BERT model, which outputs the encoded representations of the token sequences.
3. )(
3
C Te TC a
C RTs Ci C C
ü t t p s a s g C
• (/ 2) / H N Cs L
• s C C N
• Nv
• ( - N
• . N
• coRh C L (/ 2) /
• D V s m LirgS nd I GN C
dpa V E
A :
9. 9
6 IB 5B MF
EB / MLPN C 4F DPFN F - BL F F S / 4- 5 EBQ N / MM kdi l o
EB C MA B FIB MBB - PM S o W xo
5F M N C 9BNB M E 7 M LEM NB / MLPN 597/ 2 - PM S ho r o
' BI F BR P FIF MF S B EI M 7B MN LB MI / MM ho o u# o w
8P M 8PBN F 7 FMN 887 2 - PM S ho r o
5P F643 5 EBA 5643 I - PM S
2ho contradiction, entailment, neutralo3 Xtrain
n ergenreo
5P F643 5FNI EBA 5643 II - PM S
2ho contradiction, entailment, neutralo3 Xtrain
n dm genreo
8PBN F 643 8643 - PM S
F FLBAF o n di u r o
9B D FTF D BR P 1 F IB 9 1 - PM S ho r o
# F DM A 643 643 - PM S
C u rk d e ab trpo r
fsu ir
0F D N F N 5 F - 5 EBQ N / MM
contradiction, entailment, neutralo3 Xtrain datamd GLUE
y W dg
o ho z w csr
11. 11
1
ü / : : s : : SbF d
aF. /gfe lind
ü / : :
- B 6 - 6 - B -
. E 1 / 6: : .1/
gfed tRF mkn R : : N
ü 2 B 6 urdk f lind
ü 2 B N B T o ed
• 4 B (
• (
x
ü / B: : 6: B: Bd M 4 66:
) 2 L b p P Flin r cb
13. - 1 2 1 1
13
0 B 4: : 8 C 45 .: BC 45
h BB 4 F: 8 45 ) ) (
0 B 4: : 8 .: BC Wo T dbe P unPR0
B 4: : 8To P vyo wtWPR
ü dbe PTo wt ( 1 2 21
( 1 ( ( )1 1 1 1
1 B 1 2 ) iL e
E :8 B
ü a : : :4 (sx C
)sx
0 B 4: : 8To PRE :8 B vyo P rn
-12 e gW k /4 p lk
W n wt
1 2
N E MF L
14. 14
text
1
position_id 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
text two kids are playing in a swimming pool with a green colored crocodile float .
ids 2048 4268 2024 2652 1999 1037 5742 4770 2007 1037 2665 6910 21843 14257 1012
text
2
position_id 0 1 2 3 4 5 6 7 8 9 10 11 12
text two kids push an in ##fl ##atable crocodile around in a pool .
ids 2048 4268 5245 2019 1999 10258 27892 21843 2105 1999 1037 4770 1012
4 9 : :olpi uT t ir T ?:B :
r ShW Pz g
: - B 7 : . B 7 :
B 7 : ea g P g
:# B:9_3 ?:B : ir 4 9 : :i uT t r T ?:B :
B5 9colp nkm g s :# B:9 ?:B : _ (
x = - = B : B : B5 : ?:B : = /= = = . ?:B :
_ : 232#0 _ B9: ++df
b t _)
15. 15
['[CLS]', 'two', 'kids', 'are', 'playing', 'in', 'a', 'swimming', 'pool', 'with', 'a',
'green', 'colored', 'crocodile', 'float', '.', '[SEP]', 'two', 'kids', 'push', 'an',
'in', '##fl', '##atable', 'crocodile', 'around', 'in', 'a', 'pool', '.', '[SEP]']
text
1
position_id 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
text two kids are playing in a swimming pool with a green colored crocodile float .
ids 2048 4268 2024 2652 1999 1037 5742 4770 2007 1037 2665 6910 21843 14257 1012
text
2
position_id 0 1 2 3 4 5 6 7 8 9 10 11 12
text two kids push an in ##fl ##atable
crocod
ile
aroun
d
in a pool .
ids 2048 4268 5245 2019 1999 10258 27892 21843 2105 1999 1037 4770 1012
16. 1 1: 1
16
[CLS] two kids are playing kids[SEP] two kidstwo [SEP]
( o % 8 ) o - n s
8 o EA B E n s o
n s l r 4 5t c 5 t
0 11 20 3 , 11 5 e p 5
playing two two
[CLS] two kids are [MASK] kids[SEP] dog kids[MASK] [SEP]
pred2 pred3 pred4
kids
pred1
cross entropy
loss
cross entropy
loss
cross entropy
loss
cross entropy
loss
17. 2 - 2 2 2
17
N I 1 2
IsNext
pred1
cross entropy
loss
e
( ) ) )
[CLS] two kids are playing kids[SEP] two kidstwo [SEP]
/
21. 21
import tensorflow as tf
from transformers import BertTokenizer, TFBertForSequenceClassification
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
text1 = "[CLS] Two kids are playing in a swimming pool with a green colored crocodile
float. [SEP]"
text2 = "Two kids push an inflatable crocodile around in a pool. [SEP]"
tokenized_text = tokenizer.tokenize(text1 + " " + text2)
print(tokenized_text)
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
pos_sep = tokenized_text.index(“[SEP]”)+1 # al o gi [SEP] 1st sentence
segments_ids = [0]*pos_sep + [1]*(len(indexed_tokens)-pos_sep)
tokens_tensor = tf.Variable([indexed_tokens])
segments_tensors = tf.Variable([segments_ids])
model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased')
print(model.summary())
outputs = model(tokens_tensor, token_type_ids=segments_tensors)
print(outputs)
r Pf BF E gd T - P
C - c C fm nk Fph
e R
-
27. 27
TFBertForSequenceClassification
TFBertMainLayer
TFBertEmbedding TFBertEncorder TFBertPooler
TFBertLayer
TFBertLayer
TFBertLayerTFBertLayer
TFBert Attention
TFBertSelfAttention TFBertSelfOutput
TFBertIntermediate TFBertOutput
input_ids
position_ids
token_type_ids
input_embeds
8 ab
pd c n
heT(
(n_seq,)
(n_seq,)
(n_seq,)
(n_seq, dim)
8 , , 8 , , 8 o
Weight
[word_embeddings]
Embedding
[position_embeddings]
Embedding
[token_type_embeddings]
gather
(vocab_size, dim)
(n_seq, dim)
(type_vocab_size, dim)
+
(n_seq, dim)
LayerNormalization Dropout hidden_status
[CLS]
two
kids
are
playing
in
a
swimming
pool
…
[SEP]
[PAD]
…
[PAD]
hidden_size=768
max_position_embeddings=512
( 8 7
sq[ m
N l i
! " ] E
_f r
) 8 y /
/ n L
gx 8 ] E k
6 6 , t z
8
input
(n_seq, dim)(n_seq, dim)(n_seq, dim)
30. A z c lnpo cbvs
30
TFBertForSequenceClassification
TFBertMainLayer
TFBertEmbedding TFBertEncorder TFBertPooler
TFBertLayer
TFBertLayer
TFBertLayerTFBertLayer
TFBert Attention
TFBertSelfAttention TFBertSelfOutput
TFBertIntermediate TFBertOutput
hidden_status
attention_mask
multi head
multi head
multi head
Dense
[query (Q)]
Dense
[ Key (K)]
Dense
[ Value (V)]
input
= A u dcb
- [mf g
q ehi ]
Qc[_a A DNMMb
u[- vs N
b
]
tS
(n_seq, dim)
D = A e e AA A e Tb
(n_seq, dim)
(n_head, n_seq,
dim/n_head) (n_head, n_seq, n_seq)
(n_seq,)
softmax Dropout hidden_status
x _ [
A c
cb
attention_mask
attention_mask
attention_mask
attention_mask
(n_head, n_seq,
n_seq)
(n_head, n_seq,
n_seq)
(n_head, n_seq,
dim/n_head)
Reshape
(n_seq, dim)
x _ b
Q vseQQ ]
A A (
) NtS M cbu
AA A e AA A K
[
+
sMc KV A
e r
35. 35
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
https://arxiv.org/abs/1810.04805
Transformers
https://huggingface.co/transformers
3rd party pre-trained
https://huggingface.co/models