vasteelab.com
ã®è¨äºãåèã«ãBERTã®äºåå¦ç¿ãåããã¦ã¿ãã¨ãã®åå¿é²ãã¾ã¨ããã2021å¹´12æç¾å¨ã§ã¯ãtensorflowã®ãã¼ã¸ã§ã³ãä¸ãã£ã¦ããããªã©ã§ãè¨äºã«æ¸ãã¦ããæé éãã§ã¯ãã¾ãåããªãã£ããèªåãè¡ã£ãä¿®æ£å
容ãªã©ãå
±æããç®çã§æ´çããã
- ç°å¢
- tensorflowã®ã¤ã³ã¹ãã¼ã«
- ç©´åãåé¡ãå®è¡ããããã®ãã¡ã¤ã«çæãåä½ããã
- Pretrainingã®å®è¡ãåä½ããã
- åè
ç°å¢
Windowsãã·ã³ã«Anacondaãå ¥ããç°å¢ã§è©¦ããã
Anaconda
conda info active environment : base active env location : /home/nakajo/anaconda3 shell level : 1 user config file : /home/nakajo/.condarc populated config files : conda version : 4.10.3 conda-build version : 3.21.4 python version : 3.8.8.final.0 virtual packages : __linux=5.10.60.1=0 __glibc=2.31=0 __unix=0=0 __archspec=1=x86_64 base environment : /home/nakajo/anaconda3 (writable) conda av data dir : /home/nakajo/anaconda3/etc/conda conda av metadata url : None channel URLs : https://repo.anaconda.com/pkgs/main/linux-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/r/linux-64 https://repo.anaconda.com/pkgs/r/noarch package cache : /home/nakajo/anaconda3/pkgs /home/nakajo/.conda/pkgs envs directories : /home/nakajo/anaconda3/envs /home/nakajo/.conda/envs platform : linux-64 user-agent : conda/4.10.3 requests/2.25.1 CPython/3.8.8 Linux/5.10.60.1-microsoft-standard-WSL2 ubuntu/20.04.2 glibc/2.31 UID:GID : 1000:1000 netrc file : None offline mode : False
tensorflowã®ã¤ã³ã¹ãã¼ã«
ä»åã¯Anacondaãã¤ã³ã¹ãã¼ã«ããã¨ãã«ä¸ç·ã«ã¤ã³ã¹ãã¼ã«ãããTensorflowãå©ç¨ããã
ç©´åãåé¡ãå®è¡ããããã®ãã¡ã¤ã«çæãåä½ããã
çºçããã¨ã©ã¼
ãç©´åãåé¡ãå®è¡ããããã®ãã¡ã¤ã«çæããå®è¡ãããã¨ããã¨ã以ä¸ã®ã¨ã©ã¼ãçºçããã
$ python create_pretraining_data.py --input_file=./sample_text.txt --output_file=/tmp/tf_examples.tfrecord --vocab_file=./vocab.txt -- do_lower_case=True --max_seq_length=128 --max_predictions_per_seq=20 --masked_lm_prob=0.15 --random_seed=12345 --dupe_factor=5 Traceback (most recent call last): File "create_pretraining_data.py", line 26, in <module> flags = tf.flags AttributeError: module 'tensorflow' has no attribute 'flags'
ããã«ã以ä¸ã®ã¨ã©ã¼ãçºçã
$ python create_pretraining_data.py --input_file=./sample_text.txt --output_file=/tmp/tf_examples.tfrecord --vocab_file=./vocab.txt -- do_lower_case=True --max_seq_length=128 --max_predictions_per_seq=20 --masked_lm_prob=0.15 --random_seed=12345 --dupe_factor=5 Traceback (most recent call last): File "create_pretraining_data.py", line 469, in <module> tf.app.run() File "/home/nakajo/anaconda3/lib/python3.8/site-packages/tensorflow/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/home/nakajo/anaconda3/lib/python3.8/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/home/nakajo/anaconda3/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "create_pretraining_data.py", line 439, in main tokenizer = tokenization.FullTokenizer( File "/home/nakajo/work/pythonProject/bert/tokenization.py", line 165, in __init__ self.vocab = load_vocab(vocab_file) File "/home/nakajo/work/pythonProject/bert/tokenization.py", line 125, in load_vocab with tf.gfile.GFile(vocab_file, "r") as reader: AttributeError: module 'tensorflow' has no attribute 'gfile'
ä¿®æ£æ¹æ³
åå ã¯ããã¨ãã¨ã®BERTãtensorflow 1ç³»ã§ä½æããã¦ãã¦ãtensorflow 2ç³»ãã¤ã³ã¹ãã¼ã«ããã¦ãããããããããã¼ã¸ã§ã³ã¢ããã«ãã£ã¦ãããã¤ãã®attributeãåé¤ããã¦ããã®ãåå ã
tensorflow 2ç³»ã§ã¯1ç³»ã®interfaceã«æ»ãããããã®ã§ãä»åã¯ããã試ãã¦ã¿ãã¨ãããã¾ãåãããå
·ä½çã«ã¯ä»¥ä¸ã®ï¼ãæãä¿®æ£ããã
ä¿®æ£å¾ã®ã³ã¼ããè¨è¼ããã
create_pretraining_data.pyã®24è¡ç®
import tensorflow.compat.v1 as tf
tokenization.pyã®25è¡ç®
import tensorflow.compat.v1 as tf
Pretrainingã®å®è¡ãåä½ããã
ç¶ãã¦ããPretrainingã®å®è¡ãã試ãã¦ã¿ãããå¼ãç¶ãã¨ã©ã¼ãçºçããã
çºçããã¨ã©ã¼
python run_pretraining.py --input_file=/tmp/tf_examples.tfrecord --output_dir=./tmp/pretraining_output --do_train=True --do_eval=Tru e --bert_config_file=./bert_config.json --init_checkpoint=./bert_model.ckpt --train_batch_size=32 --max_seq_length=128 --max_predictions_per_seq=20 --num_train_steps=20 --num_warmup_steps=1 0 --lerning_rate=2e-5 Traceback (most recent call last): File "run_pretraining.py", line 26, in <module> flags = tf.flags AttributeError: module 'tensorflow' has no attribute 'flags'
ãã®ã¨ã©ã¼ã¯å ã»ã©ã¨åæ§ã®ã¨ã©ã¼ã¨æãããã
æ°ãã以ä¸ã®ã¨ã©ã¼ãçºçããã
python run_pretraining.py --input_file=/tmp/tf_examples.tfrecord --output_dir=./tmp/pretraining_output --do_train=True --do_eval=Tru e --bert_config_file=./bert_config.json --init_checkpoint=./bert_model.ckpt --train_batch_size=32 --max_seq_length=128 --max_predictions_per_seq=20 --num_train_steps=20 --num_warmup_steps=1 0 --lerning_rate=2e-5 INFO:tensorflow:*** Input Files *** I1218 03:55:13.127805 140009368016704 run_pretraining.py:420] *** Input Files *** INFO:tensorflow: /tmp/tf_examples.tfrecord I1218 03:55:13.128413 140009368016704 run_pretraining.py:422] /tmp/tf_examples.tfrecord Traceback (most recent call last): File "run_pretraining.py", line 493, in <module> tf.app.run() File "/home/nakajo/anaconda3/lib/python3.8/site-packages/tensorflow/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/home/nakajo/anaconda3/lib/python3.8/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/home/nakajo/anaconda3/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "run_pretraining.py", line 429, in main is_per_host = tf.contrib.tpu.InputPipelineConfig.PER_HOST_V2 AttributeError: module 'tensorflow.compat.v1' has no attribute 'contrib'
File "/home/nakajo/work/pythonProject/bert/modeling.py", line 184, in __init__ self.embedding_output = embedding_postprocessor( File "/home/nakajo/work/pythonProject/bert/modeling.py", line 520, in embedding_postprocessor output = layer_norm_and_dropout(output, dropout_prob) File "/home/nakajo/work/pythonProject/bert/modeling.py", line 370, in layer_norm_and_dropout output_tensor = layer_norm(input_tensor, name) File "/home/nakajo/work/pythonProject/bert/modeling.py", line 364, in layer_norm return tf.contrib.layers.layer_norm( AttributeError: module 'tensorflow.compat.v1' has no attribute 'contrib'
ä¿®æ£æ¹æ³
åºæ¬çã«ã¯ã研究開発:BERTオフィシャルコードをtensorflow2で実行する方法 - livedoor Blog(ブログ)ãåèã«ãã¦ä¿®æ£ãã¦ããã¨ããã
以ä¸ã«èªåãè¡ã£ãä¿®æ£å
容ãè¨è¼ããã
ã¾ãã¯ãååã¨åæ§tensorflow v1ã®ã¤ã³ã¿ã¼ãã§ã¼ã¹ãæå¹ã«ããã
run_pretraining.pyãmodeling.pyãtokenization.pyã®tensorflowãimportãã¦ããåæã以ä¸ã®ããã«ä¿®æ£ã
import tensorflow.compat.v1 as tf
ç¶ãã¦ããAttributeError: module 'tensorflow.compat.v1' has no attribute 'contrib' ãã¨ã©ã¼ã«å¯¾å¿ããã
run_pretraining.pyã®ãtf.contrib.ããä¸æ¬ã§ããtf.estimator.ãã«ç½®æããã
次ã¯ããAttributeError: module 'tensorflow_estimator.python.estimator.api._v1.estimator' has no attribute 'data'ãã«å¯¾å¿ããã
run_pretraining.pyã®365è¡ç®ã¨381è¡ç®ããããã次ã®ããã«ä¿®æ£ããã
365è¡ç®
tf.data.experimental.parallel_interleave(
381è¡ç®
tf.data.experimental.map_and_batch(
æå¾ã«ä»¥ä¸ã®ã¨ã©ã¼ã«å¯¾å¿ããã
File "/home/nakajo/work/pythonProject/bert/modeling.py", line 364, in layer_norm return tf.contrib.layers.layer_norm( AttributeError: module 'tensorflow.compat.v1' has no attribute 'contrib'
modeling.pyã®362è¡ç®ï½365è¡ç®ã以ä¸ã®ããã«ä¿®æ£ããã
def layer_norm(input_tensor, name=None): """Run layer normalization on the last dimension of the tensor.""" layer_norma = tf.keras.layers.LayerNormalization(axis = -1) return layer_norma(input_tensor)