- ã¯ããã«
- å®è¡ç°å¢
- ç°å¢æ§ç¯
- WSL2 ã®å°å ¥
- CUDA toolkit ã®ã¤ã³ã¹ãã¼ã«
- cuDNN ã®ã¤ã³ã¹ãã¼ã«
- Anaconda ã®ã¤ã³ã¹ãã¼ã«
- conda ç°å¢ã®ä½æ
- Pytorch ã®ã¤ã³ã¹ãã¼ã«
- VSCode ã®å°å ¥
- è³ææ ¼ç´
- ãµã³ãã«ã³ã¼ããå®è¡
- ãµã³ãã«ã³ã¼ãå®è¡ã¨ã©ã¼
- rinna_test.py ã®å®è¡æéã«ã¤ãã¦
- åèè³æ
ã¯ããã«
ååã®è¨äºã®ç¶ãã
Windows ã§ã®ç°å¢æ§ç¯ã«æ«æãããã WSL2 ã§æ§ç¯ãããã¨ã¨ãããä»åã¯ãã®ä½æ¥å 容㨠rinna ã®ãµã³ãã«ã³ã¼ãã®å®è¡çµæã«ã¤ãã¦è¨è¼ããã
å®è¡ç°å¢
- PC
- WSL é¢é£
- æ±ç¨è¨èªã¢ãã« rinna/bilingual-gpt-neox-4b
ç°å¢æ§ç¯
宿½ãã使¥ãæç³»åã«æ²¿ã£ã¦è¨è¼ããã
WSL2 ã®å°å ¥
éå»ã«å°å
¥æ¸ã¿ã ã£ããããªãããããã¯ã¼ã¯ãé
ããã¦ã³ãã³ãå®è¡ã«æ°åç§ãããç¶æ
ã ã£ããããã¢ã³ã¤ã³ã¹ãã¼ã«ã»ã¤ã³ã¹ãã¼ã«ã宿½ããã
以ä¸ã®ãµã¤ããåèã«ããã¦ããã ããã
WSL2 のインストールとアンインストール #初心者 - Qiita
CUDA toolkit ã®ã¤ã³ã¹ãã¼ã«
å
¬å¼ãµã¤ãã®ã¤ã³ã¹ãã¼ã«ã³ãã³ããå®è¡ããã
CUDA Toolkit 12.6 Update 1 Downloads | NVIDIA Developer
cuDNN ã®ã¤ã³ã¹ãã¼ã«
å
¬å¼ãµã¤ãã®ã¤ã³ã¹ãã¼ã«ã³ãã³ããå®è¡ããã
cuDNN 9.3.0 Downloads | NVIDIA Developer
åä½ç¢ºèª
nvidia-smi
ä¸è¨ã³ãã³ããå®è¡ããã¨çµæãè¿ã£ã¦ããããã¶ãããã§ OK ãªã¯ãã
nvcc -V
ä¸è¨ã³ãã³ããå®è¡ãã㨠Command not found ã®ã¨ã©ã¼ãåºãããã~/.bashrc
ã«ç°å¢å¤æ°ã®è¨å®ã³ãã³ãã追è¨ããã
~/.bashrc
export PATH=/usr/local/cuda:/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/cuda/lib64:$LD_LIBRARY_PATH
~/.bashrc
ã®ä¿åå¾ãWSL ãåèµ·åãã¦ç°å¢å¤æ°ã®è¨å®ã¯å®äºã
å度 nvcc ã³ãã³ããå®è¡ããã¨çµæãè¿ã£ã¦ãããã OKã
Anaconda ã®ã¤ã³ã¹ãã¼ã«
WSL ã§ã conda ç°å¢ã§å®è¡ãããã£ããããAnaconda ãå°å
¥ããã
å°å
¥æé ã¯ãã¡ããåèã«ããã¦ããã ããã
【備忘録】WSL2にanaconda3をインストールしてcondaを使えるようにするまで #Python - Qiita
conda ç°å¢ã®ä½æ
conda ã®ä»®æ³ç°å¢ã使ãããæé ã¯å²æããã
Pytorch ã®ã¤ã³ã¹ãã¼ã«
å ¬å¼ãµã¤ãããã¤ã³ã¹ãã¼ã«ã³ãã³ããã³ãããã¦å®è¡ããã
VSCode ã®å°å ¥
ã½ã¼ã¹ãæ¸ãã»ã¿ã¼ããã«ãæä½ããã»GitHub ã«ä¸ãããªã©ã®ä½æ¥ã VSCode ã«éç´ãããã£ãããå°å
¥ãããã¨è¨ã£ã¦ããUbuntu ã«ã¯ã¤ã³ã¹ãã¼ã«ããã¦ããããã§ãcode
ã³ãã³ããå®è¡ããã ãã§ VSCode ãèµ·åãããã
VSCode èµ·åå¾ã¯ãç»é¢å·¦ä¸ã®ã¢ã¤ã³ã³ãã Connect to WSL ã鏿ããã° OKã
è³ææ ¼ç´
以ä¸ã®ãã£ã¬ã¯ããªæ§æã§è³æãæ ¼ç´ããã
- bilingual-gpt-neox-4b é
ä¸
- æåã§ãã¦ã³ãã¼ããã rinna ã¢ãã«ã®ãã¡ã¤ã«ä¸å¼ãæ ¼ç´ãã
- rinna_test.py
- ãµã³ãã«ã³ã¼ããæ¸ããã Python ãã¡ã¤ã«
. âââ bilingual-gpt-neox-4b â âââ README.md â âââ config.json â âââ gitattributes â âââ model.safetensors â âââ pytorch_model.bin â âââ rinna.png â âââ spiece.model â âââ spiece.vocab â âââ tokenizer_config.json âââ rinna_test.py
ãµã³ãã«ã³ã¼ããå®è¡
以ä¸ã§å ¬éããã¦ãããµã³ãã«ã³ã¼ããä¸é¨å¤æ´ãã¦å®è¡ããã
rinna/bilingual-gpt-neox-4b · Hugging Face
ã¢ãã«ã®ãã¹ã®è¨è¼é¨åã夿´ããbilingual-gpt-neox-4b ãã©ã«ãã«æ ¼ç´ãããã¢ãã«ãåç §ããããã«ããã
夿´å¾ã®ã³ã¼ã
tokenizer = AutoTokenizer.from_pretrained("./bilingual-gpt-neox-4b", use_fast=False)
ãµã³ãã«ã³ã¼ãå®è¡ã¨ã©ã¼
Killed
åãã¦ã®ã¨ã©ã¼ã«ééãããKilled ã¨è¡¨ç¤ºããã¦ãã以å¤ã®æ å ±ããªãã
調ã¹ãéããã¡ã¢ãªä¸è¶³ã®éã« Killed ã表示ãããããã ãã詳細ã確èªããããã«ãdmesg
ã³ãã³ããå®è¡ããã
å®è¡ããã³ãã³ã
dmesg -T
ã³ãã³ãå®è¡çµæ
[Sun Sep 1 14:48:46 2024] Out of memory: Killed process 8833 (pt_main_thread) total-vm:52131812kB, anon-rss:6358652kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:23936kB oom_score_adj:0
ãããã«ãµã³ãã«ã³ã¼ããå®è¡ããæé帯ã«ãOut of Memory ã«ãã£ã¦ããã»ã¹ããã«ããã¦ãããä»ã®ãµã³ãã«ã³ã¼ãã ã¨ãèªåã®å®è¡ç°å¢ã§ã¯ã¡ã¢ãªä¸è¶³ã§ãã«ããã¦ãã¾ãã¨ãããã¨ãåãã£ãã
OOM Killer ã¸ã®å¯¾å¦ã¨çµæ
ãã«ãããªãããã«ãµã³ãã«ã³ã¼ããä¿®æ£ãããå
·ä½çã«ã¯ãtorch_dtype
ã¨device_map
ã®è¨å®å¤ã追å ããã
ä¿®æ£å¾ã® rinna_test.py
import torch from transformers import AutoTokenizer, AutoModelForCausalLM # ã¡ã¢ãªä½¿ç¨ç¶æ ãåå¾ãã # torch.cuda.memory._record_memory_history() # rinnaã¢ãã«ããã¼ã«ã«ãã¹æå® tokenizer = AutoTokenizer.from_pretrained("./bilingual-gpt-neox-4b", use_fast=False) # 夿´å â Killed # model = AutoModelForCausalLM.from_pretrained("./bilingual-gpt-neox-4b") # 夿´å¾ â OK model = AutoModelForCausalLM.from_pretrained("./bilingual-gpt-neox-4b", torch_dtype=torch.float16, device_map='auto') # if torch.cuda.is_available(): # model = model.to("cuda") text = "西ç°å¹¾å¤éã¯ã" token_ids = tokenizer.encode(text, add_special_tokens=False, return_tensors="pt") with torch.no_grad(): output_ids = model.generate( token_ids.to(model.device), max_new_tokens=100, min_new_tokens=100, do_sample=True, temperature=1.0, top_p=0.95, pad_token_id=tokenizer.pad_token_id, bos_token_id=tokenizer.bos_token_id, eos_token_id=tokenizer.eos_token_id ) output = tokenizer.decode(output_ids.tolist()[0]) print(output) # ã¡ã¢ãªã®ã¹ãããã·ã§ãããåå¾ # torch.cuda.memory._dump_snapshot("my_snapshot.pickle")
rinna_test.py å®è¡çµæ
T5Tokenizer ã«å¯¾ããã¡ãã»ã¼ã¸ãåºã¦ããããããã°ã©ã ã¯æ£å¸¸çµäºããæå¾ ããçµæãåºåãããã
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 Some parameters are on the meta device device because they were offloaded to the cpu. 西ç°å¹¾å¤éã¯ãææ²»åæã®ç©çå¦ã®ç¬¬ä¸äººè ã§ãããçç¥è«è ã¨ãã¦ãç¥ããã¦ãã¾ãã å½¼ã¯ãä¸ç©ã¯å ¨ã¦éã§ããããã®æ¬è³ªã¯ãæ°¸é ã«å¤ãããªããã®ã¨ä¿¡ãã¦ãã¾ãªãã£ãã®ã§ãã (以ä¸ãä¸ç©ã®éã®æ¬è³ªã¨ã人å¿ã®ç§å¦ããã) ãç¥ãããããããªããéè½è ã«ãå¿ è¦ãªã®ããã¨ããã°ããããããç¥ç§ã®ä¸çã«ç¥ãæãã®ããçã®å®æã ããã§ãã éã®åå¨ã¯ã宿ãã®ãã®ã§ãã 宿ã«ã¯ããããããª
rinna_test.py ã®å®è¡æéã«ã¤ãã¦
Out of Memory ãçºçãããã¨ãããèªåã®å®è¡ç°å¢ã¯ãµã³ãã«ã³ã¼ãã®å®è¡ã«èãããã¹ããã¯ã§ã¯ãªãããã¨ãããã¨ãåãã£ããåèç¨åº¦ã«å®è¡æéã«ã¤ãã¦è¨è¼ãã¦ããã
測å®ã«ã¯ãtime.perf_counter()
ã使ç¨ããã5 åå®è¡ããããã©ãã 30 ç§è¿ãããã£ã¦ãããã¨ãåããã
1 åç®: 29.19523430600384
2 åç®: 27.53580800799682
3 åç®: 28.29200663699885
4 åç®: 28.88365111400344
5 åç®: 28.652095464000013
ããã°ã©ã å®è¡ä¸ã¯ãã¡ã¢ãªã GPU ãããªã使ã£ã¦ããã16GBã®GPUãå¿
è¦ãªã®ãã»ã»ã»
åèè³æ
WSL2 のインストールとアンインストール #初心者 - Qiita
WSL2 上の PyTorch に GPU を認識させて深層学習環境をつくる
anaconda - CUDA Toolkitインストール時に発生するnvcc missingエラーについて - pytorch
【備忘録】WSL2にanaconda3をインストールしてcondaを使えるようにするまで #Python - Qiita
linux - PyTorch code stops with message "Killed". What killed it? - Stack Overflow