GPUDirect SQL on NFS-over-RDMAã試ã
ã¿ã¤ãã«ã§ã»ã¼ã»ã¼åºãªãã§ãããå æ¥ãNVIDIAããCUDA Toolkit 11.4ã¨å ±ã«ãªãªã¼ã¹ãããæ°æ©è½GPUDirect Storage 1.0のドキュメントãèªãã§ããã¨ãé¢ç½ãè¨è¿°ãè¦ã¤ããã
æ°ããMOFEDãã©ã¤ã5.3以éã¨ãMellanox Connect-X4/5ã®çµã¿åããã§ãNFS-over-RDMAã¨GPUDirect Storageãçµã¿åããããªã¢ã¼ãã®NFSåºç»ãããã¼ã«ã«ã®GPUã¸ã¨ç´æ¥ã®ãã¼ã¿è»¢éãè¡ãäºãã§ããããã«ãªããã¨ã
14.10. NFS Support with GPUDirect Storage
This section provides information about NFS support with GDS.14.10.2. Install GPUDirect Storage Support for the NFS Client
Here is some information about installing GDS support for the NFS client.
To install a NFS client with GDS support complete the following steps:
Note: The client must have a Mellanox connect-X4/5 NIC with MLNX_OFED 5.3 or later installed.
:
çµæ§ãªäºã§ããã
PG-Strom v3.0以åã§ã¯ããã¼ã«ã«ã®NVME-SSDã¾ãã¯ãªã¢ã¼ãã®NVME-oFåºç»ï¼å®é¨çï¼ã Ext4 ãã¡ã¤ã«ã·ã¹ãã ã§åæåãããã¿ã¼ã³ã«éã£ã¦ GPUDirect SQL ã対å¿ãã¦ããããã
- 段éçã«ã¹ãã¬ã¼ã¸ãæ¡å¼µããã®ã«å°é£ãä¼´ã£ãã
- å ±æãã¡ã¤ã«ã·ã¹ãã ã§ã¯ãªãã®ã§ãè¤æ°å°ã®ãã¼ãããæ¸ãè¾¼ã¿ãã§ããªãã£ãã
ã¨ãã課é¡ããã£ããNFSèªä½ã¯ãã®ãããé«éãªãã¡ã¤ã«ã·ã¹ãã ãã¨ããã¯ã±ã§ã¯ãªãããDB/GPUãµã¼ãããã¹ãã¬ã¼ã¸ãåé¢ãããã¤è¤æ°ã®ãã¼ãããæ¸ãè¾¼ã¿ãã§ããã®ã§ããã°ãä¾ãã°ãIoT/M2Mç³»ã®ã¯ã¼ã¯ãã¼ãã§ãã°ãã¼ã¿ãåéãããããNFSãµã¼ãä¸ã«ç½®ãã¦ããããããã°ãDB/GPUãµã¼ããããããåç §ãã¦GPUDirect SQLã®å¦çã¹ãã¼ãã§ãã£ã¦ã³ã¬ãåæããäºãã§ããã
çµè«ï¼çµæ§ã¤ã±ã¦ã
ã»ããã¢ããæé ãªã©ã¯é·ããªãã®ã§å¾åãã«ããã¨ãã¦ãã²ã¨ã¾ãSSBM (Star Schema Benchmark) ã®çµæãä¸è¨ã§ã¾ã¨ããã¨ãçµæ§ã¤ã±ã¦ããã¨ããå°è±¡ã
測å®ç°å¢ã¯ä»¥ä¸ã®å³ã®éãã§ãä»åã¯1Uãµã¼ãã®SYS-1019GP-TTã«NFSãµã¼ãã«ãªã£ã¦ããã£ãããã®äººã«ã¯ãã¨ã³ã¯ãã¼ã¸ã£çµç±ã§NVME-SSDï¼Intel DC P4510[1.0TB; U.2]ï¼ã4å°æ¥ç¶ããã¾ã Mellanox Connect-X5 ã¨ãã100Gb-NIC ãæ¥ç¶ãã¦ããã
GPU/DBãµã¼ãã«ã¯4Uã®SYS-4029GP-TRTã使ãããã®äººã«ã¯ãåãPCI-Eã¹ã¤ããã®é
ä¸ã«GPUã¨Connect-X5ãæ¥ç¶ãããã¢ã¨ãããä¸ã¤GPUã¨NVME-SSDï¼å DC P4510ï¼ã4å°æ¥ç¶ãããã¢ãä½ã£ããããã¯ãã¼ã«ã«NVME-SSDã¨ã®æ§è½æ¯è¼ç¨ã§ããã
NFSãµã¼ãã¯ãSSD x4å°ãmd-raid0ã§ã¹ãã©ã¤ãã³ã°ããåºç»ãNFSã¯ã©ã¤ã¢ã³ãã«ã¨ã¯ã¹ãã¼ãããNFSã¯ã©ã¤ã¢ã³ãã¯ç´çµã®100Gbãããã¯ã¼ã¯*1ãä»ãã¦ããããNFS-over-RDMAã¢ã¼ãã§ãã¦ã³ãã
GPU/DBãµã¼ãå´ã§ã¯ä»¥ä¸ã®ãããªã¹ãã¬ã¼ã¸æ§æã¨ãªã£ã¦ããã
/opt/nvme0
ã«ã¯ããã¼ã«ã«ã®NVME-SSD x4å°ãmd-raid0ã§ã¹ãã©ã¤ãã³ã°ããåºç»ããã¦ã³ãã/opt/nvme1
ã«ã¯ã1Uãµã¼ãï¼192.168.80.106ï¼ã®NFSåºç»ãè¦ãã¦ããã
[kaigai@kujira ~]$ df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 94G 0 94G 0% /dev tmpfs 94G 257M 94G 1% /dev/shm tmpfs 94G 19M 94G 1% /run tmpfs 94G 0 94G 0% /sys/fs/cgroup /dev/mapper/vg_disk-root 246G 15G 218G 7% / /dev/nvme0n1p1 1.8T 35G 1.7T 2% /opt /dev/md0p1 3.6T 1.4T 2.1T 41% /opt/nvme0 /dev/sda2 976M 189M 721M 21% /boot /dev/mapper/vg_disk-home 393G 24G 349G 7% /home /dev/sda1 599M 6.9M 592M 2% /boot/efi tmpfs 19G 0 19G 0% /run/user/1000 192.168.80.106:/mnt/nfsroot 2.0T 1.2T 697G 64% /opt/nvme1
ã§ãããããã®åºç»ã«ä¿æããã¦ããlineorderãã¼ãã«ã¸ã®åç
§ãå«ãSSBMã¯ã¨ãªã®å®è¡é度ã¯ä»¥ä¸ã®éãã
åãããããããã«ãï¼ç·DBãµã¤ãºï¼Ã·ï¼ã¯ã¨ãªå¿çæéï¼ã§å°åºãããã¯ã¨ãªå¦çã¹ã«ã¼ããããã§è¡¨è¨ãã¦ããã
è¦ã¦ã®éãããã¼ã«ã«ã®NVME-SSDã«æ¯ã¹ãã¨NFS-over-RDMAã¯ï¼å²ç¨åº¦é ãã¨*2è¨ããããããã¯ãï¼å²ç¨åº¦é ãã ãã§ã¹ãã¬ã¼ã¸ã®æ¡å¼µæ§ããªã¢ã¼ãã¢ã¯ã»ã¹ã¨ãã£ãç¹æ§ãå¾ãããã¨ããäºãæå³ããã
ã¯ã¨ãªå®è¡ä¸ã®ã¹ãã¬ã¼ã¸ããã®èªã¿åºãé度ãè¦ã¦ã¿ã¦ããã¯ã¨ãªå®è¡ä¸ã®100Gbã®ãããã¯ã¼ã¯ã§8.0GB/så¼·ãåºãã¦ããã®ã§ãã¾ãã¾ãã®ããã©ã¼ãã³ã¹ã¨è¨ããã
ãªãããã¼ã«ã«ã®NVME-SSDã®å ´åãå¾åã§çªç¶èªã¿åºãé度ãå¢ãã§10.0GB/sç¨åº¦ã¾ã§å¢éãã¦ããããããã«ã¤ãã¦ã¯ç¾æç¹ã§è¬ã§ããâ¦ã
NFS-over-RDMAã®ã»ããã¢ããæé
NFS-over-RDMAã®ã»ããã¢ããæé ã¯ã以ä¸ã®ããã°ãåèã«ããâ¦ã¨ããããã»ã¨ãã©ãã®ã¾ã¾ã
https://community.mellanox.com/s/article/howto-configure-nfs-over-rdma--roce-x
ã½ããã¦ã§ã¢ã®æ§æã¯ãã£ãã以ä¸ã®éã
- CentOS 8.3 (kernel-4.18.0-240.22.1.el8_3.x86_64)
- CUDA Toolkit 11.4 (NVIDIA Driver R470.42.01)
- MOFED 5.3-1.0.0.1 (RHEL8.3; x86_64)
- PostgreSQL v13.3 (PG-Strom v3.0-3)
MOFEDOãã©ã¤ãã®ã¤ã³ã¹ãã¼ã«
ã¾ããMellanoxのサイトããMOFEDãã©ã¤ãã®ææ°çããã¦ã³ãã¼ãããã
[Version]->[OS Distribution]->[OS Distribution Version]->[Architecture]ã¨é¸æãã¦ããã¨ããã¤ããªããã±ã¼ã¸ãå«ã tgz ã®ããã±ã¼ã¸ã¨ãã½ã¼ã¹ã³ã¼ãã® tgz ããã±ã¼ã¸ã®ä¸¡æ¹ã表示ãããã®ã§ã両æ¹ã¨ããã¦ã³ãã¼ããå®ã¯ã½ã¼ã¹ã³ã¼ããå¾ã§ä½¿ãã¾ãã
tgzãã¡ã¤ã«ããã¦ã³ãã¼ãããã¨ãã¾ã GPUDirect Storage ã®ããã¥ã¡ã³ãéãã«ãã©ã¤ãã®ã¤ã³ã¹ãã¼ã«ãè¡ãã
éä¸ãä¸è¶³ããããã±ã¼ã¸ãããå ´åã«ã¯ãã¤ã³ã¹ãã¼ã«ã¹ã¯ãªããããµã¸ã§ã¹ãéãã«`dnf install ...`ããã°ããã®ã§ããã®éãã«é²ããã°MOFEDãã©ã¤ãã®ã¤ã³ã¹ãã¼ã«ã¯è¡ããã¯ãã
$ sudo ./mlnxofedinstall --with-nvmf --with-nfsrdma --enable-gds --add-kernel-support Note: This program will create MLNX_OFED_LINUX TGZ for rhel8.3 under /tmp/MLNX_OFED_LINUX-5.3-1.0.0.1-4.18.0-240.22.1.el8_3.x86_64 directory. See log file /tmp/MLNX_OFED_LINUX-5.3-1.0.0.1-4.18.0-240.22.1.el8_3.x86_64/mlnx_iso.225746_logs/mlnx_ofed_iso.225746.log Checking if all needed packages are installed... Building MLNX_OFED_LINUX RPMS . Please wait... : <snip> : $ sudo dracut -f $ sudo shutdown -r now
ããããNFSãµã¼ãå´ãNFSã¯ã©ã¤ã¢ã³ãå´ã®ä¸¡æ¹ã§è¡ããã·ã¹ãã ãåèµ·åã
NFSãµã¼ãã®è¨å®
1Uãµã¼ãã®SYS-1019GP-TTå´ã§ã¯ããã¼ã«ã«ã®NVME-SSDã4æ¬æããmd-raid0åºç»ã`/mnt/nfsroot`ã«ãã¦ã³ããã¦ããã
ããã以ä¸ã®æé ã§NFS-over-RDMAåºç»ã¨ãã¦ã¨ã¯ã¹ãã¼ãããã
1. IPã¢ãã¬ã¹ä»ãããã¯ã¼ã¯è¨å®
ä»åã¯å®ç´ã«192.168.80.0/24
ãç´çµç¨ã®ãããã¯ã¼ã¯ã¨ãã¦ä½¿ç¨ã
éçã«192.168.80.106/24
ãConnect-X5ããã¤ã¹ã«è¨å®ããMTU=9000ã§NICãæå¹åãã¾ããã
2. /etc/exports
ãè¨è¿°ãç¹ã«ã»ãã¥ãªãã£ã¨ãä½ãèãã¦ãªãè¨å®ã§ãã
# cat /etc/exports /mnt/nfsroot *(rw,async,insecure,no_root_squash)
3. RDMA Transport Kernel Moduleããã¼ããããã¯MOFEDãã©ã¤ãã«ãã£ã¦æä¾ãããã¢ã¸ã¥ã¼ã«ã
# modprobe svcrdma # modinfo svcrdma filename: /lib/modules/4.18.0-240.22.1.el8_3.x86_64/extra/mlnx-nfsrdma/svcrdma.ko version: 2.0.1 license: Dual BSD/GPL description: svcrdma dummy kernel module author: Alaa Hleihel rhelversion: 8.3 srcversion: F7C50654667EBC6F832D608 depends: mlx_compat name: svcrdma vermagic: 4.18.0-240.22.1.el8_3.x86_64 SMP mod_unload modversions
4. NFSãµã¼ããèµ·å
# systemctl start nfs-server
5. RDMA転éç¨ã®ãã¼ãçªå·ãè¨å®ãä¸å¿ãä»»æã®ãã¼ãçªå·ã使ç¨ã§ãããã20049ã¨ããã®ãwell-known defaultã¨ã®ãã¨ã
# echo rdma 20049 > /proc/fs/nfsd/portlist # cat /proc/fs/nfsd/portlist rdma 20049 rdma 20049 tcp 2049 tcp 2049
NFSã¯ã©ã¤ã¢ã³ãã®è¨å®
1. IPã¢ãã¬ã¹ä»ãããã¯ã¼ã¯è¨å®
ãµã¼ãã¼å´ã¨åæ§ãéçã«192.168.80.108/24
ãConnect-X5ããã¤ã¹ã«è¨å®ããMTU=9000ã§NICãæå¹åãã¾ããã
ãããã¯ã¼ã¯ã®æå¹åãçµãã£ãããpingãªã©ã§å°é確èªã
$ ping 192.168.80.106 PING 192.168.80.106 (192.168.80.106) 56(84) bytes of data. 64 bytes from 192.168.80.106: icmp_seq=1 ttl=64 time=0.178 ms 64 bytes from 192.168.80.106: icmp_seq=2 ttl=64 time=0.197 ms ^C
2. ã¯ã©ã¤ã¢ã³ãå´ã®RDMA Transport Kernel Moduleããã¼ãããããMOFEDãã©ã¤ãã«å«ã¾ããã¢ã¸ã¥ã¼ã«ã
# modprobe rpcrdma # modinfo rpcrdma filename: /lib/modules/4.18.0-240.22.1.el8_3.x86_64/extra/mlnx-nfsrdma/rpcrdma.ko alias: xprtrdma alias: svcrdma license: Dual BSD/GPL description: RPC/RDMA Transport author: Open Grid Computing and Network Appliance, Inc. rhelversion: 8.3 srcversion: EFB4ED2B09C65AA7DA8D887 depends: ib_core,sunrpc,mlx_compat,rdma_cm name: rpcrdma vermagic: 4.18.0-240.22.1.el8_3.x86_64 SMP mod_unload modversions
3. åç¯ã§ã¨ã¯ã¹ãã¼ãããNFSåºç»ããã¦ã³ã
# mount -o rdma,port=20049 192.168.80.106:/mnt/nfsroot /opt/nvme1 # df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 94G 0 94G 0% /dev tmpfs 94G 257M 94G 1% /dev/shm tmpfs 94G 19M 94G 1% /run tmpfs 94G 0 94G 0% /sys/fs/cgroup /dev/mapper/vg_disk-root 246G 15G 218G 7% / /dev/nvme0n1p1 1.8T 35G 1.7T 2% /opt /dev/md0p1 3.6T 1.4T 2.1T 41% /opt/nvme0 /dev/sda2 976M 189M 721M 21% /boot /dev/mapper/vg_disk-home 393G 24G 349G 7% /home /dev/sda1 599M 6.9M 592M 2% /boot/efi tmpfs 19G 0 19G 0% /run/user/1000 192.168.80.106:/mnt/nfsroot 2.0T 1.2T 697G 64% /opt/nvme1
ããã§æºåå®äºã
å°é確èªãå
¼ãã¦ã巨大ãªãã¡ã¤ã«ã®è»¢éãè¡ã£ã¦ã¿ãã
# dd if=/opt/nvme1/100GB of=/dev/null iflag=direct bs=32M 3106+1 records in 3106+1 records out 104230305696 bytes (104 GB, 97 GiB) copied, 11.8926 s, 8.8 GB/s
ããã¯éãï¼ 8.8GB/s ãåºã¦ããã
ä¸æ¹ãNFS-over-RDMAã使ããªããã¿ã¼ã³ã ã¨ã
# mount 192.168.80.106:/mnt/nfsroot /mnt/ # dd if=/mnt/100GB of=/dev/null iflag=direct bs=32M 3106+1 records in 3106+1 records out 104230305696 bytes (104 GB, 97 GiB) copied, 32.6171 s, 3.2 GB/s
御æã
GPUDirect Storageã§NFSåºç»âGPUã¸ã®ç´æ¥Readãè¡ã
ç¶ãã¦æ¬çªãGPUDirect Storageã使ã£ã¦ããªã¢ã¼ãã®NFSåºç»ããGPUã¸ã®ç´æ¥Readãè¡ãã
ä»ç¾å¨ãNFSåºç»ããGPUDirect Storageã«ããç´æ¥èªã¿åºããå¯è½ãªç¶æ
ã«ãªã£ã¦ãããã©ãããCUDA 11.4ã«æ·»ä»ã®gdscheck
ã¨ããã³ãã³ãã§ç¢ºèªããäºãã§ããããããããããããUnsupportedã¨è¡¨ç¤ºããã¦ããã
# /usr/local/cuda/gds/tools/gdscheck -p GDS release version: 1.0.0.82 nvidia_fs version: 2.7 libcufile version: 2.4 ============ ENVIRONMENT: ============ ===================== DRIVER CONFIGURATION: ===================== NVMe : Supported NVMeOF : Supported SCSI : Unsupported ScaleFlux CSD : Unsupported NVMesh : Unsupported DDN EXAScaler : Unsupported IBM Spectrum Scale : Unsupported NFS : Unsupported WekaFS : Unsupported Userspace RDMA : Unsupported --Mellanox PeerDirect : Enabled --rdma library : Not Loaded (libcufile_rdma.so) --rdma devices : Not configured --rdma_device_status : Up: 0 Down: 0 :
ããã¯2æéãããããã¦èª¿ã¹ãã¨ãããã©ããããMOFEDãã©ã¤ãã§ãã¤ããªé
å¸ããã¦ããrpcrdma
ã¢ã¸ã¥ã¼ã«ã§GPUDirect Storage対å¿ã®ã³ã¼ããæå¹åãããªãã¾ã¾ãã«ããé
å¸ããã¦ãã¾ã£ã¦ããã¨ããäºã®ããã§ããã
MOFEDãã©ã¤ãã®ã½ã¼ã¹ã³ã¼ããè¦ã¦ã¿ãã¨ãããCONFIG_GPU_DIRECT_STORAGE=y
ã¤ãã§ãã«ãããã¦ããã®ã§ããã°ã/proc/kallsyms
ã«nvfs_ops
ã¨ããé¢æ°ãã¤ã³ã¿è¡¨ãåºç¾ãã¦ãããã¹ãã§ããã®ã ãããããåºç¾ãã¦ããªãã
# grep nvfs_ops /proc/kallsyms ffffffffc0c256c0 b nvfs_ops [nvme_rdma] ffffffffc00dc718 b nvfs_ops [nvme]
ã¨ããäºã§ãå½è©²ã¢ã¸ã¥ã¼ã«ãéè¯ãã«ããã¦ã¿ãäºã«ããã
ï¼ãªããNVIDIAã®éçºãã¼ã ã«ã¯ã¨ã¹ã«ã¬ã¼ã·ã§ã³æ¸ã¿ãMellanoxã¸ãå±éãã¦ãããã§ããããï¼
ã½ã¼ã¹ã³ã¼ãã® tgz ã«ã¯ SRPM ãå«ã¾ãã¦ããã®ã§ãrpcrdmaã¢ã¸ã¥ã¼ã«ãå«ãmlnx-nfsrdma
ã®SRPMãå±éããããã«CONFIG_GPU_DIRECT_STORAGE=y
ãä»å ãã¦ãã«ãããã
ãããinsmodãã¦ã¿ãã¨ãrpcrdmaã¢ã¸ã¥ã¼ã«ã«ãnvfs_ops
ã·ã³ãã«ãã¨ã¯ã¹ãã¼ãããã¦ããã®ããããã
$ wget http://www.mellanox.com/downloads/ofed/MLNX_OFED-5.3-1.0.0.1/MLNX_OFED_SRC-5.3-1.0.0.1.tgz $ tar zxvf MLNX_OFED_SRC-5.3-1.0.0.1.tgz $ cd MLNX_OFED_SRC-5.3-1.0.0.1 $ rpm2cpio SRPMS/mlnx-nfsrdma-5.3-OFED.5.3.0.3.8.1.src.rpm | cpio -idu $ tar zxvf mlnx-nfsrdma-5.3.tgz $ cd mlnx-nfsrdma-5.3 $ make CONFIG_GPU_DIRECT_STORAGE=y $ sudo insmod rpcrdma.ko $ sudo grep nvfs_ops /proc/kallsyms ffffffffc319ddc8 b nvfs_ops [rpcrdma] ffffffffc0c256c0 b nvfs_ops [nvme_rdma] ffffffffc00dc718 b nvfs_ops [nvme]
ãã®ç¶æ ã§ãå度gdscheckã³ãã³ããå®è¡ãã¦ã¿ãã¨ã
$ /usr/local/cuda/gds/tools/gdscheck -p GDS release version: 1.0.0.82 nvidia_fs version: 2.7 libcufile version: 2.4 ============ ENVIRONMENT: ============ ===================== DRIVER CONFIGURATION: ===================== NVMe : Supported NVMeOF : Supported SCSI : Unsupported ScaleFlux CSD : Unsupported NVMesh : Unsupported DDN EXAScaler : Unsupported IBM Spectrum Scale : Unsupported NFS : Supported WekaFS : Unsupported Userspace RDMA : Unsupported --Mellanox PeerDirect : Enabled --rdma library : Not Loaded (libcufile_rdma.so) --rdma devices : Not configured --rdma_device_status : Up: 0 Down: 0 :
ã¤ã¤ããã¥ã¥ã¥ã¥ï¼ï¼ï¼
æ©éãGPUDirect Storageã®Raw-I/Oæ§è½ã測å®ãã¦ã¿ãäºã«ããã
$ /usr/local/cuda/gds/tools/gdsio -x 0 -f /mnt/100GB -d 1 -s 96G -i 16M -w 6 IoType: READ XferType: GPUD Threads: 6 DataSetSize: 63143936/100663296(KiB) IOSize: 16384(KiB) Throughput: 7.642794 GiB/sec, Avg_Latency: 12874.833807 usecs ops: 3854 total_time 7.879154 secs
ã¤ã¤ããã¥ã¥ã¥ã¥ï¼ï¼ï¼
ãµã¼ãæ©æã¯æãåãããªã®ã§ããããããã¨Skylake-SPå èµã®PCI-Eã³ã³ããã¼ã©ã§è©°ã¾ã£ã¦ããããï¼å¸¯åçã«ã¯ãããªæããããªãã§ããªãï¼ãããªãããNFSã¨ããè¨èããåããå°è±¡ã¨ã¯ããã¶ãéã£ãã¬ãã«ã®ããã©ã¼ãã³ã¹ãåºãã¦ããããã«è¦ããã
ãã¦ãããã§ã¯ãæãéè¦ãª PG-Strom ã§GPUDirect SQLãç¨ããå ´åã®ããã©ã¼ãã³ã¹ãè¨æ¸¬ãã¦ã¿ãäºã«ããã
ï¼âå
é ã«æ»ãï¼
8/21追è¨ï¼5.4-1.0.3.0 ãã©ã¤ãã§ã¯ç´ã£ã¦ã
ä¸è¨ãrpcrdmaã¢ã¸ã¥ã¼ã«ãGPUDirect Storage対å¿ã§ãã«ãããã¦ããªãã£ãåé¡ã§ãããæ¬ã¨ã³ããªãæ¸ããæç¹ã®MOFEDãã©ã¤ãï¼5.3-1.0.0.1ï¼ã§ã¯ãªããææ°ã® 5.4-1.0.3.0 ã使ç¨ããã° GPUDirect Storage é¢é£ã®æ©è½ãæå¹ã«ãã¦ãã«ããããããã§ãã
ãã©ã¤ãæ¨æºã®ã¤ã³ã¹ãã¼ã«ã¹ã¯ãªãããå®è¡ããã ãã®ç¶æ ã§
[root@magro ~]# modinfo rpcrdma filename: /lib/modules/4.18.0-305.12.1.el8_4.x86_64/extra/mlnx-nfsrdma/rpcrdma.ko alias: xprtrdma alias: svcrdma license: Dual BSD/GPL description: RPC/RDMA Transport author: Open Grid Computing and Network Appliance, Inc. rhelversion: 8.4 srcversion: 6144CA5B71903B01293DD5F depends: ib_core,sunrpc,mlx_compat,rdma_cm name: rpcrdma vermagic: 4.18.0-305.12.1.el8_4.x86_64 SMP mod_unload modversions [root@magro ~]# modprobe rpcrdma [root@magro ~]# grep nvfs_ops /proc/kallsyms ffffffffc0f20dc8 b nvfs_ops [rpcrdma] ffffffffc0970700 b nvfs_ops [nvme_rdma] ffffffffc02ce718 b nvfs_ops [nvme]
[root@magro ~]# /usr/local/cuda/gds/tools/gdscheck -p GDS release version: 1.0.1.3 nvidia_fs version: 2.7 libcufile version: 2.4 ============ ENVIRONMENT: ============ ===================== DRIVER CONFIGURATION: ===================== NVMe : Supported NVMeOF : Supported SCSI : Unsupported ScaleFlux CSD : Unsupported NVMesh : Unsupported DDN EXAScaler : Unsupported IBM Spectrum Scale : Unsupported NFS : Supported WekaFS : Unsupported Userspace RDMA : Unsupported --Mellanox PeerDirect : Enabled --rdma library : Not Loaded (libcufile_rdma.so) --rdma devices : Not configured --rdma_device_status : Up: 0 Down: 0 ===================== CUFILE CONFIGURATION: ===================== properties.use_compat_mode : true properties.gds_rdma_write_support : true properties.use_poll_mode : false properties.poll_mode_max_size_kb : 4 properties.max_batch_io_timeout_msecs : 5 properties.max_direct_io_size_kb : 16384 properties.max_device_cache_size_kb : 131072 properties.max_device_pinned_mem_size_kb : 33554432 properties.posix_pool_slab_size_kb : 4 1024 16384 properties.posix_pool_slab_count : 128 64 32 properties.rdma_peer_affinity_policy : RoundRobin properties.rdma_dynamic_routing : 0 fs.generic.posix_unaligned_writes : false fs.lustre.posix_gds_min_kb: 0 fs.weka.rdma_write_support: false profile.nvtx : false profile.cufile_stats : 0 miscellaneous.api_check_aggressive : false ========= GPU INFO: ========= GPU index 0 Tesla V100-PCIE-16GB bar:1 bar size (MiB):16384 supports GDS ============== PLATFORM INFO: ============== IOMMU: disabled Platform verification succeeded