ããã¯Rubyã¢ããã³ãã«ã¬ã³ãã¼ã¨SmartHRã¢ããã³ãã«ã¬ã³ãã¼ã®17æ¥ç®ã®è¨äºã§ãã
12/9 ã« nagano.rb ã§æåã«ã¤ãã¦çºè¡¨ãã¦ãåãã®ã 12/15 ã« SmartHR 社å ã§ LT ãã¾ããã
ã¹ã©ã¤ãã¯ãã¡ã
åãæåï¼

ãã®ï¼ã¤ã®æåã¯åããã®ã«è¦ãã¾ããï¼
å®ã¯ããã¯åãæåãç°ãªããã©ã³ãã§è¡¨ç¤ºãããã®ã§ãã

ã´ã·ãã¯ä½ã¨ææä½ã§åä½ãç°ãªã£ã¦è¦ããã®ã¨åããã¨ãªã®ã§ãåãæåã¨è¨ããã§ãããã
ã³ã³ãã¥ã¼ã¿ã§æ±ãæåã¯æåãã¨ã«çªå·(ã³ã¼ããã¤ã³ã)ãæ¯ããã¦ãã¦ãããã°ã©ã ããè¦ãã¨ãã«ã¯åãã³ã¼ããã¤ã³ãã§ããã°åãæåã¨ãã¦æ±ããã¾ãã
Ruby ã§æåã®ã³ã¼ããã¤ã³ããå¾ãã«ã¯ String#ord ã使ç¨ã§ãã¾ãã
'ç´'.ord.to_s(16) #=> "76f4"
'ã»ã'.chars.map{_1.ord.to_s(16)} #=> ["307b", "3052"]
ã¾ã㯠String#unpack('U*') ã§ãå¯è½ã§ãã
'ã»ã'.unpack('U*').map{_1.to_s(16)} #=> ["307b", "3052"]
æ£è¦å

ãã®ï¼ã¤ã¯åãæåã§ããããã åãã«è¦ãã¾ãããããã¯ç°ãªãã³ã¼ããã¤ã³ãã®æåã§ãã

åè ã¯CJKçµ±åæ¼¢åãå¾è ã¯CJKäºææ¼¢åã¨ããã«ãã´ãªã«å«ã¾ãã¦ãã¾ãã
ã³ã¼ããã¤ã³ããç°ãªãã®ã§æ®éã«æ¯è¼ãããä¸ä¸è´ã¨ãªãã¾ããã
rei1 = '令' rei2 = '令' rei1.ord.to_s(16) #=> "4ee4" rei2.ord.to_s(16) #=> "f9a8" rei1 == rei2 #=> false
CJKäºææ¼¢åã String#unicode_normalize ã§æ£è¦åããã¨çµ±åæ¼¢åã«å¤æããã¾ãã
rei1 == rei2.unicode_normalize #=> true
ã¦ãã³ã¼ãã®æ£è¦å㯠UAX #15: Unicode Normalization Forms ã«ä»æ§ãããã¾ãã
String#unicode_normalize ã®ããã©ã«ã㯠NFC ã§ãããNFKC ã使ãã¨æ¬¡ã®ãããªå¤æãã§ãã¾ãã
'ï¼'.unicode_normalize(:nfkc) #=> '0' 'â '.unicode_normalize(:nfkc) #=> '1' 'ï½±'.unicode_normalize(:nfkc) #=> 'ã¢' 'ï¾ï¾'.unicode_normalize(:nfkc) #=> 'ã' 'ã'.unicode_normalize(:nfkc) #=> 'ããã¡ã¼ãã«'
ç°ä½åã»ã¬ã¯ã¿

ããã¯åãæåã§ããããï¼
æ¥æ¬èªã«è©³ãããã°ãããã¯åä½ãç°ãªãã ãã§åãæåã ã¨ãããã¨ã¯ãããã§ãããã æåã®ãç´ãã¨åãã§ãã
ã§ãããããã§ã¯ç°ä½åã»ã¬ã¯ã¿ã使ã£ãä¾ã示ãã¾ãã

U+E0100ãU+E01EF ãç°ä½åã»ã¬ã¯ã¿ã§ããä¸ã®ä¾ã§ã¯ U+E0102 ã§ãã
åºåºæåã«ç°ä½åã»ã¬ã¯ã¿ã追å ãããã¨ã§æåã®è¦ãç®ãæå®ãããã¨ãã§ãã¾ãã ãã¬ã¼ã³ããã¹ãã§ãåä½ãæå®ã§ããä»çµã¿ã§ãã
ãã ãã¡ããã¨è¡¨ç¤ºããã«ã¯ãã·ã¹ãã ã¨ãã©ã³ãã対å¿ãã¦ããå¿ è¦ãããã¾ãã
ã©ã®ãããªç°ä½åãããã調ã¹ãã«ã¯ ç°ä½åã»ã¬ã¯ã¿ã»ã¬ã¯ã¿ ã便å©ã§ãã
ãã¨ãã°ãéãã®ä¸è¦§ã¯ https://747.github.io/vsselector/#!/ja/908a ã§è¦ãã¾ãã æåã«ç¤ºãããç´ãã®ç°ä½åã»ã¬ã¯ã¿ãããã¾ããhttps://747.github.io/vsselector/#!/ja/76f4
ç°ä½åã»ã¬ã¯ã¿ã¯ unicode_normalize ã§ã¯æ¶ãã¾ãããæ¶ãããå ´å㯠gsub ã¨ãã§æ¶ãã¾ãããã
str.gsub(/[\u{e0100}-\u{e01ef}]/, '')
ãé«ã
ãé«ãã¯ä¿ã«ãã¯ããé«ãã¨å¼ã°ãã¦ãæåã§ãã
Unicode ã§ã¯ãé«ãã¯ãé«ãã®ç°ä½åã§ã¯ãªãå¥ã®æåã§ããå¥ã®æåãªã®ã§æ£è¦åã®å¯¾è±¡ã§ã¯ãªãããç°ä½åã»ã¬ã¯ã¿ã«ãããã¾ããã
SJIS(Windows-31J)ã«ãåå¨ããæåã§ãããªã®ã§å¤æãå¯è½ã§ãã
'é«'.encode('Windows-31J')
#=> "\x{FBFC}"
'é«'.encode('SJIS') # SJIS 㯠Windows-31J ã®å¥å
#=> "\x{FBFC}"
ã§ã JIS ã§ã¯ãé«ãã¨ããæåã¯åå¨ããªãã¦ãé«ãã®ç°ä½åæ±ãã§ãã対å¿ããæåããªãã®ã§å¤æã§ãã¾ããã
Ruby ã§ã¯ SJIS 㨠Shift_JIS ã¯ç°ãªãã¨ã³ã³ã¼ãã£ã³ã°ãªã®ã§æ³¨æã
'é«'.encode('Shift_JIS') # Shift_JIS 㨠SJIS ã¯ç°ãªã
# `encode': U+9AD9 from UTF-8 to Shift_JIS
# (Encoding::UndefinedConversionError)
ãé«ãã¯ãé«ãã¨å¥ã®æåã¨ãã¦æ±ãåã«ã¯ä½ãåé¡ãªããã§ãããäººåæ¤ç´¢ã¨ãã§ãé«ãã¨å䏿åã¨ãã¦æ±ããããã¨ããããããããªãã®ã§ãããããã¨ããã§ãã
ãï¨ã
ãï¨ãã¯ä¿ã«ããã¡å´ãã¨å¼ã°ãã¦ãæåã§ãã
ããã¯ã令ãã¨åãã CJKäºææ¼¢åã«å«ã¾ããæåã§ãã
ãã©ãã令ãã¨ç°ãªã unicode_normalize ã§ã¯ãå´ãã«ã¯ãªãã¾ããã
'ï¨'.unicode_normalize #=> "ï¨"
åããCJKäºææ¼¢åã«å«ã¾ãã¦ããï¨ãã¯ã¡ããã¨ãç¦ãã«å¤æããã¾ãã
'ï¨'.unicode_normalize #=> "ç¦"
ãï¨ãã¯ä½ãéããã¨ããã¨ãããããã¨ã§ããã
ãªããU+FA11ï¼ï¨ï¼ã¯U+5D0Eï¼å´ï¼ãU+FA14ï¼ï¨ï¼ã¯U+6B05ï¼æ¬ ï¼ããã³U+6989ï¼æ¦ï¼ãU+FA1Fï¼ï¨ï¼ã¯U+81C8ï¼èï¼ã«ããããçµ±åæ¼¢åãããã¯ã®ç°ä½åãæã¤ããåä½å·®ã大ããã¨ã¿ãªããçµ±åã®ç¯çã¨ããã¦ããªãã
ããããé«ãã¨åããæ£è¦åããã«ã¯åå¥ã§ããå¿ è¦ãããããã§ãã
ãã¾ã

平仮åã®ãã¸ãã¨çä»®åã®ããããã¾ã£ããåãåä½ãªã®ã¯æ¥æ¬èªã®ãã°ã§ããã
æåæ°

1æåã«è¦ãããããã®çµµæåã¯å®éã«ã¯ä½æåã§ãããï¼
彿ã¯2æåã§æ§æããã¦ã¾ãã
'ð¯ðµ'.size #=> 2
æ¥æ¬ã®å½ã³ã¼ã㯠JP ã§ããå½æç¨æåã®ãð¯ãã¨ãðµããã¤ãªãã¦æ¸ãã¨ãð¯ðµãã¨ãªãã¾ãã åæ§ã«ãðºãã¨ãð¸ããã¤ãªããã¨ãðºð¸ãã«ãªãã¾ãã
3人家æã®çµµæåã¯ã³ã¼ããã¤ã³ã U+1F46A ã®1æåã§ãã
'ðª'.size #=> 1
ã¨ãããåä¾ãä¸äººå¢ãã¦4人家æã«ãªãã¨ã³ã¼ããã¤ã³ã7æåã§æ§æããã¾ãã
'ð¨âð©âð§âð¦'.size #=> 7

çµµæå以å¤ã«ãããã¨ãã°æ¿ç¹ä»ãã®ããªæåã¯ããã±ãã®ããã«1æåã®æ¿ç¹ä»ãæåã¨ããã¯ãã¨ãâããã®2æåãåæããæåãããã¾ãã

人éã«èè²ã髪åãåæããçµµæåãããã¾ãã

æ¸è¨ç´
ããã°ã©ã çã«èªç¶ãªã®ã¯ã³ã¼ããã¤ã³ãã®æ°ã§ããã人ã«ã¯ä¸èªç¶ã§ãã
人ã«èªç¶ãªæåã®åä½ã«ãæ¸è¨ç´ ãã¨ããã®ãããã¾ãã
æ¸è¨ç´ - Wikipedia ãã
æ¸è¨ç´ ï¼ãããããè±: graphemeï¼ã¨ã¯ãæ¸è¨è¨èªã«ããã¦æå³ä¸ã®åºå¥ãå¯è½ã«ããæå°ã®å³å½¢åä½ããã
Ruby ã§ã¯ String#grapheme_clusters ã使ãã¨æååãæ¸è¨ç´ ã«åå²ã§ãã¾ãã
'ð¯ðµðªð¨âð©âð§âð¦'.size #=> 10
'ð¯ðµðªð¨âð©âð§âð¦'.grapheme_clusters #=> ["ð¯ðµ", "ðª", "ð¨âð©âð§âð¦"]
'ð¯ðµðªð¨âð©âð§âð¦'.grapheme_clusters.size #=> 3
ã¾ããæ£è¦è¡¨ç¾ã® \X ã¯æ¸è¨ç´ 1æåã«é©åãã¾ãã
'ð¯ðµðªð¨âð©âð§âð¦'.scan(/./)
#=> ["ð¯", "ðµ", "ðª", "ð¨", "â", "ð©", "â", "ð§", "â", "ð¦"]
'ð¯ðµðªð¨âð©âð§âð¦'.scan(/\X/)
#=> ["ð¯ðµ", "ðª", "ð¨âð©âð§âð¦"]
ã¾ã¨ã
ã¦ãã³ã¼ãã¯çµæ§ã«ãªã¹ã
æååãæ¯è¼ããã¨ãã¯æ£è¦åããæ¹ããããããããªãã
æåæ°ãæ°ããã¨ãã¯ã³ã¼ããã¤ã³ããªã®ãæ¸è¨ç´ ãªã®ããèããæ¹ãããããã