ååã®ã¨ã³ããªã§ãã なんで世にあるアプリケーションは1バイトで済むUTF-8を3バイト表現でもOKなんて勘違いをするの?ãã£ã¦æ¸ãããã©ããªãã§ããèãã¦ã¿ãã
ãã£ããã¯ãã³ã¡ã³ã
ãã£ããã¯ãid:kick123ããããã£ãã³ã¡ã³ã
ãC2ãDFã¯C0ãDFï¼ãã£ã¦ã¨ããã§ããã1ãã¤ãã§è¡¨ç¾ã§ããã®ã¯7ãããã¾ã§ã§ã80ã表ç¾ããã«ã¯8ããããå¿ è¦ã§ãã
2ãã¤ã表ç¾ã«ãã¦ã¯ãããªããã1100 0010ãã¨ã1000 0000ãã«ãªãã1ãã¤ãç®ã¯C2ã«ãªãã
ã£ã¦ãã¨ã§ã¯ãªãã§ããï¼
ãããæåãï¼ï¼ï¼ãã£ã¦æãã ã£ããã§ãã
ç解ããããã«ãã¨ãããããUTF-8ã§1byteã§è¡¨ããæåä¸è¦§ããä½ã£ã¦ã¿ãã
utf-8_mapping_1byte posted by (C)ITOH Takashi
ããåºãã®ã«ä½¿ã£ãããã°ã©ã ã¯ãPHPã§
<?php for ($i=0; $i<=base_convert("01111111", 2, 10); $i++){ printf("\"%s\" , \"0%07s\" , \"%04s\" , \"%s\"\n", $i, base_convert($i, 10, 2), base_convert($i, 10, 16), hex2bin(strval(base_convert($i, 10, 16)))); } function hex2bin($hex_str) { return pack("H*" , $hex_str); }
ã¾ããPHPèªãããªããã®ããã°ã©ã ã¯åããããããèªããªãã¦ãããããããã¨æããã©ãPHPã¯ã
ã§ããã®æ¬¡ãç¥ãããã
1byteã§è¡¨ããç¯å²ã¯åãã£ãããããããã®æ¬¡ã«ã©ãããã®ãï¼
å½ç¶ã2é²æ°ã§ã01111111ããªã®ã§æ¬¡ã¯ãã11111111ãã»ã»ã»ã»ã¨è¡ãããã¨ãããªãã ãã©ãUTF-8ã®1byteã§ã¯ãã®è¡¨ç¾ã¯åããªãã
ãããã£ã¦ãã§ã2byte使ããã¨ã«ãããUTF-8ã®2byte表ç¾ã¯ã前回のエントリã«æ¸ããã¨ãã
- æåã®1byteã¯ã110xxxxx ã¨ããå½¢å¼ã«ãã
- 次ã®1byteã¯ãã10xxxxxx ã¨ããå½¢å¼ã«ãã
ããã§ã注ç®ããªããã°ãªããªãã®ã¯ãã次ã®1byteã§èªç±ã«ä½¿ããã®ã¯6æ¡ã¾ã§ãã¨ãããã¨ããªã注æããªããã°ãªããªããï¼
ããã¯ããã£ãã®1byte表ç¾UTF-8ã§æ¢ã«7æ¡ç®ã使ã£ã¦ããããã§ããã
ã¤ã¾ãã2byteã使ã£ã¦129åç®ã®UTF-8æåãä½ãæã«ã¯ã次ã®ããã«èãã
UTF-8ã®2byteæåã®1æå posted by (C)ITOH Takashi
ãªãã»ã©ãid:kick123ã®ã¨ããUTF-8ã®2byte表ç¾ã®æåã®1byteã¯ãC2ãããã§ãªããã°ãªããªãã
ããã§ãããªãã2éãã®è¡¨ç¾ã®ä»æ¹ãããã®ãï¼ãã«æ°ã¥ã
ããã§ããã¿ã¨æ°ãã¤ããã
2byteã®UTF-8æåã§ãæåã®1byteãã11000000ãã ã£ãããã11000001ãã ã£ããããUTF-8æåã£ã¦ããã£ã¦ã¯ãªããªãããã®ãªããããªããï¼
ãã£ã¦ã¯ãªããªãã»ã»ã»ã»ããããããããã
ã¨ã
ãããã«ãã£ã¦ã¿ããPHPã§ã
ã¿ã¼ã²ããã¯ã1byte表ç¾ã01...ã§å§ã¾ããã®ãªãä½ã§ãè¯ããã ãã©ããã£ãã®ä¸è¦§è¡¨ããã¨ãããããzããã¿ã¼ã²ããã«ããã1byte表ç¾ã¯ã "01111010"ã
<?php $n = base_convert( "01111010" , 2, 16); echo hex2bin(strval($n))."\n";
ã¯ã確ãã«
ã¨åºããå½ç¶ã ã
ã§ã次ã¯ããã®2byte表ç¾ãã11000001ãã10111010ãã確ããã¦ã¿ã
<?php $n = base_convert( "11000001"."10111010" , 2, 16); echo hex2bin(strval($n))."\n";
ãããã£ã±ã
ã¨åºãï¼ï¼
ãã ããååã®è¨äºã®ããã«ããããæ¬å½ã«ãzããªã®ãã確ããã¦ã¿ã
<?php $n = base_convert( "01111010" , 2, 16); echo hex2bin(strval($n))."\n"; $n = base_convert( "11000001"."10111010" , 2, 16); echo hex2bin(strval($n))."\n";
EUCã§ã¯ããã ã£ããã©ãShiftJISã§ã¯éã£ãçµæãåºãã
ããããããããåé·ãªUTF-8ã¨ã³ã³ã¼ãã£ã¦ãã¤ã ããã
ããããã¢ããªã¯ä½ã§ï¼ã£ã¦è©±ã
ååã®ã¨ã³ããªã§ä¸æã ã£ãããªãã§ãããªåéãããã¦ãã¾ãã®ãï¼ãã£ã¦ç¹ã
ããã¾ã§èªåã§æãåããã¦ãã³ã¼ããã¦ãã£ã¨ããã£ãã
ããã¯ãå¤åããããã®ã¢ããªã±ã¼ã·ã§ã³ã¯ããUTF-8ã®å®ç¾©é¨åãåé¤ãã¦ãæå³ã®ããé¨åã ããã¬ããã£ã³ã³ãã¦ãã³ã¼ããã¦ããããããããªããï¼
ã¨ãããã¨ã
ãããªæããå¿ç¨ããã°ãã©ããªæåã§ãè¡ããã
ã¯ãªãããã¼ã05 posted by (C)ITOH Takashi
ã¾ãããããããããµã¼ããµã¤ãã§ãã§ãã¯ãã¹ãé ç®ãã£ã¦ããã¨ãèªåèªèº«ã¨ãã¦ã¯ããã©ã¦ã¶ãªãã®ã¢ããªã®è§£éãããããããã¨ãã£ã¦ããã¨æããã§ããã©ã¦ã¶å´ã§ç°å¸¸ãªUTF-8ã¯ç°å¸¸ãªã¾ã¾è¡¨ç¤ºããã¦ãããã¼ã£ã¦æããã©ãã¾ããããªãã¨è¨ã£ã¦ããµã¼ãå´ã®å¯¾å¿ããããªãããããªããã¨ã ãããPHPã§ãããã®ãããã¯ãµã¼ããµã¤ãã¨ã¯ã©ã¤ã¢ã³ããµã¤ãã¨éçºè ã¯ãäºãæ§ã£ã¦ãã¨ã§ã
ããããUTF-8ãDefineããã¨ãã¯ããããªç©´ããããªãã¦æããããªãã£ããã ããã
ã ãã©ãèªåã¿ãããªãºã¼ãºã¼ããã®ã³ã¼ãã®æãç«ã¡ãç解ãããã¨ããã¨ããã¿ãã¨æ°ä»ãã¦ãã¾ã£ã¦ãããããã»ã»ã»ãã£ã¦æã£ãããããã ãªãåå¿è
ç®ç·å¤§åã