Document Informationclick to expand document information社å åå¼·ä¼ã§çºè¡¨ããã¹ã©ã¤ãã§ãã
ç·ã®åã®çµµæåã®ä¸ä½1byteã¯0x41ãªã®ã§ãã«ã¼ã«2ã®è¡¨ã®1çªã«è©²å½ãã¾ãã0x41-0x40=0x01ãªã®ã§ãèµ·ç¹ãã1çªç®ã®æåã§ãããã¨ããããã¾ããã 3.Unicodeã§ã®ã³ã¼ããã¤ã³ããæ±ãã ï½¢1ãã§æ±ããUnicodeã®èµ·ç¹ã«ï½¢2ï½£ã§æ±ããæ°åã足ãã°ãUnicodeã§ã®ã³ã¼ããã¤ã³ããæ±ãããã¨ãåºæ¥ã¾ãã ç·ã®åã®çµµæåã«é¢ãã¦ãï½¢1ï½£ã®çµæã¯U+E000ãï½¢2ï½£ã®çµæã¯0x01ã§ãããU+E000+0x01=U+E001 ã¨ãããã¨ã§ç·ã®åã®çµµæåã¯Unicodeã§ã¯U+E001ã§ãããã¨ããããã¾ããã Shift_JISã®SoftBankçµµæåãå¤æããJAVAã®ã½ã¼ã¹ã³ã¼ã Shift_JISã®ã¨ã³ã³ã¼ãã§æ¸¡ãããSoftBankçµµæåãUnicodeã®ã³ã¼ããã¤ã³ãã«å¤æããJAVAã®ã½ã¼ã¹ã³ã¼ãã以ä¸ã«è¨è¼ãã¾ããJAVAã§ã¯ãã¨ã³ã³ã¼ãã£ã³ã°
2009å¹´06æ15æ¥07:00 ã«ãã´ãªLightweight Languages perl - use utf8; #ã£ã¦ä½ã ? id:otsuneã«å»ºè¨äºå®ãã©ã°ããã¦ããã¦ããã®ã§ã å¬éãã«æ¶ãè¡ãå¶æã¬ã¼ã«ã¯ã夢ç©èªã«ãªã¢ã«ãæ±ããªãã - subtech Perl ã® utf8 é¢ä¿ãæªã å ¨ãç解ã§ããªããããããªããã¨ãããããªãã®ã§æ´ç use utf8ã¯ãã¤ãã©ã°ããã¦ãã use utf8 ãã¦ã¦ããã©ã°ãããªããã¨ãããâ¦â¦ ããã¯ã以ä¸ã®å®ä¾ãè¦ã¦ããã ãã®ãä¸çªããã ããã #!/usr/bin/perl use strict; use warnings; use utf8 (); sub check_flag{ my $str = shift; print qq("$str" ), utf8::is_utf8($str) ? 'is' : 'IS NOT',
ãªããªã豪快ãªè¨äº(è¬ç¿ä¼ãæåéåã¨æåã¨ã³ã³ã¼ãã£ã³ã°ããéå¬ãã¾ãã â ãã£ããªã¼ãã³ã©ãã©ããª)ãè¦ã¤ããã®ã§ãããã³ããæ¸ãã¦ã¿ããã¨ã«ãã¾ãããããã³ãã©ããã¯ããªãå¤ããã§ãããã¾ãä¸ã®ä¸ã®æåã³ã¼ãããã¿ã®è¨äºãªãã¦å¤§åããããªãã®ã§ãã ãæåã³ã¼ããã¨ããèªã¯ãæ£ãããã ã¹ã©ã¤ãã®5ãã¼ã¸ç®ã¯ããæåã³ã¼ããã¨ããè¨ãæ¹ã¯ééãã¨ãã趣æ¨ã«è¦ãã¾ãããããã§ãããã¾ããã ã¨ããã®ããæåã³ã¼ãã®ä¸çã¯é£ããä¸çã§ããè¤æ°ã®ã¬ã¤ã¤ã¼ãè¤æ°ã®å½ãè¤æ°ã®ãã³ãã¼ã«ã¾ããã£ã¦ãããã®ãç°¡åã«ãªãã¯ããããã¾ããããããå¿ é è¦ç´ ã§ããããã«ãååãªç¥èãæããªãã¾ã¾ãã¾ãã¯å¿ è¦æ§ã«é§ããã¦ååãªç¥è¦ãéã¾ãåã«å®è£ ãè¡ã£ã¦ãã¾ããã¨ããã°ãã°ããã¾ãããã®ãã¨ãããã«ãæ´å²ççµç·¯ãã¨ãã¦ããã«æåã³ã¼ããé£ãããã¦ãã¾ããä¾ãã°HTTPã®charsetãã©ã¡ã¼ã¿ã¯ãchar
Recent entries Apache2.4ã®ãªãªã¼ã¹äºå®ã¯æ¥å¹´(2011å¹´)åã(ããã¾ã§äºå®) inoue 2010-12-23 Herokuã®çºé³ inoue 2010-12-20 éèªè¨äºãã½ããã¦ã§ã¢ã»ãã¹ãPRESS Vol.9ãã®åç¨¿å ¬é inoue 2010-12-18 IPAæªè¸ã®ãã¥ã¼ã¹ inoue 2010-12-15 å´åºæ³ã¨ããã³ã²ã¼ã inoue 2010-12-06 ããã³ãã¨ã³ãã¨ã³ã¸ã㢠inoue 2010-12-03 ASCII.technologiesèªã«MapReduceã®è¨äºãæ¸ãã¾ãã inoue 2010-11-25 æè¡è©è«ç¤¾ãã¼ãã§ã¯ãã·ãªã¼ãºçµ¶è³çºå£²ä¸ inoue 2010-11-24 éèªé£è¼ãEmacsã®ãã©ããããã®å稿(part8)å ¬é inoue 2010-11-22 RESTã®å½æ inoue 2010-11-22 ãã
çµµæåã¨ã¯ãé¡ã®è¡¨æ ããã®ä»ã®ã·ã³ãã«ãªã©ãçµµã§è¡¨ç¾ããæåã§ãæ¥æ¬ã®æºå¸¯é»è©±ã¦ã¼ã¶ã¼ã®éã§ç¹ã«äººæ°ããããåºã使ç¨ããã¦ãããã®ã§ããå æãGmail ã§ãçµµæåã使ç¨å¯è½ã«ãªãã¾ããã詳ããã¯Gmail ãã¼ã ã®ããã°ãã¹ããGmail ã§çµµæåã使ããããã«ãªãã¾ãã ããã覧ãã ããã ãããã®çµµæåã¯æºå¸¯é»è©±ä¼ç¤¾ãåã ç¬èªã«åµä½ãããã®ã§ãã¡ã¼ã«ãã¦ã§ããªã©ã§ä½¿ããã¦ãã¾ããçµµæåã¯å ã åæºå¸¯ä¼ç¤¾ã®ã¦ã¼ã¶ã¼å士ã§ä½¿ç¨ããããã¨ãåæã«ä½ããããã®ã§ãããç¾å¨ã§ã¯å社éã§ããç¨åº¦ã®äºææ§ãä¿ã¤ããã®çµµæåå¤æ表ãå©ç¨ããã¦ãã¾ãã ã¦ã¼ã¶ã¼ã¯æºå¸¯ä¼ç¤¾ãæ©ç¨®ã®éãã«é¢ããããè¦æ £ãã¦ããçµµæåã表示ããããã¨ãæå¾ ãã¦ãã¾ããèªåãã¡ã¼ã«ã§éã£ãçµµæåããåä¿¡å´ã§ãåããåçã®çµµæåã§è¡¨ç¤ºããããã¨ãã¦ã§ãã§è¦ãçµµæåãä»ã®æºå¸¯ã¦ã¼ã¶ã¼ã«ãåãã«è¦ãããã¨ãã¾ãæ¤ç´¢ã¨ã³ã¸ã³ã§çµµæåã
2006å¹´11æ23æ¥22:00 ã«ãã´ãªLightweight Languages perl, python & ruby - chr() vs. Unicode ã¨ããããã§ã404 Blog Not Found:There's more than one language to cook your problemsã§Python & Ruby Cookbooksãä¸æ°èªã¿ããã®ã§ãæ°ã«ãªãç¹ãå°ããã¤æ¸ãã¦è¡ããã¨ã«ããã ã¾ãã¯ãæåã®æ±ããæååã§ãªãç¹ã«æ³¨æã å°ãªãã¨ããæååããã¤ãåã¨è¦ãªãã¦ç¸äºå¤æãããã¨ã¯ãLLã«éããããã¦ãã®è¨èªã§åºæ¥ãããã«ãªã£ãã®ã ãã©ãæåãæåã¨ãã¦æ±ãã¨ããç¹ã«é¢ãã¦ã¯åè¨èªã¨ãã¾ã¡ã¾ã¡ã§ãå¤è¨èªæ´¾ã®ç§ã¨ãã¦ã¯çµæ§é ã®ãããã¨ããã ããã§ã¯ãç§ãä¸çªæµæ¢ãªPerlã軸ã«ãRubyã¨Pythonã§ã¯ã©ããªã£ã¦ããã®ãã調ã¹ã¦ã¿ãã æ°å¤
2005å¹´12æ20æ¥11:45 ã«ãã´ãªiTechLogos åå¿é²: Unicode, UCS, and UTF ã¾ã æ··ä¹±ãåã¾ã£ã¦ããã£ããããªããããªã®ã§ãåå¿é²ãå ¼ãã¦ããã§ã¾ã¨ãã¦ããã¾ãããã é»è³ç¤¾ä¼ã®æ¥æ¬èª å è¤ å¼ä¸ quinta essentia - del.icio.usè²·å, Yonahãã£ã¦ãããª? Character Set (æåéå) vs. Encoding (符å·å) ã¾ããã®äºã¤ãå¥ç©ã ã¨ãããã¨ãæãã¾ããããUCSã¨ããã®ã¯ååãããããéããCharacter Set (æåéå)ã§ã(ã¨ã¯ãããUnicode.orgã®Glossaryãè¦ãã¨ã符å·åã®ä¸æ段ã«ãè¦ããªãã¯ãªã)ããã®æ®µéã§ã¯ãåæåã¯ãèçªå·ããæã£ã¦ããã«éãã¾ãããç義ã®ãUnicodeãã¯ãã®ãèçªå·ããæãã¾ãã ãããã©ãå®éã®ãã¼ã¿ã«ããã®ããEncoding (
2006å¹´11æ24æ¥12:30 ã«ãã´ãªLightweight Languages Unicodeã¯æåéåã符å·åæ¹å¼ã 以ä¸ã¯ãé»è³ã§æåãæ±ãå ´åã®åºç¤ä¸ã®åºç¤ãªã®ã ããèå¿ã®è¨äºã«é大ãªèª¤ããããã¤ãããã æåã³ã¼ãè¦æ ¼ã®åºç¤ï¼ITpro ããããå ·ä½çãªèª¬æã«å ¥ãããæåã«ã¯ã£ããããã¦ããå¿ è¦ãããã®ã¯æ¬¡ã®ç¹ã ãä¸è¬ã«ãæåã³ã¼ããã¨è¨ãå ´åï¼ æåã®éå ã¨ã³ã³ã¼ãæ¹æ³ ã¨ããè¦ç´ ãããããã®äºã¤ãåºå¥ãã¦èãããã¨ãéè¦ã ããã¡ãã大ããªé¢é£ã¯ããã®ã ãï¼ãã¡ããã¡ãã®ã¾ã¾ã§ã¯ãããããªãããªã大ããªè¦å ã¨ãªããããã ã ããã«ããã¨ãUnicodeã¯æããã«ãã¨ã³ã³ã¼ãæ¹æ³ãã§ããããããã¯ééããããã§æ¸ããã¦ãããã®ã¯UCS-2ã¨ããååã®Unicodeãå®ããããã¤ãã®ãã¨ã³ã³ã¼ãæ¹æ³ãã®ä¸ã¤ã§ããããããUTF-16ã«ãã£ã¦é³è åããæ¹å¼ã§ããã ã¾ãUnic
-> 趣æ¨ã¨æ³¨ææ¸ã -> UTF8ãã©ã°ï¼ -> UTF8ãã©ã°ã¨PerlIOã¬ã¤ã¤ -> UTF8ãã©ã°ã®ã¤ããæååãè¨è¿°ãã -> Wide character in print ... -> Encode -> utf8::* -> use utf8; -> use encoding; -> use UTF8 㨠use encoding -> JcodeããEncode㸠-> æ å ±æº <- ã¢ã㫠趣æ¨ã¨æ³¨ææ¸ã Perl 5.8.x ã®Unicode é¢é£ã§ãã æ£ç´ã5.8.x ã¯ããã¿ã§ãã使ã£ã¦ãªãã£ãã®ã§(ããã¡ããã¨ããã£ããã¨ãããã¾ããã§ããã 使ã£ã¦ã¿ãã¨ãããããããªããªã£ãã®ã§ãã¡ãã£ã¨ã¾ã¨ãã¦ã¿ã¾ããã ä»ã§ãããã¾ãããã£ã¦ãªããããããªãã®ã§ãå 容ã¯ç¡ä¿è¨¼ã§ãã çªã£è¾¼ã¿æè¿ã JcodeãEncodeã®ã¡ã³ããã®å¼¾ããããããææããã ããã®ã§ã
UCS ( Universal Multiple-Octet Coded Character Set ) ã¯å ¨ã¦ã®è¨èªã® æåãä¸ã¤ã® ( çµ±ä¸ããã ) ã³ã¼ãã«å²ãå½ã¦ããã®ã§ãã ã³ã¼ããã¼ãã«ã¯ 0 ã 0x7FFFFFFF ã¨ãç´ï¼ï¼åæåå æãã¾ãã ï¼æåãããï¼ãã¤ã使ç¨ãããã UCS-4 ã¨ãå¼ã°ãã¾ãã Unicode 㯠UCS ã® 0 ã 0x10FFFF ( ç´ 111 ä¸æå ) ã®é¨åãããã¾ãã 使ãã¾ããã¨æ¸ãã¾ãããå®é㯠UCS 㨠Unicode ã¯çå®ãã¦ããã°ã«ã¼ããç°ãªãã¾ãã UCS ãçå®ãã¦ããã°ã«ã¼ããä¸ä½é¨åãæ¡ç¨ãããã¨ããäºææ§ãçºçãã ãµãã»ãã(é¨åéå)ã¬ãã«ã§ã¯åä¸ã¨ã¿ãªããã®ãç¾ç¶ã§ãã UCS-4 ã®ç¯å²ã§ã¯ UCS-4 ( UTF-32 ã¨ãå¼ã°ãã¾ã ) 㨠UTF-8 ã®ï¼ç¨®é¡ã®ã³ã¼ãã£
æå符å·åæ¹å¼ï¼ãããµãããã»ããããè±: character encoding schemeãCESï¼ã¨ã¯ã符å·åæåéåã§æåã«å¯¾å¿ä»ããéè² æ´æ°å¤ããå®éã«ã³ã³ãã¥ã¼ã¿ãå©ç¨ã§ãããã¼ã¿åï¼é常ããã¤ãåï¼ã«å¤æãã符å·åæ¹å¼ã æå符å·åä½ç³»ãæå符å·åã¹ãã¼ã (CCS) ã¨ãè¨ããæåã«ã¤ãã¦è¿°ã¹ã¦ãããã¨ãæ確ãªã¨ãã¯ãåã«ç¬¦å·åæ¹å¼ãã¾ãIBMã®ç¨èªã§ã¯ã³ã¼ãåä½ç³» (ES) ãªã©ã¨ãè¨ãã ãã®ç¨èªã¯UnicodeãIETFã®æ¨æºãªã©ã§ç¨ãã¦ããããISO/IECãJISã®æ¨æºã§ã¯ç¨ããã符å·åæåéåã®æ§é ããããã¯ãæå符å·ã®æ§é åã³æ¡å¼µæ³ãã¨ãã¦æ±ããã¦ããããã®ç¨èªã®å®ç¾©ã¯ãä¸çã®æåã³ã¼ãè¦æ ¼ã¨ã¯å¿ ãããåè´ããªããã¨ãããã 符å·åæåéåã¨CESã®é¢ä¿ã示ããããJIS X 0208ãä¾ã«ã¨ãããªãã話ãç°¡åã«ãããããéå±æ¸ã¯ç¡è¦ããã JIS X 020
ã³ã³ãã¥ã¼ã¿ä¸ã§æåãæ±ãå ´åãå ¸åçã«ã¯æåã«ããéä¿¡ãè¡ãå ´åã«ãã®ä¸¡ç«¯ç¹ã§ã¯ãã©ã®ãããªæåéåã使ããããããããåã決ãã¦ããå¿ è¦ãããããããããå®ç¾©ããã符å·åæåéåï¼å¾è¿°ï¼ã使ããã¨ãããã¨ãå¤ãã 符å·åæåéåï¼å¾è¿°ï¼ã®åé²å¯¾è±¡ã¨ãªãæå群ãã¬ãã¼ããªã¨ãããæ¨æºãè¦æ ¼ã«ãã£ã¦ç¨èªã«éãããã次ã®ããã«å®ç¾©ããã¦ããã Unicode Character Encoding Model (UTR#17) æ½è±¡æåã¬ãã¼ã㪠(ACR: Abstract Character Repertoireï¼- 符å·åã®å¯¾è±¡ã¨ãªãæ¦å¿µä¸ã®æåã®éé åºéåã Character Model for the World Wide Web 1.0ï¼ Fundamentals (W3Cå§å CharMod) ã¬ãã¼ã㪠(repertoire) - 符å·åã®å¯¾è±¡ã¨ãªãèå¥ãããæåã®éåãä¸
Section: Linux Programmer's Manual (7) Updated: 2001-05-11 Index JM Home Page roff page åå UTF-8 - ASCII ã¨äºææ§ã®ããå¤ãã¤ã Unicode ã®ç¬¦å·å 説æ ã¦ãã³ã¼ã (Unicode) 3.0 æåéå㯠16 ãããã®ã³ã¼ã空éãå ããã æãåç´ãª Unicode ã®ç¬¦å·åæ¹æ³ (UCS-2) ã§ã¯ãæå㯠16 ãããã»ã¯ã¼ã (16 ãããæåã®å) ã§æ§æãããã ãã®åã«ã¯ã aq\0aq ã aq/aq ã®ãã㪠(ãã¡ã¤ã«åã C ã®ã©ã¤ãã©ãªé¢æ°ã®å¼ãæ°ã®å é¨ã§) ç¹æ®ãªæå³ãæ㤠16 ãããæåãå«ã¾ãããã¨ãããã ããã«ãã»ã¨ãã©ã® Unix ãã¼ã«ã¯ ASCII ãã¡ã¤ã«ãå ¥åã¨ãã¦æå¾ ããã®ã§ã å¤§å¹ ãªå¤æ´ãªãã«ã¯ 16 ãããã¯ã¼ããæåã¨ãã¦èªã
æçµæ´æ° 2003-11-11 UCSã¨UTF ã¦ãã³ã¼ãã§ææ¸ãä½ãã¨ãï¼æåã³ã¼ãã®æ¹å¼ãï¼ç¨®é¡ï¼ãããã¯ãã以ä¸ãããã¨ã«æ°ãä»ãããããããã¾ãããä¾ãã°ï¼ä¸ã¤ã¯Unicodeã¨ããã®ã«å¯¾ãï¼ããä¸ã¤ã¯Unicode (UTF-8)ã¨è¡¨è¨ããã¦ããããããã¾ããããã®ï¼ã¤ã¯ï¼ã¾ãï¼ãã®éãã¯ä½ã§ããããï¼åè ã¯UTF-16ã®ä¸å½¢æ ãªã®ã§ããâ¦ï¼ã UCS-2ã¨UCS-4 ã¦ãã³ã¼ããæ¡æããããã¨ã«ãªã£ãå¤è¨èªç¨ã®æåã³ã¼ãã»ããï¼ISO-10846-1ã¯ï¼16ãããï¼16æ¡ã®äºé²æ°ï¼ã§ããããã®æåã表ãã¾ãããããUCS-2 (Universal Character Set coded in 2 octetsï¼ãï¼ã¤ã®ãªã¯ãããã§ã³ã¼ããããã¦ããã¼ãµã«æåã»ããã)ã¨ç§°ãã¾ãããªã¯ãããã¨ã¯æåé·ã®åä½ã¨ãã¦ã®8ãããï¼8æ¡ã®äºé²æ°ï¼ã®ãã¨ã§ãã ã¦ãã³ã¼ãã®ç¹å®ã®æåã¯ï¼ä¾
id:tomi-ru ããã [http://e8y.net/mag/015-encode/:title] ã¨ããã¨ã¦ããã©ã¯ãã£ã«ã«ãª [http://search.cpan.org/perldoc?Encode:title=Encode] å ¥éããæ¸ãã«ãªã£ãã®ã§ï¼ããããéãåãå£ã§æ¸ãã¦ã¿ãããªãã¾ããã ãã¡ããã®åºç¤ï¼èªã¿é£ã°ãå¯ï¼ æåã»ãã, ãã£ã©ã¯ã¿ã»ãã, æåéå, æåéå - Wikipedia ã¨ã³ã³ã¼ãã£ã³ã°, 符å·åæ¹å¼, æå符å·åæ¹å¼ - Wikipedia ãã®2ã¤ã¯ç°ãªãã¾ããã¨ãã«ç¥ããªãã¦ãä¸è¨ã®ææ¸ãèªããã¨ã¯ã§ãã¾ããï¼ç解ãã¦ããã¨ããã«ãªãã¾ããããããç¥ããã人ã¯èªç¿ãã¦ãã ããã æåã»ããã®ä¾ Unicode JIS X 0208 ã²ãããªã¨ãã«ã¿ã«ãã¨ãæ¼¢åã¨ã ASCII æå ã¨ã³ã³ã¼ãã£ã³ã°ã®ä¾ UTF-8 ISO-202
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}