URLã¨ã³ã³ã¼ã/ãã³ã¼ã : C++ : ã¡ã¢ãªç®¡çæå
以åã«ä½æããC++ã«ããURLã¨ã³ã³ã¼ã/ãã³ã¼ãé¢æ°ã¯ãå¤æå
ã®æååãæ ¼ç´ããããã«std::stringã¯ã©ã¹ã使ã£ã¦ããã
std::stringã使ãã°ãã¯ã©ã¹å´ãã¡ã¢ãªç®¡çãé©åã«è¡ã£ã¦ãããã®ã§æ¥½(ãã°ãçãã«ãã)ãªã®ã ãããã®åãªã¼ãããããããã
URLã¨ã³ã³ã¼ã/ãã³ã¼ãã®å ´åã¯ãå¿
è¦ãªã¡ã¢ãªéã®ä¸éããããããåãã£ã¦ããã®ã§ããããå©ç¨ãã¦ã¡ã¢ãªç®¡ç*1ãæ示çã«è¡ãããã«ã³ã¼ããä¿®æ£ããå度å¦çæéãè¨æ¸¬ãã¦ã¿ãã
åç §: mmap_t
//////////////////////////// // ãã¡ã¤ã«å: url_encode.cc // g++ -O3 -ourl_encode url_encode.cc /* 以åã¯ãæé©åãªãã·ã§ã³ã¨ãã¦-O2ãæå®ãã¦ãã -> 0.002sé«éå*/ // time url_encode 対象ãã¡ã¤ã« #include "mmap_t.h" bool is_safe_char(char c) { return isalnum(c)||c=='.'||c=='-'||c=='_'||c=='*'; } char* encode_char_to_hex(char c, char* dist) { dist[0]='%'; dist[1]="0123456789ABCDEF"[(c&0xF0)>>4]; dist[2]="0123456789ABCDEF"[c&0x0F]; return dist+2; } unsigned url_encode(const char* c, char* dist) { const char* head = dist; for(; *c!='\0'; c++,dist++) { if(is_safe_char(*c)) *dist = *c; else if (*c==' ') *dist = '+'; else dist=encode_char_to_hex(*c,dist); } *dist='\0'; return dist-head; } #include <iostream> int main(int argc, char** argv) { mmap_t mm(argv[1]); char* dist = new char[mm.size*3+1]; url_encode((const char*)mm.ptr,dist); // XXX: ãã®mmap_tã®ä½¿ãæ¹ã«ã¯ãæ«å°¾ã¯NULLã§ããä¿è¨¼ããªãã¨ãããã°ãããã //std::cout << dist; delete [] dist; return 0; } //////////////////////////// // ãã¡ã¤ã«å: url_decode.cc // g++ -O3 -ourl_decode url_decode.cc /* 以åã¯ãæé©åãªãã·ã§ã³ã¨ãã¦-O3ãæå®ãã¦ãã -> ãã¡ãã¯å¤åç¡ã */ // time url_decode 対象ãã¡ã¤ã« #include "mmap_t.h" static const char table[]={0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,3,4,5,6,7,8,9,0,0,0,0,0,0,0,10,11,12,13,14,15,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,11,12,13,14,15,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; char decode_hex_to_char(const char* c) { // XXX: [0-9a-zA-Z]以å¤ã®å ¨ã¦ã®æåã¯ã0ã¨ãã¦æ±ããã (e.g. "%@X" => 0) return (table[static_cast<unsigned char>(c[1])]<<4)+ table[static_cast<unsigned char>(c[2])]; } unsigned url_decode(const char* c, char* dist) { const char* head = dist; for(; *c!='\0'; c++,dist++) { switch(*c) { case '%': *dist = decode_hex_to_char(c); c+=2; break; // XXX: æ«å°¾ã«'%'ãæ¥ãä¸æ£ãªæååã渡ãããå ´åã®æåã¯æªå®ç¾©(ex. "abc%") case '+': *dist = ' '; break; default: *dist = *c; break; } } *dist='\0'; return dist-head; } #include <iostream> int main(int argc, char** argv) { mmap_t mm(argv[1]); char* dist = new char[mm.size+1]; url_decode((const char*)mm.ptr,dist); //std::cout << dist; delete [] dist; return 0; }
ã³ã³ãã¤ã«å¾ã®ã³ãã³ããã以åã¨åæ§ã«ããããã*2ã«å¯¾ãã¦ä½¿ã£ã¦ã¿ãã¨ãããURLã¨ã³ã³ã¼ãã«0.06ç§ãURLãã³ã¼ãã«0.04ç§ãå¦çæéãè¦ããã
std::stringã使ã£ãå ´å*3ã«æ¯ã¹ãã¨ã³ã³ã¼ããç´4åããã³ã¼ããç´2åãé«éåããããã¨ã«ãªãã
ãã¯ãããããã£ãé«åº¦ãªã¡ã¢ãªç®¡çãä¸è¦ãªã±ã¼ã¹ã§ã¯ããã¤ã³ã¿(ã¡ã¢ãªé å)ãããã°ã©ããæ示çã«æä½ã§ããã¨ãããã¨ã¯ã大ããªå©ç¹ã *4ã
*1:åã«ãå¿ è¦ãªã¡ã¢ãªé åããããããã¾ã¨ãã¦ç¢ºä¿ãã¦ããã ããªã®ã§ãã¡ã¢ãªç®¡çã¨ããã»ã©å¤§ããã§ã¯ãªãã
*2:Apacheã®ãã°ã¯ç´å¤±ãã¦ãã¾ã£ã
*3:ã³ã¡ã³ãã«ãæ¸ãã¦ããããURLã¨ã³ã³ã¼ãã®å ´åã¯ãæé©åãªãã·ã§ã³ãå¤ãããã¨ãå¦çé度ã«å½±é¿ãä¸ãã¦ãã
*4:ã¡ãªã¿ã«ãå®å ¨ã«ç§è¦ã ããæè¿ã¯CãC++ãä»ã®ã³ã³ãã¤ã©è¨èª(Java, Common Lisp, ocaml, etc)ãããéãã¨ããããããã¯ä»åã®ããã«ããã°ã©ãã«ããã¡ã¢ãªã®ç®¡çãå¯è½ãã¤æå¹ãªå ´åã§ã¯ãªããã¨èãã¦ãã(ã¤ã¾ããã¤ã³ã¿æä½ã®å¯å¦)ãååã¯ãURLã¨ã³ã³ã¼ãã«std::string(C++ã®ã©ã¤ãã©ãªãã¡ã¢ãªã管ç)ãç¨ãããããã®å ´åã¯ä»ã®è¨èªã¨ã®å·®ã¯ããã»ã©ãªãã£ã(Javaãsbclã§ã¯ãæååã®ãã¤ãå/ã¦ãã³ã¼ãåå¤æãå¿ è¦ã ã¨ãããã¨ãå¿ãã¦ã¯ãããªããããã¯ç¹ã«ãã³ã¼ãæ(è¦ãã¤ãåâã¦ãã³ã¼ãåå¤æ)ã«ã³ã¹ãã大ãã)ãéã«ãé«åº¦ãªã¡ã¢ãªç®¡ç(ä¾ãã°GCã®ãããªãã¹ãã¼ããã¤ã³ã¿ã¯ã©ãã ãã?)ãå¿ è¦ãªé åã§ã¯ãC++ãèªä½ã®ã¡ã¢ãªã¢ãã±ã¼ã¿ã¼ãªã©ã¨çµã¿åããã¦ä½¿ãããããè¯ããã¥ã¼ãã³ã°ãããGCãåãã¦ããè¨èªã使ã£ãæ¹ãéããããã¡ããå®å ¨æ§ãé«ãã®ã§ã¯ãªããã¨æãã