æ¥æ¬èªãåãã¦å¤§å¤è¦å´ããã®ã§ã¡ã¢ãçµè«ã¨ãã¦ãXMLï¼ã¾ãã¯HTMLï¼ã解æããåã«unicodeé¢æ°ã«éãã¦ãããã¨ãããã¨ã§è¯ãã®ããªï¼ç¸å¤ãããæåã³ã¼ãé¢é£ã¯ããåãããªãã from urllib import urlopen from lxml import etree html = urlopen("http://b.hatena.ne.jp") charset = html.headers.getparam('charset') html_data = unicode(html.read(),charset) et = etree.fromstring(html_data, parser=etree.HTMLParser()) title_element = et.xpath("./head/title")[0] title = title_element.text.e
{{#tags}}- {{label}}
{{/tags}}