PDFãè¦ãã¦ã¿ãã(ãã¼ã«ç·¨)
ãã®è¨äºã¯Imaizumi Lab Advent Calendarã®9æ¥ç®ã§ãã
ãªãè¨äºã®æ稿ã¯13æ¥ã®æ¨¡æ§ã
æ´æ°å±¥æ´
2020/12/13 æ稿å¾ã«peepdfã®ãã«ã¦ã§ã¢æ¤ç¥æ©è½ã«ã¤ãã¦æ¸ãå¿ãã¦ãããã¨ã«æ°ã¥ããã®ã§è¿½è¨
ã¯ããã«
ååã«å¼ãç¶ããPDFã®å é¨æ§é ãè¦ã¦ããã¾ãã
ååã®è¨äºã¯ãã¡ãã§ãã
ä»åã®ãããã¯ã¯ãPDFã解æããã®ã«ä¾¿å©ãªãã¼ã«ã®ç´¹ä»ã§ãã
ååã®è¨äºã§PDFã®æ§é ãã©ããªã£ã¦ããããã¿ã¦ããã¾ãããæ§é ã詳ããè¦ã¦ããã¨é¢ç½ããã§ãããPDFã®ã¤ã³ãã¤ã¬ã¯ããªãã¸ã§ã¯ãã®åç §é¢ä¿ãæä½æ¥ã§æ¢ã£ãããåºã¦ãããã¼ã¯ã¼ããã«ã¦ã³ããã¦ããã®ã¯å¤§å¤ã§ããããããã§ãã¼ã«ã®åºçªã§ãã
ç´¹ä»ãããã¼ã«ã¯å ¨ã¦Python製ã§é¢åãªã¤ã³ã¹ãã¼ã«ãªã©ä¸è¦ãªã®ã§ãæ¯éãæå ã®PDFã§è©¦ãã¦ã¿ã¦ãã ããã
(Pythonã¯2ç³»ã¨3ç³»ã®ä¸¡æ¹ã使ããå¿ è¦ãããã¾ããpeepdfã2ç³»ã§ããåããªãããã§ã)
PDFiD
DidierStevensSuite/pdfid.py at master · DidierStevens/DidierStevensSuite · GitHub
PDFã®ç¹å®ã®ãã¼ã¯ã¼ããæ½åºããæ°ã表示ãã¦ããããã¼ã«ã§ãã
PDFã®æ©è½ãæªç¨ãããã«ã¦ã§ã¢ã§ã¯ç¹å®ã®æ©è½ã使ããããã¨ãå¤ã(/JavaScriptã/OpenActionãªã©)ããã®ãã¼ã«ã§èª¿ã¹ããã¨ã§ãããã®æ©è½ã使ããã¦ããããç°¡åã«ãã§ãã¯ãããã¨ãã§ãã¾ãã
pdf-parser
DidierStevensSuite/pdf-parser.py at master · DidierStevens/DidierStevensSuite · GitHub
é常ã«é«æ©è½ãªPDFãã¼ãµã§ãã
å ã»ã©ç´¹ä»ããPDFiDã¨ä¼¼ããããªçµ±è¨åºåæ©è½ãããã¾ãã*1
å®è¡ãã¦ã¿ã
ãªãã·ã§ã³ãªãã§å®è¡ããã¨ã
- ãªãã¸ã§ã¯ãçªå·ã¨ä¸ä»£çªå·
- ã¤ã³ãã¤ã¬ã¯ããªãã¸ã§ã¯ãã®ç¨®é¡
- åç §é¢ä¿
- ãã£ã¯ã·ã§ããªã®å 容
ã表示ããã¾ãã
ãã¡ã¤ã«å ¨é¨ã§ã¯ãªãç¹å®ã®ã¤ã³ãã¤ã¬ã¯ããªãã¸ã§ã¯ãã ããè¦ããå ´åã¯ãã-o n (nã¯ãªãã¸ã§ã¯ãçªå·)ãã¨æå®ãã¾ãã
ãã®ä»ã§ãããã¨
-yãªãã·ã§ã³ã§YARAã«ã¼ã«ãå¼æ°ã«åããã¨ã§ãã«ã¼ã«ã«åºã¥ãPDFãæªæ§ãã©ãããæ¤ç¥ã§ãã¾ãã*2
-sãªãã·ã§ã³ã§ã-s hogehogeãã®ããã«æå®ããã¨ãhogehogeãå«ããªãã¸ã§ã¯ããæ½åºã§ããããã¾ãã
ã¡ãã£ã¨é¢ç½ãã®ã¯ã-gãªãã·ã§ã³ãç¨ããã¨ãã¼ã¹å¯¾è±¡ã®PDFãã¡ã¤ã«ãçæããPythonããã°ã©ã ãçæãã¾ããã©ãã§ä½¿ãã®ãã¯ã¡ãã£ã¨ãããã¾ããã...
peepdf
Pythonã®ãã¼ã¸ã§ã³ã2ç³»ã§ããåããªããã注æãå¿ è¦ã§ãã*3
ç¹å®ã®ã©ã¤ãã©ãªããªãã¨æãããããããã¾ããããããã«æ¸ãã¦ãããªãã·ã§ã³ã¯ãããã®ã©ã¤ãã©ãªãªãã§å®è¡å¯è½ãªã®ã§ãç¡è¦ãã¦ã大ä¸å¤«ã§ãã*4
å®è¡ãã¦ã¿ã
ãªãã·ã§ã³ãªãã§å®è¡ããã¨ãçµ±è¨æ å ±ã表示ããã¾ãã
tree表示
python peepdf.py hoge.pdf -f -C tree
ä¸è¨ã®ã³ãã³ã*5ã§ãåç §é¢ä¿ã表ããæ¨æ§é ã表示ããããã¨ãã§ãã¾ããã«ãã³ã®ä¸ã¯ãªãã¸ã§ã¯ãçªå·ã§ãã試ãã«ãªãã¸ã§ã¯ãçªå·ã2ã®ã¤ã³ãã¤ã¬ã¯ããªãã¸ã§ã¯ãããåºåããtreeã¨ãã¡ã¤ã«ãã¼ã¿ã®ä¸¡æ¹ã§è¦ã¦ã¿ã¾ãããã
tree表示ã®æ¹ãè¦ã¦ã¿ãã¨ãPage(2)ã¯Pages(3)ãstream(4)ãdictionary(6)ãåç §ãã¦ããããã§ããããã¹ãã¨ãã£ã¿ã§éããæ¹ãè¦ã¦ã¿ãã¨ãã3 0 R 6 0 R 4 0 Rãã¨åç §é¢ä¿ãæ¸ãã¦ããã¾ããtree表示ã¨ä¸è´ãã¦ãã¾ãã
tree表示ã®åºåãå¤æ´ãã
PDFã®é層æ§é ãæ±ãéã«peepdfã¯é常ã«å¼·åãªãã¼ã«ã§ãããåºåè¡æ°ã大ããæã«éä¸ã§åºåãæ¢ã¾ã(ãã¼å ¥åå¾ ã¡ç¶æ ã«ãªã)ã¨ãã£ãåé¡ãããã¾ãã
ããã¹ããã¡ã¤ã«ã«åºåãããæã«ä¸ä¾¿ãªã®ã§ä¿®æ£ãã¾ããå°ã éã§ããã以ä¸ã®ä¿®æ£ãè¡ãã¨è§£æ¶ã§ãã¾ãã
- peepdf/PDFConsole.py 4293è¡ç®
- limit = int(self.variables['output_limit'][0]) + limit = 10000000 #ã¨ã«ãã大ããªæ°å
ããã§å¤å°æ±ãããããªãã¾ãã
çããããã¡ã¤ã«ã®æ¤ç¥
peepdfã§ã¯é¢åãªè¨å®ãªãã§çããããã¡ã¤ã«ãæ¤ç¥ããæ©è½ãããã¾ãããªãã·ã§ã³ãªãã§å®è¡ããéã®çµ±è¨æ å ±ã«ãçãããã¤ã³ãã¤ã¬ã¯ããªãã¸ã§ã¯ãããã£ãæ¨ãæ¸ããã¾ãã
PDFãã«ã¦ã§ã¢ã§è©¦ãã¦ã¿ãã¨ããããªæãã«ãªãã¾ããCVEã¾ã§ç¹å®ããã¦ãã¾ããã
ãããã«
ãã¼ã«ã使ããã¨ã§ãããæ·±ãPDFã解æãããã¨ãã§ãããã§ãã
ä»åç´¹ä»ãããã¼ã«ã«ã¯ã¾ã ç´¹ä»ãããã¦ããªããªãã·ã§ã³ããã£ããããã®ã§ãã©ããã§è¨äºããããããããªã¨æãã¾ãã
次ã®PDFã«ã¾ã¤ããè¨äºã¯ã¾ã æªå®ã§ãããpdf-parserã®åºåãèªã¿è¾¼ãã§èªä½ããã°ã©ã ã§æ±ãããããã話ãªãããæ¸ããã¨æãã¾ããéè¦ãã©ãã«ããã®ãã¯ãããã¾ãããã
åèæç®
マルウェア解析者向け: 疑わしい PDF を解析する Python ツール - 拡張頭蓋 | Extended Cranium
*1:-aãªãã·ã§ã³ã§ã
*2:çè ã¯è©¦ãã¦ã¾ãã
*3:3ç³»ã«å¯¾å¿ããããã«ãªã¯ãåºã¦ããããã§ããããã¼ã¸ããã¦ãã¾ãã
*4:PyV8ãå ¥ããã®ã«è¦å´ããè¦ããããã¾ããhttps://github.com/brokenseal/PyV8-OS-X ãåèã«å ¥ãã¾ããã
*5:-fãªãã·ã§ã³ã¯ããã¼ã¹æã®ã¨ã©ã¼ãç¡è¦ãããªãã·ã§ã³ã§ããæªæã®ããPDFã ã¨ãã®ãªãã·ã§ã³ãã¤ããªãã¨ã¨ã©ã¼ãåºããã¨ãããã¾ããä»åã¯å¿ è¦ãªããã¨æãã¾ãã