ãã¸ãã¹ãã¼ã¿åæãEXCELããPythonä¸å¿ã«å¤æ´ãã¦ããã£ãå¤ãã®ã¡ãªãã
ä»å¹´ã«å ¥ã£ã¦ãããåã ããè¡ãããã£ããã¼ã¿åæã®Python移è¡ã«æ¬æ ¼çã«åçµã¿å§ãã¦ãã3ãæãçµéãã¾ãããçµå¶ã³ã³ãµã«æ¥çã«ããã¨ãã¼ã¿åæã¯ã©ããã¦ãå ¨å¡ã使ããExcelãä¸å¿ã¨ãªã£ã¦ãã¾ãã®ã§ãããå¾ã ã«ç°å¢ãæ´ãããããè»éã«ã®ã£ã¦ããã®ã§ç¶æ³ãã¾ã¨ãããã¨æãã¾ãã
å½åã¯EXCELã§ã§ããªãæ©æ¢°å¦ç¿ãªã©ãè¡ãããã¨æã£ã¦ä½¿ãå§ãã¾ããããã ãã ããã以å¤ã®ã¡ãªãããè¦ãã¦ãã¾ãããå½è¨äºã§ã¯ããããªä½¿ã£ã¦ã¿ã¦ããã£ã¦ããPythonã®ã¡ãªãããç´¹ä»ãã¦ããã¾ãã
ãªããç¾å¨ä½¿ç¨ãã¦ãããã¼ã«ã¯ãã¹ã¦ç¡æã§ä½¿ãã(ãªã¼ãã³ã½ã¼ã¹)ãã¾ãã¯åææè³ãããããªãã¯ã©ã¦ãç°å¢(AWS)ã§æ§ç¯ãã§ãããã®ã¨ãªã£ã¦ãã¾ãããã¼ã¿åæãPythonãRã§è¡ãããã«ããããã ãã©ããããEXCELãããã¡ãªã®ï¼ãã¨ä¸å¸ã«è¨ãããæ¹ã®ãå½¹ã«ç«ã¦ãã¨å¹¸ãã§ãã
EXCEL & ãªã³ãã¬ä¸å¿ã®ãã¼ã¿åææ¥åããPython & ã¯ã©ã¦ãã«ç§»è¡
Python移è¡ã ãã§ã¯ãªãããããããªæã«æãå ããã®ã§ä»¥ä¸ã«ç¾å¨ã®ç¶æ³ãã¾ã¨ãã¦ãã¾ãã1
移è¡å
容 (移è¡ä¸ã®ãã®ãå«ã) |
Before å¾æ¥ã®ãã¼ã¿åæ |
After ç¾å¨ã®ãã¼ã¿åæ |
---|---|---|
ãã¼ã¿åæ(ã¯ã©ã¤ã¢ã³ã) | EXCEL, SQLServer Management Studio(SSMS) | Python(Jupyter Notebook), æ§ã ãªPythonããã±ã¼ã¸ |
ãã¼ã¿åæ(ãµã¼ã) | SQL Server | AWS EMR(Cloudera Impala) |
ãã¼ã¿ç®¡ç | å ±æãã©ã«ãã¯ãããã®ã®, åæä½æ¥ç¨ã®EXCELã¯éæå ±æ(ã¡ã¼ã«ãªã©) | ãã¹ã¦ã®åæçµæãGitã«ã¢ãããã¼ã |
ã¿ã¹ã¯ã»ç¥è管ç | (å£é ã»ã¡ã¼ã«ãªã©ã«ãã¨ã¥ãå人ã管ç) | Redmine |
Before
- æ°åè¡ã®ãã¼ã¿ããã¼ã«ã«ã®ãã¼ã¿åæç¨ãµã¼ã(MS SQL Server)ã«æ ¼ç´(ç°å¢æ§ç¯ã¯ãã¼ã ã®å°éã®ã¨ã³ã¸ãã¢ã«ä¾é ¼)
- ç¾å ´ãã¼ã ã¯ä¼ç¤¾PC(Windows7)ã«ã¤ã³ã¹ãã¼ã«ããã¯ã©ã¤ã¢ã³ã(SQL Server Management Studio)ããSQLã¯ã¨ãªãå®è¡ããåºåçµæãEXCELã«ã³ãã¼ãEXCELä¸ã§ãã¼ã¿åæãå®æ½ãåæçµæã¯Powerpointã«è²¼ãä»ãã¦è³æåãã¬ãã¼ãã¨ãã¦åºåããå ´åã«ã¯VBAã使ç¨ãããã¨ãã
- ãã¼ã¿åæã«ä½¿ç¨ããSQLã¯ã¨ãªãEXCELãã¡ã¤ã«ã¯åæ å½è ã管çããããããä»ã®ã¡ã³ãã¼ã«åæãã¸ãã¯ãå ±æãããã¨ãå°é£ã§ãã£ããããã¼ã ã«ãã¦ãã¦ãèç©ããã¯ãªãªãã£ãæ å½è ä¾åã«ãªã£ã¦ãã¾ããã¨ã課é¡ã«ã
- PJTã«é¢ããPowerpoint以å¤ã®ããã¥ã¡ã³ããæ®ã£ã¦ããããåãçµã¿ãã¼ãã¨ãã¦ããããªãã®ä»¥å¤ã®ããã¥ã¡ã³ããæ®ããªãæ§é ã
After
- ä¸é¨ã®ããã°ãã¼ã¿ãããã¼ã«ã«ç°å¢ã§æ±ãã«ã¯ãã¾ãã«è¨å¤§ã«ãªã£ã¦ãã¾ã£ãããã¯ã©ã¦ã(AWS)ã®æ´»ç¨ãå§ããã
- è»éã«ä¹ã£ã¦ããããããã¼ã«ã«ã®åæç¨ãµã¼ãã§ç®¡çãã¦ãããã¼ã¿ãå¾ã ã«ç§»è¡(AWSåºç¤ã®æ§ç¯ç®¡çã¯ãã¢ã·ã§ã¢ã®å°éã¨ã³ã¸ãã¢ã«ä¾é ¼)
- ç¾å ´ãã¼ã ã¯ä¼ç¤¾PCä¸ã«æ§ç¯ããPythonç°å¢(Jupyter Notebook)ä¸ã§Pythonã³ã¼ãã¨SQLãæ¸ããAWSããåå¾ãããã¼ã¿ããã¨ã«åæãè¡ã(主ã«pandasã使ç¨)
- å¿ è¦ã«å¿ãã¦ããã¼ã¿åæã®çµæã¯ã°ã©ãå(matplotlib)ãPowerpointã«æ·»ä», ã¬ãã¼ãå(Excelã®å ´åã«ã¯xlwings)ã
- ãã¼ã¿åæã§ä½æãããã¡ã¤ã«(主ã«Jupyter Notebook ãã¡ã¤ã«)ã¯Redmineä¸ã®Gitã§ç®¡çãå人ã§ä½æãããã¼ã¿åæããã¼ã ã§å ±æã
- Redmineä¸ã®Wikiãæ´»ç¨ããããã«ãã¦ãåMTGã§ä½æããPowerpoint以å¤ã«ãã¡ã³ãã¼ã®æé»ç¥ãå¾ã ã«ããã¥ã¡ã³ãã¨ãã¦æ®ãããã«ã(ããã¯ã¾ã åçµã¿å§ããã°ããã)
Future
- ããã°ãã¼ã¿éç¨åºç¤ã¨ãã¦ã¯impalaããããRedshiftãSpark on AWS ã®ã»ããå°æ¥æ§ãé«ãããªã®ã§ãæ©ä¼ãè¦ã¤ãã¦ãã¡ãã試ãã¦ã¿ããã
- Redmineã®ã¿ã¹ã¯ç®¡çã¯ã¾ã ã¾ã 使ãããªããé¨åãããããã
- ãã¼ã (ç¾å ´ãã¼ã &ãã¢ã·ã§ã¢ã¤ã³ãã©ãã¼ã )ã®äº¤æµãããã¹ã ã¼ãºã«ããããã«Slackãªã©ã®ãã£ãããã¼ã«ãå ¥ãã¦ã¿ããã(ããã¯ããããä¼ç¤¾ã®è¨±å¯ãå¿ è¦)
ãã¼ã¿åæãPythonã§è¡ãã¡ãªãã
å®éã«ãä¸è¨ã®ãããªç§»è¡ä½æ¥ã試è¡é¯èª¤ãè¡ããªããè¡ã£ã¦ããä¸ã§ããã¼ã¿åæãPythonã§è¡ããã¨ã®ã¡ãªãããè¦ãã¦ãã¾ããã23
ã¡ãªãã1: Excelã§ã¯ããããã§ããªããã¼ã¿åæãå¯è½ã«
ããã¯ãããããããã¨ã§ãããEXCELã®æ©è½ã§ã¯ã§ããªããã¨ãPythonãªãã§ããã¨ãããã¨ããã¯ãå¤ãæãã¾ãã
100ä¸è¡ãè¶ ãããã¼ã¿ãåãæ±ããã¨ãã§ãã¾ããEXCELã§ãã¢ãã¤ã³ã使ãã°ãã大è¦æ¨¡ãªãã¼ã¿ã使ããã¢ãã¤ã³ãæãããã§ãã(åè:PowerPivot) ã以å使ã£ãéã«ã¯éãã¦ä½¿ãç©ã«ãªãã¬ãã«ã§ã¯ç¡ãã¨ããå°è±¡ã
Pythonã¯(ç¹ã«æ¬§ç±³ã®ã)ã¢ã«ãããã¯ãªç 究ã§ä½¿ãããã±ã¼ã¹ãå¤ãããã¼ã¿åæã®ããã®è¿½å ããã±ã¼ã¸ããã®ãããã¹ãã¼ãã§ãªã¼ãã³ã½ã¼ã¹ã§éçºããã¦ãã¾ããä¾ãã°æåãªãã®ã ã¨ä¸è¨ã®ãããªãã®ãããã¾ãã
- æ©æ¢°å¦ç¿
- Webã¹ã¯ã¬ã¤ãã³ã°
è±èªãã¼ã¸ã§ãããä¸è¨ãªã³ã¯ã®ãã¼ãã¼ã大å¦ã®ãã¼ã¿ãµã¤ã¨ã³ã¹ã®è¬ç¾©ã§ã¯ãPythonã«ãããã¼ã¿åæãä¸å¿ã«è¬ç¾©ããã¦ããããã¼ã¿ã®éè¨ã»å å·¥ã ãã§ã¯ãªãscikit-learnã«ããæ©æ¢°å¦ç¿ã,BeautifulSoup4ã«ããWebã¹ã¯ã¬ã¤ãã³ã°ãåãæ±ã£ã¦ããããã§ãã
åè: ãã¼ãã¼ã大å¦ã®ãã¼ã¿ãµã¤ã¨ã³ã¹è¬ç¾©: CS109 Data Science
Pythonèªä½ã¯ãã¼ã¿åæå°ç¨ã§ã¯ãªãæ±ç¨ã®ããã°ã©ãã³ã°è¨èªãªã®ã§ãWebãµã¼ãã¹ãä½ããªã©ãã®ä»æ§ã ãªäºãã§ãã¾ãã
ã¡ãªãã2: 誤æä½ã«ãããã¥ã¼ãã³ã¨ã©ã¼ãæ¿æ¸
EXCELã¯ç´æçã«æä½ããããã¤ã³ã¿ã¼ãã§ã¼ã¹ã§èª°ãã使ããããä¸æ¹ã§ããããè¨ç®å¼ã®åç §å ãä¸ã¤ããã¦ããã,ãæä½ãã¦ãããã¡ã«ã誤ã£ã¦ãã¼ãã¼ãã«è§¦ãã¦ãã¾ããã¼ã¿ãæ¸ãæãã¦ãã¾ã£ããã¨ãã£ããããªãã¥ã¼ãã³ã¨ã©ã¼ãé »çºãã¦ãã¾ããããPythonã§ã¯ãã¼ã¿ã¯ç´æ¥ç·¨éããã«ã¯ä¸å®æé ãå¿ è¦ã§ãããå¼æ°ã¨ãªããã¼ã¿ããã¦ã¹ã§ã¯ãªãã³ã¼ãã§æå®ããã®ã§ããã®ãããªãã¹ãæ¿æ¸ãã¾ããã
ã¡ãªãã3: ãã¸ãã¯ã®å¯è¦åã«ãããã¼ã ã¬ãã«ã®åºä¸ãã¨ã³ã¼ãã®åå©ç¨çã®åä¸
ããã¤ãã¡ãªãããåãä¸ãã¦ãã¾ãããããä¸çªå¤§ããã§ããEXCELã§ã¯æ§é ä¸ã©ããã¦ãè¨ç®ãã¸ãã¯ãåã»ã«ã«å没ãã¦ãã¾ãããã¼ã¿ã«å¯¾ãã¦ã©ãããæä½ããã¦ããããç解ããã®ã«ç¸å½ãªã³ã¹ããæãã£ã¦ãã¾ãåããããªè¨ç®ãããå ´åã§ãã£ã¦ããä¸ããä½ããªããã¨ãããã¨ãå¤ãæ°ããã¾ãã
ã¾ããEXCELã ã¨ãã¸ãã¯ã®å ±æãé£ããã®ã§ãååãã©ã®ããã«ãã¼ã¿åæããã£ã¦ããããã©ããã¦ãå ±æãã¥ããã®ã§ã³ã³ãµã«ã¿ã³ãã®ãã¼ã¿åææ¥åã¯ã¯ä¸äººå±å°æ¹å¼ã«ãªããã¡ã§ããPythonã使ããã¨ã§ãGitãå©ç¨ãã¦ãã¼ã¿åæã®ãã¸ãã¯ã®è¨è¿°ã§ããã³ã¼ããå ¨å¡ãå ±æãããã¨ã§ãä¸äººå±å°æ¹å¼ããçã§ãã¼ã¿åæã«é¢ãããã¦ãã¦ãå ±æã§ããä½å¶ã«ãªãã¤ã¤ããã¾ãã
é常è¤æ°äººã§è¡ãã·ã¹ãã éçºã«å¯¾ãã¦ããã¼ã¿åæã¯1人or2人ç¨åº¦ã§è¡ããã¨ãå¤ãã®ã§ã¾ã ã¾ã ãã®è¾ºãã®ãã¦ãã¦ãªçºå±éä¸ã ã¨æãã¾ããã·ã¹ãã éçºã§ä½¿ããããããªãã¼ã«ãæ¹æ³è«ããè¯ãã¨ããããã¼ã¿åæã®é åã«ãåãå ¥ãã¦ããããã§ãã
åæ§ã®çç±ã§ããã¸ãã¯ããã¼ã¿ããåãé¢ããã®ã§è¨ç®åºæºã«å¤æ´ããã£ãå ´åããä¸åº¦ä½ã£ããã¸ãã¯ãåå©ç¨ãããå ´åã§ãã£ã¦ãææ»ããå°ãªããªãã¾ããä¸åä¸åã®ä½æ¥ãEXCELã®ã»ããçãçµããã±ã¼ã¹ã§åã£ãã¨ãã¦ããå 容ã®è¿ãåæãè¤æ°åè¡ãå ´åãªã©ã³ã¼ãã®åå©ç¨ã®ã¡ãªããã¯æ¥µãã¦å¤§ããã¨æãã¾ãã4
ã¡ãªãã4: ããåªç§ãªäººãç²å¾ã§ããããã«
ããã¯ã©ã¡ããã¨ããã¨ã¯ã©ã¦ãã使ã£ããã¨ã«ããé¨åãããªã大ããã®ã§ãããPythonã使ããGitãªã©ã«ããã¹ã ã¼ãºãªå ±æãåºæ¥ãããã«ãªããå½å å°æ¹ãªã¢ã¼ããã¼ã (ãã¢ã·ã§ã¢)ã§ã®ãã¼ã¿åæãå®ç¾å¯è½ã«ãªãã¾ãããæè¿ã¯ã³ã³ãµã«æ¥çãããªã人ã§ä¸è¶³ãªã®ã§ãããã£ãå½¢ã§ããæè»ã«ãã¼ã ãçµããããã«ãªãã¡ãªããã¯ããªã大ããã§ãã
ã¾ããæè¡çã«æ°ãããã¨ã«åãçµãã¨ãããã¨ãã¹ãã«ã¢ããã«ã¤ãªããã®ã§ãããåªç§ãªäººã社å ã»ç¤¾å¤ããç²å¾ãããã¨ããå ´åããä»å¾å¹æãçºæ®ããã®ã§ã¯ãªããã¨èãã¦ãã¾ãã
æå¾ã«
ã¾ã ã¾ã ãã®ãããªä½å¶ã§ã®æ¬æ ¼çãªéç¨ã¯å§ããã°ããã§æ¯æ¥è©¦è¡é¯èª¤ã§ããããã§ã«ããªãã®æå¿ããæãã¦ãããä»å¾ãæ´ãªãçç£æ§ã®åä¸ãæå¾ ã§ãã¾ããå½åã¯ã¡ãªãã1ã«è¨è¿°ãããæ©æ¢°å¦ç¿ãªã©ã®EXCELã§ã§ããªããã¨ãåºæ¥ãããã«ãªããã¨ãçãã§ãããããã£ã¦ããä¸ä»ã®ã¡ãªãããè¦ãã¦ãã¾ãããä»å¾ã使ãè¾¼ãã§ãããããªã±ã¼ã¹ã§å©ç¨ã§ããããã«ãªã£ã¦ããããæã§ãã
-
ãã¡ããEXCELãå®å ¨ã«æé¤ããããã§ã¯ãªããç®çã«å¿ãã¦æ´»ç¨ãã¦ãã¾ãã ↩
-
å¿ ãããPythonã ãã®ã¡ãªããã§ã¯ãªãããã¼ã¿åæè¨èªã®R ããã®ä»è¨èªã§ãå®ç¾ã§ãããã¨ããããã¨æãã¾ãã ↩
-
ãã¡ãªããã¯å¦ç¿ã³ã¹ãã®é«ãã ↩
-
ã·ã¹ãã ã¨ã©ã¼ãèµ·ãããªãã³ãã¯ããªãã¹ãã³ã¼ããæ¸ããã«åå©ç¨ãããã¨ã ããã§ããEXCELãã¡ã¤ã«ãããããä½ãã¨ã©ããã¦ãã¨ã©ã¼ãç´ãè¾¼ã確çã¯é«ããªã£ã¦ãã¾ãã¾ãã ↩