ä¿¡é ¼æ§ã®è¿½æ±ã¨ããã®ã¯ã決ãã¦å®ç§ãæ±ãããã¨ã§ã¯ããã¾ãããå®ç§ãªå¯ç¨æ§ã追æ±ããã®ã§ã¯ãªããã¦ã¼ã¶ã¼ã®æºè¶³ã¨ãéããããªã½ã¼ã¹ã®ãã©ã³ã¹ãåããã¨ãããéè¦ã§ãããã®ãã©ã³ã¹ãåãããã®ä¸ã¤ã®ãã¼ã«ããSLOï¼Service Level Objectivesï¼ã§ãã
ã¯ããã«
ãImplementing Service Level Objectivesãã¯ãç¾ä»£ã®ã½ããã¦ã§ã¢ãµã¼ãã¹ç®¡çã«ããã¦ä¸å¯æ¬ ãªæ¦å¿µã®ä¸ã¤ã§ããSLOï¼Service Level Objectivesï¼ã®å®è£ ã¨éç¨ã«é¢ããå æ¬çãªã¬ã¤ãããã¯ã§ããæ¬æ¸ã¯ãSLOã®åºæ¬æ¦å¿µããå®è·µçãªå°å ¥æ¹æ³ãçµç¹æåã¸ã®æµ¸éã¾ã§ãå¹ åºããããã¯ãã«ãã¼ãã¦ãã¾ãããã®SREæ¬ããããï¼2024å¹´çã§ãç´¹ä»ãã¦ããããã«SREãã¤ã³ãã©ã¨ã³ã¸ãã¢ã®æ¹ãå¿ èªã®ä¸åã ã¨æãã¾ãã
SREã¨ãã¦ããã®æ¬ãèªã¿é²ããä¸ã§ãSLOãåãªãæè¡çãªææ¨ã§ã¯ãªããã¦ã¼ã¶ã¼ä½é¨ã¨ãã¸ãã¹ç®æ¨ãçµã³ã¤ããå¼·åãªãã¼ã«ã§ãããã¨ãåèªèãã¾ãããã¾ãã§ãã¯ããã¸ã¼ã¨ãã¸ãã¹ã®éã®æ©æ¸¡ããããæ¸ç±ã§ããæ¬æ¸ã¯ãå¾æ¥ã®ä¿¡é ¼æ§ç®¡çææ³ã®éçãææããSLOãã¼ã¹ã®ã¢ããã¼ããããã«ãããã®åé¡ã解決ããããå¹æçãªã·ã¹ãã éç¨ãå¯è½ã«ãããã詳細ã«è§£èª¬ãã¦ãã¾ãã
çµé¨ããã¡ãã大äºã§ãããèªæ¸ãéãã¦å¾ãããç¥èãæ´å¯ããå®è·µçãªçµé¨ã«å¹æµãã価å¤ãããã¨èãã¦ãã¾ããèªæ¸ã¯ä¸ç¨®ã®éæ¥çãªçµé¨ã§ãããåæã«ãç§ãã¡ã®å®éã®çµé¨ããæ¬ãèªãããã«è§£éãå¦ã¶ãã¨ãã§ãã¾ããã¤ã¾ããèªæ¸ã¯ä¸ç¨®ã®çµé¨ã§ãããçµé¨ã¯ã¾ãä¸ç¨®ã®èªæ¸ã§ããã¨è¨ããã§ãããããã®ç¸äºä½ç¨çãªå¦ã³ã®ããã»ã¹ãéãã¦ãSREã¨ãã¦ã®ç¥èã¨å®è·µãããæ·±ããããè±ããªãã®ã«ã§ããã¨ä¿¡ãã¦ãã¾ãã
ç¹ã«ãæ¬æ¸ãæè¡çãªå´é¢ã ãã§ãªããçµç¹æåã人éçãªå´é¢ã«ã大ããªæ³¨æãæã£ã¦ããç¹ãå°è±¡çã§ãããSLOã®å°å ¥ã¯ãåãªããã¼ã«ã®å°å ¥ã§ã¯ãªããçµç¹å ¨ä½ã®æèæ¹æ³ã¨è¡åæ§å¼ãå¤é©ãã大ããªãã£ã¬ã³ã¸ã§ãããã¨ã強調ããã¦ãã¾ããã¾ãã§çµç¹å ¨ä½ã§ä½è³ªæ¹åã«ææ¦ãããããªãã®ã§ãããçã¿ãä¼´ãããããã¾ããããçµæã¯ç´ æ´ãããã¯ãã§ãï¼SLOã«é¢ãã¦ã¯å½å å¤ã§ç´ æ´ãããçºè¡¨ãããã¤ãããã®ã§ãåããã¦ç´¹ä»ãã¦ããããã¨æãã¾ãã
æ¬ææ³æã§ã¯ãSLOã®åºæ¬æ¦å¿µãç°¡åã«ç´¹ä»ããå¾ãæ¬æ¸ã®ä¸»è¦ãªé¨åã3ã¤ã®ãã¼ãã«åãã¦æ¯ãè¿ãã¾ããåãã¼ãããå¾ãããéè¦ãªæ´å¯ã¨æè¨ãããã¦ããããå®è·µã«ç§»ãããã®å ·ä½çãªã¢ããã¼ãã«ã¤ãã¦èå¯ãã¦ããã¾ããæ¥æ¬èªçãåºçããã¦ããã®ã§ããã²åæ¸ã¨åããã¦èªãã§ã¿ã¦ãã ãããã¡ãªã¿ã«ããã¤ãå½å ã§ã¤ãã³ããéå¬ããã¦ããã®ã§ããã¡ããè¦ãã§ãã¯ã§ãã
ç§ã¯ç¿»è¨³ã®çµé¨ãããã¾ããããã®æ¬ã®æ¥æ¬èªè¨³ã¯ç§é¸ã ã¨æãã¾ããããã£ã¨ãè±èªçãèªãã æã®ãããï¼ ããä½è¨ã£ã¦ããã ãããã¨ããã¢ã¤ã¢ã¤ãä¸æãããã¯ãã§ãï¼
ãªãããã®èªæ¸ææ³æã¯ããã¾ã§ç§å人ã®èªæ¸ä½é¨ã«åºã¥ããã®ã§ãããæ¬æ¸ã®å 容ã常ã«æ£ç¢ºã«è§£éã§ãã¦ããã¨ã¯éãã¾ããã人éã¯èªèº«ã®èªç¥ãã¬ã¼ã ã¯ã¼ã¯ã¨çµé¨ãéãã¦ã®ã¿ç©äºãç解ã解éãããã®ã§ãããã®ãããããã§ã®èå¯ã解éã«ã¯ãç§èªèº«ã®èæ¯ãçµé¨ãåéããæãéããåæ ããã¦ããå¯è½æ§ãããã¾ããèªè ã®çæ§ã«ã¯ããã®ç¹ããç解ããã ããæ¬æ¸ãç´æ¥ãèªã¿ã«ãªããã¨ããå§ããããã¾ãã
ã¾ããæ¬ããã°ã®ã¬ãã¥ã¼ã«ããã¦ãabnoumaru ããããå¤å¤§ãªãè²¢ç®ãããã ãã¾ãããabnoumaru ããã®å°éç¥èã¨ç¶¿å¯ãªã¬ãã¥ã¼ã®ãããã§ãããã°ã®å質ã¨æ£ç¢ºæ§ãå¤§å¹ ã«åä¸ãã¾ããããã®å ´ãåãã¦ãabnoumaru ããã®ãå°½åã«å¿ããæè¬ç³ãä¸ãã¾ãã
Part I. SLO Development
Part Iã§ã¯ãSLOã®åºæ¬æ¦å¿µã¨éçºæ¹æ³ã«ã¤ãã¦è©³ç´°ã«è§£èª¬ããã¦ãã¾ããå人çã«å¥½ãã ã£ãã®ã¯ãReliability Stackã®æ¦å¿µã§ããSLIï¼Service Level Indicatorsï¼ãSLOãError Budgetã¨ãã3ã¤ã®è¦ç´ ãé層æ§é ãæãããµã¼ãã¹ã®ä¿¡é ¼æ§ãå æ¬çã«ç®¡çããããã®ãã¬ã¼ã ã¯ã¼ã¯ãæä¾ãã¦ãã¾ãã
èè ã¯ããå®ç§ãªä¿¡é ¼æ§ãç®æããã¨ã¯ç¾å®çã§ã¯ãªãããããã¦ã¼ã¶ã¼ãæºè¶³ããç¨åº¦ã®ä¿¡é ¼æ§ãç®æ¨ã¨ãã¹ããã¨ããèãæ¹ãæå±ãã¦ãã¾ãããã®ç¾å®çãªã¢ããã¼ãã¯ããªã½ã¼ã¹ã®å¹ççãªé åã¨ã¦ã¼ã¶ã¼æºè¶³åº¦ã®ãã©ã³ã¹ãåãä¸ã§éè¦ã§ãã
æè¡çãªè¦³ç¹ããã¯ãSLIã®é¸å®ã¨SLOã®è¨å®æ¹æ³ã«é¢ããå ·ä½çãªã¬ã¤ãã³ã¹ãæç¨ã§ãããç¹ã«ããã¼ã»ã³ã¿ã¤ã«ãã¼ã¹ã®SLOè¨å®ããä¾åé¢ä¿ãæã¤ãµã¼ãã¹ã®SLOè¨ç®æ¹æ³ãªã©ãå®è·µçãªç¥è¦ãå¤ãå«ã¾ãã¦ãã¾ãã
ãã®é¨åããå¦ãã æãéè¦ãªæè¨ã¯ãSLOã®éçºãåãªãæ°å¤ç®æ¨ã®è¨å®ã§ã¯ãªããã¦ã¼ã¶ã¼ã®ãã¼ãºãæè¡çãªå¶ç´ããã¸ãã¹ç®æ¨ã®ãã©ã³ã¹ãåãè¤éãªããã»ã¹ã§ããã¨ãããã¨ã§ããSREã¨ãã¦ããã®è¦ç¹ã常ã«å¿µé ã«ç½®ããªãããããå¹æçãªSLOã®è¨è¨ã¨éç¨ãç®æãã¦ããå¿ è¦ãããã¾ãã
Chapter 1. The Reliability Stack
第1ç« ãThe Reliability Stackãã¯ãç¾ä»£ã®ã½ããã¦ã§ã¢ãµã¼ãã¹ã«ãããä¿¡é ¼æ§ã®éè¦æ§ã¨ãããã測å®ã»ç®¡çããããã®ãã¬ã¼ã ã¯ã¼ã¯ãæ示ãã¦ãã¾ããæ¬ç« ã¯ãService Level Indicators (SLIs)ãService Level Objectives (SLOs)ãããã¦Error Budgetsã¨ãã3ã¤ã®ä¸»è¦æ¦å¿µãä¸å¿ã«æ§æããã¦ããããããããReliability Stackããå½¢æãã¦ãã¾ããåç¨èªã«ã¤ãã¦ã¯SREæ¬ã®Service Level ObjectivesãThe Art of SLOsãè¯ãã®ã§ããããã§ãã
ã¾ããæ¬æ¸ã§ã¯ç¾ä»£ã®ã½ããã¦ã§ã¢ç°å¢ãè¤éåããåæ£åãã¦ãããã¨ãææãã¦ãã¾ãããWe live in a world of services.ãã¨ããåé ã®ä¸æã¯ããã®ç¾ç¶ã端çã«è¡¨ç¾ãã¦ãã¾ããSoftware as a Service (SaaS)ãInfrastructure as a Service (IaaS)ãããã«ã¯ãã¤ã¯ããµã¼ãã¹ã¢ã¼ããã¯ãã£ã®æ®åã«ããããµã¼ãã¹ã®å®ç¾©èªä½ãææ§ã«ãªã£ã¦ããã¨ããææã¯ãå¤ãã®ã½ããã¦ã§ã¢ã¨ã³ã¸ãã¢ãæ¥ã ç´é¢ãã¦ãã課é¡ãç確ã«æãã¦ãã¾ãã
æ¬æ¸ã§ã¯ãè¤éåããç°å¢ä¸ã§ãµã¼ãã¹ã®ä¿¡é ¼æ§ã確ä¿ããã«ã¯ãå¾æ¥ã®ãã°ãã¹ã¿ãã¯ãã¬ã¼ã¹ã ãã§ã¯ä¸ååã§ãããæ°ããã¢ããã¼ããå¿ è¦ã ã¨ä¸»å¼µãã¦ãã¾ããããã§ææ¡ããã¦ããã®ããã¦ã¼ã¶ã¼ã®è¦ç¹ã«ç«ã£ãä¿¡é ¼æ§ã®æ¸¬å®ã¨ç®¡çã§ããã"Is my service reliable?"ã¨ãã質åã¯ã"Is my service doing what its users need it to do?"ã¨ãã質åã¨ã»ã¼å義ã§ãããã¨ãããã¬ã¼ãºã¯ãä¿¡é ¼æ§ã®æ¬è³ªã端çã«è¡¨ç¾ãã¦ããã¨æãã¾ãã Luupã«ãããSLOã®ç©èªãªã©ã¯è¯ãè³æã¨ãã¦ã§ãã¦ããã®ã§ç¢ºèªãã¦ãããããã§ãã
ã¾ããObservability Engineeringã®Chapter 12. Using Service-Level Objectives for ReliabilityãChapter 13. Acting on and Debugging SLO-Based Alertsã§SLOã«ã¤ãã¦è§¦ãã¦ããã®ã§ãããããä¸èªãã価å¤ãããã§ãããã
ãªããPractical Monitoringã¨Observability Engineeringã®2åãèªãåã«ãã®æ¬ãèªããã¨ã¯ããã¾ããªã¹ã¹ã¡ã§ãã¾ããã
Reliability Stackã®åºç¤ã¨ãªãService Level Indicators (SLIs)ã¯ãã¦ã¼ã¶ã¼ã®è¦ç¹ãããµã¼ãã¹ã®ããã©ã¼ãã³ã¹ã測å®ããææ¨ã§ããæ¬æ¸ã§ã¯ãåç´ãªå¯ç¨æ§ãã¨ã©ã¼ã¬ã¼ãã§ã¯ãªããã¦ã¼ã¶ã¼ã®å®éã®ä½é¨ã«åºã¥ããææ¨ãé¸ã¶ã¹ãã ã¨å¼·èª¿ãã¦ãã¾ããä¾ãã°ãã¦ã§ããã¼ã¸ã®èªã¿è¾¼ã¿æéã2ç§ä»¥å ã§ããã°ãè¯å¥½ãããã以ä¸ã§ããã°ãä¸è¯ãã¨ãããããªå ·ä½çãªä¾ã¯ãSLIã®æ¦å¿µãç解ããä¸ã§æç¨ã§ãã
Figure 1.1ã§ç¤ºããã¦ããReliability Stackã®å³ã¯ãSLIãSLOãError Budgetã®é¢ä¿æ§ãè¦è¦çã«è¡¨ç¾ãã¦ããããããã®æ¦å¿µã®é層æ§é ãç解ããä¸ã§å½¹ç«ã¡ã¾ãã
Service Level Objectives (SLOs)ã¯ãSLIã«åºã¥ãã¦è¨å®ãããç®æ¨å¤ã§ããæ¬æ¸ã§ã¯ãå®ç§ãªä¿¡é ¼æ§ãç®æããã¨ã¯ç¾å®çã§ã¯ãªããããããã¦ã¼ã¶ã¼ãæºè¶³ããç¨åº¦ã®ä¿¡é ¼æ§ããç®æ¨ã¨ãã¹ãã ã¨ä¸»å¼µãã¦ãã¾ãããã®èãæ¹ã¯ããªã½ã¼ã¹ã®å¹ççãªé åã¨ãã¦ã¼ã¶ã¼æºè¶³åº¦ã®ãã©ã³ã¹ãåãä¸ã§éè¦ã§ããSREã¨ãã¦ããã®è¦ç¹ã¯ç¹ã«å ±æã§ããé¨åã§ããã
Error Budgetã¯ãSLOããã®é¸è±ã許容ããç¯å²ãå®ç¾©ãããã®ã§ããæ¬æ¸ã§ã¯ãError Budgetããã©ã®ç¨åº¦ãµã¼ãã¹ã失æãã¦ã許容ãããããã表ãææ¨ã¨ãã¦èª¬æãã¦ãã¾ããFigure 1.3ã§ç¤ºããã¦ããError Budgetã®ä½¿ç¨æ¹æ³ã¯ãæ°æ©è½ã®ãããã¤ãå®é¨çãªåãçµã¿ã¨ãä¿¡é ¼æ§åä¸ã®ããã®ä½æ¥ã®ãã©ã³ã¹ãåãä¸ã§æç¨ãªãã¬ã¼ã ã¯ã¼ã¯ãæä¾ãã¦ãã¾ãã
æ¬æ¸ã§ã¯ãæ§ã ãªã¿ã¤ãã®ãµã¼ãã¹ï¼ã¦ã§ããµã¼ãã¹ãAPIãµã¼ãã¹ããã¼ã¿å¦çãã¤ãã©ã¤ã³ããããã¸ã§ãããã¼ã¿ãã¼ã¹ãã³ã³ãã¥ã¼ãã£ã³ã°ãã©ãããã©ã¼ã ããã¼ãã¦ã§ã¢ã»ãããã¯ã¼ã¯ï¼ã«ã¤ãã¦è¨åããããããã®ãµã¼ãã¹ã¿ã¤ãã«é©ããSLIã¨SLOã®è¨å®æ¹æ³ãæ¦èª¬ãã¦ãã¾ããããã¯ãReliability Stackã®æ¦å¿µãå¹ åºããµã¼ãã¹ã«é©ç¨å¯è½ã§ãããã¨ã示ãã¦ãããå®åä¸æç¨ãªæ å ±ã§ãã
ç¹ã«ãæ¬æ¸ãSLOã¢ããã¼ãã®å°å ¥ã«é¢ãã¦æ³¨æç¹ãæãã¦ããé¨åã§ãããSLOs Are Just DataãããSLOs Are a Process, Not a ProjectãããIterate Over EverythingãããThe World Will ChangeãããIt's All About Humansãã¨ãã5ã¤ã®ãã¤ã³ãã¯ãSLOãã¼ã¹ã®ã¢ããã¼ããå®è·µããä¸ã§å¸¸ã«å¿ã«çãã¦ããã¹ãéè¦ãªæéã ã¨æãã¾ãããç¹ã«ãSLOãåãªãããã¸ã§ã¯ãã§ã¯ãªããç¶ç¶çãªããã»ã¹ã¨ãã¦æããè¦ç¹ã¯ãå¤ãã®çµç¹ãSLOå°å ¥ã«å¤±æããåå ãç確ã«ææãã¦ããã¨æãã¾ãã
æè¡çãªè¦³ç¹ããã¯ãæ¬æ¸ãSLIã®æ¸¬å®æ¹æ³ãError Budgetã®è¨ç®æ¹æ³ã«ã¤ãã¦è©³ç´°ã«èª¬æãã¦ããç¹ãåèã«ãªãã¾ãããä¾ãã°ãError Budgetã®è¨ç®ã«ã¤ãã³ããã¼ã¹ã®ã¢ããã¼ãã¨æéãã¼ã¹ã®ã¢ããã¼ãããããã¨ã®èª¬æã¯ãå®éã«Error Budgetãå®è£ ããéã®å ·ä½çãªæéã¨ãªãã¾ãã
ã¾ããæ¬æ¸ãService Level Agreement (SLA)ã¨SLOã®éããæ確ã«èª¬æãã¦ããç¹ãéè¦ã§ããSLAãå¥ç´ä¸ã®ç´æã§ããã®ã«å¯¾ããSLOã¯å é¨çãªç®æ¨ã§ããã¨ããåºå¥ã¯ãå¤ãã®çµç¹ã§ãã°ãã°æ··åãããã¡ãªæ¦å¿µãæ´çããä¸ã§æç¨ã§ãã
æ¬ç« ããå¾ãããéè¦ãªæè¨ã¯ãä¿¡é ¼æ§ã®ç®¡çããµã¼ãã¹ã®éç¨ã«ããã¦æãéè¦ãªè¦ç´ ã®ä¸ã¤ã§ããããããã¦ã¼ã¶ã¼ã®è¦ç¹ãã測å®ãã管çãããã¨ã®éè¦æ§ã§ããReliability Stackã®æ¦å¿µã¯ãè¤éåãããµã¼ãã¹ç°å¢ã«ããã¦ãä¿¡é ¼æ§ãä½ç³»çã«ç®¡çããããã®å¼·åãªãã¬ã¼ã ã¯ã¼ã¯ãæä¾ãã¦ãã¾ãã
åæãæ¬æ¸ã§ã¯æè¡çãªå´é¢ã ãã§ãªãã人éçãªå´é¢ã強調ãã¦ãã¾ãããService level objectives are ultimately about happier users, happier engineers, happier product teams, and a happier business.ãã¨ãããã¬ã¼ãºã¯ãSLOã¢ããã¼ãã®æçµçãªç®æ¨ã端çã«è¡¨ç¾ãã¦ãããæè¡ã¨äººéã®ãã©ã³ã¹ãåããã¨ã®éè¦æ§ã示åãã¦ãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ã¹ãéè¦ãªç¹ã¯ãä¿¡é ¼æ§ã®ç®¡çãåãªãæè¡çãªåé¡ã§ã¯ãªããã¦ã¼ã¶ã¼ãã¨ã³ã¸ãã¢ããã¸ãã¹å ¨ä½ãèæ ®ããç·åçãªã¢ããã¼ããå¿ è¦ã ã¨ãããã¨ã§ããSLIãSLOãError Budgetã®æ¦å¿µã¯ããã®è¤éãªåé¡ã«å¯¾ããä½ç³»çãªã¢ããã¼ããæä¾ãã¦ããã¾ãã
ã¾ããæ¬æ¸ã強調ãã¦ãããã種ã®ãå®ç§ãç®æããªããã¨ãã姿å¢ã¯ããªã½ã¼ã¹ã®å¹ççãªé åã¨ãç¶ç¶çãªæ¹åã®ãã©ã³ã¹ãåãä¸ã§éè¦ã§ãã100ï¼ ã®ä¿¡é ¼æ§ãç®æãã®ã§ã¯ãªããã¦ã¼ã¶ã¼ã®ãã¼ãºãæºããç¨åº¦ã®ä¿¡é ¼æ§ãç®æãã¨ããèãæ¹ã¯ãç¾å®çãã¤å¹æçãªã¢ããã¼ãã ã¨æãã¾ããã
ããã«ãæ¬æ¸ãSLOã¢ããã¼ããåãªããã¼ã¿åéãç®æ¨è¨å®ã§ã¯ãªããç¶ç¶çãªããã»ã¹ã¨ãã¦æãã¦ããç¹ãéè¦ã§ããSLOã®è¨å®ã調æ´ãéãã¦ãçµç¹å ¨ä½ãä¿¡é ¼æ§ã«ã¤ãã¦ç¶ç¶çã«èããè°è«ããæåãé¸æãããã¨ã®éè¦æ§ã強調ããã¦ãã¾ãã
æè¡çãªè¦³ç¹ããã¯ãSLIã®é¸å®ãError Budgetã®è¨ç®æ¹æ³ãªã©ãå ·ä½çãªå®è£ ã«é¢ããæ å ±ãæä¾ããã¦ãã¾ãããããã®æ å ±ã¯ãå®éã«Reliability Stackãå°å ¥ããéã®å®è·µçãªã¬ã¤ãã©ã¤ã³ã¨ãªãã¾ããç¹ã«ãç°ãªãã¿ã¤ãã®ãµã¼ãã¹ã«å¯¾ããSLIã®ä¾ã¯ãå¤æ§ãªãµã¼ãã¹ç°å¢ã«å¯¾å¿ããä¸ã§æç¨ã§ãã
æ¬ç« ãèªãã§ãç§ã¯Reliability Stackã®æ¦å¿µãç¾ä»£ã®ã½ããã¦ã§ãµã¼ãã¹ç®¡çã«ããã¦éè¦ã§ããã¨åèªèãã¾ãããåæã«ããã®æ¦å¿µãå¹æçã«å°å ¥ããããã«ã¯ãæè¡çãªå®è£ ã ãã§ãªããçµç¹æåã®å¤é©ãå¿ è¦ã§ãããã¨ãå¼·ãæãã¾ãããSLOã¢ããã¼ãã¯ãæè¡ãã¼ã ã¨ãã¸ãã¹ãã¼ã ã®æ©æ¸¡ãã¨ãªãããµã¼ãã¹ã®ä¿¡é ¼æ§ã«é¢ããå ±éè¨èªãæä¾ããå¯è½æ§ãç§ãã¦ãã¾ãã
æå¾ã«ãæ¬æ¸ã強調ãã¦ãããã¤ãã¬ã¼ã·ã§ã³ãã®éè¦æ§ã¯ãå°è±¡ã«æ®ãã¾ããããµã¼ãã¹ã®ç°å¢ãè¦ä»¶ã¯å¸¸ã«å¤åãã¦ãããããã«å¿ãã¦SLIãSLOãError Budgetãç¶ç¶çã«è¦ç´ãã調æ´ãã¦ããå¿ è¦ãããã¾ãããã®æè»æ§ã¨ç¶ç¶çæ¹åã®å§¿å¢ã¯ãæ¥éã«å¤åãããã¯ããã¸ã¼ç°å¢ã«ããã¦éè¦ã§ãã
ç·æ¬ããã¨ãæ¬ç« ã¯Reliability Stackã¨ããæ¦å¿µãéãã¦ãç¾ä»£ã®ã½ããã¦ã§ã¢ãµã¼ãã¹ã«ãããä¿¡é ¼æ§ç®¡çã®æ°ãããã©ãã¤ã ãæ示ãã¦ãã¾ããã¦ã¼ã¶ã¼ä¸å¿ã®è¦ç¹ããã¼ã¿é§åã®ææ決å®ãç¶ç¶çãªæ¹åããã»ã¹ãããã¦æè¡ã¨äººéã®ãã©ã³ã¹ãéè¦ãããã®ã¢ããã¼ãã¯ãè¤éåãããµã¼ãã¹ç°å¢ã«ããã¦æå¹ã ã¨æãã¾ãããSREã¨ãã¦ããã®æ¦å¿µãèªèº«ã®æ¥åã«åãå ¥ããããã«çµç¹å ¨ä½ã«æµ¸éããã¦ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ããµã¼ãã¹æä¾ã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ãã
Chapter 2. How to Think About Reliability
第2ç« ãHow to Think About Reliabilityãã¯ãä¿¡é ¼æ§ã¨ããæ¦å¿µã«å¯¾ããæ·±ãæ´å¯ãæä¾ãã¦ãã¾ããæ¬æ¸ã§ã¯ãæè¡æ¥çã§é »ç¹ã«èª¤è§£ããã¦ããä¿¡é ¼æ§ã®æ¬è³ªãæããã«ããèªè ã«æ°ããªè¦ç¹ãæ示ãã¦ãã¾ãã
ç« ã®åé ã§ãæ¬æ¸ã§ã¯æè¡æ¥çã«ãããç¨èªã®ä¹±ç¨ã«ã¤ãã¦è¦éãé³´ããã¦ãã¾ãã"DevOps"ã"Site Reliability Engineering"ã¨ãã£ãç¨èªæ¬æ¥ã®æå³ã失ããåãªããã¼ã±ãã£ã³ã°ç¨èªãè·ç¨®åã¨ãã¦ä½¿ããã¦ããç¾ç¶ãææãã¦ãã¾ããããã¯ãSREã¨ãã¦ã®çµé¨ãæã¤ç§ã«ã¨ã£ã¦å ±æã§ããç¹ã§ãããç¾å ´ãããã ãã©ã¯ãã£ã¹ããã¨ã ååã¯ã ããã«ã¨ããè¨èã好ããªã®ã§ããæè¡ç¨èªã®æ¬è³ªãç解ããã«ä½¿ç¨ãããã¨ã§ãè¨èã¯ãããããããã®æ¦å¿µã®éè¦æ§ãæå³ãèãã¦ãã¾ãã®ã¯ãå®ã«ãã£ãããªãã
æ¬æ¸ã§ã¯ãä¿¡é ¼æ§ããã°ãã°å¯ç¨æ§ã¨åä¸è¦ããããã¨ãåé¡è¦ãã"Reliability has far too often come to mean only availability in the tech world."ã¨ããä¸æã§ãã®èª¤è§£ã端çã«è¡¨ç¾ãã¦ãã¾ãããSREã¨ãã¦ä¿¡é ¼æ§åä¸ã«åãçµãä¸ã§å¯ç¨æ§ä»¥å¤ã«ãå¤ãã®è¦ç´ ãé¢ä¿ãã¦ãããã¨ãå®æãã¦ãããä¿¡é ¼æ§ã«å¯¾ããåºå®çãªçãã¯ãªããããå®éã®ç¾å ´ã§ãã¼ã¿ãåéã»åæããåã·ã¹ãã ãç¶æ³ã«å¿ãã¦ä¿¡é ¼æ§ãæºããããã«ä½ãå¿ è¦ããè°è«ãããã¨ãéè¦ã§ããã¡ãã£ã¨éæ¿ãªè¡¨ç¾ã§ããæå³ããå¯ç¨æ§ãæ±ããéåã ã¨è¨ããã¨æã£ã¦ã¾ãã
æ¬æ¸ãæ示ããä¿¡é ¼æ§ã®å®ç¾©ã"a system is doing what its users need it to do"ã¯ãã·ã³ãã«ã§ãããªããæ¬è³ªãçªãã¦ãã¾ãããã®å®ç¾©ã¯ãæè¡çãªææ¨ã ãã§ãªããã¦ã¼ã¶ã¼ã®è¦ç¹ãä¸å¿ã«ç½®ããã¨ã®éè¦æ§ã強調ãã¦ãã¾ããSREã¨ãã¦ããã®è¦ç¹ã¯éè¦ã§ããç§ãã¡ã¯å¾ã ã«ãã¦æè¡çãªææ¨ã«ã¨ããããã¡ã§ãããæçµçã«ã¯ã¦ã¼ã¶ã¼ã®æºè¶³åº¦ããæãéè¦ãªææ¨ã§ãããã¨ãå¿ãã¦ã¯ããã¾ããã
ç« ã®ä¸ç¤ã§ãéå»ã®ããã©ã¼ãã³ã¹ã¨ã¦ã¼ã¶ã¼ã®æå¾ ã«ã¤ãã¦èå³æ·±ãèå¯ãå±éãã¦ãã¾ãã"Past performance predicts future performance."ã¨ããä¸æã¯ãã¦ã¼ã¶ã¼ã®æå¾ ãã©ã®ããã«å½¢æããããã端çã«è¡¨ç¾ãã¦ãã¾ããããã¯ãSLOãè¨å®ããéã«èæ ®ãã¹ãéè¦ãªè¦ç´ ã§ããéå»ã®ããã©ã¼ãã³ã¹ãæé»ã®ç´æã¨ãªã£ã¦ããã¨ããææã¯ãSREã¨ãã¦ç¹ã«æ³¨æãæãã¹ãç¹ã ã¨æãã¾ããã
æ¬æ¸ãæ示ããæ ç»ã¹ããªã¼ãã³ã°ãµã¼ãã¹ã®ä¾ã¯ãä¿¡é ¼æ§ã®è¤éããç解ããä¸ã§æå¹ã§ãããåã«åç»ãåçãããã ãã§ãªããé©åãªé度ã§ã®ãããã¡ãªã³ã°ãæ£ããåç»ã®é ä¿¡ãé©åãªç»è³ªãé³å£°ã¨æ åã®åæãåå¹ã®æ£ç¢ºããªã©ãå¤å²ã«ãããè¦ç´ ãä¿¡é ¼æ§ãæ§æãã¦ãããã¨ãåããã¾ãããã®ä¾ã¯ãSREã¨ãã¦ã·ã¹ãã ã®ä¿¡é ¼æ§ãèããéã«ãå¤è§çãªè¦ç¹ãæã¤ãã¨ã®éè¦æ§ãåèªèããã¦ããã¾ãã
æ¬æ¸ã®"100ï¼ is impossible"ã¨ãã主張ã¯ãéè¦ãªç¹ã§ããå®ç§ãç®æããã¨ã§ãå´ã£ã¦è³æºã®ç¡é§é£ãã人çè² æ ã®å¢å ãæãå¯è½æ§ããããã¨ãææãã¦ãã¾ããããã¯ãSREã¨ãã¦å¸¸ã«å¿ã«çãã¦ããã¹ãæè¨ã§ããä¿¡é ¼æ§ã¨å¹çæ§ã®ãã©ã³ã¹ãåããã¨ã®éè¦æ§ãåèªèããããã¾ããã
æ¬æ¸ã§ã¯ãä¿¡é ¼æ§ã®åä¸ã«ãããã³ã¹ãã«ã¤ãã¦ã詳細ã«èª¬æãã¦ãã¾ãã99.9ï¼ ãã99.95ï¼ ã¸ã®ä¿¡é ¼æ§ã®åä¸ã¨ã99.95ï¼ ãã99.99ï¼ ã¸ã®åä¸ã§ã¯ãå¾è ã®æ¹ã5åã®ã³ã¹ãããããã¨ããå ·ä½çãªä¾ã¯ãå°è±¡çã§ããããã®æ°å¦çãªè£ä»ãã¯ãä¿¡é ¼æ§ç®æ¨ãè¨å®ããéã®éè¦ãªå¤æææã¨ãªãã¾ãã
æå¾ã«æ¬æ¸ã§ã¯ãä¿¡é ¼æ§ã«å¯¾ããèãæ¹ã«ã¤ãã¦ç·æ¬ãã¦ãã¾ãã"Be as reliable as your users need you to be."ã¨ããä¸æã¯ãä¿¡é ¼æ§ã«å¯¾ããã¢ããã¼ãã®æ¬è³ªã表ç¾ãã¦ãã¾ããããããæ¬æ¸ã§ã¯ãã®åç´ãªçãã«çã¾ãããã¦ã¼ã¶ã¼ã®ãã¼ãºãæéã¨ã¨ãã«å¤åãããã¨ããä¸æµã®ãµã¼ãã¹ã®ä¿¡é ¼æ§ã«ä¾åããé¨åããããã¨ãªã©ãããè¤éãªè¦å ã«ã¤ãã¦ãè¨åãã¦ãã¾ãã
ãã®ç« ããå¾ãããéè¦ãªæè¨ã¯ãä¿¡é ¼æ§ã¯åç´ãªæ°å¤ç®æ¨ã§ã¯ãªããã¦ã¼ã¶ã¼ã®ãã¼ãºã¨ä¼æ¥ã®ãªã½ã¼ã¹ã®ãã©ã³ã¹ãåãããã®ç¶ç¶çãªããã»ã¹ã§ããã¨ãããã¨ã§ããSREã¨ãã¦ããã®è¦ç¹ã¯éè¦ã§ããã¦ã¼ã¶ã¼ã®æå¾ ãæè¡çãªå¶ç´ãã³ã¹ããªã©ãæ§ã ãªè¦å ãèæ ®ããªãããæé©ãªä¿¡é ¼æ§ã¬ãã«ã追æ±ãç¶ãããã¨ãæ±ãããã¦ãã¾ãã
æè¡çãªè¦³ç¹ããã¯ãæ¬æ¸ãæ示ããä¿¡é ¼æ§ã®æ°å¦çãªèå¯ãé常èå³æ·±ããã®ã§ãããç¹ã«ã99.99ï¼ ã®ä¿¡é ¼æ§ãéæããããã«å¿ è¦ãªå¯¾å¿æéããä¿¡é ¼æ§åä¸ã«ãããã³ã¹ãã®éç·å½¢æ§ãªã©ãå ·ä½çãªæ°å¤ãç¨ãã説æã¯ãSREã¨ãã¦ä¿¡é ¼æ§ç®æ¨ãè¨å®ããéã®éè¦ãªæéã¨ãªãã¾ãã
ã¾ããæ¬æ¸ã強調ãã¦ãã"black swan event"ã¸ã®å¯¾å¿ã®éè¦æ§ããSREã¨ãã¦å¿ã«çãã¦ããã¹ãç¹ã§ããäºæ¸¬ä¸å¯è½ãªå¤§è¦æ¨¡é害ã«å¯¾ãã¦ã¯ãé常ã®SLOãã¨ã©ã¼ãã¸ã§ããã®èãæ¹ãä¸æçã«ä¿çããæè»ã«å¯¾å¿ãããã¨ã®éè¦æ§ãåèªèãã¾ããã
ãã®ç« ãèªãã§ãä¿¡é ¼æ§ã®åä¸ã¯åã«ã·ã¹ãã ã®å¯ç¨æ§ãé«ãããã¨ã§ã¯ãªããã¦ã¼ã¶ã¼ã®ãã¼ãºãæ·±ãç解ããããã«å¿ããããã®ãã©ã³ã¹ã®åããã¢ããã¼ãã追æ±ãããã¨ã ã¨åèªèãã¾ãããã¾ããæè¡ãã¼ã ã ãã§ãªãã製åãã¼ã ãçµå¶é£ã¨ã®å¯æ¥ãªé£æºã®éè¦æ§ãå¼·ãæãã¾ããã
ç¹ã«å°è±¡ã«æ®ã£ãã®ã¯ãæ¬æ¸ãç¹°ãè¿ã強調ãã¦ãããã¦ã¼ã¶ã¼è¦ç¹ãã®éè¦æ§ã§ããSREã¨ãã¦ãç§ãã¡ã¯å¾ã ã«ãã¦æè¡çãªææ¨ã«ã¨ããããã¡ã§ãããæçµçã«ã¯ã¦ã¼ã¶ã¼ã®æºè¶³åº¦ãããæãéè¦ãªææ¨ã§ãããã¨ãå¿ãã¦ã¯ããã¾ããããã®è¦ç¹ã¯ãSLOã®è¨å®ããé害対å¿ã®åªå é ä½ä»ããªã©ãæ¥ã ã®æ¥åã®æ§ã ãªå ´é¢ã§æ´»ãããã¨æãã¾ããã
ã¾ããæ¬æ¸ãæ示ãããä¿¡é ¼æ§ã®ã³ã¹ããã«ã¤ãã¦ã®èå¯ã¯ãSREã¨ãã¦ã®ææ決å®ã«å¤§ããªå½±é¿ãä¸ãããã®ã ã¨æãã¾ãããå®ç§ãªä¿¡é ¼æ§ãç®æãã®ã§ã¯ãªããã¦ã¼ã¶ã¼ã®ãã¼ãºã¨ä¼æ¥ã®ãªã½ã¼ã¹ã®ãã©ã³ã¹ãåããã¨ã®éè¦æ§ããæ°å¦çãªè£ä»ãã¨ã¨ãã«ç解ã§ãããã¨ã¯ææ義ã§ããã
ãã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãä¸ã§ã以ä¸ã®ãããªä½çãªã¢ã¯ã·ã§ã³ãèãã¦ãã¾ãï¼
- SLOã®è¨å®æã«ãæè¡çãªææ¨ã ãã§ãªããã¦ã¼ã¶ã¼ã®å®éã®ä½¿ç¨ä½é¨ãåæ ããææ¨ãåãå ¥ããã
- ä¿¡é ¼æ§ç®æ¨ã®è¨å®æã«ããã®ç®æ¨éæã«ãããã³ã¹ãã¨å¾ããã便çãå®éçã«è©ä¾¡ããã
- ãã¼ã å ã§å®æçã«ãã¦ã¼ã¶ã¼ã«ã¨ã£ã¦ã®ä¿¡é ¼æ§ãã«ã¤ãã¦è°è«ããå ´ãè¨ããã
- ä¸æµãµã¼ãã¹ã®ä¿¡é ¼æ§ãèæ ®ã«å ¥ãããããç¾å®çãªSLOãè¨å®ããã
- äºæãã¬å¤§è¦æ¨¡é害ã«å¯¾å¿ããããã®æè»ãªè¨ç»ãç«ã¦ãã
ç·æ¬ããã¨ããã®ç« ã¯ä¿¡é ¼æ§ã«å¯¾ããæ°ããªè¦ç¹ãæä¾ããSREã¨ãã¦ã®ç§ãã¡ã®å½¹å²ãåå®ç¾©ãããã®ã§ãããä¿¡é ¼æ§ã¯åãªãæè¡çãªåé¡ã§ã¯ãªããã¦ã¼ã¶ã¼ã®ãã¼ãºããã¸ãã¹ã®è¦æ±ãæè¡çãªå¶ç´ã®ãã©ã³ã¹ãåãããã®ç¶ç¶çãªããã»ã¹ã§ãããã¨ãå¼·ãèªèããããã¾ããããã®è¦ç¹ãæã¡ã¤ãæ¥ã ã®æ¥åã«åãçµããã¨ã§ãããå¹æçãªSREã¨ãã¦ã®è²¢ç®ãã§ããã¨ç¢ºä¿¡ãã¦ãã¾ãã
Chapter 3. Developing Meaningful Service Level Indicators
第3ç« ãDeveloping Meaningful Service Level Indicatorsãã¯ãSLOï¼Service Level Objectivesï¼ãã¼ã¹ã®ã¢ããã¼ãã«ãããæãéè¦ãªè¦ç´ ã§ããSLIï¼Service Level Indicatorsï¼ã®éçºã«ç¦ç¹ãå½ã¦ã¦ãã¾ããæ¬æ¸ã§ã¯ãSLIã®éè¦æ§ã強調ãããã®éçºæ¹æ³ã«ã¤ãã¦è©³ç´°ã«èª¬æãã¦ãã¾ãã
ç« ã®åé ã§ãæ¬æ¸ã§ã¯ãSLIs are the most vital part of this entire process and system.ãã¨è¿°ã¹ãSLIãReliability Stackã®åºç¤ã§ãããã¨ã強調ãã¦ãã¾ãããã®è¨èã¯ãSREã¨ãã¦ã®ç§ã®çµé¨ã¨æ·±ãå ±é³´ãã¾ãããã·ã¹ãã ã®ä¿¡é ¼æ§ãåä¸ãããããã«ã¯ãã¾ãé©åãªææ¨ãè¨å®ãããã¨ãä¸å¯æ¬ ã§ããããããSLIã®å½¹å²ã ã¨ç解ãã¦ãã¾ãã
æ¬æ¸ã§ã¯ãSLIã®éè¦æ§ã説æããä¸ã§ããYour service isn't reliable if your users don't think it is.ãã¨ããä¸æãç¨ãã¦ãã¾ãããã®è¦ç¹ã¯ãæè¡çãªææ¨ã ãã§ãªããã¦ã¼ã¶ã¼ã®èªèãéè¦ãããã¨ã®éè¦æ§ã示åãã¦ãããSREã¨ãã¦ã®ç§ã®èãæ¹ã«å¤§ããªå½±é¿ãä¸ãã¾ããã
ç« ã®ä¸ç¤ã§ã¯ãSLIã®éçºæ¹æ³ã«ã¤ãã¦å ·ä½çãªä¾ãç¨ãã¦èª¬æãã¦ãã¾ããæ¬æ¸ã§ã¯ãåç´ãªãªã¯ã¨ã¹ãã»ã¬ã¹ãã³ã¹APIããå§ãã¦ãããè¤éãªå°å£²ã¦ã§ããµã¤ãã®ã¢ã¼ããã¯ãã£ã¾ã§ãæ§ã ãªã¬ãã«ã®ãµã¼ãã¹ã«ã¤ãã¦SLIã®éçºæ¹æ³ã示ãã¦ãã¾ãã
Figure 3.1ã§ã¯ãç°¡ç¥åãããå°å£²ã¦ã§ããµã¤ãã®ã¢ã¼ããã¯ãã£ã示ããã¦ãããè¤éãªã·ã¹ãã ã«ãããSLIã®éçºã®é£ãããè¦è¦çã«ç解ãããã¨ãã§ãã¾ãããã®å³ã¯ãSLIãéçºããéã«èæ ®ãã¹ãæ§ã ãªã³ã³ãã¼ãã³ãã¨ãã®ç¸äºä½ç¨ãæ確ã«ç¤ºãã¦ãããæç¨ã§ããã
æ¬æ¸ã§ã¯ãè¤éãªã·ã¹ãã ã«ãããSLIã®éçºã«ã¤ãã¦ããMeasuring many things by measuring only a fewãã¨ããã¢ããã¼ããææ¡ãã¦ãã¾ããããã¯ãã¦ã¼ã¶ã¼ã®è¦ç¹ããæãéè¦ãªææ¨ãé¸ã³åºããããã測å®ãããã¨ã§ãã·ã¹ãã å ¨ä½ã®ä¿¡é ¼æ§ãå¹æçã«è©ä¾¡ã§ããã¨ããèãæ¹ã§ãããã®èãæ¹ã¯ãSREã¨ãã¦è¤éãªã·ã¹ãã ã®ç£è¦ãè¡ãéã«æç¨ã ã¨æãã¾ããã
æè¡çãªè¦³ç¹ããç¹ã«èå³æ·±ãã£ãã®ã¯ãSLIã®æ¸¬å®æ¹æ³ã«é¢ãã説æã§ããæ¬æ¸ã§ã¯ãã¨ã©çãå¿çæéãªã©ã®åºæ¬çãªææ¨ããå§ãã¦ãããè¤éãªã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ã®æ¸¬å®ã¾ã§ã段éçã«ã¢ããã¼ããæ·±ãã¦ããã¾ããä¾ãã°ããã°ã¤ã³æ©è½ã®SLIãéçºããéã«ãåã«èªè¨¼ãµã¼ãã¹ã®ã¨ã©ã¼çãè¦ãã ãã§ãªãããã¼ããã©ã³ãµã¼ããã¦ã¼ã¶ã¼ã®ãã©ã¦ã¶ã§ã®è¡¨ç¤ºã¾ã§ãã¨ã³ããã¼ã¨ã³ãã®æµããèæ ®ããå¿ è¦ãããã¨ããææã¯ãéè¦ã ã¨æãã¾ãããã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ãã«ã¹ã¿ãã¼ã¸ã£ã¼ãã¼ã¨ã¯ä½ãç¥ãããæ¹ã¯ãã®æ¸ç±ããªã¹ã¹ã¡ã§ãã
æ¬æ¸ã§ã¯ãSLIã®è¨è¿°æ¹æ³ã«ã¤ãã¦ãè¨åãã¦ãã¾ãããThe 95th percentile of requests to our service will be responded to with the correct data within 400 ms.ãã¨ããä¾ã¯ãæè¡çãªè©³ç´°ãå«ã¿ãªããããéæè¡è ã«ãç解ããããå½¢ã§è¡¨ç¾ããã¦ãã¾ããããã¯ãSREã¨ãã¦ä»é¨ç½²ã¨ã³ãã¥ã±ã¼ã·ã§ã³ãåãéã«åèã«ãªã表ç¾æ¹æ³ã ã¨æãã¾ããã
ã¾ããæ¬æ¸ã§ã¯SLIã®éçºããã¸ãã¹ã¢ã©ã¤ã³ã¡ã³ãã«ãå¯ä¸ãããã¨ãææãã¦ãã¾ããSLIã¯ããããã¯ãããã¼ã¸ã£ã¼ã®ãã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ãããã¸ãã¹é¨éã®ãKPIããQAãã¼ã ã®ãã¤ã³ã¿ãã§ã¼ã¹ãã¹ããã¨æ¬è³ªçã«åããã®ãæãã¦ããå ´åãå¤ãã¨ããææã¯ãèå³æ·±ããã®ã§ãããããã¯ãSREãçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ããä¿é²ããä¸ã§éè¦ãªå½¹å²ãæããããã¨ã示åãã¦ãã¾ãã
ç« ã®çµè«é¨åã§æ¬æ¸ã§ã¯ãSLIã®éçºãå¿ ããã容æã§ã¯ãªããã¨ãèªãã¤ã¤ãããã®éè¦æ§ã強調ãã¦ãã¾ãããThinking about your users first is never a bad idea.ãã¨ããè¨èã¯ãSLIã®éçºã ãã§ãªããSREã®ä»äºå ¨è¬ã«éããéè¦ãªæéã ã¨æãã¾ããã
ãã®ç« ããå¾ãããéãªæè¨ã¯ãSLIã®éçºãSLOãã¼ã¹ã®ã¢ããã¼ãã®åºç¤ã§ãããã¦ã¼ã¶ã¼ã®è¦ç¹ãä¸å¿ã«ç½®ããã¨ã®éè¦æ§ã§ããæè¡çãªææ¨ã«åããã¡ãªç§ãã¡SREã«ã¨ã£ã¦ããã®è¦ç¹ã¯éè¦ã§ããã¾ããè¤éãªã·ã¹ãã ã«ããã¦ããå°æ°ã®éè¦ãªææ¨ãé©åã«é¸æãããã¨ã§ãå ¨ä½ã®ä¿¡é ¼æ§ãå¹æçã«æ¸¬å®ã§ããã¨ããç¹ããå®è·µçã§æç¨ãªç¥è¦ã ã¨æãã¾ããã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãã
- ç¾å¨ã®ã¢ãã¿ãªã³ã°ææ¨ãè¦ç´ããããããã¦ã¼ã¶ã¼ã®è¦ç¹ãã©ã®ç¨åº¦åæ ãã¦ããããè©ä¾¡ããã
- åãµã¼ãã¹ã«ã¤ãã¦ãã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ã詳細ã«ãããã³ã°ããããããéè¦ãªSLIãæ½åºããã
- ã¨ã³ããã¼ã¨ã³ãã®æ¸¬å®ãå¯è½ã«ããããã®ã¤ã³ãã©ã¹ãã©ã¯ãã£ï¼ä¾ï¼åæ£ãã¬ã¼ã·ã³ã°ã·ã¹ãã ï¼ã®å°å ¥ãæ¤è¨ããã
- SLIã®å®ç¾©ã¨æ¸¬å®æ¹æ³ã«ã¤ãã¦ããããã¯ãããã¸ãã¹ãQAãªã©ä»é¨éã¨è°è«ããçµç¹å ¨ä½ã§ã®ã¢ã©ã¤ã³ã¡ã³ããå³ãã
- SLIã®å®ç¾©ãå®æçã«è¦ç´ããã¦ã¼ã¶ã¼ã®ãã¼ãºãæå¾ ã®å¤åã«åããã¦æ´æ°ããã
æè¡çãªè¦³ç¹ããã¯ãSLIã®æ¸¬å®ã«é¢ããæ¬æ¸ã®ææ¡ã¯æç¨ã§ãããç¹ã«ãè¤éãªã·ã¹ãã ã«ããã¦ã¨ã³ããã¼ã¨ã³ãã®æ¸¬å®ãè¡ããã¨ã®éè¦æ§ã¯ãç§ãã¡ã®ã¢ãã¿ãªã³ã°æ¦ç¥ãåèããè¯ããã£ããã«ãªãã¨æãã¾ãããä¾ãã°ãç¾å¨ã®åå¥ã®ã³ã³ãã¼ãã³ãã¬ãã«ã®ã¢ãã¿ãªã³ã°ã«å ãã¦ãã¦ã¼ã¶ã¼ã®å®éã®ä½é¨ãã·ãã¥ã¬ã¼ãããã·ã³ã»ãã£ãã¯ã¢ãã¿ãªã³ã°ã®å°å ¥ãæ¤è¨ãã価å¤ãããã§ãããã
ã¾ããæ¬æ¸ãææ¡ãããå¤ãã®ãã¨ãå°æ°ã®ææ¨ã§æ¸¬å®ãããã¢ããã¼ãã¯ãã¢ãã¿ãªã³ã°ã·ã¹ãã ã®è¨è¨ã«ã大ããªå½±é¿ãä¸ããã¨èãããã¾ããç¾å¨ã®ããã«å¤æ°ã®ææ¨ãåå¥ã«ç£è¦ããã®ã§ã¯ãªããã¦ã¼ã¶ã¼ä½é¨ã«ç´çµããå°æ°ã®éè¦ãªææ¨ã«ç¦ç¹ãå½ã¦ãããã·ãã¼ããã¢ã©ã¼ãã·ã¹ãã ã®è¨è¨ãå¿ è¦ã«ãªãã§ãããã
ããã«ãSLIã®å®ç¾©ããç°¡åã«èª¬æã§ããæç« ãã§è¡¨ç¾ããã¨ããææ¡ã¯ãæè¡ãã¼ã ã¨éæè¡ãã¼ã ã®ã³ãã¥ãã±ã¼ã·ã§ã³ãæ¹åããä¸ã§éè¦ã ã¨æãã¾ãããããã¯ãã¤ã³ã·ãã³ã対å¿æã®ç¶æ³èª¬æããçµå¶é£ã¸ã®å ±åã®éã«ãå½¹ç«ã¤ã¢ããã¼ãã ã¨èãããã¾ãã
ãã®ç« ãèªãã§ãåã«ã·ã¹ãã ã®æè¡çãªå´é¢ãç£è¦ããåé¡ã«å¯¾å¦ããã ãã§ãªããã¦ã¼ã¶ã¼ä½é¨ã®åä¸ã«ç´æ¥å¯ä¸ããææ¨ãè¨è¨ããããã«åºã¥ãã¦ã·ã¹ãã ã®ä¿¡é ¼æ§ãåä¸ããã¦ãããã¨ããç§ãã¡ã®éè¦ãªè²¬åã§ããã¨ãåèªèãã¾ããã
ã¾ããSLIã®éçºãçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ãã«ã¤ãªããã¨ããææã¯ãSREã®å½¹å²ãããæ¦ç¥çã«ãªãå¯è½æ§ã示åãã¦ãã¾ããæè¡é¨éã¨äºæ¥é¨éã®æ©æ¸¡ãå½¹ã¨ãã¦ãSREããã¸ãã¹ç®æ¨ã®éæã«ç´æ¥è²¢ç®ã§ããå¯è½æ§ããããã¨ãæãã¾ããã
ç·æ¬ããã¨ããã®ç« ã¯SLIã®éçºã¨ãããã¯ãã«ã«ãªè©±é¡ãéãã¦ãSREã®å½¹å²ã¨è²¬ä»»ã«ã¤ãã¦æ·±ãæ´å¯ãæä¾ãã¦ãã¾ããã¦ã¼ã¶ã¼ä¸å¿ã®è¦ç¹ãè¤éãªã·ã¹ãã ã®ç解ãçµç¹å ¨ä½ã¨ã®ã¢ã©ã¤ã³ã¡ã³ãããããã¯ãã¹ã¦SREãåãçµãã¹ãéè¦ãªèª²é¡ã§ããSLIã®é©åãªéçºã¨éç¨ã¯ããããã®èª²é¡ã«å¯¾å¦ããããã®å¼·åãªãã¼ã«ã¨ãªãå¾ã¾ãã
ä»å¾ã®å®åã«ããã¦ããã®ç« ã§å¦ãã ã¢ããã¼ããç©æ¥µçã«åãå ¥ãã¦ããããã¨èãã¦ãã¾ããç¹ã«ãã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ã®è©³ç´°ãªåæã¨ããã«åºã¥ãSLIã®è¨è¨ãã¨ã³ããã¼ã¨ã³ãã®æ¸¬å®ãå¯è½ã«ããã¤ã³ãã©ã®æ´åãããã¦ä»é¨éã¨ã®å¯æ¥ãªé£æºã«ããSLIã®ç¶ç¶çãªæ¹åã«æ³¨åãã¦ããããã¨æãã¾ãããããã®åãçµã¿ãéãã¦ãããå¹æçãªSREãã©ã¯ãã£ã¹ã確ç«ããæçµçã«ã¯ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ãã
Chapter 4. Choosing Good Service Level Objectives
第4ç« ãChoosing Good Service Level Objectivesãã¯ãSLOï¼Service Level Objectivesï¼ã®è¨å®ã«é¢ããæ·±ãæ´å¯ãæä¾ãã¦ãã¾ããæ¬æ¸ã§ã¯ãSLOã®æ¬è³ªã¨ããããå¹æçã«é¸æã»è¨å®ããããã®æ¹æ³è«ã詳細ã«è§£èª¬ãã¦ãã¾ãã
æ¬æ¸ã§ã¯ãSLOããã¦ã¼ã¶ã¼ã®å¹¸ç¦åº¦ãã¨ãã観ç¹ããå®ç¾©ãã¦ãã¾ãããIf you are exceeding your SLO target, your users are happy with the state of your service.ãã¨ããä¸æã¯ãSLOã®æ¬è³ªãçã«è¡¨ç¾ãã¦ãã¾ãããã®è¦ç¹ã¯ãæè¡çãªææ¨ã ãã§ãªããã¦ã¼ã¶ã¼ä½é¨ãä¸å¿ã«æ®ãããã¨ã®éè¦æ§ã強調ãã¦ãããSREã¨ãã¦ã®ç§ãã¡ã®å½¹å²ãåèããããã¾ããã
ããããæ¬æ¸ã§ã¯åæã«ãtoo reliableãã§ãããã¨ã®åé¡ç¹ãææãã¦ãã¾ãããBeing too reliable all the time, you're also missing out on some of the fundamental features that SLO-based approaches give you: the freedom to do what you want.ããã®èãæ¹ã¯ãä¸è¦paradoxicalã«æãã¾ãããå®éã®éç¨ç°å¢ã§ã¯éè¦ã§ããé度ã«é«ãä¿¡é ¼æ§ãç®æããã¨ã§ãã¤ããã¼ã·ã§ã³ãå®é¨ã®æ©ä¼ã失ãå¯è½æ§ãããã¨ããææã¯ãSREã¨ãã¦å¸¸ã«æèãã¹ãç¹ã ã¨æãã¾ããã
æ¬æ¸ã§ã¯ãSLOè¨å®ã«ããã¦ã9ãã«ãã ããããããã¨ã®å±éºæ§ãææãã¦ãã¾ãã99.9ï¼ ã99.99ï¼ ã¨ãã£ãæ°å¤ã¯ä¸è¦é åçã§ãããå®éã®ãµã¼ãã¹éç¨ã§ã¯å¿ ãããé©åã§ã¯ãªãå ´åãããã¾ããæ¬æ¸ã§ã¯ãããæè»ãªã¢ããã¼ããææ¡ãã¦ãããä¾ãã°98.62ï¼ ã87ï¼ ã¨ãã£ãSLOç®æ¨ãé©åãªå ´åãããã¨è¿°ã¹ã¦ãã¾ãããã®æè»æ§ã¯ãå®éã®ãµã¼ãã¹éç¨ã«ããã¦éè¦ã ã¨æãã¾ããã
æè¡çãªè¦³ç¹ããç¹ã«èå³æ·±ãã£ãã®ã¯ãSLOè¨å®ã«ãããçµ±è¨çã¢ããã¼ãã®è§£èª¬ã§ããæ¬æ¸ã§ã¯ãå¹³åå¤ãä¸å¤®å¤ãã¢ã¼ããç¯å²ããã¼ã»ã³ã¿ã¤ã«ãªã©ã®åºæ¬çãªçµ±è¨æ¦å¿µãä¸å¯§ã«èª¬æããããããSLOè¨å®ã«ã©ã®ããã«æ´»ç¨ã§ããããå ·ä½çã«ç¤ºãã¦ãã¾ããä¾ãã°ã95ãã¼ã»ã³ã¿ã¤ã«ã®å¿çæéã使ç¨ãã¦SLOãè¨å®ããæ¹æ³ã¯ãå®éã®ãµã¼ãã¹éç¨ã«ããã¦æç¨ã§ãã
Figure 4.1ã¨4.2ã§ç¤ºããã¦ãããSLO targets composed of nines translated to timeãã¨ãSLO targets composed of not-just-nines translated to timeãã®è¡¨ã¯ãç°ãªãSLOç®æ¨ãå®éã®éç¨æéã«ã©ã®ããã«å½±é¿ããããè¦è¦çã«ç解ããä¸ã§å½¹ç«ã¡ã¾ããããããã®è¡¨ã¯ãSLOç®æ¨ãè¨å®ããéã®å ·ä½çãªæéã¨ãªãã¾ãã
æ¬æ¸ã§ã¯ããµã¼ãã¹ã®ä¾åé¢ä¿ã¨ã³ã³ãã¼ãã³ãã®èæ ®ã®éè¦æ§ã強調ãã¦ãã¾ããç¹ã«ãè¤æ°ã®ãã¼ã ãé¢ããè¤éãªãµã¼ãã¹ã«ããã¦ãåã³ã³ãã¼ãã³ãã®SLOãã©ã®ããã«è¨ããå ¨ä½ã®SLOã¨ã©ãæ´åæ§ãåããã¨ããç¹ã¯ãå®åä¸éè¦ãªèª²é¡ã§ããæ¬æ¸ã®ææ¡ãããdependency mathãã¯ããã®èª²é¡ã«å¯¾ããå ·ä½çãªã¢ããã¼ããæä¾ãã¦ãããå®è·µçã§æç¨ã ã¨æãã¾ããã
ã¾ããæ¬æ¸ã§ã¯ã¡ããªã¯ã¹ã®å±æ§ï¼è§£å度ãéã質ï¼ã«ã¤ãã¦ã詳細ã«è§£èª¬ãã¦ãã¾ãããããã®è¦ç´ ãSLOè¨å®ã«ã©ã®ããã«å½±é¿ããããç解ãããã¨ã¯ãé©åãªSLOãé¸æããä¸ã§éè¦ã§ããç¹ã«ãä½é »åº¦ã¤ãã³ããå質ã®ä½ããã¼ã¿ã«å¯¾ããã¢ããã¼ãã¯ãå®éã®SREæ¥åã§ç´é¢ãã課é¡ã«ç´æ¥é¢é£ãã¦ãããæç¨ãªç¥è¦ã§ããã
æ¬æ¸ãææ¡ãããã¼ã»ã³ã¿ã¤ã«é¾å¤ã®ä½¿ç¨ã¯ãç¹ã«ãã³ã°ãã¼ã«ãæã¤åå¸ï¼ä¾ï¼ã¬ã¤ãã³ã·ã¼ï¼ãæ±ãéã«é常ã«æå¹ã§ãä¾ãã°ãThe P95 of all requests will successfully complete within 2,000 ms 99.9ï¼ of the time.ãã¨ããSLOã®è¨å®æ¹æ³ã¯ããã³ã°ãã¼ã«ã®ããã©ã¼ãã³ã¹ãç¶ç¶çã«ç£è¦ã»ç®¡çã§ããç¹ã¨ãSLOãããç´æçã«å ±åã§ããç¹ã§åªããã¢ããã¼ãã ã¨æãã¾ããã
ãã®ç« ããå¾ãããéè¦ãªæè¨ã¯ãSLOè¨å®ãåãªãæ°å¤ç®æ¨ã®è¨å®ã§ã¯ãªããã¦ã¼ã¶ã¼ã®æå¾ ãæè¡çãªå¶ç´ããã¸ãã¹ç®æ¨ã®ãã©ã³ã¹ãåãè¤éãªããã»ã¹ã§ããã¨ãããã¨ã§ããæ¬æ¸ã®ãSLOs are objectives, they're not formal agreementsãã¨ããè¨èã¯ãSLOã¢ããã¼ãã®æè»æ§ã¨é²åã®å¿ è¦æ§ã強調ãã¦ãããéè¦ãªææã ã¨æãã¾ããã
ãã®ç« ã®å 容ãå®è·µã«ç§»ãä¸ã§ã以ä¸ã®ãããªã¢ããã¼ããèãã¦ãã¾ããã¾ããè³æã¨ãã¦ã¯ä»¥ä¸ãSLOãã¼ãããã¤ãããªã©ããããããªã®ã§å½å ã®äºä¾ã¨ãã¦èªãã§ã¿ã¦ã»ããã§ãã
- ç¾å¨ã®SLOãåè©ä¾¡ããã¦ã¼ã¶ã¼ä½é¨ãããé©åã«åæ ãã¦ãããæ¤è¨ããã
- ãã¼ã»ã³ã¿ã¤ã«é¾å¤ãæ´»ç¨ãããã³ã°ãã¼ã«ãæã¤ã¡ããªã¯ã¹ã«å¯¾ããSLOãæ¹åããã
- ãµã¼ãã¹ã®ä¾åé¢ä¿ã詳細ã«ãããã³ã°ãããdependency mathããç¨ãã¦å ¨ä½ã®SLOãåè¨ç®ããã
- ã¡ããªã¯ã¹ã®å±æ§ï¼è§£å度ãéã質ï¼ã詳細ã«åæããSLOè¨å®ã«åæ ãããã
- ãã¼ã éã§SLOã«é¢ããå®æçãªè°è«ã®å ´ãè¨ããç¶ç¶çãªæ¹åãå³ãã
æè¡çãªè¦³ç¹ããã¯ãæ¬æ¸ãææ¡ããçµ±è¨çã¢ããã¼ãã¨ãã¼ã»ã³ã¿ã¤ã«é¾å¤ã®ä½¿ç¨ã¯ãç¹ã«æ³¨ç®ã«å¤ãã¾ãããããã®ææ³ãå®è£ ããããã«ã¯ãé«åº¦ãªã¢ãã¿ãªã³ã°ã·ã¹ãã ã¨åæãã¼ã«ãå¿ è¦ã«ãªãã§ããããä¾ãã°ããªã¢ã«ã¿ã¤ã ã§ãã¼ã»ã³ã¿ã¤ã«å¤ãè¨ç®ããSLOéåãæ¤åºããã·ã¹ãã ã®æ§ç¯ãèãããã¾ããã¾ãããdependency mathããèªååãããµã¼ãã¹ã®ä¾åé¢ä¿ã®å¤æ´ãSLOã«ã©ã®ããã«å½±é¿ããããã·ãã¥ã¬ã¼ããããã¼ã«ãæç¨ã§ãããã
ããã«ãæ¬æ¸ã®ãoperational underloadãã®æ¦å¿µã¯ãSREå®è·µã«ããã¦éè¦ã§ããã·ã¹ãã ã«é©åº¦ãªã¹ãã¬ã¹ãããããã¨ã§ããã¼ã ã®å¯¾å¿è½åãç¶æããæ½å¨çãªåé¡ãæ©æã«çºè¦ã§ããã¨ããèãæ¹ã¯ãå¾æ¥ã®ã·ã¹ãã éç¨ã®èãæ¹ãè¦ããã®ã§ãããããå®è·µããããã«ã¯ãæ éã«è¨è¨ãããè² è·ãã¹ããã«ãªã¹ã¨ã³ã¸ãã¢ãªã³ã°ã®ææ³ã®å°å ¥ãå¿ è¦ã«ãªãã§ãããã
ãã®ç« ã®å 容ã¯ãSREã®å®åã«ç´æ¥çã«é©ç¨ã§ããå¤ãã®ç¤ºåã«å¯ãã§ãã¾ããä¾ãã°ãæ°ãããµã¼ãã¹ã®SLOè¨å®æã«ã¯ãæ¬æ¸ãææ¡ãããeducated guessãã¢ããã¼ããæ¡ç¨ããåæã®SLOãè¨å®ããå¾ãå®éã®ãã¼ã¿ã«åºã¥ãã¦è¿ éã«èª¿æ´ãã¦ããæ¹æ³ãåãå ¥ãããã¨ãã§ãã¾ããã¾ããæ¢åã®ãµã¼ãã¹ã«ã¤ãã¦ã¯ãç¾å¨ã®SLOãæ¬å½ã«ã¦ã¼ã¶ã¼ã®æå¾ ãåæ ãã¦ããããå®æçã«åè©ä¾¡ããç¿æ £ãå°å ¥ãããã¨ãéè¦ã§ãã
æ¬æ¸ã®ãNines don't matter if users aren't happyãã¨ããèãæ¹ã¯ãSREã®å½¹ãåå®ç¾©ãããã®ã ã¨æãã¾ãããæè¡çãªææ¨ã®éæã ãã§ãªããå®éã®ã¦ã¼ã¶ã¼æºè¶³åº¦ãä¸å¿ã«æ®ããã¢ããã¼ãã¯ãSREããã¸ãã¹ä¾¡å¤ã®åµåºã«ããç´æ¥çã«è²¢ç®ã§ãããã¨ã示åãã¦ãã¾ããããã¯ãSREã®æ¦ç¥çéè¦æ§ãé«ããçµç¹å ã§ã®ä½ç½®ã¥ããå¤ããå¯è½æ§ãæã£ã¦ãã¾ãã
ä¸æ¹ã§ããã®ç« ã®å 容ãå®è·µããä¸ã§ã®èª²é¡ãåå¨ãã¾ããä¾ãã°ãè¤éãªä¾åé¢ä¿ãæã¤ãã¤ã¯ããµã¼ãã¹ç°å¢ã§ã¯ãåã ã®ãµã¼ãã¹ã®SLOã¨å ¨ä½ã®SLOãã©ã®ããã«æ´åããããã大ããªèª²é¡ã¨ãªãã¾ããã¾ããã¦ã¼ã¶ã¼æºè¶³åº¦ãæ£ç¢ºã«æ¸¬å®ãããããSLOã«åæ ãããæ¹æ³ãããããªãç 究ã¨å®é¨ãå¿ è¦ãªé åã§ãã
ããã«ãæ¬æ¸ãææ¡ããæè»ãªSLOã¢ããã¼ãã¯ãçµç¹æåã®å¤é©ãå¿ è¦ã¨ããå ´åãããã¾ããç¹ã«ãå¾æ¥ã®åºå®çãªSLAã«æ £ããçµç¹ã§ã¯ãããåçã§é©å¿çãªSLOã¢ããã¼ãã¸ã®ç§»è¡ã«æµæãããå¯è½æ§ãããã¾ãããã®èª²é¡ã«å¯¾å¦ããããã«ã¯ãçµå¶é£ãå«ãçµç¹å ¨ä½ã§ã®ç解ã¨æ¯æãå¾ããã¨ãéè¦ã§ãã
ç·æ¬ããã¨ããã®ç« ã¯SLOè¨å®ã«é¢ããå æ¬çãã¤å®è·µçãªã¬ã¤ããæä¾ãã¦ãã¾ããæ¬æ¸ã®ç¾å®ä¸»ç¾©çãã¤ã¦ã¼ã¶ã¼ä¸å¿ã®ã¢ããã¼ãã¯ãSREã®å®åã«ç´æ¥çã«é©ç¨å¯è½ãªå¤ãã®æ´å¯ãæä¾ãã¦ãã¾ããç¹ã«ãã100ï¼ ã¯ä¸å¯è½ãã¨ããåæã«ç«ã¡ã¤ã¤ãã©ã®ããã«ãã¦é©åãªä¿¡é ¼æ§ç®æ¨ãè¨å®ãããã¨ããåãã«å¯¾ããæ¬æ¸ã®åçã¯ã示åã«å¯ãã§ãã¾ãã
ãã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããå¹æçãªSLOè¨å®ãå¯è½ã«ãªããçµæã¨ãã¦ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ãSLOè¨å®ã¯ã³ã³ããã¹ãã«ä¾åããè¤éãªããã»ã¹ã§ãããç¶ç¶çãªå¦ç¿ã¨æ¹åãå¿ è¦ã§ãããã¨ãå¿ãã¦ã¯ããã¾ããã
æå¾ã«ããã®ç« ã®å 容ã¯ãSREãåãªãæè¡çãªå½¹å²ã§ã¯ãªãããã¸ãã¹ã¨ã¦ã¼ã¶ã¼ä½é¨ãæ·±ãç解ãããããæè¡çã«å®ç¾ããæ¦ç¥çãªå½¹å²ã§ãããã¨ãåèªèããã¦ããã¾ãããSLOã®é©åãªè¨å®ã¨ç®¡çã¯ããã®å½¹å²ãæããä¸ã§æ ¸å¿çãªè¦ç´ ã§ãããä»å¾ã®SREå®è·µã«ããã¦ããã«éè¦æ§ãå¢ãã¦ããã¨èãããã¾ããMonitoring user experience of Flutter apps with SLI/SLO (æ¥æ¬èª)ãè¯ãã£ãã®ã§ãªã¹ã¹ã¡ã§ãã
Chapter 5. How to Use Error Budgets
第5ç« ãHow to Use Error Budgetsãã¯ãSREï¼Site Reliability Engineeringï¼ã®æ ¸å¿ã¨ãè¨ããã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µã¨å®è·µçãªæ´»ç¨æ¹æ³ã«ã¤ãã¦æ·±ãæãä¸ãã¦ãã¾ããæ¬æ¸ã§ã¯ãã¨ã©ã¼ãã¸ã§ãããåãªãæ°å¤ç®æ¨ã§ã¯ãªããçµç¹ã®ææ決å®ã¨è¡åãå°ãå¼·åãªãã¼ã«ã§ãããã¨ã説å¾åã®ããæ¹æ³ã§èª¬æãã¦ãã¾ãã
ç« ã®åé ã§ãæ¬æ¸ã§ã¯ãError budgets are the final part of the Reliability Stack, and it takes a lot of effort and resources to use them properly.ãã¨è¿°ã¹ãã¨ã©ã¼ãã¸ã§ããã®éè¦æ§ã¨å®è£ ã®é£ããã強調ãã¦ãã¾ãããã®è¨èã¯ãSREã¨ãã¦ã®ç§ã®çµé¨ã¨æ·±ãå ±é³´ãã¾ãããã¨ã©ã¼ãã¸ã§ããã®å°å ¥ã¯ãæè¡çãªèª²é¡ã ãã§ãªããçµç¹æåã®å¤é©ãå¿ è¦ã¨ããè¤éãªããã»ã¹ã§ãããã¨ãæ¥ã å®æãã¦ãã¾ãã
æ¬æ¸ãæ示ããã¨ã©ã¼ãã¸ã§ããã®åºæ¬çãªèãæ¹ã¯ãFigure 5.1ã«ç«¯çã«ç¤ºããã¦ãã¾ãããIf you have error budget remaining, ship new features and push to production as often as you'd like; once you run out of it, stop pushing feature changes and focus on reliability instead.ããã®åç´ãªååã¯ãéçºãã¼ã ã¨éç¨ãã¼ã ã®éã®å¯¾ç«é¢ä¿ã«å¯¾ããåªãã解決çãæä¾ãã¦ãã¾ãã
æè¡çãªè¦³ç¹ããç¹ã«èå³æ·±ãã£ãã®ã¯ãã¨ã©ã¼ãã§ããã®è¨ç®æ¹æ³ã«é¢ãã詳細ãªèª¬æã§ããæ¬æ¸ã§ã¯ãã¤ãã³ããã¼ã¹ã¨æéãã¼ã¹ã®2ã¤ã®ã¢ããã¼ããç´¹ä»ããããããã®å©ç¹ã¨æ¬ ç¹ã解説ãã¦ãã¾ããä¾ãã°ã30æ¥éã®ã¦ã£ã³ãã¦ã§99.7ï¼ ã®SLOç®æ¨ãæã¤ãµã¼ãã¹ã®å ´åãã¨ã©ã¼ãã¸ã§ããã¯æ¬¡ã®ããã«è¨ç®ããã¾ãï¼
(1 - 0.997) Ã 2592000 = 7776
ãã®7776ç§ï¼2æé9å36ç§ï¼ãã30æ¥éã§è¨±å®¹ããããä¸å®å®ãªæéãã¨ãªãã¾ãããã®å ·ä½çãªæ°å¤ã示ããã¨ã§ãã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µãããç解ãããããªãã¨æãã¾ããã
æ¬æ¸ã§ã¯ã¾ããã¨ã©ã¼ãã¸ã§ããã®æéæ ã®é¸æã«ã¤ãã¦ã詳ããè«ãã¦ãã¾ãããã¼ãªã³ã°ã¦ã£ã³ãã¦ã¨ã«ã¬ã³ãã¼ãã¼ã¹ã®ã¦ã£ã³ãã¦ã®æ¯è¼ã¯ãç¹ã«èå³æ·±ããã®ã§ãããã«ã¬ã³ãã¼ãã¼ã¹ã®ã¦ã£ã³ãã¦ã¯å ±åãè¨ç»ã容æã«ãªãä¸æ¹ã§ãææ«è¿ãã®å¤§ããªé害ã®å½±é¿ãé©åã«æ ã§ããªãå¯è½æ§ãããã¨ããææã¯ãå®åä¸éè¦ãªç¹ã ã¨æãã¾ããã
ã¨ã©ã¼ãã¸ã§ããããªã·ã¼ã®çå®ã«é¢ããæ¬æ¸ã®ææ¡ããæç¨ã§ãããç¹ã«ããUse words like must, may, should, and required in your written policies to help give people freedom to adapt certain parts of them.ãã¨ããå©è¨ã¯ãããªã·ã¼ã®æè»æ§ã¨å³æ ¼æ§ã®ãã©ã³ã¹ãåãä¸ã§éè¦ã ã¨æãã¾ãããæ¬æ¸ãæ¨å¥¨ããRFC 2119ã®æ´»ç¨ã¯ãæè¡ææ¸ã®ä½æã«ããã¦æç¨ãªæéã ã¨æãã¾ãã
ãã®ç« ã§æãå°è±¡ã«æ®ã£ãã®ã¯ãã¨ã©ã¼ãã¸ã§ãããåãªãæ°å¤ç®æ¨ã§ã¯ãªããææ決å®ã®ãã¼ã«ã¨ãã¦æããè¦ç¹ã§ãããError budget status is just the end result of a bunch of data (SLIs, SLOs, and their targets) that exists to help you make decisions.ããã®èãæ¹ã¯ãSREã®å®è·µã«ããã¦éè¦ã§ããã¨ã©ã¼ãã¸ã§ããã¯ããã¼ã éã®ã³ãã¥ãã±ã¼ã·ã§ã³ãä¿é²ãããªã½ã¼ã¹é åã®åªå é ä½ä»ããæ¯æ´ããå¼·åãªãã¼ã«ã¨ãªãå¾ã¾ãã
æ¬æ¸ãææ¡ããã¨ã©ã¼ãã¸ã§ããã®æ®µéçãªå¯¾å¿çï¼ä¾ï¼33ï¼ æ¶è²»ã§2人ã66ï¼ æ¶è²»ã§4人ããªã©ã¤ã¢ããªãã£ä½æ¥ã«éä¸ï¼ã¯ãå®åã«ç´æ¥é©ç¨ã§ããå ·ä½çãªã¢ã¤ãã¢ã¨ãã¦åèã«ãªãã¾ãããåæã«ããããã®å¯¾å¿çãåºå®çãªã«ã¼ã«ã§ã¯ãªããç¶æ³ã«å¿ãã¦æè»ã«é©ç¨ãã¹ãã¬ã¤ãã©ã¤ã³ã¨ãã¦æããæ¬æ¸ã®å§¿å¢ã¯ãç¾å®ã®ã½ããã¦ã§ã¢éçºç°å¢ã®è¤éããé©åã«åæ ãã¦ããã¨æãã¾ããã
ã¨ã©ã¼ãã¸ã§ããã®è¨ç®æ¹æ³ã«é¢ããæè¡çãªè©³ç´°ããæç¨ã§ãããç¹ã«ããã¼ãªã³ã°ã¦ã£ã³ãã¦ã¨ã«ã¬ã³ãã¼ãã¼ã¹ã®ã¦ã£ã³ãã¦ã®æ¯è¼ã¯ãå®éã®ã·ã¹ãã è¨è¨ã«ããã¦éè¦ãªé¸æã¨ãªãã¾ããæ¬æ¸ãææããããã«ãã«ã¬ã³ãã¼ãã¼ã¹ã®ã¦ã£ã³ãã¦ã¯å ±åãè¨ç»ã容æã«ãªãä¸æ¹ã§ãææ«è¿ãã®å¤§ããªé害ã®å½±é¿ãé©åã«åæ ã§ããªãå¯è½æ§ãããã¾ãããã®ç¹ã¯ãã¨ã©ã¼ãã¸ã§ããã®è¨è¨ã«ããã¦æ éã«èæ ®ãã¹ãè¦ç´ ã ã¨æãã¾ããã
ã¾ããæ¬æ¸ãææ¡ãããæéã®é¤å¤ãã®æ¦å¿µãèå³æ·±ããã®ã§ãããè¨ç»çãªã¡ã³ããã³ã¹æéãããµã¼ãã¹ãéè¦ã§ãªãæé帯ãã¨ã©ã¼ãã¸ã§ããã®è¨ç®ããé¤å¤ãããã¨ã§ãããç¾å®çãªSLOãè¨å®ã§ããã¨ããç¹ã¯ãå¤ãã®ã·ã¹ãã ã«é©ç¨å¯è½ãªæç¨ãªèãæ¹ã ã¨æãã¾ãã
ã¨ã©ã¼ãã¸ã§ããããªã·ã¼ã®çå®ã«é¢ããæ¬æ¸ã®ææ¡ã¯ãçµç¹å ã§ã®ã¨ã©ã¼ãã¸ã§ããã®å¹æçãªéç¨ã«ä¸å¯æ¬ ãªãã®ã ã¨æãã¾ãããç¹ã«ãããªã·ã¼ã«ææè ã¨ã¹ãã¼ã¯ãã«ãã¼ãæ確ã«è¨è¼ãããã¨ã®éè¦æ§ã¯ãçµç¹ã®è¦æ¨¡ã大ãããªãã»ã©éè¦ã«ãªãã¾ããã¾ããã¨ã©ã¼ãã¸ã§ããã®ãã¼ã³ã¬ã¼ãã«å¿ãã段éçãªå¯¾å¿çã®è¨å®ã¯ããªã½ã¼ã¹ã®å¹ççãªé åã¨è¿ éãªåé¡å¯¾å¿ã®ãã©ã³ã¹ãåãä¸ã§æå¹ã ã¨æãã¾ãã
æ¬æ¸ã強調ãããä¿¡é ¼ãã®éè¦æ§ãè¯ãã£ãã§ããã¨ã©ã¼ãã¸ã§ããããªã·ã¼ã¯å³æ ¼ãªã«ã¼ã«ã§ã¯ãªããææ決å®ã®ã¬ã¤ãã©ã¤ã³ã§ããã¹ãã ã¨ããèãæ¹ã¯ãç¾å®ã®è¤éãªç¶æ³ã«æè»ã«å¯¾å¿ããä¸ã§éè¦ã§ããåæã«ãããªã·ã¼ã®å®æçãªè¦ç´ãã¨æ´æ°ã®å¿ è¦æ§ã強調ãã¦ããç¹ããå¤åã®æ¿ããæè¡ç°å¢ã«ããã¦éè¦ã ã¨æãã¾ããã
ãã®ç« ããå¾ãããéè¦ãªæè¨ã¯ãã¨ã©ã¼ãã¸ã§ãããåãªãæ°å¤ç®æ¨ã§ã¯ãªããçµç¹å ¨ä½ã®ææ決å®ã¨ã³ãã¥ãã±ã¼ã·ã§ã³ãå°ãå¼·åãªãã¼ã«ã§ããã¨ãããã¨ã§ããã¨ã©ã¼ãã¸ã§ãããå¹æçã«æ´»ç¨ããããã«ã¯ãæè¡çãªå®è£ ã ãã§ãªããçµç¹æåã®å¤é©ã¨ç¶ç¶çãªåããã»ã¹ãä¸å¯æ¬ ã§ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
- ã¨ã©ã¼ãã¸ã§ããã®è¨ç®æ¹æ³ã¨æéæ ã®é¸æãããµã¼ãã¹ã®ç¹æ§ã¨ã¦ã¼ã¶ã¼ã®ãã¼ãºã«åºã¥ãã¦æ éã«æ¤è¨ããã
- 段éçãªã¨ã©ã¼ãã¸ã§ããæ¶è²»ã«å¯¾ãã対å¿çãããã¼ã ã®è¦æ¨¡ã¨æ§é ã«åããã¦è¨è¨ããã
- ã¨ã©ã¼ãã¸ã§ããããªã·ã¼ãä½æããææè ãã¹ãã¼ã¯ãã«ãã¼ã対å¿çãè¦ç´ãã¹ã±ã¸ã¥ã¼ã«ãæ確ã«å®ç¾©ããã
- ã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ãå®æçã«å ±åãããã¼ã éã®ã³ãã¥ãã±ã¼ã·ã§ã³ã¨ææ決å®ã«æ´»ç¨ããã
- ã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µã¨éè¦æ§ã«ã¤ãã¦ãçµç¹å ¨ä½ã®ç解ãä¿é²ããããã®æè²ããã°ã©ã ãå®æ½ããã
æè¡çãªè¦³ç¹ããã¯ãã¨ã©ã¼ãã¸ã§ããã®è¨ç®ã¨ç£è¦ãèªååããã·ã¹ãã ã®æ§ç¯ãéè¦ã«ãªãã¾ããä¾ãã°ãSLIãã¼ã¿ãç¶ç¶çã«åéãããªã¢ã«ã¿ã¤ã ã§ã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ãè¨ç®ã»å¯è¦åããããã·ãã¼ãã®éçºãèãããã¾ããã¾ããã¨ã©ã¼ãã¸ã§ããã®ãã¼ã³ã¬ã¼ãã«åºã¥ãã¦èªåçã«ã¢ã©ã¼ããçºçãããé©åãªãã¼ã ã¡ã³ãã¼ã«éç¥ããã·ã¹ãã ãæç¨ã§ãããã
ããã«ãæ¬æ¸ãææ¡ããã¨ã©ã¼ãã¸ã§ããã®ãburn rateãã®æ¦å¿µã¯ãç¹ã«æ³¨ç®ã«å¤ãã¾ãããã®æ¦å¿µãå®è£ ããããã«ã¯ãæç³»åãã¼ã¿ãã¼ã¹ã¨ã¢ããªãã£ã¯ã¹ãã¼ã«ã®å¹æçãªæ´»ç¨ãå¿ è¦ã«ãªãã§ããããä¾ãã°ãPrometheus+Grafanaã®çµã¿åããã使ç¨ãã¦ãã¨ã©ã¼ãã¸ã§ããã®ãã¼ã³ã¬ã¼ããå¯è¦åããäºæ¸¬åæãè¡ããã¨ãèãããã¾ãã
ã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µãCICDï¼Continuous Integration/Continuous Deploymentï¼ãã¤ãã©ã¤ã³ã¨çµ±åãããã¨ããæå¹ãªã¢ããã¼ãã ã¨èããã¾ããä¾ãã°ãç¾å¨ã®ã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ã«åºã¥ãã¦ãèªåçã«ãããã¤ã¡ã³ãã®é »åº¦ãè¦æ¨¡ã調æ´ããã·ã¹ãã ãæ§ç¯ãããã¨ãã§ãã¾ããããã«ãããã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µãããç´æ¥çã«éçºããã»ã¹ã«çµã¿è¾¼ããã¨ãå¯è½ã«ãªãã¾ãã
ãã®ç« ãèªãã§ãã¨ã©ã¼ãã¸ã§ããã®å¹æçãªæ´»ç¨ã¯ãåã«ã·ã¹ãã ã®ä¿¡é ¼æ§ãåä¸ãããã ãã§ãªããçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ãã¨ã³ãã¥ãã±ã¼ã·ã§ã³ãä¿é²ããå¼·åãªãã¼ã«ã¨ãªãå¾ã¾ããç¹ã«ãéçºãã¼ã ã¨éç¨ãã¼ã ã®éã®ä¼çµ±çãªå¯¾ç«ã解æ¶ããå ±éã®ç®æ¨ã«åãã£ã¦ååããæåãé¸æããä¸ã§ãã¨ã©ã¼ãã¸ã§ããã¯æå¹ã ã¨æãã¾ããã
ã¾ããã¨ã©ã¼ãã¸ã§ãããæ´»ç¨ããææ決å®ããã»ã¹ã¯ãSREã®æ¦ç¥çãªéè¦æ§ãé«ããå¯è½æ§ããã¾ããã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ã«åºã¥ãã¦ããªã½ã¼ã¹é åãåªå é ä½ä»ããè¡ããã¨ã§ãSREã¯ããç´æ¥çã«ãã¸ãã¹ç®æ¨ã®éæã«è²¢ç®ã§ããããã«ãªãã¾ããããã¯ãSREã®å½¹å²ãããæ¦ç¥çã«ãªããçµå¶é£ã¨ã®å¯¾è©±ãããæ·±ã¾ãå¯è½æ§ã示åãã¦ãã¾ãã
ç·æ¬ããã¨ããã®ç« ã¯ã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µã¨å®è·µçãªæ´»ç¨æ¹æ³ã«ã¤ãã¦ãå æ¬çãã¤æ·±ãæ´å¯ãæä¾ãã¦ãã¾ããã¨ã©ã¼ãã¸ã§ããã¯ãåãªãæè¡çãªã¡ããªã¯ã¹ã§ã¯ãªããçµç¹ã®æåã¨ææ決å®ããã»ã¹ãå¤é©ããå¼·åãªãã¼ã«ã§ããSREã¨ãã¦ãã¨ã©ã¼ãã¸ã§ããã®å¹æçãªå®è£ ã¨æ´»ç¨ã¯ãã·ã¹ãã ã®ä¿¡é ¼æ§åä¸ã ãã§ãªããçµç¹å ¨ä½ã®ããã©ã¼ãã³ã¹åä¸ã«ã大ããè²¢ç®ããå¯è½æ§ãããã¨ç¢ºä¿¡ãã¦ãã¾ãã
ä»å¾ã®èª²é¡ã¨ãã¦ã¯ãã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µãããåºç¯å²ã®ã¹ãã¼ã¯ãã«ãã¼ã«ç解ãã¦ããããçµç¹å ¨ä½ã§æ´»ç¨ãã¦ãããã¨ãæãããã¾ããã¾ããã¨ã©ã¼ãã¸ã§ããã®è¨ç®ã¨ç£è¦ããã精緻åããããæ£ç¢ºãã¤ãªã¢ã«ã¿ã¤ã ãªææ決å®ãæ¯æ´ããã·ã¹ãã ã®éçºãéè¦ãªèª²é¡ã¨ãªãã§ããããããã«ãã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µãä»ã®çµå¶ææ¨ã¨çµ±åããããå æ¬çãªçµç¹ããã©ã¼ãã³ã¹ã®è©ä¾¡ã·ã¹ãã ãæ§ç¯ãããã¨ããå°æ¥çãªå±æã¨ãã¦èãããã¾ãã
ã¨ã©ã¼ãã¸ã§ããã®å¹æçãªæ´»ç¨ã¯ãSREã®å®è·µã«ããã¦ä¸å¿çãªå½¹å²ãæãããã®ã§ãããã®ç« ã§å¦ãã æ¦å¿µã¨ææ³ããæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãã¦ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ãã·ã¹ãã ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ãã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µã¯å¸¸ã«é²åãç¶ãããã®ã§ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ã«å¿ãã¦ãç¶ç¶çã«å¦ç¿ããé©å¿ãã¦ããå¿ è¦ããããã¨ãå¿ãã¦ã¯ããã¾ããã
Part II. SLO Implementation
Part IIã§ã¯ãSLOã®å®è£ ã¨éç¨ã«ç¦ç¹ãå½ã¦ããã¦ãã¾ããç¹ã«è¯ãã£ãã®ã¯ãSLOã¢ãã¿ãªã³ã°ã¨ã¢ã©ã¼ãã«é¢ããç« ã§ããèè ã¯ãå¾æ¥ã®ãããå¤ãã¼ã¹ã®ã¢ã©ã¼ãã®åé¡ç¹ãææããã¨ã©ã¼ãã¸ã§ããã®ãã¼ã³ã¬ã¼ãã«åºã¥ããSLOã¢ã©ã¼ãã®åªä½æ§ã説æãã¦ãã¾ããä¸ã®ä¸ã«ã¯OpenSLOãªãã¦ããåãçµã¿ãããã¾ãã
æè¡çãªè¦³ç¹ããã¯ãã¨ã©ã¼ãã¸ã§ããã®è¨ç®æ¹æ³ããçæçãªåé¡ï¼fast burnï¼ã¨é·æçãªåé¡ï¼slow burnï¼ãåºå¥ãã¦æ±ãã¢ããã¼ããªã©ãå®è·µçãªç¥è¦ãå¤ãå«ã¾ãã¦ãã¾ãããããã®ææ³ã¯ãããå¹æçãªã¤ã³ã·ãã³ã管çã¨ããªã½ã¼ã¹ã®æé©ãªé åãå¯è½ã«ãã¾ãã
ã¾ãããã¼ã¿ã®ä¿¡é ¼æ§ã«é¢ããç« ãèå³æ·±ããã®ã§ããããã¼ã¿ãµã¼ãã¹ã®ä¿¡é ¼æ§ãä»ã®ãµã¼ãã¹ã¨ã¯æ ¹æ¬çã«ç°ãªãæ§è³ªãæã¤ãã¨ã強調ããããã¼ã¿ã®é®®åº¦ãå®å ¨æ§ãä¸è²«æ§ãªã©ãå¤é¢çãªå±æ§ãèæ ®ããSLOè¨å®ã®éè¦æ§ã説æããã¦ãã¾ãã
ãã®é¨åããå¦ãã æãéè¦ãªæè¨ã¯ãSLOã®å®è£ ãæè¡çãªèª²é¡ã ãã§ãªããçµç¹å ¨ä½ã®ããã»ã¹ã¨å¯æ¥ã«é¢é£ãã¦ãããã¨ã§ããSLOãå¹æçã«éç¨ããããã«ã¯ãæè¡ãã¼ã ããããã¯ããã¼ã ãçµå¶é£ãªã©ãæ§ã ãªã¹ãã¼ã¯ãã«ãã¼ã®ååãä¸å¯æ¬ ã§ããSREã¨ãã¦ããããã®é¢ä¿è éã®ã³ãã¥ãã±ã¼ã·ã§ã³ãä¿é²ããSLOãçµç¹ã®ææ決å®ããã»ã¹ã«çµã¿è¾¼ãã§ããå½¹å²ãæ±ãããã¦ãã¾ãã
Chapter 6. Getting Buy-In
第6ç« ãGetting Buy-Inãã¯ãSLOï¼Service Level Objectivesï¼ã®å°å ¥ã«ããã¦æãéè¦ãã¤å°é£ãªèª²é¡ã®ä¸ã¤ã§ããçµç¹å ã®åæå½¢æã«ã¤ãã¦è©³ç´°è§£èª¬ãã¦ãã¾ããæ¬ç« ã§ã¯ãSLOã®å°å ¥ãåãªãæè¡çãªåé¡ã§ã¯ãªããçµç¹å ¨ä½ã®æåãææ決å®ããã»ã¹ã«æ·±ãé¢ããå¤é©ã§ãããã¨ã強調ãã¦ãã¾ãã
ç« ã®åé ã§ãæ¬æ¸ã§ã¯ãSLIs, SLOs, and error budgets are really helpful mental tools to reason about the reliability needs of your systems. If you want them to be anything more than just interesting talking points, however, you will need to convince your company (or organization) to implement and live by them.ãã¨è¿°ã¹ã¦ãã¾ãããã®è¨èã¯ãSLOã®å°å ¥ãåãªãæè¡çãªå®è£ 以ä¸ã®æå³ãæã¤ãã¨ã端çã«è¡¨ç¾ãã¦ãã¾ãããã®ç¹ã«å¼·ãå ±æãã¾ãããæè¡çã«åªããã½ãªã¥ã¼ã·ã§ã³ã§ãã£ã¦ããçµç¹å ¨ä½ã®ç解ã¨æ¯æããªããã°ããã®å¹æãååã«çºæ®ãããã¨ã¯ã§ãã¾ããã
æ¬æ¸ã§ã¯ãSLOå°å ¥ã®ããã®åæå½¢æããã»ã¹ã段éçã«èª¬æãã¦ãã¾ããç¹ã«è¯ãã£ãã®ã¯ãåã¹ãã¼ã¯ãã«ãã¼ã®æ¸å¿µäºé ã¨ãããã«å¯¾ããå¹æçãªèª¬å¾æ¹æ³ãå ·ä½çã«æ示ãã¦ããç¹ã§ããä¾ãã°ãã¨ã³ã¸ãã¢ãªã³ã°ãã¼ã ã«å¯¾ãã¦ã¯ããSLOs (and error budgets) increase both reliability and feature velocity over time. They also make for a better work environment because they align incentives among previously warring factions.ãã¨ããã¡ãã»ã¼ã¸ãå¹æçã ã¨è¿°ã¹ã¦ãã¾ãããã®è¦ç¹ã¯ãéçºãã¼ã ã¨éç¨ãã¼ã ã®éã«å¸¸ã«åå¨ãã対ç«é¢ä¿ã解æ¶ããä¸ã§éè¦ã ã¨æãã¾ããã
æè¡çãªè¦³ç¹ããç¹ã«èå³æ·±ãã£ãã®ã¯ãã¨ã©ã¼ãã¸ã§ããããªã·ã¼ã®è¨è¨ã«é¢ããææ¡ã§ããæ¬æ¸ã§ã¯ãæåã®1å¹´éã¯åä¸ã®ã¨ã©ã¼ãã¸ã§ããããªã·ã¼ãæ¡ç¨ããããããæ°æ©è½ã®åçµï¼feature freeze)ãããªã·ã¼ã¨ãããã¨ãæ¨å¥¨ãã¦ãã¾ããå ·ä½çã«ã¯ãå¯ç¨æ§SLOã99.9ï¼ ã®å ´åã30æ¥éã§43.2åã®ã¨ã©ã¼ãã¸ã§ãããããããã®ãã¸ã§ãããè¶ éããå ´åã¯æ°æ©è½ã®éçºãä¸æåæ¢ããä¿¡é ¼æ§åä¸ã«æ³¨åããã¨ãããã®ã§ãããã®æ確ã§å³æ ¼ãªããªã·ã¼ã¯ãçµç¹å ¨ä½ã«SLOã®éè¦æ§ã浸éãããä¸ã§å¹æçã ã¨æãã¾ããã
æ¬æ¸ã§ã¯ãSLOå°å ¥ã®ããã®åæå½¢æããã»ã¹ã5ã¤ã®ä¸»è¦ãªã¹ãã¼ã¯ãã«ãã¼ã°ã«ã¼ãï¼ã¨ã³ã¸ãã¢ãªã³ã°ãéç¨ããããã¯ãããªã¼ãã¼ã·ãããæ³åï¼ã«åãã¦èª¬æãã¦ãã¾ããåã°ã«ã¼ãã®æ¸å¿µäºé ã¨ãããã«å¯¾ããå¹æçãªèª¬å¾æ¹æ³ã詳細ã«è§£èª¬ããã¦ãããå®éã®å°å ¥ããã»ã¹ã«ããã¦åèã«ãªãã¾ããã
ã¾ãæ¬æ¸ã§ã¯ãã¨ã°ã¼ã¯ãã£ããªã¼ãã¼ã·ããã¸ã®èª¬å¾æ¹æ³ã«ã¤ãã¦ã詳ãã解説ããã¦ãã¾ãããIn our firm, we strive for 100ï¼ customer satisfaction and 100ï¼ uptime! Our customers will tolerate no less!ãã¨ããçµå¶é£ããããããåå¿ã«å¯¾ãã¦ãæ¬æ¸ã§ã¯ç¾å®ä¸»ç¾©çãªã¢ããã¼ããææ¡ãã¦ãã¾ããå®ç§ãªä¿¡é ¼æ§ã¯ä¸å¯è½ã§ããããããé©åãªã¬ãã«ã®ä¿¡é ¼æ§ãç®æããã¨ã§ãã¤ããã¼ã·ã§ã³ã¨å®å®æ§ã®ãã©ã³ã¹ãåããã¨ãã§ããã¨ãã説æã¯ã説å¾åãããã¾ããã
ã¾ããæ¬æ¸ãææ¡ãããthaw taxãã®æ¦å¿µãèå³æ·±ããã®ã§ãããæ©è½ããªã¼ãºæéä¸ã«ä¾å¤çã«ãªãªã¼ã¹ãè¡ãå ´åããã®æéã®1.5åãããªã¼ãºæéã«è¿½å ããã¨ããèãæ¹ã¯ãããªã·ã¼ã®æè»æ§ã¨å³æ ¼æ§ã®ãã©ã³ã¹ãåãä¸ã§æå¹ã ã¨æãã¾ãã
SLOå°å ¥ã®æåã®éè¦ãªç¬éã«ã¤ãã¦ãæ¬æ¸ã§ã¯ãThe most important moment is the first time you exhaust your error budget and need to enforce your policy. That will be the moment that teaches everyone whether or not you are serious about this journey.ãã¨è¿°ã¹ã¦ãã¾ãããã®ææã¯ãå ±æã§ãããã®ã§ãããããªã·ã¼ãå³æ ¼ã«é©ç¨ãããã¨ã®éè¦æ§ã¨ããããçµç¹æåã®å¤é©ã«ã¤ãªãããã¨ã強調ãã¦ããç¹ã¯ãéè¦ã ã¨æãã¾ããã
æ¬æ¸ã§ã¯ãSLOå°å ¥ããã»ã¹ããå¦ãã æè¨ãå ±æãã¦ãã¾ããç¹ã«å°è±¡çã ã£ãã®ã¯ããToo much too soonãã¨ããè¦åã§ããä¸åº¦ã«ãã¹ã¦ãå¤ãããã¨ããã®ã§ã¯ãªããä¸ã¤ã®è£½åã®ä¸é¨åããããã¯ä¸ã¤ã®é害ãã¡ã¤ã³ããå§ãããã¨ã®éè¦æ§ã強調ãã¦ãã¾ãããã®æ®µéçãªã¢ããã¼ãã¯ã大è¦æ¨¡ãªçµç¹å¤é©ãæåãããä¸ã§éè¦ã ã¨æãã¾ããã
ã¾ãããBe completely transparentãã¨ããå©è¨ãéè¦ã§ããSLOã¨ã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ãçµç¹å ¨ä½ã§å¯è¦åããå ±æãããã¨ã®éè¦æ§ã強調ãã¦ãã¾ããããã¯ãSLOã¢ããã¼ãã®å¹æãæ大åããçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ããä¿é²ããä¸ã§ä¸å¯æ¬ ã ã¨æãã¾ãã
æè¡çãªè¦³ç¹ããã¯ãæ¬æ¸ãSLOã¨ã¨ã©ã¼ãã¸ã§ããã®å¯è¦åã«ã¤ãã¦è¨åãã¦ããç¹ãèå³æ·±ãã£ãã§ããå ·ä½çãªå®è£ æ¹æ³ã¯è¿°ã¹ããã¦ãã¾ããããä¾ãã°Prometheus + Grafanaã®ãããªç£è¦ã¹ã¿ãã¯ãæ´»ç¨ãããªã¢ã«ã¿ã¤ã ã§SLOã®ç¶æ³ãå¯è¦åããããã·ã¥ãã¼ããæ§ç¯ãããã¨ãèãããã¾ããããã«ãããçµç¹å ¨ä½ã§SLOã®ç¶æ³ãå ±æããè¿ éãªææ決å®ãè¡ããã¨ãå¯è½ã«ãªãã¾ãã
æ¬æ¸ã§ã¯æå¾ã«ãSLOã¢ããã¼ãã®å°å ¥ã«å¯¾ããä¸è¬çãªåè«ã¨ãã®åé§ãæ示ãã¦ãã¾ããç¹ã«ããBut we're not Google!ãã¨ããåè«ã«å¯¾ããæ¬æ¸ã®åçã¯å°è±¡çã§ãããSLOãã¼ã¹ã®ã¢ããã¼ãã¯Googleç¹æã®ãã®ã§ã¯ãªããããããè¦æ¨¡ã®çµç¹ã«é©ç¨å¯è½ã§ãããã¨ã強調ãã¦ãã¾ãã
ãã®ç« ããå¾ãããæãéè¦ãªæè¨ã¯ãSLOã®å°å ¥ãæè¡çãªèª²é¡ä»¥ä¸ã«ãçµç¹æåã®å¤é©ã¨é¢ä¿è ã®åæå½¢æãéè¦ã§ããã¨ãããã¨ã§ããæ¬æ¸ãæ示ãã段éçãªã¢ããã¼ãã¨åã¹ãã¼ã¯ãã«ãã¼ã«å¯¾ããå ·ä½çãªèª¬å¾æ¹æ³ã¯ãå®éã®SLOå°å ¥ããã»ã¹ã«ããã¦æç¨ãªã¬ã¤ãã©ã¤ã³ã¨ãªãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãã
- çµç¹å ã®ä¸»è¦ãªã¹ãã¼ã¯ãã«ãã¼ãç¹å®ããããããã®æ¸å¿µäºé ãäºåã«ææ¡ããã
- SLOã¨ã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µã«ã¤ãã¦ãéæè¡è ã«ãç解ãããã説æè³æãæºåããã
- å°è¦æ¨¡ãªãã¤ãããããã¸ã§ã¯ãããå§ããæåäºä¾ãä½ãåºãã
- ã¨ã©ã¼ãã¸ã§ããããªã·ã¼ãæ確ã«å®ç¾©ããçµç¹å ¨ä½ã§åæãå¾ãã
- SLOã¨ã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ãå¯è¦åããããã·ã¥ãã¼ããæ§ç¯ããçµç¹å ¨ä½ã§å ±æããã
- å®æçã«SLOã®è¦ç´ãã¨èª¿æ´ãè¡ããç¶ç¶çãªæ¹åãå³ãã
æè¡çãªè¦³ç¹ããã¯ãSLOã¨ã¨ã©ã¼ãã¸ã§ããã®æ¸¬å®ã¨å¯è¦åã®ããã®ã¤ã³ãã©ã¹ãã©ã¯ãã£ã®æ§ç¯ãéè¦ã«ãªãã¾ããä¾ãã°ã以ä¸ã®ãããªæè¡ã¹ã¿ãã¯ã®æ´»ç¨ãèãããã¾ãã
- Prometheusã使ç¨ãã¦SLIã¡ããªã¯ã¹ãåéããã
- Grafanaã使ç¨ãã¦SLOã¨ã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ããªã¢ã«ã¿ã¤ã ã§å¯è¦åããã
- Alertmanagerãè¨å®ããã¨ã©ã¼ãã¸ã§ããã®ãã¼ã³ã¬ã¼ãã«å¿ããã¢ã©ã¼ããçºè¡ããã
- Kubernetes Operatorãéçºããã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ã«å¿ãã¦èªåçã«ãããã¤ã¡ã³ããå¶å¾¡ããã
ãããã®æè¡çãªå®è£ ã«ãããSLOã¢ããã¼ããçµç¹ã®æ¥å¸¸çãªéç¨ã«çµã¿è¾¼ããã¨ãå¯è½ã«ãªãã¾ãã
ã¾ããæ¬æ¸ã強調ãã¦ãããéææ§ããå®ç¾ããããã«ã¯ãæè¡çãªå¯è¦åã ãã§ãªããçµç¹å ã®ã³ãã¥ãã±ã¼ã·ã§ã³ããã»ã¹ãæ´åããå¿ è¦ãããã¾ããä¾ãã°ãé±æ¬¡ã¾ãã¯æ次ã®SLOã¬ãã¥ã¼ãã¼ãã£ã³ã°ãè¨å®ããã¨ã³ã¸ãã¢ãªã³ã°ããããã¯ããçµå¶é£ãä¸å ã«ä¼ãã¦SLOã®ç¶æ³ã¨ä»å¾ã®æ¹éãè°è«ããå ´ãè¨ãããã¨ãæå¹ã§ãããã
SLOã®å°å ¥ã¯ãåãªãæè¡çãªææ¨ã®å°å ¥ä»¥ä¸ã®æå³ãæã¡ã¾ããããã¯ãçµç¹å ¨ä½ã®ææ決å®ããã»ã¹ã¨åªå é ä½ä»ãã®æ¹æ³ãæ ¹æ¬çã«å¤ããå¯è½æ§ãç§ãã¦ãã¾ããä¾ãã°ãæ°æ©è½ã®éçºã¨ã·ã¹ãã ã®å®å®æ§åä¸ã®ãã©ã³ã¹ããæè¦çãªãã®ã§ã¯ãªãããã¼ã¿ã«åºã¥ãã¦å¤æãããã¨ãå¯è½ã«ãªãã¾ããããã«ãããã¨ã³ã¸ãã¢ãªã³ã°ãªã½ã¼ã¹ã®æé©ãªé åãå¯è½ã«ãªããçµæã¨ãã¦ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«ã¤ãªããã¾ãã
åæã«ãSLOã¢ããã¼ãã®å°å ¥ã¯ãçµç¹ã®æåããé害ã責ãããæåãããå¦ç¿ã¨æ¹åãã®æåã¸ã¨å¤é©ããå¥æ©ã«ããªãå¾ã¾ããã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µã¯ãä¸å®ã¬ãã«ã®é害ã許容ãããã¨ã§ãããç©æ¥µçãªå®é¨ã¨å¦ç¿ãä¿é²ãã¾ããããã¯ãé·æçã«ã¯çµç¹ã®é©æ°æ§ã¨ç«¶äºåã®åä¸ã«ã¤ãªããå¯è½æ§ãããã¾ãã
ãã®ç« ãèªãã§ãç§ã¯SREã®å½¹å²ãããæ¦ç¥çãªãã®ã«ãªãã¤ã¤ãããã¨ãå¼·ãæãã¾ãããSLOã®å°å ¥ã¨éç¨ãéãã¦ãSREã¯æè¡çãªåé¡è§£æ±ºã ãã§ãªããçµç¹å ¨ä½ã®æ¹åæ§ã«å½±é¿ãä¸ããéè¦ãªä½ç½®ã«ãããã¨ãæ確ã«ãªãã¾ããããã®å¤åã«é©å¿ããæè¡çãªã¹ãã«ã¨ãã¸ãã¹æè¦ã®ä¸¡æ¹ã磨ãã¦ãããã¨ããä»å¾ã®SREã«ã¨ã£ã¦ä¸å¯æ¬ ã ã¨èãã¾ãã
ç·æ¬ããã¨ããã®ç« ã¯SLOå°å ¥ã®æåã«ä¸å¯æ¬ ãªçµç¹çå´é¢ã«ç¦ç¹ãå½ã¦ãå ·ä½çãã¤å®è·µçãªã¬ã¤ãã³ã¹ãæä¾ãã¦ãã¾ããæè¡çãªå®è£ ã¯SLOå°å ¥ã®ä¸é¨åã«éãããçã®æåã¯çµç¹å ¨ä½ã®ç解ã¨æ¯æãå¾ããã¨ã«ããã¨ããæ¬æ¸ã®ã¡ãã»ã¼ã¸ã¯ãéè¦ã§ããSREã¨ãã¦ããã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããå¹æçãªSLOå°å ¥ãå¯è½ã«ãªããçµæã¨ãã¦çµç¹å ¨ä½ã®ããã©ã¼ãã³ã¹åä¸ã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ãã
åæã«ãSLOå°å ¥ããã»ã¹ã¯ç¶ç¶çãªå¦ç¿ã¨æ¹åã®æ©ä¼ã§ãããã¾ããåæã®å°å ¥å¾ããå®æçã«SLOã®å¦¥å½æ§ãè¦ç´ããçµç¹ã®å¤åãæ°ããªæè¡ã®ç»å ´ã«å¿ãã¦æè»ã«èª¿æ´ãã¦ããå¿ è¦ãããã¾ãããã®ç¶ç¶çãªæ¹åããã»ã¹ããããSLOã¢ããã¼ãã®çã®ä¾¡å¤ãå¼ãåºãéµã¨ãªãã§ãããã
ä»å¾ã®èª²é¡ã¨ãã¦ã¯ãSLOã®æ¦å¿µãããåºç¯å²ã®çµç¹ã¡ã³ãã¼ã«æµ¸éããããã¨ãSLOãã¼ã¿ãæ´»ç¨ããææ決å®ããã»ã¹ã®ç¢ºç«ãããã¦ä»ã®ãã¸ãã¹ã¡ããªã¯ã¹ã¨SLOãçµ±åããå æ¬çãªããã©ã¼ãã³ã¹è©ä¾¡ã·ã¹ãã ã®æ§ç¯ãªã©ãèãããã¾ãããããã®èª²é¡ã«åãçµããã¨ã§ãSLOã¢ããã¼ãã¯ããä¸å±¤çµç¹ã«æ ¹ä»ããé·æçãªä¾¡å¤ãçã¿åºããã¨ãã§ããã§ãããã
æå¾ã«ãæ¬æ¸ã®ãSLOã¯è¤éãªç§å¦ã§ã¯ãªããåºæ¬çãªç®è¡ã¨è¦å¾ã®åé¡ã§ãããã¨ããææã¯éè¦ã§ãããã®è¨èã¯ãSLOå°å ¥ã®æ¬è³ªãæè¡çãªè¤éãã§ã¯ãªããçµç¹ã®æå¿ã¨å®è¡åã«ãããã¨ã端çã«è¡¨ç¾ãã¦ãã¾ããSREã¨ãã¦ããã®ç¹ã常ã«å¿µé ã«ç½®ããªãããçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ãã¨ç¶ç¶çãªæ¹åãæ¨é²ãã¦ãããã¨ãéè¦ã ã¨æãã¾ããã
SLOã®å°å ¥ã¯ãæè¡çãªå¤é©ã§ããã¨åæã«ãçµç¹æåã®å¤é©ã§ãããã¾ãããã®ç« ã§å¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããä¿¡é ¼æ§ã®é«ãã·ã¹ãã ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ã«å¯¾ãã¦å¶ã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ãSLOã®æ¦å¿µã¯å¸¸ã«é²åãç¶ãããã®ã§ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ã«å¿ãã¦ãç¶ç¶çã«å¦ç¿ããé©å¿ãã¦ãå¿ è¦ããããã¨ãå¿ãã¦ã¯ããã¾ããã
SLOã®å°å ¥ããã»ã¹ã«ããã¦ãæ¬æ¸ã強調ãããã¹ãã¼ã¯ãã«ãã¼ã®ç解ã¨æ¯æãå¾ããã¨ãã®éè¦æ§ã¯ãç¹ã«æ³¨ç®ã«å¤ãã¾ããæè¡çã«åªããã½ãªã¥ã¼ã·ã§ã³ã§ãã£ã¦ããçµç¹å ¨ä½ã®ç解ã¨æ¯æããªããã°ããã®å¹æãååã«çºæ®ãããã¨ã¯ã§ãã¾ããããã®ç¹ã§ãæ¬æ¸ãææ¡ããåã¹ãã¼ã¯ãã«ãã¼ã°ã«ã¼ãï¼ã¨ã³ã¸ãã¢ãªã³ã°ãéç¨ããããã¯ãããªã¼ãã¼ã·ãããæ³åï¼ã«å¯¾ããå ·ä½çãªèª¬å¾æ¹æ³ã¯ãå®è·µçã§æç¨ã§ãã
ä¾ãã°ãã¨ã³ã¸ãã¢ãªã³ã°ãã¼ã ã«å¯¾ãã¦ã¯ãSLOã¨ã¨ã©ã¼ãã¸ã§ãããä¿¡é ¼æ§ã¨æ©è½éçºé度ã®ä¸¡æ¹ãåä¸ããããã¨ã強調ãã¾ããããã¯ãå¤ãã®ã¨ã³ã¸ãã¢ãæ±ãããä¿¡é ¼æ§åä¸ã¨æ°æ©è½éçºã®ãã¬ã¼ããªããã¨ããæ¸å¿µã«ç´æ¥å¿ãããã®ã§ããä¸æ¹ãéç¨ãã¼ã ã«å¯¾ãã¦ãSLOã¢ããã¼ããã¨ã³ã¸ãã¢ãªã³ã°ãã¼ã ã¨ã®å¯¾ç«ã解æ¶ããå ±éã®ç®æ¨ã«åãã£ã¦ååããæåãé¸æãããã¨ã強調ãã¾ãã
ãããã¯ããã¼ã ã«å¯¾ãã¦ã¯ãSLOãé·æçã«ã¯æ©è½éçºé度ãåä¸ããããã¨ã説æãã¾ããããã¯ãå¤ãã®ãããã¯ãããã¼ã¸ã£ã¼ãæã¤ãSLOãæ©è½éçºãé ããããã¨ããæ¸å¿µã解æ¶ããã®ã«å½¹ç«ã¡ã¾ããããã«ãSLOãã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ã¨å¯æ¥ã«é¢é£ãã¦ãããã¨ã強調ãããã¨ã§ããããã¯ããã¼ã ã®é¢å¿ãå¼ãåºããã¨ãã§ãã¾ãã
ãªã¼ãã¼ã·ããã«å¯¾ãã¦ã¯ãSLOã¢ããã¼ããçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ããä¿é²ãããã¼ã¿ã«åºã¥ããææ決å®ãå¯è½ã«ãããã¨ã強調ãã¾ããç¹ã«ãã100ï¼ ã®ä¿¡é ¼æ§ã¯ä¸å¯è½ã§ããã追æ±ãã¹ãã§ããªããã¨ããç¾å®çãªã¢ããã¼ãã¯ãå¤ãã®çµå¶è ãæã¤ãå®ç§ãç®æãã¹ããã¨ãèãã«å¯¾ããéè¦ãªåè«ã¨ãªãã¾ãã
æ³åãã¼ã ã«å¯¾ãã¦ã¯ãSLOãæ³çãªã¹ã¯ã®å®éåã¨ç®¡çã«å½¹ç«ã¤ãã¨ã説æãã¾ããç¹ã«ãSLAã¨SLOã®éããæ確ã«èª¬æããSLOãããç¾å®çã§ç®¡çå¯è½ãªå é¨ç®æ¨ã§ãããã¨ã強調ãããã¨ãéè¦ã§ãã
æè¡çãªè¦³ç¹ããã¯ãSLOã®æ¸¬å®ã¨å¯è¦åã®ããã®ã¤ã³ãã©ã¹ãã©ã¯ãã£ã®æ§ç¯ãéè¦ãªèª²é¡ã¨ãªãã¾ããå ·ä½çã«ã¯ã以ä¸ã®ãããªæè¡ã¹ã¿ãã¯ã®æ´»ç¨ãèãããã¾ãã
- Prometheusã使ç¨ãã¦SLIã¡ããªã¯ã¹ãåéããã
- Grafanaã使ç¨ãã¦SLOã¨ã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ããªã¢ã«ã¿ã¤ã ã§å¯è¦åããã
- Alertmanagerãè¨å®ããã¨ã©ã¼ãã¸ã§ããã®ãã¼ã³ã¬ã¼ãã«å¿ããã¢ã©ã¼ããçºè¡ããã
- ã«ã¹ã¿ã ã®Kubernetes Operatorãéçºããã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ã«å¿ãã¦èªåçã«ãããã¤ã¡ã³ããå¶å¾¡ããã
- OpenTelemetryãæ´»ç¨ãã¦ãåæ£ã·ã¹ãã å ¨ä½ã§ã®ã¨ã³ããã¼ã¨ã³ãã®ãã¬ã¼ã·ã³ã°ãå®ç¾ããã
ãããã®æè¡ãå¹æçã«çµã¿åããããã¨ã§ãSLOã®æ¸¬å®ã¨ç®¡çãèªååãããªã¢ã«ã¿ã¤ã ã§ã®ææ決å®ãæ¯æ´ãããã¨ãã§ãã¾ãã
ããããæè¡çãªå®è£ 以ä¸ã«éè¦ãªã®ã¯ãSLOã¢ããã¼ããçµç¹ã®æ¥å¸¸çãªéç¨ã¨ãã¸ãã¹ææ決å®ããã»ã¹ã«æ·±ãçµã¿è¾¼ããã¨ã§ããä¾ãã°ãååæãã¨ã®äºæ¥è¨ç»çå®æã«SLOã®ç¶æ³ãã¬ãã¥ã¼ããä»å¾ã®éçºæ¹éããªã½ã¼ã¹é åã®æ±ºå®ã«æ´»ç¨ããã¨ãã£ãåãçµã¿ãèãããã¾ãã
ã¾ããSLOã¢ããã¼ãã®å°å ¥ã¯ãçµç¹ã®æåããé害ã責ãããæåãããå¦ç¿ã¨æ¹åãã®æåã¸ã¨å¤é©ããæ©ä¼ã§ãããã¾ããã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µã¯ãä¸å®ã¬ãã«ã®é害ã許容ãããã¨ã§ãããç©æ¥µçãªå®é¨ã¨å¦ç¿ãä¿é²ãã¾ãããããçµç¹æåã¨ãã¦å®çãããããã«ã¯ã以ä¸ã®ãããªåãçµã¿ãå¹æçã§ã
- ãã¹ãã¢ã¼ãã ï¼é害äºå¾åæï¼ãéé£ã®å ´ã§ã¯ãªãç¿ã®æ©ä¼ã¨ãã¦ä½ç½®ã¥ããã
- ã¨ã©ã¼ãã¸ã§ããã使ãåã£ãéã®å¯¾å¿ããåãªãããã«ãã£ã§ã¯ãªããã·ã¹ãã æ¹åã®ããã®éä¸æéã¨ãã¦æããã
- ã¤ããã¼ã·ã§ã³ã¨ãªã¹ã¯ãã¤ãã³ã°ã奨å±ããã¨ã©ã¼ãã¸ã§ããå ã§ã®ã失æãã許容ããæåãé¸æããã
- SLOã®éæç¶æ³ãå人ããã¼ã ã®è©ä¾¡ææ¨ã¨ãã¦ä½¿ç¨ããã®ã§ã¯ãªããçµç¹å ¨ä½ã®æ¹åææ¨ã¨ãã¦æ´»ç¨ããã
ãããã®æåçå´é¢ã¯ãæè¡çãªå®è£ ã¨åçããããã¯ãã以ä¸ã«éè¦ã§ãããªããªããçµç¹æåãSLOã¢ããã¼ããæ¯æããªãéãããã®å¹æãæ大éã«çºæ®ãããã¨ã¯ã§ããªãããã§ãã
æ¬æ¸ã§å¼·èª¿ãããæåã®ã¨ã©ã¼ãã¸ã§ããè¶ éæã®å¯¾å¿ãã®éè¦æ§ããç¹çã«å¤ãã¾ãããã®æåã®äºä¾ããçµç¹ãSLOã¢ããã¼ããã©ãã ãçå£ã«æãã¦ãããã示ãéè¦ãªææ¨ã¨ãªãã¾ãããããã£ã¦ããã®æåã®äºä¾ã«å¯¾ãã対å¿ãæ éã«è¨ç»ããçµç¹å ¨ä½ã§å ±æãããã¨ãéè¦ã§ãã
ä¾ãã°ãæåã®ã¨ã©ã¼ãã¸ã§ããè¶ éæã«ã¯ä»¥ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
- ã¨ã°ã¼ã¯ãã£ãã¬ãã«ãå«ãå ¨ç¤¾çãªãã¼ãã£ã³ã°ãéå¬ããç¶æ³ãå ±æããã
- ã¨ã©ã¼ãã¸ã§ããããªã·ã¼ã«åºã¥ã対å¿ï¼ä¾ï¼æ©è½ããªã¼ãºï¼ãå³æ ¼ã«å®æ½ããã
- ãã®æéä¸ã®æ¹åæ´»åã¨ãã®ææã詳細ã«è¨é²ããçµç¹å ¨ä½ã§å ±æããã
- ãã®çµé¨ããå¦ãã ãã¨ãåºã«ãSLOã¨ã¨ã©ã¼ãã¸ã§ããããªã·ã¼ãè¦ç´ããå¿ è¦ã«å¿ãã¦èª¿æ´ããã
ãããã®åãçµã¿ãéãã¦ãSLOã¢ããã¼ããåãªãæè¡çãªææ¨ã§ã¯ãªããçµç¹å ¨ä½ã®éå¶æ¹éã«æ·±ãçµã¿è¾¼ã¾ãããã®ã§ãããã¨ã示ããã¨ãã§ãã¾ãã
æå¾ã«ãæ¬æ¸ãææ¡ãã段éçãªã¢ããã¼ãã¨å®æçãªè¦ç´ãã®éè¦æ§ã強調ãã¦ããããã¨æãã¾ããSLOã®å°å ¥ã¯ä¸æä¸å¤ã«ã¯å®ç¾ã§ãã¾ãããå°è¦æ¨¡ãªãã¤ãããããã¸ã§ã¯ãããå§ããå¾ã ã«ç¯å²ãæ¡å¤§ãã¦ããã¢ããã¼ãããå¤ãã®å ´åå¹æçã§ããã¾ããåæã®SLOè¨å®ãæé©ã§ãªãå¯è½æ§ãé«ããããå®æçï¼ä¾ãã°ååæãã¨ï¼ã«SLOãè¦ç´ããå¿ è¦ã«å¿ãã¦èª¿æ´ãããã¨ãéè¦ã§ãã
ãã®ãããªç¶ç¶çãªæ¹åããã»ã¹ãéãã¦ãSLOã¢ããã¼ãã¯çµç¹ã«æ·±ãæ ¹ä»ããé·æçãªä¾¡å¤ãçã¿åºããã¨ãã§ããããã«ãªãã¾ããSREã¨ãã¦ããã®ç¶ç¶çãªæ¹åããã»ã¹ããªã¼ãããæè¡çãªå®è£ ã¨çµç¹æåã®å¤é©ã®ä¸¡é¢ããSLOã¢ããã¼ãã®æåãæ¯æ´ãããã¨ããæã ã®éè¦ãªå½¹å²ã¨ãªãã§ãããã
SLOã®å°å ¥ã¯ãåãªãæè¡çãªå¤æ´ä»¥ä¸ã®æå³ãæã¡ã¾ããããã¯ãçµç¹å ¨ä½ã®éå¶æ¹éã¨ãã¸ãã¹ææ決å®ããã»ã¹ãæ¬æ ¼çã«å¤ããå¯è½æ§ãç§ãã¦ãã¾ãããã®å¤é©ãæåãããããã«ã¯ãæè¡çãªã¹ãã«ã¨ãã¸ãã¹æè¦ã®ä¸¡æ¹ãæã¡åãããçµç¹å ¨ä½ããªã¼ããã¦ããè½åãå¿ è¦ã¨ãªãã¾ãã
ãã®ç« ã§å¦ãã ã¢ããã¼ããå®è·µããçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ããå³ããªããSLOãå°å ¥ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ãã·ã¹ãã ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ãå®ç¾ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ããã®éç¨ã¯ç¶ç¶çãªå¦ç¿ã¨æ¹åã®æ©ä¼ã§ãããã¾ããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ããã¸ãã¹ç°å¢ã®å¤åã«å¿ãã¦ã常ã«ã¢ããã¼ããè¦ç´ããé©å¿ãã¦ãã姿å¢ãéè¦ã§ãã
SLOã®å°å ¥ã¯ãæè¡çãªææ¦ã§ããã¨åæã«ãçµç¹æåã®å¤é©ã¨ãã大ããªææ¦ã§ãããã¾ãããããããã®ææ¦ãä¹ãè¶ãããã¨ã§ãçµç¹ã¯ããå¼·éã§é©å¿åã®é«ããã®ã¨ãªããæ¥éã«å¤åããæè¡ç°å¢ãã¸ãã¹ç°å¢ã«ããã¦ãæç¶çãªæåãåãããã¨ãã§ããã§ããããSREã¨ãã¦ããã®å¤é©ã®æåç·ã«ç«ã¡ãæè¡ã¨çµç¹ã®ä¸¡é¢ãããªã¼ãã¼ã·ãããçºæ®ãããã¨ããæã ã«æ±ãããã¦ããéè¦ãªå½¹å²ãªã®ã§ãã
Chapter 7. Measuring SLIs and SLOs
第7ç« ãMeasuring SLIs and SLOsãã¯ãService Level Indicators (SLIs)ã¨Service Level Objectives (SLOs)ã®å®è£ ã¨æ¸¬å®ã«é¢ããæ·±ãæ´å¯ãæä¾ãã¦ãã¾ããæ¬ç« ã¯ãSLOã®å®è£ ãåãªãæè¡çãªä½æ¥ã§ã¯ãªããçµç¹å ¨ä½ã®éç¨æ¹éã¨å¯æ¥ã«é¢é£ããéè¦ãªåãçµã¿ã§ãããã¨ã強調ãã¦ãã¾ãã
ç« ã®åé ã§ãæ¬æ¸ã§ã¯ãIt's one thing to understand the philosophy of what a good SLI for a service might be, but it's another thing entirely to actually understand how to implement and measure it.ãã¨è¿°ã¹ã¦ãã¾ãããã®è¨èã¯ãSLIã¨SLOã®çè«ã¨å®è·µã®éã«ã¯å¤§ããªã®ã£ãããåå¨ãããã¨ã端çã«è¡¨ç¾ãã¦ãããSREã¨ãã¦ã®ç§ã®çµé¨ã¨æ·±ãå ±é³´ãã¾ãããçæ³çãªSLIãå®ç¾©ãããã¨ã¯æ¯è¼ç容æã§ããããããå®éã®ã·ã¹ãã ã§æ£ç¢ºã«æ¸¬å®ãããã¨ã¯å¤ãã®æè¡ç課é¡ãä¼´ãã¾ãã
æ¬æ¸ã§ã¯ãSLOã®å®è£ ã«ããã6ã¤ã®è¨è¨ç®æ¨ãæ示ãã¦ãã¾ãï¼æè»ãªã¿ã¼ã²ããããã¹ãå¯è½ãªã¿ã¼ã²ããã鮮度ãã³ã¹ããä¿¡é ¼æ§ãçµç¹çå¶ç´ããããã®ç®æ¨ã¯ãSLOå®è£ ã®æåãå·¦å³ããéè¦ãªè¦ç´ ã§ãããSREã¨ãã¦å¸¸ã«æèãã¹ãç¹ã ã¨æãã¾ãããç¹ã«ããFlexible Targetsãã®éè¦æ§ã¯å°è±¡çã§ãããæ¬æ¸ã§ã¯ãSLOãæéã¨ã¨ãã«é²åããå¿ è¦ããããã¨ã強調ãã人éã®ãªãã¬ã¼ã¿ã¼ãã³ã¼ãå¤æ´ãã½ããã¦ã§ã¢ã®åãããã¤ãªãã«SLOã®ãã©ã¡ã¼ã¿ã調æ´ã§ãããã¨ã®éè¦æ§ãææãã¦ãã¾ãããã®æè»æ§ã¯ãæ¥éã«å¤åãããã¸ãç°å¢ãæè¡ç°å¢ã«ããã¦éè¦ã§ãã
æè¡çãªè¦³ç¹ããç¹ã«èå³æ·±ãã£ãã®ã¯ãTime Series Database (TSDB)ã¨æ§é åã¤ãã³ããã¼ã¿ãã¼ã¹ï¼ãã°ã·ã¹ãã ï¼ã®æ¯è¼ã§ããæ¬æ¸ã§ã¯ããããã®ç°ãªãã¢ããã¼ãã®é·æã¨çæã詳細ã«åæãã¦ãã¾ããä¾ãã°ãTSDBã¯é«ã¹ã«ã¼ãããã®ã·ã¹ãã ã«é©ãã¦ãã¾ãããæè»æ§ã«æ¬ ããå ´åãããã¾ããä¸æ¹ãæ§é åã¤ãã³ããã¼ã¿ãã¼ã¹ã¯æè»æ§ãé«ããã®ã®ãã³ã¹ããç·å½¢ã«å¢å ããå¾åãããã¾ãããã®åæã¯ãSLOå®è£ ã®æè¡é¸æã«ããã¦æç¨ãªæéã¨ãªãã¾ãã
æ¬æ¸ãæ示ããTSDBã«ãããçµ±è¨åå¸ã®ãµãã¼ãã«é¢ãã説æã¯ãç¹ã«å°è±¡çã§ããããã¼ã»ã³ã¿ã¤ã«ãã¼ã¹ã®SLOãå®è£ ããéãTSDBãæä¾ããåå¸ãã¼ã¿åã®æ©è½ãéè¦ã«ãªãã¾ããæ¬æ¸ã§ã¯ããStatistical distributions are incredibly important when implementing SLOs with a TSDB: per our design goals of flexible targets and testable targets, durably stored time series distributions allow us to measure P95 latency one day and P99 the next without changing code, changing configuration, or losing time series history.ãã¨è¿°ã¹ã¦ãã¾ãããã®æè»æ§ã¯ãSLOã®ç¶ç¶çãªæ¹åã¨èª¿æ´ã«ããã¦éè¦ã§ãã
æ¬æ¸ã§ã¯ãä¸è¬çãªSLOå®è£ ãã¿ã¼ã³ã¨ãã¦ãã¬ã¤ãã³ã·ã«ææãªãªã¯ã¨ã¹ãå¦çãä½é 延ã»é«ã¹ã«ã¼ãããã®ãããå¦çãã¢ãã¤ã«ã»Webã¯ã©ã¤ã¢ã³ãã®3ã¤ãæãã¦ãã¾ãããããã®ãã¿ã¼ã³ã®è§£èª¬ã¯ãå®éã®ã·ã¹ãã è¨è¨ã«ããã¦åèã«ãªãã¾ãããç¹ã«ãã¢ãã¤ã«ã»Webã¯ã©ã¤ã¢ã³ãã®ããã©ã¼ãã³ã¹æ¸¬å®ã«é¢ããèå¯ã¯ãã¨ã³ãã¦ã¼ã¶ã¼è¦ç¹ã®SLOã®éè¦æ§ãåèªèããããã¾ããã
æè¡çãªè©³ç´°ã«é¢ãã¦ã¯ãæ¬æ¸ãTSDBã«ããããã¹ãã°ã©ã å®è£ ã«ã¤ãã¦è©³ç´°ã«èª¬æãã¦ããç¹ãæç¨ã§ãããä¾ãã°ããã±ããåããããã¼ã¿ãç¨ãã¦ãã¼ã»ã³ã¿ã¤ã«ãè¿ä¼¼ããæ¹æ³ã®èª¬æã¯ãå®éã®SLOå®è£ ã«ããã¦ç´æ¥é©ç¨å¯è½ãªç¥è¦ã§ããã¾ããDunning and Ertlã®t-digestã¢ã«ã´ãªãºã ã¸ã®è¨åã¯ãããé«åº¦ãªSLOå®è£ ãæ¤è¨ããä¸ã§åèã«ãªãã¾ããã
åæ£ãã¬ã¼ã·ã³ã°ã¨SLOã®çµ±åã«é¢ããæ¬æ¸ã®èå¯ãèå³æ·±ããã®ã§ããããHistorically, distributed tracing was thought of as its own "product" with valuable but highly specialized use cases, mostly around performance analysis and distributed debugging. Really, though, distributed traces are just a data source that can be applied to a variety of problems, and SLIs and SLOs are well qualified to benefit from distributed tracing data and technology.ããã®è¦ç¹ã¯ãåæ£ã·ã¹ãã ã«ãããSLOå®è£ ã®æ°ããªå¯è½æ§ã示åãã¦ãããä»å¾ã®SREå®è·µã«ããã¦éè¦ãªæéã¨ãªãã¨æãã¾ããã
æ¬æ¸ã§ã¯ãSLIã¨SLOã®å®è£ ãåãªãæè¡çãªåé¡ã§ã¯ãªããçµç¹æåã®å¤é©ãä¼´ããã®ã§ãããã¨ã強調ãã¦ãã¾ããç¹ã«ãSLIã¨SLOã®çºè¦å¯è½æ§ï¼Discoverabilityï¼ã®éè¦æ§ãææãã¦ããç¹ã¯å°è±¡çã§ãããSLOã¢ããã¼ãã®å¹æãæ大åããããã«ã¯ãSLIã¨SLOãçµç¹å ¨ä½ã§å®¹æã«çºè¦ãããç解ããããã¨ãä¸å¯æ¬ ã§ãããã®ç¹ã¯ãSREã¨ãã¦ã®ç§ãã¡ã®å½¹å²ãåãªãæè¡çãªå®è£ ãè¶ ãã¦ãçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ããä¿é²ããéè¦ãªä½ç½®ã«ãããã¨ã示åãã¦ãã¾ãã
ãã®ç« ããå¾ãããéè¦ãªæè¨ã¯ãSLIã¨SLOã®å®è£ ãè¤éãªå¤æ®µéã®è¨ç®ãä¼´ãä½æ¥ã§ãããæ§ã ãªãã¬ã¼ããªããæ éã«æ¤è¨ããå¿ è¦ãããã¨ãããã¨ã§ããæ¬æ¸ã§ã¯ããAt the end of the day, most useful SLIs and SLO measurements are complex, multistage computations, and like any such computations, their implementation involves trade-offs and conflicting goals that must be held in tension.ãã¨è¿°ã¹ã¦ãã¾ãããã®èªèã¯ãSLOå®è£ ã®é£ããã¨åæã«ããã®éè¦æ§ã端çã«è¡¨ç¾ãã¦ãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
çµç¹ã®ç¾ç¶ã®ã¤ã³ãã©ã¹ãã©ã¯ãã£ï¼TSDBããã°ã·ã¹ãã ï¼ãè©ä¾¡ããSLOå®è£ ã®6ã¤ã®è¨è¨ç®æ¨ã«ç §ããåããã¦é©åæ§ãæ¤è¨ããã
ä¸è¬çãªSLOå®è£ ãã¿ã¼ã³ï¼ã¬ã¤ãã³ã·ã«ææãªãªã¯ã¨ã¹ãå¦çãä½é 延ã»é«ã¹ã«ã¼ãããã®ãããå¦çãã¢ãã¤ã«ã»Webã¯ã©ã¤ã¢ã³ãï¼ãåèã«ãèªçµç¹ã®ã·ã¹ãã ã«é©ããSLOå®è£ æ¦ç¥ãçå®ããã
TSDBã使ç¨ããå ´åãçµ±è¨åå¸ã®ãµãã¼ããéè¦ããå¿ è¦ã«å¿ãã¦ã«ã¹ã¿ã ã®å®è£ ï¼ä¾ï¼ãã±ããåããããã¹ãã°ã©ã ï¼ãæ¤è¨ããã
æ§é åã¤ãã³ããã¼ã¿ãã¼ã¹ã使ç¨ããå ´åãã³ã¹ãã¨é®®åº¦ã®ãã©ã³ã¹ãæ éã«è©ä¾¡ããé«ã¹ã«ã¼ãããã·ã¹ãã ã§ã®ä½¿ç¨ã«ã¯æ³¨æãæãã
åæ£ãã¬ã¼ã·ã³ã°ã·ã¹ãã ã¨SLOã®çµ±åãæ¤è¨ãããã詳細ãªããã©ã¼ãã³ã¹åæã¨åé¡è§£æ±ºãå¯è½ã«ããã
SLIã¨SLOã®çºè¦å¯è½æ§ãé«ãããããçµç¹å ã§ã®ææ¸åã¨å ±æã®ããã»ã¹ã確ç«ããã
ã¾ããåæ£ãã¬ã¼ã·ã³ã°ã¨SLOã®çµ±åã«é¢ãã¦ã¯ãOpenTelemetryãªã©ã®ãªã¼ãã³ã½ã¼ã¹ãã¬ã¼ã ã¯ã¼ã¯ãæ´»ç¨ãããã¨ã§ãããå¹æçãªSLOå®è£ ãå¯è½ã«ãªãã¾ããä¾ãã°ãOpenTelemetryã®Spanãã¼ã¿ãå©ç¨ãã¦ããµã¼ãã¹éã®ä¾åé¢ä¿ãèæ ®ããSLOãå®è£ ãããã¨ãã§ãã¾ãã
SLIã¨SLOã®çºè¦å¯è½æ§ãé«ããããã«ã¯ãçµç¹å ã®ãµã¼ãã¹ã«ã¿ãã°ãWikiã·ã¹ãã ã¨ã®çµ±åãå¹æçã§ããä¾ãã°ãåãµã¼ãã¹ã®ããã¥ã¡ã³ããã¼ã¸ã«SLIã¨SLOã®å®ç¾©ãæè¨ãç¾å¨ã®ç¶æ ãåçã«è¡¨ç¤ºããããã·ã¥ãã¼ãã¸ã®ãªã³ã¯ãæä¾ãããªã©ãæè¡çãªå®è£ ã¨çµç¹çãªããã»ã¹ãçµã¿åãããã¢ããã¼ããèãããã¾ãã
ãã®ç« ãèªãã§ãSLIã¨SLOã®å®è£ ã¯ãåãªãæè¡çãªèª²é¡ã§ã¯ãªããçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ãã¨ç¶ç¶çãªæ¹åããã»ã¹ã®ä¸æ ¸ãæããã®ã§ããç¹ã«ãæ¬æ¸ã強調ããæè»æ§ã¨ãã¹ãå¯è½æ§ã®éè¦æ§ã¯ãæ¥éã«å¤åãããã¸ãã¹ç°å¢ã«ããã¦éè¦ã§ãã
åæã«ãã³ã¹ãã¨ä¿¡é ¼æ§ã®ãã©ã³ã¹ãåããã¨ã®é£ãããåèªèãã¾ãããé«ã¹ã«ã¼ãããã®ã·ã¹ãã ã§SLOãå®è£ ããéãTSDBã¨æ§é åã¤ãã³ããã¼ã¿ãã¼ã¹ã®ã©ã¡ããé¸æãããããããã¯ä¸¡è ãã©ã®ããã«çµã¿åããããã¯ãæ éã«æ¤è¨ããå¿ è¦ãããã¾ãããã®é¸æã¯ãçµç¹ã®è¦æ¨¡ãæè¡ã¹ã¿ãã¯äºç®ãªã©ãæ§ã ãªè¦å ã«ä¾åãã¾ãã
æ¬æ¸ãææ¡ãã6ã¤ã®è¨è¨ç®æ¨ï¼æè»ãªã¿ã¼ã²ããããã¹ãå¯è½ãªã¿ã¼ã²ããã鮮度ãã³ã¹ããä¿¡é ¼æ§ãçµç¹çå¶ç´ï¼ã¯ãSLOå®è£ ããã¸ã§ã¯ãã®è©ä¾¡ãã¬ã¼ã ã¯ã¼ã¯ã¨ãã¦æç¨ã§ãããããã®ç®æ¨ã常ã«æèããªããè¨è¨ã¨å®è£ ãé²ãããã¨ã§ãããå¹æçãªSLOã·ã¹ãã ãæ§ç¯ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ãã
ç·æ¬ããã¨ããã®ç« ã¯SLIã¨SLOã®æ¸¬å®ã«é¢ããå æ¬çãã¤å®è·µçãªã¬ã¤ããæä¾ãã¦ãã¾ããæ¬æ¸ã®è±å¯ãªçµé¨ã«åºã¥ãæ´å¯ã¯ãSLOå®è£ ã®è¤éãã¨éè¦æ§ãæ確ã«ç¤ºãã¦ãã¾ããSREã¨ãã¦ããã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããå¹æçãªSLOå®è£ ãå¯è½ã«ãªããçµæã¨ãã¦çµç¹å ¨ä½ã®ããã©ã¼ãã³ã¹åä¸ã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ãã
åæã«ãSLOã®å®è£ ã¯ç¶ç¶çãªå¦ç¿ã¨æ¹åã®ããã»ã¹ã§ãããã¨ãå¿ãã¦ã¯ããã¾ãããæè¡ã®é²åãçµç¹ã®å¤åã«å¿ãã¦ã常ã«ã¢ããã¼ããè¦ç´ããé©å¿ãã¦ãã姿å¢ãéè¦ã§ããç¹ã«ãåæ£ãã¬ã¼ã·ã³ã°ã¨ã®çµ±åãAIãæ´»ç¨ããSLOäºæ¸¬ãªã©ãæ°ããæè¡ã®æ´»ç¨å¯è½æ§ã«æ³¨ç®ãã¦ããå¿ è¦ãããã¾ãã
ä»å¾ã®èª²é¡ã¨ãã¦ã¯ãSLOãã¼ã¿ãæ´»ç¨ããäºæ¸¬åæã®å®è£ ããã¤ã¯ããµã¼ãã¹ã¢ã¼ããã¯ãã£ã«ãããEnd-to-Endã®SLO管çãããã¦ãã¸ãã¹KPIã¨SLOã®çµ±åãªã©ãèãããã¾ãããããã®èª²é¡ã«åãçµããã¨ã§ãSREã®å®è·µã¯ããã«é²åããããå¹æçã«ãã¸ãã¹ä¾¡å¤ãåµåºã§ããããã«ãªãã§ãããã
æå¾ã«ãæ¬æ¸ã®ãSLIã¨SLOã®å®è£ ã¯è¤éãªå¤æ®µéã®è¨ç®ã§ãããæ§ã ãªãã¬ã¼ããªããä¼´ããã¨ããææã¯éè¦ã§ãããã®èªèãæã¡ã¤ã¤ãçµç¹ã®ç¹æ§ã«åãããæé©ãªSLOå®è£ ã追æ±ãã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡éè¦ãªå½¹å²ã ã¨æãã¾ããã
ãããã®å³ã¯ãTSDBã使ç¨ããSLOå®è£ ã®å ·ä½çãªæ¹æ³ã示ãã¦ãããå®éã®ã·ã¹ãã è¨è¨ã«ããã¦æç¨ã§ããç¹ã«ãFigure 7.3ã®ãã¹ãã°ã©ã å®è£ ã®ä¾ã¯ããã¼ã»ã³ã¿ã¤ã«ãã¼ã¹SLOãå®è£ ããéã®å ·ä½çãªæéã¨ãªãã¾ãã
SLIã¨SLOã®æ¸¬å®ã¯ãSREã®å®è·µã«ããã¦ä¸å¿çãªå½¹å²ãæãããã®ã§ãããã®ç« ã§å¦ãã æ¦å¿µã¨ææ³ããæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãã¦ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ãã·ã¹ãã ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ãSLIã¨SLOã®æ¦å¿µã¯å¸¸ã«é²åãç¶ãããã®ã§ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ã«å¿ãã¦ãç¶ç¶çã«å¦ç¿ããé©å¿ãã¦ããå¿ è¦ããããã¨ãå¿ãã¦ã¯ããã¾ããã
Chapter 8. SLO Monitoring and Alerting
第8ç« ãSLO Monitoring and Alertingãã¯ãSLOï¼Service Level Objectivesï¼ã«åºã¥ãã¢ãã¿ãªã³ã°ã¨ã¢ã©ã¼ãã®å®è·µçãªå®è£ ã«ã¤ãã¦æ·±ãæãä¸ãã¦ãã¾ããæ¬æ¸ã§ã¯ãå¾æ¥ã®ãããå¤ãã¼ã¹ã®ã¢ã©ã¼ãã®åé¡ç¹ãææããSLOã¢ã©ã¼ããããã«ãããã®åé¡ã解決ããããå¹æçãªã·ã¹ãã ç¨ãå¯è½ã«ãããã詳細ã«è§£èª¬ãã¦ãã¾ããpyrraã®ãããªSLO ã®ç®¡çãã¨ã©ã¼ ãã¸ã§ããã®è¨ç®ãè¨é²ããã³ã¢ã©ã¼ã ã«ã¼ã«ã®ä½æã«å½¹ç«ã¤ãã¼ã«ãããã¾ããä»ã«ãGrafanaã§ãããã¤ã®æ©è½ãããã®ã§ããããã§ãã
æ¬ç« ã®åé ã§ãæ¬æ¸ã§ã¯ãSLO alerting really is one of the most promising developments in the management of production systems today. It promises to get rid of a lot of the chaos, the noise, and the sheer uselessness of conventional alerting that teams experience, and replace them with something significantly more maintainable.ãã¨è¿°ã¹ã¦ãã¾ãããã®è¨èã¯ãSLOã¢ã©ã¼ãã®éè¦æ§ã¨å¯è½æ§ã端çã«è¡¨ç¾ãã¦ãããSREã¨ãã¦ã®ç§ã®çµé¨ã¨æ·±ãå ±é³´ãã¾ãããå¾æ¥ã®ã¢ã©ã¼ãææ³ã®éçãæ¥ã æãã¦ããä¸ã§ãSLOã¢ã©ã¼ããæä¾ããæ°ããã¢ããã¼ãã«å¤§ããªæå¾ ãæ±ãã¾ããã
æ¬æ¸ã§ã¯ãå¾æ¥ã®ãããå¤ãã¼ã¹ã®ã¢ã©ã¼ãã®åé¡ç¹ã詳細ã«åæãã¦ãã¾ããå°è±¡çã ã£ãã®ã¯ã以ä¸ã®ç¹ã§ãï¼
- ãããå¤ãæéã¨ã¨ãã«é©åã§ãªããªãåé¡
- ã¦ã¼ã¶ã¼ä½é¨ãç´æ¥åæ ãã¦ããªãææ¨ã¸ã®ä¾å
- ã³ã³ããã¹ãã®åªå¤±
- ã¢ã©ã¼ãç²ãã¨ã¦ã©ã¼ãã©ã°ï¼æ¦äºã®é§ï¼å¹æ
ãããã®åé¡ç¹ã¯ãSREã¨ãã¦ã®ç§ã®çµé¨ã¨ãä¸è´ãã¦ãããå ±æã§ãããã®ã§ããããHuman responses to alerts gradually decay in energy over time.ãã¨ããææã¯éè¦ã§ããã¢ã©ã¼ãç²ãã¯ãéç¨ãã¼ã ã®å¹çã¨å¯¾å¿å質ãä½ä¸ããã大ããªè¦å ã¨ãªã£ã¦ãã¾ãã
æ¬æ¸ã§ææ¡ããSLOã¢ã©ã¼ãã®ã¢ããã¼ãã¯ããããã®åé¡ã«å¯¾ãã解決çã¨ãã¦é åçã§ããã¨ã©ã¼ãã¸ã§ããã®æ¶çã«åºã¥ããã¢ã©ã¼ãã®è¨å®ã¯ãã¦ã¼ã¶ã¼ä½é¨ã«ç´çµããææ¨ãç¨ãããã¨ã§ãããæå³ã®ããã¢ã©ã¼ããå®ç¾ã§ãã¾ãã
ã¨ã©ã¼ãã¸ã§ããã®æ¶è²»çã®è¨ç®æ¹æ³ã¨ããã«åºã¥ãã¢ã©ã¼ãã®è¨å®ã«é¢ããå¼ã¨ãã¦ãããç´¹ä»ããã¦ãã¾ãã
Rate of error budget consumption = (observed errors per [time period or event count]) / (allowable errors per [time period or event count])
ã¾ããæ¬æ¸ãææ¡ãããã¼ãªã³ã°ã¦ã£ã³ãã¦ã®æ¦å¿µãæç¨ã§ããçæçãªåé¡ï¼fast burnï¼ã¨é·æçãªåé¡ï¼slow burnï¼ãåºå¥ãã¦æ±ãã¢ããã¼ãã¯ãå®éã®ã·ã¹ãã éç¨ã«ããã¦å¹æçã ã¨æãã¾ããã
æ¬æ¸ã§ã¯ãSLOã¢ã©ã¼ãã®å®è£ ã«é¢ããå ·ä½çãªã¬ã¤ãã³ã¹ãæä¾ãã¦ãã¾ãã1æéã§2ï¼ ã®ã¨ã©ã¼ãã¸ã§ããæ¶è²»ããã¼ã¸ã³ã°ã¢ã©ã¼ãã®é¾å¤ã¨ãã3æ¥éã§10ï¼ ã®æ¶è²»ããã±ããçºè¡ã®é¾å¤ã¨ããææ¡ã¯ãå®è·µçã§æç¨ãªæéã§ãã
ããããæ¬æ¸ã§ãææãã¦ããããã«ãSLOã¢ã©ã¼ãã®å®è£ ã«ã¯èª²é¡ãããã¾ããæ¢åã®ã·ã¹ãã ï¼ãã©ã¦ã³ãã£ã¼ã«ãï¼ã¸ã®SLOã¢ã©ã¼ãã®å°å ¥ã«ã¯ãæ§ã ãªå°é£ãä¼´ãã¾ããæ¬æ¸ãææ¡ãã以ä¸ã®ã¢ããã¼ãã¯ããã®èª²é¡ã«å¯¾å¦ããä¸ã§åèã«ãªãã¾ããï¼
- ç¾ç¶ã®äººçå½±é¿ã示ã
- æ¢åã®ã¢ã¦ãã¼ã¸ãããããªã³ããã¬ãã¥ã¼ãã
- æ°æ§ã®ã·ã¹ãã ã並è¡ãã¦éç¨ãã
ãããã®ã¢ããã¼ãã¯ãçµç¹å ã§ã®SLOã¢ã©ã¼ãå°å ¥ãæ¨é²ããéã«ãæå¹ãªæ¦ç¥ã ã¨æãã¾ããã
æ¬ç« ã®çµè«é¨åã§ãæ¬æ¸ã§ã¯ä»¥ä¸ã®éè¦ãªæ¨å¥¨äºé ãæ示ãã¦ãã¾ãï¼
- å å±æ§ï¼CPU使ç¨çãªã©ï¼ã«åºã¥ãã¢ã©ã¼ãããè±å´ãããã¨
- ã¢ã©ã¼ãã·ã¹ãã ã®æè¡çè½åã確èªãããã¨
- å¯è¦³æ¸¬æ§ï¼Observabilityï¼ã®éè¦æ§ãèªèãããã¨
- SLOè¨å®ã®åªåã¨ã³ã¹ããèæ ®ãããã¨
ãããã®æ¨å¥¨äºé ã¯ãSLOã¢ã©ã¼ããå¹æçã«å®è£ ããéç¨ãã¦ããä¸ã§éè¦ãªæéã¨ãªãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
- ç¾å¨ã®ã¢ã©ã¼ãè¨å®ãè¦ç´ããå é¨å±æ§ã«åºã¥ãã¢ã©ã¼ããç¹å®ãã
- ã¦ã¼ã¶ã¼ä½é¨ã«ç´çµããSLIãå®ç¾©ããããã«åºã¥ããSLOãè¨å®ãã
- ã¨ã©ã¼ãã¸ã§ããã®ãã¼ã³ã¬ã¼ããè¨ç®ããããã«åºã¥ããã¢ã©ã¼ããè¨å®ãã
- çæï¼fast burnï¼ã¨é·æï¼slow burnï¼ã®ã¢ã©ã¼ããåºå¥ãã¦è¨å®ãã
- ã¢ã©ã¼ãã·ã¹ãã ã®è½åãè©ä¾¡ããå¿ è¦ã«å¿ãã¦ã¢ããã°ã¬ã¼ããæ¤è¨ãã
- å¯è¦³æ¸¬æ§ãåä¸ãããããã®ãã¼ã«ããã©ã¯ãã£ã¹ãå°å ¥ãã
- SLOè¨å®ã¨ããã«ä¼´ããªãã¬ã¼ã·ã§ãã«ãã¼ãã«ã¤ãã¦ãã¹ãã¼ã¯ãã«ãã¼ã¨è°è«ãã
ãã®ç« ãèªãã§ãSLOã¢ã©ã¼ãã®å°å ¥ã¯ãåãªãæè¡çãªå¤æ´ã§ã¯ãªããçµç¹å ¨ä½ã®éç¨æ¹éã¨ã¤ã³ã·ãã³ã対å¿ã®å¨ãæ¹ãæ ¹æ¬çã«å¤ããå¯è½æ§ãç§ãã¦ãããã¨ãç解ãã¾ãããã¦ã¼ã¶ã¼ä½é¨ãä¸å¿ã«æ®ããã¢ããã¼ãã¯ãSREã®æ¬è³ªçãªç®çã§ãããã¦ã¼ã¶ã¼ã®æºè¶³åº¦åä¸ãã«ç´çµãããã®ã§ãã
åæã«ãSLOã¢ã©ã¼ãã®å°å ¥ã«ã¯æ éãªã¢ããã¼ããå¿ è¦ã§ãããã¨ãåèªèãã¾ãããæ¢åã®ã·ã¹ãã ãçµç¹æåã¨ã®æ´åæ§ãåããã¨ã®éè¦æ§ãå¼·ãæãã¾ãããSREã¨ãã¦ãæè¡çãªå®è£ ã ãã§ãªããçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ããå³ããªããã段éçSLOã¢ã©ã¼ããå°å ¥ãã¦ããã¢ããã¼ããéè¦ã ã¨èãã¾ãã
ç·æ¬ããã¨ããã®ç« ã¯SLOã¢ã©ã¼ãã®å®è£ ã«é¢ããå æ¬çãã¤å®è·µçãªã¬ã¤ããæä¾ãã¦ãã¾ããæ¬æ¸ã®è±å¯ãªçµé¨ã«åºã¥ãæ´å¯ã¯ãSLOã¢ã©ã¼ãã®å¯è½æ§ã¨èª²é¡ãæ確ã«ç¤ºãã¦ãã¾ããSREã¨ãã¦ããã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããå¹æçãªã¢ã©ã¼ãä½å¶ãæ§ç¯ããçµæã¨ãã¦ã·ã¹ãã ã®ä¿¡é ¼æ§åä¸ã¨ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ãã
åæã«ãSLOã¢ã©ã¼ãã®å°å ¥ã¯ç¶ç¶çãªå¦ç¿ã¨æ¹åã®ããã»ã¹ã§ãããã¨ãå¿ãã¦ã¯ããã¾ãããæè¡ã®é²åãçµç¹ã®å¤åã«å¿ãã¦ã常ã«ã¢ããã¼ããè¦ç´ããé©å¿ãã¦ãã姿å¢ãéè¦ã§ãã
æå¾ã«ãæ¬ç« ã®ãSLO alerting promises to get rid of a lot of the chaos, the noise, and the sheer uselessness of conventional alerting that teams experience, and replace them with something significantly more maintainable.ãã¨ããè¨èãå度強調ãããã¨æãã¾ãããã®è¦ç¹ãæã¡ã¤ã¤ãçµç¹ã®ç¹æ§ã«åãããæé©ãªSLOã¢ã©ã¼ãã®å®è£ ã追æ±ãã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡ã®éè¦ãªå½¹å²ã ã¨æãã¾ããã
Chapter 9. Probability and Statistics for SLIs and SLOs
第9ç« ãProbability and Statistics for SLIs and SLOsãã¯ãSLOï¼Service Level Objectivesï¼ã¨SLIï¼Service Level Indicatorsï¼ã®è¨è¨ã¨å®è£ ã«ããã確çã¨çµ±è¨ã®éè¦æ§ãæ·±ãæãä¸ãã¦ãã¾ããæ¬ç« ã¯ãSREãç´é¢ããè¤éãªåé¡ã«å¯¾ãã¦ãæ°å¦çãªã¢ããã¼ããç¨ãã¦ããç²¾å¯ãªè§£æ±ºçãæä¾ãããã¨ãç®çã¨ãã¦ãã¾ãã
ç« ã®åé ã§ãæ¬æ¸ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãReliability is expensive, and figuring out the amount of reliability you need is crucial for making the most of your resources.ããã®è¨èã¯ãSREã®æ¬è³ªçãªèª²é¡ã端çã«è¡¨ç¾ãã¦ãã¾ããé©åãªä¿¡é ¼æ§ã¬ãã«ã決å®ãããã¨ã¯ããªã½ã¼ã¹ã®æé©åã¨é¡§å®¢æºè¶³åº¦ã®ãã©ã³ã¹ãåãä¸ã§æ¥µãã¦éè¦ã§ãã
æ¬ç« ã¯ã主ã«äºã¤ã®éè¦ãªåé¡ã«ç¦ç¹ãå½ã¦ã¦ãã¾ãï¼SLOã®é©åãªè¨å®æ¹æ³ã¨ãSLIã®æ£ç¢ºãªè¨ç®æ¹æ³ã§ãããããã®åé¡ã¯ãæ°ãããµã¼ãã¹ã®ç«ã¡ä¸ããæ¢åã®ãµã¼ãã¹ã®æ¹åã«ããã¦å¸¸ã«ç´é¢ãã課é¡ã§ããä¾ãã°ãä¾åé¢ä¿ãæã¤ãµã¼ãã¹ã®SLOãã©ã®ããã«è¨å®ããã¹ããããããã¯ä½é »åº¦ã®ã¤ãã³ãã«å¯¾ãã¦SLIãã©ã®ããã«è¨ç®ããã¹ããã¨ãã£ãåé¡ãåãä¸ãããã¦ãã¾ãã
æ¬æ¸ã¯ããããã®åé¡ã«å¯¾å¦ããããã«ç¢ºçè«ã¨çµ±è¨å¦ã®ææ³ãå°å ¥ãã¦ãã¾ããå°è±¡çã ã£ãã®ã¯ããã«ãã¼ã¤è©¦è¡ã¨ãã¢ã½ã³åå¸ã®æ´»ç¨ã§ãããããã®æ¦å¿µã¯ããµã¼ãã¹ã®å¯ç¨æ§ãä¿¡é ¼æ§ãæ°å¦çã«ã¢ãã«åããä¸ã§æç¨ã§ãã
ä¾ãã°ãæ¬æ¸ã¯99.99ï¼ ã®å¯ç¨æ§ãæã¤ãµã¼ãã¹ãä¾ã«æãããããã³ã¤ã³æãã®åé¡ã«ç½®ãæãã¦èª¬æãã¦ãã¾ãããã®é¡æ¨ã¯ã確çè«ã®æ¦å¿µãç´æçã«ç解ããä¸ã§å¹æçã§ããæ¬æ¸ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãIf you flipped a coin to decide some question, you'd probably expect the probability of heads or tails to be about 50ï¼ . Mathematically, we say that the coin has a bias of .5.ããã®èª¬æã¯ãè¤éãªç¢ºçã®æ¦å¿µã身è¿ãªä¾ãç¨ãã¦åããããã解説ãã¦ãã¾ãã
æ¬ç« ã§èå³æ·±ãã£ãã®ã¯ãæå¾ å¤ï¼Expected Valueï¼ã®æ¦å¿µã¨ãã®å¿ç¨ã§ããæ¬æ¸ã¯ãæå¾ å¤ã確çåå¸ã®éè¦ãªç¹æ§ã§ãããããã»ã¹ã®åºåãäºæ¸¬ããä¸ã§æè¯ã®æ¨å®å¤ã§ãããã¨ã説æãã¦ãã¾ããããããåæã«æ¬æ¸ã¯æå¾ å¤ã®éçã«ã¤ãã¦ãè¨åãã¦ãã¾ãããUnfortunately, despite its name, the expected value of a distribution is not always a good description of the values that would come out of sampling from it.ããã®ææã¯ãSREãæ°å¦çã¢ãã«ãå®éã®ã·ã¹ãã ã«é©ç¨ããéã«æ³¨æãã¹ãéè¦ãªç¹ã示åãã¦ãã¾ãã
æ¬æ¸ã¯ãæå¾ å¤ã®éçãè£å®ãããã®ã¨ãã¦ä¸å¤®å¤ï¼Medianï¼ã®æ¦å¿µãå°å ¥ãã¦ãã¾ããä¸å¤®å¤ã¯ãå¤ãå¤ã®å½±é¿ãåããããåå¸ã«ããã¦ãããé©åãªä»£è¡¨å¤ã¨ãªãå ´åãããã¾ãããã®æ¦å¿µã¯ãSLOã®è¨å®ã«ããã¦éè¦ã§ããä¾ãã°ãå¿çæéã®SLOãè¨å®ããéã極端ã«é·ãå¿çæéã®å½±é¿ãæé¤ããããã«ä¸å¤®å¤ã使ç¨ãããã¨ãæå¹ãªå ´åãããã¾ãã
æ¬ç« ã§ã¯ãMaximum Likelihood Estimationï¼MLEï¼ãMaximum a Posterioriï¼MAPï¼ãªã©ã®çµ±è¨çæ¨å®ææ³ã«ã¤ãã¦ã詳細ã«èª¬æããã¦ãã¾ãããããã®ææ³ã¯ãéããããã¼ã¿ããä¿¡é ¼æ§ã®é«ãæ¨å®ãè¡ãä¸ã§éè¦ã§ããæ¬æ¸ã¯ããããã®ææ³ãSLIã®è¨ç®ã«å¿ç¨ããæ¹æ³ãå ·ä½çã«ç¤ºãã¦ãã¾ãã
Nå¹´ã¶ãã«èãããã¤ãºæ¨å®ã®å¿ç¨ã¯ã¨ã¦ãè¯ãã£ãã§ããæ¬æ¸ã¯ãäºåã®ç¥èã信念ãæ°å¦çã«ã¢ãã«åããæ°ããªè¦³æ¸¬ãã¼ã¿ã¨çµã¿åããã¦æ¨å®ãè¡ãæ¹æ³ã詳細ã«èª¬æãã¦ãã¾ãããã®ææ³ã¯ãä½é »åº¦ã®ã¤ãã³ãããã¼ã¿ãéããã¦ããç¶æ³ã§ã®SLIè¨ç®ã«æå¹ã§ãã
æ¬æ¸ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãIf we have good reason to think some values of p are more likely than others before we get any evidence, then this allows us to incorporate those prior beliefs into our calculations.ããã®èãæ¹ã¯ãSREãéå»ã®çµé¨ãå°éç¥èãSLIã®è¨ç®ã«åæ ãããæ¹æ³ãæä¾ãã¦ãããå®è·µçã§ãã
æ¬ç« ã®å¾åã§ã¯ããã¥ã¼ã¤ã³ã°çè«ã¨ãã®å¿ç¨ã«ã¤ãã¦è©³ç´°ãªèª¬æããªããã¦ãã¾ããæ¬æ¸ã¯ãM/M/1ãã¥ã¼ãM/M/cãã¥ã¼ãªã©ã®ã¢ãã«ãç¨ãã¦ãã·ã¹ãã ã®ã¬ã¤ãã³ã·ã¼ãå¦çè½åãæ°å¦çã«åæããæ¹æ³ã示ãã¦ãã¾ãããããã®ã¢ãã«ã¯ãã·ã¹ãã ã®æ§è½äºæ¸¬ã容éè¨ç»ã«ããã¦æç¨ã§ãã
æ¬æ¸ã¯ããã¥ã¼ã¤ã³ã°çè«ã®å¿ç¨ä¾ã¨ãã¦ããããå¦çã·ã¹ãã ã®åæãæãã¦ãã¾ããããã§ã¯ããã¢ã½ã³åå¸ãç¨ãã¦ãªã¯ã¨ã¹ãã®å°çãã¿ã¼ã³ãã¢ãã«åããã·ã¹ãã ã®å¦çè½åã¨ã®é¢ä¿ãæ°å¦çã«åæãã¦ãã¾ãããã®åæã¯ãSLOã®è¨å®ãã·ã¹ãã ã®ã¹ã±ã¼ãªã³ã°è¨ç»ãç«ã¦ãä¸ã§æç¨ãªæ´å¯ãæä¾ãã¦ãã¾ãã
æ¬ç« ã®çµè«é¨åã§ãæ¬æ¸ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãThe power of thinking in a probabilistic and statistical manner is that it allows verification of the gut feel that most team members will have developed around the behavior of the system.ããã®è¨èã¯ãæ¬ç« å ¨ä½ã®ã¡ãã»ã¼ã¸ã端çã«è¡¨ç¾ãã¦ãã¾ãã確çã¨çµ±è¨ã®ææ³ã¯ãSREã®çµé¨ãç´æãæ°å¦çã«æ¤è¨¼ããããä¿¡é ¼æ§ã®é«ãææ決å®ãè¡ãããã®å¼·åãªãã¼ã«ã¨ãªãã¾ãã
Figure 9-11ã§ã¯ãé«ãå¯ç¨æ§ï¼99.99ï¼ ï¼ãæã¤ã·ã¹ãã ã®ããã©ã¼ãã³ã¹ã示ããã¹ãã°ã©ã ãæ示ããã¦ãã¾ãããã®å³ã¯ã極ãã¦é«ãä¿¡é ¼æ§ãæã¤ã·ã¹ãã ã«ããã¦ããããããªãã失æãçºçããå¯è½æ§ããããã¨ãè¦è¦çã«ç¤ºãã¦ãããå®ç§ãªä¿¡é ¼æ§ãå®ç¾ä¸å¯è½ã§ãããã¨ãæ確ã«è¡¨ç¾ãã¦ãã¾ãã
Figure 9-29ã¨9-30ã¯ãã·ã¹ãã ã®Utilizationã¨ã¬ã¤ãã³ã·ã¼ã®é¢ä¿ã示ãã°ã©ãã§ãããããã®å³ã¯ãã·ã¹ãã ã®å©ç¨çãå¢å ããã«ã¤ãã¦ãã¬ã¤ãã³ã·ã¼ãéç·å½¢ã«ä¸æãããã¨ãæ確ã«ç¤ºãã¦ãã¾ãããã®é¢ä¿ã®ç解ã¯ãé©åãªãã£ãã·ãã£ãã©ã³ãã³ã°ã¨SLOè¨å®ã«ããã¦æ¥µãã¦éè¦ã§ãã
æ¬ç« ããå¾ãããéè¦ãªæè¨ã¯ãSLOã¨SLIã®è¨è¨ã¨å®è£ ã«ããã¦ã確çã¨çµ±è¨ã®ææ³ãå¼·åãªãã¼ã«ã¨ãªãã¨ãããã¨ã§ãããããã®ææ³ã¯ãç´æçãªå¤æãæ°å¦çã«è£ä»ããããç²¾å¯ã§ä¿¡é ¼æ§ã®é«ãææ決å®ãå¯è½ã«ãã¾ãã
åæã«ãæ¬ç« ã¯æ°å¦çã¢ãã«ã®éçã«ã¤ãã¦ãæ確ã«ææãã¦ãã¾ããæ¬æ¸ã¯æ¬¡ã®ããã«è¦åãã¦ãã¾ãï¼ãWhile models are helpful, they cannot be completely correct. This is exactly why they are models.ããã®èªèã¯ãSREãæ°å¦çã¢ãã«ãå®éã®ã·ã¹ãã ã«é©ç¨ããéã«å¸¸ã«å¿µé ã«ç½®ãã¹ãéè¦ãªç¹ã§ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
SLOã®è¨å®ã«ããã¦ãåç´ãªå¹³åå¤ã ãã§ãªããåå¸ã®ç¹æ§ï¼æå¾ å¤ãä¸å¤®å¤ããã¼ã»ã³ã¿ã¤ã«ï¼ãèæ ®ã«å ¥ããã
ã·ã¹ãã ã®ããã©ã¼ãã³ã¹ãåéããé©åãªç¢ºçåå¸ï¼ä¾ï¼ãã¢ã½ã³åå¸ãææ°åå¸ï¼ã§ã¢ãã«åããã
ãã¤ãºæ¨å®ãç¨ãã¦ãéå»ã®çµé¨ãå°éç¥èãSLIã®è¨ç®ã«åæ ãããã
ãã¥ã¼ã¤ã³ã°çè«ãæ´»ç¨ãã¦ãã·ã¹ãã ã®å®¹éè¨ç»ãã¹ã±ã¼ãªã³ã°æ¦ç¥ãæ°å¦çã«è£ä»ããã
ããã¤ãã®ææ³ãç¨ãã¦ãè¤éãªã·ã¹ãã ã®æ¯ãèããäºæ¸¬ããé©åãªSLOãè¨å®ããã
çµ±è¨çææ³ãç¨ãã¦SLIã®ä¿¡é ¼åºéãè¨ç®ãã測å®ã®ä¸ç¢ºå®æ§ãå®éåããã
ãããã®ã¢ããã¼ããå®è·µãããã¨ã§ãããç²¾å¯ã§ä¿¡é ¼æ§ã®é«ãSLOã¨SLIã®è¨è¨ã¨å®è£ ãå¯è½ã«ãªãã¾ãã
æè¡çãªè¦³ç¹ããã¯ãæ¬ç« ã§ç´¹ä»ãããæ°å¦çææ³ãå®è£ ããããã®ãã¼ã«ãã©ã¤ãã©ãªã®æ´»ç¨ãéè¦ã§ããä¾ãã°ãPythonã®SciPyã©ã¤ãã©ãªã¯ãæ¬ç« ã§ç´¹ä»ããã確çåå¸ãçµ±è¨çæ¨å®ææ³ã容æã«å®è£ ã§ããæ©è½ãæä¾ãã¦ãã¾ããã¾ããPrometheusãªã©ã®ç£è¦ãã¼ã«ã¨çµã¿åããããã¨ã§ããªã¢ã«ã¿ã¤ã ã§SLIãè¨ç®ããçµ±è¨çãªåæãè¡ããã¨ãå¯è½ã«ãªãã¾ãã
ããã«ãæ©æ¢°å¦ç¿ã®ææ³ãæ´»ç¨ãã¦ãããé«åº¦ãªSLOäºæ¸¬ã¢ãã«ãæ§ç¯ãããã¨ãæ¤è¨ã«å¤ãã¾ããä¾ãã°ãæç³»åäºæ¸¬ã¢ãã«ãç¨ãã¦SLIã®å°æ¥çãªå¾åãäºæ¸¬ããããã¢ã¯ãã£ããªã·ã¹ãã 調æ´ãè¡ããã¨ãå¯è½ã«ãªãã¾ãã
æ¬ç« ãèªãã§ãç§ã¯SREã®å½¹å²ãããæ°å¦çã»åæçã«ãªãã¤ã¤ãããã¨ãå¼·ãæãã¾ãããåç´ãªã«ã¼ã«ãã¼ã¹ã®ç£è¦ãã¢ã©ã¼ãããã確çè«ã¨çµ±è¨å¦ã«åºã¥ããç²¾å¯ãªåæã¸ã¨ã·ãããã¦ããå¾åã¯ãã·ã¹ãã ã®è¤éåã¨è¦æ¨¡ã®æ¡å¤§ã«ä¼´ãå¿ ç¶çãªæµãã ã¨èãããã¾ãã
åæã«ããã®æ°å¦çã¢ããã¼ãã®å°å ¥ã«ã¯èª²é¡ãããã¾ããçµç¹å ¨ä½ã§ãããã®æ¦å¿µãç解ããå®è·µã«ç§»ãããã«ã¯ãç¶ç¶çãªæè²ã¨æåã®å¤é©ãå¿ è¦ã§ããSREã¨ãã¦ããããã®æ°å¦çæ¦å¿µãéæè¡è ãå«ãçµç¹å ¨ä½ã«åããããã説æãããã®ä¾¡å¤ã示ãã¦ãããã¨ãéè¦ãªå½¹å²ã¨ãªãã¾ãã
ç·æ¬ããã¨ããã®ç« ã¯ç¢ºçã¨çµ±è¨ã®ææ³ãSLOã¨SLIã®è¨è¨ã¨å®è£ ã«é©ç¨ããå ·ä½çãªæ¹æ³ãæä¾ãã¦ãã¾ãããããã®ææ³ã¯ãSREãç´é¢ããè¤éãªåé¡ã«å¯¾ãã¦ãããç²¾å¯ã§ä¿¡é ¼æ§ã®é«ã解決çãæä¾ããå¯è½æ§ãç§ãã¦ãã¾ããåæã«ãæ°å¦çã¢ãã«ã®éçãèªèããå®éã®ã·ã¹ãã ã®æ¯ãèãã¨ã®ãã©ã³ã¹ãåããã¨ã®éè¦æ§ã強調ããã¦ãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããå¹æçãªSLOã¨SLIã®è¨è¨ãå¯è½ã«ãªããçµæã¨ãã¦ã·ã¹ãã ã®ä¿¡é ¼æ§åä¸ã¨ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ããããã®æ°å¦çææ³ã®é©ç¨ã¯ç¶ç¶çãªå¦ç¿ã¨æ¹åã®ããã»ã¹ã§ãããã¨ãå¿ãã¦ã¯ããã¾ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ã«å¿ãã¦ã常ã«ã¢ããã¼ããè¦ç´ããé©å¿ãã¦ãã姿å¢ãéè¦ã§ãã
æå¾ã«ãæ¬ç« ã®ãThe power of thinking in a probabilistic and statistical manner is that it allows verification of the gut feel that most team members will have developed around the behavior of the system.ãã¨ããè¨èãå度強調ãããã¨æãã¾ãããã®è¦ç¹ãæã¡ã¤ã¤ãç´æã¨æ°å¦çåæã®ãã©ã³ã¹ãåããªãããããç²¾å¯ã§ä¿¡é ¼æ§ã®é«ãã·ã¹ãã éç¨ã追æ±ãã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡ã®éè¦ãªå½¹å²ã ã¨æãã¾ããã
ãã®ç« ã§å¦ãã 確çã¨çµ±è¨ã®ææ³ã¯ãSREã®å®è·µã«ããã¦ä¸å¿çãªå½¹å²ãæãããã®ã§ãããããã®æ¦å¿µã¨ææ³ãæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãã¦ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ãã·ãã ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ããããã®ææ³ã®é©ç¨ã«ã¯æ éãã¨ç¶ç¶çãªå¦ç¿ãå¿ è¦ã§ãããã¨ãå¿ãã¦ã¯ããã¾ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ãããã¦ãã¸ãã¹è¦ä»¶ã®å¤åã«å¿ãã¦ã常ã«ã¢ããã¼ããè¦ç´ããé©å¿ãã¦ããå¿ è¦ãããã¾ãã
SREã¨ãã¦ããã®ç« ããå¾ãããç¥è¦ãçµç¹å ¨ä½ã«æµ¸éããããã¼ã¿é§åã®ææ決å®æåãé¸æãã¦ãããã¨ãéè¦ã§ãã確çã¨çµ±è¨ã«åºã¥ããã¢ããã¼ãã¯ãåã«ã·ã¹ãã ã®ä¿¡é ¼æ§ãåä¸ãããã ãã§ãªããçµç¹å ¨ä½ã®ææ決å®ããã»ã¹ãæ¹åããããå¹ççãªãªã½ã¼ã¹é åãå¯è½ã«ãã¾ãã
ä»å¾ã®èª²é¡ã¨ãã¦ã¯ããããã®æ°å¦çææ³ããã使ãããããç解ãããããã¼ã«ããã¬ã¼ã ã¯ã¼ã¯ã«è½ã¨ãè¾¼ãã§ãããã¨ãæãããã¾ããã¾ããæ©æ¢°å¦ç¿ã人工ç¥è½ã®é²æ©ã«ä¼´ããããé«åº¦ãªäºæ¸¬ã¢ãã«ãæé©åã¢ã«ã´ãªãºã ãæ´»ç¨ãã次ä¸ä»£ã®SLO/SLI管çã·ã¹ãã ã®éçºãæå¾ ããã¾ãã
ãã®ç« ã§å¦ãã 確çã¨çµ±è¨ã®ææ³ã¯ãSREã®å®è·µã«ããã¦å¼·åãªæ¦å¨ã¨ãªãã¾ãããããã®ææ³ãé©åã«æ´»ç¨ãããã¨ã§ãããç²¾å¯ã§ä¿¡é ¼æ§ã®é«ãã·ã¹ãã éç¨ãå¯è½ã«ãªããçµæã¨ãã¦ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ããSREã¨ãã¦ã常ã«å¦ã³ç¶ããæ°ããç¥èã¨æè¡ãç©æ¥µçã«åãå ¥ããªãããçµç¹ã¨ã·ã¹ãã ã®ç¶ç¶çãªæ¹åã«è²¢ç®ãã¦ãããã¨ãéè¦ã§ãã
Chapter 10. Architecting for Reliability
第10ç« ãArchitecting for Reliabilityãã¯ãSLOï¼Service Level Objectivesï¼ã念é ã«ç½®ããã·ã¹ãã è¨è¨ã®éè¦æ§ã¨æ¹æ³è«ã«ã¤ãã¦æ·±ãæãä¸ãã¦ãã¾ããæ¬ç« ã¯ãã·ã¹ãã ã¢ã¼ããã¯ããSLOãèæ ®ããªãããããã«ä¿¡é ¼æ§ã®é«ãã·ã¹ãã ãè¨è¨ã§ãããã詳細ã«è§£èª¬ãã¦ãã¾ããSRE ã«ãªãããã«å½¹ç«ã¤ã·ã¹ãã ã¨ã³ã¸ãã¢ãªã³ã°ã®ã·ã©ãã¹ã®ãç´¹ä»ã§ããã¡ãã®ç« ãç´¹ä»ãã¦ãã¾ãã
æ¬ç« ã®åé ã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãThis chapter focuses on designing systems from the ground up with SLOs in mind.ããã®è¨èã¯ãSLOãã·ã¹ãã è¨è¨ã®åæ段éããèæ ®ãããã¹ãéè¦ãªè¦ç´ ã§ãããã¨ã強調ãã¦ãã¾ãããã®è¦ç¹ã¯éè¦ã ã¨æãã¾ãããå¤ãã®å ´åãSLOã¯æ¢åã®ã·ã¹ãã ã«å¾ä»ãã§é©ç¨ãããã¡ã§ãããè¨è¨æ®µéããSLOãèæ ®ãããã¨ã§ãããå¹æçã§ä¿¡é ¼æ§ã®é«ãã·ã¹ãã ãæ§ç¯ã§ãã¾ãã
æ¬ç« ã§ç¹ã«å°è±¡çã ã£ãã®ã¯ãã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ã®éè¦æ§ã«ã¤ãã¦ã®è¨åã§ããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãUser journeys, which represent the same concept as SLIs (see Chapter 3), help us understand these interactions, as well as the implications for the user when the system does not meet its objectives.ããã®è¦ç¹ã¯ãæè¡çãªææ¨ã ãã§ãªããã¦ã¼ã¶ã¼ä½é¨ãä¸å¿ã«æ®ããã·ã¹ãã è¨è¨ã®éè¦æ§ã強調ãã¦ãã¾ããSREã¨ãã¦ããã®èãæ¹ã¯éè¦ã§ããç§ãã¡ã¯å¾ã ã«ãã¦æè¡çãªææ¨ã«ã¨ããããã¡ã§ãããæçµçã«ã¯ã¦ã¼ã¶ã¼ã®æºè¶³åº¦ãããæãéè¦ãªææ¨ã§ãããã¨ãå¿ãã¦ã¯ããã¾ããã
æ¬ç« ã§ã¯ãã·ã¹ãã è¨è¨ã«ãããæ§ã ãªèæ ®äºé ã«ã¤ãã¦è©³ç´°ã«è§£èª¬ãã¦ãã¾ãããã¼ãã¦ã§ã¢ã®é¸æãã·ã¹ãã ã®SLOã«ä¸ããå½±é¿ã«ã¤ãã¦ãèå³æ·±ãåæããªããã¦ãã¾ããä¾ãã°ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãA system cannot offer an SLO greater than any of its dependencies' SLOs.ããã®ææã¯ãã·ã¹ãã å ¨ä½ã®SLOãèããä¸ã§éè¦ã§ããä¾åé¢ä¿ã«ããã³ã³ãã¼ãã³ãã®SLOãç解ããããããèæ ®ãã¦ã·ã¹ãã å ¨ä½ã®SLOãè¨å®ãããã¨ã®éè¦æ§ãåèªèããããã¾ããã
ãã¼ãã¦ã§ã¢ã®é¸æã«é¢ããå ·ä½çãªä¾ã¨ãã¦ãæ¬ç« ã§ã¯ç°ãªãã¹ãã¬ã¼ã¸ãªãã·ã§ã³ã®æ¯è¼ã示ããã¦ãã¾ããFigure 10.1ã§ã¯ããã¼ããã£ã¹ã¯ãSSDãRAMã®èªã¿åãã¬ã¤ãã³ã·ã¨IOPSãæ¯è¼ããã¦ããããããã®é¸æãSLIã«ã©ã®ãããªå½±é¿ãä¸ããããæ確ã«ç¤ºããã¦ãã¾ãããã®æ¯è¼è¡¨ã¯ãã·ã¹ãã è¨è¨ã®åæ段éã§éè¦ãªææ決å®ãè¡ãéã®è²´éãªæéã¨ãªãã¾ãã
æ¬ç« ã§ã¯ãã¢ããªã¹ããã¤ã¯ããµã¼ãã¹ãããè°è«ã«ã¤ãã¦ãè¨åãã¦ãã¾ããèè ã¯ããµã¼ãã¹æåã¢ã¼ããã¯ãã£ï¼SOAï¼ã®å©ç¹ã強調ãã¤ã¤ã次ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãAn open-ended systemâone that allows for extension and changeâis superior to a closed-ended system.ããã®è¦ç¹ã¯ãæ¥éã«å¤åãããã¸ãã¹ç°å¢ã«ããã¦éè¦ã§ããSREã¨ãã¦ãã·ã¹ãã ã®æ¡å¼µæ§ã¨å¤æ´ã®å®¹æãã¯ãé·æçãªéç¨æ§ã¨ä¿¡é ¼æ§ã確ä¿ããä¸ã§ä¸å¯æ¬ ãªè¦ç´ ã ã¨æãã¾ããã
ã·ã¹ãã ã®æ éã¢ã¼ãã®äºæ¸¬ã¨å¯¾å¿ã«ã¤ãã¦ããæ¬ç« ã§ã¯è©³ç´°ã«è§£èª¬ããã¦ãã¾ããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãWhen designing systems it's important to anticipate failure modesâthat is, the problems that a system may realistically encounter and that it can respond to in order to maintain its SLOs.ããã®èãæ¹ã¯ãSREã®æ ¸å¿ã«è§¦ãããã®ã§ããã·ã¹ãã ã®ä¿¡é ¼æ§ãé«ããããã¯ãåã«æ£å¸¸æã®åä½ãè¨è¨ããã ãã§ãªããæ§ã ãªç°å¸¸ç¶æ ãäºæ¸¬ãããããã«é©åã«å¯¾å¿ã§ããããã«ã·ã¹ãã ãè¨è¨ãããã¨ãéè¦ã§ãã
æ¬ç« ã§ã¯ããªã¯ã¨ã¹ãã®ç¨®é¡ï¼åæãéåæããããï¼ã«å¿ããè¨è¨ä¸ã®èæ ®äºé ã«ã¤ãã¦ãè¨åãã¦ãã¾ãããããã®ç°ãªãã¿ã¤ãã®ãªã¯ã¨ã¹ãã«å¯¾ãã¦é©åã«å¯¾å¿ãããã¨ã§ãã·ã¹ãã å ¨ä½ã®ããã©ã¼ãã³ã¹ã¨ä¿¡é ¼æ§ãåä¸ããããã¨ãã§ãã¾ãããããå¦çã«é¢ãã次ã®ææã¯å°è±¡çã§ããï¼ãBatch processing of requests typically happens because their results are not time-sensitive or in the critical path, yet SLIs still play an important role: they provide measurements for KPIs such as the duration of each batch process, meaning how long the process takes to execute, and the number of requests processed in each batch.ããã®è¦ç¹ã¯ãéåæå¦çããããå¦çã®SLOãè¨å®ãéã®éè¦ãªæéã¨ãªãã¾ãã
ã·ã¹ãã ã®å®éçåæã«é¢ããé¨åãèå³æ·±ããã®ã§ãããèè ã¯ãã·ã¹ãã ã®å¯ç¨æ§ãæ§æè¦ç´ ã®å¯ç¨æ§ã®çµã¿åããã¨ãã¦è¡¨ç¾ã§ãããã¨ã示ãã¦ãã¾ãããã®èãæ¹ã¯ãè¤éãªã·ã¹ãã ã®ä¿¡é ¼æ§ãç解ããæ¹åããä¸ã§æç¨ã§ãã
1 - SLO = ((MTTD + MTTM) / MTBF) Ã IMPACT
ãã®å¼ã¯ãã·ã¹ãã ã®ä¿¡é ¼æ§ã人éã®å¯¾å¿æéã¨é¢é£ä»ãã¦è¡¨ç¾ãã¦ãããSREã®å®åã«ç´æ¥é©ç¨å¯è½ãªæ´å¯ãæä¾ãã¦ãã¾ãã
æ¬ç« ã®å¾åã§ã¯ãã·ã¹ãã ã®ä¾åé¢ä¿ã®éè¦æ§ã«ã¤ãã¦è©³ãã解説ããã¦ãã¾ããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãOnce your product and engineering perspectives agree, you can develop SLOs, and we can turn back to "the system." Thus far we have designed a system that solves our problem as designed, without building any nonessential software.ããã®è¦ç¹ã¯ãã·ã¹ãã è¨è¨ã«ããã¦ä¸è¦ãªè¤éããé¿ããæ¬è³ªçãªåé¡è§£æ±ºã«ç¦ç¹ãå½ã¦ããã¨ã®éè¦æ§ã強調ãã¦ãã¾ãã
Figure 10.5ã§ã¯ãã·ã¹ãã ã®å¢çãç解ãããã¨ã®éè¦æ§ãè¦è¦çã«ç¤ºããã¦ãã¾ãããã®å³ã¯ãã·ã¹ãã å ã®ããã©ãã¯ããã¯ã¹ãï¼ãµã¼ããã¼ãã£ã®ãµã¼ãã¹ãã¯ã©ã¦ããã¼ã¹ã®ã·ã¹ãã ãªã©ï¼ãèå¥ããããããã·ã¹ãã å ¨ä½ã®ä¿¡é ¼æ§ã«ã©ã®ããã«å½±é¿ããããç解ãããã¨ã®éè¦æ§ã強調ãã¦ãã¾ãã
æ¬ç« ããå¾ãããéè¦ãªæè¨ã¯ãSLOãèæ ®ããã·ã¹ãã è¨è¨ããåãªãæè¡çãªæ¼ç¿ã§ã¯ãªããã¦ã¼ã¶ã¼ä½é¨ã¨å¯æ¥ã«çµã³ã¤ããéè¦ãªããã»ã¹ã§ããã¨ãããã¨ã§ããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãIn order to have effective SLOs, we need to reflect the user experience, not only system performance.ããã®è¦ç¹ã¯ãSREã®å®è·µã«ããã¦éè¦ã§ãã
æè¡çãªè¦³ç¹ããã¯ãæ¬ç« ã§ç´¹ä»ãããã·ã¹ãã è¨è¨ã®æ¹æ³è«ã¯å®è·µçã§æç¨ã§ããã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ã«åºã¥ãã¦SLIã¨SLOãè¨å®ããããããå ã«ã·ã¹ãã ã¢ã¼ããã¯ãã£ã決å®ãã¦ããã¢ããã¼ãã¯ãå¤ãã®SREããã¸ã§ã¯ãã«é©ç¨å¯è½ã§ãã
ã¾ããæ¬ç« ã§å¼·èª¿ããã¦ãããã°ã¬ã¼ã¹ãã«ãã°ã©ãã¼ã·ã§ã³ãã®æ¦å¿µãéè¦ã§ããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãGiven congestion on the internal network between application servers and the storage component, a conscious architectural decision will, for example, allow our image-serving system to degrade such that thumbnail pages continue to serve within 250 ms, even though loading the detail view might take longer.ããã®èãæ¹ã¯ãã·ã¹ãã ã®ä¸é¨ã«åé¡ãçºçããå ´åã§ããå ¨ä½ã¨ãã¦ã®æ©è½ãç¶æããã¦ã¼ã¶ã¼ä½é¨ã¸ã®å½±é¿ãæå°ã«æããããã®éè¦ãªè¨è¨ååã§ãã
æ¬ç« ãèªãã§ãã·ã¹ãã ã¢ã¼ããã¯ãã¨ãã¦ã®è¦ç¹ãæã¡ã¤ã¤ãã¦ã¼ã¶ã¼ä½é¨ã¨ãã¸ãã¹ç®æ¨ã常ã«æèããªããã·ã¹ãã ãè¨è¨ãããã¨ã®éè¦æ§ãå¼·ãæãã¾ãããåæã«ãSLOãåãªãæ°å¤ç®æ¨ã§ã¯ãªããã·ã¹ãã è¨è¨ã®æéã¨ãã¦æ´»ç¨ãããã¨ã®æå¹æ§ãåèªèãã¾ããã
ç·æ¬ããã¨ããã®ç« ã¯SLOãèæ ®ããã·ã¹ãã è¨è¨ã«é¢ããå æ¬çãã¤å®è·µçãªã¬ã¤ããæä¾ãã¦ãã¾ããã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ã®éè¦æ§ããã¼ãã¦ã§ã¢é¸æã®å½±é¿ããã¤ã¯ããµã¼ãã¹ã¢ã¼ããã¯ãã£ã®å©ç¹ãæ éã¢ã¼ãã®äºæ¸¬ã¨å¯¾å¿ãç°ãªãã¿ã¤ãã®ãªã¯ã¨ã¹ãã¸ã®å¯¾å¿ãã·ã¹ãã ã®å®éçåæãä¾åé¢ä¿ã®ç解ãªã©ãã·ã¹ãã è¨è¨ã®éè¦ãªå´é¢ãç¶²ç¾ ãã¦ãã¾ãã
SREã¨ãã¦ããã®ç« ãå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããä¿¡é ¼æ§ãé«ããã¦ã¼ã¶ã¼ä½é¨ãéè¦ããã·ã¹ãã ã®è¨è¨ãå¯è½ã«ãªãã¨ç¢ºä¿¡ãã¦ãã¾ããè¨è¨ã®åæ段éããSLOãèæ ®ããã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ã«åºã¥ãã¦ã·ã¹ãã ã¢ã¼ããã¯ãã£ã決å®ãã¦ããã¢ããã¼ãã¯ãå¤ãã®ããã¸ã§ã¯ãã§æå¹ã«æ´»ç¨ã§ããã§ãããã
åæã«ãæ¬ç« ã§å¼·èª¿ããã¦ããã·ã¹ãã ã®ä¾åé¢ä¿ã®ç解ã¨ç®¡çã®éè¦æ§ããå®åä¸éè¦ã§ããã¯ã©ã¦ããµã¼ãã¹ããµã¼ããã¼ãã£ã®APIã«ä¾åããç¾ä»£ã®ã·ã¹ãã éçºã«ããã¦ããããã®ããã©ãã¯ããã¯ã¹ããã·ã¹ãã å ¨ä½ã®SLOã«ã©ã®ãããªå½±é¿ãä¸ããããç解ããé©åã«ç®¡çãããã¨ã¯ä¸å¯æ¬ ã§ãã
ä»å¾ã®èª²é¡ã¨ãã¦ã¯ãæ¥éã«é²åããã¯ã©ã¦ãæè¡ãæ°ããã¢ã¼ããã¯ãã£ãã¿ã¼ã³ï¼ä¾ï¼ãµã¼ãã¼ã¬ã¹ã¢ã¼ããã¯ãã£ï¼ã«ãã¦ãæ¬ç« ã§ç´¹ä»ãããã¢ããã¼ããã©ã®ããã«é©ç¨ãã¦ããããæ¤è¨ããå¿ è¦ãããã¾ããã¾ããæ©æ¢°å¦ç¿ãAIãæ´»ç¨ããã·ã¹ãã ã®è¨è¨ã«ããã¦ãSLOãã©ã®ããã«å®ç¾©ãã管çãã¦ããããéè¦ãªç 究ãã¼ãã¨ãªãã§ãããã
æå¾ã«ãæ¬ç« ã®ãSLOs as a Result of System SLIsãã¨ããã»ã¯ã·ã§ã³ã§è¿°ã¹ããã¦ãããThe SLOs for a system follow from the SLIs we have identified, although not necessarily directly: in order to have effective SLOs, we need to reflect the user experience, not only system performance.ãã¨ããè¨èãå度強調ãããã¨æãã¾ãããã®è¦ç¹ãæã¡ã¤ã¤ãæè¡çãªææ¨ã¨ã¦ã¼ã¶ã¼ä½é¨ã®ãã©ã³ã¹ãåããªãããããå¹æçãªã·ã¹ãã è¨è¨ã追æ±ãã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡ã®éè¦ãªå½¹å²ã ã¨æãã¾ããã
æ¬ç« ã§å¦ãã SLOãèæ ®ããã·ã¹ãã è¨è¨ã®ã¢ããã¼ãã¯ãSREã®å®è·µã«ããä¸å¿çãªå½¹å²ãæãããã®ã§ãããããã®æ¦å¿µã¨ææ³ãæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãã¦ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ãã·ã¹ãã ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ãã·ã¹ãã è¨è¨ã®æ¹æ³è«ã¯å¸¸ã«é²åãç¶ãããã®ã§ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ã«å¿ãã¦ãç¶ç¶çã«å¦ç¿ããé©å¿ãã¦ããå¿ è¦ããããã¨ãå¿ãã¦ã¯ããã¾ããã
SREã¨ãã¦ããã®ç« ããå¾ãããç¥è¦ãçµç¹å ¨ä½ã«æµ¸éãããSLOãä¸å¿ã¨ããã·ã¹ãã è¨è¨ã®æåãé¸æãã¦ãããã¨ãéè¦ã§ããã¦ã¼ã¶ã¼ä½é¨ãéè¦ããä¿¡é ¼æ§ã¨æ§è½ã®ãã©ã³ã¹ãåããªãããæè»ã§æ¡å¼µæ§ã®é«ãã·ã¹ãã ãè¨è¨ãããã¨ã§ãé·æçã«ã¯ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ãã
Chapter 11. Data Reliability
第11ç« ãData Reliabilityãã¯ããã¼ã¿ãµã¼ãã¹ã®ä¿¡é ¼æ§ã«ç¦ç¹ãå½ã¦ãSLOï¼Service Level Objectivesï¼ã¨SLIï¼Service Level Indicatorsï¼ã®è¨å®ã¨éç¨ã«ã¤ãã¦æ·±ãæãä¸ãã¦ãã¾ããæ¬ç« ã¯ããã¼ã¿ã®ä¿¡é ¼æ§ãä»ã®ãµã¼ãã¹ã®ä¿¡é ¼æ§ã¨ã©ã®ããã«ç°ãªãããããã¦ãã¼ã¿ãµã¼ãã¹ã«ç¹æã®SLOãã©ã®ããã«è¨å®ãã測å®ãã¹ããã詳細ã«è§£èª¬ãã¦ãã¾ãã
ç« ã®åé ã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãThe goal of this chapter is to explore what makes SLOs for data services different from SLOs for other services.ããã¼ã¿ã®ä¿¡é ¼æ§ã¯ãåãªãã·ã¹ãã ã®å¯ç¨æ§ãæ§è½ã ãã§ãªãããã¼ã¿ãã®ãã®ã®å質ãæ´åæ§ã«ããé¢ãããããç¬èªã®èæ ®äºé ãå¿ è¦ã«ãªãã¾ãã
æ¬ç« ã§ç¹ã«å°è±¡çã ã£ãã®ã¯ããã¼ã¿ã®ä¿¡é ¼æ§ã13ã®å±æ§ã«åé¡ããããããã«ã¤ãã¦è©³ç´°ã«è§£èª¬ãã¦ããç¹ã§ãããããã®å±æ§ã¯ããã¼ã¿ããããã£ï¼7ã¤ï¼ã¨ãã¼ã¿ã¢ããªã±ã¼ã·ã§ã³ããããã£ï¼6ã¤ï¼ã«åãããã¦ãã¾ãã
ãã¼ã¿ããããã£ã«ã¯ä»¥ä¸ãå«ã¾ãã¾ãï¼ 1. Freshnessï¼é®®åº¦ï¼ 2. Completenessï¼å®å ¨æ§ï¼ 3. Consistencyï¼ä¸è²«æ§ï¼ 4. Accuracyï¼å³å¯æ§ï¼ 5. Validityï¼å¦¥å½æ§ï¼ 6. Integrityï¼æ´åæ§ï¼ 7. Durabilityï¼èä¹ æ§ï¼
ãã¼ã¿ã¢ããªã±ã¼ã·ã§ã³ããããã£ã«ã¯ä»¥ä¸ãå«ã¾ãã¾ãï¼ 1. Securityï¼ã»ãã¥ãªãã£ï¼ 2. Availabilityï¼å¯ç¨æ§ï¼ 3. Scalabilityï¼ã¹ã±ã¼ã©ããªãã£ï¼ 4. Performanceï¼æ§è½ï¼ 5. Resilienceï¼å復åï¼ 6. Robustnessï¼å ç¢æ§ï¼
ãããã®å±æ§ã®è©³ç´°ãªè§£èª¬ã¯ããã¼ã¿ãµã¼ãã¹ã®ä¿¡é ¼æ§ãå¤é¢çã«æããä¸ã§å¸¸ã«æç¨ã§ããåå±æ§ã«ã¤ãã¦å ·ä½çãªSLOã®ä¾ãæ示ããã¦ããç¹ãå°è±¡çã§ãããä¾ãã°ãFreshnessã«é¢ããSLOã®ä¾ã¨ãã¦ã以ä¸ãæãããã¦ãã¾ãï¼
ãExample SLO: 97ï¼ of data is available in the dashboard tool within 15 minutes of an event occurring.ã
ãã®ãããªå ·ä½ä¾ã¯ãå®éã®SLOè¨å®ã®éã®éè¦ãªæéã¨ãªãã¾ãã
æ¬ç« ã§ã¯ããããã®å±æ§ãç¸äºã«é¢é£ããæã«ã¯ç¸åããé¢ä¿ã«ãããã¨ãææããã¦ãã¾ããFigure 11-1ã§ã¯ãåå±æ§éã®é¢ä¿ãè¦è¦çã«ç¤ºããã¦ããããã¼ã¿ãµã¼ãã¹ã®è¨è¨ã«ãããè¤éãã¨ããã¬ã¼ããªãã®å¿ è¦æ§ãæ確ã«ç¾ãã¦ãã¾ãã
æè¡çãªè¦³ç¹ããç¹ã«èå³æ·±ãã£ãã®ã¯ãåå±æ§ã«å¯¾ããSLOã®æ¸¬å®æ¹æ³ã¨ãããããã·ã¹ãã è¨è¨ã«ã©ã®ããã«å½±é¿ãããã«ã¤ãã¦ã®è§£èª¬ã§ããä¾ãã°ãDurabilityã«é¢ãã¦ã¯ãã¯ã©ã¦ããããã¤ãã¼ãæä¾ãã99.999999999ï¼ ï¼11ãã¤ã³ï¼ã¨ããé«ãèä¹ æ§ãç´¹ä»ããã¦ãã¾ããããã¯ã100ä¸ã®ãªãã¸ã§ã¯ããä¿åããå ´åã10ä¸å¹´ã«1åã®ãã¼ã¹ã§ãªãã¸ã§ã¯ãã失ããã¨ãæå³ãã¾ãããã®ãããªæ¥µç«¯ã«é«ãä¿¡é ¼æ§ç®æ¨ãããã¼ã¿ãµã¼ãã¹ã«ããã¦éè¦è¦ãããçç±ã«ã¤ãã¦ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼
ãData-related properties have a different calculus of risk. Properties like durability, consistency, and integrity must be considered in a unique light, because once lost they are difficult to regain. Recovering from a true durability failure can be impossible, and the effects of these failures will persist forward indefinitely into your users' future.ã
ãã®ææã¯ããã¼ã¿ãµã¼ãã¹ã®ä¿¡é ¼æ§ãä»ã®ãµã¼ãã¹ã¨ã¯æ ¹æ¬çã«ç°ãªãæ§è³ªãæã¤ãã¨ãæ確ã«ç¤ºãã¦ãã¾ããä¸åº¦å¤±ããããã¼ã¿ãå復ãããã¨ã®å°é£ããããã¦ãããå¼ãèµ·ããé·æçãªå½±é¿ãèæ ®ããã¨ããã¼ã¿ãµã¼ãã¹ã«ããã¦ã¯æ¥µãã¦é«ãä¿¡é ¼æ§ç®æ¨ãè¨å®ãããã¨ãæ£å½åããã¾ãã
æ¬ç« ã§ã¯ãSLOã®è¨å®ã ãã§ãªããããããéæããããã®ã·ã¹ãã è¨è¨ã«ã¤ãã¦ãè¨åãã¦ãã¾ããFigure 11-1ã§ã¯ãåãã¼ã¿å±æ§ã¨ã·ã¹ãã è¨è¨ã®èæ ®äºé ï¼æéãã¢ã¯ã»ã¹ãåé·æ§ããµã³ããªã³ã°ãå¯å¤æ§ãåæ£ï¼ã¨ã®é¢ä¿ã示ããã¦ãã¾ãã
èè ã¯ããã¼ã¿ãµã¼ãã¹ã®ä¿¡é ¼æ§ãèããä¸ã§ããã¼ã¿ã®ç³»èï¼Data Lineageï¼ã®éè¦æ§ã強調ãã¦ãã¾ãããã¼ã¿ãè¤æ°ã®ãµã¼ãã¹ãééããéç¨ã§ãåãµã¼ãã¹ã®ä¿¡é ¼æ§ãã©ã®ããã«å ¨ä½ã®ä¿¡é ¼æ§ã«å½±é¿ããããç解ãããã¨ã®éè¦æ§ãææããã¦ãã¾ãã
ãData can flow through an application like a river, which is probably why there are so many water-related metaphors in the space (streams, pools, data lakes). As the process goes from one step to the next, we're moving downstream. Where in the process is our application's data? Who are the upstream producers/publishers? Do these sources have SLOs? Who are the downstream consumers/subscribers of this data? How will they use the data?ã
ãã®è¦ç¹ã¯ãè¤éãªåæ£ã·ã¹ãã ã«ããããã¼ã¿ã®ä¿¡é ¼æ§ãèããä¸ã§éè¦ã§ããä¸æµã®ãµã¼ãã¹ã®SLOãä¸æµã®ãµã¼ãã¹ã®SLOã«ç´æ¥å½±é¿ãä¸ãããã¨ãç解ããã·ã¹ãã å ¨ä½ã¨ãã¦ã®ä¿¡é ¼æ§ãè¨è¨ãããã¨ã®éè¦æ§ãåèªèããããã¾ããã
æ¬ç« ã®çµè«é¨åã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼
ãModern organizations are often obsessed with "data quality." They hire tons of engineers to think about it. But quality is ultimately subjective unless you can define and measure it, and it's inextricably intertwined with the systems that collect, store, process, and produce our data. We must reframe these conversations, and use SLOs to provide a supporting framework of quantitative measurement to help define the mechanisms by which we provide users with reliable data.ã
ãã®è¨èã¯ããã¼ã¿ã®ä¿¡é ¼æ§ã主観çãªæ¦å¿µãã客観çã«æ¸¬å®å¯è½ãªãã®ã¸ã¨è»¢æããå¿ è¦æ§ã強調ãã¦ãã¾ããSLOãç¨ãã¦ãã¼ã¿ã®ä¿¡é ¼æ§ãå®éåãããã¨ã§ãçµç¹ã¯ããå¹æçã«ãã¼ã¿å質ã管çããæ¹åãããã¨ãã§ãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
ãã¼ã¿ãµã¼ãã¹ã®åå±æ§ï¼Freshness, Completeness, Consistency ãªã©ï¼ã«ã¤ãã¦ãå ·ä½çãªSLOãè¨å®ããããããå®æçã«æ¸¬å®ã»è©ä¾¡ããä»çµã¿ãæ§ç¯ããã
ãã¼ã¿ã®ç³»èãæ確ã«ææ¡ããä¸æµãµã¼ãã¹ã®SLOãä¸æµãµã¼ãã¹ã®SLOã«ã©ã®ããã«å½±é¿ããããåæããã
ãã¼ã¿ã®ä¿¡é ¼æ§ã«é¢ããåå±æ§ã®ãã¬ã¼ããªããç解ããã¦ã¼ã¶ã¼ã®ãã¼ãºã¨æè¡çãªå¶ç´ã®ãã©ã³ã¹ãåããªãããé©åãªSLOãè¨å®ããã
ãã¼ã¿ã®èä¹ æ§ãæ´åæ§ãªã©ã®å復å°é£ãªå±æ§ã«ç¹ã«æ³¨æãæãããããã«å¯¾ãã¦æ¥µãã¦é«ãä¿¡é ¼æ§ç®æ¨ãè¨å®ããã
SLOã®æ¸¬å®çµæãç¶ç¶çã«ã¢ãã¿ãªã³ã°ããã·ã¹ãã è¨è¨ã®æ¹åã«æ´»ç¨ããã
æè¡çãªè¦³ç¹ããã¯ãæ¬ç« ã§ç´¹ä»ãããåå±æ§ã®SLO測å®æ¹æ³ãå®è£ ããããã®ãã¼ã«ããã¬ã¼ã ã¯ã¼ã¯ã®éçºãéè¦ã«ãªãã¾ããä¾ãã°ããã¼ã¿é®®åº¦ï¼Freshnessï¼ã測å®ããããã®ã¿ã¤ã ã¹ã¿ã³ã管çã·ã¹ãã ãããã¼ã¿ã®å®å ¨æ§ï¼Completenessï¼ããã§ãã¯ããããã®èªåæ¤è¨¼ãã¼ã«ãªã©ãèãããã¾ãã
ã¾ããæ¬ç« ã§å¼·èª¿ããã¦ãããã¼ã¿ã®ç³»èï¼Data Lineageï¼ã®ç®¡çã¯ãç¹ã«éè¦ãªæè¡ç課é¡ã§ããè¤éãªåæ£ã·ã¹ãã ã«ããã¦ããã¼ã¿ã®æµãã追跡ããå段éã§ã®SLOã管çããããã«ã¯ãé«åº¦ãªãã¬ã¼ã·ã³ã°ã·ã¹ãã ãã¡ã¿ãã¼ã¿ç®¡çã·ã¹ãã ã®å®è£ ãå¿ è¦ã«ãªãã§ãããã
ãã®ç« ãèªãã§ããã¼ã¿ãµã¼ãã¹ã®ä¿¡é ¼æ§ã¯ãåãªãã·ã¹ãã ã®å¯ç¨æ§ãæ§è½ã ãã§ãªãããã¼ã¿ãã®ãã®ã®å質ãæ´åæ§ã«ãæ·±ãé¢ãããã¨ãå¼·ãèªèãã¾ãããSREã¯ãã·ã¹ãã ã®éç¨ã ãã§ãªãããã¼ã¿ã®å質管çã«ãæ·±ãé¢ä¸ããã¦ã¼ã¶ã¼ã«ä¿¡é ¼æ§ã®é«ããã¼ã¿ãæä¾ãããã®ä»çµã¿ã¥ããã«è²¢ç®ããå¿ è¦ãããã¾ãã
åæã«ããã¼ã¿ã®ä¿¡é ¼æ§ã確ä¿ãããã¨ã®é£ãããåèªèãã¾ãããä¸åº¦å¤±ããããã¼ã¿ã®å復ãå°é£ã§ãããã¨ãèããã¨ãäºé²çãªã¢ããã¼ãã¨ãä¸ãä¸ã®å ´åã®å復ã¡ã«ããºã ã®ä¸¡æ¹ãæ éã«è¨è¨ã»å®è£ ããå¿ è¦ãããã¾ãã
ç·æ¬ããã¨ããã®ç« ã¯ãã¼ã¿ãµã¼ãã¹ã®ä¿¡é ¼æ§ã«é¢ããå æ¬çãã¤å®è·µçãªã¬ã¤ããæä¾ãã¦ãã¾ãã13ã®å±æ§ã«åºã¥ãã¢ããã¼ãã¯ããã¼ã¿ã®ä¿¡é ¼æ§ãå¤é¢çã«æããå ·ä½çãªSLOã®è¨å®ã¨æ¸¬å®æ¹æ³ãæ示ãã¦ãã¾ããã¾ãããã¼ã¿ã®ç³»èã®éè¦æ§ã強調ãããã¨ã§ãè¤éãªåæ£ã·ã¹ãã ã«ããããã¼ã¿ã®ä¿¡é ¼æ§ç®¡çã®èª²é¡ã«ãå ãå½ã¦ã¦ãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããä¿¡é ¼æ§ã®é«ããã¼ã¿ãµã¼ãã¹ã®è¨è¨ã¨éç¨ãå¯è½ã«ãªã確信ãã¦ãã¾ãããã¼ã¿ã®åå±æ§ã«å¯¾ããå ·ä½çãªSLOã®è¨å®ã¨ããããã®ãã¬ã¼ããªããèæ ®ããã·ã¹ãã è¨è¨ã¯ããã¼ã¿ãµã¼ãã¹ã®å質åä¸ã«å¤§ããè²¢ç®ããã§ãããã
åæã«ããã¼ã¿ã®ä¿¡é ¼æ§ç¢ºä¿ã¯ç¶ç¶çãªåãçµã¿ã§ãããã¨ãå¿ãã¦ã¯ããã¾ãããæè¡ã®é²åãæ°ããªãã¼ã¿å©ç¨å½¢æ ã®ç»å ´ã«å¿ãã¦ã常ã«ã¢ããã¼ããè¦ç´ããé©å¿ãã¦ãã姿å¢ãéè¦ã§ãã
æå¾ã«ãæ¬ç« ã®ãWe must reframe these conversations, and use SLOs to provide a supporting framework of quantitative measurement to help define the mechanisms by which we provide users with reliable data.ãã¨ããè¨èãå度強調ãããã¨æãã¾ãããã®è¦ç¹ãæã¡ã¤ã¤ããã¼ã¿ã®ä¿¡é ¼æ§ã客観çã«æ¸¬å®ã»ç®¡çå¯è½ãªãã®ã¨ããã¦ã¼ã¶ã¼ã«çã«ä¾¡å¤ã®ãããã¼ã¿ãµã¼ãã¹ãæä¾ãã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡ã®éè¦ãªå½¹å²ã ã¨æãã¾ããã
ãã®ç« ã§å¦ãã ãã¼ã¿ä¿¡é ¼æ§ã®ã¢ããã¼ãã¯ãSREã®å®è·µã«ããã¦ä¸å¿çãªå½¹å²ãæãããã®ã§ãããããã®æ¦å¿µã¨ææ³ãæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãã¦ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ããã¼ã¿ãµã¼ãã¹ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ããã¼ã¿ä¿¡é ¼æ§ã®ç¢ºä¿ã¯å¸¸ã«é²åãç¶ãã課é¡ã§ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ã«å¿ãã¦ãç¶ç¶çã«å¦ç¿ããé©å¿ãã¦ããå¿ è¦ããããã¨ãå¿ãã¦ã¯ããã¾ããã
SREã¨ãã¦ããã®ç« ããå¾ãããç¥è¦ãçµç¹å ¨ä½ã«æµ¸éããããã¼ã¿ä¸å¿ã®ä¿¡é ¼æ§ç®¡çæåãé¸æãã¦ãããã¨ãéè¦ã§ãããã¼ã¿ã®ä¿¡é ¼æ§ãå®éçã«ç®¡çãããã¨ã§ãçµç¹ã¯ããå¹æçã«ãã¼ã¿å質ãåä¸ãããã¦ã¼ã¶ã¼ã«ä¾¡å¤ãããµã¼ãã¹ãæä¾ãããã¨ãã§ãã¾ãã
å®éã«ä»å¾åãçµã課é¡ã¨ãã¦ãæ©æ¢°å¦ç¿ãAIãæ´»ç¨ãããã¼ã¿ãµã¼ãã¹ã«ãããä¿¡é ¼æ§ã®ç¢ºä¿ããã©ã¤ãã·ã¼ããã¼ã¿å«çã®è¦³ç¹ãå«ããããå æ¬çãªãã¼ã¿ä¿¡é ¼æ§ãã¬ã¼ã ã¯ã¼ã¯ã®æ§ç¯ãããã¦ã¾ãã¾ãè¤éåãããã¼ã¿ã¨ã³ã·ã¹ãã ã«ãããå¹æçãªä¿¡é ¼æ§ç®¡çææ³ã®éçºãªã©ãèãããã¾ãããããã®èª²é¡ã«åãçµããã¨ã§ããã¼ã¿ãµã¼ãã¹ã®ä¿¡é ¼æ§ã¯ããã«åä¸ããã¦ã¼ã¶ã¼ã«ã¨ã£ã¦ãã価å¤ã®ãããµã¼ãã¹ãæä¾ã§ããããã«ãªãã§ãããã
Chapter 12. A Worked Example
第12ç« ãA Worked Exampleãã¯ãSLOï¼Service Level Objectivesï¼ãã¼ã¹ã®ã¢ããã¼ããå®éã®ãµã¼ãã¹ã«é©ç¨ããå ·ä½çãªä¾ãæä¾ãã¦ãã¾ããæ¬ç« ã¯ãæ¶ç©ºã®ä¼ç¤¾ãThe Wiener Shirt-zel Clothing Companyããä¾ã«åããè¤éãªå¤å±¤ãµã¼ãã¹ã«SLOãé©ç¨ããæ¹æ³ã詳細ã«è§£èª¬ãã¦ãã¾ããå ¬éãã¦ããè³æã ã¨IoTãµã¼ãã¹ã«ãããSLIã®è¨è¨ã¨LUUPã§ã®å®è·µãè¯ãã£ãã®ã§ãªã¹ã¹ã¡ã§ãã
ç« ã®åé ã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãWhile the other chapters in this part of the book have given you lots of detailed insight into specific aspects of an SLO-based approach to reliability, and Part I outlined and defined all of the concepts you need to get started, what we really haven't talked about yet is how all this might actually work for a multicomponent serviceâor how it might apply to an entire company or organization.ããã®è¨èã¯ãæ¬ç« ã®ç®çãçè«ãå®è·µã«ç§»ãå ·ä½çãªæ¹æ³ã示ããã¨ã§ãããã¨ãæ確ã«ãã¦ãã¾ããå®éã®ãµã¼ãã¹ã«SLOãé©ç¨ããéã«ã¯ãçè«ã ãã§ã¯å¯¾å¦ããããªãè¤éãªç¶æ³ã«ç´é¢ãããã¨ãå¤ãããã§ãã
æ¬ç« ã§ç¹ã«å°è±¡çã ã£ãã®ã¯ããµã¼ãã¹ã®æé·ã«ä¼´ãã¢ã¼ããã¯ãã£ã®å¤åã¨SLOã®é¢ä¿æ§ã«ã¤ãã¦ã®è§£èª¬ã§ããèè ã¯ãåä¸ã®ããã°ã©ãã¼ã®ã©ãããããããå§ã¾ã£ããµã¼ãã¹ããã©ã®ããã«è¤éãªåæ£ã·ã¹ãã ã¸ã¨é²åãã¦ãã£ããã説æãã¦ãã¾ããFigure 12-3ã§ã¯ãæé·å¾ã®The Wiener Shirt-zel Clothing Companyã®ã¢ã¼ããã¯ãã£ã示ããã¦ãããWebã¢ããªã±ã¼ã·ã§ã³ããã¤ã¯ããµã¼ãã¹ããã¼ã¿ãã¼ã¹ããã£ãã·ã¥ãCDNï¼Content Delivery Networkï¼ãªã©ãç¾ä»£çãªã¦ã§ããµã¼ãã¹ã®å ¸åçãªæ§æè¦ç´ ãå«ã¾ãã¦ãã¾ãã
ãã®è¤éãªã¢ã¼ããã¯ãã£ã«å¯¾ãã¦ãèè ã¯3ã¤ã®ã¦ã¼ã¶ã¼ã¿ã¤ãï¼å¤é¨é¡§å®¢ãå é¨ãµã¼ãã¹ãå é¨ã¦ã¼ã¶ã¼ï¼ã«ç¦ç¹ãå½ã¦ãããããã®ãã¼ãºã«åºã¥ããSLOã®è¨å®æ¹æ³ã解説ãã¦ãã¾ãããã®ã¢ããã¼ãã¯SLOãåãªãæè¡çãªææ¨ã§ã¯ãªããã¦ã¼ã¶ã¼ä½é¨ã«ç´çµãããã®ã§ããã¹ãã¨ããæ¬æ¸ã®ä¸»å¼µãå®è·µçã«ç¤ºãã¦ãã¾ãã
ä¾ãã°ãå¤é¨é¡§å®¢åãã®ã¦ã§ããµã¤ãã®ããã³ããã¼ã¸ã«é¢ããSLOã¨ãã¦ãèè ã¯æ¬¡ã®ãããªä¾ãæãã¦ãã¾ãï¼
ã99.9ï¼ of responses to our website will return a 2xx, 3xx, or 4xx HTTP code within 2,000 ms.ã
ãã® SLO ã¯ãã¦ã¼ã¶ã¼ä½é¨ï¼ãã¼ã¸ã®èªã¿è¾¼ã¿é度ï¼ã¨æè¡çãªææ¨ï¼HTTP ã¹ãã¼ã¿ã¹ã³ã¼ãï¼ãå·§ã¿ã«çµã¿åããã¦ãã¾ããèè ã¯ããã® SLO ãæéç´43åã®ãã¦ã³ã¿ã¤ã ã許容ãããã¨ã説æãããããåççãªãã¬ã¼ããªãã§ãããã¨ã示ãã¦ãã¾ãã
å é¨ãµã¼ãã¹éã®ä¾åé¢ä¿ã«é¢ããSLOã®è¨å®ã«ã¤ãã¦ãèè ã¯æ¯æãå¦çãã¤ã¯ããµã¼ãã¹ãä¾ã«æããå¤é¨æ±ºæ¸ãµã¼ãã¹ã®SLAã¨ã®é¢ä¿ã詳細ã«è§£èª¬ãã¦ãã¾ããTable 12-1ã§ã¯ããã³ãã¼SLAã¨å é¨ãµã¼ãã¹ã®SLOã®çµã¿åããã«ããçµæã示ããã¦ãããè¤æ°ã®ãµã¼ãã¹ã®ä¿¡é ¼æ§ãã©ã®ããã«å ¨ä½ã®ä¿¡é ¼æ§ã«å½±é¿ããããæ確ã«è¡¨ç¾ãã¦ãã¾ãããã® analysis ã¯ãSREã¨ãã¦ä¾åé¢ä¿ã®ãããµã¼ãã¹ã®SLOãè¨å®ããéã«åèã«ãªãã¾ãã
å é¨ã¦ã¼ã¶ã¼åãã®ãµã¼ãã¹ã«é¢ããSLOã®è¨å®ã«ã¤ãã¦ã¯ããã¹ã¯ãããã¢ããªã±ã¼ã·ã§ã³ã¨å é¨Wikiã®ä¾ãæãããã¦ãã¾ããç¹ã«å°è±¡çã ã£ãã®ã¯ããã¹ã¯ãããã¢ããªã±ã¼ã·ã§ã³ã®ãããªãããã¯ã¼ã¯ãµã¼ãã¹ã§ã¯ãªããã®ã«å¯¾ããSLOã®è¨å®æ¹æ³ã§ããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼
ãRemember, SLOs are about thinking about your usersâand those users are not always millions of people on the internet. Sometimes they're three people in a marketing department.ã
ãã®è¦ç¹ã¯ãSLOãé©ç¨ã§ããç¯å²ãæ³å以ä¸ã«åºããã¨ã示åãã¦ãããSREã®å®è·µã«ããã¦éè¦ã§ãã
æå¾ã«ããã©ãããã©ã¼ã ã¨ãã¦ã®ãµã¼ãã¹ï¼ãã®å ´åã¯ã³ã³ãããã©ãããã©ã¼ã ï¼ã«å¯¾ããSLOã®è¨å®æ¹æ³ã解説ããã¦ãã¾ããèè ã¯ãã³ã³ããã® ephemeral ãªæ§è³ªãèæ ®ããSLOã®è¨å®æ¹æ³ãææ¡ãã¦ãããããã¯è¤éãªåæ£ã·ã¹ãã ã«ãããSLOã®è¨å®ã®é£ããã¨éè¦æ§ã示ãã¦ãã¾ãã
æè¡çãªè¦³ç¹ããã¯ãæ¬ç« ã§æ示ãããSLOã®ä¾ã¨ãã®è¨å®çç±ãåèã«ãªãã¾ããç¹ã«ããµã¼ãã¹éã®ä¾åé¢ä¿ãèæ ®ããSLOã®è¨å®æ¹æ³ããã¦ã¼ã¶ã¼ä½é¨ãç´æ¥åæ ããSLIã®é¸ã³æ¹ã¯ãå®éã®ãµã¼ãã¹éç¨ã«ç´æ¥é©ç¨ã§ããç¥è¦ã§ãã
ã¾ããæ¬ç« ã§ã¯ãSLOã®è¨å®ãåãªãæ°å¤ç®æ¨ã®è¨å®ã§ã¯ãªããã¦ã¼ã¶ã¼ã®ãã¼ãºãæè¡çãªå¶ç´ããã¸ãã¹ç®æ¨ã®ãã©ã³ã¹ãåãè¤éãªããã»ã¹ã§ãããã¨ã強調ããã¦ãã¾ããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼
ãSLO-based approaches give you a way to find out whether users are happy or not, even if this example doesn't fit all of the traditional trappings of the general discussions about SLOs. Always remember that it's the philosophies behind these approaches that are the most important, not having the slickest technology to use to perform complicated math against statistically derived SLIs.ã
ãã®è¨èã¯ãSLOã®æ¬è³ªãæè¡çãªææ¨ã§ã¯ãªããã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã«ãããã¨ãåèªèããã¦ããã¾ãã
æ¬ç« ãèªãã§ãSLOã®è¨å®ã¯ãåã«ã·ã¹ãã ã®æè¡çãªå´é¢ãç£è¦ãããã¨ã§ã¯ãªããã¦ã¼ã¶ã¼ä½é¨ã¨ãã¸ãã¹ç®æ¨ã常ã«æèããªããããµã¼ãã¹å ¨ä½ã®ä¿¡é ¼æ§ã管çãããã¨ã ã¨æ¹ãã¦èªèãã¾ãããåæã«ãSLOã®é©ç¨ç¯å²ãæ³å以ä¸ã«åºããã¨ãå¦ã³ã¾ããããããã¯ã¼ã¯ãµã¼ãã¹ã ãã§ãªãããã¹ã¯ãããã¢ããªã±ã¼ã·ã§ã³ãå é¨åããã¼ã«ãªã©ããããã種é¡ã®ãµã¼ãã¹ã«SLOãé©ç¨ã§ããå¯è½æ§ããããã¨ãç¥ããSREã®å®è·µã®å¹ ã大ããåºããæè¦ãå¾ã¾ããã
ç·æ¬ããã¨ããã®ç« ã¯SLOãã¼ã¹ã®ã¢ããã¼ããå®éã®ãµã¼ãã¹ã«é©ç¨ããå ·ä½çãªæ¹æ³ãæä¾ãã¦ãã¾ããè¤éãªå¤å±¤ãµã¼ãã¹ã«ãããSLOã®è¨å®æ¹æ³ããµã¼ãã¹éã®ä¾åé¢ä¿ã®èæ ®ãç°ãªãã¦ã¼ã¶ã¼ã¿ã¤ãã«å¯¾ããSLOã®è¨å®ãªã©ãå®è·µçã§æç¨ãªç¥è¦ãçãè¾¼ã¾ãã¦ãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããå¹æçãªSLOã®è¨å®ã¨éç¨ãå¯è½ã«ãªãã¨ç¢ºä¿¡ãã¦ãã¾ããç¹ã«ãã¦ã¼ã¶ã¼ã¸ã£ã¼ãã¼ãä¸å¿ã«æ®ããSLIã®é¸å®ã¨ãããã«åºã¥ãSLOã®è¨å®ã¯ãå¤ãã®ããã¸ã§ã¯ãã§æå¹ã«æ´»ç¨ã§ããã§ãããã
åæã«ãæ¬ç« ã§å¼·èª¿ããã¦ããSLOã®æè»ã¨é²åã®å¿ è¦æ§ããå®åä¸éè¦ã§ãããµã¼ãã¹ã®æé·ã«ä¼´ããã¢ã¼ããã¯ãã£ã顧客ã®ãã¼ãºã¯å¤åãã¦ããã¾ãããã®ãããSLOã常ã«è¦ç´ããé©å¿ããã¦ããå¿ è¦ãããã¾ãã
ä»å¾ã®èª²é¡ã¨ãã¦ã¯ãããè¤éãªåæ£ã·ã¹ãã ã«ãããEnd-to-Endã®SLO管çããã¤ã¯ããµã¼ãã¹ã¢ã¼ããã¯ãã£ã«ããããµã¼ãã¹éã®ä¾åé¢ä¿ãèæ ®ããSLOã®èªå調æ´ãããã¦AIãæ©æ¢°å¦ç¿ãæ´»ç¨ããããé«åº¦ãªSLOäºæ¸¬ã¢ãã«ã®éçºãªã©ãèãããã¾ãããããã®èª²é¡ã«åãçµããã¨ã§ãSREã®å®è·µã¯ããã«é²åããããå¹æçã«ãã¸ãã¹ä¾¡å¤ãåµåºã§ããããã«ãªãã§ãããã
æå¾ã«ãæ¬ç« ã®ãA lot of this book has been abstract, since SLO-based approaches are mostly philosophical. You might use a lot of math and numbers to help you gather data, but it's ultimately about using this data to engage humans to make decisions.ãã¨ããè¨èãå度強調ãããã¨æãã¾ãããã®è¦ç¹ãæã¡ã¤ã¤ãSLOãåãªãæ°å¤ç®æ¨ã§ã¯ãªããã¦ã¼ã¶ã¼æºè¶³åº¦åä¸ã¨ãã¸ãã¹æåã®ããã®æ¦ç¥çãã¼ã«ã¨ãã¦æ´»ç¨ãã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡ã®éè¦ãªå½¹å²ã ã¨æãã¾ããã
ãã®ç« ã§å¦ãã å ·ä½çãªSLOè¨å®ã®ã¢ããã¼ãã¯ãSREã®å®è·µã«ããã¦ä¸å¿çãªå½¹å²ãæãããã®ã§ãããããã®æ¦å¿µã¨ææ³ãæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãã¦ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ããµã¼ãã¹ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ãSLOã®è¨å®ã¨éç¨ã¯å¸¸ã«é²åãç¶ããããã»ã¹ã§ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ã«å¿ãã¦ãç¶ç¶çã«å¦ç¿ããé©å¿ãã¦ããå¿ è¦ããããã¨ãå¿ãã¦ã¯ããã¾ããã
SREã¨ãã¦ããã®ç« ããå¾ãããç¥è¦ãçµç¹å ¨ä½ã«æµ¸éãããSLOãä¸å¿ã¨ããä¿¡é ¼æ§ç®¡çã®æãé¸æãã¦ãããã¨ãéè¦ã§ããã¦ã¼ã¶ã¼ä½é¨ãéè¦ããæè¡çãªææ¨ã¨ãã¸ãã¹ç®æ¨ã®ãã©ã³ã¹ãåããªãããç¶ç¶çã«ãµã¼ãã¹ã®ä¿¡é ¼æ§ãåä¸ããã¦ãããã¨ã§ãé·æçã«ã¯ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ãã
Part III. SLO Culture
Part IIIã§ã¯ãSLOæåã®æ§ç¯ã¨æ®åã«ç¦ç¹ãå½ã¦ããã¦ãã¾ããç¹ã«å°è±¡çã ã£ãã®ã¯ãSLO Advocateã®å½¹å²ã«é¢ããç« ã§ããèè ã¯ãSLOå°å ¥ã®æåã«ã¯æè¡çãªå®è£ 以ä¸ã®ãã®ãå¿ è¦ã§ãããçµç¹æåã®å¤é©ã¨æ·±ãç解ãä¸å¯æ¬ ã§ãããã¨ã強調ãã¦ãã¾ãã
SLO Advocateã®å½¹å²ã¯ãåãªãæè¡çãªã¨ãã¹ãã¼ãã§ã¯ãªããçµç¹ã®å¤é©è ã¨ãã¦ã®å´é¢ãæã¡ã¾ãããã®å½¹å²ãéãã¦ãSREã¯ããæ¦ç¥çãªç«å ´ã«ç«ã¡ãçµç¹å ¨ä½ã®ä¿¡é ¼æ§æåã®é¸æã«å¤§ããè²¢ç®ãããã¨ãã§ãã¾ãã
ã¾ããSLOã®ç解ããããã¨çºè¦å¯è½æ§ã«é¢ããç« ãæç¨ã§ãããSLOå®ç¾©ææ¸ã®æ§é åãä¸å¤®éä¸åã®ããã¥ã¡ã³ã管çãå¹æçãªããã·ã¥ãã¼ãã®è¨è¨ãªã©ãSLOãçµç¹å ¨ä½ã§æ´»ç¨ããããã®å ·ä½çãªæ¹æ³ã詳細ã«è§£èª¬ããã¦ãã¾ãã
ãã®é¨åããå¦ãã æãéè¦ãªæè¨ã¯ãSLOæåã®æ§ç¯ãç¶ç¶çãªããã»ã¹ã§ããã常ã«é²åãç¶ãããã®ã ã¨ãããã¨ã§ããæè¡ã®é²åãçµç¹ã®å¤åã«å¿ãã¦ãSLOã®ã¢ããã¼ããé©å¿ãã¦ããå¿ è¦ãããã¾ããSREã¨ãã¦ããã®ç¶ç¶çãªæ¹åããã»ã¹ããªã¼ãããçµç¹å ¨ä½ã®ã¢ã©ã¤ã³ã¡ã³ããå³ã£ã¦ãããã¨ãéè¦ã§ãã
Chapter 13. Building an SLO Culture
第13ç« ãBuilding an SLO Cultureãã¯ãSLOï¼Service Level Objectivesï¼ãçµç¹æåã«æµ¸éãããããã®å ·ä½çãªæ¹æ³è«ãæ示ãã¦ãã¾ããæ¬ç« ã¯ãSLOã®æè¡çãªå®è£ ã ãã§ãªããçµç¹å ¨ä½ã§SLOãåãå ¥ããæ´»ç¨ãã¦ããããã®ããã»ã¹ã«ã¤ãã¦æ·±ãæãä¸ãã¦ãã¾ãã
ç« ã®åé ã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãIt's one thing to understand and live by these principles yourself, but it's another to spread these ideas throughout your organization and get others working alongside you.ããã®è¨èã¯ãSLOã®å°å ¥ãåãªãæè¡çãªèª²é¡ã§ã¯ãªããçµç¹æåã®å¤é©ãä¼´ã大ããªææ¦ã§ãããã¨ã端çã«è¡¨ç¾ãã¦ãã¾ããåªããSLOãè¨è¨ãã¦ããçµç¹å ¨ä½ããããç解ããæ´»ç¨ããªããã°ããã®å¹æã¯éå®çãªãã®ã«ãªã£ã¦ãã¾ãã¾ãã
æ¬ç« ã§ç¹ã«å°è±¡çã ã£ãã®ã¯ãSLOæåã®æ§ç¯ã段éçãªããã»ã¹ã¨ãã¦æãã¦ããç¹ã§ããèè ã¯ä»¥ä¸ã®6ã¤ã®ã¹ããããæ示ãã¦ãã¾ãï¼
- è³åãå¾ãï¼Get buy-inï¼
- SLOä½æ¥ãåªå ããï¼Prioritize SLO workï¼
- SLOãå®è£ ããï¼Implement your SLOsï¼
- SLOã使ç¨ããï¼Use your SLOsï¼
- SLOãå復æ¹åããï¼Iterate on your SLOsï¼
- ä»è ã«SLOã®ä½¿ç¨ãæå±ããï¼Advocate for others to use SLOsï¼
ãã®ã¢ããã¼ãã¯ãSLOå°å ¥ã®è¤éããèªèãã¤ã¤ã段éçã«çµç¹æåãå¤é©ãã¦ããæ¹æ³ã示ãã¦ãã¾ããç¹ã«ãæåã®ã¹ãããã§ãããè³åãå¾ãããã¨ã®éè¦æ§ã強調ããã¦ããç¹ãå°è±¡çã§ãããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãBefore anything can happen, people need to be in agreement about the value of SLOs. If your team doesn't value reliability, it's going to be hard for you to justify creating SLOs.ããã®ææã¯ãæè¡çãªå®è£ 以åã«ãçµç¹å ã§SLOã®ä¾¡å¤ãå ±æãããã¨ã®éè¦æ§ã強調ãã¦ãããSREã¨ãã¦å ±æã§ããç¹ã§ããã
SLOã®å®è£ ã«é¢ããé¨åã§ãèè ã¯ãDo it yourselfãã¨ãAssign itãã®2ã¤ã®ã¢ããã¼ããæ示ãã¦ãã¾ããããã¯ãSLOã®å°å ¥ãæ¨é²ããç«å ´ã«ãã人éã®å½¹å²ã«ã¤ãã¦ãéè¦ãªç¤ºåãä¸ãã¦ãã¾ããç¹ã«ããDo it yourselfãã¢ããã¼ãã«ã¤ãã¦ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãHaving read this book, you will likely be the most knowledgeable on the subject and the most driven to make the move to an SLO culture. Leading by example and making the work your priority will signal to others that you're committed to making this change.ããã®è¦ç¹ã¯ãSREã¨ãã¦SLOå°å ¥ãæ¨é²ããéã®å¿æ§ãã¨ãã¦éè¦ã ã¨æãã¾ããã
SLOã®ä½¿ç¨ã«é¢ããé¨åã§ã¯ãã¢ã©ã¼ããã¨ã©ã¼ãã¸ã§ããã®æ¶è²»ãä½å°ã¨ã©ã¼ãã¸ã§ããã®æ´»ç¨ã«ã¤ãã¦è©³ç´°ã«è§£èª¬ããã¦ãã¾ããç¹ã«å°è±¡çã ã£ãã®ã¯ãã¨ã©ã¼ãã¸ã§ããã®æ¶è²»ã«é¢ãã以ä¸ã®è¨è¿°ã§ãï¼ãIf you find your applications are breaking SLOs and there's a lack of urgency to repair the situation, it might be a sign that you need to make some adjustments.ããã®ææã¯ãSLOãåãªãæ°å¤ç®æ¨ã§ã¯ãªããçµç¹ã®åªå é ä½ãåæ ãã¹ããã®ã§ãããã¨ã強調ãã¦ãããSLOã®æ¬è³ªãç解ããä¸ã§éè¦ã§ãã
æè¡çãªè¦³ç¹ããã¯ãæ¬ç« ã§ã¯SLOã®å®è£ ãéç¨ã«é¢ããå ·ä½çãªæ¹æ³è«ãæ示ããã¦ãã¾ããä¾ãã°ãSLOããã¥ã¡ã³ãã®ä½æãSLIã®é¸å®ã¨ã¢ãã¿ãªã³ã°ã®å®è£ ãã¢ã©ã¼ãã®è¨å®ãªã©ã«ã¤ãã¦ãå®è·µçãªã¢ããã¤ã¹ãæä¾ããã¦ãã¾ãããããã®ç¥è¦ã¯ãå®éã«SLOãå°å ¥ããéã«ç´æ¥æ´»ç¨ã§ããè²´éãªæ å ±ã§ãã
æ¬ç« ã®çµè«é¨åã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãSLOs are a process, not a project. They won't stick overnight, but hopefully the content in this chapter has given you a better sense of how to circle back and iterate on these approaches until things begin to click.ããã®è¨èã¯ãSLOæåã®æ§ç¯ãç¶ç¶çãªåãçµã¿ã§ãããã¨ã強調ãã¦ãããSREã¨ãã¦ã®é·æçãªè¦ç¹ã®éè¦æ§ãåèªèããããã¾ããã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
- çµç¹å ã§SLOã®ä¾¡å¤ãå ±æããããã®ã¯ã¼ã¯ã·ã§ãããåå¼·ä¼ãå®æçã«éå¬ããã
- å°è¦æ¨¡ãªããã¸ã§ã¯ãããSLOã®å°å ¥ãå§ããæåäºä¾ãä½ãåºãã
- SLOã®å®è£ ã¨éç¨ã®ããã»ã¹ãææ¸åããçµç¹å ã§å ±æããã
- SLOã®å®æçãªè¦ç´ãã¨æ¹åã®ãµã¤ã¯ã«ã確ç«ããã
- ä»ã®ãã¼ã ãé¨éã«SLOã®å°å ¥ãæå±ããçµç¹å ¨ä½ã§ã®SLOæåã®æ§ç¯ãç®æãã
æè¡çãªè¦³ç¹ããã¯ãSLOã®å®è£ ã¨éç¨ãæ¯æ´ãããã¼ã«ããã¬ã¼ã ã¯ã¼ã¯ã®éçºãéè¦ã«ãªãã¾ããä¾ãã°ãSLOããã¥ã¡ã³ãã®ç®¡çã·ã¹ãã ãSLIãã¼ã¿ã®åéã¨åæã®ããã®åºç¤ãã¨ã©ã¼ãã¸ã§ããã®è¨ç®ã¨å¯è¦åã®ããã®ããã·ã¥ãã¼ããªã©ãèãããã¾ãããããã®ãã¼ã«ãæ´åãããã¨ã§ãSLOæåã®å®çãããå¹æçã«æ¯æ´ã§ããã§ãããã
ãã®ç« ãèªãã§ãSLOã®å°å ¥ã¯ãåã«æè¡çãªææ¨ãè¨å®ãããã¨ã§ã¯ãªããçµç¹å ¨ä½ã®ä¿¡é ¼æ§ã«å¯¾ããèãæ¹ãå¤é©ãããã¨ã ã¨æ¹ãã¦èªèãã¾ãããSREã¯ããã®æåå¤é©ã®æ¨é²å½¹ã¨ãã¦ãæè¡çãªç¥èã ãã§ãªããçµç¹å ã®ã³ãã¥ãã±ã¼ã·ã§ã³ããã§ã³ã¸ããã¸ã¡ã³ãã®ã¹ãã«ãæ±ãããããã¨ãå¼·ãæãã¾ããã
ç·æ¬ããã¨ããã®ç« ã¯SLOæåã®æ§ç¯ã«é¢ããå æ¬çãã¤å®è·µçãªã¬ã¤ããæä¾ãã¦ãã¾ããSLOã®æè¡çãªå´é¢ã ãã§ãªããçµç¹æåã®å¤é©ã¨ãã大ããªèª²é¡ã«æ£é¢ããåãçµãã§ããSREã«ã¨ã£ã¦ä¾¡å¤ã®ããç¥è¦ãçãè¾¼ã¾ãã¦ãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããå¹æçãªSLOæåã®æ§ç¯ãå¯è½ã«ãªãã¨ç¢ºä¿¡ãã¦ãã¾ããç¹ã«ã段éçãªã¢ããã¼ãã¨ç¶ç¶çãªæ¹åã®éè¦æ§ã¯ã大è¦æ¨¡ãªçµç¹å¤é©ãæåãããä¸ã§éè¦ãªæéã¨ãªãã§ãããã
åæã«ãSLOæåã®æ§ç¯ã¯é·æçãªåãçµã¿ã§ãããã¨ãå¿ãã¦ã¯ããã¾ãããæè¡ã®é²åãçµç¹ã®å¤åã«å¿ãã¦ã常ã«ã¢ããã¼ããè¦ç´ããé©å¿ãã¦ãã姿å¢ãéè¦ã§ãã
æå¾ã«ãæ¬ç« ã®ãThis chapter should also remind you that at the end of the day, SLOs are about people. Creating a culture of SLOs is about making your users and your team happier.ãã¨ããè¨èãå度強調ãããã¨æãã¾ãããã®è¦ç¹ãæã¡ã¤ã¤ãæè¡çãªææ¨ã¨äººéçãªå´é¢ã®ãã©ã³ã¹ãåããªãããçµç¹å ¨ä½ã§SLOæåãæ§ç¯ãã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡ã®éè¦ãªå½¹å²ã ã¨æãã¾ããã
ãã®ç« ã§å¦ãã SLOæåæ§ç¯ã®ã¢ããã¼ãã¯ãSREã®å®è·µã«ããã¦ä¸å¿çãªå½¹å²ãæãããã®ã§ãããããã®æ¦å¿µã¨ææ³ãæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãã¦ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ããµã¼ãã¹ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ãSLOæåã®æ§ç¯ã¯å¸¸ã«é²åãç¶ããããã»ã¹ã§ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ã«å¿ãã¦ãç¶ç¶çã«å¦ç¿ããé©å¿ãã¦ããå¿ è¦ããããã¨ãå¿ãã¦ã¯ããã¾ããã
SREã¨ãã¦ããã®ç« ããå¾ãããç¥è¦ãçµç¹å ¨ä½ã«æµ¸éãããSLOãä¸å¿ã¨ããä¿¡é ¼æ§ç®¡çã®æåãé¸æãã¦ãããã¨ãéè¦ã§ããã¦ã¼ã¶ã¼ä½é¨ãéè¦ããæè¡çãªææ¨ã¨ãã¸ãã¹ç®æ¨ã®ãã©ã³ã¹ãåããªãããç¶ç¶çã«ãµã¼ãã¹ã®ä¿¡é ¼æ§ãåä¸ããã¦ãããã¨ã§ãé·æçã«ã¯ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ãã
Chapter 14. SLO Evolution
第14ç« ãSLO Evolutionãã¯ãSLOï¼Service Level Objectivesï¼ã®é²åã¨é©å¿ã®éè¦æ§ã«ã¤ãã¦æ·±ãæãä¸ãã¦ãã¾ããæ¬ç« ã¯ãSLOãéçãªãã®ã§ã¯ãªãããµã¼ãã¹ã®å¤åã«åããã¦å¸¸ã«é²åãç¶ããå¿ è¦ããããã¨ã強調ãã¦ãã¾ãã
ç« ã®åé ã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãService level objectives work best when you're willing to let them change and grow as your service does.ããã®è¨èã¯ãSLOã®æ¬è³ªãæè»æ§ã¨é©å¿æ§ã«ãããã¨ã端çã«è¡¨ç¾ãã¦ãã¾ãããµã¼ãã¹ã®æé·ãå¤åã«åããã¦SLOã調æ´ãããã¨ã§ãããé©åãªä¿¡é ¼æ§ç®æ¨ãç¶æã§ããããã§ãã
æ¬ç« ã§ç¹ã«å°è±¡ã ã£ãã®ã¯ãSLOã®é²åãä¿ãæ§ã ãªè¦å ã«ã¤ãã¦è©³ç´°ã«è§£èª¬ãã¦ããç¹ã§ããèè ã¯ä»¥ä¸ã®ãããªè¦å ãæãã¦ãã¾ãï¼
- 使ç¨ç¶æ³ã®å¤åï¼Usage Changesï¼
- ä¾åé¢ä¿ã®å¤åï¼Dependency Changesï¼
- é害ã«ããå¤åï¼Failure-Induced Changesï¼
- ã¦ã¼ã¶ã¼ã®æå¾ ã¨è¦æ±ã®å¤åï¼User Expectation and Requirement Changesï¼
- ãã¼ã«ã®å¤åï¼Tooling Changesï¼
- ç´æã«åºã¥ãå¤åï¼Intuition-Based Changesï¼
ãããã®è¦å ã¯ãSLOã®é²åãåãªãæ°å¤ã®èª¿æ´ã§ã¯ãªãããµã¼ãã¹ã®å ¨ä½çãªç¶æ³ãèæ ®ããå æ¬çãªããã»ã¹ã§ãããã¨ã示ãã¦ãã¾ããç¹ã«ãã¦ã¼ã¶ã¼ã®æå¾ ã¨è¦æ±ã®å¤åã«é¢ããé¨åãå°è±¡çã§ãããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãThe users that depend on your service may experience changes in their expectations over time.ããã®ææã¯ãSLOãåãªãæè¡çãªææ¨ã§ã¯ãªããã¦ã¼ã¶ã¼ä½é¨ã¨å¯ã«çµã³ã¤ãã¦ãããã¨ã強調ãã¦ãããSREã¨ãã¦å ±æã§ããç¹ã§ããã
æè¡çãªè¦³ç¹ããã¯ãæ¬ç« ã§ã¯ SLO ã®æ¸¬å®ã¨è¨ç®ã«é¢ããå¤æ´ã«ã¤ãã¦è©³ç´°ã«è§£èª¬ããã¦ãã¾ããç¹ã«ãã¡ããªã¯ã¹ã·ã¹ãã ã®å¤æ´ããã¼ã¿ã®è§£å度ãä¿ææéã®å¤æ´ãSLOã«ä¸ããå½±é¿ã«ã¤ãã¦ãå ·ä½çãªä¾ãæãããã¦ãã¾ããä¾ãã°ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ããIf you're using Prometheus to scrape your metrics endpoint for new data every 30 seconds, you'll have to revisit how you're calculating things if you change this to every 10 seconds or every 3 seconds.ããã®ææã¯ãSLOã®è¨ç®ãåç´ãªæ°å¼ã§ã¯ãªãããã¼ã¿åéã®æ¹æ³ãé »åº¦ã«ã大ããä¾åãããã¨ã示ãã¦ãããSREã¨ã㦠SLO ãè¨è¨ã»éç¨ããéã®éè¦ãªèæ ®ç¹ã ã¨æãã¾ããã
æ¬ç« ã§ã¯ãSLO ã®å¤æ´ããã»ã¹ã«ã¤ãã¦ã詳細ã«è§£èª¬ããã¦ãã¾ããç¹ã«å°è±¡çã ã£ãã®ã¯ãå®æçãªè¦ç´ãã®éè¦æ§ã強調ãã¦ããç¹ã§ããèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãIn addition to all of what we've covered so far, you also need to have scheduled revisits of your SLOs.ããã®ææã¯ãSLO ãéçãªãã®ã§ã¯ãªããç¶ç¶çãªæ¤è¨¼ã¨æ¹åãå¿ è¦ã§ãããã¨ã強調ãã¦ãããSRE ã®å®è·µã«ããã¦éè¦ãªè¦ç¹ã ã¨æãã¾ããã
ã¾ããæ¬ç« ã§ã¯ã誤ã£ãSLOã®èå¥ãã«ã¤ãã¦ãè¨åããã¦ãã¾ããèè ã¯ãã¦ã¼ã¶ã¼ã®å£°ãç¶ç¶çã«èããã¨ã®éè¦æ§ããé害æã® SLO ã®æåã注è¦ãããã¨ã®éè¦æ§ã強調ãã¦ãã¾ãããããã®ææã¯ãSLOãåãªãæ°å¤ç®æ¨ã§ã¯ãªããã¦ã¼ã¶ã¼ä½é¨ã¨å®éã®ã·ã¹ãã ã®æåãåæ ãã¹ããã®ã§ãããã¨ãæ¹ãã¦èªèããã¦ããã¾ããã
æ¬ç« ã®çµè«é¨åã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãService level objectives are exactly what they sound likeâthey're objectives, not agreements. They should be malleable, and they should change over time.ããã®è¨èã¯ãSLO ã®æ¬è³ªãæè»æ§ã¨é©å¿æ§ã«ãããã¨ãå確èªããã¦ããã¾ããSREã¨ãã¦ããã®è¦ç¹ã¯éè¦ã§ããSLO ãåºå®çãªãã®ã¨ãã¦æ±ãã®ã§ã¯ãªãã常ã«å¤åãå¾ããã®ã¨ãã¦æãããµã¼ãã¹ã®é²åã«åããã¦é©åã«èª¿æ´ãã¦ããå¿ è¦ãããã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
- SLO ã®å®æçãªè¦ç´ãããã»ã¹ã確ç«ããçµç¹å ã§å¾¹åºããã
- ãµã¼ãã¹ã®å¤åï¼ä½¿ç¨ç¶æ³ãä¾åé¢ä¿ãã¦ã¼ã¶ã¼ã®æå¾ ãªã©ï¼ã常ã«ç£è¦ããSLO ã¸ã®å½±é¿ãè©ä¾¡ããã
- é害çºçæã« SLO ã®æåã詳細ã«åæããå¿ è¦ã«å¿ã㦠SLO ã®å®ç¾©ãè¨ç®æ¹æ³ãè¦ç´ãã
- ã¦ã¼ã¶ã¼ãã£ã¼ãããã¯ãç¶ç¶çã«åéããSLO ã«åæ ãããä»çµã¿ãæ§ç¯ããã
- SLO ã®å¤æ´ããã»ã¹ãææ¸åããçµç¹å ã§å ±æããã
æè¡çãªè¦³ç¹ããã¯ãSLO ã®é²åãæ¯æ´ãããã¼ã«ããã¬ã¼ã ã¯ã¼ã¯ã®éçºãéè¦ã«ãªãã¾ããä¾ãã°ã以ä¸ã®ãããªãã®ãèãããã¾ãï¼
- SLO ã®å±¥æ´ã追跡ããå¤æ´ã®çç±ãå½±é¿ãè¨é²ããã·ã¹ãã
- ã¦ã¼ã¶ã¼ãã£ã¼ãããã¯ã¨ SLO ã®ç¸é¢é¢ä¿ãåæãããã¼ã«
- ä¾åé¢ä¿ã®å¤åã SLO ã«ä¸ããå½±é¿ãã·ãã¥ã¬ã¼ããããã¼ã«
- SLO ã®å¤æ´ãä»ã®ãµã¼ãã¹ã«ä¸ããå½±é¿ãäºæ¸¬ããã·ã¹ãã
ãããã®ãã¼ã«ãæ´åãããã¨ã§ãSLO ã®é²åããã»ã¹ãããå¹æçã«ç®¡çãããµã¼ãã¹ã®ä¿¡é ¼æ§åä¸ã«ã¤ãªãããã¨ãã§ããã§ãããã
ãã®ç« ãèªãã§ãSLO ã®ç®¡çã¯ãåã«æ°å¤ç®æ¨ãè¨å®ãç£è¦ãããã¨ã§ã¯ãªãããµã¼ãã¹ã®é²åã«åããã¦ç¶ç¶çã« SLO ãé©å¿ããã¦ããããã»ã¹ã§ãããã¨ãå¼·ãèªèãã¾ãããSRE ã¯ããã®é²åã®ããã»ã¹ã主å°ããæè¡çãªå´é¢ã ãã§ãªãããã¸ãã¹ã®è¦æ±ãã¦ã¼ã¶ã¼ã®æå¾ ãèæ ®ããªãããé©å㪠SLO ãç¶æãã¦ãã責任ãããã¾ãã
ç·æ¬ããã¨ããã®ç« 㯠SLO ã®é²åã«é¢ããå æ¬çãã¤å®è·µçãªã¬ã¤ããæä¾ãã¦ãã¾ããSLO ãéçãªãã®ã§ã¯ãªãããµã¼ãã¹ã®å¤åã«å¿ãã¦å¸¸ã«é²åãç¶ããå¿ è¦ããããã¨ã強調ãããã®é²åã®ããã»ã¹ã詳細ã«è§£èª¬ãã¦ãã¾ããSRE ã«ã¨ã£ã¦ããã®ç¥è¦ã¯ä¾¡å¤ããããããå¹æçãªä¿¡é ¼æ§ç®¡çã®å®ç¾ã«ã¤ãªãããã®ã§ãã
SRE ã¨ãã¦ããã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããæè»ã§å¹æç㪠SLO 管çãå¯è½ã«ãªãã¨ç¢ºä¿¡ãã¦ãã¾ããç¹ã«ãå®æçãªè¦ç´ãããã»ã¹ã®ç¢ºç«ã¨ãã¦ã¼ã¶ã¼ãã£ã¼ãããã¯ã®ç¶ç¶çãªåéã»åæ ã¯ãå¤ãã®ããã¸ã§ã¯ãã§å³åº§ã«é©ç¨ã§ããæç¨ãªç¥è¦ã§ãã
åæã«ãSLO ã®é²åã¯ç¶ç¶çãªããã»ã¹ã§ãããã¨ãå¿ãã¦ã¯ããã¾ãããæè¡ã®é²åãå¸å ´ã®å¤åã«å¿ãã¦ã常ã«ã¢ããã¼ããè¦ç´ããé©å¿ãã¦ãã姿å¢ãéè¦ã§ãã
æå¾ã«ãæ¬ç« ã®ãServices evolve over time, which means your SLOs should, too. Use the data they provide you to have better conversations and make better decisions.ãã¨ããè¨èãå度強調ãããã¨æãã¾ãããã®è¦ç¹ãæã¡ã¤ã¤ãSLO ãéçãªç®æ¨ã§ã¯ãªãããµã¼ãã¹ã®é²åãä¿é²ããããè¯ãææ決å®ãæ¯æ´ãããã¼ã«ã¨ãã¦æ´»ç¨ãã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡ã®éè¦ãªå½¹å²ã ã¨æãã¾ããã
ãã®ç« ã§å¦ãã SLO é²åã®ã¢ããã¼ãã¯ãSRE ã®å®è·µã«ããã¦ä¸å¿çãªå½¹å²ãæãããã®ã§ãããããã®æ¦å¿µã¨ææ³ãæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãã¦ãããã¨ã§ãããä¿¡é ¼æ§ã®é«ããµã¼ãã¹ã®æ§ç¯ã¨ãããå¹æçãªçµç¹éå¶ã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ããåæã«ãSLO ã®é²åã¯å¸¸ã«ç¶ç¶ããããã»ã¹ã§ãããæ°ããæè¡ãæ¹æ³è«ã®ç»å ´ã«å¿ãã¦ãç¶ç¶çã«å¦ç¿ããé©å¿ãã¦ããå¿ è¦ããããã¨ãå¿ãã¦ã¯ããã¾ããã
Chapter 15. Discoverable and Understandable SLOs
第15ç« ãDiscoverable and Understandable SLOsãã¯ãSLOï¼Service Level Objectivesï¼ã®ç解ããããã¨çºè¦å¯è½æ§ã®éè¦æ§ã«ç¦ç¹ãå½ã¦ã¦ãã¾ããæ¬ç« ã¯ãSLOãçµç¹å ¨ä½ã§å¹æçã«æ´»ç¨ããããã«ã¯ããããã容æã«ç解ã§ãããã¤å¿ è¦ãªæã«è¿ éã«è¦ã¤ãããããã¨ãä¸å¯æ¬ ã§ãããã¨ã強調ãã¦ãã¾ãã
ç« ã®åé ã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãAn SLO-based approach to reliability works best when everyone is on the same page.ããã®è¨èã¯ãSLOã®æåãçµç¹å ¨ä½ã®å ±éç解ã«ä¾åãã¦ãããã¨ã端çã«è¡¨ç¾ãã¦ãã¾ãããã®è¦ç¹ã¯éè¦ã ã¨æãã¾ãããSLOãæè¡ãã¼ã ã ãã§ãªãããã¸ãã¹å´ã®äººã ã«ãç解ãããæ´»ç¨ããããã¨ã§ãããå¹æçãªä¿¡é ¼æ§ç®¡çãå¯è½ã«ãªãããã§ãã
æ¬ç« ã§ç¹ã«å°è±¡çã ã£ãã®ã¯ãSLOå®ç¾©ææ¸ã®éè¦æ§ã¨ãã®æ§æè¦ç´ ã«ã¤ãã¦ã®è©³ç´°ãªè§£èª¬ã§ããèè ã¯ãSLOå®ç¾©ææ¸ã«å«ããã¹ãè¦ç´ ã¨ãã¦ä»¥ä¸ãæãã¦ãã¾ãï¼
- ãªã¼ãã¼ã·ãã
- æ¿èªè
- å®ç¾©ã®ã¹ãã¼ã¿ã¹
- ãµã¼ãã¹æ¦è¦
- SLOå®ç¾©ã¨ã¹ãã¼ã¿ã¹
- æ ¹æ
- ã¬ãã¥ã¼ã¹ã±ã¸ã¥ã¼ã«
- ã¨ã©ã¼ãã¸ã§ããããªã·ã¼
- å¤é¨ãªã³ã¯
ãããã®è¦ç´ ãå«ãããã¨ã§ãSLOå®ç¾©ææ¸ã¯åãªãæè¡çãªææ¨ã®è¨é²ã§ã¯ãªãããµã¼ãã¹ã®ä¿¡é ¼æ§ã«é¢ããå æ¬çãªæ å ±æºã¨ãªãã¾ããç¹ã«ããªã¼ãã¼ã·ããã¨æ¿èªè ã®æ確åã¯ãSLOã®ç®¡ç責任ãæ確ã«ããçµç¹å ¨ä½ã§ã®åæå½¢æãä¿é²ããä¸ã§éè¦ã ã¨æãã¾ããã
ã¾ããæ¬ç« ã§ã¯ SLO ã®çºè¦å¯è½æ§ãé«ããããã®æ¹æ³ã«ã¤ãã¦ã詳ãã解説ããã¦ãã¾ããèè ã¯ãä¸å¤®éä¸åã®ããã¥ã¡ã³ããªãã¸ããªã®éè¦æ§ã強調ããWikiã·ã¹ãã ãããã¥ã¡ã³ãã®ã³ã¼ãåï¼Documentation-as-codeï¼ãªã©ã®å ·ä½çãªæ¹æ³ãææ¡ãã¦ãã¾ããç¹å°è±¡çã ã£ãã®ã¯ãããã¥ã¡ã³ãã®èªåã¹ãã£ã³ã¨éç´ãè¡ãã«ã¹ã¿ã ãã¼ã«ã®éçºã«é¢ããææ¡ã§ããããã¯ãSLOå®ç¾©ã®ææ°æ§ãä¿ã¡ãçµç¹å ¨ä½ã§ã®å¯è¦æ§ãé«ããä¸ã§å¹æçãªã¢ããã¼ãã ã¨æãã¾ããã
æè¡çãªè¦³ç¹ããã¯ãæ¬ç« ã®ããã·ã¥ãã¼ãã«é¢ãã解説ãæç¨ã§ãããèè ã¯ãå¹æçãªSLOããã·ã¥ãã¼ãã«å«ããã¹ãè¦ç´ ã¨ãã¦ä»¥ä¸ãæãã¦ãã¾ãï¼
- ç¾å¨ã®ã¹ãã¼ã¿ã¹
- SLIéåã®ã°ã©ã
- ãã¼ã³ãã¦ã³ã°ã©ã
- ã¨ã©ã¼ãã¸ã§ããã®ã¹ãã¼ã¿ã¹
- SLOå®ç¾©ææ¸ã¸ã®ãªã³ã¯
æ¬ç« ã®çµè«é¨åã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãReliability requires people to know what's going on, and SLOs provide a clear, customer-centric picture that speaks a thousand words.ããã®è¨èã¯ãSLOã®æ¬è³ªãåãªãæè¡çãªææ¨ã§ã¯ãªããçµç¹å ¨ä½ã§å ±æãããä¿¡é ¼æ§ã«é¢ããå ±éè¨èªã§ãããã¨ã強調ãã¦ãã¾ããSREã¨ãã¦ããã®è¦ç¹ã¯éè¦ã§ããSLOãæè¡ãã¼ã ã ãã®ãã®ã§ã¯ãªããçµç¹å ¨ä½ã§æ´»ç¨ãããéå ·ã¨ãã¦ä½ç½®ã¥ãããã¨ã§ãããå¹æçãªä¿¡é ¼æ§ç®¡çãå¯è½ã«ãªãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ãã¨ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
- çµç¹å ¨ä½ã§çµ±ä¸ãããSLOå®ç¾©ææ¸ã®ãã³ãã¬ã¼ããä½æããå°å ¥ããã
- SLOå®ç¾©ææ¸ãéä¸ç®¡çããããã®ãªãã¸ããªãæ§ç¯ããçµç¹å ã§ã®å¯è¦æ§ãé«ããã
- SLOå®ç¾©ææ¸ã®èªåã¹ãã£ã³ã¨éç´ãè¡ããã¼ã«ãéçºããå®ç¾©ã®ææ°æ§ã¨ä¸è²«æ§ãä¿ã¤
- SLOã¹ãã¼ã¿ã¹ãå¯è¦åããããã·ã¥ãã¼ããè¨è¨ããçµç¹å ¨ä½ã§å ±æããã
- SLOã¬ãã¼ãã®å®æçãªé ä¿¡ãä¼è°ã§ã®å ±æãéãã¦ãSLOã®èªç¥åº¦ã¨ç解度ãé«ããã
æè¡çãªè¦³ç¹ããã¯ãæ¬ç« ã§ææ¡ããã¦ããããã¥ã¡ã³ã管çãããã·ã¥ãã¼ãæ§ç¯ã®ã¢ããã¼ããå®è£ ããããã®å ·ä½çãªæ¹æ³ãæ¤è¨ããå¿ è¦ãããã¾ããä¾ãã°ã以ä¸ã®ãããªæè¡çãªèª²é¡ã«åãçµãå¿ è¦ãããã§ãããï¼
- ããã¥ã¡ã³ãã®ã³ã¼ãåï¼Documentation-as-codeï¼ãå®ç¾ããããã®ãã¼ã«ãã§ã¤ã³ã®æ§ç¯
- SLOå®ç¾©ææ¸ã®èªåã¹ãã£ã³ã¨éç´ãè¡ãã¹ã¯ãªãããã¢ããªã±ã¼ã·ã§ã³ã®éçº
- ãªã¢ã«ã¿ã¤ã ã§SLOã¹ãã¼ã¿ã¹ãå¯è¦åããããã·ã¥ãã¼ãã®è¨è¨ã¨å®è£
- SLOå®ç¾©ææ¸ã¨ã¢ãã¿ãªã³ã°ã·ã¹ãã ãé£æºãããAPIã®éçº
ãããã®æè¡çãªåãçµã¿ãéãã¦ãSLOã®ç解ãããã¨çºè¦å¯è½æ§ãé«ããçµç¹å ¨ä½ã§ã®SLOã®å¹æçãªæ´»ç¨ãä¿é²ãããã¨ãã§ãã¾ãã
ãã®ç« ãèªãã§ãSLOã®ç®¡çã¯åãªãæè¡çãªææ¨ã®è¨å®ã¨ç£è¦ã§ã¯ãªããçµç¹å ¨ä½ã§ã®å ±éç解ãä¿é²ããä¿¡é ¼æ§ã«é¢ããææ決å®ãæ¯æ´ãããã®ã ã¨åèªèãã¾ãããSREã¯ãSLOã®ç解ããããã¨çºè¦å¯è½æ§ãé«ããããã®ã¤ã³ãã©ã¹ãã©ã¯ãã£ã¨ããã»ã¹ãæ§ç¯ã»ç¶æããéè¦ãªå½¹å²ãæ ã£ã¦ãã¾ãã
ãã®ç« ã¯ãSLOã®ç解ããããã¨çºè¦å¯è½æ§ã«é¢ããå æ¬çãã¤å®è·µçãªã¬ã¤ããæä¾ãã¦ãã¾ããå ·ä½çã«ã¯ï¼
- SLOå®ç¾©ææ¸ã®æ§é å
- ä¸å¤®éä¸åã®ããã¥ã¡ã³ã管ç
- å¹æçãªããã·ã¥ãã¼ãã®è¨è¨
ãããã®æ¹æ³ã¯ãSLOãçµç¹å ¨ä½ã§æ´»ç¨ããããã®å ·ä½çãªã¢ããã¼ãã¨ãã¦è©³ç´°ã«è§£èª¬ããã¦ãã¾ãã
SREã¨ãã¦ããããã®ã¢ããã¼ããå®è·µãããã¨ã§ãSLOãããçµç¹ã«æµ¸éãããå¹æçã«æ´»ç¨ãããã¨ãå¯è½ã«ãªãã¾ããç¹ã«ãSLOå®ç¾©ææ¸ã®æ¨æºåã¨ãã®ä¸å¤®ç®¡çãããã¦ãªã¢ã«ã¿ã¤ã ã®ããã·ã¥ãã¼ãæä¾ã¯ãå¤ãã®çµç¹ã§å³åº§ã«é©ç¨ã§ããæç¨ãªæ½çã§ãã
åæã«ãSLOã®ç解ããããã¨çºè¦å¯è½æ§ã®åä¸ã¯ç¶ç¶çãªããã»ã¹ã§ãããã¨ãèªèãããã¨ãéè¦ã§ããçµç¹ã®æé·ãæè¡ã®é²åã«å¿ãã¦ã常ã«ã¢ããã¼ããè¦ç´ããæ¹åãã¦ããå¿ è¦ãããã¾ãã
æ¬ç« ã®ãSLOs provide a clear, customer-centric picture that speaks a thousand words.ãã¨ããè¨èã¯ãSLOã®æ¬è³ªãæãã¦ãã¾ããSLOãçµç¹å ¨ä½ã§å ±æãããä¿¡é ¼æ§ã®å ±éè¨èªã¨ãã¦ä½ç½®ã¥ããç¶ç¶çã«æ¹åãã¦ãããã¨ãSREã®éè¦ãªå½¹å²ã§ãã
ãããã®æ¦å¿µã¨ææ³ãæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãããã¨ã§ãããå¹æçãªä¿¡é ¼æ§ç®¡çã¨çµç¹å ¨ä½ã§ã®ä¿¡é ¼æ§æåã®é¸æã«è²¢ç®ã§ãã¾ããã¦ã¼ã¶ã¼ä½é¨ãéè¦ããæè¡çãªææ¨ã¨ãã¸ãã¹ç®æ¨ã®ãã©ã³ã¹ãåããªãããç¶ç¶çã«ãµã¼ãã¹ã®ä¿¡é ¼æ§ãåä¸ããã¦ãããã¨ã§ãé·æçã«ã¯ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«ã¤ãªããã§ãããã
Chapter 16. SLO Advocacy
第16ç« ãSLO Advocacyãã¯ãSLOï¼Service Level Objectivesï¼ã®å°å ¥ã¨æ®åãçµç¹å ¨ä½ã§æ¨é²ããããã®ã¢ããã¼ãã«ã¤ãã¦è©³ç´°ã«è§£èª¬ãã¦ãã¾ããæ¬ç« ã¯ãSLOå°å ¥ã®æåã«ã¯åãªãæè¡çãªå®è£ 以ä¸ã®ãã®ãå¿ è¦ã§ãããçµç¹æåã®å¤é©ã¨æ·±ãç解ãä¸å¯æ¬ ã§ãããã¨ã強調ãã¦ãã¾ãã
èè ã¯ãSLO Advocateã®å½¹å²ããçµç¹ãæåè£ã«SLOãå®è£ ããã®ãæ¯æ´ãããã¨ãã¨å®ç¾©ãããã®å½¹å²ã«ã¯æ·±ãæè¡çç¥èã ãã§ãªãããªã¼ãã¼ã·ããã¹ãã«ãçµç¹å ¨ä½ã¨ã®ã³ãã¥ãã±ã¼ã·ã§ã³è½åãæ±ãããããã¨ãææãã¦ãã¾ããç¹ã«å°è±¡çã ã£ãã®ã¯ããããªãã®äººéé¢ä¿ã¹ãã«ã¨ãªã¼ãã¼ã·ããã¹ãã«ã¯ãã®æ ã®ä¸ã§æ¥µãã¦éè¦ã«ãªãã§ããããããªãã¯èªåã®ãã¸ã§ã³ãä»è ã«ç´å¾ãããå½¼ãã«å¿ è¦ãªç¥èãæããååããªã¨ãã«ã®ã¼ãçã¿åºãã¦å½¼ããé¼èããSLOæ¡ç¨ã®æåãæ¨é²ããå¿ è¦ãããã¾ããã¨ããä¸ç¯ã§ãããã®è¨èã¯ãSLOå°å ¥ãåãªãæè¡çãªèª²é¡ã§ã¯ãªããçµç¹å ¨ä½ã®æåã¨æèã®å¤é©ãå¿ è¦ã¨ãã大ããªææ¦ã§ãããã¨ã端çã«è¡¨ç¾ãã¦ãã¾ãã
æ¬ç« ã¯ãSLOå°å ¥ã®ããã»ã¹ããCrawlï¼éãï¼ããWalkï¼æ©ãï¼ããRunï¼èµ°ãï¼ãã®3ã¤ã®ãã§ã¼ãºã«åãã¦èª¬æãã¦ãã¾ãããã®æ®µéçãªã¢ããã¼ãã¯ã大è¦æ¨¡ãªçµç¹å¤é©ãæåãããããã®å¹æçãªæ¦ç¥ã§ãã
Crawlãã§ã¼ãºã§ã¯ãSLO Advocateã¨ãã¦ã®åºç¤ä½ãã«ç¦ç¹ãå½ã¦ã¦ãã¾ãããã®ãã§ã¼ãºã§ã¯ãèªå·±å¦ç¿ãæ¯æ´ã¢ã¼ãã£ãã¡ã¯ãã®ä½æãçµç¹å ã®ãªã¼ãã¼ããã¼ã ã¨ã®é£æºãæåãã¬ã¼ãã³ã°ã»ãã·ã§ã³ã®å®æ½ãªã©ãå«ã¾ãã¾ããç¹ã«éè¦ãªã®ã¯ãSLOã®ãã»ã¼ã«ã¹ããããã®æºåã§ããèè ã¯ããã¨ã¬ãã¼ã¿ã¼ã§ä¼ç¤¾ã®CEOã«ä¼ã£ãã¨ããå½¼ãã®æ³¨ç®ãæ°ç§ããå¾ãããªãã¨ããããããªãã¯ä½ãè¨ãã¾ããï¼ãã¨ãã質åãæããããç°ãªãè´è¡ã«å¯¾ãã¦SLOã®ä¾¡å¤ãç°¡æ½ã«èª¬æã§ãããã¨ã®éè¦æ§ã強調ãã¦ãã¾ãã
æè¡çãªè¦³ç¹ããã¯ãCrawlãã§ã¼ãºã§ã®ããã¥ã¡ã³ãã®éè¦æ§ã強調ããã¦ãã¾ããèè ã¯ã1ãã¼ã¸ã®æ¦ç¥ææ¸ãSLOã®é«ã¬ãã«ãªå®ç¾©ãFAQãSLOå®ç¾©ã®ã¹ããããã¤ã¹ãããã¬ã¤ããSLIåéã®ããã®è¨è£ ã¬ã¤ããã¦ã¼ã¹ã±ã¼ã¹ãªã©ãæ§ã ãªææ¸ã®ä½æãæ¨å¥¨ãã¦ãã¾ãããããã®ããã¥ã¡ã³ãã¯ãçµç¹å ¨ä½ã§SLOã®ç解ãæ·±ããå®è£ ãä¿é²ããä¸ã§éè¦ãªå½¹å²ãæããã¾ãã
Walkãã§ã¼ãºã§ã¯ãSLOå°å ¥ã®ç¯å²ãæ¡å¤§ããããå¤ãã®ãã¼ã ãå·»ãè¾¼ãã§ããã¾ãããã®ãã§ã¼ãºã§ã¯ãæ©ææ¡ç¨è ã¨ã®ååãæåäºä¾ã®å ±æããã¬ã¼ãã³ã°ããã°ã©ã ã®æ¡å¤§ãã³ãã¥ãã±ã¼ã·ã§ã³æ¹æ³ã®æ¹åãªã©ãéè¦ã«ãªãã¾ããèè ã¯ãããµã¼ãã¹ã¯æéã¨ã¨ãã«é²åãããã®ã§ãSLOãåæ§ã§ããSLOãæä¾ãããã¼ã¿ãæ´»ç¨ãã¦ãããè¯ã対話ãè¡ããããè¯ã決å®ãä¸ãã¾ããããã¨è¿°ã¹ãSLOãéçãªãã®ã§ã¯ãªãããµã¼ãã¹ã®é²åã«åããã¦ç¶ç¶çã«èª¿æ´ãããã¹ããã®ã§ãããã¨ã強調ãã¦ãã¾ãã
æè¡çã«ã¯ãWalkãã§ã¼ãºã§ã®ã±ã¼ã¹ã¹ã¿ãã£ã©ã¤ãã©ãªã®ä½æãéè¦ã§ããèè ã¯ããæ§ã ãªãµã¼ãã¹ã¿ã¤ãã®SLOå®è£ ä¾ãæã¤ãã¨ã§ãå¤ãã®ãã¼ã ãæ¯æ´ã§ããã§ããããã¨è¿°ã¹ã¦ãã¾ããããã¯ãç°ãªãã¿ã¤ãã®ãµã¼ãã¹ï¼ãªã¯ã¨ã¹ã/ã¬ã¹ãã³ã¹ããã¤ãã©ã¤ã³ãç¶ç¶çè¨ç®ãªã©ï¼ã«å¯¾ããSLOå®è£ ã®å ·ä½çãªä¾ãæä¾ãããã¨ã§ãä»ã®ãã¼ã ãSLOå°å ¥ãé²ããéã®åèã«ãªããã¨ã示åãã¦ãã¾ãã
Runãã§ã¼ãºã§ã¯ãSLOå®è£ ãçµç¹å ¨ä½ã«åºããããã¹ã¦ã®ãã¼ã ãããç¨åº¦ã®SLOæç度ã«éãã¦ããç¶æ ãæ³å®ãã¦ãã¾ãããã®ãã§ã¼ãºã§ã®ä¸»ãªæ´»åã«ã¯ãã±ã¼ã¹ã¹ã¿ãã£ã©ã¤ãã©ãªã®å ±æãSLOã¨ãã¹ãã¼ãã®ã³ãã¥ããã£ä½æããã©ãããã©ã¼ã ã®æ¹åãã¢ããã«ã·ã¼ããã»ã¹ã®æ¹åãªã©ãå«ã¾ãã¾ããèè ã¯ããSLOã®å®ç¾©ã¨å®è£ ã¯ãä¿¡é ¼æ§ãåä¸ãããããã®æåã®ã¹ãããã«ããã¾ãããã²ã¼ã ãã§ã³ã¸ã£ã¼ã¯ãå®éã«SLOãã¨ã³ã¸ãã¢ãªã³ã°ãã©ã¯ãã£ã¹ã®ä¸é¨ã¨ãã¦ä½¿ç¨ãããµã¼ãã¹ã®å質ã¨éç¨ã®åè¶æ§ãæ¨é²ãããã¨ã§ããã¨å¼·èª¿ãã¦ãã¾ãã
æè¡çãªè¦³ç¹ããã¯ãRunãã§ã¼ãºã§ã®ç¶ç¶çãªæ¹åã®è¦æ§ã強調ããã¦ãã¾ããèè ã¯ããã©ãããã©ã¼ã ã¬ãã«ã®æ¹åãå®æçãªSLOã¬ãã¥ã¼ããµã¼ãã¹å質ã¬ãã¥ã¼ãSLOå®è£ ããã»ã¹ã®ãã£ã¼ããã¤ããªã©ãæ§ã ãªæ¹åæ´»åãææ¡ãã¦ãã¾ãããããã®æ´»åã¯ãSLOã®å¹æãæ大åããçµç¹å ¨ä½ã®ä¿¡é ¼æ§æåãå¼·åããããã«ä¸å¯æ¬ ã§ãã
æ¬ç« ã®çµè«é¨åã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãé²æ©ã¯å¤åãªãã«ã¯ä¸å¯è½ã§ãããèªåã®èããå¤ããããªã人ã¯ä½ãå¤ãããã¨ãã§ãã¾ããããã®è¨èã¯ãSLO Advocateã®å½¹å²ãåã«SLOãå®è£ ãããã¨ã§ã¯ãªããçµç¹å ¨ä½ã®æåãå¤é©ãããã¨ã§ãããã¨ãå確èªããã¦ããã¾ãã
SREã¨ãã¦ãã®ç« ããå¦ãã æãéè¦ãªæè¨ã¯ãSLOå°å ¥ã®æåã«ã¯æè¡çãªå®è£ 以ä¸ã®ãã®ãå¿ è¦ã ã¨ãããã¨ã§ããçµç¹æåã®å¤é©ãå¹æçãªã³ãã¥ãã±ã¼ã·ã§ã³ãç¶ç¶çãªå¦ã¨æ¹åãä¸å¯æ¬ ã§ããã¾ããSLO Advocateã®å½¹å²ããæè¡çãªã¨ãã¹ãã¼ãã§ããã¨åæã«ãå¤é©ã®ãªã¼ãã¼ã§ããããã¨ãå¼·ãèªèãã¾ããã
ãã®ç« ã®å 容ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
- çµç¹å ã§SLOã®ä¾¡å¤ãå ±æããããã®ã¯ã¼ã¯ã·ã§ãããåå¼·ä¼ãå®æçã«éå¬ããã
- SLOå°å ¥ã®ããã®å æ¬çãªããã¥ã¡ã³ããä½æããçµç¹å ¨ä½ã§å ±æããã
- æ©ææ¡ç¨è ã¨ã®ååãéãã¦ãæ§ã ãªãµã¼ãã¹ã¿ã¤ãã®SLOå®è£ ä¾ï¼ã±ã¼ã¹ã¹ã¿ãã£ï¼ãä½æããã©ã¤ãã©ãªåããã
- SLOãã¬ã¼ãã³ã°ããã°ã©ã ã確ç«ããä»ã®ãã¬ã¼ãã¼ãè²æãã¦è¦æ¨¡ãæ¡å¤§ããã
- SLOã¨ãã¹ãã¼ãã®ã³ãã¥ããã£ãæ§ç¯ããçµç¹å ¨ä½ã§ã®SLOå°å ¥ãæ¯æ´ããä½å¶ãæ´ããã
- å®æçãªSLOã¬ãã¥ã¼ã¨ãµã¼ãã¹å質ã¬ãã¥ã¼ãå®æ½ããç¶ç¶çæ¹åãå³ãã
æè¡çãªè¦³ç¹ããã¯ãSLOå®è£ ãæ¯æ´ããããã®ãã¼ã«ããã¬ã¼ã ã¯ã¼ã¯ã®éçºãéè¦ã«ãªãã¾ããä¾ãã°ã以ä¸ã®ãããªãã®ãèãããã¾ãï¼
- SLOå®ç¾©ã¨SLIåéãèªååãããã¼ã«
- ãªã¢ã«ã¿ã¤ã ã§SLOã®ç¶æ ãå¯è¦åããããã·ã¥ãã¼ã
- SLOãã¼ã¹ã®ã¢ã©ã¼ãè¨å®ã容æã«ããã·ã¹ãã
- SLOãã¼ã¿ãåæããæ¹åææ¡ãçæããæ©æ¢°å¦ç¿ã¢ãã«
ãããã®ãã¼ã«ãæ´åãããã¨ã§ãSLOå°å ¥ã®ããã»ã¹ãå¹çåããçµç¹å ¨ä½ã§ã®æ¡ç¨ãå éãããã¨ãã§ããã§ãããã
ãã®ç« ã¯SLO Advocateã®å½¹å²ã¨è²¬ä»»ã«ã¤ãã¦å æ¬çãã¤å®è·µçãªã¬ã¤ãã³ã¹ãæä¾ãã¦ãã¾ããSLOå°å ¥ã¨ç®¡çã®æåã«ã¯ãæè¡çãªç¥èã ãã§ãªããçµç¹å ¨ä½ãå·»ãè¾¼ã¿ãæåãå¤é©ããè½åãå¿ è¦ã§ãããã¨ãæ確ã«ç¤ºããã¦ãã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã ã¢ããã¼ããå®è·µãããã¨ã§ãããå¹æçãªSLOå°å ¥ãå¯è½ã«ãªããçµç¹å ¨ä½ã®ããã©ã¼ãã³ã¹åä¸ã«ã¤ãªããã¨ç¢ºä¿¡ãã¦ãã¾ããç¹ã«éè¦ãªç¹ã¯ï¼
- 段éçãªã¢ããã¼ãï¼Crawl, Walk, Runï¼
- ç¶ç¶çãªæ¹åã¨é©å¿
- SLOãçµç¹å ¨ä½ã§å ±æãããä¿¡é ¼æ§ã®å ±éè¨èªã¨ãã¦ä½ç½®ã¥ãããã¨
æ¬ç« ã®ãSLOã¯ãã¦ã¼ã¶ã¼ä¸å¿ã®æ確ãªå ¨ä½åãæä¾ããåã®è¨èãèªããã®ã§ããã¨ããè¨èã¯ãSLOã®æ¬è³ªãæãã¦ãã¾ãã
SLO Advocateã®å½¹å²ã¯ãæè¡çãªã¨ãã¹ãã¼ãã§ããã¨åæã«ãçµç¹ã®å¤é©è ã§ãããç¹ã§ãããããããã¾ãããã®å½¹å²ãéãã¦ãSREã¯ããæ¦ç¥çãªç«å ´ã«ç«ã¡ãçµç¹å ¨ä½ã®ä¿¡é ¼æ§æåã®é¸æã«å¤§ããè²¢ç®ã§ãã¾ãããã®å½¹å²ã«ã¯æè¡çã¹ãã«ã ãã§ãªããã³ãã¥ãã±ã¼ã·ã§ã³è½åããªã¼ãã¼ã·ããã¹ãã«ã®åä¸ãæ±ãããã¾ãã
æå¾ã«ãSLOãä¸å¿ã¨ããä¿¡é ¼æ§ç®¡çã®æåãé¸æãããã¨ãéè¦ã§ããã¦ã¼ã¶ã¼ä½é¨ãéè¦ããæè¡çãªææ¨ã¨ãã¸ãã¹ç®æ¨ã®ãã©ã³ã¹ãåããªãããç¶ç¶çã«ãµã¼ãã¹ã®ä¿¡é ¼æ§ãåä¸ããã¦ãããã¨ã§ãé·æçã«ã¯ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«ã¤ãªããã¾ãããããã®æ¦å¿µã¨ææ³ãæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ãã常ã«å¦ç¿ãé©å¿ãã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡ã®éè¦ãªå½¹å²ã§ãã
Chapter 17. Reliability Reporting
第17ç« ãReliability Reportingãã¯ãSLOï¼Service Level Objectivesï¼ãç¨ããä¿¡é ¼æ§å ±åã®éè¦æ§ã¨æ¹æ³ã«ã¤ãã¦æ·±ãæãä¸ãã¦ãã¾ããæ¬ç« ã¯ãå¾æ¥ã®ä¿¡é ¼æ§å ±åææ³ã®åé¡ç¹ãææããSLOãã¼ã¹ã®ã¢ããã¼ããããã«ãããã®åé¡ã解決ããããå¹æçãªã·ã¹ãã éç¨ãå¯è½ã«ãããã詳細ã«è§£èª¬ãã¦ãã¾ãã
ç« ã®åé ã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãSLOã¯æ ¹æ¬çã«ãããè¯ãè°è«ãè¡ãããããã£ã¦ï¼é¡ããã°ï¼ï¼ããè¯ã決å®ãä¸ãããã®ãã¼ã¿ãæä¾ããæ段ã§ãããã®è¨èã¯ãSLOã®æ¬è³ªãåãªãæè¡çãªææ¨ã§ã¯ãªããææ決å®ããã»ã¹ãæ¹åããããã®ãã¼ã«ã§ãããã¨ã端çã«è¡¨ç¾ãã¦ãã¾ãããã®è¦ç¹ã¯éè¦ã ã¨æãã¾ãããå¤ãã®çµç¹ã§ã¯ãæè¡çãªææ¨ã«åããããã¦ãã¦ã¼ã¶ã¼ä½é¨ãäºæ¥ç®æ¨ã¨ã®é¢é£æ§ãè¦å¤±ããã¡ã§ããSLOã¯ãæè¡ã¨ãã¸ãã¹ã®ã®ã£ãããåããå¼·åãªãã¼ã«ã¨ãªãå¾ã¾ãã
æ¬ç« ã§ç¹ã«å°è±¡çã ã£ãã®ã¯ãå¾æ¥ã®ä¿¡é ¼æ§å ±åææ³ã®åé¡ç¹ã«ã¤ãã¦ã®è©³ç´°ãªåæã§ããèè ã¯ãã¤ã³ã·ãã³ãæ°ã®ã«ã¦ã³ããé大度ã¬ãã«ã®è¨å®ãMean Time to Xï¼MTTXï¼ãªã©ã®å¾æ¥ã®ã¢ããã¼ãããå®éã®ã¦ã¼ã¶ã¼ä½é¨ãæ£ç¢ºã«åæ ãã¦ããªããã¨ãææãã¦ãã¾ããä¾ãã°ãMTTXã«é¢ãã¦èè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãè¤éãªã·ã¹ãã ã¯ä¸è¬çã«ãæ¯åç°ãªãè¦å ãå¯ä¸å åãæã¤ã¦ãã¼ã¯ãªæ¹æ³ã§æ éãã¾ãããã®ææã¯ãã¤ã³ã·ãã³ãã®ä¸å¾ãªåé¡ãå¹³åå¤ã«ããè©ä¾¡ããå®éã®ã·ã¹ãã ã®è¤éããæããããªããã¨ãæ確ã«ç¤ºãã¦ãã¾ãã
æè¡çãªè¦³ç¹ããç¹ã«èå³æ·±ãã£ãã®ã¯ãåæ£åãµã¼ãã¹æå¦ï¼DDoSï¼æ»æã®ä¾ãç¨ããå¾æ¥ã®å ±åææ³ã®éçã®èª¬æã§ããèè ã¯ãåãã¿ã¤ãã®æ»æã§ãã£ã¦ããæ»æè ã®åæ©ã使ç¨ãããæè¡ãæ¨çã¨ãªãã¨ã³ããã¤ã³ãããã©ãã£ãã¯ãã¿ã¼ã³ãªã©ãç°ãªããããããã®ã¤ã³ã·ãã³ããæ¬è³ªçã«ä¸æã§ãããã¨ã強調ãã¦ãã¾ãããã®ä¾ã¯ãã¤ã³ã·ãã³ããåç´ã«ã«ã¦ã³ãããããé大度ã¬ãã«ã«åé¡ãããããã¢ããã¼ãã®éçãæ確ã«ç¤ºãã¦ãã¾ãã
æ¬ç« ã§ã¯ãSLOãã¼ã¹ã®ã¢ããã¼ãããããã®åé¡ãã©ã®ããã«è§£æ±ºãããã«ã¤ãã¦ã詳細ã«è§£èª¬ããã¦ãã¾ããèè ã¯ãã¨ã©ã¼ãã¸ã§ããã®æ¦å¿µãç¨ãããã¨ã§ãã¦ã¼ã¶ã¼ãå®éã«çµé¨ããä¿¡é ¼æ§ã®ä½ä¸ãæ£ç¢ºã«æãããããã¨ã示ãã¦ãã¾ããä¾ãã°ã20åéã®ã¤ã³ã·ãã³ãã20åçºçããååæã¨ã3æéã®åä¸ã¤ã³ã·ãã³ããçºçããååæãæ¯è¼ããå ´åãMTTXã¢ããã¼ãã§ã¯å¾è ã®æ¹ãæ·±å»ã«è¦ãã¾ãããã¨ã©ã¼ãã¸ã§ãããç¨ããã¨åè ã®æ¹ãã¦ã¼ã¶ã¼ã«ã¨ã£ã¦å®éã«ã¯å¤§ããªå½±é¿ããã£ããã¨ãæ確ã«ãªãã¾ãã
Figure 17-1ã§ã¯ãä¿¡é ¼æ§ã®ããã¼ã³ãã¦ã³ãã示ãããã·ã¥ãã¼ãã®ä¾ãæ示ããã¦ãã¾ãããã®ãããªãã¸ã¥ã¢ã«åã¯ããµã¼ãã¹ã®ç¾å¨ã®ç¶æ ã¨å¾åãä¸ç®ã§ç解ããã®ã«æå¹ã§ããèè ã¯ãã人éã¯è¦è¦çãªãã¼ã¿ãããã¿ã¼ã³ãè¦ã¤ããã®ãå¾æã§ããã¨è¿°ã¹ã¦ãããé©åã«è¨è¨ãããããã·ã¥ãã¼ãããã¢ã©ã¼ãã·ã¹ãã ãæ¤ç¥ããåã«åé¡ãçºè¦ããã®ã«å½¹ç«ã¤ãã¨ã強調ãã¦ãã¾ãã
æ¬ç« ã®çµè«é¨åã§ãèè ã¯æ¬¡ã®ããã«è¿°ã¹ã¦ãã¾ãï¼ãSLOã¯ãã¦ã¼ã¶ã¼ã®è¦ç¹ããç©äºã測å®ããåæã«ããªãã®ååããã幸ãã«ããæ¹æ³ã§ãããããã®è°è«ãé©åã«è¡ãããã«ã¯ãèªåã®ç¶æ ãé©åã«å ±åã§ãããã¨ãæããæãéè¦ãªé¨åã§ãããã®è¨èã¯ãSLOãåãªãæè¡çãªææ¨ã§ã¯ãªããçµç¹å ¨ä½ã®ã³ãã¥ãã±ã¼ã·ã§ã³ã¨ææ決å®ãæ¹åããããã®ãã¼ã«ã§ãããã¨ãå確èªããã¦ããã¾ãã
SREã¨ãã¦ããã®ç« ããå¦ãã æãéè¦ãªæè¨ã¯ãä¿¡é ¼æ§å ±åãåãªãæ°å¤ã®å ±åã§ã¯ãªããã¦ã¼ã¶ã¼ä½é¨ã¨ãã¸ãã¹ç®æ¨ã«ç´çµããæå³ã®ããæ å ±ãæä¾ããã¹ãã ã¨ãããã¨ã§ããSLOã¨ã¨ã©ã¼ãã¸ã§ãããç¨ãããã¨ã§ãæè¡ãã¼ã ãçµå¶é£ãããã¦é¡§å®¢ã¨ã®éã§ããµã¼ãã¹ã®ä¿¡é ¼æ§ã«é¢ãããã建è¨çãªå¯¾è©±ãå¯è½ã«ãªãã¾ãã
ãã®ç« ã®å 容ãå®è·µã«ç§»ãããã«ã¯ã以ä¸ã®ãããªã¢ããã¼ããèãããã¾ãï¼
- æ¢åã®ä¿¡é ¼æ§å ±åææ³ãè¦ç´ããSLOãã¼ã¹ã®ã¢ããã¼ãã¸ã®ç§»è¡è¨ç»ãç«ã¦ãã
- ã¦ã¼ã¶ã¼ä½é¨ãæ£ç¢ºã«åæ ããSLIã¨SLOãè¨å®ããããã«åºã¥ããã¨ã©ã¼ãã¸ã§ãããå®ç¾©ããã
- ãªã¢ã«ã¿ã¤ã ã§SLOã®ç¶æ ã¨ã¨ã©ã¼ãã¸ã§ããã®æ¶è²»ç¶æ³ãå¯è¦åããããã·ã¥ãã¼ããæ§ç¯ããã
- SLOã¨ã¨ã©ã¼ãã¸ã§ããã®ç¶æ³ãå®æçã«ã¬ãã¥ã¼ãããµã¼ãã¹æ¹åã®åªå é ä½ä»ãã«æ´»ç¨ããã
- æè¡ãã¼ã ãçµå¶é£ã顧客ããããã«é©ããå½¢ã§ä¿¡é ¼æ§ã¬ãã¼ããä½æãå®æçã«å ±æããã
æè¡çãªè¦³ç¹ããã¯ãSLOã¨ã¨ã©ã¼ãã¸ã§ããã®è¨ç®ã¨å¯è¦åãèªååããã·ã¹ãã ã®æ§ç¯ãéè¦ã«ãªãã¾ããä¾ãã°ã以ä¸ã®ãããªãã®ãèãããã¾ãï¼
- ãªã¢ã«ã¿ã¤ã ã§SLIãåéããSLOã®éæç¶æ³ãè¨ç®ãããã¼ã¿ãã¤ãã©ã¤ã³
- ã¨ã©ã¼ãã¸ã§ããã®æ¶è²»ç¶æ³ãã¢ãã¿ãªã³ã°ããã¢ã©ã¼ããçºããä»çµã¿
- éå»ã®SLOéæç¶æ³ã¨ã¨ã©ã¼ãã¸ã§ããæ¶è²»ã®ãã¬ã³ããåæãããã¼ã«
- å種ã¹ãã¼ã¯ãã«ãã¼åãã«ã«ã¹ã¿ãã¤ãºãããä¿¡é ¼æ§ã¬ãã¼ããèªåçæããã·ã¹ãã
ãããã®ãã¼ã«ãæ´åãããã¨ã§ãããå¹ççãã¤å¹æçãªä¿¡é ¼æ§å ±åãå¯è½ã«ãªãããµã¼ãã¹ã®ç¶ç¶çãªæ¹åã«ã¤ãªããã§ãããã
æ¬ç« ã¯ãSLOãã¼ã¹ã®ä¿¡é ¼æ§å ±åã«é¢ããå æ¬çãã¤å®è·µçãªã¬ã¤ããæä¾ããå¾æ¥ã®å ±åææ³ã®éçãæ確ã«ç¤ºãã¨ã¨ãã«ãSLOã¨ã¨ã©ã¼ãã¸ã§ãããç¨ããã¢ããã¼ãããããã®åé¡ãããã«è§£æ±ºããããå ·ä½çã«è§£èª¬ãã¦ãã¾ããSREã¨ãã¦ããã®ã¢ããã¼ããå®è·µãããã¨ã§ãã¦ã¼ã¶ã¼ä½é¨ãä¸å¿ã«æ®ããSLOã®è¨å®ã¨ã¨ã©ã¼ãã¸ã§ããã®ç®¡çãéãã¦ãããå¹æçãªä¿¡é ¼æ§ç®¡çã¨å ±åãå¯è½ã«ãªãã¾ãããããããã®å°å ¥ã«ã¯çµç¹æåã®å¤é©ãä¼´ã大ããªææ¦ããããå ¨ã¦ã®ã¹ãã¼ã¯ãã«ãã¼ããã®ä¾¡å¤ãç解ãæ´»ç¨ã§ããããã«ãªãã¾ã§ã«ã¯æéãè¦ãã¾ããæ¬ç« ã®ãå®ç§ã§ããå¿ è¦ã¯ããã¾ãããã¨ããè¨èã¯ãSREã¨ãã¦ã®éè¦ãªè¦ç¹ãæä¾ãã¦ããããã®èããæã¡ã¤ã¤SLOãéãã¦ãµã¼ãã¹ã®ä¿¡é ¼æ§ã¨çµç¹å ¨ä½ã®æºè¶³åº¦ãç¶ç¶çã«åä¸ããã¦ãããã¨ãç§ãã¡ã®å½¹å²ã§ããSLOãã¼ã¹ã®ä¿¡é ¼æ§å ±åã¯ãåãªãæè¡çææ¨ã®å ±åã§ã¯ãªããçµç¹å ¨ä½ã®ã³ãã¥ãã±ã¼ã·ã§ã³ã¨ææ決å®ãæ¹åããå¼·åãªãã¼ã«ã§ãããé©åã«å®è£ ã»æ´»ç¨ãããã°ãæè¡ãã¼ã ãçµå¶é£ã顧客éã®å ±éè¨èªã¨ãªããããä¿¡é ¼æ§ã®é«ãã·ã¹ãã æ§ç¯ã¨ã¦ã¼ã¶ã¼ã¸ã®ä¾¡å¤æä¾ã«ã¤ãªããã¾ãããã®æ°ããã¢ããã¼ããæ¥ã ã®æ¥åã«ç©æ¥µçã«åãå ¥ããã¦ã¼ã¶ã¼ä½é¨ãéè¦ãã¤ã¤æè¡çææ¨ã¨ãã¸ãã¹ç®æ¨ã®ãã©ã³ã¹ãåããªãããç¶ç¶çã«å¦ç¿ãé©å¿ãã¦ãããã¨ã§ãé·æçã«ã¯ã¦ã¼ã¶ã¼æºè¶³åº¦ã®åä¸ã¨ãã¸ãã¹ç®æ¨ã®éæã«è²¢ç®ã§ããã¨ç¢ºä¿¡ãã¦ãã¾ãã
ãããã«
ãImplementing Service Level Objectivesããéãã¦ãSLOãåãªãæè¡çãªææ¨ã§ã¯ãªããçµç¹å ¨ä½ã®ä¿¡é ¼æ§æåãå½¢æããå¼·åãªãã¼ã«ã§ãããã¨ãæ·±ãç解ãããã¨ãã§ãã¾ãããã¾ãã§ãçµç¹ã¨ããåºã«SLOã¨ãã種ãæ¤ãã¦ãä¿¡é ¼æ§ã¨ããç¾ããè±ãå²ããããããªãã®ã§ãããæ¬æ¸ã¯ãSLOã®æè¡çãªå´é¢ã ãã§ãªããçµç¹æåã人éçãªå´é¢ã«ã大ããªæ³¨æãæã£ã¦ãããSREã¨ãã¦ã®ç§ãã¡ã®å½¹å²ã®éè¦æ§ãåèªèããã¦ããã¾ãããç§ãã¡ã¯åãªãã¬ã¼ããã¼ã§ã¯ãªããåºå¸«é·ãªã®ã§ãï¼
ç¹ã«å°è±¡ã«æ®ã£ãã®ã¯ããå®ç§ã§ããå¿ è¦ã¯ãªããã¨ããèè ã®ã¡ãã»ã¼ã¸ã§ããSLOã¯ãã¦ã¼ã¶ã¼ã®æºè¶³åº¦ã¨çµç¹ã®ãªã½ã¼ã¹ã®ãã©ã³ã¹ãåãããã®ãã¼ã«ã§ããã常ã«é²åãç¶ãããã®ã§ããã¾ãã§ãå®ç§ãªä½éãç®æããã¤ã¨ããã§ã¯ãªããå¥åº·çãªçæ´»ç¿æ £ãç¯ããããªãã®ã§ããããã®è¦ç¹ãæã¡ã¤ã¤ãæè¡çãªææ¨ã¨ãã¸ãã¹ç®æ¨ã®ãã©ã³ã¹ãåããªãããç¶ç¶çã«ãµã¼ãã¹ã®ä¿¡é ¼æ§ãåä¸ããã¦ãããã¨ããSREã¨ãã¦ã®ç§ãã¡ã®éè¦ãªå½¹å²ã ã¨æãã¾ããã
æ¬æ¸ããå¦ãã ç¥è¦ãå®è·µã«ç§»ãããã«ã¯ãæè¡çãªã¹ãã«ã ãã§ãªããã³ãã¥ãã±ã¼ã·ã§ã³è½åããªã¼ãã¼ã·ããã¹ãã«ã®åä¸ãå¿ è¦ã§ããã¾ãã§ãã¹ã¼ãã¼ã¨ã³ã¸ãã¢ããã¹ã¼ãã¼ãã¼ãã¼ã¸ã®é²åãæ±ãããã¦ããããã§ããSLOãçµç¹å ¨ä½ã«æµ¸éãããå¹æçã«æ´»ç¨ãã¦ããããã«ã¯ãæè¡ãã¼ã ããããã¯ããã¼ã ãçµå¶é£ãªã©ãæ§ã ãªã¹ãã¼ã¯ãã«ãã¼ã¨ã®ååãä¸å¯æ¬ ã ããã§ããæã«ã¯ãç°ä¸ç人ã¨ã®äº¤æ¸ãå¿ è¦ããããã¾ããã
æ¬æ¸ãéãã¦ãSREã®å½¹å²ãããæ¦ç¥çãªãã®ã«ãªãã¤ã¤ãããã¨ãå¼·ãæãã¾ãããSLOã®å°å ¥ã¨éç¨ãéãã¦ãSREã¯æè¡çãªåé¡è§£æ±ºã ãã§ãªããçµç¹å ¨ä½ã®æ¹åæ§ã«å½±é¿ãä¸ããéè¦ãªä½ç½®ã«ãããã¨ãæ確ã«ãªãã¾ãããã¾ãã§ãè£æ¹ããèå°ã®ä¸»å½¹ã«èºãåºããããªæè¦ã§ãããã®å¤åã«é©å¿ããæè¡çãªã¹ãã«ã¨ãã¸ãã¹æè¦ã®ä¸¡æ¹ã磨ãã¦ãããã¨ããä»å¾ã®SREã«ã¨ã£ã¦ä¸å¯æ¬ ã ã¨èãã¾ãã
æå¾ã«ãæ¬æ¸ã®èè ãã¯ãããSLOã®çºå±ã«å°½åããã¦ããæ¹ã ã«ãå¿ããã®æ¬æã¨æè¬ã表ãã¾ããçããã®ç®èº«çãªåªåãªããã¦ãä»æ¥ã®SLOã®éçã¯ããã¾ããã§ãããçãããåãæãã¦ãã ãã£ãéã®ä¸ããç§ãã¾ãæ©ãã§ãããã¨ãèªãã¾ãã
ããã¦ãæ¬ããã°èªè ã®çãã¾ã«ãæè¬ãç³ãä¸ãã¾ãã1ã¤1ã¤ã®æ°ã¥ããå¦ã³ãç©ã¿éãããã¨ããç§ãã¡èªèº«ã®æé·ã«ã¤ãªããã ãã§ãªããã²ãã¦ã¯æ¥çå ¨ä½ã®çºå±ã«ãã¤ãªããã®ã ã¨ä¿¡ãã¦ãã¾ããå¼ãç¶ããSLOã«ã¤ãã¦å¦ã³ãå®è·µããè°è«ãæ·±ãã¦ãããã°ã¨ãããã¾ãã
ã¿ãªãããæå¾ã¾ã§èªãã§ããã¦æ¬å½ã«ãããã¨ããããã¾ããéä¸ã§æ«æããã«ä»ãåã£ã¦ããããã¨ã«æè¬ãã¦ãã¾ãã
èªè
ã«ãªã£ã¦ããããæ´ã«æè¬ã§ããXã¾ã§ãã©ãã¯ã¼ãã¦ããããæ³£ãã¦ããããããã¾ããã