Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
クラウドネイティブな監視をMackerelで / Mackerel Day#2
Search
FUJIWARA Shunichiro
December 23, 2019
Technology
4
5.3k
クラウドネイティブな監視をMackerelで / Mackerel Day#2
FUJIWARA Shunichiro
December 23, 2019
Tweet
Share
More Decks by FUJIWARA Shunichiro
See All by FUJIWARA Shunichiro
alecthomas/kong はいいぞ / kamakura.go#7
fujiwara3
1
250
ISUCONに強くなるかもしれない日々の過ごしかた/Findy ISUCON 2024-11-14
fujiwara3
9
1k
「最高のチューニング」をしないために / hack@delta 24.10
fujiwara3
21
4k
AWS Lambdaで実現するスケーラブルで低コストなWebサービス構築/YAPC::Hakodate2024
fujiwara3
10
4.5k
CEL(Common Expression Language)で書いた条件にマッチしたIAM Policyを見つける / iam-policy-finder
fujiwara3
2
1.5k
awslim - Goで実装された高速なAWS CLIの代替品を作った/layerx.go#1
fujiwara3
6
760
AWS CLIの起動が重くてつらいので aws-sdk-client-go を書いた / kamakura.go#6
fujiwara3
7
10k
コードを書く隙間を見つけて生きていく技術/Findy 思考の現在地
fujiwara3
31
7.1k
fujiwara-ware OSSをひたすら紹介する/ya8-2024
fujiwara3
8
800
Other Decks in Technology
See All in Technology
Postman と API セキュリティ / Postman and API Security
yokawasa
0
140
KubeCon NA 2024 Recap / Running WebAssembly (Wasm) Workloads Side-by-Side with Container Workloads
z63d
1
200
Kubeshark で Kubernetes の Traffic を眺めてみよう/Let's Look at k8s Traffic with Kubeshark
kota2and3kan
3
340
OpenAIの蒸留機能(Model Distillation)を使用して運用中のLLMのコストを削減する取り組み
pharma_x_tech
2
370
Splunk Enterpriseで S3のデータを直接検索してみた!
recruitengineers
PRO
2
120
NilAway による静的解析で「10 億ドル」を節約する #kyotogo / Kyoto Go 56th
ytaka23
3
270
re:Invent2024のIaC周りのアップデート&セッションの共有/around-re-invent-2024-iac-updates
tomoki10
0
940
MLOps の現場から
asei
5
530
実務につなげる数理最適化
recruitengineers
PRO
6
550
IVRyエンジニア忘年LT大会2024 クリティカルユーザージャーニーの整理
abnoumaru
0
150
ブラックフライデーで購入したPixel9で、Gemini Nanoを動かしてみた
marchin1989
1
350
10分で学ぶKubernetesコンテナセキュリティ/10min-k8s-container-sec
mochizuki875
1
130
Featured
See All Featured
Documentation Writing (for coders)
carmenintech
65
4.5k
The Language of Interfaces
destraynor
154
24k
Done Done
chrislema
181
16k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
27
2.1k
RailsConf 2023
tenderlove
29
930
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.3k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
32
2.7k
The World Runs on Bad Software
bkeepers
PRO
65
11k
Agile that works and the tools we love
rasmusluckow
328
21k
Art, The Web, and Tiny UX
lynnandtonic
298
20k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
95
17k
How To Stay Up To Date on Web Technology
chriscoyier
789
250k
Transcript
ΫϥυωΠςΟϒͳࢹΛ Mackerel Ͱ 2019.12.23 Mackerel Day #2 @fujiwara
@fujiwara .BDLFSFMΞϯόαμʔ ʙ HJUIVCDPNLBZBDFDTQSFTTP "NB[PO&$4σϓϩΠπʔϧ HJUIVCDPNGVKJXBSBMBNCSPMM "84-BNCEBσϓϩΠπʔ ϧ
Game & Community
None
ΫϥυωΠςΟϒʁ $/$'$MPVE/BUJWF%FpOJUJPOW ΫϥυωΠςΟϒٕज़ɺύϒϦοΫΫϥυɺϓϥΠϕʔτΫϥυɺϋΠϒ ϦουΫϥυͳͲͷۙతͰμΠφϛοΫͳڥʹ͓͍ͯɺεέʔϥϒϧͳΞϓ ϦέʔγϣϯΛߏங͓Αͼ࣮ߦ͢ΔͨΊͷೳྗΛ৫ʹͨΒ͠·͢ɻ ͜ͷΞϓϩʔνͷදྫʹɺίϯςφɺαʔϏεϝογϡɺϚΠΫϩαʔϏεɺΠ ϛϡʔλϒϧΠϯϑϥετϥΫνϟɺ͓ΑͼએݴܕAPI͕͋Γ·͢ɻ IUUQTHJUIVCDPNDODGUPDCMPCNBTUFS%&'*/*5*0/NE
ίϯςφɺαʔϏεϝογϡɺϚΠΫϩαʔϏε… ͦΕΒͷٕज़Λ͍ͬͯΔ㱠ΫϥυωΠςΟϒ "Design for Failure" ͱͦΕΛ࣮ݱ͢ΔΈίϯϙʔωϯτɺͦΕΛ͍ ͜ͳ͢͜ͱ͕ͦ͜ΫϥυωΠςΟϒ োͷൃੜʹରͯࣗ͠ಈ෮چͰ͖ΔΑ͏ʹσβΠϯ͢ΔɻোͷൃੜʹΑΔϢʔ βʔӨڹ͕ͦͦͳ͍Α͏ʹΞʔΩςΫνϟΛσβΠϯ͢Δ2 IUUQTTQFBLFSEFDLDPNUPSJDMTEFTJHOGPSGBJMVSFJTUIFUSVFDMPVEOBUJWF
"Design for Failure" োΛఆͯ͠γεςϜΛσβΠϯ͢Δ w Πϯελϯεࢮ͵ w ϚωʔδυαʔϏεࢮ͵ ͚ͲେGBJMPWFS͢Δ w
σʔληϯλʔ͝ͱࢮ͵ كʹ͋Δ Ͳ͜·ͰΛఆ͠ɺͲ͔͜ΒఘΊΔ͔ ࢹͦΕΒΛંΓࠐΜͰઃܭ͢ΔˠΫϥυωΠςΟϒͳࢹ
ࢹରಈతʹ૿͑ͨΓݮͬͨΓ͢Δ ίϯςφσϓϩΠ͝ͱʹੜ·ΕมΘΔ ίϯςφͰͳͯ͘ w &$ͷΦʔτεέʔϦϯάͰ4QPUΠϯελϯεΛ͏ w ΘΓͱ͙͢མͪΔɺଞͷ্͕͕Δ ϚωʔδυαʔϏεෛՙʹԠͯ͡૿ݮͰ͖ΔΑ͏ʹͳ͖ͬͯͨ "VSPSB"VUP4DBMJOHͳͲ
Mackerel ͰΫϥυωΠςΟϒͷୈҰา ʮDPOOFDUJWJUZݕΛΊΔʯ
connec%vity ݕΛΊΔ ͜ͷΞϥʔτ ! $SJUJDBM͔͠ͳ͍ w ։ൃڥ͕ਂʹམͪͯඈͼى͖Δඞཁͳ͍ w ຊ൪ڥͰམͪͯαʔϏεʹӨڹ͢Δϗετͳ͍ Α͏ʹ࡞Δ
ϗετ͕ࢮ͵͜ͱΛલఏʹσβΠϯ͢ΔʹΫϥυωΠςΟϒ $SJUJDBMͳΞϥʔτʮαʔϏεͷܧଓੑʹӨڹ͢ΔͷʯͷΈ
Mackerel ͷΫϥυωΠςΟϒ͞ ! ϗετ͕૿͑ͯݮͬͯࣗಈͰैͰ͖Δ NBDLFSFMDPOUBJOFSBHFOU "84"[VSFΠϯςάϨʔγϣϯ ! ୀ͢ΔͱϗετϝτϦοΫ͕ݟ͑ͳ͘ͳΔ $16ͳͲҰ෦ͷϝτϦοΫͷΈΔɺΧελϜϝτϦοΫ ߹ܭฏۉͳͲظͰ͍͍͕ͨɺফ͑ͯ͠·͏
! ϗετ୯Ґ՝ۚ ϚΠΫϩϗετ ԁ݄ Ͱ424-BNCEBͷΑ͏ʹ ͍҆ɺϝτϦοΫ͕গͳ͍ରͷࢹʹͪΐͬͱߴ͍ʜ ૯ϝτϦοΫ՝ۚϓϥϯ͕΄͍͠
ϚωʔδυαʔϏεͷਐԽʹظͭͭ͠ ΫϥυωΠςΟϒͳࢹΛਐΊΔͨΊʹ ʮܺؒՈ۩044ʯ ϚωʔδυαʔϏεͷػೳαʔϏεؒ࿈ܞ͕ࣗͨͪͷӡ༻ʹ͓͍ͯෆेͳ ߹ʹɺͦͷ伱ؒΛຒΊͯΑΓΑ͍ӡ༻Λ࣮ݱ͢ΔͨΊʹ։ൃ͞ΕͨιϑτΣΞɻ ಛʹOSSͷͷΛࢦ͢ɻ3 IUUQTTQFBLFSEFDLDPNGVKJXBSBYJKJBOKJBKVPTTGBMTFTVTVNF IUUQTTQFBLFSEFDLDPNGVKJXBSBBXTEFWEBZUPLZP
ࠓհ͢Δ伱ؒՈ۩ OSS w NBQSPCF w NBDLFSFMQMVHJOQSPNFUIFVTRVFSZ IUUQTHJUIVCDPNGVKJXBSBNBDLFSFMQMVHJOQSPNFUIFVTRVFSZ IUUQTHJUIVCDPNGVKJXBSBNBQSPCF
ʲ՝ʳAWSΠϯςάϨʔγϣϯͰొ͞ΕͨϗετͰ mackerel-plugin ͰͷϝτϦοΫऔΓ͍ͨ
ྫɿAmazon RDS(MySQL)ʹରͯ͠ mackerel-plugin-mysql Λ࣮ߦ
ʲղ๏ʳͲ͔͜ͷϗετͷ mackerel-agent Ͱ plugin ࣮ߦʁ [plugin.metrics.rds01] command = "mackerel-plugin-mysql -host='rds01.***.ap-northeast-1.rds.amazonaws.com'
(ུ)" custom_identifier = "rds01.***.ap-northeast-1.rds.amazonaws.com" [plugin.metrics.rds02] command = "mackerel-plugin-mysql -host='rds02.***.ap-northeast-1.rds.amazonaws.com' (ུ)" custom_identifier = "rds02.***.ap-northeast-1.rds.amazonaws.com" ! ͜ͷϗετ͕མͪͨΒϝτϦοΫऩू͕ࢭ·Δ " ͋ͱ͔Β૿͑ͨϗετΛࢹ͢Δͷʹઃఆมߋ͕໘ # ࠷ۙNBDLFSFMBHFOU Λಈ͔͢ϗετ ͕ͳ͍͜ͱʜ ࢹରͷ૿ݮʹࣗಈै͍ͨ͠ʂ
maprobe w .BDLFSFMʹొ͞Εͨϗετʹରͯ͠ w ֎ܗࢹQJOHUDQIUUQ w NBDLFSFMQMVHJOΛ࣮ߦ ϗετϝτϦοΫͱͯ͠ߘ w ొࡁΈͷϗετϝτϦοΫΛू͠
αʔϏεϝτϦοΫͱͯ͠ߘ Λߦ͏ͨΊͷΤʔδΣϯτ
ʲղ๏ʳ maprobe Ͱ plugin ࣮ߦ probes: - service: production role:
RDS command: command: - 'mackerel-plugin-mysql' - '-host={{.Host.CustomIdentifier}}' - '-username=root' - '-password={{env "RDS_PASSWORD"}}' αʔϏεQSPEVDUJPO ϩʔϧ3%4ͷϗετશͯʹରͯ͠ NBDLFSFMQMVHJONZTRMΛ࣮ߦ ݁ՌΛݸʑͷϗετϝτϦοΫͱͯ͠.BDLFSFMૹ৴͢Δ
maprobe ରϗετͷ૿ݮʹࣗಈै ຖ.BDLFSFM"1*Λୟ͍ͯϗετΛݕࡧ ! ϗετͷ૿ݮʹࣗಈͰै %PDLFSίϯςφΞϦ㽂 docker pull fujiwara/maprobe 4ʹஔ͍ͨઃఆϑΝΠϧΛࣗಈͰ࠶ಡΈࠐΈ
! 4Λߋ৽͢ΕίϯςφϏϧυɾσϓϩΠෆ༻Ͱઃఆө maprobe agent --config s3://example.com/config.yaml IUUQTIVCEPDLFSDPNSGVKJXBSBNBQSPCF
ʲ՝ʳconnec%vity ΛΊͨΒ ϗετͷࢮ׆ࢹͲ͏͢Δʁ
ʲղ๏ʳmaprobe ͷϔϧενΣοΫػೳ ΈࠐΈͷϔϧενΣοΫػೳ QJOH 5$1 )551͕͋Δ probes: - service: production
role: EC2 ping: address: "{{ .Host.IPAddresses.eth0 }}" - service: production role: ElastiCacheRedis tcp: host: "{{ .Host.CustomIdentifier }}" port: 6379 send: "PING\n" expect_pattern: "PONG"
maprobe ͰͷϔϧενΣοΫ NBQSPCFͷϔϧενΣοΫ݁ՌϗετϝτϦοΫʹͳΔ DIFDLࢹͰͳ͍
check ࢹ͕Α͘ͳ͍ͱ͜Ζ(ࢲݟ) ઃఆมߋ͕ϑΝΠϧͷमਖ਼ σϓϩΠ ʮͪΐͬͱ͍·͚ͩࢹP⒎ᮢมߋʯ͕͍͠ ᮢͷධՁํ๏͕ϓϥάΠϯ͝ͱʹ·ͪ·ͪ --critical-underʮҎԼʯ͔ʮະຬʯ͔ʜ ҰʹଟͷϗετͰൃใ͕ͪ͠ ेϗετ͔ΒDIFDLࢹࣦഊ͕དྷͯݪҼݸͩͬͨΓ OUQͷ࣌ࠁͣΕɺEBFNPOͷઃఆ
EFQMPZ ϛεʜ
check ࢹ = metric ࢹͷಛघͳύλʔϯ ϝτϦοΫΛอଘͯͦ͠ΕΛධՁ͢Εಉ͜͡ͱ͕Ͱ͖Δ ϝτϦοΫࢹɺࣜࢹΛ׆༻͢Δ
ྫɿ ping ʹΑΔࢮ׆ࢹ sum(role(production:EC2, ping.count.failure)) QSPDVUDJPO&$ͷ͍ͣΕ͔ͷϗετʹQJOH͕ࣦഊͨ͠ΒXBSO Կ͔མͪͯαʔϏε͕ఏڙͰ͖͍ͯΕ$SJUJDBMͰͳ͍
ྫɿ job queue ͷཹ job ΛΞϥʔτ sum(role(production:job-queue, custom.gearmand.queue.*.total)) ෳͷϗετʹKPCRVFVF͕͋Δ ཹKPCΛϝτϦοΫʹ͍ͯ͠Δ
ཷ·Δͱ͖શͯͷRVFVF͕ཷ·Δ͜ͱ͕ଟ͍ DIFDLࢹͰݸผʹΞϥʔτ͢Δͱશ෦ͷϗετͰൃใ͕ͪ͠ ߹ܭΛݟΔ͜ͱͰશମͷॲཧঢ়گΛѲ͢Δ
ʲ՝ʳୀ͢Δͱফ͑ͯ͠·͏ ϗετϝτϦοΫΛ͍͍ͨ
ΧελϜϝτϦοΫϗετ͕ୀ͢Δͱফ͑ͯ͠·͏
ʲղ๏ʳmaprobe ͰϗετϝτϦοΫΛूอଘ aggregates: - service: production role: push-server metrics: -
name: custom.push.messages.sent outputs: - func: sum name: custom.push.messages.total_sent αʔϏεQSPEVDUJPO ϩʔϧQVTITFSWFSʹରͯ͠ ϗετϝτϦοΫͷQVTINFTTBHFTTFOUΛશऔಘ ˠԋࢉͨ݁͠ՌΛαʔϏεϝτϦοΫͱͯ͠อଘ͢Δ
maprobe aggregate func/ons ݱࡏTVN NJO NBY BWFSBHF DPVOUΛαϙʔτ QFSDFOUJMF͋ͬͨ΄͏͕Αͦ͞͏͚ͩͲະ࣮ ΧελϜϝτϦοΫ͕ফ͑ͳ͚ΕࣜάϥϑͰ݁ͳͷͰԿଔ
ʲ՝ʳͬͱΫϥυωΠςΟϒͳ Ϧιʔεͷࢹ
ϚΠΫϩαʔϏεʂ αʔϏεϝογϡʂ &OWPZΛ͍࢝Ί͍ͯΔͷͰɺϝτϦΫεΛऔΓ͍ͨ ͱ͋Δ&OWPZͷ/statsΛୟ͘ͱʜ $ curl -s x.x.x.x:9901/stats ... cluster.web.default.total_match_count:
1 cluster.web.external.upstream_rq_200: 988 cluster.web.external.upstream_rq_2xx: 988 cluster.web.external.upstream_rq_302: 13 cluster.web.external.upstream_rq_3xx: 13 cluster.web.external.upstream_rq_400: 3 cluster.web.external.upstream_rq_403: 26 cluster.web.external.upstream_rq_404: 5 cluster.web.external.upstream_rq_4xx: 34 .... IUUQTFOWPZQSPYZJP
͜ΕΛશ෦ Mackerel ʹૹΕΑ͍ʁ cluster.web.default.total_match_count: 1 cluster.web.external.upstream_rq_200: 988 cluster.web.external.upstream_rq_2xx: 988 cluster.web.external.upstream_rq_302:
13 cluster.web.external.upstream_rq_3xx: 13 cluster.web.external.upstream_rq_400: 3 cluster.web.external.upstream_rq_403: 26 cluster.web.external.upstream_rq_404: 5 cluster.web.external.upstream_rq_4xx: 34 ...
͍߹Θͤ ϑΟʔυόοΫϑΥʔϜ͔Βؾܰʹฉ͍ͯΈͨ ۙʑenvoy(https://www.envoyproxy.io/)Λಋೖ༧ఆͳͷͰ͕͢ɺMackerel ެࣜͱͯ͠envoy statsऔಘϓϥάΠϯΛެ։͞ΕΔ༧ఆ͋Γ·͢Ͱ͠ΐ͏͔ʁ envoyͷϓϥάΠϯͰ͕͢ɺݱ࣌Ͱެࣜͱͯ͠ެ։͢Δ༧ఆ͍͟͝ ·ͤΜɻ ! ࡞Δ͔ʜʜ
͔͠͠ Envoy େྔͷϝτϦοΫΛు͖ग़͢ $ curl -s x.x.x.x:9901/stats | wc -l
337 ͜ΕΛશ෦.BDLFSFMʹ͍࣋ͬͯ͘ͱʜ ϚΠΫϩϗετϝτϦοΫϗετˠϗετ૬ &OWPZ͕͍Δϗετ͝ͱʹYԁ ੫ผ
࡞ઓมߋ શ෦Λ͍࣋ͬͯ͘ͷίετ͕ݫ͍͠ ͱ͍͑ͲΜͳ͕΄͍͔͠Α͔͍ͬͯ͘ͳ͍ &OWPZӡ༻ܦݧ͕ઙ͍ͷͰɺӡ༻͠ͳ͕ΒݟΔΛܾΊ͍ͨ QMVHJOΛ࡞ͬͯ༗༻ͳϝτϦοΫΛऔΔʹӡ༻ܦݧ͕ඞཁ ͭ·ΓͱΓ͋͑ͣશ෦औΓ͍ͨ
Prometheus Ͳ͏͔ʁ 1VMMܕϝτϦοΫऩूɾࢹπʔϧ อଘͨ͠Λ1SPN2-ΫΤϦͰॊೈʹՃͯ͠औಘͰ͖Δ ظؒͷϝτϦοΫอଘ͋·Γߟྀ͞Ε͍ͯͳ͍ (SBGBOBͱ͔ͰՄࢹԽ͕ී௨ʁ
! Envoy → Prometheus → Mackerel զʑ.BDLFSFMͰΞϥʔτ͍ͨ͠͠άϥϑݟ͍ͨ ͲͷΛऔΔ͖͔͕ݟ͑ͳ͍ͷͰɺۙશ෦औ͓͖͍ͬͯͨ &OWPZˠ1SPNFUIFVT
͜Ε͋Δ 1SPNFUIFVTˠΫΤϦ݁ՌΛNFUSJDQMVHJOܗࣜͰग़ྗ NFUSJDQMVHJOܗࣜͷग़ྗΛ.BDLFSFMʹอଘ ͜Ε͋Δ ͜ͷ͚ͩ࡞ΕΑ͍ͷͰʂ Γͳ͍1SPNFUIFVTʹΫΤϦͯ͠औΕΑ͍
mackrel-plugin-prometheus-query ࡞Γ·ͨ͠ HJUIVCDPNGVKJXBSBNBDLFSFMQMVHJOQSPNFUIFVTRVFSZ $ mackerel-plugin-prometheus-query \ -query "up" \ -metric-key-format
"promq.{job}.{instance}" promq.web.10_1_129_175_9901 1 1575941187 promq.web.10_1_130_170_9901 1 1575941187 promq.web.10_1_131_53_9901 1 1575941187 promq.prometheus.localhost_9090 1 1575941187
Prometheus ͷΫΤϦྫ ྫɿؒͷVQTUSFBNͷϦτϥΠճΛٻΊΔΫΤϦ sum( delta( envoy_cluster_upstream_rq_retry{envoy_cluster_name="web"}[1m] ) ) .BDLFSFMʹϗετ୯ҐͰͳ͘શମͷΛૹΔ ݸʑͷ1SPNFUIFVTΛݟΕ͋ΔͷͰ
plugin ͷग़ྗΛ mkr throw Ͱ͛Δ $ mackerel-plugin-prometheus-query \ -query 'sum(delta(envoy_cluster_upstream_rq_retry_success
{envoy_cluster_name="web"}[1m]))' \ -metric-key-format "envoy.web.upstream.retry.success" \ | mkr throw --service production
·ͱΊ ΫϥυωΠςΟϒͳࢹͱ োΛڐ༰͢Δઃܭશମͷ݈શੑΛݟΔ ͦͷಓ۩ͱͯ͠ͷܺؒՈ۩044 w NBQSPCF ࣗಈैɺ֎ܗࢹɺQMVHJOࢹɺϝτϦοΫू w NBDLFSFMQMVHJOQSPNFUIFVTRVFSZ TIPSUUFSN1SPNFUIFVTͰ
MPOHUFSN BMFSUJOH.BDLFSFMͰ