Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Grafana Lokiで構築する大規模ログモニタリング基盤 / Grafana Lok...
Search
LINE Developers
November 04, 2021
Technology
12
9.1k
Grafana Lokiで構築する大規模ログモニタリング基盤 / Grafana Loki Deep Dive
CloudNative Days Tokyo 2021での登壇資料です
https://event.cloudnativedays.jp/cndt2021/talks/1252
LINE Developers
November 04, 2021
Tweet
Share
More Decks by LINE Developers
See All by LINE Developers
LINEスタンプのSREing事例集:大きなスパイクアクセスを捌くためのSREing
line_developers
1
2k
Java 21 Overview
line_developers
6
1k
Code Review Challenge: An example of a solution
line_developers
1
1.1k
KARTEのAPIサーバ化
line_developers
1
450
著作権とは何か?〜初歩的概念から権利利用法、侵害要件まで
line_developers
5
2k
生成AIと著作権 〜生成AIによって生じる著作権関連の課題と対処
line_developers
3
2k
マイクロサービスにおけるBFFアーキテクチャでのモジュラモノリスの導入
line_developers
9
3.1k
A/B Testing at LINE NEWS
line_developers
3
850
LINEのサポートバージョンの考え方
line_developers
2
1.1k
Other Decks in Technology
See All in Technology
密着! Bedrockerがre:Invent 2024で過ごした5日間を紹介
minorun365
PRO
3
360
サーバレスアプリ開発者向けアップデートをキャッチアップしてきた #AWSreInvent #regrowth_fuk
drumnistnakano
0
140
Kubernetesトラフィックルーティング徹底解説/Kubernetes-traffic-deep-dive
oracle4engineer
PRO
5
900
同一クラスタ上でのFluxCDとArgoCDのリソース最適化の話
kumorn5s
0
190
DevOps視点でAWS re:invent2024の新サービス・アプデを振り返ってみた
oshanqq
0
140
AIのコンプラは何故しんどい?
shujisado
1
110
プロダクトマネージャーは 事業責任者の夢をみるのか pmconf2024
gimupop
2
9.7k
Autonomous Database サービス・アップデート (FY25)
oracle4engineer
PRO
0
280
KubeCon NA 2024 Recap: How to Move from Ingress to Gateway API with Minimal Hassle
ysakotch
0
130
アップデート紹介:AWS Data Transfer Terminal
stknohg
PRO
0
110
Snykで始めるセキュリティ担当者とSREと開発者が楽になる脆弱性対応 / Getting started with Snyk Vulnerability Response
yamaguchitk333
2
140
多様なロール経験が導いたエンジニアキャリアのナビゲーション
coconala_engineer
1
190
Featured
See All Featured
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
BBQ
matthewcrist
85
9.3k
GraphQLの誤解/rethinking-graphql
sonatard
67
10k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
26
1.4k
A Philosophy of Restraint
colly
203
16k
Building Flexible Design Systems
yeseniaperezcruz
327
38k
Build your cross-platform service in a week with App Engine
jlugia
229
18k
A Tale of Four Properties
chriscoyier
157
23k
A Modern Web Designer's Workflow
chriscoyier
693
190k
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
Writing Fast Ruby
sferik
627
61k
jQuery: Nuts, Bolts and Bling
dougneiner
61
7.5k
Transcript
1 Grafana LokiͰߏங͢Δ େنϩάϞχλϦϯάج൫ CNDT2021 LINEגࣜձࣾ Hiroki Sakamoto / @taisho6339
2 ࣗݾհ - Title: Senior Software Engineer@LINE Corp - Role:
Private Cloud։ൃ৫ͷSRE - Mission Private CloudΛԣஅͨ͠৴པվળ - Interest: Kubernetes, ࢄγεςϜ, ےτϨ, OSS׆ಈ - Twitter: @taisho6339
like Prometheus but for logs • ϩάͷอଘͱݕࡧػೳ • ϩάϕʔεͷΞϥʔςΟϯάػೳ •
ϩάϕʔεͷϝτϦΫε࡞ػೳ • ϚϧνςφϯτͷDefault Support 3 LokiͱԿ͔ʁ
4 LokiͱԿ͔ʁ
5 LokiͱԿ͔ʁ
6 ҆͘େ༰ྔͷϩάΛอଘՄೳ
7 Private Cloud “Verda” in LINE is based on OpenStack.
since 2016~ FaaS PaaS IaaS NAT LB Bare metal
8 Private Cloud “Verda” in LINE Virtual Machines 74000+ 30000+
4000+ Physical Machines Hypervisors
9 20 TB / day application logs
10 Loki is suitable for us!
11 Lokiͷ͠͞ LokiϚΠΫϩαʔϏε • ֤ίϯϙʔωϯτɺ֤ΩϟογϡͷΈͱׂ͕ෆ໌ྎ • ϩάσʔλͲ͜ͰͲΜͳܗࣜͰͲͷ͘Β͍อ࣋͞ΕΔ͔ෆ໌ྎ • ετϨʔδো࣌Ͳ͏͍͏ڍಈʹͳΔͷ͔ෆ໌ྎ •
ຊ൪Ͱӡ༻͢ΔͳΒԿΛߟྀ͠ͳ͍ͱ͍͚ͳ͍ͷ͔͕ෆ໌ྎ
12 ຊηογϣϯͷΰʔϧ ࠃͰ࠷ৄࡉʹղઆ͢Δ͜ͱΛࢦ͠·͢ • ମܥతʹLokiͷίϯϙʔωϯτͷׂͱΈΛΔ • τϥϒϧ࣌ʹݪҼͷಛఆ͕ਝʹͰ͖ΔΑ͏ʹͳΔ • ࣗྗͰΩϟύγςΟཧɺύϑΥʔϚϯενϡʔχϯάͰ͖ΔΑ ͏ʹͳΔ
13 ຊηογϣϯఆߏ Loki Version: v2.3.0 Cache: Memcached Chunk Storage: AWS
S3 Index Storage: AWS S3 + BoltDB Shipper
14 1) ϩάͷॻ͖ࠐΈϓϩηεΛΔ 2) ϩάͷಡΈࠐΈϓϩηεΛΔ 3) ো࣌ͷڍಈΛΔ ຊηογϣϯͷRoadmap
15 1) ϩάͷॻ͖ࠐΈϓϩηε
16 Overview
17 ॻ͖ࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Loki Clients Storage Distributor
Ingesters Clients (Promtail, Fluentd)
18 ॻ͖ࠐΈϓϩηε Overview Amazon S3 Chunk Cache Index Write Cache
Loki Clients Storage Distributor Ingesters Clients (Promtail, Fluentd) • FluentdPromtailͳͲͷɺϩάૹ৴Client • ϩάΛಡΈऔΓLokiͷΤϯυϙΠϯτૹ৴͢Δ
19 Amazon S3 Chunk Cache Index Write Cache Loki Clients
Storage Distributor Ingesters Clients (Promtail, Fluentd) • ϦΫΤετͷόϦσʔγϣϯΛߦ͏ • దͳIngesterϧʔςΟϯά͢Δ ॻ͖ࠐΈϓϩηε Overview
20 Amazon S3 Chunk Cache Index Write Cache Loki Clients
Storage Distributor Ingesters Clients (Promtail, Fluentd) ϩάΛ࣮ࡍʹετϨʔδʹอଘ͢Δ Ұఆ࣌ؒόοϑΝϦϯά͔ͯ͠Βετ Ϩʔδʹอଘ ॻ͖ࠐΈϓϩηε Overview
21 Amazon S3 Loki Clients Storage Distributor Ingesters Clients (Promtail,
Fluentd) • S3ʹϩάΛӬଓԽ • MemcachedʹϩάͷΩϟογϡΛอଘ Chunk Cache ॻ͖ࠐΈϓϩηε Overview
22 Client͔ΒDistributorͷૹ৴
23 Client -> Distributor Clients (Promtail, Fluentd) Distributor HTTP Headers
X-Scope-OrgID : <Tenant ID> TenantIDΛRequest Headerʹهࡌ
24 Client -> Distributor Clients (Promtail, Fluentd) Distributor {service=“keystone”, hostname=“host1”}
00:00:02 keystone log body {service=“keystone”, hostname=“host1”} 00:00:03 keystone log body {service=“keystone”, hostname=“host1”} 00:00:04 keystone log body LokiૹΔϩάσʔλߏ
25 Client -> Distributor {service=“keystone”, hostname=“host1”} 00:00:02 keystone log body
Stream Log Body TS
26 Client -> Distributor {service=“keystone”, hostname=“host1”} 00:00:02 keystone log body
ϩά͍͔ͭ͘ͷϥϕϧΛ࣋ͭɻ TenantIDͱϥϕϧͷΈ߹Θͤͷ ҰͭҰͭΛɺʮStreamʯͱݺͿ Stream Log Body TS
27 DistributorͰͷόϦσʔγϣϯ • ϥϕϧͷܗࣜਖ਼͍͔͠ʁ • Rate limitΛ͑ͳ͍͔ʁ Clients (Promtail, Fluentd)
Distributor
28 Distributor DistributorͰͷόϦσʔγϣϯ Distributor Distributor Distributor Distributorಉ࢜ͰΫϥελϦϯά • StatusΛޓ͍ʹ૬ޓʹࢹ •
ingestion_rate_strategy͕global ͳΒΫϥελશମͰingestion rateΛ੍ޚ͢Δ
29 Distributor DistributorͰͷόϦσʔγϣϯ Distributor Distributor Distributor DistributorશମͰόϦσʔγϣϯ ͢ΔͨΊͷΫϥελϦϯά
30 Distributor͔ΒIngesterͷૹ৴
31 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor ݕࡧͷͨΊʹ(ޙड़)
ෳͷIngesterʹ Խͯ͠ϩάΛૹΔ
32 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor Ingesterಉ࢜ͰΫϥελϦϯά
• StatusΛޓ͍ʹ૬ޓʹࢹ • Consistent HashͷRingʹͳͬͯ ͍Δ
33 Distributor -> Ingesters Ingester Ingesters Ingester Ingester FNV1-32bitͰHashΛੜ tenantID
+ {service=“keystone”, hostname=“host1”} a6965cd7
34 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor a6965cd7
Consistent Hashʹج͍ͮͯɺ ࢉग़ͨ͠HashʹରԠ͢ΔIngester Λreplication factorཁٻ
35 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor ؼ͖ͬͯͨෳͷIngesterʹɺ
ಉ͡ϩάΛಉ࣌ૹ৴
36 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor OK
OK Fail
37 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor ա͕OKͳΒޭ
OK OK Fail
38 IngesterͷRequest Handling
39 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 {service=“keystone”, hostname=“host1”} Ingester
40 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 Stream1 chunks StreamͰGrouping͞Εɺ
Chunkͱ͍͏ܗࣜͰAppend Ingester
41 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 Stream1 chunks WAL
Segment Ingester WALʹه
42 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 Stream1 chunks WAL
Segment Ingester OK
43 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 Stream1 chunks WAL
Segment Ingester ͠StreamͰ࠷ޙʹड͚औͬͨϩάͷ࣌ؒΑΓ લͷ࣌ؒͷϩά͕དྷͨ߹ϦΫΤετΛࣦഊͤ͞Δ Out of order entry error
44 ChunkόοϑΝϦϯά
45 IngesterͷChunkόοϑΝϦϯά • ҰఆͷϩάΛStream͝ͱʹ ʮChunkʯʹ·ͱΊΔ • ChunkϝϞϦ্ʹอଘ͞ΕΔ • HeadɺBlocksͱ͍͏ྻΛอ࣋ Head
Blocks compressed block compressed block compressed block compressed block compressed block MemoryChunk
46 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block MemoryChunk {service=“keystone”, hostname=“host1”}
47 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block MemoryChunk Log Append ड͚औͬͨϩάҰ୴Head Append͞ΕΔ
48 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block MemoryChunk Log Log Append 1 block sizeཷ·Δ·Ͱ܁Γฦ͢
49 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block MemoryChunk Log Log Log Log Log Append 1 block sizeཷ·Δ·Ͱ܁Γฦ͢
50 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block compressed block MemoryChunk HeadʹՃ BlockʹՃ ҰఆαΠζཷ·ͬͨΒ ઃఆͨ͠ܗࣜͰѹॖ͠ɺ blocksʹՃ͢Δ
51 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block compressed block MemoryChunk HeadʹՃ Block͕Ұఆཷ·ͬͨΒ Read Only ModeʹͳΓɺFlush Queue (Default 10 blocks, target chunk sizeͰࢦఆ)
52 Ingester͔ΒStorageFlush
53 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Chunk Cache Disk సஔIndex
54 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε ݅Λຬͨ͢ChunkΛݕ • ࢦఆαΠζʹ౸ୡ • ࠷ޙͷߋ৽͔Βchunk idle periodܦա • max_chunk_ageܦա
55 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Disk సஔΠϯσοΫε Flush QueueEnqueue
56 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Disk Enqueue͞ΕͨͷΛFlush 1. ChunkΛS3อଘ 2. Chunk Cacheอଘ(Write Through) 3. సஔIndexΛϩʔΧϧBoltDBʹอଘ Chunk Cache సஔIndex
57 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Goroutine Goroutine Amazon S3 Disk IngesterͷϦΫΤετ ෳ͞Ε͍ͯΔͷͰɺ طʹCacheʹೖ͍ͬͯΔChunk Storageͷॻ͖ࠐΈ͕ൃੜ͠ͳ ͍Α͏ʹ੍ޚ͍ͯ͠Δ Chunk Chunk Cache సஔIndex
58 Cacheͷෛՙࢄ Chunk Cache Ingesters Chunk Cache Chunk Cache Chunk1
Key Chunk2 Key Chunk3 Key ઃఆʹΑͬͯConsistent HashʹΑΓɺ ChunkͷKeyΛݩʹࢄͯ͠อଘͰ͖Δ
59 Write Ahead Log
60 ChunkͷFlush(࠶ܝ) Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε
61 IngesterͷChunkͷ࣋ͪํ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Flush͞ΕΔલʹϓϩηε͕ࢭ·Δͱ Chunk͕شൃ͢Δ
62 WALͷҙٛ ӬଓԽͯ͠Memoryͷشൃʹඋ͑Δ
63 LokiͷWALͷಛ • ϩάड৴࣌ʹɺMemoryͱWAL྆ํʹॻ͖ࠐΉ • WALॻ͖ࠐΈ͕ࣦഊͯ͠ॲཧΛࣦഊͤ͞ͳ͍ • Ingesterͷϓϩηεىಈ࣌ʹWAL͔Βͷ෮چॲཧ͕ೖΔ • Ұఆظؒ͝ͱʹෆཁͳWALύʔδ͞ΕΔ
64 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Log Entry ϩάΛૹ৴
65 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Append SegmentϑΝΠϧʹ rawσʔλͷ··ه͞ΕΔ
66 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Create! 1ϑΝΠϧ͕େ͖͘ͳͬͯ͘Δͱ ผͷSegmentϑΝΠϧΛ࡞ Segment2
67 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Goroutine Disk Segment1 Segment2 Ұఆظؒ͝ͱʹ ෆཁͳSegmentϑΝΠϧͷ ύʔδॲཧ͕ೖΔ
68 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Segment2 Segment3 Create! SegmentΛҰͭਐΊΔ
69 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Segment2 Segment3 Checkpoint1 IngesterͷະFlush ChunkΛ CheckpointͱݺΕΔ εφοϓγϣοτͱͯ͠อଘ
70 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Segment2 Segment3 Checkpoint1 MemoryChunkߏͷ··อଘ͢Δ ͷͰblockѹॖܗࣜɺ headঢ়ଶͷϩάඇѹॖͱͳΔ
71 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment3 Checkpoint1 ࠷৽ͷCheckpointͱSegmentΛ ͯͯ͢͠আ͢Δ
72 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment3 Checkpoint1 ͦͷ࣌ͰFlush͞Ε͍ͯͳ͍ ͯ͢ͷϩάؚ͕·Ε͍ͯΔ
73 WALͷΈ Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Ingesterͷىಈ࣌ʹ Disk͔Β SegmentͱCheckpointΛಡΈऔΔ
74 WALͷΈ Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Stream1 Stream2 Stream1 Stream2 Memoryʹ෮ݩ͢Δ
75 WALͷΈ Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Stream1 Stream2 Stream1 Stream2 ෮ݩ͕ྃ͢Δ·ͰϓϩηεΛىಈ͠ͳ͍
76 Ingester্ͷσʔλͱEncodeܗࣜ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Disk Segment1 Segment2 Segment3 Checkpoint1 1block sizeҎԼͷraw + ѹॖ per Chunk raw Memory Chunkͱಉ༷
77 2) ϩάͷಡΈࠐΈϓϩηε
78 Overview
79 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend Queriers Query Result Cache Ingesters
80 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ϦΫΤετΛड͚Δ
81 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ड͚औͬͨΫΤϦΛ࣌ؒͳͲͰ ׂͯ͠ΩϡʔΠϯά͢Δ
82 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ෳͷQuerier͕Ωϡʔ͔ΒQuery Λड͚औΓϋϯυϦϯά͢Δ Queriers
83 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ͯ͢ͷIngesterʹରͯ͠ɺ MemoryChunk͔ΒQueryʹMatch ͢ΔͷΛཁٻ Queriers Ingesters
84 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend QueryʹରԠ͢ΔసஔIndexΛऔಘ͢Δ ͜ͷͱ͖CacheʹಁաతʹΞΫηε Queriers
85 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend Cacheʹଘࡏ͠ͳ͍߹ɺ ϩʔΧϧͷBoltDB͔ΒMatch͢ΔIndexΛऔಘ ͦͯ݁͠ՌΛCacheʹอଘ͢Δ(snappy) Queriers BoltDB
86 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend సஔIndex͔ΒରͷChunkΛׂΓग़͢ Queriers
87 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ChunkΛऔಘ͢Δ Cacheʹͳ͍ͷStorage͔Βऔಘ͠ɺ Cacheʹอଘ͢Δ Queriers
88 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ͯ͢ͷQuerier͔Βͷ݁ՌΛड͚औΓɺ ूɺιʔτɺॏෳഉআΛ࣮ࢪ Queriers
89 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ݁ՌΛQuery Result Cacheอଘ͢Δ Queriers Query Result Cache
90 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend Ϩεϙϯεฦ٫ Queriers Query Result Cache
91 సஔIndex͔ΒରChunkͷબఆ
92 సஔIndexͷత S3͔ΒChunkΛ࠷খ࿑ྗͰऔಘ͢Δ͜ͱ
93 సஔIndex͔ΒରChunkͷߜΓࠐΈ 1.LabelͷKeyͱValueͷΈ߹Θ͔ͤΒStreamͷID (Series ID)Λऔಘ͢Δ 2.Series IDͱ࣌ؒൣғ͔ΒChunkͷKeyΛऔಘ͢Δ 3.ChunkͷKey͔ΒS3্ɺMemcached্ͷύεΛׂ Γग़͠ɺChunkΛDownload
94 SeriesID {service=“keystone”, hostname=“host1”} 00:00:02 keystone log body key, valueͷΈ߹Θͤͷsha256
9ac2adda49e899b312a9abb895656b1ab26c9858fd500f2ae3983d5309b39363/ ϩά SeriesID
95 Chunk Key {service=“keystone”, hostname=“host1”} 00:00:02 keystone log body Tenant1/a6965cd7:Chunk։࢝࣌ؒ:Chunkऴྃ࣌ؒ
key, valueͷΈ߹ΘͤͷHash a6965cd7 ϩά FingerPrint Chunk Key
Hash Value Range Value Value TenantID + LabelName Hash(Label Value)
+ SeriesID Label Value సஔIndexͷߏΠϝʔδ Label Key-ValueͷHash͔ΒSeriesIDΛҾ͘Index
Hash Value Range Value Value TenantID + LabelName Hash(Label Value)
+ SeriesID Label Value Label Key-ValueͷHash͔ΒSeriesIDΛҾ͘Index Hash + RangeͰϢχʔΫߦΛಛఆ͢Δ సஔIndexͷߏΠϝʔδ
Hash Value Range Value Value TenantID + LabelName Hash(Label Value)
+ SeriesID Label Value Label Key-ValueͷHash͔ΒSeriesIDΛҾ͘Index సஔIndexͷߏΠϝʔδ Range Valueൣғݕࡧɺιʔτʹར༻Ͱ͖Δ
Hash Value (TenantID + Label Name) Range Value (Hash(Label Value)
+ SeriesID) Value (Label Value) Tenant1:service abc680ab:c79abadeff keystone Tenant1:host cfe960ab:bcfe12ea hostname1 Tenant1:type can860ab:c79abadeff api Tenant1:service cdc680ab:c79abadeff nova Tenant1:host bee960ab:bcfe12ea hostname21 Tenant1:type abd860ab:c79abadeff scheduler ςʔϒϧΠϝʔδ సஔIndexͷߏΠϝʔδ
͜ͷIndex LabelͷKeyͱValueͷύλʔϯ͚ͩ࡞ΒΕΔ ΧʔσΟφϦςΟͷߴ͍ϥϕϧ ͜ͷIndex͕େྔʹ࡞ΒΕΔ͜ͱʹͳΔ సஔIndexͷߏΠϝʔδ
SeriesID͔ΒChunk KeyΛҾ͘Index సஔIndexͷߏΠϝʔδ Hash Value Range Value Value TenantID +
SeriesID Chunkͷ։࢝࣌ؒ + Chunk Key nil
102 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} |= “level=ERROR” LogQL ϥϕϧϚον෦ ϑΟϧλ෦
103 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} |= “level=ERROR” LogQL ϥϕϧϚον෦ ϑΟϧλ෦
IndexΛ͏ͷ͜ͷ෦ͷධՁ
104 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} {service=“keystone”} {hostname=“host1”} ׂ
105 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} {service=“keystone”} {hostname=“host1”} ͦΕͧΕͰϚονϯά݅ ͔ΒసஔIndexΛऔಘ సஔIndex
సஔIndex
Hash Value (TenantID + Label Name) Range Value (Hash(Label Value)
+ SeriesID) Value Tenant1:service abc680ab:c79abadeff keystone Tenant1:host cfe960ab:bcfe12ea hostname1 Tenant1:type can860ab:c79abadeff api Tenant1:service cdc680ab:c79abadeff nova Tenant1:host bee960ab:bcfe12ea hostname21 Tenant1:type abd860ab:c79abadeff scheduler Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”} ώοτ͢Δͷ͜ͷϨίʔυ
Hash Value (TenantID + Label Name) Range Value (Hash(Label Value)
+ SeriesID) Value Tenant1:service abc680ab:c79abadeff keystone Tenant1:host cfe960ab:bcfe12ea hostname1 Tenant1:type can860ab:c79abadeff api Tenant1:service cdc680ab:c79abadeff nova Tenant1:host bee960ab:bcfe12ea hostname21 Tenant1:type abd860ab:c79abadeff scheduler Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”} ରSeriesID
108 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} {service=“keystone”} {hostname=“host1”} Index͔ΒSeriesIDΛநग़ ྆ํʹڞ௨͢Δͷ͚ͩ࠾༻ SeriesIDs
సஔIndex సஔIndex
109 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} {service=“keystone”} {hostname=“host1”} SeriesIDͱ࣌ؒൣғ͔Β Chunk KeyΛநग़
SeriesIDs సஔIndex Chunk Keys సஔIndex సஔIndex
SeriesID = c79abadeff, ൣғ=2021/10/26 21:52:00 + 5min Hash Value (TenantID
+ SeriesID) Range Value (Chunk։࢝࣌ؒ + Chunk Key) Value Tenant1:c79abadeff 1635252768:chunk1 nil Tenant1:c79abadeff 1635256368:chunk2 nil Tenant1:c79abadeff 1635260768:chunk3 nil Index͔ΒͷChunk KeyׂΓग़͠
SeriesID = c79abadeff, ൣғ=2021/10/26 21:52:00 + 5min Hash Value (TenantID
+ SeriesID) Range Value (Chunk։࢝࣌ؒ + Chunk Key) Value Tenant1:c79abadeff 1635252768:chunk1 nil Tenant1:c79abadeff 1635256368:chunk2 nil Tenant1:c79abadeff 1635260768:chunk3 nil Index͔ΒͷChunk KeyׂΓग़͠ ରChunkͷRecord
112 Ϩεϙϯεͷੜͱฦ٫
113 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks ॳظঢ়ଶͰ࣮ମΛ࣋ͨͣɺ ಡΈࠐΈ໋ྩ͕͞ΕͨλΠϛϯάͰετϨʔδ(Ωϟογϡ) ʹChunkΛऔΓʹߦ͘Lazy ChunkΛੜ
114 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Ingester͔ΒͷChunkΛ͋Θͤͯɺ IteratorΛੜ
115 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Response Ϩεϙϯε͕نఆ݅ʹୡ͢Δ·Ͱɺ
IteratorΛಡΈࠐΜͰ͍͘ ϩάͷFilter݅͜͜ͰධՁ͞ΕΔ |= “level=ERROR”
116 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Response LazyChunkΛಡΈࠐΉ߹ɺ
Ωϟογϡɺͳ͚ΕStorage ChunkΛ͍߹ΘͤΔ Amazon S3 Chunk Cache
117 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Response Storage͔ΒಡΈࠐΜͩ߹ɺ
औಘޙʹCacheʹอଘ͢Δ Amazon S3 Chunk Cache
118 Query Sharding
119 Queryͷׂઓུ 1. ࣌ؒ͝ͱʹׂ͢Δ • 1࣌ؒͷϩάΛݕࡧ͢Δ߹ɺ15Ͱׂ͢ΔઃఆͳΒ4 ͭͷΫΤϦʹղ͞Ε࣮ͯߦ͞ΕΔ 2. సஔIndexΛSharding͢Δ •
͋Β͔͡ΊసஔIndexʹShard൪߸Λ͍Ε͓͖ͯɺ QueryFrontend͕QueryΛׂ͠ɺͦΕͧΕʹShard൪߸Λ ૠೖͯ͠QuerierʹΘͨ͢
Hash Value Range Value Value TenantID + LabelName Shard Number
+ Hash(Label Value) + SeriesID Label Value సஔIndexͷShard൪߸ຒΊࠐΈ Shard Number = SeriesID % shard count
121 సஔIndexͷShard൪߸ຒΊࠐΈ {service=“keystone”, hostname=“host1”} {service=“keystone”} {service=“keystone”,hostname=“host2”} 1 12 16 Shard
Number Stream
122 Shard൪߸ʹΑΔQueryׂ {service=“keystone”} |= “level=ERROR” LogQL Querier for shard 1
Querier for shard 12 Querier for shard 16 QueryΛShardͰׂ
123 Shard൪߸ʹΑΔQueryׂ {service=“keystone”, hostname=“host1”} {service=“keystone”} {service=“keystone”,hostname=“host2”} 1 12 16 {service=“keystone”}
|= “level=ERROR” LogQL Querier for shard 1 Querier for shard 12 Querier for shard 16 Chunk Keys Chunk Keys Chunk Keys औΕΔChunk͕shardͰׂ͞ΕΔ
124 Shard൪߸ʹΑΔQueryׂ {service=“keystone”, hostname=“host1”} {service=“keystone”} {service=“keystone”,hostname=“host2”} 1 12 16 {service=“keystone”}
|= “level=ERROR” LogQL Querier for shard 1 Querier for shard 12 Querier for shard 16 Chunk Keys Chunk Keys Chunk Keys |= “level=ERROR”ͷfilterॲཧΛׂॲཧͰ͖Δ
125 BoltDB ShipperʹΑΔIndexཧ
126 BoltDB Shipper Ingester Shipper Disk Querier Shipper Disk Index
1 Index 2 Index 1 Index 2 Amazon S3
127 BoltDB Shipper - Ingester side Ingester Shipper Disk Querier
Shipper Disk Index 1 Index 2 Index 1 Index 2 Amazon S3 Ұఆ࣌ؒ͝ͱʹϩʔΧϧDiskʹ͋ΔIndex ΛS3Ξοϓϩʔυ͢Δ Ξοϓϩʔυޙʹআ͢Δ
128 BoltDB Shipper - Querier side Ingester Shipper Disk Querier
Shipper Disk Index 1 Index 2 Index 1 Index 2 Amazon S3 • ىಈ࣌ʹS3ʹ͋ΔIndexΛDownload • Query࣌ʹΓͳ͍IndexS3͔Β μϯϩʔυ • Ұఆ࣌ؒ͝ͱʹ࠷ऴ༻͔Β CacheTTLܦաͨ͠IndexΛআ
129 BoltDB Shipper Ingester Shipper Disk Querier Shipper Disk Index
1 Index 2 Index 1 Index 2 Amazon S3 Ingester, QuerierϩʔΧϧͰIndexΛѻ͍ɺ Shipper͕ඇಉظͰIndexΛStorageͱಉظ͢Δ
130 ֤ίϯϙʔωϯτͷׂ·ͱΊ
131 Name ׂ λΠϓ σʔλͷ࣋ͪԽ ΫϥελϦϯά༗ແ Distributor όϦσʔγϣϯͱIngesterͷϧʔςΟϯά Stateless ༗
Ingester σʔλͷόοϑΝϦϯάͱFlush Stateful Memory: Chunks(raw + ѹॖ) Disk: WAL(raw + ѹॖ) సஔIndex(ѹॖ) ༗ Query Frontend ΫΤϦͷׂɺΩϡʔ੍ޚ Stateless ແ Querier ΫΤϦͷ࣮ߦ Stateful Disk: సஔIndexͷCache(ѹॖ) ແ Chunk Cache ChunkͷΩϟογϡ Stateful Memory: Chunks(ѹॖ) ༗(ΫϥΠΞϯταΠυ) Index Read Cache IndexͷRead༻Ωϟογϡ Stateful Memory: సஔIndex(Snappy) ༗(ΫϥΠΞϯταΠυ) Index Write Cache ಉ͡Indexͷॻ͖ࠐΈ͕ෳൃੜ͠ͳ͍Α͏ʹ ͢ΔͨΊͷ੍ޚ༻Ωϟογϡ (BoltDB ShipperͰෆཁ) Stateful Memory: Chunk Key(raw) ༗(ΫϥΠΞϯταΠυ) Query Result Cache ΫΤϦͷ݁ՌͷΩϟογϡ Stateful Memory: Query Result(raw) ༗(ΫϥΠΞϯταΠυ)
132 3) ো࣌ͷڍಈΛΔ
133 ॻ͖ࠐΈ࣌ͷোઃܭ
134 ॻ͖ࠐΈ࣌ͷোઃܭ S3ো࣌ʹඋ͑Δ
135 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk Segment1 Checkpoint1 ͕ࣦ͜͜ഊ సஔIndex
136 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 FlushඇಉظͳͷͰ ॻ͖ࠐΈࣗମࣦഊ͠ͳ͍
137 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 ͕ࣦ͜͜ഊ MemoryͱDiskʹͨ·Γଓ͚Δ
138 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 ͕ࣦ͜͜ഊ ઌʹMemory͕͋;ΕɺOOM
139 WALͷΈ(࠶ܝ) Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Ingesterͷىಈ࣌ʹ Disk͔Β SegmentͱCheckpointΛಡΈऔΔ
140 WALͷΈ(࠶ܝ) Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Stream1 Stream2 Stream1 Stream2 Memoryʹ෮ݩ͢Δ
141 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 WAL࣮࣭MemoryͷSnapshot ϩʔυޭ͍ͯۙ͠͏ͪʹOOM ϩʔυࣦഊͨ͠Βͦͦىಈ͠ͳ͍
142 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 ϘτϧωοΫIngesterͷMemory
143 Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2 Disk
Segment1 Segment2 Segment3 Checkpoint1 1block sizeҎԼͷraw + ѹॖ per Chunk raw Memory Chunkͱಉ༷ Ingester্ͷσʔλͱEncodeܗࣜ(࠶ܝ)
144 ॻ͖ࠐΈ࣌ͷোઃܭ ରࡦ1. ͍͑ͨ࣌ؒͷMemoryΛੵΉ • ࣌ؒ͋ͨΓͷϩάྔ / ѹॖൺ * ࣌ؒ
* replication factor ※ѹॖൺฐڥͰgzipѹॖͰ10~18ഒͷѹॖൺ
145 ॻ͖ࠐΈ࣌ͷোઃܭ 1ͷྲྀྔ10TBͷڥͰ1࣌ؒ͑Δ(ฐࣾͷ1Ϧʔδϣϯ) 1000 / 24 = 41.6 GB /
hour • Replication Factor = 1 • Ingester * 11 • Memory: 4GiB • Disk: 8GiB(ϚʔδϯΛऔͬͯMemoryͷ2ഒ) • Chunk Cache * 14 • Memory: 3GiB
146 ॻ͖ࠐΈ࣌ͷোઃܭ Replication Factor1Ͱ͍͍ͷ͔ʁ • Flush͞ΕΔ·ͰʹIngesterϓϩηε͕μϯ͢Δͱͦͷؒ ͚ͩͦʹ͋ͬͨϩάܽଛ͢Δ ͋ΔఔׂΓΓΛ͢Δ • WAL͕͋ΔͷͰ࠶ىಈޙʹ͙͢ʹ෮چͰ͖Δ
• ࡉ͔͍ܽଛΑΓো࣌ʹՔಇܧଓͰ͖ΔՄೳੑΛߴΊΔํ ʹৼΔ
147 ॻ͖ࠐΈ࣌ͷোઃܭ ରࡦ2. ো࣌Ұ࣌తʹWALΛແޮʹͯ͠ىಈ͢Δ • WALϩʔυ͕Βͳ͍ͷͰɺMemory͔Β͋;ΕΔྔཷ·͍ͬͯ ͯϓϩηεىಈͰ͖ΔΑ͏ʹͳΔ • ࠶༗ޮʹ͢Δͱ͖ʹϩά͕شൃ͠ͳ͍Α͏ɺreplication factorɺ
update strategyʹྀ͢Δ
148 ಡΈऔΓ࣌ͷোઃܭ
149 ಡΈऔΓ࣌ͷোઃܭ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Response Amazon
S3 Chunk Cache
150 ಡΈऔΓ࣌ͷোઃܭ Storageো࣌ʹϩάΛݕࡧ͢ΔͨΊͷ݅ • Ingester͕࠷Ұ݈ͭࡏͰ͋Δ͜ͱ • ݕࡧ݁ՌΛCache͔IngesterͷσʔλͰΧόʔͰ͖Δ͜ͱ • Cacheʹͳ͍࣌ؒൣғΛΫΤϦʹࢦఆ͠ͳ͍͜ͱ
151 ·ͱΊ
152 ·ͱΊ LokiͷίΞͰ͋Δॻ͖ࠐΈͱಡΈࠐΈϓϩηεΛৄղ • ಈ࡞ݪཧ͕Θ͔ͬͨ͜ͱͰɺτϥϒϧγϡʔςΟϯάνϡ ʔχϯά͕Մೳʹ • Ͳ͜ͰͲͷΤϯίʔσΟϯάͰσʔλΛ͔࣋ͭΛѲ͢Δ͜ ͱͰΩϟύγςΟϓϥϯχϯά͕Մೳʹ •
ো࣌ͷڍಈΛѲ͢Δ͜ͱͰదͳ४උ͕ݕ౼Մೳʹ
153 ·ͱΊ ͑ΒΕͳ͔ͬͨ͜ͱ • ϩά͔ΒͷϝτϦΫεੜΞϥʔςΟϯά • ϩάͷϦςϯγϣϯཧʹ͍ͭͯ • LokiࣗମͷϞχλϦϯάʹ͍ͭͯ •
֤ίϯϙʔωϯτͷΩϟύγςΟཧʹ͍ͭͯ • Out of order entryͷରࡦʹ͍ͭͯ
154 ผ్ຊΛॻ͖·͢
155 Twitter: @taisho6339 ࣭͝
156 THANK YOU