2. Self Introduction
• Paul Traylor
• LINE Fukuoka 開発室
• Currently responsible for updating monitoring environment at
LINE Fukuoka
• https://github.com/line/promgen
• https://promcon.io/2017-munich/talks/prometheus-as-a-
internal-service/
3. Operating Prometheus at LINE Fukuoka
• 4 HA Pairs
• ~2000 targets
per machine
• ~800k samples
per machine
• ~3.5 million samples
• ~7000 exporters
https://github.com/line/promgen
4. Scaling Prometheus ‒ HA
• Run multiple Prometheus
instance with the same targets
• Alerts are de-duplicated by Alertmanager
5. Scaling Prometheus ‒ Shard
• Split targets
across multiple
servers
• Alertmanager
de-duplicates
alerts
• Proxy or remote
read
6. Prometheus 1.8 ‒ Storage Format
https://promcon.io/2016-berlin/talks/the-prometheus-time-series-database/
http://labs.gree.jp/blog/2017/10/16614/
• One series per file
• Rewrites may have
to touch millions
of files
• Queries also may
touch millions of
files
• No easy way to backup
7. Prometheus 2.0 ‒ New Storage Format
https://promcon.io/2017-munich/slides/storing-16-bytes-at-scale.pdf
https://fabxc.org/blog/2017-04-10-writing-a-tsdb/
• Chunks stored in buckets by time
• Chunks past retention setting are just deleted
• Easier to backup
• Easier to compress
9. Prometheus 2.0 ‒ Flag Changes
• Most flags move from single dash to double dash
• Many storage settings move to tsdb settings
• -config.file -> --config.file
• -storage.local.path -> --storage.tsdb.path