SlideShare a Scribd company logo
The State of Open
Source Monitoring
Tools
Michael Richardson (@m_richo)
Energized Work
What tools are we currently using to
monitor and troubleshoot our systems?
What tools are we currently using to
monitor and troubleshoot our systems?

•
•
•
•

Nagios
ssh + grep <something_bad> /some/random/log/file.log
tail –f /some/random/log/file.log
Others?
Nagios
Nagios – The lovers
Nagios – The lovers
Nagios – The lovers
Nagios – The lovers
Nagios Love-meter

0

10
Nagios Love-meter
Where are you on the Scale?

0

10
Nagios Love-meter
Where are you on the Scale?

0
Nagios shits
me to tears

10
Sign me up to
Nagios World
Conference 2013!!!!
Alternatives ?
Alternatives ?
Yep, there’s lots
Alternatives ?

Yep, there’s lots
some are better and
some are worse
Today let’s check out
• Graphite
• Statsd
• Logstash
• Sensu
Graphite
Graphite
•
•
•
•
•

Metric storage
Complex graph creation
http://graphite.wikidot.com
Apache 2.0 license
Send time-series data that you are interested in graphing
Graphite
Components
1. Web
2. Whisper
3. Carbon
Graphite
•

Everything stored in graphite has a path with
components delimited by dots. Eg

servers.HOSTNAME.METRIC
applications.APPNAME.METRIC

servers.database01.memfree
applications.trading.loginattempts
Graphite
•
•

No need to pre-define metric end-points
Determine granularity of data upfront.

/opt/graphite/conf/storage-schemas.conf
[stats]
pattern = ^stats.*
retentions = 10:2160,60:10080,600:262974
[catchall]
priority = 0
pattern = ^.*
retentions = 30:86400,300:525600
Graphite
What should I graph/trend?
1. Application Profiling Data
2. Operational Profiling Data
3. Regression Testing (releases)
Why should I Graph/trend?
1. Trends can tell you when something is about to break.
2. …instead of hearing from your customers that it’s broken
3. Data can tell you when something is already broken but
you don’t yet know it (regression).

Source: Jason Dixon (@obfuscurity)
Graphite
Demo

Image source - http://joemiller.me/2011/11/05/correlating-puppet-changes-to-events-in-your-infrastructure/
StatsD
StatsD
•
•
•
•

Measure Anything, Measure Everything
Created and released by Etsy
Aggregate counters and timers
http://github.com/etsy/statsd
StatsD
• Written in node.js
• ~400 lines of javascript
• Listens to statistics (counters & timers),
and sends aggregates to backend
services (like graphite).
• simple
StatsD
Don’t like Javascript or Node.js??
StatsD
Don’t like Javascript or Node.js??
Google “statsd alternatives”…..
StatsD
Don’t like Javascript or Node.js??
Google “statsd alternatives”…..

20+ rewrites/clones for you including..
Ruby, python, scala, python+twisted,
erlang, clojure, C, groovy
StatsD
Concepts
• Buckets (a name that translates to graphite end-point)
• Values
• Flush (default 10 seconds)
Counter metrics
successfullogins:1|c|@0.1
Timing metrics
apitimer:320|ms
StatsD
Counter examples
• Successful customer login attempts
• Failed customer login attempts
• Register a new customer
• Hit 3rd party API
StatsD
Timer examples
• How fast is our function blah()
• How fast is a database query
• How fast is our 3rd party API service
• How fast is our internet access
• How fast are our page response times.
StatsD

demo
LogStash
LogStash
•
•
•
•
•

Tool for managing Events and logs
http://logstash.net
https://github.com/logstash/logstash
Apache 2.0 license
Created by Jordan Sissel
(@jordansissel)
LogStash
• Written in ruby.
• Built with jruby and ships as a jar file.
LogStash
LogStash agent is an Event pipeline with 3
parts.
1. Inputs
2. Filters
3. Outputs
LogStash
1. Inputs – generate events
1. Filters – modify them
1. Outputs – ship them somewhere
LogStash
Inputs include :
amqp, drupal_dblog, eventlog, exec, file,
ganglia, gelf, gemfire, generator, heroku,
irc, log4j, lumberjack, pipe, redis, relp, sqs,
stdin, stomp, syslog, tcp, twitter, udp, xmpp,
zenoss, zeromq
LogStash
Filters include :
alter, anonymize, checksum, csv, date, dns,
environment, gelfify, geoip, grep, grok,
grokdiscovery, json, kv, metrics, multiline,
mutate, noop, split, syslog_pri, urldecode,
xml, zeromq
LogStash
Outputs include :
amqp, boundary, circonus, cloudwatch,
datadog, elasticsearch, elasticsearch_http,
elasticsearch_river, email, exec, file,
ganglia, gelf, gemfire, graphite, graphtastic,
http, internal, irc, juggernaut, librato, loggly,
lumberjack, metriccatcher, mongodb,
nagios, nagios_nsca, null, opentsdb,
pagerduty, pipe, redis, riak, riemann, sns,
sqs, statsd, stdout, stomp, syslog, tcp,
websocket, xmpp, zabbix, zeromq
LogStash
Typical setup
LogStash
Shipper alternatives?
LogStash
Shipper alternatives?
• Syslog (rsyslog, syslog-ng,)
• Lumberjack
https://github.com/jordansissel/lumberjack

• Beaver
https://github.com/josegonzalez/beaver

• Woodchuck
https://github.com/danryan/woodchuck
LogStash
Kibana
• Web interface for viewing logstash
records stored in elastic search
• http://kibana.org/
• http://github.com/rashidkpc/Kibana
• Search for records
• Stream records (near realtime)
• Create RSS feeds based on search
results
• Score, trend data
LogStash
Kibana – search data

Image source - http://kibana.org/
LogStash
Kibana – trend data

Image source - http://kibana.org/
LogStash
Demo
(Syslog & Apache access logs)
LogStash
TIP – Go buy the Logstash Book –
http://logstashbook.com/
James Turnbull (@kartar)
It’s a great introduction to how to use
Logstash.
Open Source Monitoring Tools
Sensu
•
•
•
•
•

https://github.com/sensu/sensu
Creator – Sean Porter (@portertech)
Ruby, RabbitMQ, Redis
<1200 lines of code
Omnibus installation packages
Sensu
Components
• Sensu-server
• Sensu-client
• Sensu-api
• Sensu-dashboard
Sensu
• Message oriented architecture
(messages are JSON objects)
• Described as a monitoring router
• Connects “check” scripts on Sensu
Clients to “handler” scripts on Sensu
Servers
Sensu
Checks can
• Determine if a service like apache up
and running? (check exit code)
• Collect metrics like page views or
database cache usage.
Sensu
Output of checks are router to 1 or more
handlers who determine what to do.
Sensu
Output of checks are router to 1 or more
handlers who determine what to do.
• Send alerts via email, pagerduty, IRC,
twitter, basecamp, xmpp, hipchat,
campfire, etc, etc
Sensu
Output of checks are router to 1 or more
handlers who determine what to do.
• Send alerts via email, pagerduty, IRC,
twitter, basecamp, xmpp, hipchat,
campfire, etc, etc
• Feed metrics to backend services like
graphite, librato, opentsdb, etc, etc
Sensu
demo
Questions??
Thank you

More Related Content

Open Source Monitoring Tools

Editor's Notes

  1. Anyone want a quick rundown of how it works?Fault detection, notifictations, escalations, acknowledgements, adding new nodes, no ajax
  2. Graphite is a highly scalable real-time graphing systemwritten in pythonapache 2.0 license
  3. Graphite is a highly scalable real-time graphing systemwritten in pythonapache 2.0 license
  4. Web – djangoWhisper – metrics database format (similar to RRDTool). Accepts out-of-order data and supports pipelining of data in a single operation.Carbon – storage engine (agent + cache + persister)
  5. Web – djangoWhisper – database for storing time series dataCarbon – listening service for capturing data
  6. Web – djangoWhisper – database for storing time series dataCarbon – listening service for capturing data
  7. Why Graphing and trendingApplication profiling dataOperational profiling data
  8. Why Graphing and trendingApplication profiling dataOperational profiling data
  9. Counter example add 1 to the particular bucket. Count is sent at flush interval and reset to 0tells statsd that counter is sampled every 1/10th of the time.Timing exampleAPI service took 320ms to completeStatsd determines percentiles, average (mean), standard deviation, sum, lower and upper bounds for the flush intervalCan support storing histogram of values too (not default)
  10. Mean, upper, lower, stddev, upper 90, lower 90, count
  11. Embedded web server and embedded elastic searchLead in shipper alternatives
  12. Designed with CM in mind
  13. Designed with CM in mind
  14. Designed with CM in mindDescribe how client registers with server.
  15. Reuse nagios plugins