Fluentd meetup in Japan. I talked about "Dive into Fluent plugin".
Some contents are outdated. See this slide: http://www.slideshare.net/repeatedly/dive-into-fluentd-plugin-v012
8. Agenda
Yes, I talk about
- an example of Fluentd plugins
- Fluentd and libraries
- how to develop a Fluentd plugins
No, I don’t talk about
- the details of each plugin
- the experience of production
2012 2 4
9. Example
based on bit.ly/fluentd-with-mongo
2012 2 4
10. Install
Plugin name is fluent-plugin-xxx ,
and fluent-gem is included in Fluentd gem.
2012 2 4
24. Serialization:
JSON like fast and compact format.
RPC:
Async and parallelism for high performance.
IDL:
Easy to integrate and maintain the service.
2012 2 4
25. Binary format,
Header + Body,
and
Variable length.
2012 2 4
26. Note that
Ruby version can’t handle a Time object.
2012 2 4
27. So,
we use an Integer object instead of a Time.
2012 2 4
37. We can load the plugin configuration using
config_param and configure method.
config_param set config value to
@<config name> automatically.
2012 2 4
38. <source>
type tail
path /path/to/log
...
</source> fluentd.conf
class TailInput < Input
Plugin.register_input(’tail’, self)
config_param :path, :string
...
end in_tail.rb
2012 2 4
39. One trick is here:
Fluentd’s configuration module does not
verify a default value. So,
we can use the nil like Tribool :)
config_param :tag, :string, :default => nil
Fluentd does not check the type
2012 2 4
41. SetTagKeyMixin:
Provide ‘tag_key’ and ‘include_tag_key’.
SetTimeKeyMixin:
Provide ‘time_key’ and ‘include_time_key’.
DetachMultiProcessMixin:
Provide ‘detach_process’ and
execute an action in the multi-process.
2012 2 4
42. Mixin usage
Code Flow
super
class MongoOutput <
BufferedOutput BufferedOutput
... super
include SetTagKeyMixin
config_set_default SetTagKeyMixin
:include_tag_key, false
super
...
end MongoOutput
2012 2 4
45. class NewInput < Input
...
def configure(conf)
# parse a configuration manually
end
def start
# invoke action
end
def shutdown
# cleanup resources
end
end
2012 2 4
46. In action method,
we use Engine.emit to input data.
tag = "app.tag"
time = Engine.now
Sample:
record = {"key" => "value", ...}
Engine.emit(tag, time, record)
2012 2 4
47. How to read an input in an efficient way?
We use a thread and an event loop.
2012 2 4
48. Thread
class ForwardInput < Fluent::Input
...
def start
...
@thread = Thread.new(&method(:run))
end
def run
...
end
end
2012 2 4
49. Event loop
class ForwardInput < Fluent::Input
...
def start
@loop = Coolio::Loop.new
@lsock = listen
@loop.attach(@lsock)
...
end
...
end
2012 2 4
50. Note that
We must use Engine.now instead of Time.now
2012 2 4
58. class NewOutput < BufferedOutput
# configure, start and shutdown
# are same as input plugin
def format(tag, time, record)
# convert event to raw string
end
def write(chunk)
# write chunk to target
# chunk has multiple formatted data
end
end
2012 2 4
59. Output has 3 buffering modes.
None
Buffered
Time sliced
2012 2 4
60. Buffering type
Buffered Time sliced
from in
Buffer has an internal
chunk map to manage a chunk.
A key is tag in Buffered,
chunk queue but a key is time slice in
limit chunk limit TimeSliced buffer.
go out def write(chunk)
chunk # chunk.key is time slice
end
2012 2 4
61. How to write an output in an efficient way?
We can use multi-process (input too).
See: DetachMultiProcessMixin
with detach_multi_process
2012 2 4
64. class MongoOutputTest < Test::Unit::TestCase
def setup
Fluent::Test.setup
require 'fluent/plugin/out_mongo'
end
def create_driver(conf = CONFIG)
Fluent::Test::BufferedOutputTestDriver.new
(Fluent::MongoOutput) {
def start # prevent external access
super
end
...
}.configure(conf)
end
2012 2 4
65. ...
def test_format
# test format using emit and expect_format
end
def test_write
d = create_driver
t = emit_documents(d)
# return a result of write method
collection_name, documents = d.run
assert_equal([{...}, {...}, ...], documents)
assert_equal('test', collection_name)
end
...
end
2012 2 4
66. It’s a weak point in Fluentd... right?
2012 2 4