fluent-plugin-azuresearch is a fluent plugin to output to Azure Search
fluent-plugin-azuresearch | fluentd | ruby |
---|---|---|
>= 0.2.0 | >= v0.14.15 | >= 2.1 |
< 0.2.0 | >= v0.12.0 | >= 1.9 |
$ gem install fluent-plugin-azuresearch
To use Microsoft Azure Search, you must create an Azure Search service in the Azure Portal. Also you must have an index, persisted storage of documents to which fluent-plugin-azuresearch writes event stream out. Here are instructions:
<match azuresearch.*>
@type azuresearch
@log_level info
endpoint https://AZURE_SEARCH_ACCOUNT.search.windows.net
api_key AZURE_SEARCH_API_KEY
search_index messages
column_names id,user_name,message,tag,created_at
key_names postid,user,content,tag,posttime
</match>
- endpoint (required) - Azure Search service endpoint URI
- api_key (required) - Azure Search API key
- search_index (required) - Azure Search Index name to insert records
- column_names (required) - Column names in a target Azure search index. Each column needs to be separated by a comma.
- key_names (optional) - Default:nil. Key names in incomming record to insert. Each key needs to be separated by a comma. ${time} is placeholder for Time.at(time).strftime("%Y-%m-%dT%H:%M:%SZ"), and ${tag} is placeholder for tag. By default, key_names is as same as column_names
[note] @log_level is a fluentd built-in parameter (optional) that controls verbosity of logging: fatal|error|warn|info|debug|trace (See also Logging of Fluentd)
Suppose you have the following fluent.conf and azure search index schema:
fluent.conf
<match azuresearch.*>
@type azuresearch
endpoint https://yoichidemo.search.windows.net
api_key 2XX3D2456052A9AD21E54CB03C3ABF6A(dummy)
search_index messages
column_names id,user_name,message,created_at
</match>
Azure Search Schema: messages
{
"name": "messages",
"fields": [
{ "name":"id", "type":"Edm.String", "key": true, "searchable": false },
{ "name":"user_name", "type":"Edm.String" },
{ "name":"message", "type":"Edm.String", "filterable":false, "sortable":false, "facetable":false, "analyzer":"en.lucene" },
{ "name":"created_at", "type":"Edm.DateTimeOffset", "facetable":false}
]
}
The plugin will write event stream out to Azure Ssearch like this:
Input event stream
{ "id": "1", "user_name": "taylorswift13", "message":"post by taylorswift13", "created_at":"2016-01-29T00:00:00Z" },
{ "id": "2", "user_name": "katyperry", "message":"post by katyperry", "created_at":"2016-01-30T00:00:00Z" },
{ "id": "3", "user_name": "ladygaga", "message":"post by ladygaga", "created_at":"2016-01-31T00:00:00Z" }
Search results
"value": [
{ "@search.score": 1, "id": "1", "user_name": "taylorswift13", "message": "post by taylorswift13", "created_at": "2016-01-29T00:00:00Z" },
{ "@search.score": 1, "id": "2", "user_name": "katyperry", "message": "post by katyperry", "created_at": "2016-01-30T00:00:00Z" },
{ "@search.score": 1, "id": "3", "user_name": "ladygaga", "message": "post by ladygaga", "created_at": "2016-01-31T00:00:00Z" }
]
Suppose you have the following fluent.conf and azure search index schema:
fluent.conf
<match azuresearch.*>
@type azuresearch
endpoint https://yoichidemo.search.windows.net
api_key 2XX3D2456052A9AD21E54CB03C3ABF6A(dummy)
search_index messages
column_names id,user_name,message,created_at
key_names postid,user,content,posttime
</match>
Azure Search Schema: messages
{
"name": "messages",
"fields": [
{ "name":"id", "type":"Edm.String", "key": true, "searchable": false },
{ "name":"user_name", "type":"Edm.String" },
{ "name":"message", "type":"Edm.String", "filterable":false, "sortable":false, "facetable":false, "analyzer":"en.lucene" },
{ "name":"created_at", "type":"Edm.DateTimeOffset", "facetable":false}
]
}
The plugin will write event stream out to Azure Ssearch like this:
Input event stream
{ "postid": "1", "user": "taylorswift13", "content":"post by taylorswift13", "posttime":"2016-01-29T00:00:00Z" },
{ "postid": "2", "user": "katyperry", "content":"post by katyperry", "posttime":"2016-01-30T00:00:00Z" },
{ "postid": "3", "user": "ladygaga", "content":"post by ladygaga", "posttime":"2016-01-31T00:00:00Z" }
Search results
"value": [
{ "@search.score": 1, "id": "1", "user_name": "taylorswift13", "message": "post by taylorswift13", "created_at": "2016-01-29T00:00:00Z" },
{ "@search.score": 1, "id": "2", "user_name": "katyperry", "message": "post by katyperry", "created_at": "2016-01-30T00:00:00Z" },
{ "@search.score": 1, "id": "3", "user_name": "ladygaga", "message": "post by ladygaga", "created_at": "2016-01-31T00:00:00Z" }
]
fluent.conf
<match azuresearch.*>
@type azuresearch
endpoint https://yoichidemo.search.windows.net
api_key 2XX3D2456052A9AD21E54CB03C3ABF6A(dummy)
search_index messages
column_names id,user_name,message,tag,created_at
key_names postid,user,content,${tag},${time}
</match>
Azure Search Schema: messages
{
"name": "messages",
"fields": [
{ "name":"id", "type":"Edm.String", "key": true, "searchable": false },
{ "name":"user_name", "type":"Edm.String" },
{ "name":"message", "type":"Edm.String", "filterable":false, "sortable":false, "facetable":false, "analyzer":"en.lucene" },
{ "name":"created_at", "type":"Edm.DateTimeOffset", "facetable":false}
]
}
The plugin will write event stream out to Azure Ssearch like this:
Input event stream
{ "id": "1", "user_name": "taylorswift13", "message":"post by taylorswift13" },
{ "id": "2", "user_name": "katyperry", "message":"post by katyperry" },
{ "id": "3", "user_name": "ladygaga", "message":"post by ladygaga" }
Search results
"value": [
{ "@search.score": 1, "id": "1", "user_name": "taylorswift13", "message": "post by taylorswift13", "tag": "azuresearch.msg", "created_at": "2016-01-31T21:03:41Z" },
{ "@search.score": 1, "id": "2", "user_name": "katyperry", "message": "post by katyperry", "tag": "azuresearch.msg", "created_at": "2016-01-31T21:03:41Z" },
{ "@search.score": 1, "id": "3", "user_name": "ladygaga", "message": "post by ladygaga", "tag": "azuresearch.msg", "created_at": "2016-01-31T21:03:41Z" }
]
[note] the value of created_at above is the time when fluentd actually recieves its corresponding input event.
$ git clone https://github.com/yokawasa/fluent-plugin-azuresearch.git
$ cd fluent-plugin-azuresearch
# edit CONFIG params of test/plugin/test_azuresearch.rb
$ vi test/plugin/test_azuresearch.rb
# run test
$ rake test
$ rake build
$ rake install:local
# running fluentd with your fluent.conf
$ fluentd -c fluent.conf -vv &
# send test input event to test plugin using fluent-cat
$ echo ' { "postid": "100", "user": "ladygaga", "content":"post by ladygaga"}' | fluent-cat azuresearch.msg
Please don't forget that you need forward input configuration to receive the message from fluent-cat
<source>
@type forward
</source>
- Input validation for Azure Search - check total size of columns to add
- http://yokawasa.github.io/fluent-plugin-azuresearch
- https://rubygems.org/gems/fluent-plugin-azuresearch
Bug reports and pull requests are welcome on GitHub at https://github.com/yokawasa/fluent-plugin-azuresearch.
Copyright | Copyright (c) 2016- Yoichi Kawasaki |
License | Apache License, Version 2.0 |