Scripting Embulk plugins makes plugin development easier drastically. You can develop, test, and productionize data integrations using any scripting languages. It's most suitable way to integrate data with SaaS using vendor-provided SDKs.
https://techplay.jp/event/781988
6. Who develops new SaaS integrations?
Java developers
Low code
Scripting with SDKs
Scripting
Embulk plugin API
SaaS users
Dev vs user
gap!
7. Who develops new SaaS integrations?
Java developers
Low code
Scripting with SDKs
Scripting
Embulk plugin API
Embulk scripting
SaaS users
Dev = user
8. Scripting on the powerful framework
Embulk scripting plugin
Embulk core framework
Your script
SDK / library
✓High-performance
✓Choices of output plugins
Embulk plugins
9. How it works?
1. Run a script
3. Write rows as a CSV file
4. Read the CSV file
2. Load rows
named pipe
Embulk scripting plugin
Your script
SDK / library
Named pipe is like a file but not a file.
• It doesn’t consume disk space.
• It doesn’t cause disk IO (=fast).
• It transfers data as your script writes
rows (=fast).
10. How it works?
1. Run a script
3. Write rows as a CSV file
4. Read the CSV file
named pipe
Embulk scripting plugin
Your script
SDK / library
output plugin5. Pass rows to an
output plugin
2. Load rows
11. How to use embulk-input-script
1. Install
2. Create a config
3. Run
$ embulk gem install embulk-input-script
in:
type: script
run: ruby your_script.rb #-- any executable
out:
type: …
$ embulk run config.yaml
12. How to develop a script- your script runs 3 times
if ARGV[0] == “setup”
File.write(ARGV[2], “…”)
elsif ARGV[0] == “run”
CSV.open(ARGV[2], “w”) do |file|
file << row
…
end
elsif ARGV[0] == “finish”
puts “Done!”
end
$ script.rb setup <config.yaml> <setup.yaml>
$ script.rb run <setup.yaml> <N> <output.csv>
$ script.rb finish <setup.yaml>
First, write a setup file. It should include
column names, column types and parallelism.
Second, load rows and write them to a CSV file.
If the setup file says parallelism is bigger than 1,
this runs for multiple times with N=0, 1, 2, 3, …
Finally, do cleanup if necessary.
$ script.rb setup <config.yaml> <setup.yaml>
$ script.rb run <setup.yaml> <output.csv> <N>
$ script.rb finish <config.yaml> <setup.yaml>
13. Examples
• Importing server status from DataDog
https://github.com/embulk/embulk-input-script/tree/master/examples/datadog_hosts
• Importing AWS EC2 server list
https://github.com/embulk/embulk-input-script/tree/master/examples/aws_ec2_instances
14. Wanted
• Output support
embulk-output-script is not available.
• Converter from a script to an Embulk plugin gem
When you create a script, you want to release it so that other people can reuse it.
To do it, we need a tool that packages the script with embulk-input-script as a gem.