This document summarizes a microservices meetup hosted by @mosa_siru. Key points include:
1. @mosa_siru is an engineer at DeNA and CTO of Gunosy.
2. The meetup covered Gunosy's architecture with over 45 GitHub repositories, 30 stacks, 10 Go APIs, and 10 Python batch processes using AWS services like Kinesis, Lambda, SQS and API Gateway.
3. Challenges discussed were managing 30 microservices, ensuring API latency below 50ms across availability zones, and handling 10 requests per second with nginx load balancing across 20 servers.
This document summarizes Masahiro Nakagawa's presentation on Fluentd and Embulk. Fluentd is a data collector for unified logging that allows for streaming data transfer based on JSON. It is written in Ruby and uses plugins to collect, process, and output data. Embulk is a bulk loading tool that allows high performance parallel processing of data to load it into various databases and storage systems. Both tools use a pluggable architecture to provide flexibility in handling different data sources and targets.
This document summarizes a microservices meetup hosted by @mosa_siru. Key points include:
1. @mosa_siru is an engineer at DeNA and CTO of Gunosy.
2. The meetup covered Gunosy's architecture with over 45 GitHub repositories, 30 stacks, 10 Go APIs, and 10 Python batch processes using AWS services like Kinesis, Lambda, SQS and API Gateway.
3. Challenges discussed were managing 30 microservices, ensuring API latency below 50ms across availability zones, and handling 10 requests per second with nginx load balancing across 20 servers.
This document summarizes Masahiro Nakagawa's presentation on Fluentd and Embulk. Fluentd is a data collector for unified logging that allows for streaming data transfer based on JSON. It is written in Ruby and uses plugins to collect, process, and output data. Embulk is a bulk loading tool that allows high performance parallel processing of data to load it into various databases and storage systems. Both tools use a pluggable architecture to provide flexibility in handling different data sources and targets.
This document introduces Hivemall, an open-source machine learning library built as Hive UDFs. It summarizes new features in version 0.4, including Random Forest and Factorization Machine algorithms. The speaker then outlines the development roadmap, with plans to add Gradient Tree Boosting, Field-aware Factorization Machines, Online LDA, and a Mix server in upcoming versions. Real-world use cases of Hivemall are also briefly mentioned.
Embulk, an open-source plugin-based parallel bulk data loaderSadayuki Furuhashi
The document discusses Embulk, an open-source parallel bulk data loader that uses plugins. Embulk loads records from various sources ("A") to various targets ("B") using plugins for different source and target types. This makes the painful process of data integration more relaxed. Embulk executes in parallel, validates data, handles errors, behaves deterministically, and allows for idempotent retries of bulk loads.
PDF版 世界中のゲーム分析をしてきたPlayFabが大進化!一緒に裏側の最新データ探索の仕組みを覗いてみよう Db tech showcase2020Daisuke Masubuchi
世界中のオンラインゲームやスマフォアプリの分析をしてきたPlayFab。最近、従来のイベント分析に加えて様々なテレメトリーを包含したクラウド分析機能が備わりました。今回は、その裏の Azure Data Explorer a.k.a Kusto での構成や仕組みをご紹介します。Windowsのテレメトリー分析やAzureのログ解析基盤の裏側と共通した仕掛けが含まれているのでお楽しみに!ゲーム業界に限らず、ビックデータ運用を考えている大規模なSaaS事業やIoT事業にもご参考いただけたら幸いです。
at db tech showcase ONLINE 2020 https://db-tech-showcase.com/dbts/2020/online #dbts2020 #gamestackjp
*本資料は 2020年11月11日に開催された DB Tech Showcase イベントにてお話させていただいた、同タイトルのセッション資料となります
2. WHO AM I ?
• Toru Takahashi (@nora96o)
• Treasure Data, Inc.
• Support Engineering Manager
• メールにチャットに、ブログ書いたり、コードを書いたり、
• http://qiita.com/toru-takahashi
• 気づくと、社会人4年目に突入・・・
2
47. Digdagとは?
Digdag is a simple tool that helps you to build, run, schedule, and monitor
complex pipelines of tasks. It handles dependency resolution so that tasks
run in order or in parallel.
Digdag fits simple replacement of cron, IT operations automation, data
analytics batch jobs, machine learning pipelines, and many more by using
Directed Acyclic Graphs (DAG) as the infrastructure.
47
http://www.digdag.io/index.html