fluentdã§éç´ãããã°ãRedshiftã«æå ¥ããã®ã«ãããã¾ã§ã¯ fluent-plugin-redshift ã使ã£ã¦ããã®ã§ããã諸ã ã®çç±ã§ãããç½®ãæãããã¼ã«ãGoã§æ¸ãã¾ããã
Rin - Redshift data Importer by SQS messaging.
ãããã¯ã·ã§ã³ç°å¢ã«æå ¥ãã¦ã2é±éã»ã©å¿«èª¿ã«åä½ãã¦ããã®ã§è¨äºãæ¸ãã¦ããã¾ãã
ã¢ã¼ããã¯ãã£ã¨ç¹å¾´
S3ã«ãã¼ã¿ãä¿åãããã¿ã¤ãã³ã°ã§ãAmazon SNS ã¾ã㯠SQS ã«ã¡ãã»ã¼ã¸ãé£ã°ãã¤ãã³ãéç¥æ©è½ãããã¾ãã®ã§ããããå©ç¨ãã¦ãã¾ãã
- (ä½è ã) S3 ã«ãã¼ã¿ãä¿åãã (fluent-plugin-s3, ãã®ä»ã©ããªæ段ã§ãå¯)
- (S3) SQS ã« S3 ã® path çãè¨è¿°ãããã¡ãã»ã¼ã¸ãéç¥ãã
- (Rin) SQS ã®ã¡ãã»ã¼ã¸ãåä¿¡ããRedshift ã¸
COPY
ãçºè¡ãã¦åãè¾¼ã¿ãè¡ã
S3, SQSã®è¨å®ãããä¸ã§ä»¥ä¸ã®ãã㪠config ãç¨æããrin -config config.yaml
ã¨ãã¦èµ·åãã¦ããã ãã§åä½ãã¾ãã
1ããã»ã¹ã§ãè¤æ°ã® S3 path(bucket) ã«å¯¾å¿ãã Redshift ã® table (schema) ã¸ã®æå ¥ãæ±ãã¾ãã
Go 製ãªã®ã§ããã¤ããªããã¦ã³ãã¼ãããã ãã§åä½å¯è½ã§ãã
queue_name: my_queue_name # SQS queue name credentials: aws_access_key_id: AAA aws_secret_access_key: SSS aws_region: ap-northeast-1 redshift: host: localhost port: 5439 dbname: test user: test_user password: test_pass schema: public s3: bucket: test.bucket.test region: ap-northeast-1 sql_option: "JSON 'auto' GZIP" # COPY SQL option # define import target mappings targets: - redshift: table: foo s3: key_prefix: test/foo - redshift: schema: $1 # expand by key_regexp captured value. table: $2 s3: key_regexp: test/([a-z]+)/([a-z]+)/
éçºåæ©
fluent-plugin-redshift ãå©ç¨ãã¦ããéã以ä¸ã®ãããªåé¡ãããã¾ããã
ã¢ãããã¼ãæã«éã
fluentdã®ãããã¡ã¨ãã¦msgpackå½¢å¼ã§ä¿åãããã®ããS3ã¸ã®ã¢ãããã¼ãæã«åãè¾¼ã¿ç¨ã®ãã©ã¼ãããã«å¤æããã¨ããå¦çãè¡ããããfluentd ã® CPU ãç¸å½é£ãã¾ãããããªãã®æµéã®ãã¼ã¿(æ°åmsgs/secç¨åº¦) ã Redshift ã«æå ¥ãããã¨ããã¨ãfluentd ã¯1ããã»ã¹ã§ã¯è¤æ°CPUãæå¹ã«ä½¿ããªããããè¤æ°ããã»ã¹ã«å¦çãåå²ããå¿ è¦ãããã¾ããã
Redshift ã®ã¡ã³ããã³ã¹æã«é¢å
Redshiftã®ã¯ã©ã¹ã¿ã«ãã¼ãã追å ãåé¤ããå ´åãã¯ã©ã¹ã¿ãªãµã¤ãºä¸ã«ã¯ãã¼ã¿æå ¥ãã§ããªããªãã¾ã(èªã¿åãã¯å¯è½)ã
ãã®ç¶æ
㧠fluent-plugin-redshift ã®ãã¼ã¿æå
¥ãèµ°ãã¨ãS3ã¸ãã¡ã¤ã«ãã¢ãããã¼ãããã¨ããã¾ã§ã¯æåããä¸ããã®å¾ã® COPY
ã®çºè¡ã§ã¨ã©ã¼ã«ãªããããfluentdã®å¦çã¯ãS3ã¸ã®ã¢ãããã¼ãå¦çããããªãã©ã¤ããã¾ãã
ãªãã©ã¤ãããã®ã§æçµçã«ã¯åé¡ãªãåãè¾¼ã¾ããã®ã§ãããS3ã«ã¯æå ¥ã§ããªãã£ããã¡ã¤ã«ãæ®ã£ãã¾ã¾ã«ãªããæå ¥ã§ãããã¡ã¤ã«ã¨ã§ããªãã£ããã¡ã¤ã«ã«ã¯é¨åçã«åä¸ã®ãã°ãéè¤ãã¦å«ã¾ããç¶æ ã«ãªãã¾ãã
ã¨ã©ã¼ã«ãªã£ã¦åãè¾¼ã¾ããªãã£ããã¡ã¤ã«ããã¡ãã¨æ¶ãã¦ãããªãã¨ãå¾æ¥ã¾ã¨ãã¦ååãè¾¼ã¿ããããã¨ããã¨ãã«ããã°ãéè¤ãã¦èªã¿è¾¼ãã§ãã¾ããã¨ã«ãªãã¾ãã
æã æ»ã¬
åå ã¯çµå±ç¹å®ã§ããªãã£ãã®ã§ãããplugin-redshiftã®å®ç¾©ãå¤æ°è¨è¿°ããã¨æ°æ¥ãæ°é±éã«ä¸åº¦ç¨åº¦ã®é »åº¦ã§ fluentd ãã¨å¦çãåæ¢ãã¦ãã¾ããããããªã㨠kill -KILL ããªãã¨åèµ·åãã§ããªããªãã¾ããfluentdã®åªç§ãªãããã¡æ©æ§ã®ããã㧠kill ãã¦ããã¼ã¿ãã¹ãã¯ãªãããã§ãããåæ¢ãæ¤ç¥ (ãã°ãæµãã¦ããªããªã) ãã¦ãå¼·å¶åèµ·åããä»çµã¿ãä½ã£ã¦ã ã¾ãã ã¾ãåããã¦ãã¾ããã
Rin ã§ãããããã¨
S3ã¸ã®ã¢ãããã¼ãã軽ããªã
fluent-plugin-s3 ã¯ã¢ãããã¼ãããå½¢å¼ã§ç´æ¥ãããã¡ã«ä¿åãããã®ã¾ã¾(å§ç¸®ãã¦)S3ã«æããã ãã®ããããããã¡ããã®åæ§ç¯ã§ã®CPUæ¶è²»ãããã¾ããã
Redshiftã®ã¡ã³ããã³ã¹æã®ãªãã©ã¤å¦çã楽
S3ã«ä¸ããã¨ããã¾ã§ã¯ Redshift ã¨ã¯ç¡é¢ä¿ã®ãããS3 ã¸ã¢ãããã¼ãããããã®ãé¨åçã«éè¤ãããã¨ã¯ããã¾ããã Rin ã Redshift ã¸æå ¥ã§ããªãã£ãå ´åã«ã¯ SQS ã®ã¡ãã»ã¼ã¸ã¯åé¤ãããä¸å¯è¦æéãéããå¾ã«å度å®è¡ãã¾ãããªãã©ã¤ã¯ SQS ã®ã¡ãã»ã¼ã¸ã§æ ä¿ããã¾ãã
æ»ã«ã«ãã
fluent-plugin-s3ãåå ã§fluentdãåºãã£ãçµé¨ã¯æªã ããã¾ããã
fluentd以å¤ããã®ãã¼ã¿æå ¥ãå¯è½ã«ãªã
S3, ELB, CloudFront ãªã©ãS3 ã«ãã°ãä¿åããããµã¼ãã¹ã® Redshift ã¸ã®åãè¾¼ã¿ãçµ±ä¸çã«æ±ããã¨ãã§ãã¾ãã (ã¾ã ãã£ã¦ãªããã©ã§ããã¯ãâ¦)
FAQ
Q1 "Redshift data Importer by SQS messaging" ã ã£ãã Rin ãããªã㦠Ris ãªã®ã§ã¯?
A1 æåãSQS ã§ã¯ãªã SNS éç¥ãããªã¬ã«ãã¦åãè¾¼ãããã«ãããã¨ååã決ãã¦ã³ã¼ããããç¨åº¦æ¸ããå¾ã«ããªãã©ã¤ã¨å®è¡æã®ã¬ã¹ãã³ã¹ãèãã㨠SQS ã®ã»ããâ¦â¦ã¨ãªã£ãçµç·¯ãããã¾ãã
Q2 Lambda ã§ãã£ããããã®ã§ã¯?
A2 æ¬è¨äºå·çæç¹ãTokyoãªã¼ã¸ã§ã³ã«ã¯æªã ã« Lambda ãæ¥ã¦ãã¾ãã(ããããæ¥ãããªäºæãã²ãã²ãã¨ãã¦ãã¾ãã)ã ã¾ããLamba ã®ãªãã©ã¤å¦çã¯3åééã§3åãã¨ã®ãã¨ãªã®ã§ããªãµã¤ãºä¸ã«ã¯æ¯è¼çé·æé失æãç¶ãããã¨ãèããã¨ä¸å®ãããã¾ããåèï¼ S3ãKinesis/DynamoDB Streamsã§ã®Lambdaãªãã©ã¤å¦ç