[B! spark] Hashã®ãƒ–ãƒƒã‚¯ãƒžãƒ¼ã‚¯

Hash id:Hash

sparkã«é–¢ã™ã‚‹Hashã®ãƒ–ãƒƒã‚¯ãƒžãƒ¼ã‚¯ (15)

${{author_name}}$

{{{comment_expanded}}}

{{label}}

{{#is_bookmark}}ãƒªã‚¹ãƒˆ{{/is_bookmark}}{{^is_bookmark}}ãƒªãƒ³ã‚¯{{/is_bookmark}}

${{author_name}}$
{{author_name}}{{created}}
{{ #comment }}{{ comment }}{{ /comment }}
- {{ label }}

${{author_name}}$

{{{comment_expanded}}}

{{label}}

{{#is_bookmark}}ãƒªã‚¹ãƒˆ{{/is_bookmark}}{{^is_bookmark}}ãƒªãƒ³ã‚¯{{/is_bookmark}}

Breaking the â€œcurse of dimensionalityâ€ in Genomics using â€œwideâ€ Random Forests
Unified governance for all data, analytics and AI assets
Hash 2017/08/01
VariantSpark RF, ã¨ãª

bioinformatics

spark
ãƒªãƒ³ã‚¯
Interactive Analysis of Genomic Datasets Using Amazon Athena | Amazon Web Services
AWS Big Data Blog Interactive Analysis of Genomic Datasets Using Amazon Athena Aaron Friedman is a Healthcare and Life Sciences Solutions Architect with Amazon Web Services The genomics industry is in the midst of a data explosion. Due to the rapid drop in the cost to sequence genomes, genomics is now central to many medical advances. When your genome is sequenced and analyzed, raw sequencing file
Hash 2016/12/08
1000 Genomes Project ã®ãƒ‡ãƒ¼ã‚¿ã‚’ Spark ã§ãƒžã‚¨ã‚·ãƒ§ãƒªã—ã¦ Athena ã§å©ãã‚„ã¤. ã‚³ãƒ¼ãƒ‰ => https://github.com/awslabs/aws-big-data-blog/tree/master/aws-blog-athena-genomics

AWS

spark

bioinformatics

atheism
ãƒªãƒ³ã‚¯
Submitting User Applications with spark-submit | Amazon Web Services
AWS Big Data Blog Submitting User Applications with spark-submit Francisco Oliveira is a consultant with AWS Professional Services Customers starting their big data journey often ask for guidelines on how to submit user applications to Spark running on Amazon EMR. For example, customers ask for guidelines on how to size memory and compute resources available to their applications and the best reso
Hash 2016/10/04
spark

EMR
ãƒªãƒ³ã‚¯
Apache Spark @Scale: A 60 TB+ production use case
Facebook often uses analytics for data-driven decision making. Over the past few years, user and product growth has pushed our analytics engines to operate on data sets in the tens of terabytes for a single query. Some of our batch analytics is executed through the venerable Hive platform (contributed to Apache Hive by Facebook in 2009) and Corona, our custom MapReduce implementation. Facebook has
Hash 2016/09/06
facebook

spark

ã‚ã¨ã§
ãƒªãƒ³ã‚¯
AWS Solutions Architect ãƒ–ãƒã‚°
Apache Sparkã¨Amazon DSSTNEã‚’ä½¿ã£ãŸã€Amazonè¦æ¨¡ã®ãƒ¬ã‚³ãƒ¡ãƒ³ãƒ‡ãƒ¼ã‚·ãƒ§ãƒ³ç”Ÿæˆ Amazonã®ãƒ‘ãƒ¼ã‚½ãƒŠãƒ©ã‚¤ã‚¼ãƒ¼ã‚·ãƒ§ãƒ³ã§ã¯ã€ãŠå®¢æ§˜æ¯Žã®è£½å“ãƒ¬ã‚³ãƒ¡ãƒ³ãƒ‡ãƒ¼ã‚·ãƒ§ãƒ³ã‚’ç”Ÿæˆã™ã‚‹ãŸã‚ã«ãƒ‹ãƒ¥ãƒ¼ãƒ©ãƒ«ãƒãƒƒãƒˆãƒ¯ãƒ¼ã‚¯ã‚’ä½¿ã£ã¦ã„ã¾ã™ã€‚Amazonã®è£½å“ã‚«ã‚¿ãƒã‚°ã¯ã€ã‚ã‚‹ãŠå®¢æ§˜ãŒè³¼å…¥ã—ãŸè£½å“ã®æ•°ã«æ¯”è¼ƒã—ã¦éžå¸¸ã«å·¨å¤§ãªã®ã§ã€ãƒ‡ãƒ¼ã‚¿ã‚»ãƒƒãƒˆã¯æ¥µç«¯ã«ç–Žã«ãªã£ã¦ã—ã¾ã„ã¾ã™ã€‚ãã—ã¦ã€ãŠå®¢æ§˜ã®æ•°ã¨è£½å“ã®æ•°ã¯ä½•å„„ã«ã‚‚ã®ã¼ã‚‹ãŸã‚ã€æˆ‘ã€…ã®ãƒ‹ãƒ¥ãƒ¼ãƒ©ãƒ«ãƒãƒƒãƒˆãƒ¯ãƒ¼ã‚¯ã®ãƒ¢ãƒ‡ãƒ«ã¯è¤‡æ•°ã®GPUã§åˆ†æ•£ã—ãªã‘ã‚Œã°ã€ç©ºé–“ã‚„æ™‚é–“ã®åˆ¶ç´„ã‚’æº€ãŸã™ã“ã¨ãŒã§ãã¾ã›ã‚“ã€‚ ãã®ãŸã‚ã€GPUä¸Šã§å‹•ä½œã™ã‚‹DSSTNE (the Deep Scala ble Sparse Tensor Neural Engine)ã‚’é–‹ç™ºã—ã‚ªãƒ¼ãƒ—ãƒ³ã‚½ãƒ¼ã‚¹ã«ã—ã¾ã—ãŸã€‚æˆ‘ã€…ã¯DSSTNEã‚’ä½¿ã£ã¦ãƒ‹ãƒ¥ãƒ¼ãƒ©ãƒ«ãƒãƒƒãƒˆãƒ¯ãƒ¼ã‚¯ã‚’å¦ç¿’ã—ãƒ¬ã‚³ãƒ¡ãƒ³ãƒ‡ãƒ¼ã‚·ãƒ§ãƒ³ã‚’ç”Ÿæˆã—ã¦ã„ã¦ã€ECã®ã‚¦ã‚§ãƒ–ã‚µã‚¤ãƒˆ
Hash 2016/07/11
Amazon å•†å“ãƒ¬ã‚³ãƒ¡ãƒ³ãƒ‰ã‚’æ”¯ãˆã‚‹æŠ€è¡“ã®ãŠè©±ã

spark

AWS

ECS

DSSTNE
ãƒªãƒ³ã‚¯
å¤çœŸã£ç››ã‚Šï¼Spark + Python + Data Scienceç¥ã‚Š (2016/07/25 19:00ã€œ)
[2016/07/04è¿½è¨˜] å¥½è©•ã«ã¤ã80åã‹ã‚‰100åã«å¢—æž ã—ã¾ã—ãŸï¼ DMM.com ãƒ©ãƒœã€ã‚µã‚¤ãƒãƒ¼ã‚¨ãƒ¼ã‚¸ã‚§ãƒ³ãƒˆã€Clouderaã®æœ€å‰ç·šã®ã‚¨ãƒ³ã‚¸ãƒ‹ã‚¢ãŒå„è‡ªã®è¦–ç‚¹ã‹ã‚‰ç™ºè¡¨ï¼Sparkã‚„Pythonã‚’ä½¿ã„ã€ãƒ“ãƒƒã‚°ãƒ‡ãƒ¼ã‚¿ã‚’æ´»ç”¨ã—ãŸData Scienceã€æ©Ÿæ¢°å¦ç¿’ã‚’æ´»ã‹ã—ãŸãƒ—ãƒãƒ€ã‚¯ãƒˆã®æ´»ç”¨äº‹ä¾‹ã‚„ã€ãƒ„ãƒ¼ãƒ«ã€ã‚¢ãƒ¼ã‚ãƒ†ã‚¯ãƒãƒ£ã‚’çŸ¥ã‚ŠãŸã„äººã«ãŠå‹§ã‚ã®ãƒŸãƒ¼ãƒˆã‚¢ãƒƒãƒ—ã‚’é–‹å‚¬æ±ºå®šï¼ å¯¾è±¡ Sparkã‚’ä½¿ã£ã¦ã„ã¦ã€ãƒ‡ãƒ¼ã‚¿ã‚’æ´»ç”¨ã—ãŸãƒ—ãƒãƒ€ã‚¯ãƒˆã‚’ä½œã‚ŠãŸã„äºº æ©Ÿæ¢°å¦ç¿’ã‚„ãƒ‡ãƒ¼ã‚¿åˆ†æžã¯ã—ã¦ã„ã‚‹ãŒã€Sparkã¯ã¾ã ä½¿ã£ãŸã“ã¨ã®ãªã„äºº Pythonã‚’ä½¿ã£ã¦ãƒ“ãƒƒã‚°ãƒ‡ãƒ¼ã‚¿ã®åˆ†æžãƒ»æ´»ç”¨ãŒã—ãŸã„äºº ãªã©ã®æ–¹ã€…ã«æ¥½ã—ã‚“ã§ã‚‚ã‚‰ãˆã‚‹ç™ºè¡¨ã‚’äºˆå®šã—ã¦ã„ã¾ã™ã€‚ æ¦‚è¦ Sparkã‚„Pythonã‚’ç”¨ã„ã¦ãƒ“ãƒƒã‚°ãƒ‡ãƒ¼ã‚¿åˆ†æžã‚’è¡Œã£ãŸã‚Šã€æ©Ÿæ¢°å¦ç¿’ã‚’æ´»ã‹ã—ãŸãƒ—ãƒãƒ€ã‚¯ãƒˆã®é–‹ç™ºã«ã¤ã„ã„ã¦ã®çŸ¥è¦‹ã‚’å…±æœ‰ã™ã‚‹ä¼šã§ã™ã€‚å¤§é‡ã®ãƒ‡ãƒ¼ã‚¿ã«å¯¾ã—ã¦ã©ã†ã„ã†ã‚¢ãƒ¼ã‚ãƒ†ã‚¯ãƒãƒ£ã‚’ç”¨ã„
Hash 2016/06/29
å¾Œã§ç”³ã—è¾¼ã‚€ï¼ˆæ—¢ã«äººæ•°è¶…éŽï¼‰

spark

event

python

machine_learning
ãƒªãƒ³ã‚¯
Apache Spark as a Compiler: Joining a Billion Rows per Second on a Laptop
Unified governance for all data, analytics and AI assets
Hash 2016/05/31
Spark 2.0 ã¯ Tungsten ã‚¨ãƒ³ã‚¸ãƒ³ã® whole-stage code generation ã§ã•ã‚‰ã«çˆ†é€Ÿã«ãªã‚‹ã¨ã„ã†ãŠè©±

spark

performance
ãƒªãƒ³ã‚¯
Analyze Your Data on Amazon DynamoDB with Apache Spark | Amazon Web Services
AWS Big Data Blog Analyze Your Data on Amazon DynamoDB with Apache Spark Manjeet Chayel is a Solutions Architect with AWS Every day, tons of customer data is generated, such as website logs, gaming data, advertising data, and streaming videos. Many companies capture this information as itâ€™s generated and process it in real time to understand their customers. Amazon DynamoDB is a fast and flexible
Hash 2016/05/20
ã‚ã¨ã§èªã‚€

AWS

DynamoDB

spark

ã‚ã¨ã§
ãƒªãƒ³ã‚¯
Exploring Geospatial Intelligence using SparkR on Amazon EMR | Amazon Web Services
Hash 2016/04/15
Spark

AWS

R

ã‚ã¨ã§
ãƒªãƒ³ã‚¯
Hadoop / Spark Conference Japan 2016
Hash 2016/02/02
ãˆãƒ¼ãªã«ã“ã‚ŒéŒšã€…ãŸã‚‹ãƒ¡ãƒ³ãƒ„ã§ã¯â€¦ è¡ŒããŸã„ã‘ã©ãƒ‰å¹³æ—¥

Spark

Hadoop

event
ãƒªãƒ³ã‚¯
ã€ŽSparkã«ã‚ˆã‚‹å®Ÿè·µãƒ‡ãƒ¼ã‚¿è§£æžã€ã¨ã„ã†æœ¬ã®ä»˜éŒ²ã‚’åŸ·ç†ã—ã¾ã—ãŸ - ã»ããç¬‘ã‚€
ãƒªã‚¯ãƒ«ãƒ¼ãƒˆã®é«˜æŸ³ã•ã‚“ã¨å…±åŒã§ã€ŽSparkã«ã‚ˆã‚‹å®Ÿè·µãƒ‡ãƒ¼ã‚¿è§£æžã€ã¨ã„ã†æœ¬ã®ä»˜éŒ²ã‚’åŸ·ç†ã—ã¾ã—ãŸã€‚ Sparkã«ã‚ˆã‚‹å®Ÿè·µãƒ‡ãƒ¼ã‚¿è§£æž â€•å¤§è¦æ¨¡ãƒ‡ãƒ¼ã‚¿ã®ãŸã‚ã®æ©Ÿæ¢°å¦ç¿’äº‹ä¾‹é›† ä½œè€…: Sandy Ryza,Uri Laserson,Sean Owen,Josh Wills,çŸ³å·æœ‰,Skyæ ªå¼ä¼šç¤¾çŽ‰å·ç«œå¸å‡ºç‰ˆç¤¾/ãƒ¡ãƒ¼ã‚«ãƒ¼: ã‚ªãƒ©ã‚¤ãƒªãƒ¼ã‚¸ãƒ£ãƒ‘ãƒ³ç™ºå£²æ—¥: 2016/01/23ãƒ¡ãƒ‡ã‚£ã‚¢: å¤§åž‹æœ¬ã“ã®å•†å“ã‚’å«ã‚€ãƒ–ãƒã‚° (4ä»¶) ã‚’è¦‹ã‚‹ åŸ·ç†ã—ãŸä»˜éŒ²ã®å†…å®¹ã¯ã€ŒSparkRã«ã¤ã„ã¦ã€ã§ã™ã€‚ SparkR ã¯ã€R è¨€èªžã‹ã‚‰ Spark ã‚’ä½¿ã†ãŸã‚ã®ãƒ‘ãƒƒã‚±ãƒ¼ã‚¸ã§ã€å…¬å¼ã‚µãƒãƒ¼ãƒˆã•ã‚Œã¦ã„ã¾ã™ã€‚ SparkR ã«ã¤ã„ã¦ã¯ã€ä»¥å‰ Spark Meetup ã§ç™ºè¡¨ã—ã¾ã—ãŸã€‚ Spark Meetup 2015 ã§ SparkR ã«ã¤ã„ã¦ç™ºè¡¨ã—ã¾ã—ãŸ #sparkjp - ã»ããç¬‘ã‚€ ã“ã®ã¨ãã¯ã¾ã ã€æ©Ÿèƒ½ã¨ã—ã¦ä¸ååˆ†ãªç‚¹ãŒç›®ç«‹ã¡ã¾
Hash 2016/01/14
spark

R

book
ãƒªãƒ³ã‚¯
Introducing Redshift Data Source for Spark
Unified governance for all data, analytics and AI assets
Hash 2015/12/29
spark

Redshift
ãƒªãƒ³ã‚¯
SparkR (R on Spark) - Spark 4.0.1 Documentation
SparkR (R on Spark) Overview SparkDataFrame Starting Up: SparkSession Starting Up from RStudio Creating SparkDataFrames From local data frames From Data Sources From Hive tables SparkDataFrame Operations Selecting rows, columns Grouping, Aggregation Operating on Columns Applying User-Defined Function Run a given function on a large dataset using dapply or dapplyCollect dapply dapplyCollect Run a g
Hash 2015/12/26
spark

R

tutorial
ãƒªãƒ³ã‚¯
AWS News Blog
AWS Week in Review â€“Â AWS Documentation Updates, Amazon EventBridge is Faster, and More â€“ May 22, 2023 Here are your AWS updates from the previous 7 days. Last week I was in Turin, Italy for CloudConf, a conference Iâ€™ve had the pleasure to participate in for the last 10 years. AWS Hero Anahit Pogosova was also there sharing a few serverless tips in front of a full house. Hereâ€™s a picture I [â€¦] Amaz
Hash 2015/06/18
ç°¡å˜ãªã‚µãƒ³ãƒ—ãƒ«ãŒã‚ã‚‹ã®ã§è©¦ã—ãŸã„

EMR

spark

MapReduce

ã‚ã¨ã§
ãƒªãƒ³ã‚¯
GitHub - bigdatagenomics/adam: ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
ADAM is a library and command line tool that enables the use of Apache Spark to parallelize genomic data analysis across cluster/cloud computing environments. ADAM uses a set of schemas to describe genomic sequences, reads, variants/genotypes, and features, and can be used with data in legacy genomic file formats such as SAM/BAM/CRAM, BED/GFF3/GTF, and VCF, as well as data stored in the columnar A
Hash 2014/12/09
apache

github

scala

bioinformatics

spark
ãƒªãƒ³ã‚¯
1