Skip to content

Maybe a better implementation of ruby binding for Apache Spark #33

@chyh1990

Description

@chyh1990

Hi,

I have written a new prototype for ruby spark binding

https://github.com/chyh1990/jruby-spark

Although this implementation only works on JRuby, I think this approach is more promising:

  • REAL closure/lambda serialization, with elegant syntax

https://github.com/chyh1990/jruby-spark/blob/master/examples/pagerank.rb

  • use JVM infrastructure, run on YARN with the standard job submission workflow
  • reuse Java/Scala API, we can get Streaming/SQL/GraphX support nearly for free

https://github.com/chyh1990/jruby-spark/blob/master/examples/sqltest.rb

  • Easier to maintain even without merging into mainline spark

The prototype is preliminary, but the concept is proved. I think ruby would be a
more elegant binding language for spark than python. I'm looking forward for more
participants!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions