tag:blogger.com,1999:blog-4718119077220204108.post177248029062926548..comments2023-05-30T20:51:47.998+08:00Comments on SpeedGo Computing: GPU Computing with Rubyxmanhttp://www.blogger.com/profile/05695636905017529897[email protected]Blogger2125tag:blogger.com,1999:blog-4718119077220204108.post-46550758308794600632011-04-25T23:15:15.820+08:002011-04-25T23:15:15.820+08:00Hi Chad, thanks for your kind feedback. When a sys...Hi Chad, thanks for your kind feedback. When a system scales up, there certainly be many optimizations that need to be done for good performance.<br /><br />The use of dynamic library in the slides is merely to workaround with the CUDA Runtime API which doesn't provide a kernel load function. I have not seen any CUDA API wrapper supporting the CUDA Runtime API. This could be part of their xmanhttps://www.blogger.com/profile/05695636905017529897[email protected]tag:blogger.com,1999:blog-4718119077220204108.post-79185693695773005962011-04-25T22:56:23.639+08:002011-04-25T22:56:23.639+08:00A few suggestions from my work with Ruby and MPI o...A few suggestions from my work with Ruby and MPI on a 2048 processor computing system.<br /><br />1) Dynamic libraries can be a scaling issue. If 10,000 nodes execute the code you want to be able to broadcast the entire binary and not have ad-hoc queries back to the networked file system to grab dynamic libraries. Static compiling is one solution, the other is to broadcast the dynamic stuff in Chad Brewbakerhttps://www.blogger.com/profile/10443154815748267611[email protected]