Note: Grappa is no longer under active development.
Grappa makes an entire cluster look like a single, powerful, shared-memory machine. By leveraging the massive amount of concurrency in large-scale data-intensive applications, Grappa can provide this useful abstraction with high performance. Unlike classic distributed shared memory (DSM) systems, Grappa does not require spatial locality or data reuse to perform well.
Data-intensive, or “Big Data”, workloads are an important class of large-scale computations. However, the commodity clusters they are run on are not well suited to these problems, requiring careful partitioning of data and computation. A diverse ecosystem of frameworks have arisen to tackle these problems, such as MapReduce, Spark, Dryad, and GraphLab, which ease development of large-scale applications by specializing to particular algorithmic structure and behavior.
Grappa provides abstraction at a level high enough to subsume many performance optimizations common to these data-intensive platforms. However, its relatively low-level interface provides a convenient abstraction for building data-intensive frameworks on top of. Prototype implementations of (simplified) MapReduce, GraphLab, and a relational query engine have been built on Grappa that out-perform the original systems.
Grappa’s runtime system consists of three key components:
Grappa is freely available on Github under a BSD license. Anyone interested in seeing Grappa at work can follow the quick-start directions in the README to build and run it on their cluster. To learn how to write your own Grappa applications, check out the Tutorial.
Grappa is still quite young, so please don’t hesitate to ask for help if you run into problems. To find answers to questions or ask new ones, please use Github Issues. The developers hang out in the #grappa.io
IRC channel on freenode; you can join with your favorite IRC client or this web interface. Finally, to stay up-to-date on the latest releases and information about the project, you can subscribe to the mailing list below.
Latency-Tolerant Software Distributed Shared Memory.
Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin
USENIX Annual Technical Conference (USENIX ATC), July 2015 (Best Paper Award)
Alembic: Automatic Locality Extraction via Migration.
Brandon Holt, Preston Briggs, Luis Ceze, Mark Oskin
OOPSLA 2014
Radish: Compiling Efficient Query Plans for Distributed Shared Memory.
Brandon Myers, Daniel Halperin, Jacob Nelson, Mark Oskin, Luis Ceze, Bill Howe
Tech report, October 2014
Grappa: A Latency-Tolerant Runtime for Large-Scale Irregular Applications. (Expanded tech report)
Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin
International Workshop on Rack-Scale Computing (WRSC w/EuroSys), April 2014
Flat Combining Synchronized Global Data Structures.
Brandon Holt, Jacob Nelson, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin
International Conference on PGAS Programming Models (PGAS), October 2013
Compiled Plans for In-Memory Path-Counting Queries.
Brandon Myers, Jeremy Hyrkas, Daniel Halperin, and Bill Howe
International Workshop on In-Memory Data Management and Analytics (IMDM w/ VLDB), August 2013
Crunching Large Graphs With Commodity Processors.
Jacob Nelson, Brandon Myers, A. H. Hunter, Preston Briggs, Luis Ceze, Carl Ebeling, Dan Grossman, Simon Kahan, Mark Oskin
USENIX Workshop on Hot Topics in Parallelism (HOTPAR), June 2011
Autogenerated API documentation
Grappa is a project group in the Sampa Group at the University of Washington.