@avibryant At least in that context: MR was already all about sequential IO.
-
-
-
.
@stuhood yes, I don't mean to say that's the only value of *caching* as a concept. Just that it's the main value of reusing/caching RDDs. -
@avibryant My point is: "currently". Spark's ancestry is in seq IO. In the presence of caching, bias should change. -
@stuhood@avibryant read@Frankmcsherry on the topic.
End of conversation
New conversation -
-
-
@avibryant more or less same conclusion in http://www.eecs.berkeley.edu/~keo/publications/nsdi15-final147.pdf …Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@avibryant I'm curious what these benchmarks would look like with the Project Tungsten improvments in Spark 1.4 and 1.5.Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@avibryant I feel like there's an IO/CPU equivalence related to compression/parsing costs.Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.