Are there any rules of thumb for dataset size that would justify a Hadoop cluster? Or spark cluster? Or just a big single box?
-
-
you just totally confused all the people at twitter waiting for map/reduce scalding jobs to finish. What, a cluster of clusters?
-
no, no. If your Hadoop job is taking too long, the answer is to run it on a single box.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.