Are there any rules of thumb for dataset size that would justify a Hadoop cluster? Or spark cluster? Or just a big single box?
-
-
Replying to @bmdhacks
I might frame it as time, not space: is your batch process on a single box taking upwards of 10-15 minutes? A cluster might help.
1 reply 0 retweets 1 like -
Replying to @avibryant
you just totally confused all the people at twitter waiting for map/reduce scalding jobs to finish. What, a cluster of clusters?
1 reply 0 retweets 2 likes
Replying to @bmdhacks
no, no. If your Hadoop job is taking too long, the answer is to run it on a single box.
9:34 AM - 4 Jul 2016
from Capital G, British Columbia
0 replies
0 retweets
4 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.