dump #hadoop Q: when running many jobs in parallel, is it still a best practice to set your # of reduces to the total slots available?
@sorenmacbeth ... otherwise reducers spend a lot of time waiting for the map-side to finish, which is waiting to get mappers from other jobs
-
-
@avibryant Yeah, still remember when I first learned this tip. forehead smacking.Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@avibryant@sorenmacbeth I suspect this doesn't matter so much on a shared cluster. Reducers can't work until mappers finish... -
@avibryant@sorenmacbeth … ; the two task types are allocated from different resource pools.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.