Are any tweeps good at parallelising w/ #multiprocessing #Python? I'm doing something 100% wrong cos it's slower. 
https://github.com/oliviaguest/pairwise_distance/blob/master/pairwise_distance.py …
I tried 32 because I have 32 cores, does that make sense and YES it is a lot faster but not faster than the other two.. hmmm...
-
-
Also super thanks!

- 1 more reply
New conversation -
-
-
I think the operation is so simple that you always see a slowdown due to the overhead of sending data to other processes and back.
-
the thing is though it will run out of memory the other way so this way has to be better for big data — right?
-
There's another issue. I'll try to fix it because it will be easier than explaining it on Twitter, then we can talk about it :)
-
I knew there was more & thank you so much!!

-
Here's a faster version: https://gist.github.com/jfsantos/8184653991558e30a9eab8613a6ea20f …. The trick is to create a list with all combinations at once and use Pool.map.
-
It's still not faster than the others, but not far from the loop. You'll see more performance gains when the function is more complicated.
-
Maybe a better question to ask is this — I need this function or something similar to actually run in a reasonable about of time without
-
stealing all of my 64GB of memory?
- 1 more reply
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.