1. is legit. Though I would not say fail so much as "have significant challenges". This is also a problem for distributed computing: if some of your memory lives on another machine. I get point 2 at all though. Loads of GC'd languages have Coroutines, or so I thought?
-
-
-
Loads of languages might have coroutine a but at a huge cost! The way probably all GCs work is basically by walking the stack to find garbage... with "stackful" coroutines you potentially have millions of stacks to scan every collection!
- Još 1 odgovor
Novi razgovor -
-
-
Sorry for beginner question, max I know about GPU programming is http://x.to ('cuda'). Does that mean Nim needs to check for memory to be freed outside it's GC or does that mean there is no way for Nim to check for memory to be freed. Also great work with
@NimTorch
-
Well in a nutshell, GCs are full of thresholds that trigger collections and such. Any GPU tensor will appear to the GC as a small structure in terms of size not knowing it has potentially lot of GPU memory behind.
- Još 1 odgovor
Novi razgovor -
-
-
That's where GC.AddMemoryPressure can be useful.https://docs.microsoft.com/en-us/dotnet/api/system.gc.addmemorypressure …
-
True, I always wanted to experiment with it back in the
@xenko3d days :) Here we would have to compute full tensor sizes for every allocation (which hey it might be ok, shame that internally so many calls will do it too). .NET GC is a over-engineered GC in my book though :(
Kraj razgovora
Novi razgovor -
-
-
we had to deal with this in Lua as well. Our fix for (1) was to call the GC if CUDA went OOM, and try to re-allocate :D https://github.com/torch/cutorch/blob/5e9d86cb982a6048d3077aeb0e0cee19847b4c08/lib/THC/THCGeneral.c#L848-L852 … Not ideal, but worked reasonably well in practice.
-
In our case
@junjihashimoto3 implemented FFI in a way that approximates ARC + added instrumentation to keep a close eye on this - https://user-images.githubusercontent.com/2469618/68619964-d9814f00-050f-11ea-9685-f8a21837d8a9.png … But definitely one of the trickiest things to get right and most people are blissfully unaware. - Još 1 odgovor
Novi razgovor -
-
-
As the author of the alternative deep learning framework in
@nim_lang (and also a low-level multithreading runtime to avoid the limits of OpenMP): 1. Don't use Nim GC, or just use destructors/finalizers to plug Cuda/OpenCL memory in the GC. 1b. You need a caching allocator anyway -
2. I didn't try coroutines/fibers yet but I don't see what would prevent ucontext or setjmp/longjmp especially as they are used in std/coro (arguably unmaintained) and this works:https://github.com/treeform/greenlet …
- Još 1 odgovor
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.