https://arxiv.org/pdf/1803.04014.pdf … claims “half precision data and instruction in Tensor Cores are the main contributors to high throughput” but that doesn’t explain why they don’t just add instructions to the regular GPU units
-
-
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Yeah, I’m more asking why NVIDIA and Apple, which make GPUs, decided to go with separate blocks alongside their GPUs. Neither company sells their “TPUs” separate from a GPU.
- 2 more replies
New conversation -
-
-
Few things. Ideal memory topology for graphics is very different. Programmability takes silicon which isn’t needed. Supporting high precision at all needs a bunch of silicon. Convolutional and LUT logic is generally not so useful for graphics.
-
Sure, I’m more asking about companies that won’t sell you a TPU without an attached GPU though (Apple, NVIDIA). Seems strange that they have totally separate blocks instead of reusing silicon.
- 3 more replies
New conversation -
-
-
8bit floats aren’t too useful
-
Yeah, make that 16-bit floats.
End of conversation
New conversation -
-
-
So they can make new proprietary APIs and try to lock customers into them
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
I once believed it was just half precision support however some of the designs are using data flow architectures. While GPUs can emulate dataflow architectures, and they have some elements of dataflow architectures, they are not true dataflow architectures. FPGA is closer.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Marketing, hype, $$$.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.