@FioraAeterna 3² × 128² coefficients, yes.
@FioraAeterna It computes the same thing, but I don't know how it divides up the big mesh (input-major, output-major, etc.)
-
-
@marcan42 is there any organization that would reduce the number of duplicate memory reads? -
@FioraAeterna The problem is it's a mesh, so you get to pick what to inner loop on and reread everything else, or use a bit of everything.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.