Q: Producer/Consumer ring is a common pattern for high perf comm between 2 CPU cores or CPU core & device. Thus, I expected Intel to have non-temporal store instruction that write to LLC without polluting L1/L2. Useful also with device DDIO. But MOVNT* also bypass LLC. Why? (1/3)
Sorry for late here. I m not from Si/design team. It'll violate cache inclusion principle that Intel CPUs follow (data in LLC implies that data will be in MLC/L1). Will require to re-design cache coherency policies (LLC and main memory have to coherent as well + inclusion rules).
-
-
Since Skylake the LLC is not inclusive anymore. See 2.2.1.2 Non-Inclusive Last Level Cache in Intel Optimization Manual. Regardless, the mechanism I propose in this thread (be sure to read all comments) works also with inclusive LLC. While still avoiding polluting producer L1/L2.
-
(1/2)Right, thanks for pointing that out. It's non-temporal, so inclusiveness is anyways not part of equation. I believe motivation was that backed memory will be WC and MMIOed (mostly graphics). But I do see motivation for your ask specially for DDIO scenarios.
- Još 1 odgovor
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.