An int64 i containing 8 bytes of flags that are 0-1 can be converted to an int8 bitmask with i*0x0102040810204080>>56. Neat trick for atomic-free garbage-collector card table traversal.
-
-
-
Replying to @tyoc213
In this case, I use byte flags with values 0-1 so that lots of threads can update flags concurrently without requiring slow atomics. Maintaining bit flags shared between threads requires atomics which are ~72 times slower than uncontended byte writes.
3 replies 1 retweet 21 likes -
Replying to @TimSweeneyEpic @tyoc213
yeah my question was going to be about cache line contention - on multicore and even moreso multi-socket
1 reply 0 retweets 0 likes
These shared memory updates will be slow (hundreds of cycles) if there’s high write-contention. We can reduce write-contention by reading first and skipping the writing if it would be changed, which is a common case for garbage collector card marking.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.