btw something that's increasingly obvious as I dissect this pile of VLIW-looking instructions: two decades ago, a GUI could accelerate its rendering on something like Rage128 or its pre-TeraScale successors, by uploading custom microcode with its common register write sequences
-
-
it wouldn't be by much, compared to already-present gain from being able to use a ring buffer the GPU reads itself, instead of individual MMIO accesses... but ATI clearly believed it mattered, otherwise they wouldn't have included most of the special-cased ring buffer packets
1 reply 1 retweet 3 likes -
Replying to @eddyb_r @whitequark
nvidia since Fermi actually has a user-programmable macro-expander sorta-microcontroller in the path right before the main command distributor; seems they also consider it very useful
1 reply 0 retweets 4 likes -
Replying to @mwk4 @whitequark
wait, they don't make you load an undocumented firmware blob into it, like AMD does?!
2 replies 0 retweets 1 like -
ftr, I was looking at: 36-bit: - Rage128 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/r128/r128_cce.bin … 40-bit: - R100 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/radeon/R100_cp.bin … - R200 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/radeon/R200_cp.bin … - R300 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/radeon/R300_cp.bin … - R420 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/radeon/R420_cp.bin … - R500 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/radeon/R520_cp.bin … tbd: TeraScale's 76-bit https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/radeon/R600_me.bin …
1 reply 0 retweets 0 likes -
(technically the R600_me.bin appears to have 75-bit words but I think that's because that part of the word always contains a word address, and there are 0x700 words in the blob, so the words are really 64-bit + the address of another word, possibly VLIW-style control-flow slice?)
1 reply 0 retweets 0 likes -
it's really fun to look at words in a hex editor and realize that any kind of embedded data (such as a jump table) is pretty much guaranteed to occupy a saner subset of the weird word size (32-bit for Rage128-R5xx, 64-bit for TeraScale) oh and I'm pretty sure they're big-endian
1 reply 0 retweets 0 likes -
lmao good thing I googled around, looks like GCN's "microcode format" has already been reverse engineered (I swear I saw it in a talk, too!), and r1/r2's descriptions perfectly match what I can see in as far back as Rage128 https://github.com/fail0verflow/radeon-tools/tree/master/f32 … (*waves at
@marcan42*)1 reply 0 retweets 1 like -
Ha. So they've been doing this since Rage128, but with weirdo VLIW style cores? Geez, I'm glad I got the sane(ish) RISC design!
1 reply 0 retweets 1 like -
the thing that confirmed my suspicions and let me know a few things, like how the register offsets are encoded (shifted right by 2, makes sense, they're all 32-bit anyway) was this: the MPEG_INDEX handler in R520_cp.bin packet docs: http://developer.amd.com/wordpress/media/2013/10/R5xx_Acceleration_v1.5.pdf#page=54 … register bonanza!pic.twitter.com/SaepbxhcSJ
2 replies 0 retweets 0 likes
The way I started reversing this was that I immediately saw register offsets (>>2, but that's a common code convention for AMD too) in the hex dump, so I knew those were load/stores. Then run one- or two-opcode firmwares and use that as data in/out.
-
-
I wasted a lot of time because I didn't know the >>2 thing, and they're like 11 bits, so e.g. in "3d bc", the >>2 register offset is actually "05 bc", i.e. 0x16f0 also I don't even own a Rage128, the closest thing I have is an R300-ish in an AMD HP SFF desktop circa 2006 D:
1 reply 0 retweets 0 likes -
and it's, like, integrated in the chipset and I don't know how weird it is compared to cards, and I wanted to start at Rage128 anyway speaking of, Rage128 cards are cheap, but... they're all AGP, and the oldest PC I have (the aforementioned HP SFF desktop) only has PCIe and PCI
0 replies 0 retweets 0 likes
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.