This is an unfortunate side-effect of using EXT_mesh_shader instead of NV_mesh_shader - in NV version, the dispatches are one-dimensional, which doesn't match other places in the API but honestly makes more sense and is easier to deal with on the app side. Ah well.
Arseny Kapoulkine 
@zeuxcg
Technical Fellow at Roblox.
pugixml, meshoptimizer, volk, niagara, Luau.
github.com/zeux
cohost.org/zeux
mastodon.gamedev.place/@zeux
Arseny Kapoulkine 🇺🇦’s Tweets
It just requires a tiny separate dispatch to convert the number of commands to the number of groups and to fill the command tail with no-op commands.
The extra dispatch results in a small GPU bubble, but the net effect on frame time is pretty minimal (~0.02ms per two dispatches)
1
3
Show this thread
During the last stream youtu.be/eYvGruGHhUE I forgot to check limits on dispatch counts, and of course they are relatively small.
I'd normally just fix it on the next stream but I'm not sure when that's going to be, so I've fixed this off stream:
1
4
Show this thread
New* blog post: Meshlet size tradeoffs (ok, I've already posted this on cohost, but I've cleaned up the text a little and merged both cohost posts into one as it made more sense that way!)
1
5
40
Stream video is up!
In it we've successfully implemented a new draw call submission pipeline for task/mesh shading that eliminates the per-draw submission overhead of MDI, allowing the cluster culling optimizations implemented previously to shine.
3
31
Starting in 10 minutes! There's some interesting things in the plan so tune in and watch as the plan likely disintegrates as we go :D
1
Show this thread
The next niagara stream will be tomorrow at 10 AM PST; we'll continue optimizing occlusion culling and maybe do some cleanup work!
1
3
17
Show this thread
New blog post, "Approximate projected bounds", looks at computing screen-space projection bounds of a sphere and a box.
(this is not on cohost because cohost doesn't support syntax highlighting :)
3
19
104
Starting in 30 minutes!
4
Show this thread
Next niagara stream is going to be this Saturday, Jan 7, at 10 AM PST. We will work on optimizing and cleaning up various forms of culling that we've implemented recently.
1
1
29
Show this thread
"Meshlet sizing efficiency" explores theoretical "larger meshlets are better" stmt from a few angles other than vertex reuse and concludes that in fact smaller meshlets are better!
This is not the correct conclusion yet, as it ignores some HW details :)
6
42
niagara: Meshlet occlusion culling
youtube.com/watch?v=5sBpo5
Good news: we've successfully implemented this whole thing, it works and culls a lot of meshlets that we'd otherwise render!
Bad news: the code is impossible to follow now.
The next stream will happen early next week.
2
14
86
Starting in 30 minutes!
3
Show this thread
Next niagara stream will be on Tuesday, Jan 3, at 10 AM PST; we will work on meshlet occlusion culling and maybe other culling improvements:
1
6
45
Show this thread
I will probably start reposting these on Twitter as well - but Mastodon will remain my main app for now. I've written a little bit on this beyond "omg elon" here
10
Show this thread
A few things you may have missed if you don't follow me on Mastodon:
- On greedy algorithms cohost.org/zeux/post/4281
- Condition variables + atomics = trap cohost.org/zeux/post/5201
- Meshlet sizing theory cohost.org/zeux/post/6596
- niagara: Triangle culling
1
26
184
Show this thread
As an experiment, this is the last tweet I’m tweeting this year. I’ll reevaluate based on state of platform next year. I’ll still be reachable via DMs.
During this time I will keep writing on cohost and tooting (??) on Mastodon:
mastodon.gamedev.place/@zeux
cohost.org/zeux
3
5
39
__attribute__((target_clones)) seemed like a fantastic idea until it didn't
3
4
I happily paid for cohost plus despite the fact that it offers no features of significance, because I'm happy to help a tiny team of engineers build a product I want to use.
I don't want to pay the new Twitter $8 for no features of significance to sponsor soul crushing work mode
42
In the foreseeable future, more and more of my tweets or toots are probably going to be links to cohost and here's why.
15
Just in case this site explodes after all, I've cleaned up my Mastodon accounts so you can find me here mastodon.gamedev.place/@zeux
The nice thing about both Mastodon and cohost is I can get a nicer shorter name that I prefer :D Was too late for that at Twitter!
18
A few weeks before my vacation I started working on an ARM assembler and I realized I had misconceptions about ARM that just weren't true.
7
17
103
Not sure what the lesson here is, but I installed Google Maps app now which maybe was the point.
2
Show this thread
When opening a web link to google maps on iPhone, view doesn’t focus on location and it’s off screen - very hard to find.
No problem, I thought! I’ll just type the place name into Apple Maps!
Turns out there’s four places with that name in Paris, something I learned too late.
1
2
Show this thread
I’m going to be in Europe in November!
If you live in Paris, Lisbon, Barcelona or near southern coast of Spain and would be up for meeting and talking tech over tea, shoot me a DM? Unsure yet re exact route/schedule/availability.
34
2022 is the year of replacing sprintf with snprintf everywhere
5
32
In the last month I:
- Upgraded to iOS 16 and downgraded back to iOS 15 (battery drain)
- Upgraded to Ubuntu 22.10 and downgraded back to Ubuntu 22.04 (UI perf issues, audio bugs, broken Zoom)
... maybe I shouldn't try macOS Ventura quite yet.
6
1
22
Rust programmers, please return the crabs wherever you took them from.
4
5
85
It's kinda fascinating that we're in 2022 and the state of auto-vectorization is such that this loop does vectorize but the result is barely better than scalar, whereas a careful SSE2 implementation is 16x better as it should be.
Should I write a blog post about this loop? :)
18
20
430
Introducing the ScriptProfiler (Beta), a new sampling profiler available within the Developer Console. This profiler records the entire call stack of all executing scripts with a sampling frequency of 1000 times per second.
Learn more: devforum.roblox.com/t/scriptprofil
#Roblox #RobloxDev
7
41
381
Update: the combination of a downgrade to iOS 15 and a battery replacement has done wonders for the battery life.
I have a lot of travel coming up so also looking at portable batteries as backup but overall I don’t have to get the new iPhone which is nice.
2
1
9
And while we’re bashing on Apple, my two year old iPhone 12 Pro again has unacceptable battery consumption, not lasting an entire day after iOS16 update.
This happened to me for iPhone 8, and then for iPhone X.
The planning is exceptional if nothing else, solid 2 year cadence!
10
41
I’m sure Apple likes the button -> email -> website purchase -> app flow because they’re really concerned about the safety of my financial information. 🤦♂️
1
9
76
Replying to
2
3
56
Today I profiled our profiler UI with our profiler, and our profiler implementation with a C++ profiler.
8
1
133
I saw the total triangle count as very low in traditional pipeline but I assumed it was a bug of some sort - nope, not a bug!
Once I disable triangle culling in the traditional pipeline, mesh shading path is ~10% faster.
Next up: adjusting my task/mesh shaders to cull better :)
2
6
Show this thread


