Unpopular (?) Opinion: Matrices are a great data structure. But, using linear algebra ops for the vectorization speedup alone is often an obfuscating optimization. A well-commented loop over the clearly-named elements is often better. (Plus, ya know -- numba).
Like, I have some code in my dissertation that I *reliably* need to think about for a few minutes every time. It's barely more than a series of matrix multiplications. But forgetting how the data is oriented in each matrix makes it complicated.