That's Goodhart's law in action. Make the measure a target and it ceases to be a good measure - "students good at some arbitrary g loaded task" -> "students practice that task". Used to predict good outcomes due to correlation but making it official broke the correlation.
Being good at solving problems was what they were looking for so they posed novel problems to find the ability but the problems are not that hard to solve given time and attention so the sorting changed to how well one could absorb and apply someone else's solutions. Different.