Why is so much literature focused on *recognition* of programming language grammer, and so few actually *parsing* the grammar into useful data. Recognizing a programming language (using any parser) is incredibly simple. I hacked up a recognizer for C++ grammar in a day.
Nitpicking of course, but is a C++ program whose template program does not halt a valid C++ program? Do you need to solve the halting problem in order to distinguish a C++ program from other text?