Trick is to rely on cheaper to measure, but genetically informative phenotypes with tons of variables that allow for effective use of machine learning methods. A kind of middle way. Presumably works well because fewer variables than full marker set of genotypes.
And that the useful predictors are less rare among the chemical or spectrometer variables than among the genotypes, for which only like 1 in 100k is relevant or something which requires a lot of penalization to avoid overfitting.