-
-
Replying to @whitequark
I had a look at the GENTRL paper recently, wondering how they got from three-dimensional chemical structure space to a vector that could be fed into a machine learning algorithm... and even harder, how they made that mapping reversible. Couldn't figure it out.
1 reply 0 retweets 0 likes -
Replying to @NYarvin @whitequark
And I don't think that's my fault. They say approximately one word about how the transformation starts: SMILES. https://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system …
1 reply 0 retweets 0 likes -
Replying to @NYarvin @whitequark
But from there, how they get to "a continous space of 50 dimensions" (i.e. a 50-element vector) is left almost completely undescribed, even in the supplementary information.
1 reply 0 retweets 1 like -
Replying to @NYarvin @whitequark
And the real trick is making all this reversible. I was expecting about 20 pages of describing how they did that, and found bupkis.
1 reply 0 retweets 1 like -
Replying to @NYarvin @whitequark
I checked again, and found a short section in the supplemental that I hadn't noticed, which is enough to decipher it. Turns out the 50 dimensions are known chemical and medicinal properties, so it's not really "auto-encoding"; instead they're giving it a ton and a half of data.
1 reply 0 retweets 1 like -
Replying to @NYarvin @whitequark
As regards reversibility, they just train a forward network to map one way, and a reverse network (with very different shape) to map the other way. Not perfect reversibility, but they don't care; they discard the outputs that make no sense.
1 reply 0 retweets 0 likes
One of the 50 properties is "this is a drug that does exactly what we want", so to have the end result be an almost identical chemical is unsurprising. The corollary of garbage in, garbage out: solved problem in, solved problem out.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.