Agreed. Also, for the software I have a hard time seeing what's new here? It's all pre-existing tools wrapped into a website (that I can't access). E.g. Taxonomer from IDbyDNA has been available for years (and is very fast with immediate results): https://www.taxonomer.com/
-
-
Yeah I note this in the piece. They argue that the differences are in how intuitive the front end is and how robust the back end is. But I think the key diff is: for better or worse, this is the one that funders are backing.
2 antwoorden 0 retweets 3 vind-ik-leuks -
No particular issue with IDseq concetually but it would be a terrible mistake to focus on any particular software package or method in such a rapidly developing and unvalidated field ...
2 antwoorden 5 retweets 14 vind-ik-leuks -
Yeah. As for how intuitive it is, I'll wait until I gain access to pass judgment. The main issue on the front end though is *interpretation*, not presentation. Mainly because of misclassifications and contaminants upstream in the pipeline - far from trivial to do.
2 antwoorden 0 retweets 9 vind-ik-leuks -
Right. The interpretation gap is huge right now.
1 antwoord 0 retweets 2 vind-ik-leuks -
Yup - accurate classification of reads, contaminant detection, and scoring of likely causality are by far the most critical needs - and here new tools are needed. To me IDSeq appears to be an online wrapper for preexisting tools e.g. Gsnap, rapsearch, bowtie2, etc.pic.twitter.com/JA2a9z3eUC
2 antwoorden 1 retweet 5 vind-ik-leuks -
Als antwoord op @K_G_Andersen @pathogenomenick en
Compare that overview figure to the overview of the pipeline I developed >5 years ago. Not a lot has happened since then to be honest - but of course, having a website is nice (although I think Taxonomer has that totally covered).pic.twitter.com/Py43JghElP
1 antwoord 0 retweets 1 vind-ik-leuk -
Als antwoord op @K_G_Andersen @pathogenomenick en
Agreed that contamination quantification, spurious mappings (from eg bad db), and scoring are the hardest parts. Even with large datasets (400 px from a few hospitals, similar context and processing), we've had trouble fitting reasonable models.
@jackkamm2 antwoorden 0 retweets 1 vind-ik-leuk -
Als antwoord op @thebasepoint @K_G_Andersen en
sadly for my statistical soul, the best perf (in terms of false positives) has been thresholds on rpm, raw number of reads, abun rel to water controls, and excluding a list of known contaminants.
1 antwoord 0 retweets 1 vind-ik-leuk -
Als antwoord op @thebasepoint @K_G_Andersen en
some assembly is necessary to get over db bias (not every bird is a chicken, but 5% of bird sequence in NT is, and their diversity dominates in some genome regions), but god then you're doing metagenomic assembly as a pipeline, good luck.
1 antwoord 0 retweets 0 vind-ik-leuks
Yeah, nobody said any of this was easy
. Hence this is where we truly need new innovation - which is far from trivial.
Het laden lijkt wat langer te duren.
Twitter is mogelijk overbelast of ondervindt een tijdelijke onderbreking. Probeer het opnieuw of bekijk de Twitter-status voor meer informatie.