CS101 teaches us binary search is the best way to search through large datasets, but with a bit of data modeling we can substantially improve performance
-
-
Prikaži ovu nit
-
Real genomes are shockingly predictable with respect to the ordering of k-mers/suffixespic.twitter.com/6ansR8eex1
Prikaži ovu nit -
We were deeply inspired by The Case for Learned Index Structures by
@tim_kraska et al (plus conversations with@JeffDean last year): https://arxiv.org/abs/1712.01208 . I see huge potential for these ideas in the futurePrikaži ovu nit -
It is a classic CS space-time tradeoff, and with a tiny amount of space overhead (0.1%) we can double the performance over
@TheGeneMyers' optimal suffix array search algorithm. If we allow for more space can push it even faster.pic.twitter.com/TOYEFxuh5T
Prikaži ovu nit
Kraj razgovora
Novi razgovor -
-
-
Is there any methodological differences between this and Lisa: Towards Learned DNA Structure Search (Ho et al. 2019)? All that's mentioned in the paper is that Lisa's code wasn't released publically and was only tested on one dataset.
-
Similar conceptual framework. If you check our github repo you will see we have been working on this >1 year before that was posted
Kraj razgovora
Novi razgovor -
-
-
This is an amazing graphic. Helpful.
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
This is super interesting. Does kmer size affect the error bound? And for the PWL how does right shifting by b bits give which bin a query belongs to?
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.