I have a large repository of PDFs that I've wanted to analyze by constructing a graph over, Similarity of *questions* they ask instead of over, Bibliographical citations. NLP folks, how difficult do you think this task is?
-
Show this thread
-
Replying to @generativist
I want to sit down and do the same. Should be a relatively straightforward task. The Python data stack has way too many libs that can help. Search for term vectorization, tf-idf, and document clustering for a starter.
1 reply 0 retweets 4 likes
Replying to @preslavrachev
(I'm flying and doing a bunch of nonsense this week, but feel free to DM me next week if I forget.)
7:24 AM - 22 Apr 2019
0 replies
0 retweets
3 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.