I have a large repository of PDFs that I've wanted to analyze by constructing a graph over, Similarity of *questions* they ask instead of over, Bibliographical citations. NLP folks, how difficult do you think this task is?
-
Show this thread
-
Replying to @generativist
I don't do OCR, so I can't speak to that portion, but once you get these in computer-readable format, it doesn't seem like too difficult a task...it depends on how fine-grained you need "similarity of questions" to be, though.
1 reply 0 retweets 3 likes -
Replying to @Shugars
Yea, the second part is where I have very little experience.
2 replies 0 retweets 0 likes
Replying to @generativist @Shugars
- Linguistic comparison - Document position / semantic weighting - Frequency of use (albeit with different wordings)
6:48 AM - 22 Apr 2019
0 replies
0 retweets
0 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.