Conversation

An unintuitive secret of reading books on computers: reading PDFs with original typesetting is much better than reading ebooks, which treat text like a 4chan shitposter and have impoverished reading software. But… where to get the PDFs?! A survey & suggestions for future work:
Image
Image
16
25
337
Google Play: 👍 ~smooth workflow; clean pages 👎 PDFs lack text layer, so they're not searchable or selectable; only recent books available in PDF archive.org: 👍 has many older books Play lacks; includes OCR'd text layer 👎 OCR errors; photo noise; clunkier workflow
1
3
25
Replying to
Maybe you could improve the EPUB reading experience by extracting text block layout parameters from the PDFs through computer vision: ie. try to estimate the text block width/height, line height, and font size in the original typesetting. Similar technique could map page numbers.
3
16
Related: while e-book reading software are truly impoverished, PDF software is also almost universally unimaginative and unserious for the task of reading. Would love to see more work there…
4
6
45