An unintuitive secret of reading books on computers: reading PDFs with original typesetting is much better than reading ebooks, which treat text like a 4chan shitposter and have impoverished reading software.
But… where to get the PDFs?! A survey & suggestions for future work:
Conversation
Replying to
Google Play:
👍 ~smooth workflow; clean pages
👎 PDFs lack text layer, so they're not searchable or selectable; only recent books available in PDF
archive.org:
👍 has many older books Play lacks; includes OCR'd text layer
👎 OCR errors; photo noise; clunkier workflow
1
3
25
Z-Library:
👍 occasionally has clean PDFs for books which others lack
👎 PDFs are often EPUB->PDF conversions (the worst!); more illegal
4
1
23
One fun project idea: maybe you could improve upon the poor text layers in Play / archive.org's PDFs by building a tool which combines EPUBs and PDFs by aligning the EPUB's original text onto the PDF pages via OCR.
1
4
22
Maybe you could improve the EPUB reading experience by extracting text block layout parameters from the PDFs through computer vision: ie. try to estimate the text block width/height, line height, and font size in the original typesetting. Similar technique could map page numbers.
3
16
Related: while e-book reading software are truly impoverished, PDF software is also almost universally unimaginative and unserious for the task of reading. Would love to see more work there…
4
6
45
Replying to
this is related to your recent tweet.
Quote Tweet
I'm thinking about moving my non-fiction reading from the kindle to the desktop.
On the kindle, I'm constraint to basic reading and highlighting.
But on the desktop I can read, highlight, take notes, look up facts, mix in videos etc.
Isn't this much better for learning?
2
Replying to
This is such a fascinating point for me, as often my goal is to get a PDF into something that can flow so that I can actually read it on a smaller device. PDFs are so static, and they also don't work well with text-to-speech systems.
I guess it depends on what sort of reading...
2
10
Right, I guess! I don't read on smaller devices. I use a reMarkable when away from my computer to read in the original page layout.
2
3
Show replies
Replying to
Sorry, I'm not seeing the problem with the epub. Both versions are quite legible. I don't think the PDF's page layout is adding much.
2
7
Suspect you’d feel differently if you tried to read a dozen pages like this. The epub’s lines are too long, slightly too loose, and have terrible justification leading to irregular word spacing rhythms.
2
6




