It would be nice however if the PDFs also included the corresponding OCR of the text, in addition to the scan
-
-
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@SteveBellovin redaction, the whole PDF/OCR is so old. Older than you and I combined and the shitstorm "As the world turns" on NANOG. Maybe it's time for a new format/approach?Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Do the scanned PDFs have the printer tracking yellow dots embedded? If so, that’s a different type of metadata leak...
-
Fair question. I suspect that the answer is no. The motivation for the tracking dots was to detect counterfeit bills produced by high-quality color printers. That motivation doesn't exist here. It's also very easy to detect—look for any yellow in what should be a monochrome scan.
- Show replies
New conversation -
-
-
It seems like it would be better to have a program that could create PDFs but was really dumb, only knowing how to include text and images, than to try to sanitize after the fact.
-
Although I guess that would require reimplementing all the layout logic of Word. Maybe MS should offer an "export to dumb PDF" option
End of conversation
New conversation -
-
Show additional replies, including those that may contain offensive content
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.