Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Where is .djvu?


Do you need an option for that? You can convert to PDF and then `pdf2djvu` it.


I believe the best you could do is extract the raw OCR'd text from the document (with some other tool). No formatting or text hierarchy is preserved in the OCR process, only the physical locations and size of the text on the page. From text, you can convert to Markdown or whatever and then manually edit to give the OCR text some structure.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: