Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am confused. What makes it not scannable? It's just a Word doc exported to PDF.


Please try this https://www.freeconvert.com/pdf-to-txt It uses the same library I use.

You will probably see only some footer/header content from your pdf.

In general, word doc has some caveats when it generates pdf. Not everything is retrieval when you try to get the pure text content.


Nope. It got the whole thing. Might be an issue with your app.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: