Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: CLI tool to download webpages and convert to Markdown?
12 points by raytracer on June 12, 2020 | hide | past | favorite | 9 comments
Is there a CLI tool that downloads webpages and converts them to Markdown?

I've just started using Obsidian https://obsidian.md/ and would like a way to save interesting blog posts and articles.



  curl --silent https://example.com/foo.html | pandoc --from html --to markdown_strict -o foo.md
From Converting HTML to Markdown using Pandoc http://www.cantoni.org/2019/01/27/converting-html-markdown-u...


I'd like to add that Calibre (ebook-convert on the CLI) has a mode for outputting a .txt file with markdown formatting. It can take HTML as input.


I tried that on a HackerNews page and the content of the output file foo.md was in html.


Sorry, had only tested it on my own site, which worked as expected:

  curl --silent https://tinyapps.org/ | pandoc --from html --to markdown_strict -o index.md


I find Joplin https://joplinapp.org does a good job of producing markdown from web pages and already has sync capability built in. Looks like it would be ideal for working with Obsidian


Exported my Joplin markdown and opened it up in Obsidian. Works like a dream and the best part is Jopli already has it's own web clipper. Seems like a superb match




Thank you, Anand. Despite being 8 years old, Aaron's html2text.py worked perfectly to convert the HN homepage to Markdown. His memory (and code) continues to be a blessing!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: