Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Lurnby, a tool for better learning, is now open source (github.com/roznoshchik)
162 points by roznoshchik on Feb 11, 2022 | hide | past | favorite | 68 comments
I've been working on Lurnby for 2 years. It's kind of like a mix of pocket + kindle + anki.

It lets you => add add epubs, pdfs, and web articles to the app => highlight and add comments => tag and organize highlights => review them with a spaced repetition system

Today I made the decision to open source the project. I'm passionate about helping other people learn to learn better and hope that this will allow a lot more innovation in the tool and the space.

I'm very new to open source and development in general really, but looking forward to receiving the guidance of the community.



This sounds cool. I would suggest putting a UI screenshot on the github page. At least for me personally, as someone who would use something like this, I don’t really have a great concept for the tool without seeing it. For Notes & Tools like this the UI is very important as a differentiator because a lot of similar things exist. To get buy in, people want to know exactly what they are getting.

Thats just my $0.02


Thanks. I didn't realize I could add images to the github page. I'll see what I can do there.


Added now. Thanks for the heads up!


Also a short "Quick start guide" on the README would be nice...IU was unable to find out an INSTALL.md either.


I added an install.md now for setting it up locally. Does that work for you?


Nice. Tip: format it as a shell code block on markdown

Also eval adding a Dockerfile for super-rapid deploy


Thanks. I'll probably look into a dockerfile option.

Does format as a shell code block mean write it in a way that it can just be copied and pasted easily?


Github has pretty comprehensive documentation on their flavor of markdown used for formatting; Here's a direct link to the section on fenced code blocks: https://github.github.com/gfm/#fenced-code-blocks


It'll format it as monospace, and typically in a format that's easier to recognize as being code.

IIRC you can do that by gating code like this:

  '''
  code goes here
  '''
in markdown, but it's been a while.

(The spaces are necessary for HN markup, but not the markdown formatting).


Hey, by quickstart do you mean for installing it locally? Or for using it on the website?


I was about to ask the same thing, didn't realize until your answer, that it is web-based? A local offline application would be great!

(also maybe.. is making an account necessary?)

And one more question, can it also pull multi-page documentation, or just single URL's? I assume it downloads websites and makes them offline available, or is that a wrong assumption.


It is web based. That way you can read on the computer and continue reading on the phone or any other device.

A version that runs just locally on the computer would also work, but that wasn't the main focus of this particular iteration.

I haven't built a robust offline functionality yet. That's something I'm thinking of doing in the near future. Was thinking of having your X most recent articles cached.

It parses the content of articles and stores the text in a db. Then as you highlight and add notes, the text gets updated with the annotations.

It doesn't pull multipage documentation. Each url should be added manually.

At least for now.


I would get a lot of value out of a tool like this. I have a strong preference for a portable executable and file-based storage. Performance can be a problem with those, but I want to use my existing file management and synchronization tools as much as possible.

If that's not possible, than my second preference is a docker-based deployment on Unraid. That's the easiest way for me to self-host. Nice to be able to point to an existing DB for that too.


Flask is pretty lightweight and will run under anything. Including compiling to a portable executable.

I mean, potentially.

I'm 100% with you on file-based storage and flexible storage interchange APIs. I have a bunch of stuff I'd like this kind of tool to interoperate with.


Thanks for the detailed answer, starred anyway! :)


This looks really useful, as it solves a frequent concern I have: I read lots of articles but gradually important details drift from my recall.

Great to see the suggestions people have had (letting people see the product, giving install details etc) and you've been super responsive.

You mention in replies below about integrating with other tools as a way for it to work. I don't know how feasible this might be, but a tool I've been keen to use that it might fit well with is Archive Box: https://github.com/ArchiveBox/ArchiveBox

If you could integrate with something like that, it would focus on managing the content and this would be about the learning/recall testing. Is that the kind of approach you'd be aiming for?


Thanks for the kind words. I would need to look over their tool in detail (it looks great btw) to really think through the details of how an integration might work.

One of the things I want to add to the application soon is some sort of "Post" function. Wherein you write something that synthesizes data from multiple sources and you can add in your highlights as quotes.

So really just a dedicated writing feature. This is another thing that's proven to increase comprehension and retention, synthesizing some learning into a blogpost or something like that.

So with integrations, that's the kind of thing that is perfect for integration. Because although I want to add it myself, there's no way that I would make something as dedicated and enjoyable as a tool that's only concentrating on the writing and notetaking experience.

As is allowing people to read elsewhere and import their highlights.

As is allowing people to review in some other place.

So I guess my perspective is that I want to offer a total experience, that's completely non binding and allows you to get the data out at any point to move it to your app of choice.

sorry, I rambled. Not sure that made a whole lot of sense :D


I'm working on a similar project for a similar concept that involves spacing reading material called Incremental Reading[1] so it was a pleasant surprise seeing your project, great work.

For exporting to Anki, I suggest you look at Anki-Connect[2] if you haven't already.

[1]: http://super-memory.com/help/read.htm

[2]: https://foosoft.net/projects/anki-connect/


Do you use Supermemo or Anki? Is your project related to the Incremental Reading Anki plugin?

https://ankiweb.net/shared/info/935264945


I have been using Anki for the past 6 months (2021-Jul, present), before that I used SuperMemo for 9 months before that (2020-Oct, 2021-Jul), the reasons I switched to Anki are plenty (Linux/MathJax support being the top ones).

My project is not related to the Incremental Reading Anki plugin.


Cool! I'd love to know more about your project when it's ready to share :)


sweeet. Thank you! Will look at those. And I'm really happy that this whole space is finally getting attention.


You are welcome, I'm happy about it too, especially with others mentioning similar projects to check out.


Fantastic! I'll have a deeper look and see if there's any opportunities for integrating this into https://learnawesome.org (which is also open-source).

One of the most frustating thing about the current generation of note-taking apps has been that either they are available as an offline app, or as a VS Code plugin. Many of them have turned into VC-funded startups - keeping the web version as a proprietary product, blindlt following Roam.


Learnawesome is very nice. Curating learning paths is a real pain these days. It's amazing that there's so much information available, but it's now almost impossible to make decisions. So I think there's a lot of important work to be done there to make it easier for people to reskill and keep learning.

Do you mean that they aren't available as offline or vs code plugins? I personally use Bear locally. I find it's minimalism refreshing.


Hey Rostislav,

Author of https://fluentcards.com and https://germanreader.com here. I really like some of the ideas you implemented in the app.

Your landing page and the new features vlog are amazing! You’re clearly very dedicated to the project and it has a cute personal touch. Also kudos for open-sourcing it!

Looking forward to trying out your app more deeply. Cheers!


Thanks for the kind words!

I didn't realize kindle kept track of your dictionary lookups. That's a cool thing and makes a lot of sense to review those.


https://hypothes.is/ is a similar open source and free tool that doesn't get enough mention. It also has an API that I used to create a job that sends me an email every morning with highlights and notes I should revise.


I remember looking at that. They are indeed really cool. What do you mean with notes and highlights you should revise? What kind of revisions are you doing?


> add add epubs, pdfs, and web articles to the app => highlight and add comments => tag and organize highlights => review them with a spaced repetition system

This would make way more sense if it was an add-on to existing information/knowledge management software like Zotero. The base software would take care of the "add documents to the app" workflow, and the "highlight excerpts and use them to build cloze-testing SRS cards" thing could be an added feature. Having everything be a single "app" can be too limiting at times.


Thanks!

I agree with you. My goal isn't to actually make it a closed system like this. I ultimately want to make it as easy as possible for you ppl to get content in and out of the tool.

Lurnby doesn't have the NICEST reading experience possible. It's better in some aspects, but struggles in others. The plan was always to figure out how to allow people to read wherever they are comfortable and still make use of the tool to facilitate memory and retention.

First step is to decouple highlights from in-app articles, and allow them to be linked to external sources instead.

Then it's just a matter of importing them into Lurnby. I actually set the foundation for that already with another script I wrote recently. https://github.com/Roznoshchik/spreadsheet-importer

This would also allow the web extensions to be more useful and allow sending just highlights instead of articles.

My goal is to make it flexible to integrate with any existing workflow, but also be self sufficient for people who don't use any existing tools.

But it's a long road to get there.


Zotero is probably not the best example here. Its not a information/knowledge managment software. https://en.wikipedia.org/wiki/Zotero


Add any issues you know about - or todo's you have to the projects Issues in github. Add "help wanted" and "good first issue" labels to these issue, as appropriate. Add a "Contributing" section to the README to help new contributors get started, pointing to https://github.com/Roznoshchik/Lurnby/issues?q=is%3Aopen+is%...

What's your plan with lurnby.com?


Thanks! I actually haven't contributed to any open source projects myself. So this would be my first time figuring out what's involved.

This helps give some structure for what I should do to make it easier for ppl to engage. Although I have to get organized enough to outline all of those things first. :D

Plan for Lurnby.com is to keep it up, assuming the hosting costs don't spiral out of control.


Hah, i am working on a near identical app. How did you deal with pulling content from the web without reinventing wheels?

I was thinking about reusing an existing store of scraping instructions, InstantView by Telegram (iirc that's the name). Looks fairly straight forward to write a parser for that spec. However, i wanted to be able to store some in a repo on my own as well, but i fear DMCA strikes on a repo that stores instructions to scrape pages.

How did you solve this?


I used mozilla's Readability.js as the base. So didn't reinvent anything really :D


Since I'm working on a similar project, this is how I am planning to pull content from the web, utilizing percollate[1] to get the HTML content, I haven't written any implementation for this in Python yet.

If you don't mind me asking, how were you going to implement spaced repetition? Since the Incremental Reading algorithm has never been published as far as I know.

[1]: https://github.com/danburzo/percollate


I wanted to dogfood and experiment on most of the things, from design to SR algo.

My primary goal was to make something i wanted, which means lots of experimentation across the board. I also want to make a very general purpose/flexible system where you can tweak the underlying SR algo based on the type of knowledge. Ie i want to store music scores, as well as units of fact, and the SR algo or logarithms should support either.

Pulling content (Reader) seemed the hard part to me, mostly because of legality concerns.


The idea looks amazing. I'm still looking for something that allows me to highlight stuff on web pages, preferably on my e-reader. My current way of accomplishing this is by running a page through Mercury Parser, turning it into an epub, syncing it to my reader and reading it with KOReader.

It would be nice if the whole process was easier, especially getting the highlights back.


This was roughly what I wanted as well. I had a Onyx Boox e ink tablet. I was adding sites to lurnby and then reading them on the Boox to use the Lurnby features. But the Boox was just too slow and unresponsive for this type of thing, I was really unsatisfied at some point.

How do you get the highlights back currently?


> But the Boox was just too slow and unresponsive

Consider me surprised! I find mine (Nova 2) very fast. The e-ink page turns are kind of uncomfortably fast, even (recently realized I prefer my old Kindle's slower blending). Admittedly the UI can be a little quirky sometimes, which I suppose is what bothers you.

> How do you get the highlights back currently?

I use Syncthing for two-way sync, so that's how the highlights land back on my computer. Buuuuut ... I don't do anything with them. I recently discovered that the KOReader highlight data format isn't that great anyway, so I still wonder if there's some kind of standard format for processing them further.


Yeah, I think it's just about starting the thing up. It takes like 30 seconds to wake up and become responsive, and then indeed the UI was a bit unintuitive sometimes, so overall it leaves a lot to be desired. I think it's decent as a digital paper, and was decent to read in through the native pdf viewer and stuff, but using the android functions just weren't great.

Re the highlights

How would you want to process them? Are you turning them into flashcards or just adding to a doc somewhere?


Not the person you were replying to, but I've been after something like this for quite awhile too.

I use Logseq as my external brain for everything else, but am having a lot of trouble finding a good way to get my highlights and annotations from epubs into it. It has a great cloze feature with flashcards for reviewing things. It can also do highlights and notes from PDFs natively, which is awesome, but I'm a voracious epub reader on my Onyx Boox Poke 2 Color.

KOReader works great on it, but I really need something that would export my notes in md. I don't need it to do so on a daily basis. It'd be fine for it to be at the end of reading the book. But if it COULD export the notes on a daily basis with a block reference to the book title, that'd be even better.


Hey, Logseq looks really cool. I really love that there are so many options nowadays for knowledge management.

It's interesting to hear from more people that their onyx device works great, I've really struggled with making mine a part of my workflow. Also, I had no idea they now had color options. That's kinda cool!

Re the export to MD. I don't see that as a problem, the question really is about file formatting.

1. Do you have an example md file that's formatted in a way that would work for importing to logseq?

2. Export on a daily basis - would this go via api to somewhere? Or manually send you a download file on a daily basis? What's the idea here?


Nice! Archivy [0] has some of the same goals with integration of web content, also using Readability. Instead of going the highlight route, each article becomes a note in your knowledge base you can edit / add to.

It's really exciting to open source a project, Lurnby looks cool! What do you use the spaced repetition for?

[0]: https://archivy.github.io


Woah. Very cool. You seem to make it really easy for people to get started and tick a lot of the boxes around privacy.

Today's been a bit of a nervous day for me actually haha, I didn't really expect this attention. Open sourcing just felt like the right thing to do for this app and because I just don't have all the skills :D

The spaced repetition at the moment is for the highlights. All highlights get marked at an initial level 0 and then move up or down depending on you reviewing them.

Eventually gets to the point where a highlight is shown to you yearly.

In the future I was thinking of doing some spacing around finished books or articles. Things like - you finished reading X 10 days ago, what do you remember?

But don't think I've thought through the details around that enough.


https://getpolarized.io/ seems like it's in the same space - it's a product I wanted to love, but was a bit clunky to use and didn't end up sticking in my workflow.


Woah. Didn't know about this. I'm honestly amazed at how many similar things I've discovered that I didn't know about when I started this. I did look for similar tools, but didn't really surface that many of them. Not sure I would have ever started had I known the space was actually not as empty as it seemed.

Would love to know what about the flow was clunky and why it didn't fit your workflow. Might be a lot that I can learn from that.


This was a while ago, but IIRC there were some fairly major software-reliability rough edges (frequent crashing/bugs/etc). Beyond that the major shortcoming was that it was constrained to pdf's only, and that's only one part of my reading workflow.

For a while I was trying to sync a Zotero library (since I need citation management when writing papers), and do all my academic reading (which is mostly pdf's) in Polar, but it was just a bit too much overhead to stick with.


Oh interesting. PDF's is what is hardest for me. I need a completely different approach to handling them than I have been using for web content and epubs.

Thanks for sharing your story, it helps to put things in perspective. I wonder how much they've improved since you used them though. The website looks very sleek.


awesome! I've been thinking how to do it. Like if I highlight from Kindle or Calibre, how do I get them to Anki. However as an avid Anki user for language studying, one issue I found is that the most difficult part is keeping up with sheer amount of review from Anki. I could have hundreds of review, but not having enough time to study them. And I would imagine reading would generate a lot of highlights users want to remember, which would make it much more difficult. So now I am thinking the filtering and condensing parts of such method should be emphasized.


Yeah, I feel this pain. I don't have any really great ideas on this front. I've started to have different review sessions using the filter function in Lurnby, but it doesn't really solve the fact that the amount to review just continues to grow.

On that end, having been using this tool and process for about a year now, I've come to the feeling that I shouldn't worry about that and just think of it as a part of my daily process.

Rather than being goal oriented about remembering something, I've just started to trust that the information is in the system and the system works. Things naturally and serendipitously appear in my review feed and my memory and retention of them improve, but in a less planned way than it might be with a more focused process.


What is the neuroscience research upon which the techniques are based? Do you have any paper or book references? It could be insightful for a visitor to the site to be given more context.


Hey, I don't have any specific papers. And I think neuroscience makes it sound fancier than it maybe is in reality. But the primary concepts are spaced-repetition, elaborative encoding, and active recall. And the main focus is really on reducing friction with putting these practices into play.


I think the reality of neuroscience and behavioral research at this point in time is that it's just really difficult to translate neuro -> behavior and thought processes. There are some decently well researched learning methods, such as spaced repetition and enhancing depth of knowledge but we don't have a totally clear picture on why these things work well.

There are suspicions of course, such as how deeper knowledge of a subject is able to integrate the information into more parts of the network, but afaik the actual biochemical mechanisms and how those translate into network dynamics and recall for a lot of memory and learning functions are still fairly unknown.

If anyone knows of some solid studies (preferably using humans) I would be more than happy to read them, learning and memory is a fascinating area of neuroscience.


Hopefully, with relatively new equipment that allows more precise realtime brain sensing, we should have more insightful research in the years to come.[0] But the issue of defining learning itself will remain thorny.

[0] https://www.kernel.com/


Here's a reasonable overview: https://www.pnas.org/content/116/10/3988

I'm not sure that one touches on neuroscience; it's more psych-oriented if I remember correctly.


You know, the original Readability was in Python. You could eliminate your Node dependency by moving to it, I'd think.


I didn't know. I thought it was based on Arc90's original code from 2010, but I imagined that was js.

I initially was using the python port by ReadabiliPy, but it wasn't working as well as the js code in most of my tests, so I just switched.

But yeah, open to exploring different options to make it less complex.


Huh, you are correct. I guess a better way to put this is "the original Readability I encountered was in Python"! The first version I saw was in Aaron Swartz's 2012 read2text tool, but a check of the URL I found that through says, yup, it's a Python port of Arc90's original code, which was a browser extension.

And you're right. It was in JavaScript. I finally tracked a copy down (the original is long evaporated): https://github.com/masukomi/arc90-readability/blob/master/js...


Looks interesting, some docker setup instructions would help for quick testing


Hey, there's no docker setup just yet. I'll try to get to that this week but for now it's just the standard python install.


Thanks. Could you please add a dockerfile for faster installation ?


Will try to get it done in the next few days. :D


That sounds nice! I'll try it.

Is it possible to then export the cards to Anki?


You can export yes, not entirely sure if you can export to Anki.

You can export in either txt or json.

And you can export 3 ways.

1. All highlights & notes from an article (article = epub / web article) 2. All highlights & notes from a particular topic/tag 3. All highlights & notes from your entire account organized by article.

I think that's how it works. Wrote the export code a while ago so the details escape me at the moment.


I installed the firefox addon, and was hoping that it can deal with paywalled content. It seems to me that lurnby is accessing the content from the server side and got nothing.

If lurnby could access the current DOM in the current browser tab via the browser addon and extract the content via readability.js, I am willing to be a paying customer. I believe this is technically possible.


Hey, thanks. I'll look into that. The way I've been personally dealing with paywalled content is just copy and paste. I use the "manual" option from "add content" and then save it that way.

Alternatively, I also email things to the app - the subject becomes the title, the body becomes the article.

But yeah, considering that the goal is to reduce friction, the manual work isn't ideal, but it gets the job done.

I'll try to take a look at what's possible with getting the current dom.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: