- home assistant (just a few currently, more soon)
- paperless:absolutely awesome document management system
- immich: image management with automatic synchro of my mobile taken images (ML features)
- tailscale
- StirlingPDF: simple tools for all things PDF
I really like your idea and the clean, simple execution! I'm looking forward to using it more in the coming days.
I'm curious about the technical side of things—how did you build the app, and how do its inner workings function? Is it open source? If possible, could you share more technical details?
We source the book cover from the publisher, create author entries, etc.
Eventually, we started paying Nielsen a lot of money to use their Book metadata API. It is "ok," and they don't update it often. But it helps us automatically pull in an author's name, book title, genres, and age-group.
We still manually source the book covers as they only have super small and blurry covers. And we screen every book we add to make sure the data is correct.
What is especially frustrating?
- Author names are text and not linked in any way. So we have to decide what is a slightly different name but the same author and what is a different author.
- The BISAC genre standard is full of errors and abuse by publishers. For example, they might tag Dune as "AI" which marks it also as being nonfiction because they don't know how the BISAC standard they created works :).
- It lists book editions and has no concept that all book editions belong to one book.
- No real concept of a book series.
- Terrible book descriptions where publishers put in all kinds of reviews and nonsense that we need to figure out how to scrub eventually. They also abuse weird symbols to make it stand out.
It requires a lot of work to fix and manage all of this.
I am about to redo the entire topic/genre system due to some of these problems (this winter).
I am hoping to build a database of all books to use with new features in 2025. I don't know what we are going to do here. We might license a full DB of books from Ingrams (expensive) or Bowker. I liked Ingrams, but Bowker didn't email me back for months and gave me a lot of worry about working with them in the future. I might just do the best I can or break down and use Amazon's API (lots of stipulations in using it).
when I asked you, I already suspected that obtaining the data would be a real challenge. It’s actually unbelievable that you have to put in so much (manual) effort to get clean data—and then pay a lot of money for it, too. It’s a bit unfortunate.
I do have one more question: Shepherd.com seems to be specifically tailored for the American book market, right? Are you planning anything in the future to serve an international audience as well?
Since I’m from Germany, I looked into how things are here regarding book data. It seems like we might have it a bit easier when it comes to accessing book information. We have the VLB (https://vlb.de/en/), which provides book data—I’ll look into it more closely. The German National Library (probably similar to the Library of Congress) archives every book published in Germany! Their data might be quite useful too.
By the way, I didn’t realize it was so expensive to get an ISBN...
Wishing you continued success with Shepherd! I think it’s wonderful when someone pours so much passion into an idea—especially one focused on books!
It is frustrating, as this info should be free and accurate. Publishers only benefit if devs can play with it and build cool stuff. Google Books does have an API, but crazy rules, it can be used for private projects.
Ya, I am focused on the global English market, so USA, UK, Australia, Ireland, India, and English-speaking readers globally (a lot in Europe).
I would love to do other languages, but I just don't have the resources yet to handle that level of complexity. My hope is one day I can. I'd probably start with Spanish, French, or German.