Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Let's say I have a content website with about 20k content pages. I want to automatically cluster the pages so that the each page has the related content linked. Right now I'm using a hacked together tf–idf using sklearn and Python2, and it just works. The downsides are that I have to compute everything offline whenever I add new content, and that it's one more thing to maintain/upgrade.

I'm wondering if anyone has a suggestion of a SaaS or another alternative for my use case? Thanks!



I think this PostgreSQL is your friend in this case https://github.com/ankane/pgvector


Thank you, will have a look!


Python2?


Yes, it's been running for about 10 years.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: