Hacker Newsnew | past | comments | ask | show | jobs | submit | syastrov's commentslogin

Probably more likely inefficient data access patterns, such as N+1 queries, inefficient algorithms, lack of parallelism


It is super interesting! I was at a talk where a genetics company used the Smith-Waterman algorithm (1) to perform sequence alignment. Still I believe it took a long time (24 hours?) to perform the calculations. They were working on optimizing the code as doing so could mean a faster turnaround.

1. https://en.m.wikipedia.org/wiki/Smith–Waterman_algorithm


I did an experiment of switching an app which uses Postgres to use SQLite. I ran into issues where I had a long running transaction in one process doing writes and another process doing reads and writes. As soon as the second process tried to do any writes, I would often get the queries running into locks and aborting. Otherwise, it worked great for reads and sped up the app a lot as there were lots of N+1 queries issues.

This is also after playing with settings about WAL mode.

It would not be a viable solution to split these tables into separate databases to avoid the concurrency issue. Also, most of the issues involve writes to the same tables.

Is there honestly any way to get that working well with current SQLite or do we need to wait for hctree or begin concurrent to mature?

I would be concerned that to get this working in its current state would require a large refactoring of the app which would require a lot of effort and would introduce limitations that make coding more difficult.


One way to have the best of both worlds for situations where you have complicated read queries is to create database views for the queries. You can then define them as an unmanaged table in Django (requires some boilerplate) and query them in a simple way as if they were a normal ORM model.


Similarly, we have amazingly powerful and reliable computer hardware, but continue to have software that is extremely slow and buggy and hard to use.

I theorize that since software is so invisible/intangible, these complexities are not understood by many.


Anyone tried any of the alternative full-text search extensions mentioned? Zombodb sounds good, except I would like that it didn’t work synchronously and fail transactions if there is a network error talking to ElasticSearch. I think most people will use Postgres’ built in FTS up until a point or go straight to an application level integration with ElasticSearch.


> That's where things got a bit hairy. Lila is built on Play Framework which is not yet ported to Scala 3.

> So I forked it and butchered it to remove everything we don't need - which is actually most of the framework.

I guess there is hope that Play framework itself will be migrated to Scala 3 and that the dependency on the fork can be removed, but this is taking on a risk - what if there are security updates to the upstream in the mean time?


I think the plan might be to get rid of the play framework altogether and rather use a couple small, independent libraries to achieve the same.

The likelihood of getting security fixes in the future is about the same as getting new security holes created by whatever updates. Butchering away everything they didn't need certainly didn't harm security.


Isn't light bend more or less falling apart? I thought they announced they would no longer publish changes to play


Play has been donated to the community one year ago, which has been reasonably active lately. They secured some funding and development efforts have picked up a good pace.


About 15 years ago, my brother used to run a shoutcast (Internet radio) server. He wasn’t getting many listeners since the list of stations people browse was sorted by popularity - number of listeners. So I disassembled and then hexedited the shoutcast server binary so that the initial number of listeners was 60-something (which meant the listener count would never drop below that). Then he actually started getting a few listeners :)


You fixed the cold start problem! Not a bad idea :)


I was just curious if I could improve our PHP-based site’s performance. So I attached strace to an Apache process and followed the log of syscalls and counted milliseconds between them. Sure enough, I discovered that 20 ms was being spent each time on a DNS lookup to a statsd metrics collector service over UDP (I remember being told that this was lightweight, since it was UDP). PHP didn’t cache DNS lookups and this was sometimes happening many times per request. I added a static entry in /etc/hosts and the overall latency improved by 30% across all endpoints.

Another hack: I once was consulting for a client who was running Drupal and was going to launch their new site the next day, but suddenly it started crashing on some of the pages. I found out that you can take a core dump of Apache and load it into gdb. Then if you run some gdb macros, you can see the PHP stack trace at the time the crash occurred. Turns it it was some module (tokens?) they had recently enabled which was recursively calling into itself. Not sure why it didn’t hit some stack limit, though. We disabled the module, which fixed it and the client was super happy. If I knew more about Drupal, I probably would have disabled modules in a form of binary search as a first troubleshooting step. But I did know a little about gdb, so that came in handy.


Haha, that made my day! (It’s a reference to That Mitchell and Webb Look).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: