Quick tip when working with Claude Code and Git: When you're happy with an intermediate result, stage the changes by running `git add` (no commit). That makes it possible to always go back to the staged changes when Claude messes up. You can then just discard the unstaged changes and don't have to roll back to the latest commit.
This type of service is becoming increasingly prevalent among European energy suppliers for their residential customers. Beyond providing a revenue stream for consumers this model aggregates distributed energy resources (home batteries, EV's, PV systems) into a one virtual power plant. This enables the storage of surplus energy generated during solar peaks and dispatch back to the grid during periods of high demand. I find it a fascinating domain to work in!
Totally irrelevant on the subject, but the screen recording in the article caught my attention: what tool does one use to create the screen recording with zoom and mouse tracking?
Otel seems complicated because different observability vendors make implementing observability super easy with their proprietary SDK’s, agents and API’s. This is what Otel wants to solve and I think the people behind it are doing a great job.
Also kudos to grafana for adopting OpenTelemetry as a first class citizen of their ecosystem.
I’ve been pushing the use of Datadog for years but their pricing is out of control for anyone between mid size company and large enterprises. So as years passed and OpenTelemetry API’s and SDK’s stabilized it became our standard for application observability.
To be honest the documentation could be better overall and the onboarding docs differ per programming language, which is not ideal.
My current team is on a NodeJS/Typescript stack and we’ve created a set of packages and an example Grafana stack to get started with OpenTelemetry real quick.
Maybe it’s useful to anyone here: https://github.com/zonneplan/open-telemetry-js
> Otel seems complicated because different observability vendors make implementing observability super easy with their proprietary SDK’s, agents and API’s. This is what Otel wants to solve and I think the people behind it are doing a great job.
Wait... so, the problem is that everyone makes it super easy, and so this product solves that by being complicated? ;P
The problem is that they make it super easy in very hacky ways and it becomes painful to improve things without startup money.
Also, per the hackiness, it tends to have visible perf impact. I know with dynatrace agent we had 0-1MS metrics pop up to 5-10ms (this service had a lot of traffic so it added up) and I'm pretty sure on .NET side there's issues around general performance of OTEL. I also know some of the work/'fun' colleagues have had to endure to make OTEL performant for their libs, in spite of the fact it was a message passing framework where that should be fairly simple...
Well let's be fair. You can't get the type of telemetry Dyntrace provides "for free". You have to pay for it somewhere. Pretty sure you can exclude the agent from instrumenting performance critical parts of the code, if that is your concern.
> I’ve been pushing the use of Datadog for years but their pricing is out of control for anyone between mid size company and large enterprises
Not a fan of datadog vs just good metric collection. OTOH while I see the value of OTEL vs what I prefer to do... in theory.
My biggest problem with all of the APM vendors, once you have kernel hooks via your magical agent all sorts of fun things come up that developers can't explain.
My favorite example: At another shop we eventually adopted Dynatrace. Thankfully our app already had enough built-in metrics that a lead SRE considered it a 'model' for how to do instrumentation... I say that because, as soon as Dynatrace agents got installed on the app hosts, we started having various 'heisenbugs' requiring node restarts as well as a directly measured drop in performance. [0]
Ironically, the metrics saved us from grief, yet nobody had an idea how to fix it. ;_;
[0] - Curiously, the 'worst' one was MSSQL failovers on update somehow polluting our ADO.NET connection pools in a bad way...
> I say that because, as soon as Dynatrace agents got installed on the app hosts, we started having various 'heisenbugs' requiring node restarts
Our containers regularly fail due vague LD_PRELOAD errors. Nobody has invested the time to figure out what the issue is because it usually goes away after restarting; the issue is intermittent and non-blocking, yet constant.
This seems like a great tool! I like how it "simply" works by putting a view on top of the actual table. This concept is widely used in the data engineering world (in dbt).
There must be an easier way to write migrations for pgroll though. I mean, JSON, really?
What a nice writeup. I've listened to jungle and related styles on and off over the years but I wasn't aware the music had played such a big role in 90's game music!
It’s really something you don’t know how this applies to your Postgres DB, until you run into a situation where you do.
The author explains this very well, it’s a good read! I’ve learned about this padding little over a year ago, while I was designing a data intensive application with a colleague. I was skeptical about the advantage at first, but for our specific design, where we have 100 to 480+ columns in one table it makes a huge difference on table store size. Not so much on the indexes, though.
Not OP but in similar situation. Main reason is that the primary tables represent various official declarations, and they have a lot of fields. In addition, our customers are required by law to keep the data readily available for 5-10 years. Also the law states our customers are responsible for the data they send through our systems, so they want to see what they're sending.
So just to give an example of how field count "blows up", there can be up to 8 different parties related to a single declaration, and for each we want to have for display/print purposes our internal id, id in customers system, organization number, name, address, postal code, location, country code, contact person, contact person phone and email.
So we're talking about almost 100 fields just to store parties.
We want to store all that separate from say our "companies" table as a company might change name, location etc, and we're required to show/print the details from today 5+ years down the line.
We could put that in a separate table, but that means 8 joins to fetch data for a single record just there. And that's just one piece of the declarations, and at least 3-4 of these parties our customers want to see in overview grids and be able to filter on there.
Now tack on 20+ years of doing this, with various official systems coming and going with different technical requirements, and it adds up.
Without seeing your schema, that sounds like a bit much, but maybe not. That is kind of the point of a relational DB.
I always attempt to normalize as high as possible first, and then denormalize IFF it’s necessary for performance. Postgres has a default join limit (i.e. where it shifts to the genetic algorithm for joins, and may not be optimal) of 8, but I’ve still seen it do fine far above that.
If we normalized as much as possible we'd be looking at 20-30 joins just for the main view of a declaration. Main overview grid would have probably 10+ joins. And keep in mind almost all of these would have data, so instead of inserting one row you're now inserting 20-30 rows + updating corresponding foreign key indexes.
Think performance would be pretty crap, and developer experience as well.
You do not want to do this, I assure you. The TOAST / DE-TOAST overhead alone would wreck performance, plus OP said it was update-heavy; Postgres can’t update a JSON object on disk, it just rewrites the entire blob.
I'm not sure, there's also HOT-updates (heap-only tuples). It's an optimization where data in modified in place if none of the modified columns are part of an index and maybe other conditions but I don't remember it all too well.
I strongly disagree. This is an excellent application for JSONB. The user outlined that the records are for historical compliance purposes and therefore aren't frequently updated after the initial phase.
A schema of `party1 JSONB, party2 JSONB, ...`, or even `parties JSONB` would likely be far more maintainable and sensible than the "table-as-spreadsheet" approach.
If we could limit us to only PostgreSQL we could perhaps do that for new modules. Sadly we have some customers which want to use their own MSSQL database.
Though we'll see what the future holds. PostgreSQL is interesting due to cost and features, and many of the large customers are shifting their POV on how they want to run things, almost doing 180's in some cases.
This system contains measurements and state of physical devices (time series). It’s designed for both heavy write and read, with slight emphasis on write.
Each table is one type of device and contains 1 to 5 different measurements/states. But here’s the trick: because data is queried with minimum bucket size of 15minutes I figured we could just create a column for each measurement + quarter of the day (i.e. measure0000, measure0015), so that’s 100 columns for each measurement (96 quarter + 4 for DST), include the date in the key, et voila: excellent write performance (because it’s mainly UPDATE queries) and good read performance.
Okay, the queries to make sense of the data aren’t pretty, but can be generated.
I find it really cool how effective this is for time-series data without Postgres extensions (we’re on RDS).
One interesting thing your team may want to look into (if you haven't already) is compression.
Of course there are a multitude of variables we don't have access from the outside, but Postgres only compresses data that is TOASTed, and based on your description of the table, the data is not being TOASTed (and therefore not being compressed).
Instead, if you could somehow pack your timeseries entries into an array, you would get the benefits of compression automatically.
Given your write performance requirements, using an array may be out-of-question (and you may get too much overhead from dead tuples) -- but who knows? Always a good idea to benchmark.
I actually considered mentioning this at the post but figured it was too long already and could be the material for a future one :)
The updates wouldn’t be in place anyway because of the Postgres mvcc approach, right?
I can see the advantage in terms of just needing a single tuple for a reads. So a timestamp + value model would likely take twice as much heap space than your approach?
Given that you’re probably always just inserting new data you could use a brin index to get fast reads on the date ranges. Would be interesting to see it in action and play around to see the tradeoffs. The model you’ve settled on sounds like it would be a pain to query.
Fascinating. How reliable is the measurement apparatus? Or to put it another way, how tolerant of missed measurements is this scheme? I’ve been in a similar situation in a past life and we took a different approach. I was never satisfied with how we ended up doing it though.
I see your point. Some of our measurements are counters and interpolate missing data via certain rules. We store these interpolations in a separate table with exactly the same format and join the two tables directly when querying the data. Since the real value always takes precedence over the interpolated value this is just a COALESCE(realmeasure0000, interpolation0000) statement.
This is super fast when taking advantage of postgres' partition-wise joins.
I assume you looked at the various Postgres functions, like width_bucket()?
I’ve never ran a PG-based TSDB so I’m reluctant to pass judgment one way or the other, but as I hope you can understand, “we have hundreds of columns in a table” is alarming at first glance.
The schema is organized by how we need to query it which is mostly by fetching all measurements for an entire day or month. Querying statistics on those metrics is not often needed, maybe a few times per week or day, so we do that in Snowflake if necessary. Or create a specialized streaming data flow when it the stats are queried more often.
I don't know what they're doing, but highly-denormalized tables are very common, and sometimes even the norm, depending on how things are set up, especially in OLAP contexts.
Highly denormalized tables are often the norm simply because the tables weren’t properly normalized to begin with, and the data model wasn’t properly done, such that reasonable joins are overly difficult.
OLAP is of course its own problem, and most of the best practices for OLTP do not apply.
i programmed things for fun for months on end without having an internet connection. never had the need to share. occasionally I needed connection to download papers on the subject.