More

georgewfraser · 2025-06-18T04:04:21 1750219461

I talked to the timescale CTO at pg conf a few years ago and asked him what timescale does differently than a standard columnar database that makes it better suited for time oriented data. He said a bunch of things and I said “but columnar databases do those things.” Then he got mad at me.

I guess it’s just another columnar dbms after all?

dangoodmanUT · 2025-06-18T04:41:47 1750221707

They don't do well on benchmarks https://benchmark.clickhouse.com/

jascha_eng · 2025-06-18T11:13:39 1750245219

I'd argue we do okay, but of course it's Clickhouses own benchmark it's hard to outperform them there. It's also not apples to apples. Clickhouse has much less transactional guarantees and isn't postgres SQL compatible. The great thing about Timescale is that you only need one DB for all your analytics and transactional needs. Combined with pgvector postgres also handles search quite well.

In a way Timescale is just postgres on steroids. Sure if you really know your use-case well, are fine with giving up some postgres nicenes, are willing to learn a new query language and are fine with using and syncing multiple data stores you'll outperform timescale. But I think it is still really cool to see how close you can get with essentially just a better postgres.

hodgesrm · 2025-06-19T17:28:33 1750354113

> Clickhouse has much less transactional guarantees ...

Is this relevant? The benchmark is just reads.

jascha_eng · 2025-06-20T11:00:18 1750417218

Depends on your workload? If you don't care about ACID compliance in your use-case, and query speed is all that is relevant to you probably not.

You might still be better of with Timescale/TigerData if your query pattern uses a lot of joins as we do much better there than Clickhouse does. We have our own benchmark too and perform better than Clickhouse on those kind of queries: https://rtabench.com/

But also transactions often make your life as a dev easier in my experience, and being able to use a single DB and stick with 100% postgres compatible SQL without having to change your application is often worth more than squeezing out the last few bit of query performance.

I'm just saying that single-benchmark comparisons rarely tell the full story when evaluating database technologies. ClickHouse is undoubtedly impressive engineering, and it excels in many scenarios. Ultimately the optimal choice depends on your specific use case.

hodgesrm · 2025-06-20T23:47:02 1750463222

I'm glad you answered as my comment was not very complete. It left out mentioning that transactions should not be expensive on reads unless you are doing something wrong. In general MVCC--which PostgreSQL uses--does not have a lot of performance overhead for this case. ClickHouse also maintains snapshots when reading, so to a certain extent they do the same work. Transactions don't seem like a very strong argument here, since you would be conceding that your implementation is inefficient.

I agree with the other points. ClickHouse is not strong on joins [yet]. It's also nice to have a single database for everything. Yet so far nobody has been able to achieve one that delivers high concurrency, fast updates, and petabyte-level scaling. Mike Stonebraker et al. called this problem out in 2007. [0] It appears they called it right and we'll continue to see 2-3 major categories of databases for the foreseeable future.

[0] https://www.vldb.org/conf/2007/papers/industrial/p1150-stone...

akulkarni · 2025-06-18T11:28:16 1750246096

It depends on which benchmarks you use.

"ClickBench evaluates databases using a single table of clickstream data, representative of workloads like web analytics, BI, and log aggregation. It also favors full-table large scans and large-scale aggregations on denormalized data.

Real-time analytics inside applications is different and needs a new benchmark." [0]

This is why we published RTABench. [1]

We believe that it is more representative of real-time analytical workloads.

[0] https://www.tigerdata.com/blog/benchmarking-databases-for-re...

[1] https://rtabench.com/

dengolius · 2025-06-18T09:08:37 1750237717

Yes, TigerData aka Timescale tried to make a fuss a few years ago comparing Clickhouse and TimescaleDB, but they failed.

freilanzer · 2025-06-18T13:26:57 1750253217

DuckDB seems to be the most interesting there.

qoega · 2025-06-18T19:22:22 1750274542

It is meant for single reader/writer workload so not meant to be used as a service

viccis · 2025-06-18T05:09:47 1750223387

Do you think all time series databases (like InfluxDB for example) are useless compared to "columnar databases" that "do those things" or just Timescale?

victorbjorklund · 2025-06-18T06:55:52 1750229752

it does sound like a pretty dumb question. Many things do similar things. That is like asking what postgres does that other sql databases doesnt.

yuretz · 2025-06-18T04:34:20 1750221260

Few years have passed and your guess is still wrong.

fellatio · 2025-06-18T06:14:08 1750227248

Unfair anecdote as you don't mention what he said before and after "he got mad" (whatever that means).

georgewfraser · 2025-05-28T01:33:38 1748396018

They make a really good criticism of Iceberg: if we have a database anyway, why are we bothering to store metadata in files?

I don’t think DuckLake itself will succeed in getting adopted beyond DuckDB, but I would not be surprised if over time the catalog just absorbs the metadata, and the original Iceberg format fades into history as a transitional form.

georgewfraser · 2025-05-01T18:53:39 1746125619

This is exactly right. We even went so far as to build a proof of concept internally, and the technical challenges are just very different. The simplest way to explain it is that Fivetran connects a skinny pipe (APIs) to a fat pipe (databases) while Census connects a fat pipe to a skinny pipe.

georgewfraser · 2025-04-14T00:41:28 1744591288

I am generally a huge vertical sharding skeptic but there are special cases where it is beneficial. If you have a simple query pattern on one table that represents a big fraction of your entire workload you can put it into its own instance and it becomes much easier to monitor. It’s easy to see why vertical sharding is sometimes the right answer by inverting the decision: should we put two unrelated large applications on the same instance? Obviously not, there is no benefit and ops becomes more difficult.

georgewfraser · 2025-01-04T18:41:56 1736016116

Like so many things from Google engineering this will be toxic to your startup. SREs read stuff like this, they get main character syndrome and start redoing the technical designs of all the other teams, and not in a good way.

This phenomenon can occur in all “overlay” functions, for example the legal department will try to run the entire company if you don’t have a good leader who keeps the team in their lane.

physhster · 2025-01-04T19:26:17 1736018777

In my experience, SREs are usually "enforcers of maintainability". If your engineers don't want to be oncall, they need to produce applications and services that are documented and maintainable. It's an amazing forcing function. SRE doesn't often redo technical designs, there's plenty enough reliability and scalability work to do...

jshen · 2025-01-04T19:50:06 1736020206

Your engineers should be on call.

physhster · 2025-01-04T20:32:47 1736022767

At a 200-person company, sure. But when you're in the tens or hundreds of thousands, that's a hard no. Especially when dealing with out-of-scope dependencies.

otterley · 2025-01-04T20:53:14 1736023994

I work for a company with millions of employees. Our SDEs and their managers carry and are responsible for answering the pagers. We don’t have SREs.

strken · 2025-01-05T01:25:36 1736040336

What? Engineers should own the code they write, including being on call to maintain it. Out-of-scope dependencies should be irrelevant, and if they're not, get some of those tens or hundreds of thousands of employees to work on better observability.

I agree that if you own the blahblah service then you shouldn't get alerts for a broken dependency foobaz if that team is already aware, but if blahblah itself breaks, not being around to fix it is pretty dangerous.

motorest · 2025-01-04T23:22:32 1736032952

> But when you're in the tens or hundreds of thousands, that's a hard no.

What? No, not at all. I worked in such a company,and oncall was indeed a thing and it was tremendously easy to deal with upstream and downstream dependencies. You have dashboards monitoring metrics emitted in calls to-from third party services and run books that made it quite clear who to call when any of the dependencies misbehaved. If anything happened, everyone was briefed and even on a call.

This boils down to ownership and accountability. It means nothing if the company had 10 or 100k employees.

la64710 · 2025-01-04T21:11:58 1736025118

From the 90s the whole DNS on which the internet is standing today was run successfully with minimum error by a bunch of folks who used to call themselves sysadmins. Developers seems to run out of things to develop and they have been reinventing themselves as devops and SREs. They have been pushing out pure sysadmins but at the same time this trend shows how demand for developers or SWEs falls far short of the supply of developers in the market.

tsss · 2025-01-04T22:30:42 1736029842

Take one look at the Kubernetes source code and it becomes clear that you can make successful software with zero clue about good software engineering.

hbogert · 2025-01-11T17:03:12 1736614992

What is objectively bad code in the code base of k8s? Is it really worse than any other system?

tsss · 2025-01-13T22:18:58 1736806738

Yes it is worse, much worse. A large part of the reason for that is that it's written in Go. The other part is that it's written by Googlers and sysadmin people; two groups not particularly known for their great software engineering skills. My personal experience here is mostly with cAdvisor (which I guess is not strictly part of Kubernetes but comes from the same ecosystem). It is chock full of horrible error handling (if there is any), uninitialized structures and a dozen layers of indirection.

georgewfraser · 2024-12-16T19:32:16 1734377536

I have some insight into this because this claim is about my company Fivetran:

“…relies on the data source being able to seek backwards on its changelog. But Postgres throws changelogs away once they're consumed, so the Postgres data source can't support this operation”

Dan’s understanding is incorrect, Postgres logical replication allows each consumer to maintain a bookmark in the WAL, and it will retain the WAL until you acknowledge receipt of a portion and advance the bookmark. Evidently, he tried our product briefly, had an issue or thought he had an issue, investigated the issue briefly and came to the conclusion that he understood the technology better than people who have spent years working on it.

Don’t get me wrong, it is absolutely possible for the experts to be wrong and one smart guy to be right. But at least part of what’s going on in this post is an arrogant guy who thinks he knows better than everyone, coming to snap conclusions that other people’s work is broken.

gizmo · 2024-12-16T20:50:33 1734382233

> When their product attempts to do this and the operation fails, we end up with the sync getting "stuck", needing manual intervention from the vendor's operator and/or data loss. Since our data is still on Postgres, it's possible to recover from this by doing a full resync, but the data sync product tops out at 5MB/s for reasons that appear to be unknown to them, so a full resync can take days even on databases that aren't all that large. Resyncs will also silently drop and corrupt data

I don't know, but it sounds like you skipped over most of the reasons why the author was annoyed by Fivetran. You advertise "Connect data sources to PostgreSQL in minutes using Fivetran" but if Dan Luu -- who is certainly an intelligent and capable engineer -- and his coworkers can't figure out how to use your product correctly, and if your customer support also can't figure out why the sync breaks, then maybe this isn't mere customer 'arrogance'.

lmm · 2024-12-17T01:17:24 1734398244

> if Dan Luu -- who is certainly an intelligent and capable engineer -- and his coworkers can't figure out how to use your product correctly, and if your customer support also can't figure out why the sync breaks, then maybe this isn't mere customer 'arrogance'.

Dan Luu claims, among other things, to experience hundreds of software bugs per week. If you believe the things he writes then he's not at all representative of a normal customer.

malfist · 2024-12-17T02:18:54 1734401934

Hundreds seems attainable.

Today, for me: * Firefox mobile didn't load the keyboard on a textbox making me kill and restart the app to get it to work. * Firefox mobile pull to refresh triggered while scrolling up * I ask Alexa to turn on some lights and she dinged like she did it but nothing happened. * I turned on my office light that has a routine to turn on a space heater and another light. Only the first light turned on * The Roomba got lost and ignored its keep our zone and ran into the Christmas tree skirt. * When I ran out to get groceries android auto wouldn't connect until I restarted the car. * On another errand, apple car play refused to play music even though it said it was. * A website told me I had unsaved changes and wanted to know if I wanted to navigate away from the page without saving.... While clicking the save button. * I got a letter in the mail from Amex telling me that they couldn't reach me by email and I needed to log into my account and pay a zero dollar balance. This is after I closed all my accounts months ago, I get two letters each month to sign into an account that was deleted to pay a bill that doesn't exist. * Octopi said it's webserver wasn't running, a refresh fixed it. * Build tools at work linked the wrong binary for some tools and I had to manually correct the symlinks. * Insert 10 or more bugs with pipe wire and pulse audio on Ubuntu.

I'm sure there's more, every day is like this. Yesterday I had a plethora of bugs trying to get screenshare and webcam streaming to work for a video conference despite working for a dry run a few minutes prior.

And right now, line breaks aren't working in this reply

MichaelZuo · 2024-12-17T03:17:24 1734405444

This sounds like a plausible list.

Often times on HN it can be hard to distinguish between scoundrels/trolls/losers/etc… with ulterior motives and genuine people with accurate information, but for this case since Dan Luu has done a lot of credible actions, he should be given the benefit fo the doubt.

lmm · 2024-12-17T03:53:16 1734407596

Many of those things may not be software bugs as I would normally understand the term, but rather software behaving as specified/intended, where that specification/intention has unexpected and/or undesirable consequences. (The line break thing certainly is, for example). I find it unhelpful to conflate the two.

eterm · 2024-12-17T11:35:12 1734435312

> software behaving as specified/intended, where that specification/intention has unexpected and/or undesirable consequences

Unexpected behaviour is how I choose to define "bug".

It doesn't matter if the programmer intended it. It's still a bug if it behaves contrary to the user expectation.

It might be that the best resolution is better documentation / training, but it's still worth of a bug being raised to fix.

homebrewer · 2024-12-17T15:17:20 1734448640

You've never had contradictory user requirements thrown at you, with the expectation that you somehow implement them both? By this definition all software with more than one user is buggy, and it's impossible to do otherwise until we get AGI to do everything.

c22 · 2024-12-17T13:19:00 1734441540

It's a bug if it behaves contrary to the programmer's expectation. Full stop. There is too much diversity in users to go the other way on this.

If a product doesn't meet a user's expectations it may be a poor fit, improper usage, or even a braindead terrible design, but these are not bugs.

DangitBobby · 2024-12-17T04:34:05 1734410045

I personally don't mind when people lump intentionally bad behavior along with the bugs.

Dylan16807 · 2024-12-17T05:34:30 1734413670

Which ones? I could see the pull to refresh maybe not being a bug, but every single other one sounds like a bug to me.

lmm · 2024-12-17T08:09:28 1734422968

I mentioned the line breaks are by design. Roomba could be a hardware error or sensor noise. Text box not bringing up the keyboard it's hard to be sure whether that's a bug or intended. The auto thing not connecting could easily be hardware or deliberate. Letters from Amex might well be as specified by their processes. Webserver not running might have been deliberate maintenance.

Many of them could be software bugs, sure, but without actually figuring out what's going on and what the root cause is it's hard to tell.

Dylan16807 · 2024-12-17T10:12:50 1734430370

Oh, I wasn't sure which direction the line break "certainly is" went in. But they have line breaks elsewhere in their post...

I would say failing to deal with hardware error that strongly is a bug. Keyboard I'm pretty sure is a bug, I've had plenty of situations where the keyboard code locks up and needs app restarts. Auto not being auto would be a weird thing to lie about, otherwise it's a bug. "Specified by their processes" a process is an algorithm, sending incorrect messages for $0 could be an algorithm bug or an implementation bug but either way it's a bug in the software, it's not doing that because someone decided it actually should do that. It said the webserver wasn't running but it was, that's a bug if they didn't have a exact unlucky timing.

Retric · 2024-12-17T01:30:34 1734399034

I think most people experience hundreds of bugs per week, they may not notice them and or realize they are bugs.

grumple · 2024-12-17T01:56:00 1734400560

I experience hundreds of bugs per week. Some in my own product. Most in others. Many are small and solved with a refresh. Most go unnoticed or are easily ignored.

jodrellblank · 2024-12-17T06:17:26 1734416246

That wouldn't really surprise me. Off the top of my head things I've experienced in the past week or two and remembered:

- Four involving a Vim upgrade being pushed by one tool and blocked by another; a broken shortcut causing gVim to load with an error message and no working menus - not fixed by uninstall/reinstall; unable to delete one of the Vim DLLs; Windows Explorer window lockup after right click (possibly it had the DLL loaded for Vim context menu icon?).

- FireFox regularly stops loading pages until I visit Help -> About FireFox and see that it's silently done an upgrade and is waiting to restart.

- Cloud service with SSO login working for months, stopped working with mysterious 'error processing your credentials' type message.

- and it has a broken 'send password reset email' feature, no email gets to me.

- checking for blocked emails in SaaS email filter, when I put my email in the 'Recipient' box of the search, Edge browser autofill puts my email address in the 'Sender' search field as well, incorrectly / uselessly.

- Edge browser autofill dropdowns routinely cover SaaS product's HTML/JS dropdown menus.

- New SaaS product which has a username/password login instead of SSO, but login goes to a prompt with only 'use SSO' (which we don't) or 'forgot my password' (after logging in successfully). Clicking 'use SSO' gets past that screen, I think it's internal between-different sub-sites SSO abstraction leaking to the customer.

- SaaS product offers 2-factor auth using Google Authenticator, if using the signup key instead of QR code it appears in a difficult-to-select text label. Entering an expired TOTP code shows an incorrect error saying the code can only be used once (Time Based OTP codes are valid for the time duration, often ~30 seconds for any number of uses).

- Saving details for that site in a SaaS password vault from a different vendor, an error like 'no saved password' popped up despite having a password saved.

- At least four different debugger errors - set breakpoint either not setting, or not breaking, or debugger crashing, or code crashing without error message until upgrading the runtime.

- Windows 'restart and update' over and over until 'check for updates' and it decided there weren't any, then the option disappeared.

- Dell laptop firmware upgrade which rebooted to upgrade firmwares, then said it failed.

- Always on VPN for work which stops passing traffic until the service is restarted, a couple of times a month.

- Backup software which runs for a week then fails, until rebooted. Vendor support blames the network.

- Monitoring software which reported the wrong datastore sizes from a VMware host until the host management services were restarted, one or both of them are bugged.

- Theater website's date range selection on mobile, I dropped down the start, slowly worked through year/month and tapped a day, dropped down end, again went through year/month and tapped a day after the start, 'apply' button was greyed out, tapped outside the calendar controls, and they hadn't registered the days. Retried and it worked second time.

- Multiple websites where the infinite scroll or 'load more' links stop working, including big ones like YouTube.

- SaaS product search finding an item for one coworker, not finding it for another coworker.

- SaaS ticketing system not finding the ticket ID typed into the search until the search is retried a couple of times, multiple times per day.

- SaaS ticketing system not handling session timeout properly and presenting a normal interface where everything looks fine until attempting to interact with it and then presenting an 'oops something went wrong' style error message instead of going to the login prompt.

- Switching back to VMware web client in a browser tab, finding it showing the "your session will expire in 20 seconds" countdown box, knowing that the session expired an hour or three ago, and the 'extend session' button doesn't work.

- SaaS password vault, VMware web client, not handling session timeout properly and presenting a normal interface where everything looks fine until attempting to interact with it and only then whisking away to the logon prompt.

- Secure Jumpbox tool doesn't save its settings between sessions, even though it explicitly has a 'make this the the default value' type option.

- Secure jumpbox tool presents multiple screen resolution options which don't work, only some of them work.

- Database engine sync reporting different sync status in different views.

- Reddit discussion from 1 year ago with URL to the CDC website which is now a 404.

- iOS keyboard unreliable at opening the select-all/copy/paste popup dialog, often needs prompting and reprompting.

- iOS keyboard unreliable at starting swiping.

- After powering off phone and on again, torch GUI would switch 'on' but torch LED was not lit up, persisted for maybe 10 seconds after unlocking, as if there were internals still booting and the GUI was disconnected from control of the hardware.

- pfSense (OpenBSD-based) firewall upgrade process from old versions is broken.

- Computer game where the character turned invisible and then never turned visible again; glitchy bounds checks; glitch-teleported back to a starting point in the middle of a fight; unreliable animation on characters being enchanted; glitchy behaviour happening for an attack which doesn't have that behaviour. 'Disable background service' setting doesn't disable the background service. Network glitched and dropped back to the main menu not counting any of the progress made. Showed loading screen then didn't load. Network glitch and the fire button became the wrong way round - gun stopped firing when clicked, fired continuously when not clicked. Audio stopped working after RDP connection until game restarted.

- Proprietary software content to fill the disk with logfiles that aren't tidied up or limited by default.

- Search engines which don't search for what I type in and show something else instead. Try Google for "spirtling", top result is a Wiktionary page for it as an archaic spelling of using a Scottish porridge stirrer, other relevant results. DuckDuckGo result has changed it to "sporting" and are horse racing tips, then spirling and spiraling, then spirtling.

- New York Times' Connections game, on small screen phone the "one away" message appears in the wrong Z-order behind the top bar, unreadable unless tapped on to bring it to the front. After a game the 'register an account to track progress' popup partially blocks the 'close summary screen and look at the final game state' X button.

- Runaway memory use in Amazon.co.uk and YouTube.com tabs left open too long until FireFox restarted that 'about:processes' can't help with.

- Teams continuing ringing on cellphone after I've answered the call on laptop.

- Google StreetView glitches where it sometimes gets stuck at junctions and can't move forwards but can double-click-jump there instead.

- and if we count glitchy / unexpected / poor user interface or user experience as well.. Python's IDLE laggy keyboard input until latest version installed. Vendor renewal email which doesn't say what is being renewed in a useful way. Etsy online shop which encouraged me to click on 'related' links then gave me a captcha because they couldn't tell if I was human or robot. Inconsistent YouTube keyboard shortcuts. AWS's chat LLM gave incorrect instructions for how to achieve firewall rule change on AWS server. VoIP phone app which switches back to the signing-in animation every few minutes.

- The https://www.skyscanner.net/ date selection is poor (it's not clear that clicking 'Depart' also sets the 'Return' date until you try and get rid of the calendar so you can set the return date and find you cannot. You can (X) to clear the return date and the depart date stays configured so you can set a new return date, but you can't X out of the depart date to set a different one without clearing the return date as well and having to reset that. Once both dates are set, you can't adjust them; clicking the 'Return' box where the UX looks like you're setting a return date, when you click on a day it will clear both dates and then set the date you click on as the new departure date, getting in the way. The depart and return boxes have the same visual loko as the From/To boxes which you can type airport codes in, but this is misleading as you cannot type a date into the date boxes.

- https://calculator-online.net/sohcahtoa-calculator/ says "This calculator uses the SOH.CAH.TOA mnemonic method to solve the sides and angles of a right triangle. It provides step-by-step calculations using the SOHCAHTOA formula" but put in leg a=1 and sin ⍺=30 and it calculates that the hypotenuse is length 2, showing the step-by-step solution using Pythagoras' Theorem using the length of sides b and c that we haven't given and don't know.

And this isn't counting design I find confusing, documentation which is unclear, layout bugs where things don't visually align or the spacing is misleading or the organization is unhelpful, bugs in tabstop ordering, or not-really-bugs like Notepad++ lagged with a single line of 10MB of data, or EMACS lagged with a long line, etc.

jodrellblank · 2024-12-19T05:03:23 1734584603

Just calling this one out: "Inconsistent YouTube keyboard shortcuts." because it's so annoying and wouldn't it have been easier if they were all hooked to the same code, than to implement them differently? Assume YouTube.com in FireFox on Windows.

- 'space' will play/pause the video if the main video has focus, or if most other controls are selected (e.g. volume), but if the settings are open then 'space' will change the setting and play/pause the video.

- holding 'space' when the main video has focus will 2x speed the video for the duration of the hold. Holding space when the 'play' button has focus will not do anything. Not double-speed, not repeatedly play/pause.

- 'k' will play/pause the video whatever control has the focus. Holding 'k' will stutter the video repeatedly switching play/pause. (Why?). Holding 'k' will never 2x speed the video. Space and K are both play/pause but implemented differently and inconsistently.

- Left/Right arrows rewind/fastforward by 5 seconds when most controls have focus, but when the volume slider has focus then left/right arrows change the volume. The Auto-play enabled/disabled control is also a left/right sliding one but the left/right arrows do not move that when it has focus.

- Up/Down arrows control the volume. Even when the volume slider has focus and left/right arrows are also controlling the volume. But wait, the up/down arrows don't control the volume when the settings are open; then they only move the selection in the settings. (Contrast with spacebar which activates settings and play/pauses at the same time, contrast with Return which sometimes does play/pause and does activate settings but does not do both at the same time).

- Page Up/Page Down with the video focused is browser page scroll. Click on the video position bar, the thin red one, and Page Up/Down do rewind/ffwd by 1 minute! With this focused, up/down arrows stop being volume control and now do the same rewind/ffwd as left/right arrows.

- j/l do rewind/ffwd like the left/right arrow keys, but they jump 10 seconds instead of 5 seconds.

- 'Return' activates a focused control so with play/pause focused it will do play/pause; it behaves almost like spacebar and it stutters the video like 'k' when held - but with the main video having the focus, Enter does not play/pause, it does nothing.

WHY are they so inconsistent?

oxfordmale · 2024-12-16T23:13:41 1734390821

Fivetran works perfectly fine for syncing Postgres databases into Snowflake. My company syncs dozens of them without problems. I can only assume their Postgres database has a non standard set up.

bcrosby95 · 2024-12-16T23:18:26 1734391106

Considering the comment chain involved here chiming in with "I can only assume non standard setup" is pretty hilarious.

i_am_a_squirrel · 2024-12-17T04:38:35 1734410315

Yeah, if you have 1 million dollars to spend every time you run a data migration or anything else that touches many rows.

I've seen some new libraries crop up for writing your own replication slot clients. I wouldn't use fivetran for PG.

Either you have a lot of data and fivetran will be too expensive or you don't, and you're better off just using a postgres OLAP plugin/extension.

Maybe it was because it was in beta, but I had a nightmare of a time with fivetrans API trying to coordinate connectors and destinations and git access.

compiler-guy · 2024-12-16T19:58:46 1734379126

I have no idea who is right or wrong as to the capabilities. But I believe his story that he couldn't get it working. And I believe your statement that it can be made to work.

When very smart people can't get your product to work as advertised, that's a problem with either the advertising, or the documentation, or maybe the default settings. Or maybe it needs the source data set up in a very specific way.

That kind of plays into the larger point of the essay that outsourcing this sort of thing still requires significant internal knowledge, and therefore may not be as cheap as it looks at first glance.

georgewfraser · 2024-12-16T21:12:42 1734383562

In general, I absolutely agree with you. It’s basically an instance of “the customer is always right”: if a smart customer can’t get our product working, there is a problem with the product. But this post made a much bolder (and wrong) claim: “the product has a number of major design flaws that mean that it literally cannot work”.

Dylan16807 · 2024-12-16T23:38:48 1734392328

You went too far in your pushback though.

"part of what’s going on in this post is an arrogant guy who thinks he knows better than everyone, coming to snap conclusions that other people’s work is broken"

He's probably wrong about why it was broken, but it was broken.

And it's not exactly "arrogance" to give the best explanation you have, in a blog post about something else, while not mentioning the company name.

lmm · 2024-12-17T01:23:10 1734398590

> He's probably wrong about why it was broken, but it was broken.

That's going too far. If the customer misunderstands the product or misreads the documentation, that's something worth addressing, but "broken" is not an informative way to describe it.

hipadev23 · 2024-12-17T05:07:32 1734412052

Dan never called out Fivetran, he wrote a couple sentences about problems he experienced with an anonymous ETL provider. That was it. Hell we don't even know that he's actually talking about Fivetran.

George however should not allowed to come to HN and start talking shit about random people who had the professional courtesy to never even mention the provider in question. The fact that his post isn't flagged, is highly upvoted, and dang hasn't swooped in to chastise him is a prime example of why HN is so fucking ridiculous and hypocritical.

Dylan16807 · 2024-12-17T03:01:15 1734404475

If, sure. Do we have any reason to think the problem was a misunderstanding or a misreading?

lmm · 2024-12-17T03:56:40 1734407800

We know that Luu was misunderstanding at least some things, since he gave an inaccurate description of what was happening. Given that context, I find it more plausible that he was also misunderstanding other things, than that the thing was broken for him despite other people seeing it working. Even if you weigh the likelihood differently from me, you must admit that that's a possibility, so concluding that it was broken because Luu thought it was broken is at a minimum premature.

Dylan16807 · 2024-12-17T05:20:52 1734412852

Given the other comments talking about problems here, we don't know for sure if it was inaccurate. But even if it was, "plausible" that he was misunderstanding other things is a hell of a lot weaker than the way your comment above treated it as the truth of the matter.

Overall I think it's pretty unlikely that it wasn't broken.

theonething · 2024-12-17T02:20:19 1734402019

> Evidently, he tried our product briefly

> investigated the issue briefly

> coming to snap conclusions

Where exactly is the evidence that he tried your product only briefly and that he investigated briefly? I've read through it and don't see that anywhere.

After reading your comment, I lean towards you being the arrogant, thin-skinned one about your product and coming to snap conclusions about your customers who are paying for your product and having trouble with it and calling them arrogant instead of looking into why they are experiencing frustration with your product.

Perhaps Dan's conclusion was wrong, but the tone and wording of your response is just off putting and devoid of tact, empathy and teachability.

Something like "No, I don't believe it's broken because x, y and z. But I do see how the developer experience here is left wanting. Maybe we can improve it" would have been so much better.

immibis · 2024-12-17T04:22:54 1734409374

The evidence is that he didn't read the postgres manual section on log-based replication[1] which would have told him how to configure a postgres master server so that it doesn't delete logs until all consumers have processed them.

It's not a five minute setup, but Dan doesn't write that the setup takes longer than five minutes - he writes that the design is fundamentally broken. Which it isn't, if you read the postgres manual. We're not even talking about the manual of the product he tried out for five minutes - we're talking about the manual of the database he's responsible for administering!

The overall point of the article is fine though. Original Commenter was nitpicking.

[1] https://www.postgresql.org/docs/current/warm-standby.html

theonething · 2024-12-17T06:19:35 1734416375

> how to configure a postgres master server so that it doesn't delete logs until all consumers have processed them.

Do you think it'd be reasonable for FiveTran to include this little tidbit in their setup documentation? I'm not talking about repeating the Postgres docs, but just a blurb about the need to do this kind of Postgres config?

That's an example of what I mean when I'm calling out georgewfraser to be humble and use Dan's feedback improve his product (in this case by improving the docs) instead of name calling his customers.

Ok, so Dan came to the wrong conclusion and was wrong to say the product was broken, but he had the professional courtesy to not name the company/product. George just attacks his character. Like another commenter mentioned, we don't even know for sure it was FiveTran. Yet, George just jumps in head first with guns blazing.

Dylan16807 · 2024-12-17T05:30:52 1734413452

And "manual intervention from the vendor" didn't involve fixing that, if it was such a trivial configuration issue?

intended · 2024-12-17T07:15:38 1734419738

The article is also several years old, No idea if that has an impact on the issue.

hklgny · 2024-12-16T21:46:51 1734385611

Weird approach chastising your customers lack of expertise in something they’re actively trying to pay you to solve for them. He shouldn’t have to be an expert in it.

I was a longtime customer of fivetran who hit these sync issues constantly. Forced resyncs every other month. Was so thankful when our contract ended.

legerdemain · 2024-12-16T20:50:20 1734382220

In 2021, my employer was a major customer of Fivetran. Our Postgres syncs routinely broke and required time-consuming resyncs from scratch.

Dan's essay is dated 2022. It is now 2024, so maybe something has changed since then on the code path between Postgres and Fivetran to allow backtracking.

dgemm · 2024-12-16T19:56:34 1734378994

You would be the best one to evaluate if this applies in your case but in many cases where my users say "it's not possible" I end up finding a gap that's more related to usability than technical. I often still find there's something worth learning from this kind of feedback even if it's "wrong".

rawgabbit · 2024-12-16T22:57:33 1734389853

I assume Dan Luu was using the old “XMIN” method and not Logical Replication.

https://fivetran.com/docs/connectors/databases/postgresql/tr...

zanellato19 · 2024-12-17T12:15:30 1734437730

> Evidently, he tried our product briefly, had an issue or thought he had an issue, investigated the issue briefly and came to the conclusion that he understood the technology better than people who have spent years working on it..

This doesn't match this:

> Syncing from Postrgres is the main offering (as in the offering with the most customers) from a leading data sync company, and we found that it would lose data, duplicate data, and corrupt data. After digging into it, it turns out that the product has a design that, among other issues, relies on the data source being able to seek backwards on its changelog. But Postgres throws changelogs away once they're consumed, so the Postgres data source can't support this operation. When their product attempts to do this and the operation fails, we end up with the sync getting "stuck", needing manual intervention from the vendor's operator and/or data loss. Since our data is still on Postgres, it's possible to recover from this by doing a full resync, but the data sync product tops out at 5MB/s for reasons that appear to be unknown to them, so a full resync can take days even on databases that aren't all that large. Resyncs will also silently drop and corrupt data, so multiple cycles of full resyncs followed by data integrity checks are sometimes necessary to recover from data corruption, which can take weeks. Despite being widely recommended and the leading product in the space, the product has a number of major design flaws that mean that it literally cannot work.

That description doesn't sound like _he_ briefly used your product, but that company he was working for used your product, found bugs and despite contacting support couldn't make it work. This doesn't read at all as a minor experiment that he didn't put in the time.

avarun · 2024-12-17T00:50:45 1734396645

The kind of arrogance this comment displays has ensured that I’ll try my best to never use Fivetran anywhere I work ever again.

Aeolun · 2024-12-17T01:10:49 1734397849

But did you ever in the first place?

nick0garvey · 2024-12-16T19:49:01 1734378541

I'm not quite following. His argument appears to be: The replication system requires a backwards seek, Postgres does not support that operation, things break when that operation is attempted.

I don't understand why replication would need a backwards seek - are you saying it doesn't and he is mistaken on that?

Aeolun · 2024-12-17T01:08:18 1734397698

This is unavoidable if you are at least a bit smarter than the average person, since in many cases their work just is broken.

It’s taken me far too long to internalize that the chances of someone making an (egregious) mistake in something I rely on to be correct are very much nonzero.

Carrok · 2024-12-17T00:10:26 1734394226

> an arrogant guy who thinks he knows better than everyone, coming to snap conclusions that other people’s work is broken

I see you’ve met my boss.

joatmon-snoo · 2024-12-16T21:38:23 1734385103

...how is _this_ the insight that you come away from this post with?

This post is a commentary on product quality issues, the underlying cost models (both goods and services), and the interplay with American culture. There's like 20+ company/product anecdotes in there - a mistake about one detail about one technical detail of one product is wildly uninteresting.

more_corn · 2024-12-17T03:01:40 1734404500

This is the case where you buy from experts instead of doing it yourself. You tried, thought it was impossible, someone else figured it out.

georgewfraser · on May 7, 2024

You can partition your BigQuery table however you like and Fivetran will leave it in place. I don’t think there’s any benefit to partitioning the staging table.

georgewfraser · on June 20, 2023

I wonder if it would be easier to create a C virtual machine that emulates all the OS interaction, then recompile Postgres and the extensions to run on this. Perhaps TruffleC would work?

https://dl.acm.org/doi/10.1145/2647508.2647528

anarazel · on June 20, 2023

Hard to believe that would provide any benefit without also causing massive slowdowns.

georgewfraser · on Jan 6, 2023

I have an S with a yoke and prefer it to a round steering wheel. 95% of the time it’s better: you can see the entire dashboard and your body can feel the angle of the wheel instinctively. 5% of the time it’s worse, during low speed maneuvers.

pengaru · on Jan 6, 2023

It's a very natural thing to unwind a wheel with a dragged hand when entering traffic from a stop on a perpendicular side street. And I'd argue that's far more than 5% of the substantial steering a driver does.

They could have just flattened the circle on the top like a D wheel to give some clearance on the instrumentation. Interrupting the continuous shape with a yoke is the problem, not that it isn't a circle.

It's as if the decision makers behind the yoke don't drive themselves, or were actively trying to make the driving experience worse to compel FSD adoption

scottLobster · on Jan 6, 2023

Depends on where you drive. I live in SE PA. Roads here tend to be winding, even in the suburbs. Like "slow down to 20 mph so you don't slide off the road" winding. Never mind going into Philly with its weird intersections and tight turns.

Maybe 5% of your driving it's worse, it would be more like 70% of mine. I don't need to see the entire dashboard, I can get all the info I need just fine through the wheel, which adjusts. Also, cars with HUDs are a thing if you go for top level trims.

But when muscle memory matters, I want a wheel to grab. One double-take because the yoke isn't where my instincts expect could be one too many, it's like taking the shoulder-harness off your seat belt so that you look better while driving. Just serves no good purpose beyond vanity.

asdff · on Jan 6, 2023

You should be able to see the entire dash board with any steering wheel

_9omd · on Jan 6, 2023

It really depends on your height and preferred wheel height setting. I'm tall and in some cars the wheel obscures a good portion of the insturments.

Hellion · on Jan 6, 2023

There are significant and effective federal regulations around this. If you are telling the truth, your steering wheel is way too high, and you need to lower it

aidenn0 · on Jan 6, 2023

Not who you are responding to, but I have a tall sitting height and I suspect that the steering wheel is too low. Consider this diagram of a properly adjusted steering wheel (instrumentation is where the "II" is):

     /----\
    /  II  \
    |------|
    \      /
     \----/

If you are tall what can happen is:

     /-II-\
    /      \
    |------|
    \      /
     \----/

A telescoping steering wheel helps quite a bit because you can adjust your seat back/forth more to get a good angle, but in cars without telescoping steering wheels, I have the second sight-picture at any position in which it is comfortable to hold the steering wheel.

If you are suggesting:

       II
     /----\
    /      \
    |------|
    \      /
     \----/

then this will render the airbags ineffective.

kj34kj · on Jan 6, 2023

Exactly. This happens to me because I'm tall. If I adjust the wheel to not obscure the cluster, then it's no longer comfortable for my back/shoulders.

My single favorite feature of the 3/Y is that there is no instrument cluster.

asdff · on Jan 6, 2023

Some cars like a prius just center the cluster

frosted-flakes · on Jan 6, 2023

There are better solutions than chopping off half the wheel though. For example, Volkswagen ditched the binnacle and put the gauge cluster screen on the steering column instead of on the dashboard. This is so that it can move up and down with the steering wheel to whatever height it's set at. Unfortunately VW also used capacitive buttons, but apparently they will go back to real buttons in the future.

int_19h · on Jan 6, 2023

You know what's even better? A proper HUD.

svnpenn · on Jan 6, 2023

> low speed maneuvers

AKA maneuvers. Driving on the highway in a straight line is not a maneuver.

georgewfraser · on Dec 12, 2022

The world would be a better place if database drivers were completely abandoned as a way for clients to connect to databases. A standard API, implemented by multiple vendors, is a vastly preferable solution. Arrow Flight is an example of this.

https://arrow.apache.org/blog/2019/10/13/introducing-arrow-f...

lidavidm · on Dec 12, 2022

Even within the Arrow project, there's still room for drivers just because not every vendor is going to implement the same wire protocol (at least on a feasible timeline). Hence both "ADBC" [1] and Flight SQL [2] (note: NOT a SQL dialect, it is a wire protocol) coexist in complementary niches.

[1]: https://arrow.apache.org/docs/format/ADBC.html [2]: https://arrow.apache.org/docs/format/FlightSql.html

jeff-davis · on Dec 12, 2022

I generally think that's a good idea, but be aware that the protocols are more interesting than you might first imagine, and that leads to a lot of the differences between drivers for different databases.

For instance, when setting a user's password in Postgres, you can do the hashing on the client side, even for non-trivial schemes like SCRAM. This means that the password itself never needs to move over the network, and that's very desirable. Speaking of authentication methods, that also opens up a big topic.

There are also important modes. For instance, the client encoding controls how strings are transcoded when they get to the server. That allows the client to not know/care what the encoding of the database is. You could demand that everything is UTF-8, and that's one philosophy, but not everyone agrees.

In practice, I think it'll be a while before there is consensus on all these points. And even when there is, the standard will need to evolve to handle new auth methods, etc.

If we invent a standard protocol, it will probably be more of a fallback for simple cases when the language framework doesn't offer a driver yet. Still helpful, though.

EthicalSimilar · on Dec 12, 2022

> For instance, when setting a user's password in Postgres, you can do the hashing on the client side, even for non-trivial schemes like SCRAM. This means that the password itself never needs to move over the network, and that's very desirable.

Off-topic, but I’m surprised more online apps don’t employ something similar.

It would all but eliminate accidental leaks that occur from logs being incorrectly stored / misconfigured, not to mention worries about MITM attacks (useful for corporate networks, or public networks).

Given how many people share usernames, emails, and passwords across sites I find it quite important to mitigate those issues as much as possible.

kardianos · on Dec 12, 2022

Nice.

This [1] appears to be the SQL layer on top of Arrow Flight specifically about SQL. It seems a bit chatty, where two network requests are required for each query if I read it correctly.

[1] https://arrow.apache.org/docs/format/FlightSql.html

lidavidm · on Dec 12, 2022

Yup. The chattiness is to account for distributed databases, so you can spread the result set across multiple instances.

That said there is a proposal for base Flight RPC to help allow embedding small results directly into the first response, that mostly needs someone to draft a prototype and push it through. (That doesn't help the case of a large-ish response from a single backend, though; that may also need some work, if we want to get rid of the second request.)