Imho the biggest challenge is that testing prioritizes the zero-side-effects fully-isolated-to-one-class stateless unit test as king (which is good! That's the gold standard for a reason), which is the polar opposite of testing a giant ball of hyper-interconnected state that is an RDBMS.
I've had good luck writing scripts where the first one sets up preconditions, then it runs a million interconnected tests with no isolation between them, and then it tears down. It means you can't run a single test in isolation and you have to be aware of each test's postconditions, but it's a good "worse is better" solution for "I need to start testing now, not after I rearchitect my whole project to be isolatable and right a crapload of scaffolding and boilerplate". But every testing framework fights against this pattern.
I haven't had any problem testing with RDBMS, so I'm wondering if there's something I'm not understanding.
For me, a testing script will load a clean schema+procedures with no data (empty tables), and the schema never changes. Each test inserts a handful of rows as needed in relevant tables, runs the query, and then checks the query result and/or changed table state. Then DELETE all table contents, and move on to the next test. Finally delete the entire test database once you're done.
If you have queries that alter tables, then each test starts with a brand-new schema and deletes the whole database once done.
The only kinds of tests this doesn't scale to are performance tests, e.g. is this query performant on a table with 500 million rows? But to me that's not the domain of unit tests, but rather full-scale testing environments that periodically copy the entire prod database.
So I'm curious why you consider an RDMBS to be hyper-interconnected state with interconnected tests?
My group is working on getting testing set up on the DB of a highly configurable SAAS product that's heavily normalized, so configuration for every feature is smeared across dozens of tables. It dates back from the "use procs for everything" era of development.
Thus, the code to set up preconditions would be as complicated as the entire service tier of the admin screens. So it makes sense to write the test scripts as, instead of "insert" it's
1. "set up known good mostly-blank DB"
2. "Test the CREATE methods"
3. "Test the UPDATE methods"
4. "Test the DELETE methods"
5. "teardown"
* obviously it's not simple CRUD, this is just a simplification of how it goes.
It's not that this is an ideal workflow, it's just that "worse is better" here. This lets us get testing ASAP and move forward with confidence.
Of course, schemas change, and it's the migration of live data with those changes that are singularly most likely to break things. Will all your db access code still work? If you're normalizing everything through stored procedures, it gets somewhat harder still.
It's not just testing a single build up... but build ups and migrations from deployed, working options. Some solutions have different versions of an application/database at different clients that will update to "approved" versions in progress. (Especially in govt work)
Ah, got it. That workflow makes a lot of sense in that case -- not perfect but the most pragmatic.
I think the concepts of "application-owned RDBMS" vs "3rd party product that uses its own RDBMS" are the source of a lot of confusion sometimes.
If you manage your own database, testing shouldn't usually be particularly difficult. But when you're integrating with a third-party product, you generally can't effectively do unit tests. Just end-to-end integration tests of the type you're describing.
I'm in a similar boat with my current project, but am probably fairing better than you (as described). I do work in the ERP space... so not unlike you describe, 100s of highly normalized tables (the biggest of these systems I worked with had just over 1000 tables).
I'm currently working on a greenfield system in this class and the database development workflow is specifically designed to help make things sanely testable. So, first the application is designed to be very granularly modular and each module handles its own database tables... and each module is fully testable by itself. Yes, there can be/are dependencies between modules, but only the direct dependencies only ever have to be dealt with... when unit testing in the module itself, it's usually a pretty small slice of the database that needs to be handled. So what that means is any one component's test suite only needs to set the initial state for itself and its dependencies. What I do then is individual functional test can take that starting DB state, do it's work in a transaction, and then roll the transaction back at the end of the test. This way the only initial state created at the start of the process necessarily has to be dealt with. My "integration tests" will walk the business process statefuly, using only the initial DB state for dependents setup and walking the business process of creating all the data the module itself handles. Finally, unit test starting data is only ever defined in the component defining the tables which are to be loaded. This means that when dependents have their own tables, reference's are made to load that component's test data rather than trying to redefine a fresh set every time it shows up in the dependency tree.
Anyway, as said earlier, this kind of code development testing discipline isn't common amongst DBA types (the best to be able to figure out a good methodology for it) while a lot of application developers that have the testing discipling avoid the DB like the plague. So it never gets done until it absolutely has to be done. And they you end up ad hoc'ing it together.
I've had good luck writing scripts where the first one sets up preconditions, then it runs a million interconnected tests with no isolation between them, and then it tears down. It means you can't run a single test in isolation and you have to be aware of each test's postconditions, but it's a good "worse is better" solution for "I need to start testing now, not after I rearchitect my whole project to be isolatable and right a crapload of scaffolding and boilerplate". But every testing framework fights against this pattern.