Does anyone have hands-on experience with Aurora? Although I signed up for the preview I do not have access yet, I'm eager to try it.
Our experience with RDS hasn't been that smooth: lots of sudden performance issues, table locks and whatnot. DB reboot or even full restore is a once a month thing.
I would also like to know about actual Aurora experiences as well. But I haven't heard of anybody using it.
However, I'm also interested in your negative experiences with AWS RDS. What kind of engine are you running on? Did you investigate the outages or contact AWS support regarding these issues?
For example, just today we needed to run a migration to rename a non-indexed column in a table in our production db. The table got locked while running the migration, all app servers down :). Rebooting the RDS instance was the only thing that got us back up.
A while back we also migrated a column, the encoding was set to ISO-8859-10 (nordic countries? seriously?), even though our encoding is UTF-8 everywhere! Had to take a DB snapshot, launch a new instance and re-run the migration (there were performance issues with the instance as well, that's why we re-created it).
These are just a few of the stories I've experienced. I didn't contact support after seeing that similar issues were completely ignored in the AWS forums.
Full setup:
- 2x m3.large app servers
- 1x m3.large RDS instance running MySQL 5.6.22, InnoDB, no read replica, no Multi AZ. Other issues were seen in an m3.medium with Multi AZ. We preferred a non-crashing m3.large Single AZ than a hanging m3.medium with Multi AZ. Seems that there are problems regardless of the instance's size.
- Rails 4.0.11 backend, ActiveRecord migration
- The table had about 1000 entries, but heavily used (users table). To give you some context ours is a B2B service, most of the users (90%) use our site from 9 to 5, never leave the page. So accessing the users database is fairly common.
The only thing about implicit full-table locking in the alter-table docs you linked says:
The exception referred to earlier is that ALTER TABLE blocks reads (not just writes) at the point where it is ready to install a new version of the table .frm file, discard the old file, and clear outdated table structures from the table and table definition caches. At this point, it must acquire an exclusive lock. To do so, it waits for current readers to finish, and blocks new reads (and writes).
There are no strategies to work around this using just the database. Even if you were to manually create a new table with the new format, migrate all the data manually, and then manually rename, the final rename operation would require a short-duration lock to perform its work. You could, of course, modify your application code to work with two tables during a manual migration.
Sure, but the migration was running for 15 minutes after we decided to reboot it. It was most probably trapped in a deadlock. Keep in mind it was just a column rename, no indexes involved. Should be a simple operation:
> Alterations that modify only table metadata and not table data are immediate because the server only needs to alter the table .frm file, not touch table contents. The following changes are fast alterations that can be made this way:
> - Renaming a column, except for the InnoDB storage engine before MySQL 5.6.6.
I'm in the preview and preparing to test some new systems on it. No production workloads or experience yet, though. If you send me an email (pthomas at opendns.com) I can get you in contact with an AWS account manager who might be able to fast-track you for access.
Our experience with RDS hasn't been that smooth: lots of sudden performance issues, table locks and whatnot. DB reboot or even full restore is a once a month thing.