SQL Server Forum / DB Engine / SQL Server / July 2008
architecture for peak OLTP
|
|
Thread rating:  |
JXStern - 30 Jun 2008 17:24 GMT Say you need to put up a database for a Big Deal website, that's going to get hit with a big peak load maybe once a month, like 50,000 transactions as near as simultaneously as you can handle. And the rest of the month, it may be ticking over a few dozen per second. Assume the transactions are comparatively simple, the database modest in size, say less than 10gb.
Of course it's mission-critical, want to run hot back-ups and all. I know this is all a big vague, but is there a best practice for such a thing?
Let's assume the peak can be handled on a dedicated (and maxed out hardware!) server (more like a set of three or four with dedicated functions), my question is really what do I do for redundancy - transactional replication, log shipping, some third-party mirroring tool? Would anyone recommend trying to split the load and run peer-to-peer transactional replication?
I'm guessing the "best practice" for this has changed, since the big 64-bit servers have come along. Assume also a nice, dedicated SAN, freshly formatted and carefully configured.
Thanks for any wisdom and/or pointers.
Josh
Andrew J. Kelly - 30 Jun 2008 18:23 GMT Josh,
There is no way to accurate answer that with so little information. A decision as to how to properly scale and do HA / DR requires a lot of details and lots of planning. There are simply too many aspects to consider to come up with a solution that is practical given your business requirements, available resources and budget. Any of the technologies you mentioned may be viable to some degree but it all depends on the situation. If anyone tries to give you a solution without knowing all the proper details you will most likely have issues when you actually go to implement. You know I am not trying to skate out of giving a proper answer but this is something that you don't want to do via a newsgroup post. Just to get 50K trans per second is a huge goal in itself, never mind the rest on top of that.
 Signature Andrew J. Kelly SQL MVP Solid Quality Mentors
> Say you need to put up a database for a Big Deal website, that's going > to get hit with a big peak load maybe once a month, like 50,000 [quoted text clipped - 21 lines] > > Josh JXStern - 01 Jul 2008 02:11 GMT >Josh, > >There is no way to accurate answer that with so little information. I meant this more along the lines of, "where is the technology sweet-spot today", than "Help I've got to get this particular requirement designed and operating in ten days".
Or, "does anyone have any good (and hopefully recent) war stories about choosing an HA/DR configuration?"
Thanks.
Josh
Linchi Shea - 01 Jul 2008 13:11 GMT > Or, "does anyone have any good (and hopefully recent) war stories > about choosing an HA/DR configuration?" A key piece of info in choosing a HA/DR configuration is can you afford to lose some data and how much can you afford to lose. When you are talking about a heavy OLTP system that can't afford to lose any data, your options are seriously limited compared to the scenarios where you can afford to lose some data.
Linchi
> >Josh, > > [quoted text clipped - 10 lines] > > Josh TheSQLGuru - 01 Jul 2008 14:33 GMT Another consideration is how long can you afford to be offline after a 'holy-sh.t' event? As the answer to each particular piece of the HA/DR pie go towards zero the cost rises exponentially. If you want serious uptime and fault tolerance in the wintel world, check here or at other similar platforms: http://www.stratus.com/. Note that their servers are currently reported to be running at SIX NINES of uptime worldwide.
To address some particular questions each of the processes you mention to move transactional information from one server to the other (transactional replication, log-shipping, third-party, etc) have their plusses and minuses (cost, peformance hit, possible loss, rollover duration, mechanism of rollover, etc) and there truly is no generic best practice. You have to determine your needs (as Linchi said) and then evaluate each solution that will address all needs on a variety of parameters.
 Signature Kevin G. Boles Indicium Resources, Inc. SQL Server MVP kgboles a earthlink dt net
>> Or, "does anyone have any good (and hopefully recent) war stories >> about choosing an HA/DR configuration?" [quoted text clipped - 22 lines] >> >> Josh JXStern - 01 Jul 2008 17:32 GMT >Another consideration is how long can you afford to be offline after a >'holy-sh.t' event? As the answer to each particular piece of the HA/DR pie [quoted text clipped - 10 lines] >determine your needs (as Linchi said) and then evaluate each solution that >will address all needs on a variety of parameters. Thanks all, even these black-label warnings are informative.
Regarding Linchi's excellent point, you take a standard commercial setup. You really don't want to lose any data, but say nobody actually dies if data is lost, and no huge amounts of money, either, but a lot of goodwill. Heck, back in the day, I used to make sure we had immediate HARDCOPY backup of all critical data. Haven't done that in a while! Log shipping seems to handle that situation, as long as we're sure the receiver is receiving, the joint probability of two well-hosted systems failing at the same time, is pretty tiny. Though the comm links may fail at a higher rate, putting the plan in the yellow zone.
Seems to me the SQL 2005 "mirroring" feature, log-shipping with immediate application, with a frequent update, is pretty near that HA/DR sweet spot I was wondering about. If you have the money, a local backup *and* a remote one, and I'm guessing that's right around a best-practice. Too bad it's not easy to get them to share the load, but that's just another quantum step in difficulty, still. Is Oracle RAC really any better at this?
I guess if you want to get some good out of the standby systems, you'd consider using transactional replication instead of log-shipping? More overhead (?), complications of having a distributor, and generally much more finicky. Peer-to-peer replication looks good for load-sharing, but doesn't seem appropriate for very peaky loads, because you have to do every transaction twice on a production system under that architeture. And of course it requires a lot of planning and design just to get it going.
(just thinking out loud here!)
Josh
Andrew J. Kelly - 01 Jul 2008 14:49 GMT The solutions and options haven't changed a whole lot in the last few years in terms of DR and I venture to say there is no "Sweet spot" per say. Each solution has pros & cons and can work well for a particular requirement and not so well for something else. Everyone has their own SLA's, requirements, conditions etc. so someone else's horror story may not be applicable to you. The same is true for their success story. I have been involved in enough HA / DR planning with high end systems to know it is not that straight forward to actually implement. A big aspect of HA / DR for a high end system is the amount of money you are willing to spend on the solution. I recommend making a list of each of the DR options along with their pros & cons and see how each one fits into your SLA as a start. That way when you get closer to what you feel you will need for a solution you can ask more specifics on that particular technology in relation to your requirements. This really isn't a subject where you want to get general feedback because for one most people don't implement DR properly and thus have horror stories. But that does not mean it wasn't the correct solution in the first place.
 Signature Andrew J. Kelly SQL MVP Solid Quality Mentors
>>Josh, >> [quoted text clipped - 10 lines] > > Josh JXStern - 01 Jul 2008 17:12 GMT >The solutions and options haven't changed a whole lot in the last few years >in terms of DR and I venture to say there is no "Sweet spot" per say. Actually, I'm not sure how recent the change is, but does anybody even attempt to back up to tape anymore? Between RAID5 on the main storage and a hot/warm backup, I've seen places even with small databases, that never seem to cut a copy (would probably be CD/DVD not tape anyway) and send it off to Iron Mountain, or wherever.
> A big aspect of HA / DR for a high end system is the >amount of money you are willing to spend on the solution. True, but even there, hardware is so relatively cheap now, that I think best/common practices have changed. Still, does anybody ever consider hijacking a staging/dev server in the HA/DR plan? Used to do that, haven't seen it for a while. Probably the fact that you have to configure the connectivity for switchover is something of a factor, can't easily change domain on the fly ... or can you?
> This really isn't a >subject where you want to get general feedback because for one most people >don't implement DR properly and thus have horror stories. But that does not >mean it wasn't the correct solution in the first place. Been there, done that! SO, why does it still make me laugh?
Josh
TheSQLGuru - 01 Jul 2008 22:51 GMT 1) reset your brain to move away from RAID5 and on to RAID10 or similar as the standard redundant storage these days.
2) I agree that most clients (at least mine and ones I hear about) are doing disk-to-disk backups (or san snapshots, etc) and maybe then taking tape (usually doing so for larger entities).
3) Routine hardware may be relatively cheap, but that is a small piece of the HA/DR pie. The devil is in the details. And 'certified' hardware is definitely more expensive.
 Signature Kevin G. Boles Indicium Resources, Inc. SQL Server MVP kgboles a earthlink dt net
>>The solutions and options haven't changed a whole lot in the last few >>years [quoted text clipped - 25 lines] > > Josh JXStern - 02 Jul 2008 14:24 GMT >1) reset your brain to move away from RAID5 and on to RAID10 or similar as >the standard redundant storage these days. Really? Isn't RAID5 still a bit faster for the common read-mostly databases, and a bit cheaper in most configurations?
I know I've had occassion to define RAID10 classes on the SAN for big ETL jobs that are write-once/read-once, but was still using RAID5 for default databases and filegroups.
Josh
>2) I agree that most clients (at least mine and ones I hear about) are doing >disk-to-disk backups (or san snapshots, etc) and maybe then taking tape [quoted text clipped - 3 lines] >the HA/DR pie. The devil is in the details. And 'certified' hardware is >definitely more expensive. TheSQLGuru - 02 Jul 2008 16:25 GMT I don't believe the difference is that great since raid10 can do reads down both sides of the disk tree concurrently to enhance throughput. And very few databases are purely read only, and the raid5 write penalty is significant as you know.
Cost is increased, althought the disks themselves are pretty inexpensive. Higher cost will come when you need more drawers/cabinets to house sufficient drives to get the raid10 space you need.
 Signature Kevin G. Boles Indicium Resources, Inc. SQL Server MVP kgboles a earthlink dt net
>>1) reset your brain to move away from RAID5 and on to RAID10 or similar as >>the standard redundant storage these days. [quoted text clipped - 16 lines] >>the HA/DR pie. The devil is in the details. And 'certified' hardware is >>definitely more expensive.
|
|
|