SQL Server Forum / Other Technologies / Replication / September 2006
IS replication REALLY the way to go?
|
|
Thread rating:  |
Methodology - 22 Sep 2006 17:25 GMT so
Im coming to end of my replication testing and im really wondering if I want to do this. some basic problems that im noticing:
if i create a new table / SP I have to re-run the snapshot agent to create and apply a new snapshot to the subscriber. But my main dabse is 200GB, and my subscriber is on the end of a 2Mb sdsl in Paris, so unless im being completely stupid, its gonna take me..like...17 weeks to get the new snap over, correct?
I COULD create a publication per table, so that when I create a new table, I just create a new pub for it, but doesnt each pub have its own agent? so im going to need a CRAY computer to handle all the overhead, correct?
I admit that I can create a separate pub for my tables and my SP's so that I dont have to snap / rep everything again when I want to create a new SP, but surely theres an easier way than this to do things?
I reindex on a Sunday - what the hell is that going to do with my replication?
Im creating a disater recovery site for my London office in Paris. We use big fat hairy SQL databases that need an exact copy running in France - is replication REALLY what I need? I havenmt looked at mirroring yet - should I?
Alastair Jones.
Hilary Cotter - 23 Sep 2006 15:21 GMT answers inline.
 Signature Hilary Cotter Looking for a SQL Server replication book? http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS http://www.indexserverfaq.com
> so > [quoted text clipped - 8 lines] > completely stupid, its gonna take me..like...17 weeks to get the new snap > over, correct? If you add an individual article through the stored procedures (as opposed to through the wizard) the snapshot will only be for the new article and related metadata. It shouldn't take 17 weeks - I take it you are exagerating. You could zip it up send it to the subscriber and then pull the subscription and point to an alternate snapshot download folder.
> I COULD create a publication per table, so that when I create a new table, > I > just create a new pub for it, but doesnt each pub have its own agent? so > im > going to need a CRAY computer to handle all the overhead, correct? No, by default all publications will replicate to the same subscription databse through a single shared distribution agent.
> I admit that I can create a separate pub for my tables and my SP's so that > I [quoted text clipped - 4 lines] > I reindex on a Sunday - what the hell is that going to do with my > replication? It puts added load on the log reader agent. YOu should be fine though.
> Im creating a disater recovery site for my London office in Paris. We use > big fat hairy SQL databases that need an exact copy running in France - is > replication REALLY what I need? I havenmt looked at mirroring yet - should > I? Replication is the cheap way to do it. Log shipping it the traditional way to do it (and cheap), but you exposure to data loss is larger - 15 minutes + and it doesn't work well with reinidexing. If you want to do this right you need EMC's SRDF or a similar solution like that which costs millions. The cost of downtime can make a choice like SRDF economical.
> Alastair Jones. daveberm - 28 Sep 2006 15:04 GMT Data replication does not need to cost millions. Host based replication such as SteelEye Data Replication (SDR) and LifeKeeper HA can be had for <$10,000.
The maximum throughput of a 2Mbps line is 900 MB/hour minus Tcp overhead. It is less if the protocol being used is not optimized for WAN. SteelEye is optimized for WAN and also includes data compression, so depending on how much your data can be compressed (~4X), I would assume you might get about 3.5 GB per hour across the pipe. So the initial sync would take 2+ days. After that, you would only be sending the changes.
If you are based in the UK and would like more info, I suggest you contact Open Minds at http://www.openminds.co.uk/
David Bermingham, MCSE, MCSA:Messaging Senior Systems Engineer www.steeleye.com
> answers inline. > [quoted text clipped - 57 lines] > > > > Alastair Jones. Hilary Cotter - 28 Sep 2006 15:32 GMT It depends on your ability to tolerate data loss. Do you have a no data loss guarantee? Hitachi does, you do have to pay for it though.
Another point to keep in mind, with Hitachi or SRDF, if the source goes down the destination/dr site is up immediately with no data loss. Does your product provide such functionality? Failback is similarly uncomplex.
Your product is hostbased, it consumes resources on the host. SRDF, Hitachi and other hardware mirroring systems are not host based.
 Signature Hilary Cotter Director of Text Mining and Database Strategy RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's positions, strategies or opinions.
Looking for a SQL Server replication book? http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS http://www.indexserverfaq.com
> Data replication does not need to cost millions. Host based > replication such as SteelEye Data Replication (SDR) and LifeKeeper HA [quoted text clipped - 90 lines] >> > >> > Alastair Jones. dbermingham@gmail.com - 28 Sep 2006 21:25 GMT We support both Asyncronous or Syncronous mirroring. WIth syncronous mirroring the data is written to the target first, so the source will have to wait for the target write to complete before the write is committed on the source. This guarantees no data loss. Obviously there is a performance penalty for syncronous replication and it is not recommended across WANs. Even storage based syncronous replication will have some pretty hefty requirements when it comes to the type of connection they require and the latency it can tolerate.
The alternative is Asyncronous replication, which takes a copy of the write and sends it across the wire (keeping write order integrity), however it lets the original write complete without delay. This is the preferred method of replication across a WAN, although in the event of an unexpected failure, you could lose whatever is in queue, which is generally at most the last write. The benefit is that the source server does not receive a performance penalty and this option works just fine over a T1 with high latency.
Failover to the target site and failback for SQL server is automatic and generally in the 30-45 second range, and this includes not only data recovery, but starting SQL, recovering an IP address or doing DNS update, and starting any third party app as well. The mirroring is automatically reversed once the original host comes back online and just the data that has changed since it was offline is then sent back to the original server. Once the mirror is back in a mirroring state, it is as simple as clicking on a button in the GUI to bring the original primary back online.
When you say that with Hitachi or SRDF that the DR site is up immediately, you mean the data, not the applications, correct? Storage based replication is only concerned about the data, not the applications. Once the data is recovered, you are still left with the job of recovering the data. One of my customers was trying to do DR for Exchange with storage based replication, and was having a very hard time with it (http://www.itreseller.com/pr/6434). When he learned about SteelEye LifeKeeper, he didn't believe that something so inexpensive would work so much better than the solution he spent 100's of thousands of dollars on.
SteelEye Data Replication runs as a Windows filter driver, below the file system. Being at that layer means we take very little system resources, maybe 1% CPU and minimal RAM. Disk activity is only increased during the initial sync when the data needs to be read from disk. After a mirroring state is reached, the disk on the source is not impacted. The only time you will see any additional load on the system is if you turn on data compression. Even at the highest level of compression, on a single processor pentium 4 system you will see at most a 15% CPU utilization increase. Compression is only required if your rate of change exceeds the capacity of the WAN link. A T1 can handle a rate of change of about 650 MB per hour with no compression. Beyond that, you will need compression.
> It depends on your ability to tolerate data loss. Do you have a no data loss > guarantee? Hitachi does, you do have to pay for it though. [quoted text clipped - 114 lines] > >> > > >> > Alastair Jones. Hilary Cotter - 29 Sep 2006 02:35 GMT I am familiar with your class of products. They do serve a niche, but they don't do filtering, can't massage the data in flight, can't do bi-directional replication, can't replicate from multiple subscriber to the same table. The destination database is offline until failed over.
Its all done on a file basis - but such products do have their place and their customers are happy with them. I am sure you have many success stories.
Note that this customer has a problem when he adds a single article to a publication. Your product will solve his problem, but there is a cost associated with it and the destination database will be offline. I am not sure if this will be ok for him.
Also, don't you require your staff to be on site to set the product up?
Business must be bad that you have to come here to drum it up.
 Signature Hilary Cotter Director of Text Mining and Database Strategy RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's positions, strategies or opinions.
Looking for a SQL Server replication book? http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS http://www.indexserverfaq.com
> We support both Asyncronous or Syncronous mirroring. WIth syncronous > mirroring the data is written to the target first, so the source will [quoted text clipped - 4 lines] > will have some pretty hefty requirements when it comes to the type of > connection they require and the latency it can tolerate. Can you quantify this latency? For example what would it be from London to Paris over a 2Mb sdsl in the case of this client? SRDF adds millisecond latency across an 88 mile distance for a large financial client here in New Jersey.
> The alternative is Asyncronous replication, which takes a copy of the > write and sends it across the wire (keeping write order integrity), [quoted text clipped - 25 lines] > inexpensive would work so much better than the solution he spent > 100's of thousands of dollars on. Yes, I mean data, the application is handled on a different layer. The LUNs that the SQL Server's hang off are SRDF enabled which enables geo-spatial clustering.
> SteelEye Data Replication runs as a Windows filter driver, below the > file system. Being at that layer means we take very little system [quoted text clipped - 142 lines] >> >> > >> >> > Alastair Jones. daveberm - 29 Sep 2006 14:23 GMT I would say 95% of our installations are self installations. Professional services are generally done in smaller offices where IT staff is limited, or in larger companies where budget is not a concern. A typical SQL cluster take about 90 minutes to set up, with the majority of that time just installing and licensing the software. Of course a novice will want to read the manual and become familiar with all of the options, so I would plan on the better part of a day to read the manual and do the installation.
You are correct in that the target database is locked during replication, as this is an HA solution, not a load balanced solution or multi-master solution. However, our replication can support "one to many" and "many to one (different volumes)" replication.
Thanks for your comment about our business, we are doing fine. How are the book sales coming along?
> I am familiar with your class of products. They do serve a niche, but they > don't do filtering, can't massage the data in flight, can't do [quoted text clipped - 222 lines] > >> >> > > >> >> > Alastair Jones. daveberm - 29 Sep 2006 14:47 GMT Sorry, didn't see your last question. Latency added depends on the latency of the line. If round trip time is 5ms, then 5ms is the added delay in Syncronous mirroring. Hardware based replication is bound to the same laws of physics as well.
> I would say 95% of our installations are self installations. > Professional services are generally done in smaller offices where IT [quoted text clipped - 239 lines] > > >> >> > > > >> >> > Alastair Jones. Hilary Cotter - 29 Sep 2006 16:16 GMT Most of the host based solutions I am aware of in the States (Neverfail, and Doubletake) do not permit self installations and monitor the topology once put in place. Failover times and failback times are significant up to 15 minutes IIRC.
My comment about multiple publishers replicating to a single table is where you have a table on 5 servers which all replicate to a single table - its called a central subscriber topology. While you can replicate from one database/array/file to one or more servers products like yours can't do this as they replicate on a file or array level. Once again I want to stress that products like yours have a place, and I know some folks using Doubletake who are very happy with them. I don't think it would be a good fit in a situation like this with the available bandwidth and distance.
Book sales are going ok - thanks for asking. They have slowed down since SQL 2005 has been released.
 Signature Hilary Cotter Director of Text Mining and Database Strategy RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's positions, strategies or opinions.
Looking for a SQL Server replication book? http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS http://www.indexserverfaq.com
>I would say 95% of our installations are self installations. > Professional services are generally done in smaller offices where IT [quoted text clipped - 265 lines] >> >> >> > >> >> >> > Alastair Jones.
|
|
|