I just set up a cluster attached to a SAN. I have had it where the cluster
service on one of the nodes doesn't start up right away. I have checked the
services to make sure that it is set to automatic which it is. Both nodes
are current with the latest patches and security updates. I'm a little
clueless as to why this is happening. Here is the real weird part after 1
minute the cluster service starts on the node that is giving me troubles.
The other node is perfectly fine.
Hi
If you do a fail-over using cluster admin, are there any resources (that SQL
server depends on) take a long to come online?
I have seen similar issues when the devices take long to come online due to
high SAN activity.
Does the SQL Server resource come online, but take a long time until it has
done it's recovery steps?
This can occur when the other node was de-porting it's devices and still had
IO pending. This results in not all pages being fluished to SAN, so SQL
Server has to do more recovery on the database start-up.
The best guage of how quickly a resource comes online is to look at cluster
admin during the failover.
Regards
Mike
Regards
Mike
> I just set up a cluster attached to a SAN. I have had it where the cluster
> service on one of the nodes doesn't start up right away. I have checked the
[quoted text clipped - 3 lines]
> minute the cluster service starts on the node that is giving me troubles.
> The other node is perfectly fine.
Thomas - 25 Jan 2005 16:27 GMT
We haven't installed SQL server yet. I should've posted that first. But I
know from passed installs that SQL does take some time to come online. The
SAN doesn't have much activity on it right now
> Hi
>
[quoted text clipped - 24 lines]
> > minute the cluster service starts on the node that is giving me troubles.
> > The other node is perfectly fine.
Mike Epprecht (SQL MVP) - 25 Jan 2005 16:49 GMT
Hi
Have a look in your event logs and check the time differences between when
Node A shuts down and Node B notices it and starts up. There will be at least
15 event messages during this process. Post the information here so that I
can compare it to our big clusters.
Regards
Mike
> We haven't installed SQL server yet. I should've posted that first. But I
> know from passed installs that SQL does take some time to come online. The
[quoted text clipped - 28 lines]
> > > minute the cluster service starts on the node that is giving me troubles.
> > > The other node is perfectly fine.
That may be somewhat normal on a simultaneous startup. The first node grabs
the quorum device and owns the cluster but isn't talking on the network yet.
The second node tries to get the device but times out. Eventually the
service comes online and talks to the other node and agrees on who is in
charge. This is especially prevalent on SCSI-based clusters.
Check the System and Application event logs on both systems to see if there
are any unusual startup errors. Also, check what happens when the second
node is rebooted. If the cluster service does come online quickly, it is
just a device contention issue. I try and avoid powering up more than one
cluster node at a time.

Signature
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com
I support the Professional Association for SQL Server
www.sqlpass.org
> I just set up a cluster attached to a SAN. I have had it where the cluster
> service on one of the nodes doesn't start up right away. I have checked the
[quoted text clipped - 3 lines]
> minute the cluster service starts on the node that is giving me troubles.
> The other node is perfectly fine.
Thomas - 25 Jan 2005 17:41 GMT
Node 2 which I haven't seen the problem with having ownership of the cluster.
When I reboot node 1 is when I see the problem of it taking 1 mintue to
start the cluster service.
> That may be somewhat normal on a simultaneous startup. The first node grabs
> the quorum device and owns the cluster but isn't talking on the network yet.
[quoted text clipped - 17 lines]
> > minute the cluster service starts on the node that is giving me troubles.
> > The other node is perfectly fine.
Geoff N. Hiten - 25 Jan 2005 17:51 GMT
Node order is arbitrary in a cluster. We could use Node X and Node Y
instead of Node 1 and Node 2.
Try manually stopping and starting the cluster service on Node 1. If it
restarts quickly, then the problem likely is one of the services that the
cluster service depends on. Time service is a usual suspect for that, but
you will have to check the entire list. Again, the Application and System
event logs are your friends here.
Now is the time to deal with this issue, not after you load SQL and get this
baby into production.

Signature
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com
I support the Professional Association for SQL Server
www.sqlpass.org
> Node 2 which I haven't seen the problem with having ownership of the cluster.
> When I reboot node 1 is when I see the problem of it taking 1 mintue to
[quoted text clipped - 21 lines]
> > > minute the cluster service starts on the node that is giving me troubles.
> > > The other node is perfectly fine.