Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / DB Engine / SQL Server / July 2008

Tip: Looking for answers? Try searching our database.

Does the optimizer take into account server load and available memory?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Ian Boyd - 15 Jul 2008 22:08 GMT
Does the optimizer take into account server load and available memory - and
any other resources it needs?

We have a database moved to SQL Server 2005, and like most people it
suddenly runs slow.  For one query in particular the logical I/O incresed by
a factor of 10. My question is not about tuning this query, or tuning any
other queries on this new 2005 production server that a customer has.

SQL Server's optimizer uses heuristics to determine the best execution plan.
In this case, the statistics on the "old" SQL 2000 and the "new" SQL 2005
machines were updated (with fullscan). The database on 2005 is a restore of
the database that was on 2000.

So if the optimizer is choosing different execution plans, it isn't because
of the data distribution - it was the same data with all the rows resampled.

So i'm thinking of what else could cause the optimizer to perform a query
differently, if not based on DDL and data distribution statistics, and i
think of the resources on the server. Perhaps the new 2005 machine has more
simultaneous users, or more I/O requests, or it knows it has a slower (or
faster) network connection. Perhaps it is in 2-phase commit with transaction
log shipping. Maybe there's not enough RAM to handle all 20 live databases.

Maybe if there isn't enough available memory, then the optimizer will decide
to run the query and have to use a worktable, rather than being able to do
everything in memory. Maybe if the network is too slow, it knows it can't
dump the rows out the wire fast enough to free up it's own memory. Blah blah
blah etc etc.

My question is: would an SQL 2005 database engine run a query than another
SQL 20005 engine (with identical DDL and statistics), based on availabe
resources?

Could a busier machine, with hundreds of simultaneous users, and less free
RAM (because of hundreds of simultaneous users) choose to run a query
differently? Or does the optimizer stick just to the DDL and statistics, and
that's it?
Russell Fields - 15 Jul 2008 23:12 GMT
Ian,

Well, one thing that the optimizer can take into account is availability of
extra processors.  On a server with more processors available at the time
the execution plan is created, it might choose a parallel plan.  But if the
machine is very busy, it probably will not choose a parallel plan.

This seems like a good idea, but many performance problems have been traced
to parallelism issues.  This is one issue discussed here:
http://support.microsoft.com/kb/243589

FWIW,
RLF

> Does the optimizer take into account server load and available memory -
> and any other resources it needs?
[quoted text clipped - 35 lines]
> differently? Or does the optimizer stick just to the DDL and statistics,
> and that's it?
Ian Boyd - 16 Jul 2008 12:49 GMT
> This seems like a good idea, but many performance problems have been
> traced to parallelism issues.  This is one issue discussed here:
> http://support.microsoft.com/kb/243589

Are you sure that's the right kb article? That's an article about generally
how to speed up your queries (make sure you have indexes, don't have join
hints).

They say if it was a parallel plan, for the hell of it try forcing it to
single path with OPTION (MAXDOP 1)

Not really a discussion of people having issues on multi-cpu/core machines
having to either force on or off parallel executions.
Russell Fields - 16 Jul 2008 13:56 GMT
Ian,

I figured that was enough to give you an idea since you asked what might
change and parallelism is one answer.  If you want more details, there are
many interesting articles:

http://technet.microsoft.com/en-us/library/ms178065.aspx BOL on parallelism
http://support.microsoft.com/kb/329204
http://bytes.com/forum/thread144731.html
http://www.sql-server-pro.com/max-degree-of-parallelism.html
http://blogs.msdn.com/psssql/archive/2008/02/13/how-it-works-sql-server-per-quer
y-degree-of-parallelism-worker-count-s.aspx


In some articles you will see that forcing parallelism off is not the only
option.  Better indexes, re-examining your code, and so forth may be better
answers.

All the best,
RLF

>> This seems like a good idea, but many performance problems have been
>> traced to parallelism issues.  This is one issue discussed here:
[quoted text clipped - 9 lines]
> Not really a discussion of people having issues on multi-cpu/core machines
> having to either force on or off parallel executions.
Ian Boyd - 16 Jul 2008 14:18 GMT
> I figured that was enough to give you an idea since you asked what might
> change and parallelism is one answer.  If you want more details, there are
> many interesting articles:

i thought it would have talked about someone's real world problem, or talk
about parallism and how it can affect the optimizer.

> http://technet.microsoft.com/en-us/library/ms178065.aspx BOL on
> parallelism
[quoted text clipped - 6 lines]
> option.  Better indexes, re-examining your code, and so forth may be
> better answers.

i, like many people, are trying to understand why identical data with
identical statistics and identical DDL can run one way on SQL 2000 engines,
but vastly different by 2005 engines.

i was hoping that it has to be a problem with the optimizer heuristics on
the machine it's running, not a change in the optimizer heuristics
themselves in 2005. If it's the latter: then everyone is faced with
potentially re-writing every query in every application in order to trick
the 2005 optimizer to run it like the 2000 optimizer did.
Ian Boyd - 16 Jul 2008 14:35 GMT
> i, like many people, are trying to understand why identical data with
> identical statistics and identical DDL can run one way on SQL 2000
> engines, but vastly different by 2005 engines.

For example, just yesterday a guy named PVG posted a question in
.programming.

A query of the form:
   IF (
       SELECT COUNT(*) ...) = 0
   BEGIN
       ...
   END
runs in 10 seconds on 2000. On 2005 it takes 120 seconds. That's an order of
magnitude slower.

If he changed it to:
   DECLARE @TheCount int
   SELECT @TheCount=COUNT(*) ...
   IF @TheCount = 0
   BEGIN
       ...
   END

it ran fast again on 2005. Now everyone can see that the two forms are
identical, but 2005 chose to run it very very poorly. But, as Roy Harvey
says,
   "The optimizer will never be perfect, and there will always be cases we
have to deal with.  Such issues come up when upgrading on a fairly
regular basis."

Well why is that? You would think that a more advanced of the SQL Server
engine would be better and determining identical queries, and running them
identically. At the very least 2005 should be able to run a query as good as
2000. Roy goes on to say,
   "We would never think twice about them when they come up while writing
an application because we would simply write the query a bit differently
until performance was acceptable, and not even remember there was an issue."

Which is simply not true. i come across problems all the time where a query
runs poorly. When i rearrange it to an identical form, i post a question in
the .programming newsgroup asking,
   "Why are these two queries not identical? If the optimizer is running
them differently it MUST be because they have the potential to return
different results. If they were truly identical, they would have identical
plans."

And the reponse always is, "The optimizer sometimes gets it wrong. If you
don't like it, don't upgrade."
Roy Harvey (SQL Server MVP) - 16 Jul 2008 16:58 GMT
Sorry if you don't like the answers.

The optimizer chooses a plan based on statistics.  The statistics are
incomplete by their very nature, a description of the data rather than
the data itself.  This imprecision is one reason the optimizer makes
mistakes.  Another is that the optimizer makes assumptions about
hardware performance, particularly the relative performance of
processor and disk, but that is only a general estimate not the
specific hardware being used.

I can only see two ways for the optimizer can guarantee never to
regress and give a worse plan than an earlier version.  One would be
if the new version is perfect.  The second is if the new version is
unchanged from the old version.

But as long as they keep improving the optimizer without making it
perfect, some queries will perform worse.  Not many, but enough to
raise these questions every time a new release (or sometimes just a
service pack) comes out.  I suppose I've gone through it enough times
to be jaded, so I just rewrite the query and move on.

Roy Harvey
Beacon Falls, CT

>> i, like many people, are trying to understand why identical data with
>> identical statistics and identical DDL can run one way on SQL 2000
[quoted text clipped - 45 lines]
>And the reponse always is, "The optimizer sometimes gets it wrong. If you
>don't like it, don't upgrade."
Ian Boyd - 16 Jul 2008 19:00 GMT
> some queries will perform worse.  Not many, but enough to
> raise these questions every time a new release (or sometimes just a
> service pack) comes out.  I suppose I've gone through it enough times
> to be jaded, so I just rewrite the query and move on.

i would love to re-write the queries and move on. i mean it's obvious that
the query can run better, and was probably already sub-optimal.

The real problem here is that we have a customer who is demanding that we
fix the entire system, without having to go though the system to find slow
queries. "You fixed that query, who's to say that we won't find some other
poorly performing query tomorrow? We want it to all be fixed, and we don't
want to have to keep coming back to fix query after query."

It's not me that doesn't like the answers, it's the salesmen and customers.
Andrew J. Kelly - 16 Jul 2008 19:30 GMT
> The real problem here is that we have a customer who is demanding that we
> fix the entire system, without having to go though the system to find slow
> queries. "You fixed that query, who's to say that we won't find some other
> poorly performing query tomorrow? We want it to all be fixed, and we don't
> want to have to keep coming back to fix query after query."

In reality there are probably only a handful of queries that are an issue
and need to be re-written. Finding which ones these are and prioritizing
them should take you less than an hour. If you fix the one that is called
the most often and is inefficient first you will make the biggest bang for
the buck. Usually just a few queries are 90% of the problems in a system. I
know this because that is what I do for a living and see it day in and day
out from client to client. Yes it is true that when ever you fix one
bottleneck another will appear but usually at a rate much less than the
previous one. If you tackle just a few of the top ones you can make a world
of difference in the overall performance.

Signature

Andrew J. Kelly    SQL MVP
Solid Quality Mentors

Russell Fields - 16 Jul 2008 14:40 GMT
Ian,

The optimizer code changes with every release of SQL Server and often
changes to some degree with service packs as well.  Overall in my experience
the improvement is significant.

However, with change to the optimizer some plans will not be as good as
before.  Naturally, we notice the pain points more than the improvements.
(At least I do.  Improvements put no burden on me, but degraded plans and
performance mean a lot of effort to figure out what is happening and how to
fix it.)

I think that, using parallelism as an example, techniques such as better
analysis and design of indexes is probably a safe fix for most releases. It
exploits a fundamental control on the optimizer's decisions.  Hints (whether
MAXDOP or something else) either reflect a failing in the optimizer that we
have to work around (it happens) or else a failure on our part to fully
understand what we should have done.

Sorry, that I do not have the information to give you a more detailed
answer.  You might find some interest in reading some of the Microsoft SQL
Server Engine and Query blogs.  Links can be found here:
http://searchsqlserver.techtarget.com/generic/0,295582,sid87_gci1132175,00.html

RLF

>> I figured that was enough to give you an idea since you asked what might
>> change and parallelism is one answer.  If you want more details, there
[quoted text clipped - 23 lines]
> potentially re-writing every query in every application in order to trick
> the 2005 optimizer to run it like the 2000 optimizer did.
Ian Boyd - 16 Jul 2008 16:14 GMT
> However, with change to the optimizer some plans will not be as good as
> before.  Naturally, we notice the pain points more than the improvements.
> (At least I do.  Improvements put no burden on me, but degraded plans and
> performance mean a lot of effort to figure out what is happening and how
> to fix it.)

But you understand the confusion right?

Even if it query was poorly written, we know that SQL Server has the ability
to run it lightening fast. But instead it says, "*tsk* *tsk* *tsk* You
shouldn't have done that. Look at this, this is all wrong. You're doing a
correlated sub-query? No, no, no. As punishment i'm going to run this slow.
Don't write your query so poorly next time."

The real problem is a screaming customer, SCREAMING, who wants us to fix
2005 so everything runs like it did on 2000. AND doesn't want us tuning
individual queries. "You fixed that slow query, but who's to say we won't
find another slow query next week, or next month. i want you to fix it so
that EVERYTHING runs as good as it did on 2000."

My own reaction is, "F*ck you.", but then i'm now allowed to talk to
customers.
TheSQLGuru - 16 Jul 2008 21:57 GMT
1) Probably a good thing you aren't allowed to talk to customers.  :-))

2) Why don't you simply roll back to 2000 for the product and/or client in
question?

3) Better yet, why didn't your internal testing discover this (and other??)
performance issues before deciding to migrate to 2005 and fix them during
that pre-migration cycle - or decide to not migrate at all due to issues
noted?

4) "No, no, no. As punishment i'm going to run this slow.   Don't write your
query so poorly next time."  What a silly statement ... ;)

5) "The real problem is a screaming customer, SCREAMING, who wants us to fix
2005 so everything runs like it did on 2000. "  Seems to me that the real
problem is not doing a good job on number 3 above.

Signature

Kevin G. Boles
Indicium Resources, Inc.
SQL Server MVP
kgboles a earthlink dt net

>> However, with change to the optimizer some plans will not be as good as
>> before.  Naturally, we notice the pain points more than the improvements.
[quoted text clipped - 18 lines]
> My own reaction is, "F*ck you.", but then i'm now allowed to talk to
> customers.
Ian Boyd - 17 Jul 2008 14:21 GMT
> 2) Why don't you simply roll back to 2000 for the product and/or client in
> question?

New information. It runs run on 2000. It runs fine on our development 2005.
In runs fine on a crappy test 2005 of theirs. It runs fine on another 2005
machine they setup. It runs poorly on their live 2005.

> 3) Better yet, why didn't your internal testing discover this (and
> other??) performance issues before deciding to migrate to 2005 and fix
> them during that pre-migration cycle - or decide to not migrate at all due
> to issues noted?

There were no performance issues.

i'll start a new thread with a fresh set of questions.
TheSQLGuru - 16 Jul 2008 21:48 GMT
> i, like many people, are trying to understand why identical data with
> identical statistics and identical DDL can run one way on SQL 2000
> engines, but vastly different by 2005 engines.

How about because the 2005 optimizer was almost entirely rewritten?  There
were MANY instances of performance regression, including one just fixed
recently in CU7.  You may have stumbled upon another.  I will note that in
total the optimzier and engine for 2005 is MUCH better than 2000, but if you
get hit by a 'hole' it can be painful.

Signature

Kevin G. Boles
Indicium Resources, Inc.
SQL Server MVP
kgboles a earthlink dt net

>> I figured that was enough to give you an idea since you asked what might
>> change and parallelism is one answer.  If you want more details, there
[quoted text clipped - 23 lines]
> potentially re-writing every query in every application in order to trick
> the 2005 optimizer to run it like the 2000 optimizer did.
Andrew J. Kelly - 16 Jul 2008 15:32 GMT
> My question is: would an SQL 2005 database engine run a query than another
> SQL 20005 engine (with identical DDL and statistics), based on availabe
> resources?

Yes this is possible but mainly in the area of when to use parallel
processing and how many.  So a query can easily run faster or slower, use
more or less I/O etc. depending on the number of threads it chooses to
generate. To some degree this may also be true of the type of joins or
processing it does. For instance hashing & sorting are very memory
intensive. If one machine has more available memory it may perform better
with some of these.

> Could a busier machine, with hundreds of simultaneous users, and less free
> RAM (because of hundreds of simultaneous users) choose to run a query
> differently? Or does the optimizer stick just to the DDL and statistics,
> and that's it?

Absolutely as explained above. Again this is especially true with parallel
queries. A very busy machine will be much less likely to even use
parallelism over one with just a few users given the same hardware.

> SQL Server's optimizer uses heuristics to determine the best execution
> plan. In this case, the statistics on the "old" SQL 2000 and the "new" SQL
> 2005 machines were updated (with fullscan). The database on 2005 is a
> restore of the database that was on 2000.

While it is true the statistics have changed between 2000 and 2005 that is
not the only determining factor in how a query plans gets generated. The
optimizer is dramatically different in many areas and may very well choose a
different plan on 2005 than 2000 given everything else the same. Every
upgrade I have seen from 2000 to 2005 has had an increase in overall
performance but there are definitely some individual plans that are worse
that people have run across. This is due to the changes we mentioned. Most
of the plans are the same or better but a few are not. Some you may just
need to tweak. One prime example was a client who had a nasty query in which
they used some sub selects with NOT IN etc. They came to me and said this
runs much slower on 2005. When I looked at the query I immediately indicated
that is not how I would write the query in the first place and in this case
it should have been a Left Outer Join. When they changed it to that it ran
considerably faster than in 2000. This is a classic example of when a query
is written poorly the optimizer favors the correct syntax instead. I for one
would rather have the optimizer do a better job on correctly written queries
than poorly written one at the expense of the others. You can not get 100%
of the queries optimized to 100% no matter how hard you try so they have to
do what is best overall and some fall thru the cracks. This is true of any
product like this. You may simply have to tweak some queries to get the best
plan when upgrading but overall most should be fine.

Signature

Andrew J. Kelly    SQL MVP
Solid Quality Mentors

> Does the optimizer take into account server load and available memory -
> and any other resources it needs?
[quoted text clipped - 35 lines]
> differently? Or does the optimizer stick just to the DDL and statistics,
> and that's it?
Linchi Shea - 17 Jul 2008 00:26 GMT
Andy;

I've been wondering about whether teh SQL Server query optimizer may choose
a different plan depending on the current load level. Put parallelism aside
for now, I'm not sure it does, and even if it does, I'm not sure it's a good
idea as fundamentally that would destroy plan stability.

But of course it depends on what it exactly means by the optimizer choosing
a different plan depending on the current load level. First, it could mean
that teh optimizer actively checks out the current load level. I highly doubt
that would be the case. For one thing, it can be expensive for the query
optimizer to correctly assess the current load level. And it could be really
bad if it gets the load level wrong. And then, what happens if the current
load level is very different from the load level when the query starts to
execute, i.e. if the load level is rather dynamic?

Second, it could mean that the current load level may cause the optimizer to
do less or more work in evaluating plans, therefore indirectly impacting the
plan generation. That sounds more sensible in terms of the impact the current
load may have on plan generation. Still, I'm not sure it's a good idea, for
the same reason of wanting to have plan stability.

I don't recall any explicit discussions by the MS folks on how the current
load level may influence the query optimizer. If anybody has a link to any MS
official documentation on this, I'd appreciate it if you could post it.

If the load level is an important factor into optimizer's decisions, we'd
better know exactly how it may change a query plan.

Linchi

> > My question is: would an SQL 2005 database engine run a query than another
> > SQL 20005 engine (with identical DDL and statistics), based on availabe
[quoted text clipped - 83 lines]
> > differently? Or does the optimizer stick just to the DDL and statistics,
> > and that's it?
Andrew J. Kelly - 17 Jul 2008 01:53 GMT
Linchi,

Well the focus was definitely on parallelism and it most definitely checks
the current load level before running the query each and every time. If a
query has a parallel plan (it always has a single threaded one) it will
evaluate the current processor load, available memory, number of users etc.
to determine if and how many threads it should create and run with. This
time it may be 1 and next it may be 32 or anywhere in between.  As for other
non parallel plans I will have to go back and read Craig Freedman's chapter
of Inside SQL Server to see where it affects things. But off hand I seem to
remember from the 2000 days that some parts of the overall plan may deviate
from one plan to the next based on available memory and such. That may not
be true in 2005 anymore. But the performance of the operation can certainly
deviate up or down depending on conditions., especially memory. For instance
there is a certain amount of memory that a query needs before it will even
start to process. If there is insufficient memory it will wait. Or if the
amount of memory for a hash or sort operation was underestimated by the
optimizer it will have to spill to disk and cause slower performance. So
technically a query running with the same query plan can still run
dramatically different from iteration to iteration apart from the normal
blocking and I/O issues in a performance standpoint.

Signature

Andrew J. Kelly    SQL MVP
Solid Quality Mentors

> Andy;
>
[quoted text clipped - 156 lines]
>> > statistics,
>> > and that's it?
Linchi Shea - 17 Jul 2008 04:31 GMT
For non-parallel plans, the amount of memory used or required, spill of a
hash to disk, and so on are all resource usage issues. None of these involves
a different query plan or causes the optimizer to come up with a different
plan. It's the same plan whose execution responds to the conditions of the
system. Consider the situation when a page is not found in the buffer pool,
the page is retrieved from disk. You wouldn't classify the changes in the
number of pages read from disk or the changes in the number of pages flushed
to disk as deviations from its original plan.

Again, for non-parallel queries/plans, I would think it's a bad idea for the
optimizer to include the current load level as an input parameter into its
plan selection decision for the reasons stated previously.

Linchi

> Linchi,
>
[quoted text clipped - 178 lines]
> >> > statistics,
> >> > and that's it?
Andrew J. Kelly - 17 Jul 2008 20:42 GMT
Linchi,

I just re-read the original question and I see that he was probably
concerned with a different plan altogether. I was trying to emphasis the
fact two executions of the same plan can have dramatically different results
performance wise just from available memory, procs etc. So OK on that part.
As for the parallel plans I disagree totally.  If it did not evaluate the
conditions each time you might have a situation in which the original plan
called for using 8 procs on an 8 proc machine. If the machine had low usage
at the time that might be fine. But if later on you now have 300 users and
this one query wants to use all 8 procs you have a problem. That is why it
reevaluates each time and will not use more procs than the current work load
or resources can effectively handle. Remember the execution contexts for
parallel plans are never kept and reused as are single plan ones. This is
the main reason why.  It still keeps the plan just not the part that states
how many pros it will use. It is either a parallel plan or a single plan
period. The execution context will say how many procs it actually uses. On
an OLTP system parallel operations are not that common overall compared to
other activity. And when it does it should be because it is resource
intensive. That said resource & processor intensive don't match well with
concurrency.

Signature

Andrew J. Kelly    SQL MVP
Solid Quality Mentors

> For non-parallel plans, the amount of memory used or required, spill of a
> hash to disk, and so on are all resource usage issues. None of these
[quoted text clipped - 246 lines]
>> >> > statistics,
>> >> > and that's it?
Linchi Shea - 17 Jul 2008 23:06 GMT
Andy;

We are not in disagreement w.r.t. query parallelism.

Linchi

> Linchi,
>
[quoted text clipped - 268 lines]
> >> >> > statistics,
> >> >> > and that's it?
Ian Boyd - 18 Jul 2008 18:36 GMT
> It still keeps the plan just not the part that states how many pros it
> will use. It is either a parallel plan or a single plan period.

Does that mean that if on first run of a stored procedure it was decided to
be single proc, it would forever be single proc until sp_recompile'd? Or
does it have two plans, Single proc, and n-proc?
Andrew J. Kelly - 18 Jul 2008 20:00 GMT
> Does that mean that if on first run of a stored procedure it was decided
> to be single proc, it would forever be single proc until sp_recompile'd?
> Or does it have two plans, Single proc, and n-proc?

Yes if when it was compiled or recompiled the optimizer decided there was
not enough work etc. to warrant a parallel plan it will always run as a
single proc operation. The next time it gets recompiled the optimizer will
evaluate everything all over again and decide if the threshold is above the
parallel cost limit and if so it can generate a parallel plan. If not it
will be single again.

But if it does generate a parallel plan it always generates a single plan as
well. And each time it is run the current conditions are evaluated and it
will decide how many to use at that time. Even though it ran with multiple
procs last time it might only use 1 this time but if the plan stays intact
it always has the option to go up or down on the procs.

Signature

Andrew J. Kelly    SQL MVP
Solid Quality Mentors

Ian Boyd - 21 Jul 2008 15:16 GMT
> Yes if when it was compiled or recompiled the optimizer decided there was
> not enough work etc.

My hope is that the decision whether or not to run it in parallel is not
determined by server load, but by the data.

i would hate for a stored procedure to be stuck running serially because the
server happened to be busy those 2 milliseconds when the stored procedure
was first run all those years ago.
Ian Boyd - 21 Jul 2008 15:25 GMT
> My hope is that the decision whether or not to run it in parallel is not
> determined by server load, but by the data.

nm. Just saw Gert-Jan's response.
Andrew J. Kelly - 21 Jul 2008 19:04 GMT
You didn't read my posts well enough. It is things such as the amount of
data, the query type etc that will dictated whether it will bother to
generate a parallel plan in ADDITION to the serial one. The decision of
which to run each time it is executed is then based on the current workload.

Signature

Andrew J. Kelly    SQL MVP
Solid Quality Mentors

>> Yes if when it was compiled or recompiled the optimizer decided there was
>> not enough work etc.
[quoted text clipped - 5 lines]
> the server happened to be busy those 2 milliseconds when the stored
> procedure was first run all those years ago.
Ian Boyd - 21 Jul 2008 21:25 GMT
> You didn't read my posts well enough. It is things such as the amount of
> data, the query type etc that will dictated whether it will bother to
> generate a parallel plan in ADDITION to the serial one.

Withtout the emphasis even that line am ambigious. Changing the emphasis
(which is in the readers head) changes the meaning:

   "It is things such as the amount of data, the query type etc that will
dictate WHETHER it will bother to generate a parallel plan in addition to
the serial one."

...and i just wanted to be clear.
Gert-Jan Strik - 20 Jul 2008 18:31 GMT
Ian,

There are things the optimizer does at compile time, and things the
storage engine does at runtime. They have different options and
responsibilities.

When the optimizer generates the query plan, it uses statistics and the
server configuration. It will only use the information that would cause
an invalidation of the query plan when changed. So it will use
parameters like the amount of memory SQL Server has to its disposal, the
number of CPUs, etc.

However, it will not use information about the current server load,
because the query plan is not build just for that point in time. It is
expected to run fast whenever it is run. For example, it (unfortunately)
does not take the Buffer Cache Hit ratio into consideration.

The optimizer will generate one or two query plans. If it creates two
query plans, the second query plan will be a query plan that includes
parallellism.

The storage engine will execute the query plan. The current server state
can influence the execution in a lot of different areas. I know quite a
few (which I will mention below), but I am sure there are more. And this
is definitely an area where changes in a fix or service pack do not
easily show up. The storage engine will make all these decisions every
time it executes(the same) query plan.

As mentioned by Andrew and Linchi, the first choice of all is whether to
use the parallel query plan (when available). If there is sufficient
memory and CPU capacity available, then it will, otherwise it will stick
to the serial query plan.

If it has to do a table scan, and the "conditions" are right, it will
piggyback on the table scan of another execution plan that is already
running. If the conditions are not right, and the required page is not
in memory, it has to be read from disk.

Depending on conditions unknown to me (except for the knowledge that
Enterprise Edition is more aggresive than Standard Edition), the storage
engine decides whether or not to read-ahead, and how much. Read aheads
undoubtedly have their influence on the outstanding locks.

Depending on the number of outstanding locks (server wide), it will
determine whether to use standard row locks or page locks. Depending on
the number of outstanding locks on the table for the running execution
plan, it will determine whether to keep issuing row/page locks for new
rows/pages or to escalate to a table lock.

Depending on the available amount of memory, hashes and work tables are
entirely (done) in memory, or spill over to tempDB.

So there are a lot of things that are decided at runtime, and that can
greatly affect performance.

In principle, I think it is impossible to improve the optimizer without
sacrificing something. IMO, it is always a tradeoff, and the balance
should be good overall (for the customer group at which the RDBMS is
targetted?). You cannot expect each and every (fine tuned) query on SQL
Server 2000 to run just as fast (or faster) on SQL Server 2005, when you
know that there have been a lot of performance improvements and added
features in 2005. This is the same for all improved products. Not
everything on Vista runs as fast as it did on Windows XP, and not
everything on Windows XP runs as fast as it did on Windows 98, however,
overall they are better products than their predecessors (at least
Windows XP was :-))

Signature

Gert-Jan
SQL Server MVP

> Does the optimizer take into account server load and available memory - and
> any other resources it needs?
[quoted text clipped - 33 lines]
> differently? Or does the optimizer stick just to the DDL and statistics, and
> that's it?
Ian Boyd - 21 Jul 2008 15:20 GMT
ty for all that.

> However, it will not use information about the current server load,
> because the query plan is not build just for that point in time. It is
> expected to run fast whenever it is run.
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.