Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / DB Engine / SQL Server / July 2008

Tip: Looking for answers? Try searching our database.

Need help explaining slow function

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
MDoyle - 14 Jul 2008 18:21 GMT
I have a function written below:  I think what the tables are is
irrelevant but if you need to hear an explanation Ill be more than
happy to provide you with an explanation:

ALTER FUNCTION fn_v33_IsGroupApproved(@GRP_nID as int)
RETURNS bit AS BEGIN

DECLARE @isGroupApproved as bit--, @GRP_nID as int
SET @isGroupApproved = 1;

SET @GRP_nID = (SELECT GRP_nID FROM ppNominations WHERE NOM_nID =
@NOM_nID);

IF (@GRP_nID) IS NULL OR (@GRP_nID) = 0
    SET @isGroupApproved = 1;
ELSE
BEGIN
    IF (   SELECT COUNT(*)
          FROM ppNominations
          WHERE GRP_nID = @GRP_nID AND NOM_lManagerApproved = 0
        ) = 0
        SET @isGroupApproved = 1;
    ELSE
        SET @isGroupApproved = 0;
END

RETURN @isGroupApproved;
END

Why would this query on a table of about 5000 records take 8 seconds
to run?  if I change the AND keyword in the the line: WHERE GRP_nID =
@GRP_nID AND NOM_lManagerApproved = 0 to OR, it's sub-second.

I do not really need a solution, since I came up with a workaround to
it, but what would cause just a simple switch from AND to OR on that
WHERE Clause to cause such a drastic change in speed?
Eric Isaacs - 14 Jul 2008 20:00 GMT
> SET @GRP_nID = (SELECT GRP_nID FROM ppNominations WHERE NOM_nID =
> @NOM_nID);

You're not declaring @NOM_nID in your example above, so we don't have
the complete picture.

It's likely that the indexes on ppNominations are not covering both
GRP_nID AND NOM_lManagerApproved.

This is not the type of function you would want to execute from a view
or a procedure that returns multiple rows.  This function should be
called only when you want to process it for one row.  It sounds like
that is exactly what you're trying to do.  SQL Functions don't scale
well when used in this way.  You're likely scanning that 5000 row
table 5000 times to produce the desired results.

Instead of having a function for this, you should consider adding this
table to your view or other SQL that is calling this function 5000
times.  Then there would be one join and one scan to get the results
you want.

SQL Functions are convenient from a development standpoint and help
breakout complex logic, but they should not be used in views when
there is more than one table involved or there is no covering index
involved.  You'll get much better performance if you keep the joins in
the main SQL rather than split them out into a separate function.

You should also consider adding WITH SCHEMABINDING to your functions
for better performance in SQL 2005.

-Eric Isaacs
JXStern - 15 Jul 2008 00:10 GMT
>You should also consider adding WITH SCHEMABINDING to your functions
>for better performance in SQL 2005.

What's that all about?

Josh
Eric Isaacs - 15 Jul 2008 01:39 GMT
> What's that all about?

SQL Server 2005 by default assumes all functions are not compiled
against the latest versions of the underlying tables (even if there
are no tables.)  Therefore it always optimizes the function processing
to the end after the results have been limited by all other criteria
and joins, thereby reducing the number of times the function calls are
executed.  If you use WITH SCHEMABINDING the optimizer trusts that the
tables have not been changed since the function was compiled, so it
can then bump up the priority of the function execution to earlier in
the processing.  SQL 2000 didn't make this assumption.  This
assumption generally improves performance in 2005, but sometimes this
can cause performance issues compared with SQL 2000.

If you have a function that doesn't include any data from any tables
(ex: formatting functions, calculation functions that just use the
input parameters), always create it with SCHEMABINDING enabled.  If
you have functions that include tables, just realize that they will
execute last unless you schemabind them.

References for using WITH SCHEMABINDING with UDFs:
http://sqltips.wordpress.com/category/schemabinding/
http://blogs.msdn.com/queryoptteam/archive/2006/03/24/560089.aspx
JXStern - 15 Jul 2008 03:52 GMT
Major wowser, thanks!

But, does it lock the schema?

"Non-deterministic expressions are not persistable..."

I can see I need to read up some more on this, because at first
glance, I totally fail to see why the code in the UDF is any more
fragile or robust than any other code.

But again, thanks!

J.

>> What's that all about?
>
[quoted text clipped - 19 lines]
>http://sqltips.wordpress.com/category/schemabinding/
>http://blogs.msdn.com/queryoptteam/archive/2006/03/24/560089.aspx
Eric Isaacs - 16 Jul 2008 08:27 GMT
> Major wowser, thanks!
>
> But, does it lock the schema?

Yes it locks the schema for the tables involved.  But for functions
with NO tables, (ex: functions that just format input parameters or
make a calculation on the input parameters directly) it makes sense
given this information to just use WITH SCHEMABINDING by default.
There's no downside in that case.

If you're using functions that select from tables, there's an obvious
trade-off for using WITH SCHEMABINDING.  If you want to change the
schema of the tables involved, you have to first drop the function (or
at least alter it to remove the WITH SCHEMABINDING statement.)

In my function templates, I have WITH SCHEMABINDING in there by
default now and I choose when to remove it.  I don't recommend
creating functions that hit tables, unless you're using them
independently from views or other SQL.  So it's not as crucial to have
the WITH SCHEMABINDING if you're only executing them as a single SQL
statement.  It doesn't matter if the compiler doesn't trust them if
that's all the compiler has to execute.

Here's a sample of a simple function that uses WITH SCHEMABINDING, but
doesn't actually bind to any tables...

CREATE FUNCTION dbo.fn_GetDateOnly
   (
    @InputDateTime DATETIME
   )
RETURNS DATETIME
   WITH SCHEMABINDING
AS
BEGIN --Function
   RETURN @InputDateTime - CAST(CAST(@InputDateTime AS BINARY(4)) AS
DATETIME)
END --Function

Intuitively it shouldn't make a difference if this type of function is
created WITH SCHEMABINDING, because there's no schema to bind to, but
as this article shows, even these types of functions benefit from WITH
SCHEMABINDING...

http://blogs.msdn.com/queryoptteam/archive/2006/03/24/560089.aspx

-Eric Isaacs
JXStern - 18 Jul 2008 00:25 GMT
>Intuitively it shouldn't make a difference if this type of function is
>created WITH SCHEMABINDING, because there's no schema to bind to, but
>as this article shows, even these types of functions benefit from WITH
>SCHEMABINDING...

I can see that the idea is to treat UDFs via static analysis, as they
might be written in some language other than TSQL and the optimizer
wouldn't have a clue about them.  And, just to simplify life.

However, when they *are* in TSQL, the compiler/optimizer *could* be a
bit more intelligent about things, it seems to me!  Like, inline
table-valued UDFs.  Maybe they could support an "inline" keyword for
all TSQL UDFs?  Well, but I guess I shouldn't hold my breath waiting
for that.

Josh
Eric Isaacs - 18 Jul 2008 01:08 GMT
> I can see that the idea is to treat UDFs via static analysis, as they
> might be written in some language other than TSQL and the optimizer
> wouldn't have a clue about them.  And, just to simplify life.

My sense is that they did this because the tables could be changed
without recompiling the functions, so the functions may not be
optimized against the new table design unless they are schemabound.

I bet this was a performance hit in SQL 2000, and this was the
solution they came up with for 2005 to address those issues from
2000.  For a majority of cases where tables are accessed from
functions, this is a good assumption, but for some cases, it's a bad
assumption.  I don't understand why they couldn't just trust these
functions that have no binding what so ever (and are plain t-sql.)
From what I understand, this was the case in SQL 2000, it's just this
false assumption that all unbound functions aren't trustworthy that
causes the problem in 2005.

Fortunately there's a work around for this issue in 2005 (by using
WITH SCHAMABINDING) but that's just not obvious to most function
developers, especially those with experience from 2000.

-Eric Isaacs
JXStern - 19 Jul 2008 04:40 GMT
>> I can see that the idea is to treat UDFs via static analysis, as they
>> might be written in some language other than TSQL and the optimizer
[quoted text clipped - 17 lines]
>WITH SCHAMABINDING) but that's just not obvious to most function
>developers, especially those with experience from 2000.

Still don't know why the UDF should be any better off table-bound than
any other snippet of SQL in a stored procedure, except that somebody
has accepted the idea that they want to treat UDFs differently.

Josh
Eric Isaacs - 21 Jul 2008 04:40 GMT
> Still don't know why the UDF should be any better off table-bound than
> any other snippet of SQL in a stored procedure, except that somebody
> has accepted the idea that they want to treat UDFs differently.
>
> Josh- Hide quoted text -

A sproc is typically a linear set of sql calls.  Do this, then do
this, then this.  There's little optimization required.  Execute this
SQL statement, then do this one.  Where as with a function, there's an
execution for the main SQL statement that calls the function, then
another for each row returned by the SQL executing the function.  It's
similar to a sub-procedure.

When a sproc starts performing poorly, a recompile can typically
help.  So they're not too far apart.  The problem with functions is
that they are used inefficiently, and there really isn't much that the
optimizer can do to make them more efficient, except doing them LAST
after the row set has been minimized as much as possible.  When you
recompile a sproc that has a function call within it, without the
schema binding, there's really no way for it to know if it too needs a
recompile.

Really a sproc can get out of sync with the database the same as a
function, but a function being slow can dramatically reduce the
performance of a SQL call that used that function.
JXStern - 21 Jul 2008 19:46 GMT
>> Still don't know why the UDF should be any better off table-bound than
>> any other snippet of SQL in a stored procedure, except that somebody
[quoted text clipped - 21 lines]
>function, but a function being slow can dramatically reduce the
>performance of a SQL call that used that function.

I really have to disagree with your analysis.  A single SQL statement
can call for quite a bit of optimization, much less all the statements
in a sproc.  And a function is logically the same as an operation, you
often have simple arithmetic in any statement, applied to each row,
that's really not the issue.

Josh
Eric Isaacs - 22 Jul 2008 01:47 GMT
You're probably right.  Views can be Schema bound, but sprocs can't.
It's probably just related to the overall complexity and hastle of
schemabinding a sproc that make difficult to maintain.  Also, in order
to schema bind a view or function, all the object they use also need
to be schema bound.  This would be harder to determine and manage
within a sproc.

The WITH RECOMPILE option within sprocs is probably sufficient for
cases where you might want to "schemabind" a stored procedure for
efficiency.

-Eric Isaacs
Eric Isaacs - 22 Jul 2008 03:01 GMT
On second thought, you don't call a sproc from other SQL like you do
with a view or function.  Schemabinding is more important from the
standpoint of optimizing those individual SQL calls that use the views
or functions, determining what should be done first within those
calls.  Schemabinding just assists the optimization when it's used.
When would you call a sproc besides when you call it on it's own from
SQL?  There's no optimization at play that would benefit from
Schemabinding a sproc vs. schemabinding a function or view.
JXStern - 15 Jul 2008 00:15 GMT
How many rows in ppNominations?

What indexes on ppNominations?

And how is the udf called?

What do you see if you set statistics io on and look at the scans?

Josh

>I have a function written below:  I think what the tables are is
>irrelevant but if you need to hear an explanation Ill be more than
[quoted text clipped - 32 lines]
>it, but what would cause just a simple switch from AND to OR on that
>WHERE Clause to cause such a drastic change in speed?
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.