SQL Server Forum / DB Engine / SQL Server / July 2008
Need help explaining slow function
|
|
Thread rating:  |
MDoyle - 14 Jul 2008 18:21 GMT I have a function written below: I think what the tables are is irrelevant but if you need to hear an explanation Ill be more than happy to provide you with an explanation:
ALTER FUNCTION fn_v33_IsGroupApproved(@GRP_nID as int) RETURNS bit AS BEGIN
DECLARE @isGroupApproved as bit--, @GRP_nID as int SET @isGroupApproved = 1;
SET @GRP_nID = (SELECT GRP_nID FROM ppNominations WHERE NOM_nID = @NOM_nID);
IF (@GRP_nID) IS NULL OR (@GRP_nID) = 0 SET @isGroupApproved = 1; ELSE BEGIN IF ( SELECT COUNT(*) FROM ppNominations WHERE GRP_nID = @GRP_nID AND NOM_lManagerApproved = 0 ) = 0 SET @isGroupApproved = 1; ELSE SET @isGroupApproved = 0; END
RETURN @isGroupApproved; END
Why would this query on a table of about 5000 records take 8 seconds to run? if I change the AND keyword in the the line: WHERE GRP_nID = @GRP_nID AND NOM_lManagerApproved = 0 to OR, it's sub-second.
I do not really need a solution, since I came up with a workaround to it, but what would cause just a simple switch from AND to OR on that WHERE Clause to cause such a drastic change in speed?
Eric Isaacs - 14 Jul 2008 20:00 GMT > SET @GRP_nID = (SELECT GRP_nID FROM ppNominations WHERE NOM_nID = > @NOM_nID); You're not declaring @NOM_nID in your example above, so we don't have the complete picture.
It's likely that the indexes on ppNominations are not covering both GRP_nID AND NOM_lManagerApproved.
This is not the type of function you would want to execute from a view or a procedure that returns multiple rows. This function should be called only when you want to process it for one row. It sounds like that is exactly what you're trying to do. SQL Functions don't scale well when used in this way. You're likely scanning that 5000 row table 5000 times to produce the desired results.
Instead of having a function for this, you should consider adding this table to your view or other SQL that is calling this function 5000 times. Then there would be one join and one scan to get the results you want.
SQL Functions are convenient from a development standpoint and help breakout complex logic, but they should not be used in views when there is more than one table involved or there is no covering index involved. You'll get much better performance if you keep the joins in the main SQL rather than split them out into a separate function.
You should also consider adding WITH SCHEMABINDING to your functions for better performance in SQL 2005.
-Eric Isaacs
JXStern - 15 Jul 2008 00:10 GMT >You should also consider adding WITH SCHEMABINDING to your functions >for better performance in SQL 2005. What's that all about?
Josh
Eric Isaacs - 15 Jul 2008 01:39 GMT > What's that all about? SQL Server 2005 by default assumes all functions are not compiled against the latest versions of the underlying tables (even if there are no tables.) Therefore it always optimizes the function processing to the end after the results have been limited by all other criteria and joins, thereby reducing the number of times the function calls are executed. If you use WITH SCHEMABINDING the optimizer trusts that the tables have not been changed since the function was compiled, so it can then bump up the priority of the function execution to earlier in the processing. SQL 2000 didn't make this assumption. This assumption generally improves performance in 2005, but sometimes this can cause performance issues compared with SQL 2000.
If you have a function that doesn't include any data from any tables (ex: formatting functions, calculation functions that just use the input parameters), always create it with SCHEMABINDING enabled. If you have functions that include tables, just realize that they will execute last unless you schemabind them.
References for using WITH SCHEMABINDING with UDFs: http://sqltips.wordpress.com/category/schemabinding/ http://blogs.msdn.com/queryoptteam/archive/2006/03/24/560089.aspx
JXStern - 15 Jul 2008 03:52 GMT Major wowser, thanks!
But, does it lock the schema?
"Non-deterministic expressions are not persistable..."
I can see I need to read up some more on this, because at first glance, I totally fail to see why the code in the UDF is any more fragile or robust than any other code.
But again, thanks!
J.
>> What's that all about? > [quoted text clipped - 19 lines] >http://sqltips.wordpress.com/category/schemabinding/ >http://blogs.msdn.com/queryoptteam/archive/2006/03/24/560089.aspx Eric Isaacs - 16 Jul 2008 08:27 GMT > Major wowser, thanks! > > But, does it lock the schema? Yes it locks the schema for the tables involved. But for functions with NO tables, (ex: functions that just format input parameters or make a calculation on the input parameters directly) it makes sense given this information to just use WITH SCHEMABINDING by default. There's no downside in that case.
If you're using functions that select from tables, there's an obvious trade-off for using WITH SCHEMABINDING. If you want to change the schema of the tables involved, you have to first drop the function (or at least alter it to remove the WITH SCHEMABINDING statement.)
In my function templates, I have WITH SCHEMABINDING in there by default now and I choose when to remove it. I don't recommend creating functions that hit tables, unless you're using them independently from views or other SQL. So it's not as crucial to have the WITH SCHEMABINDING if you're only executing them as a single SQL statement. It doesn't matter if the compiler doesn't trust them if that's all the compiler has to execute.
Here's a sample of a simple function that uses WITH SCHEMABINDING, but doesn't actually bind to any tables...
CREATE FUNCTION dbo.fn_GetDateOnly ( @InputDateTime DATETIME ) RETURNS DATETIME WITH SCHEMABINDING AS BEGIN --Function RETURN @InputDateTime - CAST(CAST(@InputDateTime AS BINARY(4)) AS DATETIME) END --Function
Intuitively it shouldn't make a difference if this type of function is created WITH SCHEMABINDING, because there's no schema to bind to, but as this article shows, even these types of functions benefit from WITH SCHEMABINDING...
http://blogs.msdn.com/queryoptteam/archive/2006/03/24/560089.aspx
-Eric Isaacs
JXStern - 18 Jul 2008 00:25 GMT >Intuitively it shouldn't make a difference if this type of function is >created WITH SCHEMABINDING, because there's no schema to bind to, but >as this article shows, even these types of functions benefit from WITH >SCHEMABINDING... I can see that the idea is to treat UDFs via static analysis, as they might be written in some language other than TSQL and the optimizer wouldn't have a clue about them. And, just to simplify life.
However, when they *are* in TSQL, the compiler/optimizer *could* be a bit more intelligent about things, it seems to me! Like, inline table-valued UDFs. Maybe they could support an "inline" keyword for all TSQL UDFs? Well, but I guess I shouldn't hold my breath waiting for that.
Josh
Eric Isaacs - 18 Jul 2008 01:08 GMT > I can see that the idea is to treat UDFs via static analysis, as they > might be written in some language other than TSQL and the optimizer > wouldn't have a clue about them. And, just to simplify life. My sense is that they did this because the tables could be changed without recompiling the functions, so the functions may not be optimized against the new table design unless they are schemabound.
I bet this was a performance hit in SQL 2000, and this was the solution they came up with for 2005 to address those issues from 2000. For a majority of cases where tables are accessed from functions, this is a good assumption, but for some cases, it's a bad assumption. I don't understand why they couldn't just trust these functions that have no binding what so ever (and are plain t-sql.) From what I understand, this was the case in SQL 2000, it's just this false assumption that all unbound functions aren't trustworthy that causes the problem in 2005.
Fortunately there's a work around for this issue in 2005 (by using WITH SCHAMABINDING) but that's just not obvious to most function developers, especially those with experience from 2000.
-Eric Isaacs
JXStern - 19 Jul 2008 04:40 GMT >> I can see that the idea is to treat UDFs via static analysis, as they >> might be written in some language other than TSQL and the optimizer [quoted text clipped - 17 lines] >WITH SCHAMABINDING) but that's just not obvious to most function >developers, especially those with experience from 2000. Still don't know why the UDF should be any better off table-bound than any other snippet of SQL in a stored procedure, except that somebody has accepted the idea that they want to treat UDFs differently.
Josh
Eric Isaacs - 21 Jul 2008 04:40 GMT > Still don't know why the UDF should be any better off table-bound than > any other snippet of SQL in a stored procedure, except that somebody > has accepted the idea that they want to treat UDFs differently. > > Josh- Hide quoted text - A sproc is typically a linear set of sql calls. Do this, then do this, then this. There's little optimization required. Execute this SQL statement, then do this one. Where as with a function, there's an execution for the main SQL statement that calls the function, then another for each row returned by the SQL executing the function. It's similar to a sub-procedure.
When a sproc starts performing poorly, a recompile can typically help. So they're not too far apart. The problem with functions is that they are used inefficiently, and there really isn't much that the optimizer can do to make them more efficient, except doing them LAST after the row set has been minimized as much as possible. When you recompile a sproc that has a function call within it, without the schema binding, there's really no way for it to know if it too needs a recompile.
Really a sproc can get out of sync with the database the same as a function, but a function being slow can dramatically reduce the performance of a SQL call that used that function.
JXStern - 21 Jul 2008 19:46 GMT >> Still don't know why the UDF should be any better off table-bound than >> any other snippet of SQL in a stored procedure, except that somebody [quoted text clipped - 21 lines] >function, but a function being slow can dramatically reduce the >performance of a SQL call that used that function. I really have to disagree with your analysis. A single SQL statement can call for quite a bit of optimization, much less all the statements in a sproc. And a function is logically the same as an operation, you often have simple arithmetic in any statement, applied to each row, that's really not the issue.
Josh
Eric Isaacs - 22 Jul 2008 01:47 GMT You're probably right. Views can be Schema bound, but sprocs can't. It's probably just related to the overall complexity and hastle of schemabinding a sproc that make difficult to maintain. Also, in order to schema bind a view or function, all the object they use also need to be schema bound. This would be harder to determine and manage within a sproc.
The WITH RECOMPILE option within sprocs is probably sufficient for cases where you might want to "schemabind" a stored procedure for efficiency.
-Eric Isaacs
Eric Isaacs - 22 Jul 2008 03:01 GMT On second thought, you don't call a sproc from other SQL like you do with a view or function. Schemabinding is more important from the standpoint of optimizing those individual SQL calls that use the views or functions, determining what should be done first within those calls. Schemabinding just assists the optimization when it's used. When would you call a sproc besides when you call it on it's own from SQL? There's no optimization at play that would benefit from Schemabinding a sproc vs. schemabinding a function or view.
JXStern - 15 Jul 2008 00:15 GMT How many rows in ppNominations?
What indexes on ppNominations?
And how is the udf called?
What do you see if you set statistics io on and look at the scans?
Josh
>I have a function written below: I think what the tables are is >irrelevant but if you need to hear an explanation Ill be more than [quoted text clipped - 32 lines] >it, but what would cause just a simple switch from AND to OR on that >WHERE Clause to cause such a drastic change in speed?
|
|
|