SQL Server Forum / Programming / SQL / November 2008
SQL Query Using incorrect index
|
|
Thread rating:  |
SQLMEG - 03 Nov 2008 01:36 GMT We have a problem with this query –
declare @startDate as datetime, @endDate as dateTime select @startDate = '2008-10-14 00:00:00.000', @endDate = '2008-10-14 23:59:59';
select aceTran.id from CallExternalAceTransaction aceTran, --with (index=pk_CallExternalAceTransaction), CallTransaction callTransaction --with ( index =IX_CallTransaction_dateCreated) where (callTransaction.id = aceTran.id) and callTransaction.dateCreated >= @startDate and callTransaction.dateCreated <= @endDate
It is using the wrong index – for the table CallExternalAcTransaction. I can see this in the query plan. It is using a non-clustered index IX_openTransNumber – which only has the field openTransNumber in it. Why it is using this index – I do not know – as this field is not even mentioned in this query.
To try and fix the problem –
1. I have dropped and re-created the Primary key on this table and the IX_openTransNumber index. 2. And I have updated the Statistics on this database. 3. And I have re-written the query several alternative ways – and there is no difference – it is still using the wrong index. (it should be using the Primary key clustered index - pk_CallExternalAceTransaction which is on id). 4. I have also cleared the procedure cache – by doing DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE.
Also – I have not used an index hint – to make it use the right index – as we are restricted – in that we cannot do this. This is because the query comes from EJB-QL – and it is translated into a SQL query – as per the above. In EJB-QL (from Java) – you cannot specify an index hint.
(fyi - Also if I do specify the index hint on the pk_CallExternalAceTransaction index – the query does an index scan instead of an index seek – and I cannot seem to make it do an index seek either – using parameters as we need it to use).
(fyi - it is using the correct index on the other table - CallTransaction. It is a non-clustered index which has the datecreated field in it. The primary key on this table is also on its id. And it is doing an index seek on this table.)
Please can I have some assistance with this problem.
Tom Cooper - 03 Nov 2008 05:15 GMT What I suspect is going on is the following.
First, every nonclustered index includes the clustered index key columns as well as any key columns in that nonclustered index. So CallExternalAceTransaction.id is included the IX_openTransNumber index because it is in the clustered index.
Second, your select list does not need any columns from CallExternalAceTransaction except the id column.
So the query optimizer has decided that the fastest way to do this query is to first do an index seek on the IX_CallTransaction_DateCreated index from which it gets a number of CallTransaction.id values. It needs to match this list against id values in the CallExternalAceTransaction table. The optimizer could do this in a number of ways. One way would be to use the clustered index. But it can also just scan the nonclustered index IX_openTransNumber. And nonclustered indexes are often times a LOT smaller than the clustered index (since the clustered index effectively is the whole table). So, appearently the query optimizer has estimated that the fastest way to find the matching ids is just to scan the nonclustered index. That way, it never has to go look the data in the CallExternalActTransaction table at all.
My guess would be that the optimizer is correct in this case. Sometimes, of course, the optimizer chooses plans that are not optimal, my guess would be that here it is doing the correct thing.
Tom
> We have a problem with this query - > [quoted text clipped - 50 lines] > > Please can I have some assistance with this problem. Uri Dimant - 03 Nov 2008 05:31 GMT Hi I have re-wrtitten your query, so I'm not sure is IX_CallTransaction_dateCreated clustered otr non-clustered. Also how many rows does the query return? Hav you tried IX_CallTransaction_dateCreated to make as clustered? Aslo if you do it within a strored procedure and supply different (range) dates , it can lead to parametyer sniffing, in SQL Server there are some hints to recompile per statement like OPTION (RECOMPILE) for more details see BOL
select aceTran.id from CallExternalAceTransaction aceTran JOIN CallTransaction callTransaction ON (callTransaction.id = aceTran.id) WHERE callTransaction.dateCreated >= @startDate and callTransaction.dateCreated <= @endDate
> We have a problem with this query – > [quoted text clipped - 50 lines] > > Please can I have some assistance with this problem. Gert-Jan Strik - 04 Nov 2008 17:45 GMT Hi SQLMEG,
I think you have some misconceptions of the optimizer. I get the impression that you think that a "scan" is always bad and a "seek" is always good. First of all, this is not always the case (it depends).
Second, a seek is used to find the first matching row. For example, when you are saying that it is seeking the nonclustered index on callTransaction to find the desired dateCreated, the seek will find the first matching row. Then a (partial) scan will take place, which will go through all the matching rows until it reaches the end of the matching rows
I agree with Tom Cooper's analysis of what is probably happening. You are probably seeing a nonclustered index scan that is then Hash Matched to the result of callTransaction (which is processed first).
Let's make up some numbers on the fly. Let's say that the average row in CallExternalAceTransaction is 64 bytes (plus overhead). And let's say that column id is an int (4 bytes) and openTransNumber is also an int (4 bytes). This makes an entry in the nonclustered index on openTransNumber 8 bytes (plus overhead).
If the rows that are found in callTransaction date range are scattered all over the CallExternalAceTransaction table, then from an I/O point of view, scanning the nonclustered index would be beneficial if 1/8th (or more) of the table rows are accessed.
So an important question is: how many rows are there in CallExternalAceTransaction, how many are selected, and how wide is the average row?
Now there can be factors why the most obvious query plan turns out not to be the most efficient. Have you cleared the buffer cache and tested which query plan runs fastest?
BTW: for testing purposes, you might be able to force index seeks by rewriting to this:
SELECT callTransaction.id FROM CallTransaction callTransaction INNER LOOP JOIN CallExternalAceTransaction aceTran ON aceTran.id = callTransaction.id WHERE callTransaction.dateCreated >= @startDate AND callTransaction.dateCreated <= @endDate OPTION (force order)
BTW: if the relation with CallExternalAceTransaction does not eleminate rows, then you should define a foreign key constraint from CallTransaction to CallExternalAceTransaction. Then the optimizer doesn't even have to consider accessing table CallExternalAceTransaction.
 Signature Gert-Jan SQL Server MVP
> We have a problem with this query – > [quoted text clipped - 43 lines] > > Please can I have some assistance with this problem. SQLMEG1 - 12 Nov 2008 22:09 GMT Hi Gert-Jan Strik
Thanks for your reply (and found Tom Cooper's one helpful too). The actual query I am trying to optimize is a bit more complex and has more tables involved than the one that I have put as an example, and it also takes a lot longer to run. However it does contain the two tables mentioned - and I have tried your 2 suggestions of the INNER LOOP JOIN and the Foreign key. The INNER LOOP JOIN seems to have made a significant difference to the performance of the query in my test database. After clearing the buffer cache, the old query was taking 3 minutes and 35 secs to run, and the one rewritten with the INNER LOOP JOIN takes 55 seconds! The foreign key on CallExternalAceTransaction referencing CallTransaction seems to have made no difference to performance.
Both of these times are not good - but I think the old one is taking about 25 seconds to run in Production which I understand is not really acceptable. So the INNER LOOP JOIN may offer some benefit to us - by the looks. It is now doing an Index seek on the pk on the CallExternalAceTransaction table (taking 56% of time) - and an Index Scan on the pk on the CallTransaction table (taking 28% of time).
To answer your questions - the number of rows in the CallExternalAceTransaction table is 2 million rows, how many are selected in the query is 3,800 and the size of the average row is about 86 bytes.
Just fyi - here is the rewritten complex query -
SELECT DISTINCT callTransaction.id FROM CallTransaction callTransaction INNER LOOP JOIN CallExternalAceTransaction aceTran ON aceTran.id = callTransaction.id INNER JOIN CallAccount callAcc ON callTransaction.appliesToAccount=callAcc.id INNER JOIN CallAccountProduct caProduct with (xlock) ON callAcc.definedByProduct=caProduct.id INNER JOIN Product product ON caProduct.id = product.id INNER JOIN FxCurrency ccy ON product.denominatedInCurrencyId = ccy.id INNER JOIN InterestPlan iplan ON product.usesInterestPlan=iplan.id WHERE callTransaction.dateCreated >= @startDate AND callTransaction.dateCreated <= @endDate AND iplan.businessUnit = 'CASHMAN' AND ccy.isoCode = 'NZD'
There is actually a non-clustered index on the CallTransaction table on appliesToAccount, dateCreated fields (together). In the old query - it was doing an Index seek using this index. Now rewritten - it does an Index scan on the pk (id).
Also, I don't really understand your example that you illustrated. The column id on the table is bigint (8 bytes), and the openTransNumber is also a bigint (8 bytes). This makes an entry in the nonclustered index on openTransNumber to be 16 bytes. Because the clustered index (pk) is the id - then this makes an entry in the clustered index on id to be 8 bytes???
And I have looked at how many unique entries there are of openTransNumber - and there are about 2 million. So the size of the openTransNumber index would be more than the size of the clustered index??
I see what Tom Cooper was saying about that the nonclustered index may be a lot smaller than the clustered index - but not in this case?? I don't really understand when he says that the clustered index is effectively the whole table?? Isn't it just the size of the actual id fields?
Thanks SQLMEG1
> Hi SQLMEG, > [quoted text clipped - 96 lines] > > > > Please can I have some assistance with this problem. Alex Kuznetsov - 12 Nov 2008 22:23 GMT > Hi Gert-Jan Strik > [quoted text clipped - 54 lines] > openTransNumber to be 16 bytes. Because the clustered index (pk) is the id - > then this makes an entry in the clustered index on id to be 8 bytes??? Actually the clustered index is the table itself - it contains all the columns of the table.
Erland Sommarskog - 12 Nov 2008 23:27 GMT > Just fyi - here is the rewritten complex query - > [quoted text clipped - 23 lines] > was doing an Index seek using this index. Now rewritten - it does an > Index scan on the pk (id). Giving optimisation suggestions on a distance is usually difficult, and with only partial information about table definitions and indexes, it's about impossible. It would really help if you posted the CREATE TABLE and CREATE INDEX statements for the tables, as well as the rough number of rows.
But when I look at the query it does not seem very well-formed to me. A query with DISTINCT of one single and multiple joins are at least logically best written with EXISTS.
And when I dissected they query, it appears that the query falls into two "legs". One where you go out to CallExternalAceTransaction, and where you go to CallAccount and continue from where. If I understand it correctly, it makes two EXISTS subqueries.
Another interesting thing is that the table CallAccountProduct does not seem to be needed at all. I think you should be able to shortcut directly from CallAccount to Product.
A final note is that I think you have modelled currencies incorrectly. While in many cases, an artificial key is to recommend, I think currencies have a very good natural key, so Product.denominatedInCurrencyId should be denominatedInCurrency and have the value of NZD for this query.
Here is my rewritten query. Whether it actually performs better, I have no idea. At first, you should verify it for correctness.
SELECT CT.id FROM CallTransaction CT WHERE EXISTS (SELECT * FROM CallExternalAceTransaction EAT WHERE EAT.id = CT.id) AND EXISTS (SELECT * FROM CallAccount CA JOIN Product P ON CA.definedByProduct=P.id JOIN FxCurrency ccy ON P.denominatedInCurrencyId = ccy.id JOIN InterestPlan IP ON P.usesInterestPlan=IP.id WHERE CT.appliesToAccount=CA.id AND IP.businessUnit = 'CASHMAN' AND ccy.isoCode = 'NZD') WHERE CT.dateCreated >= @startDate AND CT.dateCreated <= @endDate
And, yeah, I've shortened the aliases. I usuaally do that to be able to read the query.
 Signature Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Links for SQL Server Books Online: SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
SQLMEG - 13 Nov 2008 22:07 GMT Thanks for that information Erland.
You are right about the join to the CallAccountProduct table - that it is not needed.
I have used the query that you wrote - and it is running faster than the original query in my test environment (takes 1 minute 14 seconds) - but not quite as fast as the one which I rewrote to use the Inner loop join which was suggested to me (about 55 seconds). The original one was taking about 3 minutes 35 seconds to run!!
Unfortunately I don't think I'll be able to use the inner loop join idea or your idea though to use the EXISTS statements (I didn't know this before and now I have been told this!!!!!) - as this SQL code is written in EJB-QL (from Java) - and my understanding is that I cannot change the actual query as this is the translated query that is made from the EJB-QL.
So the question that my boss wants to know - is why is the query the way it is currently written - performing so badly?
(BTW - The query you have supplied does appear to work correctly and give the right number of rows.)
Here is the original query -
declare @startDate as datetime, @endDate as dateTime select @startDate = '2008-10-14 00:00:00.000', @endDate = '2008-10-14 23:59:59';
SELECT DISTINCT aceTran.id FROM CallExternalAceTransaction aceTran, CallTransaction callTransaction, Product product, FxCurrency ccy, InterestPlan iplan, CallAccount callAcc, CallAccountProduct caProduct with (xlock) WHERE (callTransaction.id = aceTran.id) AND ( ((callTransaction.dateCreated > @startDate) OR (callTransaction.dateCreated = @startDate)) AND ((callTransaction.dateCreated < @endDate) OR (callTransaction.dateCreated = @endDate)) ) AND caProduct.id = product.id AND iplan.businessUnit = 'CASHMAN' AND product.denominatedInCurrencyId = ccy.id AND ccy.isoCode = 'NZD' AND product.usesInterestPlan=iplan.id AND callTransaction.appliesToAccount=callAcc.id AND callAcc.definedByProduct=caProduct.id
Anyway as you requested - here are the table definitions and the indexes -
SET ANSI_NULLS OFF GO SET QUOTED_IDENTIFIER ON GO SET ANSI_PADDING OFF GO CREATE TABLE [dbo].[CallExternalAceTransaction]( [id] [bigint] NOT NULL, [cashSettleReference] [varchar](1024) NULL, [contractRef] [varchar](1024) NULL, [transactionNumber] [bigint] NULL, [batchNumber] [bigint] NULL, [description] [varchar](1024) NULL, [openTransNumber] [bigint] NULL, [ol_version] [int] NOT NULL CONSTRAINT [DF_CallExternalAceTransaction_ol_version] DEFAULT (1), CONSTRAINT [pk_CallExternalAceTransaction] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
GO SET ANSI_PADDING OFF GO ALTER TABLE [dbo].[CallExternalAceTransaction] WITH CHECK ADD CONSTRAINT [FK_CallExternalAceTransaction_id] FOREIGN KEY([id]) REFERENCES [dbo].[CallTransaction] ([id]) GO ALTER TABLE [dbo].[CallExternalAceTransaction] CHECK CONSTRAINT [FK_CallExternalAceTransaction_id]
1. Create nonclustered Index IX_openTransNumber On [dbo].[CallExternalAceTransaction](openTransNumber)
Number of rows = 2 million
CREATE TABLE [dbo].[CallTransaction]( [id] [bigint] NOT NULL, [dateEffective] [datetime] NULL, [transSequence] [bigint] NULL, [balanceAmount] [float] NULL, [dateCreated] [datetime] NULL, [dateReversed] [datetime] NULL, [callTransactionType] [bigint] NULL, [ol_version] [int] NOT NULL CONSTRAINT [DF_CallTransaction_ol_version] DEFAULT (1), [appliesToAccount] [bigint] NULL, [dateFundsCleared] [datetime] NULL, [auditId] [bigint] NULL, [dateSuppressed] [datetime] NULL, CONSTRAINT [pk_CallTransaction] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = ON, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
1. Create nonclustered Index CallTransactionduplicate_ind On [dbo].[CallTransaction](appliesToAccount, dateCreated)
2. Create nonclustered Index IX_CallTransaction_dateCreated On [dbo].[CallTransaction](dateCreated)
Number of rows = 13 million
CREATE TABLE [dbo].[CallAccount]( [id] [bigint] NOT NULL, [accountNumber] [bigint] NULL, [interestPaidTo] [datetime] NULL, [interestNextDue] [datetime] NULL, [dateCreated] [datetime] NULL, [ol_version] [int] NOT NULL CONSTRAINT [DF_CallAccount_ol_version] DEFAULT (1), [definedByProduct] [bigint] NULL, [ownedByClient] [bigint] NULL, [dateClosed] [datetime] NULL, [overdraftLimit] [float] NULL, [noOverdraftLimit] [bit] NULL, CONSTRAINT [pk_CallAccount] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
1. Create nonclustered Index CallAccount_definedByProduct_ind On [dbo].[CallAccount](definedByProduct)
Number of rows = 70,000
CREATE TABLE [dbo].[Product]( [id] [bigint] NOT NULL, [name] [varchar](1024) NULL, [dateAvailableFrom] [datetime] NULL, [dateNotAvailableAfter] [datetime] NULL, [daysInYear] [bigint] NULL, [denominatedInCurrencyId] [bigint] NULL, [interestPaymentFrequencyId] [bigint] NULL, [defaultCompanyBankAccountId] [bigint] NULL, [productType] [bigint] NULL, [ol_version] [int] NOT NULL CONSTRAINT [DF_Product_ol_version] DEFAULT (1), [usesInterestPlan] [bigint] NULL, [overdraft] [bit] NULL, [noTax] [bit] NULL, [noInterest] [bit] NULL, CONSTRAINT [pk_Product] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
1. Create nonclustered Index Product_denominatedInCurrency_ind On [dbo].[Product](denominatedInCurrencyId)
2. Create nonclustered Index Product_usesInterestPlan_ind On [dbo].[Product](usesInterestPlan)
Number of rows = 32
CREATE TABLE [dbo].[InterestPlan]( [id] [bigint] NOT NULL, [name] [varchar](1024) NULL, [fxCurrencyId] [bigint] NULL, [businessUnit] [varchar](1024) NULL, [ol_version] [int] NOT NULL, CONSTRAINT [pk_InterestPlan] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
Number of rows = 38
CREATE TABLE [dbo].[FxCurrency]( [id] [bigint] NOT NULL, [isoCode] [varchar](1024) NULL, [name] [varchar](1024) NULL, [subUnitDecimalPlaces] [bigint] NULL, [ol_version] [int] NOT NULL CONSTRAINT [DF_FxCurrency_ol_version] DEFAULT (1), CONSTRAINT [pk_FxCurrency] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
1. Create nonclustered Index FxCurrency_isoCode_ind On [dbo].[FxCurrency](isoCode)
Number of rows = 9
CREATE TABLE [dbo].[CallAccountProduct]( [id] [bigint] NOT NULL, [suppressZeroInterestTransactions] [bit] NULL, [ol_version] [int] NOT NULL CONSTRAINT [DF_CallAccountProduct_ol_version] DEFAULT (1), CONSTRAINT [pk_CallAccountProduct] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
Number of rows = 26
> > Just fyi - here is the rewritten complex query - > > [quoted text clipped - 69 lines] > And, yeah, I've shortened the aliases. I usuaally do that to be able > to read the query. Erland Sommarskog - 16 Nov 2008 20:14 GMT > Unfortunately I don't think I'll be able to use the inner loop join idea > or your idea though to use the EXISTS statements (I didn't know this > before and now I have been told this!!!!!) - as this SQL code is written > in EJB-QL (from Java) - and my understanding is that I cannot change > the actual query as this is the translated query that is made from the > EJB-QL. That may be prove to be a roadblock. Having "higher-level" languages like LINQ (I'm entirely unfamiliar with EJB-QL) to generate queries, decreases your possibilities to fine-tune queries.
> Here is the original query - > [quoted text clipped - 4 lines] > SELECT DISTINCT aceTran.id >... Is that the *exact* query text? That is, does EJB-QL generate the query with the DECLARE and SELECT statements? Or does it in fact pass the dates as parameters to thhe query? Exactly what do you see in SQL Server?
This is a very important question. Because if the dates are passed like this, the optimizer will not know the values, but will make a blind assumption of, I think, a 20% hit rate. That efffectively rules out the index on DateCreated.
Now, I need to add a "it depends" here. If the query is always for a single day, then most likely the index on dateCreated is probably the best. But if the date interval can vary considerably, the index on (appliesToAccount, dateCreated) is a better pick. (And may always be, from a distance I have to make some guesses.)
In your original post you said: "It is using a non-clustered index IX_openTransNumber – which only has the field openTransNumber in it. Why it is using this index – I do not know – as this field is not even mentioned in this query."
In a NC-index, the clustered index keys serves as row locators. SQL Server opts to implement the join as a merge join or a hash join, and to this end it read all id values once, and the quickest way to do that is to scan the non-clustered index.
It may help to add a non-clustered index on CallExternalAceTransaction.id, if nothing else you could get a merge join rather than a hash join.
Another idea worth investigating is to change the index on callTransaction.dateCreated to include the column AppliesToAccount. That would make this query cover the query. The index on (appliestoAccount, dateCreated) already covers the query, but it may not be equally selective. Presumably quite a few callAccounts are selected by the criterias on CASHMAN and NZD.
If you are not able to change the query text, there is exists a possibility: plan guides. With a plan guide you can add query hints to a query which matches a certain template. In the most extremee form, you can add an explicit query plan, but I think you should first try to add a RECOMPILE hint this way. (I've never used plan guides myself, so I refer you to Books Online for details.)
 Signature Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Links for SQL Server Books Online: SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
SQLMEG - 19 Nov 2008 01:58 GMT Thanks Erland.
I am trying to get some answers to your questions that you asked.
Also - what I have found out - is if I add your ideas 1 and 2 below that I mentioned, and I put a OPTION(RECOMPILE) hint on the query - I get an amazing improvement on performance. The query when it runs with the OPTION(RECOMPILE) does use both of the indexes that you suggested in ideas 1 and 2. The problem is of course - that my understanding is that I cannot add this query hint in EJB-QL.
Anyway I have used all your ideas. I have
1. added a non-clustered index on CallExternalAceTransaction.id 2. changed the index callTransaction.dateCreated to include the column AppliesToAccount. 3. Tried to use Plan Guides. I did split the parameters up into 4 parameters - just in case the plan guide got confused - and of course changed my query to have 4 parameters - startDate1, startDate2, endDate1 and endDate2.
I have created 3 plan guides in the hope of one of them being used by the query - but it is still not using them.
These are the 3 of them -
1. sp_create_plan_guide @name = N'Guide1', @stmt = N'SELECT DISTINCT aceTran.id FROM CallExternalAceTransaction aceTran, CallTransaction callTransaction, Product product, FxCurrency ccy, InterestPlan iplan, CallAccount callAcc, CallAccountProduct caProduct with (xlock) WHERE (callTransaction.id = aceTran.id) AND ( ((callTransaction.dateCreated > @startDate1) OR (callTransaction.dateCreated = @startDate2)) AND ((callTransaction.dateCreated < @endDate1) OR (callTransaction.dateCreated = @endDate2)) ) AND caProduct.id = product.id AND iplan.businessUnit = ''CASHMAN'' AND product.denominatedInCurrencyId = ccy.id AND ccy.isoCode = ''NZD'' AND product.usesInterestPlan=iplan.id AND callTransaction.appliesToAccount=callAcc.id AND callAcc.definedByProduct=caProduct.id', @type = N'SQL', @module_or_batch = NULL, @params = N'@startDate1 as datetime,@startDate2 as datetime,@endDate1 as dateTime,@endDate2 as dateTime', @hints = N'option (RECOMPILE)'
2. SET ANSI_PADDING ON
SET ANSI_NULLS ON
DECLARE @stmt nvarchar(max); DECLARE @params nvarchar(max); EXEC sp_get_query_template N'SELECT DISTINCT aceTran.id FROM CallExternalAceTransaction aceTran, CallTransaction callTransaction, Product product, FxCurrency ccy, InterestPlan iplan, CallAccount callAcc, CallAccountProduct caProduct with (xlock) WHERE (callTransaction.id = aceTran.id) AND ( ((callTransaction.dateCreated > ''2008-08-14 00:00:00.000'') OR (callTransaction.dateCreated = ''2008-08-14 00:00:00.000'')) AND ((callTransaction.dateCreated < ''2008-08-14 23:59:59.000'') OR (callTransaction.dateCreated = ''2008-08-14 23:59:59.000'')) ) AND caProduct.id = product.id AND iplan.businessUnit = ''CASHMAN'' AND product.denominatedInCurrencyId = ccy.id AND ccy.isoCode = ''NZD'' AND product.usesInterestPlan=iplan.id AND callTransaction.appliesToAccount=callAcc.id AND callAcc.definedByProduct=caProduct.id', @stmt OUTPUT, @params OUTPUT; EXEC sp_create_plan_guide N'TemplateGuide1', @stmt, N'TEMPLATE', NULL, @params, N'OPTION(PARAMETERIZATION FORCED)';
3. Next plan guide - split into 2 parts - 3a. And I ran this to get the SHOWPLAN XML EXEC sp_executesql @stmt = N'SELECT DISTINCT aceTran.id FROM CallExternalAceTransaction aceTran, CallTransaction callTransaction, Product product, FxCurrency ccy, InterestPlan iplan, CallAccount callAcc, CallAccountProduct caProduct with (xlock) WHERE (callTransaction.id = aceTran.id) AND ( ((callTransaction.dateCreated > @startDate1) OR (callTransaction.dateCreated = @startDate2)) AND ((callTransaction.dateCreated < @endDate1) OR (callTransaction.dateCreated = @endDate2)) ) AND caProduct.id = product.id AND iplan.businessUnit = ''CASHMAN'' AND product.denominatedInCurrencyId = ccy.id AND ccy.isoCode = ''NZD'' AND product.usesInterestPlan=iplan.id AND callTransaction.appliesToAccount=callAcc.id AND callAcc.definedByProduct=caProduct.id OPTION (RECOMPILE)', -- force RECOMPILE @params = N'@startDate1 as datetime, @startDate2 as datetime, @endDate1 as datetime, @endDate2 as datetime', @startDate1 = N'2008-08-14 00:00:00.000', @startDate2 = N'2008-08-14 00:00:00.000', @endDate1 = N'2008-08-14 23:59:59.000', @endDate2 = N'2008-08-14 23:59:59.000'; GO
3b. EXEC sp_create_plan_guide @name = N'ForceRecompileGuide', @stmt = N'SELECT DISTINCT aceTran.id FROM CallExternalAceTransaction aceTran, CallTransaction callTransaction, Product product, FxCurrency ccy, InterestPlan iplan, CallAccount callAcc, CallAccountProduct caProduct with (xlock) WHERE (callTransaction.id = aceTran.id) AND ( ((callTransaction.dateCreated > @startDate1) OR (callTransaction.dateCreated = @startDate2)) AND ((callTransaction.dateCreated < @endDate1) OR (callTransaction.dateCreated = @endDate2)) ) AND caProduct.id = product.id AND iplan.businessUnit = ''CASHMAN'' AND product.denominatedInCurrencyId = ccy.id AND ccy.isoCode = ''NZD'' AND product.usesInterestPlan=iplan.id AND callTransaction.appliesToAccount=callAcc.id AND callAcc.definedByProduct=caProduct.id', @type = N'SQL', @module_or_batch = NULL, @params = N'@startDate1 as datetime,@startDate2 as datetime,@endDate1 as dateTime,@endDate2 as dateTime', @hints = N'OPTION (USE PLAN N''<ShowPlanXML xmlns=" -- and included the SHOWPLAN XML here.
But I cannot get the query to use the plan guides.
Can anyone assist me here ?
Thanks SQLMEG
> > Unfortunately I don't think I'll be able to use the inner loop join idea > > or your idea though to use the EXISTS statements (I didn't know this [quoted text clipped - 57 lines] > hint this way. (I've never used plan guides myself, so I refer you to > Books Online for details.) Erland Sommarskog - 19 Nov 2008 06:31 GMT > Also - what I have found out - is if I add your ideas 1 and 2 below that > I mentioned, and I put a OPTION(RECOMPILE) hint on the query - I get an > amazing improvement on performance. The query when it runs with the > OPTION(RECOMPILE) does use both of the indexes that you suggested in > ideas 1 and 2. The problem is of course - that my understanding is that > I cannot add this query hint in EJB-QL. I see from the other post, that the query is in fact parameterised. In such case it's "only" a parameter sniffing issue. This could be tested with doing "sp_recompile CallTransaction". This will flush all plans that refers to CallTransaction.
Then again, if the nature of the application is such that the "first" query will always be with a long date interval, you are back on square 1.
> 1. added a non-clustered index on CallExternalAceTransaction.id > 2. changed the index callTransaction.dateCreated to include the column > AppliesToAccount. And none of these had any effect? In fact, adding an index on a table flushes all plans for it, I believe, so you would get the correct sniffing. But! It's important that you test the query, just like it's submitted with the application. If you submit it with the DECLARE you posted here previously, it's not the same thing. Using sp_prepare can be a little difficult from SSMS, but you can do this:
sp_executesql @query, N'@startTime datetime, @endTime datetime', '2008-08-14 00:00:00.000','2008-08-14 23:59:59.000'
> 3. Tried to use Plan Guides. I did split the parameters up into 4 > parameters - just in case the plan guide got confused - and of course [quoted text clipped - 3 lines] > I have created 3 plan guides in the hope of one of them being used by the > query - but it is still not using them. Unfortunately, I have never used plan guides myself, so I can't say what you may be doing wrong. But I suggest that you first to try to get it to work with a simpler query. I guess it takes some time to get a knack with it.
 Signature Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Links for SQL Server Books Online: SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
SQLMEG - 19 Nov 2008 03:12 GMT Hi again,
In answer to Erland's question about what is the 'exact' query text in SQL Server - I have found out is that the statement is actually being run on SQL Server like this - (ie. using a sp_prepare and a sp_execute).
declare @p1 int
--set @p1=7
exec sp_prepare @p1 output, N'@startDate as datetime,@endDate as dateTime', N'SELECT DISTINCT aceTran.id FROM CallExternalAceTransaction aceTran, CallTransaction callTransaction, Product product, FxCurrency ccy, InterestPlan iplan, CallAccount callAcc, CallAccountProduct caProduct with (xlock) WHERE (callTransaction.id = aceTran.id) AND ( ((callTransaction.dateCreated > @startDate) OR (callTransaction.dateCreated = @startDate)) AND ((callTransaction.dateCreated < @endDate) OR (callTransaction.dateCreated = @endDate)) ) AND caProduct.id = product.id AND iplan.businessUnit = ''CASHMAN'' AND product.denominatedInCurrencyId = ccy.id AND ccy.isoCode = ''NZD'' AND product.usesInterestPlan=iplan.id AND callTransaction.appliesToAccount=callAcc.id AND callAcc.definedByProduct=caProduct.id',1
select @p1
exec sp_execute @p1,'2008-08-14 00:00:00.000','2008-08-14 23:59:59.000'
NB: I did manage to get the query to use the Guide1 SQL plan guide (posted in other post) - by doing an sp_executesql !! Actually in reading the SQL Books online - it does say that the 'statement_text is submitted using sp_executesql and a value for the @params parameter is specified or SQL Server internally submits a statement after parameterizing it' . I have tried to parameterize the query by using a TEMPLATE plan guide (Parameterization forced).
Thanks SQLMEG
> > Unfortunately I don't think I'll be able to use the inner loop join idea > > or your idea though to use the EXISTS statements (I didn't know this [quoted text clipped - 57 lines] > hint this way. (I've never used plan guides myself, so I refer you to > Books Online for details.) Gert-Jan Strik - 18 Nov 2008 22:27 GMT Sorry for the late response, I have been away for a short while.
See response inline
> Hi Gert-Jan Strik > > Thanks for your reply (and found Tom Cooper's one helpful too). The actual > query I am trying to optimize is a bit more complex and has more tables > involved than the one that I have put as an example, and it also takes a lot > longer to run. When tuning performance, it is essential to have the exact query, and exact table definition and indexes. Additional joins can make a difference, and in your case they do (they basically disqualify the indexes on CallTransaction from being useful).
> However it does contain the two tables mentioned - and I have > tried your 2 suggestions of the INNER LOOP JOIN and the Foreign key. [quoted text clipped - 4 lines] > CallExternalAceTransaction referencing CallTransaction seems to have made no > difference to performance. So apparently the relation is the other way around. Apparently CallExternalAceTransaction is the referencing table. It probably would make a difference if CallTransaction were the referencing table.
> Both of these times are not good - but I think the old one is taking about > 25 seconds to run in Production which I understand is not really acceptable. [quoted text clipped - 6 lines] > CallExternalAceTransaction table is 2 million rows, how many are selected in > the query is 3,800 and the size of the average row is about 86 bytes. That is a relatively low number, so the optimizer's choice is surprising.
As mentioned by Erland, it would be best to rewrite this query, because it should be easy to eliminate the DISTINCT and to minimize access to the joined tables by rewriting to EXISTS. But I understand that is not possible in your case.
If you cannot add the foreign key as suggested earlier (from CallTransaction referencing CallExternalAceTransaction) then you could add a nonclustered index on CallExternalAceTransaction(id), so on top of the already existing clustered index. Then remove the join hint from your query when testing.
> Just fyi - here is the rewritten complex query - > [quoted text clipped - 23 lines] > doing an Index seek using this index. Now rewritten - it does an Index scan > on the pk (id). Yes, I have seen you other post (with the DDL), and with all these extra joins, you don't have a useful index on CallTransaction, because there is no longer a useful covering index (the index on (appliesToAccount,dateCreated) is covering, but not useful). You could add an index on CallTransaction(dateCreated) INCLUDE (appliesToAccount), or (preferably) change the existing index on dateCreated to this definition. If you are running SQL Server 2000 or earlier, then simply create the index on CallTransaction(dateCreated, appliesToAccount). This index will cover the query and has the date range column as the leading column, and therefore the optimizer does not have to fall back to the clustered index.
> Also, I don't really understand your example that you illustrated. The > column id on the table is bigint (8 bytes), and the openTransNumber is also a [quoted text clipped - 10 lines] > understand when he says that the clustered index is effectively the whole > table?? Isn't it just the size of the actual id fields? As explained by others, the clustered index contains all table columns. So based on the numbers you mentioned above, a clustered index entry is 86 bytes (on average), the nonclustered index entry only 16.
HTH
 Signature Gert-Jan SQL Server MVP
> Thanks > SQLMEG1 [quoted text clipped - 103 lines] > > > > > > Please can I have some assistance with this problem.
|
|
|