Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / General / Data Warehousing / May 2005

Tip: Looking for answers? Try searching our database.

SQL Server 2000 Query tuning question

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
MattODA - 11 May 2005 04:46 GMT
I have a star schema where the fact table is about 1.2M rows, and about 40
measures wide (a lot, I know). I've been tuning and studying Showplan and
have performance almost where I need it. But I'm encountering something I
don't understand. When I run this query:

select REGION_NAME,
 Sum(case when CalendarMonth = 1 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 2 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 3 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 4 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 5 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 6 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 7 then TOTAL_SALES else null end),
 Sum(case when 8 = 8 then TOTAL_SALES else null end),
 Sum(case when 9 = 9 then TOTAL_SALES else null end),
 Sum(case when 10 = 10 then TOTAL_SALES else null end),
 Sum(case when 11 = 11 then TOTAL_SALES else null end),
 Sum(case when 12 = 12 then TOTAL_SALES else null end),
 SUM((TOTAL_SALES))  
from DateDim DD JOIN StoreDayFact SDF
   on SDF.DateDimKey = DD.DateDimKey
    INNER JOIN StoreDim SD  
   on  SDF.StoreDimKey = SD.StoreDimKey
where DD.CalendarYear = 2005
group by REGION_NAME

it returns in 6 seconds. When I modify it slightly:

select REGION_NAME,
 Sum(case when CalendarMonth = 1 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 2 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 3 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 4 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 5 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 6 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 7 then TOTAL_SALES else null end),
--  Sum(case when 8 = 8 then TOTAL_SALES else null end),
 Sum(case when CalendarMonth = 8 then TOTAL_SALES else null end),
 Sum(case when 9 = 9 then TOTAL_SALES else null end),
 Sum(case when 10 = 10 then TOTAL_SALES else null end),
 Sum(case when 11 = 11 then TOTAL_SALES else null end),
 Sum(case when 12 = 12 then TOTAL_SALES else null end),
 SUM((TOTAL_SALES))  
from ODA_DateDim DD JOIN ODA_StoreDayFact SDF
   on SDF.DateDimKey = DD.DateDimKey
    INNER JOIN ODA_StoreDim SD  
   on  SDF.StoreDimKey = SD.StoreDimKey
where DD.CalendarYear = 2005
group by REGION_NAME

it takes 15 seconds. The "strange" thing to me is, the Execution plan for
both versions of the query are identical, as are the Stats IO output (i.e.
same number of logical reads for both queries). Yet, one takes more than
twice the other to return.

What is happening? More important, is there anything I can do to further
tune Query #2?

Thanks!!
Adam Machanic - 12 May 2005 17:54 GMT
> it takes 15 seconds. The "strange" thing to me is, the Execution plan for
> both versions of the query are identical, as are the Stats IO output (i.e.
[quoted text clipped - 3 lines]
> What is happening? More important, is there anything I can do to further
> tune Query #2?

   Those queries are using different tables.  The first query uses tables
DateDim and StoreFactDay, whereas the second uses ODA_DateDim and
ODA_StoreFactDay.  Do these tables have the same indexes?  Same row counts?
Same data distribution?  Are statistics updated for the latter set of
tables?

Signature

Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--

MattODA - 12 May 2005 18:41 GMT
Yeah, my bad. That was a typo. The two queries do hit the same tables. I did
a DBCC CheckDB and Shrink DB just for fun this morning, and it changed the
times a little bit. But there's still a pretty dramatic "step" in performance
of this query.

Here is that query again. I started by commenting out the lines for months 2
- 12. Then I uncommented one at a time and ran the query a few times. Next to
each is the fastest time I could achieve. The Execution Plan and Logical
Reads were identical every time.

select REGION_NAME,
 Sum(case when CalendarMonth = 1 then TOTAL_SALES else null end), -- 3 secs
 Sum(case when CalendarMonth = 2 then TOTAL_SALES else null end), -- 3 secs
 Sum(case when CalendarMonth = 3 then TOTAL_SALES else null end), -- 4 secs  
 Sum(case when CalendarMonth = 4 then TOTAL_SALES else null end), -- 4 secs
 Sum(case when CalendarMonth = 5 then TOTAL_SALES else null end), -- 5 secs
 Sum(case when CalendarMonth = 6 then TOTAL_SALES else null end), -- 5 secs
 Sum(case when CalendarMonth = 7 then TOTAL_SALES else null end), -- 6 secs
 Sum(case when CalendarMonth = 8 then TOTAL_SALES else null end), -- 6 secs
 Sum(case when CalendarMonth = 9 then TOTAL_SALES else null end), -- 7 secs
 Sum(case when CalendarMonth = 10 then TOTAL_SALES else null end), -- 7 secs
 Sum(case when CalendarMonth = 11 then TOTAL_SALES else null end), -- 11
secs
 Sum(case when CalendarMonth = 12 then TOTAL_SALES else null end), -- 17
secs
 SUM((TOTAL_SALES))  
from DateDim DD JOIN StoreDayFact SDF
   on SDF.DateDimKey = DD.DateDimKey
    INNER JOIN StoreDim SD  
   on  SDF.StoreDimKey = SD.StoreDimKey
where DD.CalendarYear = 2005
group by REGION_NAME

What do you think could be going on?
Thanks,
-Matt

> > it takes 15 seconds. The "strange" thing to me is, the Execution plan for
> > both versions of the query are identical, as are the Stats IO output (i.e.
[quoted text clipped - 9 lines]
> Same data distribution?  Are statistics updated for the latter set of
> tables?
MattODA - 12 May 2005 18:48 GMT
Yeah, my bad, that was a typo. Those two queries do indeed hit the same
tables. I ran a DBCC CheckDB and Shrink Files this morning for fun, and it
changed the overall response times a little bit. But there is still a pretty
dramatic "step" in performance. I also ran Update Stats, with no effect.

Here's that query again. I started by commenting out the lines for months
2-12, then uncommented one line at a time and ran the query a few times. Next
to each line is the fastest response time I was able to get with that line
uncommented. The Execution Plan and Logical Reads were identical every time.

select REGION_NAME,
 Sum(case when CalendarMonth = 1 then TOTAL_SALES else null end), -- 3 secs
 Sum(case when CalendarMonth = 2 then TOTAL_SALES else null end), -- 3 secs
 Sum(case when CalendarMonth = 3 then TOTAL_SALES else null end), -- 4 secs  
 Sum(case when CalendarMonth = 4 then TOTAL_SALES else null end), -- 4 secs
 Sum(case when CalendarMonth = 5 then TOTAL_SALES else null end), -- 5 secs
 Sum(case when CalendarMonth = 6 then TOTAL_SALES else null end), -- 5 secs
 Sum(case when CalendarMonth = 7 then TOTAL_SALES else null end), -- 6 secs
 Sum(case when CalendarMonth = 8 then TOTAL_SALES else null end), -- 6 secs
 Sum(case when CalendarMonth = 9 then TOTAL_SALES else null end), -- 7 secs
 Sum(case when CalendarMonth = 10 then TOTAL_SALES else null end), -- 7 secs
 Sum(case when CalendarMonth = 11 then TOTAL_SALES else null end), -- 11
secs
 Sum(case when CalendarMonth = 12 then TOTAL_SALES else null end), -- 17
secs
 SUM((TOTAL_SALES))  
from DateDim DD JOIN StoreDayFact SDF
   on SDF.DateDimKey = DD.DateDimKey
    INNER JOIN StoreDim SD  
   on  SDF.StoreDimKey = SD.StoreDimKey
where DD.CalendarYear = 2005
group by REGION_NAME

StoreDayFact: 1,279,713 rows
DateDim: 845 rows
StoreDim: 2816 rows

What do you think could be going on?
Thanks,
-Matt

> > it takes 15 seconds. The "strange" thing to me is, the Execution plan for
> > both versions of the query are identical, as are the Stats IO output (i.e.
[quoted text clipped - 9 lines]
> Same data distribution?  Are statistics updated for the latter set of
> tables?
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.