Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / Programming / SQL / December 2008

Tip: Looking for answers? Try searching our database.

About Index design

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Savvoulidis Iordanis - 27 Dec 2008 13:15 GMT
I have a table COUPON with a clustered index on two columns (username,
date_inserted).  COUPON table references USERS table on username.

Q1. Should I create the clustered index on (username, date_inserted) or
(date_inserted, username)?

Q2. Should I create another non-clustered index on username column to assist
the joined queries on both tables?
Uri Dimant - 27 Dec 2008 16:01 GMT
Hi
It depends on what kind of query you are running. Can you show us your
typical query , where condition, joins and etc..?

>I have a table COUPON with a clustered index on two columns (username,
> date_inserted).  COUPON table references USERS table on username.
[quoted text clipped - 5 lines]
> assist
> the joined queries on both tables?
Erland Sommarskog - 27 Dec 2008 16:02 GMT
> I have a table COUPON with a clustered index on two columns (username,
> date_inserted).  COUPON table references USERS table on username.
>
> Q1. Should I create the clustered index on (username, date_inserted) or
> (date_inserted, username)?

You posted a very similar question yesterday, and I repeat my answer
here:

That depends on the queries performed against the table. For instance, if
this is a very common query:
 
   SELECT TOP 200 ...
   FROM   coupons
   ORDER BY inserted_date

Then it's better to have inserted_date first.

But if there are no queries on inserted_date at all, then it's useless
to put it first in the index.


> Q2. Should I create another non-clustered index on username column to
> assist the joined queries on both tables?

You have rephrased this question, and the in light I answer differently
this time.

If your query is:

 SELECT ...
 FROM   coupons c
 JOIN   users u ON c.username = u.username
 WHERE  d.date_inserted BETWEEN ...

It's useful to have the clustered index with date_inserted first, but
you have no benefit from an index on coupons.username, because you
will read the rows in the coupons table, and the look up the users in
the other table.

On the other hand if the query goes:

 SELECT ...
 FROM   coupons c
 JOIN   users u ON c.username = u.username
 WHERE  u.city = 'Stockholm'

You certainly have a benefit of an index on coupons.username, but you
already have one.

If you run both both types of queries, you will need two indexes, one
with date_inserted as the first column and one with username.

Finally, when it comes to performance, there are rarely any quick
questions with quick answers, but there is a lot of "it depends".
So open-ended questions like your easily get long-winding answers.
And they still might now make you any wiser.

Signature

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Savvoulidis Iordanis - 30 Dec 2008 01:28 GMT
Sorry for reposting. But it seems that there is over 12 hours delay, until my
post shows up, which may seem to me something went wrong.
Savvoulidis Iordanis - 30 Dec 2008 01:56 GMT
Thanks Erland.
Well all my queries on COUPON table, are dependent on the username and
date_inserted. I always use both username and inserted_date in the WHERE
clause, because each user sees only his own records in the timespan he
chooses.

But, since the number of COUPON records are much much more than the number
of users inserting them each date, shouldn't the COUPON records be clustered
indexed first on username, so after a certain user is found in COUPON, all
his COUPON records are side by side? That way less records have to be
examined, because then a date search would be applied on contiguous records.
I believe, if the clustered index is (date_inserted, username) then the
whole index would have to be examined each time, first locating the starting
date up to end_date, and then filtering out all other user's records.

I don't know if i made myself clear, but I look forward to reading a reply
anyway.
TIA
Alex Kuznetsov - 30 Dec 2008 18:56 GMT
On Dec 29, 7:56 pm, Savvoulidis Iordanis
<SavvoulidisIorda...@discussions.microsoft.com> wrote:
> Thanks Erland.
> Well all my queries on COUPON table, are dependent on the username and
[quoted text clipped - 14 lines]
> anyway.
> TIA

Instead of choosing a clustered index, you can just create a covering
one to speed up your queries.
Erland Sommarskog - 30 Dec 2008 23:15 GMT
> Well all my queries on COUPON table, are dependent on the username and
> date_inserted. I always use both username and inserted_date in the WHERE
[quoted text clipped - 10 lines]
> each time, first locating the starting date up to end_date, and then
> filtering out all other user's records.

If I understand this correctly, the query you have in mind goes something
like:

   SELECT ...
   FROM   coupons
   WHERE  username = @username
     AND  inserted_date BETWEEN ....

Yes, in this case an index with (username, inserted_date) is better
than an index on (inseted_date, username).

Signature

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Savvoulidis Iordanis - 31 Dec 2008 16:02 GMT
What about record fragmentation caused by the column order in the clustered
index, that 'TheSQLGuru' mentions? See below. Is my fear a real fear or I
shouldn't worry about it?
Erland Sommarskog - 31 Dec 2008 16:54 GMT
> What about record fragmentation caused by the column order in the
> clustered index, that 'TheSQLGuru' mentions? See below. Is my fear a
> real fear or I shouldn't worry about it?

With a clustered index on username first, it is likely that you will
experience page splits, and thus gets fragmentation. You need to one
way or another run ALTER INDEX <name> ON coupons REBUILD everyonce in a
while.

Presumably you have more tables in your database, and there are probably
more that are likely to be prone to fragmentation. The easist way to address
this is to set up a maintenance plan. The downside is that an MP will
not defragment only when needed, but as long as you have cycles to spare,
this is not much of an issue.

Signature

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

TheSQLGuru - 27 Dec 2008 16:32 GMT
In addition to the queries performed against the table you also need to take
inserts/deletes/updates into account when deciding what to use for your
clustered index.  You can get a lot of fragmentation which can affect
performance in a number of ways.

Also don't forget that whatever columns you put in the clustered index get
carried along as the pointer on all non-clustered indexes.

Signature

Kevin G. Boles
Indicium Resources, Inc.
SQL Server MVP
kgboles a earthlink dt net

>I have a table COUPON with a clustered index on two columns (username,
> date_inserted).  COUPON table references USERS table on username.
[quoted text clipped - 5 lines]
> assist
> the joined queries on both tables?
Savvoulidis Iordanis - 30 Dec 2008 02:22 GMT
> In addition to the queries performed against the table you also need to take
> inserts/deletes/updates into account when deciding what to use for your
> clustered index.  You can get a lot of fragmentation which can affect
> performance in a number of ways.

Well that's true (and scaring too). My DB is rarely changed after the
records are inserted. I use it for statistical information. So as you
mention, if I have the clustered index based on the date_inserted first,
there would be no fragmentation. The users will only have contention on the
last DB page inserted. But I don't know if the queries would be as fast.

Anyway, would you care reading my post to Erland above and make a comment?
I'm a bit confused here.

TIA
TheSQLGuru - 30 Dec 2008 18:42 GMT
I will be pretty much unavailable for a review of the thread for several
days.  You can send me an email at address below with necessary details of
what you have / are looking for and I will try to get to it this weekend
perhaps.

Signature

Kevin G. Boles
Indicium Resources, Inc.
SQL Server MVP
kgboles a earthlink dt net

>> In addition to the queries performed against the table you also need to
>> take
[quoted text clipped - 13 lines]
>
> TIA
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2010 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.