Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / Other Technologies / Full-Text Search / February 2008

Tip: Looking for answers? Try searching our database.

unknown unique key count variance

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
PhantomRick - 07 Feb 2008 07:20 GMT
I am pretty new to FTS but found I am stuck with a problem that
doesn't seem common.  In all my searching, I have only come across one
post that had the same issue and that was from 2002.

http://groups.google.com.au/group/microsoft.public.sqlserver.fulltext/browse_thr
ead/thread/64e0e6b739e7835f/4164233bb5cf813c?hl=en&lnk=gst&q=rebuild+catalog+dif
ferent+unique+key+count#4164233bb5cf813c


The only suggestion from that post (answered by Hilary Cotter) was the
problem could be related to using change tracking with background
indexing or very high cpu activity on my server.  Neither are the
case.

To relate my problem:

I am trying to use FTS to perform content searching on only the pdf
files stored in my database.

While trying to discover why I was getting inaccuracies, I found that
if I rebuild the catalog then do a full population, the unique key
count is different each time.

I tested it 12 times in a row, & the unique key counts results:

Min    10,530
Max    42,731
Average    24,888
Median    21,455

I am running SQL Server 2000 and Adobe iFilter 6 on XP SP2.  This is a
development machine, so there is no external interference to alter my
testing.  I was also able to duplicate the issue on a Windows 2000
server running SQL Server 2000 and iFilter 6.

I am indexing only one table with a single catalog.  I used the wizard
to create this & checked on Hilary's site (http://
www.indexserverfaq.com/SQLFTIWizard.htm) to ensure I was doing it
correctly.

The table holding the files contains a total of 1261 files of which
747 are pdfs.  The rest are a variety
of .wpd, .doc, .zip, .tif, .exe, .jpeg, & .dwg files.

The table has the following layout.

3    gVfsId    uniqueidentifier    16    0
0    sExt    varchar    10    1
0    imgTarget    image    16    1
0    sCrc    varchar    100    1
0    iFilesize    int    4    1
0    bEncrypted    bit    1    1
0    bRebuild    bit    1    1
0    dteCatIncrement    timestamp    8    1

If someone could at least point me in a direction to rectify this
issue, I will be eternally grateful.

Cheers,
Rick.
Hilary Cotter - 07 Feb 2008 16:48 GMT
There was a problem in one of the earlier RTM and service packs of SQL
Server 2000 which had a behavior like this. It was solved in one of
the later SP's - SP 4 IIRC.

Are you doing a full population each time?

> I am pretty new to FTS but found I am stuck with a problem that
> doesn't seem common.  In all my searching, I have only come across one
[quoted text clipped - 52 lines]
> Cheers,
> Rick.
PhantomRick - 08 Feb 2008 01:53 GMT
Although I had applied SP4 originally, I re-applied it this morning.

I have then re-tested the catalog, each time rebuilding the catalog
then running a full population through the enterprise manager.  The
results were:

Item count / unique word count
1260    46997
1260    36005
1260    38896
1261    32451
1260    31806
1261    33381

Oh, it's now 1262 files - I added another this morning from other
testing, but before running this test.

I aslo noticed while checking through the pdf files, I seem to get a
lot of false positives using CONTAINS.  I was presuming this is a
related matter.

Do you have any suggestions as to what else I could try?

> There was a problem in one of the earlier RTM and service packs of SQL
> Server 2000 which had a behavior like this. It was solved in one of
[quoted text clipped - 60 lines]
>
> - Show quoted text -
Hilary Cotter - 08 Feb 2008 14:17 GMT
This seems abnormal. The way it works is that there are temporay
memory resident indexes which are merged into shadow indexes and then
into a single master catalog.

You might be getting this discrepancy from there.

Also are you getting these same false positives when you are using a
contains/containstable query.

> Although I had applied SP4 originally, I re-applied it this morning.
>
[quoted text clipped - 85 lines]
>
> - Show quoted text -
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.