Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / Other Technologies / Full-Text Search / April 2007

Tip: Looking for answers? Try searching our database.

Can't put duplicate words in different expansion sets?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
spencer - 19 Apr 2007 05:08 GMT
Can't you have duplicate words with different meanings in different
expansion sets?  Is this a bug?

Please try this repro below.  It uses a thesaurus with 2 expansion sets.
Each set contains the word kind.  The sets are like this:
1. kind, sort, class
2. kind, caring, considerate

When I set up a thesaurus with the above two sets (and resart the FTS
service) I only get one row from this query:

SELECT *
FROM fts_bug
WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');

I would have expected to get five rows (each row has a single word): kind,
sort, class, caring, considerate

------------------------- THESAURUS REPRO ----------------------------------

CREATE TABLE [dbo].[fts_bug](
[id] [int] IDENTITY(1,1) NOT NULL,
[txt] [varchar](50) NULL,
CONSTRAINT [PK_fts_bug] PRIMARY KEY CLUSTERED ([id] ASC)
)

-- catalog and index
create fulltext catalog testCat as default;
create fulltext index on dbo.fts_bug(txt) key index PK_fts_bug;

-- populate
insert into fts_bug(txt) values ('kind')

insert into fts_bug(txt) values ('sort')
insert into fts_bug(txt) values ('class')

insert into fts_bug(txt) values ('caring')
insert into fts_bug(txt) values ('considerate')

-- see data
select * from fts_bug

-- Use this thesaurus (restart the FTS service!):
<XML ID="Microsoft Search Thesaurus">
   <thesaurus xmlns="x-schema:tsSchema.xml">
<diacritics_sensitive>0</diacritics_sensitive>
 <expansion>
  <sub>kind</sub>
  <sub>sort</sub>
  <sub>class</sub>
 </expansion>
 <expansion>
  <sub>kind</sub>
  <sub>caring</sub>
  <sub>considerate</sub>
 </expansion>
</thesaurus>
</XML>

-- I would EXPECT this query to return "kind", "sort", "class", "caring" and
"considerate"
-- but it only returns one row "kind"
SELECT *
FROM fts_bug
WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');

-- Now use this one: (I only changed the first "kind" to "kinds") (restart
FTS service)
<XML ID="Microsoft Search Thesaurus">
<thesaurus xmlns="x-schema:tsSchema.xml">
 <diacritics_sensitive>0</diacritics_sensitive>
 <expansion>
  <sub>kinds</sub>
  <sub>sort</sub>
  <sub>class</sub>
 </expansion>
 <expansion>
  <sub>kind</sub>
  <sub>caring</sub>
  <sub>considerate</sub>
 </expansion>
</thesaurus>
</XML>

-- Running the same query now returns 3 rows: "kind", "caring",
"considerate", which you would expect
SELECT *
FROM fts_bug
WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');

-- clean up
drop fulltext index on dbo.fts_bug;
drop fulltext catalog testCat;
delete  from fts_bug
drop table fts_bug

------------------------- END THESAURUS
REPRO ----------------------------------
Hilary Cotter - 19 Apr 2007 12:54 GMT
This is SQL 2005 correct?

I can repro this on a SQL 2005 box. It looks like you must have distinct
words in your thesaurus file.

I suspect this is by design as opposed to an actual bug. use connect to
raise it as a bug and Microsoft will acknowledge it as design or a bug.

> Can't you have duplicate words with different meanings in different
> expansion sets?  Is this a bug?
[quoted text clipped - 95 lines]
> ------------------------- END THESAURUS
> REPRO ----------------------------------
spencer - 20 Apr 2007 14:37 GMT
> This is SQL 2005 correct?

I'm using SQL 2005 Express

> I can repro this on a SQL 2005 box. It looks like you must have distinct
> words in your thesaurus file.
>
> I suspect this is by design as opposed to an actual bug. use connect to
> raise it as a bug and Microsoft will acknowledge it as design or a bug.

Assuming it's not a bug, how do you get around it?  There are plenty of
words that have dual meanings.

I wasn't familiar with "connect" to raise the bug.  I found it here:
http://connect.microsoft.com/SQLServer

[this is beside the point: It took a 1x1 inch image on that page a while to
load.  So I looked at the size--it was almost a half a megabyte!  3239x2432
pixels!  My 22 inch monitor couldn't even display the whole image width at
100%!  How could that happen?  Some of us don't have T1s at home, you know
;-)   ]

>> Can't you have duplicate words with different meanings in different
>> expansion sets?  Is this a bug?
[quoted text clipped - 95 lines]
>> ------------------------- END THESAURUS
>> REPRO ----------------------------------
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.