I am reposting this under this subject area as it is more appropriate here.
----------------------------------------------
My understanding of the fulltext search using the thesaurus function is that
it will not work until the TSxxx.XML file is populated (in my case the
tsENU.xml file.)
Is this true?
We are migrating from SQL 2000 and I could have sworn this was working
there. It was my impression that you did not have the option to override or
enhance the Thesaurus on SQL 2000 but whatever dictionary file was being used
there does not appear to be available in the SQL 2005 install.
I do see where the MS Office products use the MSTH3AM.LEX & MSTH3BR.LEX
dictionary files to support the thesaurus lookup there. Can these files
somehow be migrated over to SQL Server 2005 for use there or somehow be read
and converted to an XML thesaurus?
--------------------------------
Here's a thought........I believe MS Word was installed on the machine that
had SQL 2000 on it. I know that when I set up a new desktop, spellcheck &
thesaurus are not available to other applications (i.e. Outlook Express)
until after MS Word is installed. Could this be the reason the Thesaurus
function worked on the previous machine and not the machine running SQL 2005?
Our SQL 2005 machine is running the 64 bit version of this software.
--------------------------------
I have opened a trouble ticket on this issue and will continue to post
updates as this progresses (so you don't have to.) What I have found thus far
is that the standard advice of removing the commented xml code in the
tsENU.xml file does not cut it. Once you do this, the xml tries to reference
the schema (tsSchema.xml) which is missing from the SQL Server 2005 install.
I located a copy of the file from an earlier version (SQL 2000 SP3) and that
solved my missing schema problem but still the Thesaurus function would not
work (even after adding the substitution parameters in tsENU.xml.)
I burned one of my support incidents from my MSDN subscription and contacted
the support center. They checked several of the machines there and could not
find the missing schema file amongst the installed files either.
Bottom line, it sounds like this might be an install problem (or an
undocumented feature.) I will post the fix here when I find out more.
answers inline.

Signature
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
>I am reposting this under this subject area as it is more appropriate here.
> ----------------------------------------------
[quoted text clipped - 3 lines]
> tsENU.xml file.)
> Is this true?
Yes, this is true. Only SQL FTS predicates which are thesaurus aware will
use the thesaurus functions, ie formsof, or Freetext if your search phrase
is not wrapped in quotation marks.
> We are migrating from SQL 2000 and I could have sworn this was working
> there. It was my impression that you did not have the option to override
[quoted text clipped - 8 lines]
> read
> and converted to an XML thesaurus?
The thesaurus functions of MS Office do not provide services to SQL FTS.
This was not the case in SQL 2000 and is not the case in SQL 2005.
> --------------------------------
> Here's a thought........I believe MS Word was installed on the machine
[quoted text clipped - 5 lines]
> 2005?
> Our SQL 2005 machine is running the 64 bit version of this software.
No. Run regmon and you will see that it does not use these files.
> --------------------------------
> I have opened a trouble ticket on this issue and will continue to post
[quoted text clipped - 10 lines]
> not
> work (even after adding the substitution parameters in tsENU.xml.)
It works fine for me and I do not have this file present on my SQL 2005
installations. Could you possible post your thesaurus file here? Note that
tsSchema.xml is a name space. As a namespace the file may or may not exist
on your system or anywhere else. Its just used to prevent name collisions.
here is a code sample which works for me:
--open up C:\Program files\Microsoft SQL
Server\MSSQL.1\MSSQL\Binn\FTEREF\TSENU.XML
--save it as C:\Program files\Microsoft SQL
Server\MSSQL.1\MSSQL\Binn\FTEREF\TSENU.bck
--save it again as as C:\Program files\Microsoft SQL
Server\MSSQL.1\MSSQL\Binn\FTEREF\TSENU.XML
--Replace the contents of it with this
/*
<XML ID="Microsoft Search Thesaurus">
<thesaurus xmlns="x-schema:tsSchema.xml">
<diacritics = false/>
<expansion>
<sub>Internet Explorer</sub>
<sub>IE</sub>
<sub>IE5</sub>
<sub>IE6</sub>
</expansion>
<expansion>
<sub>Bombay</sub>
<sub>Mumbai</sub>
</expansion>
<expansion>
<sub>yogurt</sub>
<sub>yoghurt</sub>
<sub>yohghurt</sub>
<sub>yoghourt</sub>
</expansion>
<replacement>
<pat>NT5</pat>
<pat>W2K</pat>
<sub>Windows 2000</sub>
</replacement>
<replacement>
<pat>sex</pat>
<sub>gender</sub>
</replacement>
<expansion>
<sub>run</sub>
<sub>jog</sub>
</expansion>
</thesaurus>
</XML>
*/
CREATE DATABASE CODE_SAMPLE3
GO
USE CODE_SAMPLE3
GO
CREATE TABLE CODE_SAMPLE3
(pk INT NOT NULL IDENTITY CONSTRAINT PRIMARYKEY PRIMARY KEY,
CHARCOL VARCHAR(200))
GO
CREATE FULLTEXT CATALOG TEST AS DEFAULT
GO
CREATE FULLTEXT INDEX ON CODE_SAMPLE3 (CHARCOL)KEY INDEX PRIMARYKEY
GO
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('sex')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('gender')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('Bombay')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('Mumbai')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('jog')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('jogged')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('jogs')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('jogging')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('Internet Explorer')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('IE')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('IE5')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('IE6')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('yogurt')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('yoghurt')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('yohghurt')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('yoghourt')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('NT5')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('w2k')
INSERT INTO CODE_SAMPLE3 (CHARCOL) VALUES('windows 2000')
GO
SELECT * FROM CODE_SAMPLE3 WHERE CONTAINS(*,'sex')
-- 1 row returned - the row with sex in it
SELECT * FROM CODE_SAMPLE3 WHERE FREETEXT(*,'sex')
-- 1 row returned - the row with gender in it, notice how in a freetext
search a search on sex is replaced by a search on gender
SELECT * FROM CODE_SAMPLE3 WHERE CONTAINS(*,'formsof(thesaurus,sex)')
-- 1 row returned - the row with gender in it
SELECT * FROM CODE_SAMPLE3 WHERE FREETEXT(*,'run')
-- 4 rows returned, all stemmed forms of the verb job
SELECT * FROM CODE_SAMPLE3 WHERE
CONTAINS(*,'formsof(thesaurus,yogurt)')
-- 4 rows returned all spellings of yogurt
SELECT * FROM CODE_SAMPLE3 WHERE CONTAINS(*,'formsof(thesaurus,IE)')
-- 4 rows returned all variants of IE
SELECT * FROM CODE_SAMPLE3 WHERE CONTAINS(*,'"windows 2000"')
-- 1 rows returned --windows 2000
SELECT * FROM CODE_SAMPLE3 WHERE CONTAINS(*,'"nt5"')
-- 1 rows returned --NT 5
SELECT * FROM CODE_SAMPLE3 WHERE CONTAINS(*,'"w2k"')
-- 1 rows returned --w2k
SELECT * FROM CODE_SAMPLE3 WHERE FREETEXT(*,'"w2k"')
-- 1 rows returned --w2k
SELECT * FROM CODE_SAMPLE3 WHERE contains(*,'formsof(thesaurus,"windows
2000")')
-- 1 rows returned --windows 2000
SELECT * FROM CODE_SAMPLE3 WHERE contains(*,'formsof(thesaurus,"nt5")')
-- 1 rows returned --windows 2000
SELECT * FROM CODE_SAMPLE3 WHERE contains(*,'formsof(thesaurus,"w2k")')
-- 1 rows returned --windows 2000
> I burned one of my support incidents from my MSDN subscription and
> contacted
[quoted text clipped - 4 lines]
> Bottom line, it sounds like this might be an install problem (or an
> undocumented feature.) I will post the fix here when I find out more.
T Mann - 13 Apr 2006 16:39 GMT
Hello Hilary,
I loaded your example.
I ran this query:
SELECT * FROM CODE_SAMPLE3 WHERE contains(*,'formsof(thesaurus,"yogurt")')
-- 1 rows returned --yogurt
Now by my count, there are 4 different spellings of yogurt in the table
13 yogurt
14 yoghurt
15 yohghurt
16 yoghourt
and my thesaurus has these defined:
<expansion>
<sub>yogurt</sub>
<sub>yoghurt</sub>
<sub>yohghurt</sub>
<sub>yoghourt</sub>
</expansion>
So I would expect there to be 4 rows returned.
My result set is this:
13 yogurt
Does this work on your installation of SQL 2005?

Signature
Thomas Mann
Hilary Cotter - 13 Apr 2006 18:36 GMT
yes, it does, but you have to bounce MSFTESQL for it to pick up the changes.
Also that code sample seems to have been from one of the early CTP's - you
may have noticed that the location has changed. Also my example is for US
English.

Signature
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
> Hello Hilary,
> I loaded your example.
[quoted text clipped - 21 lines]
>
> Does this work on your installation of SQL 2005?
T Mann - 13 Apr 2006 21:02 GMT
SUCCESS!
The restart of MSFTESQL was the key here.
I did not find that in my searches at all.
Here are my results from the same query:
PK CHARCOL
13 yogurt
14 yoghurt
15 yohghurt
16 yoghourt
I hope somebody finds this useful as well.
You don't know of a way to load the tsENU.xml (or somehow extract the
Thesaurus) from the MSTH3AM.lex or MSTH3BR.lex files?
And by the way.........thanks a ton for the help!

Signature
Thomas Mann
wheeldo - 27 Apr 2006 12:21 GMT
Hilary, you're a star... I've spent absolutely ages trying to get this
working. It seems the diacritics settings in the comments section of
the sample files is incorrect on my machine! (In fact the MSDN article
about configuring thesaurus files still looks wrong??
http://msdn2.microsoft.com/en-US/library/ms345186.aspx
On another note: I'm having real difficulty in stopping and starting
the MSFTESQL service. I've tried it via Administrative Tools>Services
and via the Surface Area Config tool just in case that made a
differance, but to no avail.
It just says it's attempting to stop it, but then never manages it.
Most annoying as it means every time I change the XML file I have to
reboot !! It'll be even more annoying when it goes on the live server
;)
Has anybody else experienced this, or have any ideas?
Thanks.
Hilary Cotter - 27 Apr 2006 13:08 GMT
My understanding is that the diacritics setting is leftover from the
SharePoint Thesaurus files that SQL Server also uses. Note they don't share
them, its just that SQL Server seems to use SharePoint's ones as a template.
I have never heard of your particular problem with the server, and I have
never experienced it. What SKU and version are you running? Did you apply
SP1?

Signature
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
> Hilary, you're a star... I've spent absolutely ages trying to get this
> working. It seems the diacritics settings in the comments section of
[quoted text clipped - 15 lines]
>
> Thanks.
wheeldo - 27 Apr 2006 17:16 GMT
Hi Hilary,
I'm using SQL Server 2005 Developer Edition:
Select @@version gives:
Microsoft SQL Server 2005 - 9.00.1399.06 (Intel X86) Oct 14 2005
00:33:37 Copyright (c) 1988-2005 Microsoft Corporation Developer
Edition on Windows NT 5.1 (Build 2600: Service Pack 2)
I'll have a dig around in the log files... hopefully find some clues. I
shall let you know if I find out what the issue is.
Wheeldo.
wheeldo - 27 Apr 2006 19:05 GMT
Hi,
I figured out what the problem is here: If you stop the full-text index
service with the SQL database engine itself running, SQL automatically
restarts the full-text index service the instant it stops (Makes sense
I suppose ;) ). I found this out by looking in XPs application event
log - it showed the MSFTE successfully shut down - then a couple of
seconds later successfully restarted - despite the fact I hadn't
requested it to do so.
However... This "feature" upsets SQL Server Configuration Manager and
it hangs. Then a few minutes later states "Connection to target machine
could not be made in timely fasion" (yes - it says fasion on the
messagebox and not fashion!).
Trying to stop the service from the surface area tool does the same and
hangs this also.
If you stop the service from within control panel > admin tools >
services - it at least doesn't hang anything...
Hilary Cotter - 28 Apr 2006 01:43 GMT
damn ur good!

Signature
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
> Hi,
>
[quoted text clipped - 16 lines]
> If you stop the service from within control panel > admin tools >
> services - it at least doesn't hang anything...