Hi,
We are using FTS in our search to get a performance boost over a
traditional LIKE search.
It is working really well, but there is one issue.
We would like to index and search for text that look like
"319.109.092". The search itself works fine now - after removing
individual numbers from the noise.dat file and rebuilding the index.
But we would like the search to be a fully exact match search. A search
for "92" will not find "319.109.092" (which is good), but a search for
"092" will find "319.109.092" (which we would like it didn't).
I suppose this is due to the word-breaking feature. Can you make it
stop word-breaking on . (period) ?
We are running SQL Server 2000 on Win2003 server.
Best regards,
Thomas Mortensen
Hilary Cotter - 19 Oct 2006 16:28 GMT
Unfortunately the word breaker will break the token at white space and
non-alphanumeric characters for the most part. So 319.109.092 will be
broken into three words/tokens 319, 109, and 092. This is why it shows up in
your searches in 092.

Signature
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
> Hi,
>
[quoted text clipped - 17 lines]
> Best regards,
> Thomas Mortensen
Daniel Crichton - 23 Oct 2006 09:59 GMT
thomas@oxygensoftware.dk wrote on 19 Oct 2006 02:29:57 -0700:
> Hi,
>
[quoted text clipped - 17 lines]
> Best regards,
> Thomas Mortensen
What you could do is change your indexed words to get around this. For
example, instead of having 319.109.092 store 319DOT109DOT092 instead. This
way it'll be kept as a single word and not broken up. If you put this into
your actual data column then you would need to convert back when displaying
the results (eg. REPLACE(mycol,'DOT','.')), or you could use a second column
to store the indexed words (which is what I do in my own setup).
Dan