Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / Other Technologies / Full-Text Search / October 2006

Tip: Looking for answers? Try searching our database.

search for words with punctuation - IE 319.109.092

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
thomas@oxygensoftware.dk - 19 Oct 2006 10:29 GMT
Hi,

We are using FTS in our search to get a performance boost over a
traditional LIKE search.

It is working really well, but there is one issue.

We would like to index and search for text that look like
"319.109.092". The search itself works fine now - after removing
individual numbers from the noise.dat file and rebuilding the index.
But we would like the search to be a fully exact match search. A search
for "92" will not find "319.109.092" (which is good), but a search for
"092" will find "319.109.092" (which we would like it didn't).

I suppose this is due to the word-breaking feature. Can you make it
stop word-breaking on . (period) ?

We are running SQL Server 2000 on Win2003 server.

Best regards,
Thomas Mortensen
Hilary Cotter - 19 Oct 2006 16:28 GMT
Unfortunately the word breaker will break the token at white space and
non-alphanumeric characters for the most part.  So 319.109.092 will be
broken into three words/tokens 319, 109, and 092. This is why it shows up in
your searches in 092.

Signature

Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.

This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.

Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

> Hi,
>
[quoted text clipped - 17 lines]
> Best regards,
> Thomas Mortensen
Daniel Crichton - 23 Oct 2006 09:59 GMT
thomas@oxygensoftware.dk wrote  on 19 Oct 2006 02:29:57 -0700:

> Hi,
>
[quoted text clipped - 17 lines]
> Best regards,
> Thomas Mortensen

What you could do is change your indexed words to get around this. For
example, instead of having 319.109.092 store 319DOT109DOT092 instead. This
way it'll be kept as a single word and not broken up. If you put this into
your actual data column then you would need to convert back when displaying
the results (eg. REPLACE(mycol,'DOT','.')), or you could use a second column
to store the indexed words (which is what I do in my own setup).

Dan
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.