Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / Other Technologies / Full-Text Search / July 2004

Tip: Looking for answers? Try searching our database.

searching for apostrophes

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Andy Fish - 12 Jul 2004 17:46 GMT
Hi,

using SQL server 2000, I have a ft indexed column that contains the name
O'Donnell (for the sake of argument).

using CONTAINSTABLE, I would like to match this against any of the following
search criteria:

1    O'Donnell
2    ODonnell
3    O Donnell
4    Donnell

I have changed the noise list so that O is a word, but from what I can tell
at the moment, the apostrophe is not counted as a word break token, so only
#1 matches.

#2 is not a vital requirement, so if there was a simple way of breaking a
word at an apostrophe, I could catch the other 3 cases. But I can't see any
way of tweaking the word breaking algorithm

any ideas please?

Andy
Hilary Cotter - 12 Jul 2004 23:03 GMT
with the English word breaker (US and Queens) a search on O'Donnell
correctly escaped ie O''Donnell works.

ie select * from TableName where contains(*,'o''donnell')

However to do the form of matching you require you would have to use some
form of thesaurus expansion, where a search on O'Donnell would be matched
with ODonnell, O Donnell, ODonnell, Donnell. The thesaurus option a
supported option in SQL FTS 7, or 2000.

You also have the option of doing a like for this form of a search.

> Hi,
>
[quoted text clipped - 20 lines]
>
> Andy
John Kane - 13 Jul 2004 01:47 GMT
Andy,
Hilary, I think the following sentence is incomplete... "The thesaurus
option a supported option in SQL FTS 7, or 2000."  Did you mean to say that
this option is supported or not supported in SQL FTS 7.0 or 2000?  For the
latter, while it is not directly supported by Microsoft, the XML Thesaurus
option does work with FREETEXT or FREETEXTTABLE in SQL Server 2000, all SP
levels shipped to-date (SP1 to SP3a) and the forthcoming SP4. It does not
work with CONTAINS or COTNAINSTABLE, so in order to use this option you will
need to switch to FREETEXT*. Note, I have confirmed all of this with
Microsoft.

Hilary, in your view will the FORMSOF(Inflectional) work with proper names,
such O'Donnel or O'Leary? The latter can be easily tested with the Pubs
database table authors.

Regards,
John

> with the English word breaker (US and Queens) a search on O'Donnell
> correctly escaped ie O''Donnell works.
[quoted text clipped - 36 lines]
> >
> > Andy
Andy Fish - 13 Jul 2004 07:13 GMT
> Andy,
> Hilary, I think the following sentence is incomplete... "The thesaurus
[quoted text clipped - 10 lines]
> such O'Donnel or O'Leary? The latter can be easily tested with the Pubs
> database table authors.

Thanks to both for your replies. Unfortunately using an unsupported option
would be out of the question for this customer. In any case, a switch to
freetext would not be an option as the text queries are fairly structured
and use wildcards. I also tried formsof(inflectional) and that did not help.

there is definitely some processing going on with apstrophes though. a
search for
   contains(*,'john''s')
returns all occurences of john as well as john's but does not return johns.

Whilst the engine generally works, I must admit I find the lack of
documentation about the exact rules it applies (and the lack of ability to
modify those rules) somewhat frustrating (Unless I am missing some useful
load of documentation). Also I get the impression MS is not committed to
supporting the search engine as a separate product - there seem to be
slightly different versions and variants embedded in different places.

Andy
Hilary Cotter - 13 Jul 2004 14:53 GMT
some people will have to columns in their table. One with the content as it
is written and one with the content reversed. Then they do a wild card
search, ie

select * from tablename where contains(reversecolumn,'llennod*')

to achieve this sort of functionality. It works best when the content is
small, ie containing few words/tokens per column.

> > Andy,
> > Hilary, I think the following sentence is incomplete... "The thesaurus
[quoted text clipped - 32 lines]
>
> Andy
Hilary Cotter - 14 Jul 2004 12:00 GMT
yikes! I meant to say the thesaurus option is NOT supported in SQL 7 and SQL
2000.

Signature

Hilary Cotter
Looking for a book on SQL Server replication?
http://www.nwsu.com/0974973602.html

> Andy,
> Hilary, I think the following sentence is incomplete... "The thesaurus
[quoted text clipped - 55 lines]
> > >
> > > Andy
John Kane - 13 Jul 2004 15:33 GMT
Agreed, not supported by Microsoft, but will still be functional in all
versions of SQL Server 2000, including SP4 using FREETEXT or FREETEXTTABLE.
Additionally, it will be supported by Microsoft in SQL Server 2005 (Yukon)

Regards,
John

> yikes! I meant to say the thesaurus option is NOT supported in SQL 7 and SQL
> 2000.
[quoted text clipped - 66 lines]
> > > >
> > > > Andy
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.