Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / Other Technologies / Full-Text Search / August 2007

Tip: Looking for answers? Try searching our database.

Free-Text Search "AND NOT" Wrong results - HELP, please!

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
joel paula - 09 Aug 2007 15:13 GMT
Hi,

I have a Photo Database, with a Photo table. Each photo has several
Varchar fields for storing caption, description, keywords, etc. I have
a Full-Text index/catalog for these fields.

This is the problem:
I want to find all photos that are related to "Simon" and "Jane", but
no photos taken when they entered or exited the "clinic".

So, my Query looks like:
SELECT *
FROM Fotos LEFT JOIN CONTAINSTABLE(Fotos,*, 'ISABOUT("Simon") AND
ISABOUT("Jane") AND NOT ISABOUT("clinic")') AS KEY_TBL ON Fotos.Cod =
KEY_TBL.[KEY]
WHERE KEY_TBL.Rank>0
ORDER BY ISNULL(KEY_TBL.Rank,0) DESC, ISNULL(Fotos.FotoDate,
Fotos.CreatedDate) DESC

The problem is that it returns photos with descriptions that include
the word "Clinic" (in  this particular query, the first row in the
resultset has the word clinic in it!).

I noticed "Simon", "Jane" and "clinic" could reside on different
columns, so I rewrote the query:
SELECT *
FROM Fotos LEFT JOIN CONTAINSTABLE(Fotos,*, 'Simon AND NOT clinic') AS
KEY_TBL0 ON Fotos.Cod = KEY_TBL0.[KEY]
LEFT JOIN CONTAINSTABLE(Fotos,*, Jane AND NOT clinic') AS KEY_TBL1 ON
Fotos.Cod = KEY_TBL1.[KEY]
WHERE KEY_TBL0.Rank+KEY_TBL1.Rank>0
ORDER BY KEY_TBL0.Rank+KEY_TBL1.Rank DESC, ISNULL(Fotos.FotoDate,
Fotos.CreatedDate) DESC

The same problem occurred.

What Am I doing wrong?

Thanks for your help.

JOEL
Daniel Crichton - 09 Aug 2007 15:36 GMT
joel wrote  on Thu, 09 Aug 2007 07:13:00 -0700:

> Hi,

> I have a Photo Database, with a Photo table. Each photo has several
> Varchar fields for storing caption, description, keywords, etc. I have
> a Full-Text index/catalog for these fields.

> This is the problem:
> I want to find all photos that are related to "Simon" and "Jane", but
> no photos taken when they entered or exited the "clinic".

> So, my Query looks like:
> SELECT *
[quoted text clipped - 4 lines]
> ORDER BY ISNULL(KEY_TBL.Rank,0) DESC, ISNULL(Fotos.FotoDate,
> Fotos.CreatedDate) DESC

> The problem is that it returns photos with descriptions that include
> the word "Clinic" (in  this particular query, the first row in the
> resultset has the word clinic in it!).

> I noticed "Simon", "Jane" and "clinic" could reside on different
> columns, so I rewrote the query:
[quoted text clipped - 6 lines]
> ORDER BY KEY_TBL0.Rank+KEY_TBL1.Rank DESC, ISNULL(Fotos.FotoDate,
> Fotos.CreatedDate) DESC

> The same problem occurred.

> What Am I doing wrong?

> Thanks for your help.

> JOEL

By using a LEFT JOIN you've told the processor to include all the rows where
'Simon AND NOT clinic' is true, whether or not there is a match to them in
'Jane AND NOT clinic'. Assuming you have only two indexed columns (because
you're only doing two CONTAINSTABLE clauses), this means that if a row has
"Simon" in col1, and "Jane clinic" in col2, then you'll still get a result
because the first clause finds a row and you're not filtering out failures
in the second clause.

If you use an INNER JOIN you'll only get the rows where both conditions are
true. However, if there is a possibility of "clinic" occurring in a 3rd
column then you'll need to add another CONTAINSTABLE clause to filter those
out.

An easier solution might to be create a new column that has the contents of
all of the indexed varchar columns concatentated together and index that -
they you need only one clause. I do this myself for my own sites where I
want to create a simple query over a number of indexed columns, so I have an
extra column with all the words from those others dumped into it. Makes life
a bit simpler :)

Dan
joel paula - 09 Aug 2007 16:06 GMT
Thanks for your answer Dan.

I modified the query as follows:
SELECT *
FROM Fotos INNER JOIN CONTAINSTABLE(Fotos,*, 'Simon AND NOT clinic')
AS KEY_TBL0 ON Fotos.Cod = KEY_TBL0.[KEY]
  INNER JOIN CONTAINSTABLE(Fotos,*, 'Jane AND NOT clinic') AS
KEY_TBL1 ON Fotos.Cod = KEY_TBL1.[KEY]
WHERE KEY_TBL0.Rank>0 AND KEY_TBL1.Rank>0
ORDER BY KEY_TBL0.Rank+KEY_TBL1.Rank DESC, ISNULL(Fotos.FotoDate,
Fotos.CreatedDate) DESC

I changed it to use Inner Join, but also ensured that any
Containstable returns significant photos, by having separate
Containstabel.Rank>0 in the WHERE clause. But it returns a row where a
field has something like "Simon and Jane entering the magnetic
resonance clinic...". I don't see any error on this new query...

I ended up solving this with and aditional NOT CONTAINS in the WHERE
clause:
SELECT *
FROM Fotos INNER JOIN CONTAINSTABLE(Fotos,*, 'Simon AND NOT clinic')
AS KEY_TBL0 ON Fotos.Cod = KEY_TBL0.[KEY]
  INNER JOIN CONTAINSTABLE(Fotos,*, 'Jane AND NOT clinic') AS
KEY_TBL1 ON Fotos.Cod = KEY_TBL1.[KEY]
WHERE KEY_TBL0.Rank>0 AND KEY_TBL1.Rank>0 AND NOT
CONTAINS(Fotos.*,'clinic')
ORDER BY KEY_TBL0.Rank+KEY_TBL1.Rank DESC, ISNULL(Fotos.FotoDate,
Fotos.CreatedDate) DESC

There is no other way to , but I am concerned with performance.

Thanks for your help.

All the best,
JOEL

> joel wrote  on Thu, 09 Aug 2007 07:13:00 -0700:
>
[quoted text clipped - 54 lines]
>
> - Show quoted text -
Daniel Crichton - 10 Aug 2007 11:18 GMT
joel wrote  on Thu, 09 Aug 2007 08:06:01 -0700:

> Thanks for your answer Dan.

> I modified the query as follows:
> SELECT *
[quoted text clipped - 5 lines]
> ORDER BY KEY_TBL0.Rank+KEY_TBL1.Rank DESC, ISNULL(Fotos.FotoDate,
> Fotos.CreatedDate) DESC

> I changed it to use Inner Join, but also ensured that any
> Containstable returns significant photos, by having separate
> Containstabel.Rank>0 in the WHERE clause. But it returns a row where a
> field has something like "Simon and Jane entering the magnetic
> resonance clinic...". I don't see any error on this new query...

> I ended up solving this with and aditional NOT CONTAINS in the WHERE
> clause:
[quoted text clipped - 7 lines]
> ORDER BY KEY_TBL0.Rank+KEY_TBL1.Rank DESC, ISNULL(Fotos.FotoDate,
> Fotos.CreatedDate) DESC

> There is no other way to , but I am concerned with performance.

> Thanks for your help.

> All the best,
> JOEL

If you break your query up into just one clause for testing, does it still
pull up the wrong row?

SELECT *
FROM Fotos INNER JOIN CONTAINSTABLE(Fotos,*, 'Simon AND NOT clinic')
AS KEY_TBL0 ON Fotos.Cod = KEY_TBL0.[KEY]

If that still pulls up the row that has the word "clinic" in it, I can think
of only two reasons

(1) clinic isn't indexed - either it was added to the description after the
index was populated and the index has been repopulated, or maybe you have
the word clinic in your noise words file. However, your second query looking
for rows that don't have clinic in shows that this cannot be the case

(2) the AND NOT isn't working Can you try:

SELECT *
FROM Fotos INNER JOIN CONTAINSTABLE(Fotos,*, '"Simon" AND NOT "clinic"')
AS KEY_TBL0 ON Fotos.Cod = KEY_TBL0.[KEY]

If that works as expected, then go back to your original query and put all
your words inside double quotes. The documentation states that AND NOT is
valid, but maybe the FTS query processor is treating them as words.

Dan
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.