Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / Other Technologies / Full-Text Search / February 2005

Tip: Looking for answers? Try searching our database.

Which search happen first ? (Yukon full text search)

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Xin Chen - 25 Feb 2005 07:06 GMT
In Yukon, if I do a fulltext search and also put let's say "userid = 5" in
the where clause, for example, [Where contains("description","yukon") and
userid = 5]
Does Yukon search for all the "yukon" among all the datarows first or search
for userid =5 first?

Obviously search for userid = 5 first then full text search will be faster,
How does Yukon handle this ?

--Xin Chen
Hilary Cotter - 25 Feb 2005 15:27 GMT
It could be done first, it depends on parallelism. However, results are
returned from the full text catalogs and then merge with the results set
returned by userid=5 to whittle down the results to matches of both
conditions.
Signature

Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

> In Yukon, if I do a fulltext search and also put let's say "userid = 5" in
> the where clause, for example, [Where contains("description","yukon") and
[quoted text clipped - 6 lines]
>
> --Xin Chen
John Kane - 25 Feb 2005 16:32 GMT
Xin Chen,
I believe you are concerned about the Yukon FTS service (MSFTESQL) reading
all of a large FT Catalog when only results for a subset of rows (userid =
5) is required and therefore taking a long time to return results. Correct?

If so, then Yukon FTS does understand this restriction in the where clause
and will only read the *rows* in the FT Catalog based upon the where clause
restriction and will return results faster. If not, could you further
explain your question?

Thanks,
John
Signature

SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/

> In Yukon, if I do a fulltext search and also put let's say "userid = 5" in
> the where clause, for example, [Where contains("description","yukon") and
[quoted text clipped - 6 lines]
>
> --Xin Chen
Xin Chen - 26 Feb 2005 01:12 GMT
John, That is exactly what I asked.  Thank you very much for your answer. By
the way, I enjoy you blog a lot.
Answer question for you.  Sometime we divide a large database into identical
smaller ones(let's say 5 databases), but keep all the data structure the
same.  If I fulltext index on each of the 5 databases, then is there a way I
can merge the full text search results from 5 database server to provide a
complete search result?   Does Yukon offer this? I think it is critical
feature when you have distributed data.  If Yukon doesn't offer this, do you
have any idea on how best handle this?  Thank you very much !

--Xin Chen

> Xin Chen,
> I believe you are concerned about the Yukon FTS service (MSFTESQL) reading
[quoted text clipped - 20 lines]
> >
> > --Xin Chen
John Kane - 26 Feb 2005 04:16 GMT
You're welcome, Xin Chin!
Yes, I believe you can do something similar as you're hinting at via Linked
Servers between each database or server. The following is quoted from the
SQL Server 2005 Dec '04 CTP BOL "Support for CONTAINS full-text predicate
against linked servers
SQL Server full-text search functionality can be applied on text data in
remote servers when you used the CONTAINS predicate with SELECT queries
against linked servers. In Microsoft? SQL ServerT 2005, you can use
four-part names in CONTAINS or FREETEXT full-text predicates to execute
queries against linked servers."

As I've not yet tested this, and there is no exact syntax sample in the
Yukon BOL on how to do this between databases on the same server (vs. on
different servers), but I believe the syntax would look something like this
after establishing the Linked Servers:

select * from tableA where CONTAINS(serverA.databaseA..tableA.columnA, '
"searchA" ')
union all
select * from tableA where CONTAINS(serverA.databaseB..tableA.columnA, '
"searchB" ')
union all
select * from tableA where CONTAINS(serverA.databaseC..tableA.columnA, '
"searchC" ')
-- etc...

However, Yukon FTS does not support Distributed Partition Views, so having
distributed or propagated search servers, will have to wait till the next
version of SQL Server after Yukon...

Hope that helps & keep reading my blog as I'll be posting more on Yukon in
the very near future!
Thanks,
John

Signature

SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/

> John, That is exactly what I asked.  Thank you very much for your answer. By
> the way, I enjoy you blog a lot.
[quoted text clipped - 36 lines]
> > >
> > > --Xin Chen
Xin Chen - 26 Feb 2005 15:00 GMT
Thanks again, John, I will try it out on Yukon.  I am not sure if the Union
All will still make the rank relevent.  I heard Microsft said that rank from
one database will be irrelvent to that of other database on full text
search.

Another thing I worried is the performance. If the search research is large
data set, merge result and union them on a single server will mean large
network traffic.  I wish I know how google does this.  I am sure they must
use some type of distributed search.  How could they do it so FAST??  Any
opinions?

--Xin Chen

> You're welcome, Xin Chin!
> Yes, I believe you can do something similar as you're hinting at via Linked
[quoted text clipped - 79 lines]
> > > >
> > > > --Xin Chen
John Kane - 26 Feb 2005 18:02 GMT
You're welcome, Xin Chen,
Keep in mind that the Yukon Dec '04 CTP BOL is still a working effort, i.e.,
not complete and that the entry for "Querying Linked Servers"
[ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/fulltxt9/html/0e0eaf74-442e-41ed-8198
-ba2c6337e54b.htm] only references CONTAINS and FREETEXT that can be used
with linked servers. Furthermore, FREETEXTTABLE
[ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/4523ae15-4260-40a7-a53c
-8df15e1fee79.htm] states that "table cannot specify a server name and
cannot be used in queries against linked servers", while CONTAINSTABLE has
no such restriction. The latter might be a "DOC" bug, i.e., CONTAINSTABLE
may also have the same restriction as FREETEXTTABLE, but not yet documented.

If CONTAINSTABLE and FREETEXTTABLE cannot be used with Linked Servers and as
they are the only commands that return RANK, then returning RANK from a
cross-database or cross-server join via linked servers may not be possible.

Performance is always a concern in situations such as a cross-database join
(FTS or non-FTS) and like they say, the proof is in the pudding and testing
would need to be done to confirm that such a multi-database FTS join would
perform well. Still, its an interesting feature that is not available in SQL
Server 2000 that will have to be more explored under SQL Server 2005!

Yes, Google does use their own-brand of distributed search. Here is a paper
that describes the Google Cluster Architecture -
http://www.computer.org/micro/mi2003/m2022.pdf. Note, that their "15,000
commodity class PC's with fault-tolerant software" is now reported to be
over 100,000 today. Comparing a new database-specific search feature to an
Internet-scale Search Engine, is not an apple-to-apple comparison, but still
Microsoft SQL FTS Dev could take some pointers from Google for the next
version of SQL Sever after Yukon, IMHO...

Regards,
John
Signature

SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/

> Thanks again, John, I will try it out on Yukon.  I am not sure if the Union
> All will still make the rank relevent.  I heard Microsft said that rank from
[quoted text clipped - 100 lines]
> > > > >
> > > > > --Xin Chen
Xin Chen - 27 Feb 2005 03:33 GMT
John, thanks again for your reply and the great article you mentioned. I
will be reading it very carefully to see if I can find something I can use.

I had once written a crawler written in C# to collect the web pages and
store them in Yukon.  I think Yukon (beta 2) corrupted the database when I
got to about 1 million page in it.  I broke the text content of 1 page into
10 row, so the database(FT) is about 10 million rows.  I had hope to see
Yukon to index 10 million web page with adequate performance.  I think in
term of full text indexing and search, its fundamental techniques are pretty
common.  It just matter of who can implement it better and get the best
scalability(or distributed search I believe) which is badly needed when
dealing with billion's web pages.

--Xin Chen

> You're welcome, Xin Chen,
> Keep in mind that the Yukon Dec '04 CTP BOL is still a working effort, i.e.,
> not complete and that the entry for "Querying Linked Servers"

[ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/fulltxt9/html/0e0eaf74-442e-41ed-8198
> -ba2c6337e54b.htm] only references CONTAINS and FREETEXT that can be used
> with linked servers. Furthermore, FREETEXTTABLE

[ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/4523ae15-4260-40a7-a53c
> -8df15e1fee79.htm] states that "table cannot specify a server name and
> cannot be used in queries against linked servers", while CONTAINSTABLE has
[quoted text clipped - 141 lines]
> > > > > >
> > > > > > --Xin Chen
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.