Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / Other Technologies / Full-Text Search / June 2008

Tip: Looking for answers? Try searching our database.

Benchmark for Full Text Search

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Ankur Rawat - 15 Apr 2008 22:16 GMT
Hi,
We are currently using a 3rd party Search Engine to search our application
content in SQL server.

Could someone please point me to Benchmark data for the FTS service? I am
interested in switching to FTS even if the Performance numbers are comparable
or better.

thanks for any help!

regards,
Ankur.
Hilary Cotter - 17 Apr 2008 13:39 GMT
There are no published benchmarks as they would vary on your content (text,
image, char), what is being indexed (a lot of the same words will perform
differently than a lot of unique words), your machines and table design.

However I was working on a system where we sql fts multi-terabytes and
performance was subsecond for most queryies (over a billion rows IIRC),
however queries which were complex or for which there were a large number of
rows could be very lengthy.

> Hi,
> We are currently using a 3rd party Search Engine to search our application
[quoted text clipped - 9 lines]
> regards,
> Ankur.
sql411@nospam.com - 26 Jun 2008 18:57 GMT
Hillary, I am interested in how you optimized the fts for a multi-terabyte
catalog.  I am in the same situation, a 7TB table with 350+ million rows.  
The FT index for this table is over 1.5 TB.  Some searches with very unique
words (hence, small # of rows coming back) finish fast – but very common
words can take as long as 40 minutes before we decide to time things out.  
What kind of optimizations have you done outside the norm of creating a
catalog/fulltext index of the table?  With billions of rows – your key must
be a bigint – did that affect performance on building/searching?  Did you add
a unique value to help further filter the data (equivalent to a batch ID or
something that minimizes hits)?  Is the underlying data partitioned with SQL
2005, or old school partitioning with an updateable view?

I am sure SQL 2008 will help quite a bit being able to utilize the regular
table filters to the full-text query, but we still have ~3 months before it
releases.

I’ve considered breaking out the table to multiple ones across multiple
servers, and creating multiple full-text catalogs to go against.  Anyone who
has done this before on a large scale?  I am considering breaking the large
table up to possibly several thousand tables – say 5,000.  If I did this and
created a full-text index on each table – what is the optimal relationship to
catalog/full-text index?  For example, how many full-text indexes per
full-text catalog are OK before you start having performance issues (my guess
is it probably has more to do with record size/full-text index size than
actual full-text index counts).  Has anyone created several thousand
full-text catalogs or full-text indexes on one server in production and it
went well?

Anyone who has helped SQL FT search scale well on many terabytes – I’d
greatly appreciate your hard earned lessons.
Many thanks,
Robert Towne

> There are no published benchmarks as they would vary on your content (text,
> image, char), what is being indexed (a lot of the same words will perform
[quoted text clipped - 18 lines]
> > regards,
> > Ankur.
Hilary Cotter - 27 Jun 2008 13:27 GMT
We ended up breaking the database into smaller databases of about 50 million
rows. Eventually the databases became monthly (again holding about 50
million rows).

Searches were directed against each database - each db had a single catalog
and a single sql fts index.

Our queries did not have a real time component to them so we were more
tolerant of the search latencies. The long and short of it is we eventually
moved to Lucene:)

Contact me offline if you want more info.

Hilary

> Hillary, I am interested in how you optimized the fts for a multi-terabyte
> catalog.  I am in the same situation, a 7TB table with 350+ million rows.
[quoted text clipped - 66 lines]
>> > regards,
>> > Ankur.
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.