Lucene is great, but there are a couple of problems with it.
1) you have to extract each row by an indexer you write your self and store
the pk in the document object
2) you can't store a lot of data in the index without it getting very large
very quickly - this means that querying and indexing speed get slow very
quickly
3) ideal configuration is many indexes each about 300 mgs, and 3-4 indexes
per machine.
4) So you need a lot of machines with fast disk subsystems. I think
Technorati has over 100 machines for their index.

Signature
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
> Hello,
>
[quoted text clipped - 8 lines]
>
> C.H