Yes . is a word or token boundary (actually there are some rules about
this). # is not a noise word when prefaced by a c, j or f.

Signature
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
> Aha! It's because of the "." being a word-breaker - is that correct?
> Shall I remove it from the noise_eng file? or the enu...
[quoted text clipped - 3 lines]
>
> Many thanks
Jack - 22 Feb 2006 07:31 GMT
I actually need to understand where I could see a complete list of
noise words. # is not in the noise word list. Is it a token boundary?
Where is the complete list of these?
Currently we are fighting fires as these errors occur.
Hilary Cotter - 23 Feb 2006 02:09 GMT
The noise words are in the noise word lists. Some characters have special
significance - for example C#, and C++.

Signature
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
>I actually need to understand where I could see a complete list of
> noise words. # is not in the noise word list. Is it a token boundary?
> Where is the complete list of these?
> Currently we are fighting fires as these errors occur.