SQL Server Forum / Other Technologies / Full-Text Search / March 2005
FullText Search on Multiple Language
|
|
Thread rating:  |
Royston - 15 Mar 2005 11:09 GMT Hi All,
I am trying to implement a FullText search for a table which contains translations for many different languages. However, different languages utilized different word breaker for the FullText search to work properly for that language but I can only assign a single word breaker to a table (for example, traditional chinese uses Chinese(Taiwan) word breaker, simplified chinese uses Chinese(PRC), etc).
To overcome this problem, I used horizontal partitioning to split the table into smaller tables according to its various languages; each assigned with the word breader for that language. And I try to insert, update, delete the data in all the tables thru a view which union all the smaller tables. This seems to work well.
Next, I created a storeprocedure to do a full-text search by searching each smaller table and appending the result.to a temp table
Example
CREATE TABLE #Temp ( searchresult ntext )
INSERT INTO #Temp (searchresult) SELECT translation FROM translation_german WHERE CONTAINS(*, @searchstring)
INSERT INTO #Temp (searchresult) SELECT * FROM translation_japanese WHERE CONTAINS(*, @searchstring)
INSERT INTO #Temp (searchresult) SELECT * FROM translation_chinese WHERE CONTAINS(*, @searchstring) : The problem is when I do a full-text search for a chinese string in the table with german word break, it will give the error#7619 ('A clause of the query contained only ignored words.') and terminate the storeprocedure immediately. There doesn't seem to be anyway to ignore the error and continue to the search in the next table(s).
1. Does anyone has anyway to handle the error#7619 in such a way that it will not terminate the Storeprocedure but continue to search the next table? OR 2. Is way any other method to implement a full-text search for a table containing multiple languages (with word breaker correctly implemented) other then the one described above?
Many thanks in advance Royston
John Kane - 15 Mar 2005 17:51 GMT Royston, Could you post the full output of -- SELECT @@version -- as this would be most helpful in understanding your environment and providing you with answers.
First of all, and assuming (for now) that you're using SQL Server 2000, you do not need to split your table into multiple smaller tables, (one for each language), as SQL Server 2000 supports multiple collation per column and SQL Full Text Indexing supports different "Languages for Word Breaker" per column. Therefore, you can have one column per language and set the collation and "Language for Word Breaker" at a per column level in one table. This is one reason I'm requesting the @@version info.
In regards to error 7619, when FT Searching for a Chinese string in the table with German word break, you will need to capture the language before you issue the FTS query and then run it against the appropriate table (or column) to avoid this error. Note, that in SQL Server 2005 (currently, still in beta), you will be able to store multiple languages in one column, and then issue language specific queries based upon the LCID of the language. Just curious, why are you running a FT Search query for a Chinese string in the table with German?
Hope that helps! John
 Signature SQL Full Text Search Blog http://spaces.msn.com/members/jtkane/
> Hi All, > [quoted text clipped - 45 lines] > Many thanks in advance > Royston Royston - 16 Mar 2005 03:29 GMT Hi John,
Yes, I am using SQL Server 2000. I am currently using a single column to store all the various translations, therefore I need to do a FT search for a Chinese string in the German table (actually, all the smaller tables). The logic for insertion, update, and delete is simpler with this method (normal query thru a view will do). Adding of new language requires only addition of a new smaller table and a quick update to the view. There's no change to the existing storeprocedures as the storeprocedures access the translation thru a view. Here's my simplified sample schema,
Create Table [English_Table] ( string_id int primary key, english_text ntext, lastupdated_by int, lastupdated_on datetime, search_id uniqueidentifier not null Unqiue )
Create Table [Translation_German_Table] ( string_id int, culture varchar(5), translation ntext, lastupdated_by int, lastupdated_on datetime, search_id uniqueidentifier not null Unqiue Constraint pk_german Primary Key (string_id., culture) Check (culture = 'de-DE') )
Create Table [Translation_Chinese_Table] ( string_id int, lang_id varchar(5), translation ntext, lastupdated_by int, lastupdated_on datetime, search_id uniqueidentifier not null Unqiue Constraint pk_chinese Primary Key (string_id, culture) Check (culture = 'zh-CN') )
Create View [Translation] ( Select * From [Translation_German_Table] Union All Select * From [Translation_Chinese_Table] )
Storing each language per column will require more complex logic on the Client side to determine the current language of the translation and to insert into/ update the appropriate column (addition of new language will require adding a new column and may also require changes to existing logic). I will still have to check the language of the search string to implement the FT search on the correct column (this may pose a problem if the search string is user input and I don't know what language the user is inputing).
Thanks, Royston
> Royston, > Could you post the full output of -- SELECT @@version -- as this would be [quoted text clipped - 77 lines] > > Many thanks in advance > > Royston Xin Chen - 16 Mar 2005 08:22 GMT Royston, I am not sure what you suggest would work. First, I don't think you can do Full text search on view, in other word, do select * from View1 where contains (data,'some search string')
Second, you would still have to choose ONE word breaker on the indexed data column, so if you have Germany and Chinese in the same column, What word breaker can you choose?
I am running the similar problem, in my case, I want to full text search on chinese and english,but struggling to find a solution for it.
Any suggest are welcome.
--Xin Chen
> Hi John, > [quoted text clipped - 152 lines] > > > Many thanks in advance > > > Royston Royston - 16 Mar 2005 12:09 GMT Hi Xin Chen,
No, I am not doing the FT search on the view. My solution is to utilize horizontal partitioning to split my translation table into multiple child tables each storing only a particular language, and then assign a different word breaker to each of the child table according to the language it is storing. I am using a view to union all the child tables so that I can access all the child tables thru the view as though it is a single table (that is, I can insert, update, delete via the view instead of referencing the child tables). Note that I implemented a check on the primary key column of each of the child table, this will allow the view to know where to insert/ update or delete the row referenced in the view. I have tried this and it seems to work.
However, to implement FT search, I created a temporary table and for each child table I will do a FT search with the search string and append the result to the temporay table (like the example below). The problem I encountered is if the search string contain a particular language string like chinese string, and the storeprocedure is doing a FT search on a child table with german word breaker, the storeprocedure will generate an error #7619 ('A clause of the query contained only ignored words.') and terminate execution. There doesn't seem to be any way I can catch the exception in the storeprocedure so as to ignore the error and continue the search on the next child table (that is, the one with the correct word breaker). As suggested by John, it seems the only way to resolve this is to check the language of the search string beforehand somehow and direct the FT search to the respective child table. This shall work, but I am hoping if anyone know how to catch the error#7619 exception and prevent the storeprocedure from terminating since it may be difficult to determine the language of the search string without restricting the user input.
Simplified Sample Code Example,
CREATE TABLE #Temp ( searchresult ntext )
INSERT INTO #Temp (searchresult) SELECT translation FROM translation_german WHERE CONTAINS(*,@searchstring)
INSERT INTO #Temp (searchresult) SELECT * FROM translation_japanese WHERE CONTAINS(*, @searchstring)
INSERT INTO #Temp (searchresult) SELECT * FROM translation_chinese WHERE CONTAINS(*, @searchstring)
Regards, Royston
> Royston, I am not sure what you suggest would work. > First, I don't think you can do Full text search on view, in other word, do [quoted text clipped - 187 lines] > > > > Many thanks in advance > > > > Royston
|
|
|