Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / Programming / SQL / July 2008

Tip: Looking for answers? Try searching our database.

Joining on a text comparison

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Mariano Gomez - 31 Jul 2008 17:04 GMT
I have 2 databases. In the 1st database foo1, I have a table - tfoo1 - which
contains item numbers in the following format, e.g. 12345XYZ345

The second database foo2, item numbers are in the format e.g., XYZ

I need to create a sql script that joins the two tables in separate
databases - having the an inner join on foo2.tfoo2.itemnmbr =
foo1.tfoo1.(itemnmbr, without the unmatching characters), in the above
example 12345345 = (12345345 without the XYZ)

Since XYZ can be within any part of the string in tfoo1, I am not able to
use substring. Also XZY is variable, which means it can be ABC, DEFG, etc. I
just need to extract the unmatching portion that makes the strings different.

Am I asking for something that difficult? Does this question even makes sense?

Best regards,
--
MG.-
Mariano Gomez, MIS, MCP, PMP
Maximum Global Business, LLC
http://www.maximumglobalbusiness.com
Andrew J. Kelly - 31 Jul 2008 18:59 GMT
While it can be done it will never be able to use an index properly and will
always scan.  But you can use REPLACE() and specify the string to be
replaced as a parameter to the sp and replace it with an empty string.

Signature

Andrew J. Kelly    SQL MVP
Solid Quality Mentors

>I have 2 databases. In the 1st database foo1, I have a table - tfoo1 -
>which
[quoted text clipped - 22 lines]
> Maximum Global Business, LLC
> http://www.maximumglobalbusiness.com
Mariano Gomez - 31 Jul 2008 19:45 GMT
Andrew,

Thanks for your input! It just seems this equates to a jumble, in the sense
that the only two occurrences that favor a substring and charindex are
XYZ12345345 and 12345345XYZ. It means that apparently I would have to
reorganize the string prior to even attempting to match something in it.

Best regards,
--
MG.-
Mariano Gomez, MIS, MCP, PMP
Maximum Global Business, LLC
http://www.maximumglobalbusiness.com

> While it can be done it will never be able to use an index properly and will
> always scan.  But you can use REPLACE() and specify the string to be
[quoted text clipped - 26 lines]
> > Maximum Global Business, LLC
> > http://www.maximumglobalbusiness.com 
Andrew J. Kelly - 31 Jul 2008 20:50 GMT
I am confused then. Are you trying to match on the XYZ or the numbers?

Signature

Andrew J. Kelly    SQL MVP
Solid Quality Mentors

> Andrew,
>
[quoted text clipped - 44 lines]
>> > Maximum Global Business, LLC
>> > http://www.maximumglobalbusiness.com
--CELKO-- - 31 Jul 2008 19:36 GMT
You can drop characters from a string with nested REPLACE()
functions:

SET my_col  -- removes digits
  = REPLACE(
    REPLACE(
    REPLACE(
    REPLACE(
    REPLACE(
    REPLACE(
    REPLACE(
    REPLACE(
    REPLACE(
    REPLACE(my_col, '0', ''),
     '1', ''),
    '2', ''),
   '3', ''),
  '4', ''),
 '5', ''),
'6', ''),
'7', ''),
'8', ''),
'9', '');

The same idea can be used for the alphabet.
Mariano Gomez - 31 Jul 2008 19:56 GMT
Celco,

Thanks very much! I believe this will cut it!
Signature

Best regards,
--
MG.-
Mariano Gomez, MIS, MCP, PMP
Maximum Global Business, LLC
http://www.maximumglobalbusiness.com

> You can drop characters from a string with nested REPLACE()
> functions:
[quoted text clipped - 21 lines]
>
> The same idea can be used for the alphabet.
Plamen Ratchev - 31 Jul 2008 19:58 GMT
If the case is with removing non-numeric characters, then you can use
something like the code below (requires SQL Server 2005). If the
non-matching portion can be a mix of numeric and non-numeric characters,
then it is a lot more difficult. In the code below you can replace the
reference to the system table master..spt_values with auxiliary table with
numbers.

CREATE FUNCTION dbo.CleanNonNumeric (@string NVARCHAR(100))
RETURNS NVARCHAR(100)
BEGIN
RETURN
 (SELECT CAST(
         (SELECT CASE
                  WHEN SUBSTRING(@string, n, 1) LIKE '[0-9]'
                  THEN SUBSTRING(@string, n, 1)
                  ELSE ''
                 END
         FROM (SELECT number
               FROM master..spt_values
               WHERE type = 'P'
                 AND number BETWEEN 1 AND 100) AS Nums(n)
         WHERE n <= LEN(@string)
         FOR XML PATH('')) AS NVARCHAR(100)));
END
GO

CREATE TABLE Foo1(
keycol NVARCHAR(100) PRIMARY KEY,
datacol CHAR(1));

CREATE TABLE Foo2(
keycol NVARCHAR(100) PRIMARY KEY,
datacol CHAR(1));

INSERT INTO Foo1 VALUES (N'12345345', 'a');
INSERT INTO Foo2 VALUES (N'1234XYZ5345', 'b');

SELECT A.keycol, B.keycol, A.datacol, B.datacol
FROM Foo1 AS A
JOIN Foo2 AS B
 ON A.keycol = dbo.CleanNonNumeric(B.keycol);

Plamen Ratchev
http://www.SQLStudio.com
Mariano Gomez - 31 Jul 2008 20:41 GMT
Plamen,

Thanks very much! It works like a charm!

Best regards,
--
MG.-
Mariano Gomez, MIS, MCP, PMP
Maximum Global Business, LLC
http://www.maximumglobalbusiness.com

> If the case is with removing non-numeric characters, then you can use
> something like the code below (requires SQL Server 2005). If the
[quoted text clipped - 40 lines]
> Plamen Ratchev
> http://www.SQLStudio.com 
Mariano Gomez - 31 Jul 2008 21:03 GMT
Plamen,

What if I wanted to remove the numeric characters and leave only the string
ones, to join by:

CREATE TABLE Foo1(
keycol NVARCHAR(100) PRIMARY KEY,
datacol CHAR(1));

CREATE TABLE Foo2(
keycol NVARCHAR(100) PRIMARY KEY,
datacol CHAR(1));

INSERT INTO Foo1 VALUES (N'XYZ', 'a');
INSERT INTO Foo2 VALUES (N'1234XYZ5345', 'b');

How would this change your function?

Best regards,
--
MG.-
Mariano Gomez, MIS, MCP, PMP
Maximum Global Business, LLC
http://www.maximumglobalbusiness.com

> If the case is with removing non-numeric characters, then you can use
> something like the code below (requires SQL Server 2005). If the
[quoted text clipped - 40 lines]
> Plamen Ratchev
> http://www.SQLStudio.com 
Plamen Ratchev - 31 Jul 2008 21:16 GMT
Just change the LIKE pattern adding '^':

CREATE FUNCTION dbo.CleanNonNumeric (@string NVARCHAR(100))
RETURNS NVARCHAR(100)
BEGIN
RETURN
 (SELECT CAST(
         (SELECT CASE
                  WHEN SUBSTRING(@string, n, 1) LIKE '[^0-9]'
                  THEN SUBSTRING(@string, n, 1)
                  ELSE ''
                 END        
         FROM (SELECT number
               FROM master..spt_values
               WHERE type = 'P'
                 AND number BETWEEN 1 AND 100) AS Nums(n)
         WHERE n <= LEN(@string)
         FOR XML PATH('')) AS NVARCHAR(100)));
END

Plamen Ratchev
http://www.SQLStudio.com
Vern Rabe - 31 Jul 2008 20:39 GMT
CREATE TABLE tfoo1(itemnumber varchar(20));
CREATE TABLE tfoo2(itemnumber varchar(20));

INSERT INTO tfoo1 VALUES ('12345XYZ345');
INSERT INTO tfoo1 VALUES ('12XYZ345345');
INSERT INTO tfoo1 VALUES ('12345345XYZ');
INSERT INTO tfoo1 VALUES ('12345ABCD345');
INSERT INTO tfoo1 VALUES ('12ABCD345345');
INSERT INTO tfoo1 VALUES ('12345345ABCD');

INSERT INTO tfoo2 VALUES ('XYZ');
INSERT INTO tfoo2 VALUES ('ABCD');

SELECT *
    FROM tfoo1
    JOIN tfoo2
        ON CHARINDEX(tfoo2.itemnumber, tfoo1.itemnumber) > 0;

DROP TABLE tfoo1;
DROP TABLE tfoo2;

Vern Rabe

Mariano:

The CHARINDEX function could also be used:

> I have 2 databases. In the 1st database foo1, I have a table - tfoo1 - which
> contains item numbers in the following format, e.g. 12345XYZ345
[quoted text clipped - 18 lines]
> Maximum Global Business, LLC
> http://www.maximumglobalbusiness.com
Vern Rabe - 31 Jul 2008 21:01 GMT
Sorry - my text got dropped.

I just wanted to say that you can also use the CHARINDEX function. One
possible problem, however, is that a tfoo1 itemnumber of '123ABCD456' will
match tfoo2 itemnumber of both 'ABCD' and 'ABC'

> CREATE TABLE tfoo1(itemnumber varchar(20));
> CREATE TABLE tfoo2(itemnumber varchar(20));
[quoted text clipped - 45 lines]
> > Maximum Global Business, LLC
> > http://www.maximumglobalbusiness.com
Mariano Gomez - 31 Jul 2008 21:05 GMT
Thanks Vern,

I believe this is what I was asking for. As I said before, not sure the
question made much sense.

Best regards,
--
MG.-
Mariano Gomez, MIS, MCP, PMP
Maximum Global Business, LLC
http://www.maximumglobalbusiness.com

> CREATE TABLE tfoo1(itemnumber varchar(20));
> CREATE TABLE tfoo2(itemnumber varchar(20));
[quoted text clipped - 45 lines]
> > Maximum Global Business, LLC
> > http://www.maximumglobalbusiness.com
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.