Hi. I've got a select statement that uses an aggregate clause nested inside
of a subquery. For my own edification (as well as taking advantage of any
available performance boosts that come with it), I'd like to know how to
rewrite it so it doesn't have a subquery. Here's the statement:
---------------
SELECT * FROM firsttable ft
LEFT OUTER JOIN [secondtable] st ON
Ft.[pkid] = st.[fkid] AND
st.[pkid] in (SELECT MAX(pkid) FROM secondtable WHERE fkid = st.fkid)
---------------
The database schema that I'm working with comes from running these statements:
-------
CREATE TABLE firstTable (
pkid INT PRIMARY KEY IDENTITY,
description VARCHAR(10)
)
GO
CREATE TABLE secondtable (
pkid INT PRIMARY KEY IDENTITY,
fkid INT REFERENCES firsttable
)
GO
INSERT firsttable (description)
VALUES ('test1')
GO
INSERT firsttable (description)
VALUES ('test2')
GO
INSERT secondtable (fkid) VALUES (1) -- inserts (1,1) into table
GO
INSERT secondtable (fkid) VALUES (1) -- inserts (2,1) into table
GO
In the end, the result set should look like this:
pkid description pkid fkid
----------- ----------- ----------- -----------
1 test1 3 1
2 test2 NULL NULL
Basically, its a left outer join where I want only One record to be included
in the result set even if there are multiple records in the [secondTable]
table. I filter out the second qualifying record in the secondTable query
by getting on the max(pkid) from a subquery... but I have no idea how to do
that inside a join clause.
Any help would be appreciated.
Thanks,
JRF
Aaron Bertrand [SQL Server MVP] - 24 Jul 2008 19:59 GMT
SELECT *
FROM firsttable ft
LEFT OUTER JOIN
(
SELECT
fkid, pkid = MAX(pkid)
FROM secondtable
GROUP BY fkid
) st
ON ft.pkid = st.fkid;
FWIW, I think either query will be treated the same by the optimizer. In
other words, don't expect radical performance differences between the two.
On 7/24/08 2:51 PM, in article
F5044236-9DFE-4F2F-B50E-B6B12D77DED0@microsoft.com, "Eigh"
> Hi. I've got a select statement that uses an aggregate clause nested inside
> of a subquery. For my own edification (as well as taking advantage of any
[quoted text clipped - 48 lines]
>
> JRF
Eigh - 24 Jul 2008 20:20 GMT
Whooah... I've never seen join syntax like that... Has that particular
join syntax (of specifying a select statement in place of the tablename for
the joined table) always been supported?
I didnt think there would be much of a performance difference between the
two either, but they DO have different execution plans.
*shrug*
Thanks a lot for enlightening me.
JRF
> SELECT *
> FROM firsttable ft
[quoted text clipped - 65 lines]
> >
> > JRF
Roy Harvey (SQL Server MVP) - 25 Jul 2008 00:50 GMT
>Whooah... I've never seen join syntax like that... Has that particular
>join syntax (of specifying a select statement in place of the tablename for
>the joined table) always been supported?
It is called a Derived Table, and it has been supported for a very
long time - I don't remember it ever not being part of the language
(but I must admit some of those memories are getting vague!) It is
based on the simple idea that the result set from a SELECT is itself a
table, and so a SELECT can be used in place of a table.
Roy Harvey
Beacon Falls, CT
Plamen Ratchev - 24 Jul 2008 20:07 GMT
One way to rewrite it:
SELECT *
FROM FirstTable AS F
LEFT OUTER JOIN (SELECT fkid, MAX(pkid) AS pkid
FROM SecondTable
GROUP BY fkid) AS S
ON F.pkid = S.fkid;
HTH,
Plamen Ratchev
http://www.SQLStudio.com