I need to select the top 3 rows per each group (xID -- order by xID). No
problem here. But I need to select the count of rows in each group before a
null value is encountered in the stat column -- ordering by xID
CREATE TABLE #tmp1(rowID int Identity(1,1), xID int, x1 int, stat varchar(1))
INSERT INTO #tmp1
SELECT 1, 1, 'f'
UNION ALL SELECT 1, 2, 'f'
UNION ALL SELECT 1, 3, 'f'
UNION ALL SELECT 1, 4, 'f'
UNION ALL SELECT 2, 1, 'f'
UNION ALL SELECT 2, 2, null
UNION ALL SELECT 2, 3, 'f'
UNION ALL SELECT 2, 4, 'f'
UNION ALL SELECT 3, 1, 'f'
UNION ALL SELECT 3, 2, 'f'
UNION ALL SELECT 3, 3, null
Here is what I have for getting the top 3 rows per group:
SELECT * FROM #tmp1 t1 WHERE EXISTS
(SELECT * FROM (SELECT TOP 3 * FROM #tmp1 t2 WHERE t2.xId = t1.xId ORDER BY
t2.xid) t3 WHERE t3.rowID = t1.rowID). This returns:
rowID xID x1 stat
1 1 1 f
2 1 2 f
3 1 3 f
5 2 1 f
6 2 2 NULL
7 2 3 f
9 3 1 f
10 3 2 f
11 3 3 NULL
From here I need to get this output
xID rowcount stat
1 3 f
2 1 f
3 2 f
xID = 1 had 3 rows (no null values encountered)
xID = 2 had 1 row before encountering a null
xID = 3 had 2 rows before encountering a null
Is it possible to get this output from the source data without looping?
Thanks,
Rich
Roy Harvey (SQL Server MVP) - 22 Jul 2008 17:25 GMT
SELECT A.xID, MAX(A.rownum), A.stat
FROM (SELECT *,
row_number() over (PARTITION BY xID
ORDER BY xID, rowID) as rownum
FROM #tmp1) as A
JOIN (SELECT *,
row_number() over (PARTITION BY xID
ORDER BY xID, rowID) as rownum
FROM #tmp1
WHERE stat IS NOT NULL) as B
ON A.rowID = B.rowID
AND A.rownum = B.rownum
WHERE A.rownum <= 3
GROUP BY A.xid, A.stat
xID stat
----------- -------------------- ----
1 3 f
2 1 f
3 2 f
The idea is that we number the rows once including the NULLs, and once
without the NULLs, and only take the ones where the numbers match.
Roy Harvey
Beacon Falls, CT
>I need to select the top 3 rows per each group (xID -- order by xID). No
>problem here. But I need to select the count of rows in each group before a
[quoted text clipped - 47 lines]
>Thanks,
>Rich
Rich - 22 Jul 2008 17:48 GMT
Thank you for your reply. I forgot to mention that we are still on sql 2000
over at my place (man, we need to upgrade!). I was just reading an article
about the Row_Number function for sql 2005. I will try to translate that
back to sql 2000 - which I understand will require the use of a #tmp table
(or 2). In the meantime, can I translate the query below in one continuous
statement? Or will I have to actually create a separate #tmp table first to
make it useable for sql 2000?
> SELECT A.xID, MAX(A.rownum), A.stat
> FROM (SELECT *,
[quoted text clipped - 74 lines]
> >Thanks,
> >Rich
Tom Cooper - 22 Jul 2008 17:53 GMT
One way to do it in SQL 2000,
SELECT t1.xId, Count(*) As 'rowcount', t1.stat FROM #tmp1 t1 WHERE EXISTS
(SELECT * FROM (SELECT TOP 3 * FROM #tmp1 t2
WHERE t2.xId = t1.xId ORDER BY
t2.xid) t3 WHERE t3.rowID = t1.rowID
AND NOT EXISTS (SELECT * FROM #tmp1 t4 WHERE t3.xID = t4.xID
And t3.RowId >= t4.RowId And t4.stat Is Null))
Group By t1.xId, t1.stat
Order By t1.xId, t1.stat
Note that if you have a large amount of data, a cursor might execute faster.
Tom
> Thank you for your reply. I forgot to mention that we are still on sql
> 2000
[quoted text clipped - 90 lines]
>> >Thanks,
>> >Rich
Rich - 22 Jul 2008 18:20 GMT
Thank you for your reply. Question: I am sure a cursor would be faster for
large data than to query large data using Exists - But would the query using
the Row_Number function (sql 2005 query) be more efficient than the cursor
(and using Exists) ? Or for this case (using large data) would the cursor be
the most efficient solution?
> One way to do it in SQL 2000,
>
[quoted text clipped - 105 lines]
> >> >Thanks,
> >> >Rich
Roy Harvey (SQL Server MVP) - 22 Jul 2008 18:30 GMT
> I am sure a cursor would be faster for
>large data than to query large data using Exists
Don't be. With good indexing the answer will usually go the other
way.
Roy Harvey
Beacon Falls, CT