SQL Server Forum / DB Engine / SQL Server / July 2008
Trying to get the distinct counts per group
|
|
Thread rating:  |
Scott C - 16 Jul 2008 15:05 GMT Greetings,
This situation is a little complex. Imagine a table called grptest that looks something like this:
VisitID CaseID ProcedureID VisitProcedureID 2219 106319 20123 948 2219 106319 20123 1075 2219 106319 20123 1110 2219 106319 20123 1172 2219 106319 20123 2862 2219 106319 20123 2865 2219 106319 20123 2867 2219 106319 20123 3223 2219 106319 20123 4654 2219 106638 20123 948 2219 106638 20123 1075 2219 106638 20123 1110 2219 106638 20123 1172 2219 106638 20123 2862 2219 106638 20123 2865 2219 106638 20123 2867 2219 106638 20123 3223 2219 106638 20123 4654 2219 106733 22075 948 2219 106733 22075 1075 2219 106733 22075 1110 2219 106733 22075 1172 2219 106733 22075 2862 2219 106733 22075 2865 2219 106733 22075 2867 2219 106733 22075 3223 2219 106733 22075 4654 2219 107881 21299 948 2219 107881 21299 1075 2219 107881 21299 1110 2219 107881 21299 1172 2219 107881 21299 2862 2219 107881 21299 2865 2219 107881 21299 2867 2219 107881 21299 3223 2219 107881 21299 4654
There are 36 rows in this table but in my real world example there would be multiple VisitIDs and new values for each field. What I would like to do is get a distinct count for each field for each VisitID, CaseID, ProcedureID, and VisitProcedureID. Take ProcedureID for example. If I were to do a distinct count on this field what I would want returned is 4 not 3 because ProcedureID 20123 is in two seperate cases. This situation could apply for all fields. VisitProcedureID should be always be the count of all rows for each VisitID. But the other fields are trickier. Here is the query I have started but as you can see it is not giving me the correct results:
select count(distinct VisitID) as VisitCount, count(distinct CaseID) as CaseCount, count(distinct ProcedureID) as ProcCount, count(VisitProcedureID) as VisitProcCount from ( select Visitid, CaseID, ProcedureID, VisitProcedureID from dbo.grptest group by visitid,CaseID,ProcedureID,VisitProcedureID ) t1 group by visitid
Results ---------- 1 4 3 36
Any suggestions on an elegant way of doing this?
Scott C - 16 Jul 2008 15:13 GMT I actually did come up with a concatenation scheme that gives me what I want:
select count(distinct VisitID) as VisitCount, count(distinct CaseID) as CaseCount, count(distinct convert(varchar,CaseID)+convert(varchar,ProcedureID)) as ProcCount, count(VisitProcedureID) as VisitProcCount from ( select Visitid, CaseID, ProcedureID, VisitProcedureID from dbo.grptest group by visitid,CaseID,ProcedureID,VisitProcedureID ) t1 group by visitid
Does anyone have a better way of doing this?
Scott
> Greetings, > [quoted text clipped - 72 lines] > > Any suggestions on an elegant way of doing this? Uri Dimant - 16 Jul 2008 15:16 GMT Scott What version of SQL Server are you using?
> Greetings, > [quoted text clipped - 72 lines] > > Any suggestions on an elegant way of doing this? Scott C - 16 Jul 2008 15:19 GMT SQL Server 2005
> Scott > What version of SQL Server are you using? [quoted text clipped - 77 lines] > > - Show quoted text - Uri Dimant - 16 Jul 2008 15:43 GMT Scott So, you may take a look at ROW_NUMBER() OVER (PARTITION BY...............ORDER BY ) function If you want me to test it to give you more accurate solution please post sample data + an expected result
By sample data I meant INSERT INTO...... statetments
SQL Server 2005
On Jul 16, 9:16 am, "Uri Dimant" <u...@iscar.co.il> wrote:
> Scott > What version of SQL Server are you using? [quoted text clipped - 81 lines] > > - Show quoted text - Scott C - 16 Jul 2008 16:10 GMT Thanks Uri. I would appreciate it if you could look into it. Here is the DDL for the example I gave earlier:
SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[grptest]( [Visitid] [int] NULL, [CaseID] [int] NULL, [ProcedureID] [int] NULL, [VisitProcedureID] [int] NULL ) ON [PRIMARY] GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,948) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,1075) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,1110) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,1172) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,2862) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,2865) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,2867) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,3223) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,4654) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,948) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,1075) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,1110) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,1172) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,2862) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,2865) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,2867) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,3223) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,4654) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,948) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,1075) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,1110) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,1172) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,2862) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,2865) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,2867) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,3223) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,4654) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,948) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,1075) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,1110) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,1172) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,2862) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,2865) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,2867) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,3223) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,4654) GO
> Scott > So, you may take a look at ROW_NUMBER() OVER (PARTITION [quoted text clipped - 95 lines] > > - Show quoted text - Uri Dimant - 17 Jul 2008 07:46 GMT Scott This one is better in terms of performance with cte
as
(
select dense_rank()
over (partition by VisitID order by VisitID)as VisitCount,
dense_rank()
over (order by CaseID )as CaseIDCount,
dense_rank()
over (order by CaseID,ProcedureID )as ProcCount,
row_number()
over (order by VisitProcedureID )as VisitProcedureID
from [grptest]
)
select max(VisitCount),max(CaseIDCount),max(ProcCount),
max(VisitProcedureID)
from cte
Thanks Uri. I would appreciate it if you could look into it. Here is the DDL for the example I gave earlier:
SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[grptest]( [Visitid] [int] NULL, [CaseID] [int] NULL, [ProcedureID] [int] NULL, [VisitProcedureID] [int] NULL ) ON [PRIMARY] GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,948) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,1075) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,1110) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,1172) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,2862) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,2865) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,2867) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,3223) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106319,20123,4654) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,948) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,1075) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,1110) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,1172) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,2862) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,2865) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,2867) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,3223) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106638,20123,4654) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,948) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,1075) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,1110) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,1172) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,2862) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,2865) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,2867) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,3223) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,106733,22075,4654) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,948) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,1075) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,1110) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,1172) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,2862) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,2865) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,2867) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,3223) GO
INSERT INTO [dbo].[grptest] ([Visitid],[CaseID],[ProcedureID], [VisitProcedureID]) VALUES (2219,107881,21299,4654) GO
On Jul 16, 9:43 am, "Uri Dimant" <u...@iscar.co.il> wrote:
> Scott > So, you may take a look at ROW_NUMBER() OVER (PARTITION [quoted text clipped - 96 lines] > > - Show quoted text - Scott C - 17 Jul 2008 15:19 GMT > Scott > This one is better in terms of performance [quoted text clipped - 329 lines] > > - Show quoted text - Thanks Uri. I will give it a try
|
|
|