Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / General / Other SQL Server Topics / November 2007

Tip: Looking for answers? Try searching our database.

ORDER BY AND GROUP BY CLAUSE

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
bwalton_707@yahoo.com - 27 Nov 2007 17:40 GMT
I'm completely lost why a trival task in VFP is a lengthy drawn out
process in SQL Server

For Example

A simple select statement where I want to return the most current date
from a table along with the unique identifier for the row selected is
a single select statement in VFP

SELECT TOP 1 date, id
   FROM anytable
   ORDER BY id, date desc
   GROUP BY id

OR Another example

SELECT invoice.number, customer.name, customer.address, invoice.id,
customer.id
FROM invoice
INNER JOIN customer
ON invoice.customerid = customer.id
ORDER BY customer.id, invoice.date DESC
GROUP BY customer.id

This will return the most recent order for a customer

Niether selects statements are supported in SQL Server 2005... Is
there a logical reason WHY? Other then ansi standards which I'm not
buying as m$ft rarely follows any standards but there own 100% of the
time anyway.

Also could someone please post the most efficent SQL eq
bwalton_707@yahoo.com - 27 Nov 2007 17:42 GMT
On Nov 27, 12:40 pm, bwalton_...@yahoo.com wrote:
> I'm completely lost why a trival task in VFP is a lengthy drawn out
> process in SQL Server
[quoted text clipped - 28 lines]
>
> Also could someone please post the most efficent SQL syntax to accomplish the above as I would like to determine if I am missing something here.

Thanks In Advance
Bryan
--CELKO-- - 27 Nov 2007 20:32 GMT
You use reserved words for data elements.   You have multiple names
for the same data element.  You used vague data element names,
including the magical, universal "id" that applies to all things in
creation.

You don't know that the ORDER BY clause is part of a cursor in
Standard SQL and that it comes at the end of the SELECT statement.
You seem to have both an invoice number (the usual term) and an
invoice identifier (I am scared that you used IDENTITY as a fake
pointer and have assumed that the data is stored in phsycial order,
like a mag tape).

I think that you are trying to use ordering because you do not know
what a table is -- no ordering!  You want a sequential file.  That is
not how RDBMS works at all!  Totally wrogn mindset.

This is the usual template for finding the latest invoice is:

SELECT I1.invoice_nbr, C.customer_name, C.shopping_addr, C.acct_nbr
 FROM Invoices AS I1,
      Customers AS C
WHERE I1.acct_nbr = C.acct_nbr
  AND I1.posting_date
      = (SELECT MAX(I2.posting_date)
           FROM Invoices AS I2
          WHERE I2.acct_nbr = C.acct_nbr);

Notice how I cleaned up the names.  You might want to read ISO-11179
sometime soon.

>> Neither SELECT statements are supported in SQL Server 2005... Is there a logical reason WHY? Other then ANSI Standards which I'm not buying as m$ft rarely follows any standards but there own 100% of the time anyway. <<

Both logic and Standards.  And Microsoft has been moving very strongly
to ANSI Standards; look at the new stuff in SQL-2005 and SQL-2008
which is pure ANSI.  The TOP syntax is proprietary syntax; the ANSI
approach would use ROW_NUMBER() OVER() instead.

What sense would it make to sort a table (a contradiction by
definition) then group it?  You have no idea what a SELECT does and
want it to act like a READ() in a proceudral languages.

Here is how a SELECT works in SQL ... at least in theory.  Real
products will optimize things, but the code has to produce the same
results.

a) Start in the FROM clause and build a working table from all of the
joins, unions, intersections, and whatever other table constructors
are there.  The <table expression> AS <correlation name> option allows
you give a name to this working table which you then have to use for
the rest of the containing query.

b) Go to the WHERE clause and remove rows that do not pass criteria;
that is, that do not test to TRUE (i.e. reject UNKNOWN and FALSE).
The WHERE clause is applied to the working set in the FROM clause.

c) Go to the optional GROUP BY clause, partiton the original table
into groups and reduce each grouping to a *single* row, replacing the
original working table with the new grouped table. The rows of a
grouped table must be only group characteristics: (1) a grouping
column (2) a statistic about the group (i.e. aggregate functions) (3)
a function or constant(4) an expression made up of only those three
items.  The original table no longer exists and you cannot reference
anything in it (this was an error in early Sybase products).

d) Go to the optional HAVING clause and apply it against the grouped
working table; if there was no GROUP BY clause, treat the entire table
as one group.

e) Go to the SELECT clause and construct the expressions in the list.
This means that the scalar subqueries, function calls and expressions
in the SELECT are done after all the other clauses are done.  The AS
operator can also give names to expressions in the SELECT list.  These
new names come into existence all at once, but after the WHERE clause,
GROUP BY clause and HAVING clause have been executed; you cannot use
them in the SELECT list or the WHERE clause for that reason.

If there is a SELECT DISTINCT, then redundant duplicate rows are
removed.  For purposes of defining a duplicate row, NULLs are treated
as matching (just like in the GROUP BY).

f) Nested query expressions follow the usual scoping rules you would
expect from a block structured language like C, Pascal, Algol, etc.
Namely, the innermost queries can reference columns and tables in the
queries in which they are contained.

g) The ORDER BY clause is part of a cursor, not a query. The result
set is passed to the cursor, which can only see the names in the
SELECT clause list, and the sorting is done there.  The ORDER BY
clause cannot have expression in it, or references to other columns
because the result set has been converted into a sequential file
structure and that is what is being sorted.

As you can see, things happen "all at once" in SQL, not "from left to
right" as they would in a sequential file/procedural language model.
In those languages, these two statements produce different results:
 READ (a, b, c) FROM File_X;
 READ (c, a, b) FROM File_X;

while these two statements return the same data:

SELECT a, b, c FROM Table_X;
SELECT c, a, b FROM Table_X;

Think about what a confused mess this statement is in the SQL model.

SELECT f(c2) AS c1, f(c1) AS c2 FROM Foobar;

That is why such nonsense is illegal syntax.
bwalton_707@yahoo.com - 27 Nov 2007 22:33 GMT
> You use reserved words for data elements.   You have multiple names
> for the same data element.  You used vague data element names,
[quoted text clipped - 104 lines]
>
> That is why such nonsense is illegal syntax.

Thanks for the reply, unforunately your solution is not completely
accurate because it will return multiple records
if they have the same datetime (posted) date which is not the desired
result.

I did not develop that nonsense syntax it is supported in visual
foxpro and returns the correct results in a single select statement.
VFP was developed by dave fulton the acquired by microsoft so I
assumed they developed it ... Moveover it works!

Bryan
Ed Murphy - 28 Nov 2007 05:38 GMT
>> This is the usual template for finding the latest invoice is:
>>
[quoted text clipped - 6 lines]
>>             FROM Invoices AS I2
>>            WHERE I2.acct_nbr = C.acct_nbr);

(Side note:  Please trim quotes down to just the part that's
immediately relevant to your reply, like I've done here.)

> Thanks for the reply, unforunately your solution is not completely
> accurate because it will return multiple records
> if they have the same datetime (posted) date which is not the desired
> result.

Assuming that you want the acct_nbr's lowest invoice_nbr with the most
recent posting_date:

SELECT I1.invoice_nbr, C.customer_name, C.shopping_addr, C.acct_nbr
  FROM Invoices AS I1,
       Customers AS C
 WHERE I1.acct_nbr = C.acct_nbr
   AND I1.invoice_nbr
       = (SELECT MIN(I2.invoice_nbr)
            FROM Invoices AS I2
           WHERE I2.acct_nbr = C.acct_nbr
             AND I2.posting_date
                 = (SELECT MAX(I3.posting_date)
                      FROM Invoices AS I3
                     WHERE I3.acct_nbr = C.acct_nbr));
--CELKO-- - 28 Nov 2007 19:26 GMT
>> I did not develop that nonsense syntax it is supported in visual FoxPro and returns the correct results in a single select statement. <<

FoxPro came from the xBase family and not from any RDBMS, much less
SQL.  Later, they tried to copy the keywords from SQL to make it look
more familiar when xBase lost out.  MDX does the same kind of thing,
but with truly horrible irregular syntax.

The results are not correct in the SQL model, as a I verbosely
explained in my last posting.

>> VFP was developed by Dave Fulton the acquired by Microsoft so I assumed they developed it ... Moveover it works! <<

Hey, I like Dr. Dave!  I was at the ComDex when FoxPro was unveiled
with ACCESS by Bill Gates.  Dr. Dave presented FoxPro; Gates did
ACCESS.

Foxpro ran great and Dr. Dave was super smooth as a presenter -- It
was like Mr. Wizard teaches DB and explains Rushmore technology.  His
nice relaxed voice and flawless demo were the kind of thing I want to
be able to do on stage.

ACCESS was not ready for prime time -- or even Beta.  It sorted dates
alphabetically, had no UNION (the GUI people could not think of a good
graphic so the DB people were told to kill it), then crashed and blue-
screened.
Tony Rogerson - 27 Nov 2007 22:34 GMT
Both are very different products.

Try this...

SELECT id, max( date )
   FROM anytable
   GROUP BY id
   ORDER BY id, date desc

The jibe about ansi standards and MS not following them is a bit cheap, in
SQL Server you can use the ansi standard if you want - well, a lot of the 92
implementation anyway.

I'm not familiar with vfp - does it follow the ansi standard - the syntax
you posted doesn't look familiar.

Signature

Tony Rogerson, SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson
[Ramblings from the field from a SQL consultant]
http://sqlserverfaq.com
[UK SQL User Community]

bwalton_707@yahoo.com - 27 Nov 2007 23:27 GMT
> Both are very different products.
>
[quoted text clipped - 16 lines]
> [Ramblings from the field from a SQL consultant]http://sqlserverfaq.com
> [UK SQL User Community]

Hi Tony,

VFP supports multiple standards including their own sql standards.
There is an engine behavior flag that governs which variation FoxPro
Uses.

If you are interested here is a link to another forum about it...
http://fox.wikis.com/wc.dll?Wiki~Enginebehavior~VFP

The syntax I ended up using was a variant of this, with part of it
wrapped in a CTE for readability.
Just seemed like a lot of code for such a simple task ... Moveover my
last post was incorrect it should
not have said duplicate records.

SELECT I1.invoice_nbr, C.customer_name, C.shopping_addr, C.acct_nbr
 FROM Invoices AS I1,
      Customers AS C
WHERE I1.acct_nbr = C.acct_nbr
  AND I1.posting_date
      = (SELECT MAX(I2.posting_date)
           FROM Invoices AS I2
          WHERE I2.acct_nbr = C.acct_nbr);

I only been coding in sql server 2005 for about 6 month... I been
using foxbase then vfp forever however with it's end of life all new
projects are in sql
and there just seems to be ALOT more coding involved to accomplish the
same task, especially when doing anything on a row by row level ....
SQL also lacks a built in debugger going into .NET is a pain . . .

But what can you do :) ....

I will try your suggestion and thanks everyone for the help...

Bryan
bwalton_707@yahoo.com - 27 Nov 2007 23:43 GMT
On Nov 27, 6:27 pm, bwalton_...@yahoo.com wrote:

> > Both are very different products.
>
[quoted text clipped - 54 lines]
>
> - Show quoted text -

Erland,

Thanks for the post

That is exactly the feature, I was referring to, thanks for
clarifing ...

Bryan
bwalton_707@yahoo.com - 28 Nov 2007 00:04 GMT
On Nov 27, 6:43 pm, bwalton_...@yahoo.com wrote:
> On Nov 27, 6:27 pm, bwalton_...@yahoo.com wrote:
>
[quoted text clipped - 67 lines]
>
> - Show quoted text -

SELECT i.number, c.name, c.address, i.id, c.address
  FROM   customer c
  JOIN   (SELECT number, id, customerid
                 rowno = row_number() OVER(PARTITION BY customerid
                                           ORDER BY date DESC)
          FROM   invoices) AS i ON i.customerid = c.id
  WHERE  i.rowno = 1
  ORDER  BY c.i

This is what I ended up using ... Worked Perfectly and performs better
then what I had ...

Thanks Much

Bryan
Erland Sommarskog - 27 Nov 2007 22:57 GMT
> A simple select statement where I want to return the most current date
> from a table along with the unique identifier for the row selected is
[quoted text clipped - 21 lines]
> buying as m$ft rarely follows any standards but there own 100% of the
> time anyway.

Any logical reason? Well, in 4.x of SQL Server you were permitted to
have columns in the SELECT list that was not in the GROUP BY clause,
and I hated the feature. Everytime I did that error, I got a long output
of complete garbage, instead of a useful error message.

Simply, GROUP BY stands for aggregation, so if you group by A, B and
C and say that you also want D, the list - what does that mean? It can
make sense if D is dependent on one of A, B or C. For instance this
applies to customer.name in your example. But what is going to
happen if there are multiple values of invoice.id for the same
customer.id? I'm afraid that it plainly doesn't make any sense, whatever
you prefer to read into it.

To get the most recent invoice for each customer, this is the best
way to do it in my opinion:

  SELECT i.number, c.name, c.address, i.id, c.address
  FROM   customer c
  JOIN   (SELECT number, id, customerid
                 rowno = row_number() OVER(PARTITION BY customerid
                                           ORDER BY date DESC)
          FROM   invoices) AS i ON i.customerid = c.id
  WHERE  i.rowno = 1
  ORDER  BY c.i

It's nice for several reasons:
1) It's fully ANSI-compatible.
2) It's easy to extend to "show the three latest invoices".
3) It is likely to be very effecient.

Signature

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.