Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / General / Other SQL Server Topics / May 2007

Tip: Looking for answers? Try searching our database.

What do you think about the checksum function ?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Kurt - 29 May 2007 13:39 GMT
Hi,

I'd like advices about an idea I add to resolve a problem. thanks to
you in advance for yours answers.
I have a database with tables that I load with flat file. The size of
each table is 600 Mb. The flat file are the image of an application
and there is no updated date or created date on any table. So my
tables are just a copy of the data from the flat file.

Now I'd like to create an History Table. So I have to determine which
lines changed and which one did'nt.
As I don't have any date on my row the only answer I had unil know was
to check each column on each row to see if any data changed. If the
data changed I add a new line in my history date.

My idea is to add a checksum column in both table on all columns. To
know if any data change I just have to check my PK  + my checksum
column.
Do you think that is a good idea ? Is checksum a quick function or
not ?.

Thanks.

--
K
Erland Sommarskog - 29 May 2007 22:31 GMT
> I'd like advices about an idea I add to resolve a problem. thanks to
> you in advance for yours answers.
[quoted text clipped - 14 lines]
> Do you think that is a good idea ? Is checksum a quick function or
> not ?.

Neither checksum() nor binary_checksum() are very useful. I think they
based on XOR, and they would too often say a row is unchanged when it
has not. It would be a lot safer to compare all columns?

Exactly how do update your tables? To blow all existing data away and
reload, or do you INSERT new, update existing ones etc? In such case a
timestamp column could work for you. (Timestamp here has nothing to
do with date and time.)

Signature

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Kurt - 30 May 2007 07:45 GMT
> > I'd like advices about an idea I add to resolve a problem. thanks to
> > you in advance for yours answers.
[quoted text clipped - 18 lines]
> based on XOR, and they would too often say a row is unchanged when it
> has not. It would be a lot safer to compare all columns?

> Exactly how do update your tables? To blow all existing data away and
> reload, or do you INSERT new, update existing ones etc? In such case a
[quoted text clipped - 8 lines]
>
> - Afficher le texte des messages précédents -

Hello,

I read on the msdn website that BINARY_CHECKSUM can be used to detect
changes to a row of a table.
To update my current tables, I truncate and after I insert the new
datas.
What do you call a timestamp ? How timestamp could work for me ?

Thanks,

--
K
Erland Sommarskog - 30 May 2007 22:57 GMT
> I read on the msdn website that BINARY_CHECKSUM can be used to detect
> changes to a row of a table.
> To update my current tables, I truncate and after I insert the new
> datas.
> What do you call a timestamp ? How timestamp could work for me ?

If you truncate and insert, timestamp is not going help you. A timestamp
column is automatically updated when a column is updated with a database-
unique monotonically increasing value. So if you had updated only rows that
had changed, and inserted new ones, you could have saved the current
timestamp value when you started, and then all rows with a higher timestamp
value would have been new.

But since you truncate and re-insert, all timestamp values will be higher
when you started.

As I recall binary_checksum() returns a 32-bit integer. If your tables
are wide, this means that collisions will not at all be unlikely. Note
also that binary_checksum ignores some data types entirely.

While more boring to code, I would recommend that you make a comparison
column by column.

Signature

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.