Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
DB Engine
SQL ServerMSDESQL Server CE
Services
Analysis (Data Mining)Analysis (OLAP)DTSIntegration ServicesNotification ServicesReporting Services
Programming
CLRConnectivitySQLXML
Other Technologies
ClusteringEnglish QueryFull-Text SearchReplicationService Broker
General
Data WarehousingPerformanceSecuritySetupSQL Server ToolsOther SQL Server Topics
DirectoryUser Groups
Related Topics
MS AccessOther DB ProductsMS Server Products.NET DevelopmentVB DevelopmentJava DevelopmentMore Topics ...

SQL Server Forum / General / Data Warehousing / May 2005

Tip: Looking for answers? Try searching our database.

Incremental Load to DataWarehouse table

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
marcmc - 06 Apr 2005 17:29 GMT
I have never done it before.

By the term incremental, I mean that I have a fact table and each day I want
to add records to it that do not already exist in the table. I do not want to
truncate and do a full load each day. There are millions of rows. I am
already thinking about Last_RunDate_ID fields etc

What is the best approach to thinking about/coding this?  
Any methodologies out there?
J?j? - 10 Apr 2005 16:16 GMT
there is no specific solution...
you have to choose the one you can do.

for example, you only add new rows which follow the last one in your table a
simple select MAX(DateKey) from MyTable can be used.
This is good for log-like or sales transactions:
each transaction as a new date, nothing to delete, only new rows to add.

if you know that sometimes you have to reload part of your data, you can cut
you big table into smaller ones (partitions) and truncate/reload only the
partition you have to load.

>I have never done it before.
>
[quoted text clipped - 7 lines]
> What is the best approach to thinking about/coding this?
> Any methodologies out there?
Peter Nolan - 16 May 2005 18:57 GMT
Hi marcmc,

I published cobol and C code to achieve what you are asking for....and much
more....

The reason I published it is that MSFT are dropping embedded SQL in C in
2005 so it cannot be taken forward....but all the ideas in the code work.  I
have implemented them in many places.....

http://www.peternolan.com/Default.aspx?tabid=57

You can always use what is there and write it in another language.

If what you really need is the ability to get the data out of a source
system incrementally (which is a different problem) and the source system
does not allow you to get incremental extracts my company provides free
utilities on win2000 (plus source code). One of them is a 'delta generation
utility' which will compare two files and generate the deltas....you can
read about it here.... http://www.instantbi.com/Default.aspx?tabid=30 under
IDW utilities.....this way you can detect changes to files and just pass the
changes through to your ETL subsystem.....another utility will detect if the
row already exists in the target table and delete it so that when the row is
loaded the loader does not crash with a constraint violation....

Best Regards

Peter Nolan
www.peternolan.com

>I have never done it before.
>
[quoted text clipped - 7 lines]
> What is the best approach to thinking about/coding this?
> Any methodologies out there?
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.