Conceptual overview of importing into a relational system.
There are three main problems to solve when importing into a relational system.
1. We must value the destination fields correctly.
2. We must value the pointers between tables for report purposes.
3. We should handle inserts and updates with the same code.
Value Destination Fields
Valuing destination fields is relatively simple. Anyone who has ever done an import has had to map source data to destination fields. Examples include text based sources with fixed width fields, delimited text, and field based imports from Excel or some other 'odbc' source that tags the data with names for each column.
Solving the fixed width problem involves character start and length settings in a datamap. The delimited problem can be solved if you know the delimiter and the column number of each source field. Sources like Excel can be imported if you know the field NAME and the odbc source (why not use this solution provided by MS?).
We're pretty well covered if our datamap inclues: 'column_num' (character column number for fixed width and field column number for delimited), 'length', 'column_name' (odbc), 'destination_table' and 'destination_field'. The 'delimiter' and 'odbc_source' information can be stored in some type of datamap setup table.
Handling Inserts And Updates
The third problem rises out of the need to know if the data coming in is new or if it's an update for data already imported. You don't always know and it would be nice if one solution handled both. To do this you sql select according to some key (refers us to the second problem above) to see if the record has already been laid down. If it has you update with the new values. If it hasn't you insert, remember the key, and update with the 'new' values.
Value Table Pointers
The second problem is the kicker. Relational systems can be very complex with pointers pointing all over the place. Also, the destination system and source relational systems might have different structures. So how do you set pointers in your tables to enable accurate reporting?
A pathing system works for this. In a pathing system each table has an entry point and an exit point. Entry points and exit points are field names. The exit point for one table becomes the entry point for the next. A sequence of entry_point, exit_point and table_names takes you from some starting point in the source data to a destination table with the fields we must value (problem 1).
entry_point, exit_point, table_name
The select statement for problem 3 looks like this from within the context of the solution for problem 2:
SELECT exit_point FROM table_name WHERE entry_point = prior_exit_point
Our program must remember the prior exit point as we go from one record to another in our pathing system. When doing an insert our program must also adjust to the demands of a relational system with primary keys that must be unique.
We can put all of this into three major program steps:
1. Collect data
2. Resolve paths
3. Update data
Special options include conversions for format differences, always inserting, never inserting, allowing empty destination records or not, certain fields that must be filled before the data is written at all... Joe Kuhn
( Topic last updated: 2002.03.10 08:57:22 PM )