(Updated: 2007.03.23 07:48:47 AM)
| |
A primary key is a field or group of fields that unambigously identifies a specific record within a table. Primary keys can be either a
Natural Key or a
Surrogate Key.
VFP provides a primary index that enforces the uniqueness of a primary key.
All records, even records marked for deletion, must have a unique value for the primary index's key expression for all records in the table (including those marked for deletion). VFP's primary index can be filtered with a FOR NOT DELETED() to filter out the marked records, however, this may cause trouble if a record is ever recalled.
Specifically, an attempt to recall a record whose primary key is used by another causes error #1884 "Uniqueness of index /tag/ is violated.", and the recall fails. To me, trouble would be if the recall succeeded; you would be left with duplicate primary keys.
Personally, I think choosing to use
for not deleted() for primary and/or candidate indexes is driven by your business rules. If your business rules allow items to be deleted, then you may as well allow the ID of a deleted item to be reused as soon as the item is deleted, not force the system to wait until the table is packed.
If you need to recall a record that would cause a collision, you can change the value of the key before doing the recall.
I see a few problems with deleted record reuse while not maintaining unqiue keys across all records in a table.
For Example, since FOR NOT DELETED() is not optimizable, you would need to create another index that is optimized if you want to use the primary key to JOIN tables and get optimization too. Since indexes get pulled across the wire when you open a table, this can cause performance issues.
Another problem is if the primary key is used as the foreign key in child records. In this case, if you RECALL a record to reuse it, you may end up re-associating child records that shouldn't be there.
Given the problems, is it really that hard to always create a new
unique primary key across all records (deleted and not deleted)? --BenCreighton
Indexing Effectively for Rushmore: (from Chapter 15 of the Programmer's Guide)
"Rushmore cannot take advantage of all indexes. If you use a FOR clause in the INDEX command, Rushmore cannot use the index for optimization. For example, because it contains a FOR clause, the following statement cannot be optimized: INDEX ON ORDNUM FOR DISCOUNT > 10 TAG ORDDISC."
This means that to take advantage of Rushmore, AND allow for deleted records, you would need to have 2 indexes on the same field. --
Cindy Winegarden
If people think they need to re-use records, it can be done without re-using keys. Replace the key with -1*PK and replace the rest of the fields with blanks. Then look for a record with PK < 0 or add a new record if there are none. IMHO people stand on their heads trying to re-use keys without errors, when they shouldn't be doing it at all.
The root of an issue of abstraction.
A data table is an abstraction of some class. Each record represents an instance of the class and each field some property or attribute of the class instance, right? However, the table itself has a level of abstraction as well -- i.e., a table could be a part of a class called
DataStorageUnit with properties such as record and field[n,x] (where n = field number and x = some kind of description). So, here is the issue as I see it.
When we create a PK (or CK for that matter), what are we attempting to uniquely identify?
If we are attempting to "fingerprint" the table and record then a SYS(2015) or SYS(3) or other unique key alone is sufficient to the purpose. Even if we are attempting to uniquely ID the actual subject of the table abstraction (i.e., customer or invoice), these routines are plently. However, if the PK or CK pulls double-duty and contains property or attribute information about a particular instance then we have another problem completely.
Input from a user always holds some opportunity for duplication. Even if the source of that information has a carefully designed and managed unique key within it. The user can always have typos or misreadings or other potential violations of the key. So, we have to ask ourselves something:
Given that user information can be duplicated (either purposefully or otherwise), how do we want to deal with such a duplication?
There are a number of factors (in my mind) to consider. It is obvious that if we determine that a primary key is needed we MUST know when the key is violated. The central question is who or what handles the trapping of the error?
If we let
FoxPro handle this error then we lose flexibility. Fox only has limited error trapping and handling capacity at a class level of abstraction. In my mind the best and most flexible solution is this (remember we are talking about primary keys that contain attribute information of a class abstraction instance):
1. Create your instance PK with whatever information available as you choose.
2. Prefix or suffix the key with a computer generated key (this will prevent Fox from generating a system level PK violation).
3. Make a base class method (in your framework) that checks for duplicate PK/CK information (less the computer generated key prefix or suffix).
4. Create a PK/CK violation handling class that will handle common issues and (as a class) can be expanded to include system specific issues as you have need.
I can tell you that one thing I dislike is the use of PK's pulling double duty as both a record level abstraction PK and a user (or class function) readable serial number style PK used informationally in a system. Invariably, the two abstract concepts end up causing problems in one way or another for each other. If you want serial numbers for your class abstractions, then create serial numbers. If you want each record easily and uniquely identified, then do so at a level where the user is completely unaware that this is happening. However, do NOT use class abstraction information (i.e., first name+last name+middle initial+date of birth) to function in the role of uniquely IDing a table record. Even the record number is a poor way to do this. Instead, make you PK field your record ID stamp and then create CK's as you need them to ID the class instance.
Purest OOP is violated a number of ways by having class properties that pull double-duty. One, is the concept of self-documentation. When subsequent programmers look at your work they must decide in each usage of the PK whether the property is being used to ID the record or an instance of the table class. Perhaps it is obvious to them, perhaps not. If they guess wrong then they may create problems there and elsewhere. If you have your PK's as ALWAYS being a "fingerprint" of the record with CK's being gen'd for specific purposes and document this clearly in the DBC comments section, it is likely that other programmers (or even yourself) months later will always know what you intended and how you implemented the use of your table structures in your programs.
It violates Direct Mapping, which states that classes and class properties have only one representation and use.
I think you get the idea.
Bottom line -- use the table PK as a record level abstraction. Create CK's as you require them to ID entity class instances and tag them with computer generated keys to keep them 100% unique. Never use one field (class property) to pull double abstraction duty in your data sets. Use your own PK and CK error handling at a class level. Finally, let each class know about each of its keys, but do not make the record level key available to the user (only the system -- and even then, don't spread the knowledge far and wide).
LarryRix
Reusing deleted records is not such a good idea if dealing with a Buffered table.
Suppose two users access the same data entry form. They both "add" a record but in effect they get the same recovered record. This leads to a conflict situation WHICH SHOULDN'T HAVE HAPPENED.
Of course, if there was a way to know, when SEEKing the deleted record, that the record found has been RECALLed by another process, things would be different, but right now, GETFLDSTATE(0) only knows about the data session within which it is invoked.
There are of course a number of ways to get over this, but if you're like me, you want that any chnges made by a user are only made final by committing the changes (including the recovery of the deleted record). If the user cancels his data entry session, you just do Tablerevert or maybe even nothing.
Furio Filoseta

Can Larry and Furio take a quick look at Using Sys2015 As A Primary Key ?
http://fox.wikis.com/wc.dll?Wiki~UsingSYS(2015)AsAPrimaryKey~VFP
What if you have tables
that have
PrimKey fields
where the
PrimKey field is setup as
Primary Key in the Modify Structure and
the tables are in a DBC and
the user never has any input as to that value - it is in the default value field in the Modi Stru and
you have 10-20 users on at any one time and
about 10 at any one time will be doing data entry NOT at a high speed rate?
Contributors:
Nancy Folsom,
Jim Booth
,
Carl Karsten,
Cindy Winegarden
Category Data Modeling Category Key Fields