(Updated: 2004.07.26 07:35:25 PM)
| |

Here's a summary of the situation: When you lock a record, the actual physical lock is not placed on the record. This is because if a lock were placed on the record, it would not be possible to read from that record until the lock is released. Supposedly, this lock is placed at (2GB + Recno()), ensuring that a lock does not conflict with reading from a record. This is not the actual case. The lock is actually placed at approximately (2GB - recno()).
So if you lock a record in a table that's at or close to the 2GB limit, one of the other records gets a lock placed on it, preventing other instances of VFP from reading that file. This can result in unusual behavior, and, possibly, a
Dead Lock scenario.
Please note that this does not affect a "single-user" table, as a process can read through it's own locks.
The formula for finding the amount of potencially affected bytes is roughly:
((Header() + (
RecSize() *
RecCount() + 1)) - ((2^31) -
RecCount()))
Also note that adding a CDX to the table does
not cause this bug to go away.
--
Peter Crabtree
This is interesting... but I'm still not sure how the 'wrong record' gets locked.
I had understood that a "lock" was more of a logical thing, where
some number was used to represent the record and as long as all systems involved derived that number in the same way then there would be no problem. Since RECNO appears to be a factor in the derivation, uniqueness to a specific record seems to be assured, doesn't it? In other words, it shouldn't matter if the derivation results in a 'real' record number as long as the derived number is unique for the ACTUAL record number intended to be locked. What am I missing here? --
Jim Nelson
From the FPD 2.6 help file ...
* The actual file size (in bytes) cannot exceed 2 gigabytes for
single-user or exclusively opened multi-user .DBF files.
Shared .DBF files with no indexes or .IDX indexes cannot
exceed 1 gigabyte. Shared .DBF files with structural
.CDX indexes cannot exceed 2 gigabytes.
I don't know if this still applies, if it does you need a cdx.
07/13/2003 I did run the code below and observed the behavior reported. I attempted to build an index on Deleted() but ran out of temp file space. I then created a table with 40 character field width as suggested by Ted, which didn't have the "Record not available..." problem (52,377,641 rows) with or without a cdx.
Scott Finegan
There is apparently a problem with VFP's locking scheme, at least on Windows XP. I've always thought that VFP locked at (2GB + record number), which was one of the reasons that tables are limited to 2GB.
This is wrong. VFP doesn't lock at
(2147483648+recno()), it locks at
(2147483646-recno()).
This is very odd, as it causes locking records in a very large table to intersect other records. Example:
(test table: test C(4))
If you lock record A (#357,913,953) (in VFP session #1),
it locks offset 1,789,569,693,
which prevents ANY access (read or otherwise) to record B #357,913,
873) (in VFP session #2),
though you can still lock it.
In actual fact, VFP session #2 will lock up until the other session unlocks the record.
I have only access to VFP7, so I don't know if there's some other scheme present in other versions (which would be even worse!), but this is quite a big bug, if you're dealing with tables large enough for this to matter!
Anyone have VFP6/8 to test this out?

I suspect that the conclusion above may be erroneous (not that I know the answer).
While I get similar results using VFP8SP1 and using 2 different systems for the test (access same table from both systems), I feel that something else is at work here.
On systemA I locked record 357913953 and an attempt to access record 359913873 from systemB resulted in the 'record not available...' message. UNLOCKing in systemA, then RLOCK() in systemB succeeded in locking the record requested on systemB.
However, I was then able immediately to re-RLOCK() the
original record on systemA.
Equally important,
File Mon does show 2 DIFFERENT offsets for the two different record numbers locked. On systemA the offset is 1789569693 for record 357913953. On systemB the offset for record 359913873 is 1789569773.
I believe that the "offset" is actually the lock key'.
Finally, it is not just a single record that is affected by the lock on systemA. Attempts to lock records around record 359913873 on systemB provide the same result... the dread message until systemA unlocks, yet systemA can still immediately relock the original record. Surprisingly, the same is true in the reverse... if a record is locked on systemB then attempting to access adjacent records on systemA result in the dread message.
Note, too, that the dread message appeared regardless of settings for SET REPROCESS and SET REFRESH on both systems.
My conclusion is that any intersection of "offset" values with real records is IMMATERIAL. But something is afoot that certainly is surprising. And it sure would be good to know what the hell it is!
For completeness, the simple tests were done between a XP-pro system and a
Win2K-pro system, with separate tests done for the referenced table being resident on each system.
I believe that you have misunderstood the problem. SystemB is not supposed to RLock() record *773, because it doesn't matter if it is RLock()ed. It is simply supposed to read from it. Locking record *953 results in a lock being placed close enough in the physical record space of *773 that when *773 is being read, the lock is hit by the O/S.
To clarify, here is the entire sequence of events, with a table to show the relevant offsets:
| Record Number |
Physical Offset Start |
Physical Offset End |
Physical Offset When VFP Does a READ (VFP's minimum READ size is 512 bytes) |
RLock() Offset from Filemon Consistantly ((2^31) - 2 - Recno()) |
| 357913953 |
1789570088 |
1789570092 |
1789570600 |
1789569693 |
| 357913873 |
1789569688 |
1789569692 |
1789570200 |
1789569773 |
- VFP Session 1:
RLock("357913953")
- VFP issues a LOCK on offset 1786569693, length 1
- VFP Session 2:
GO 357913873
- VFP issues a READ on offset 1789569688, length 512
- VFP Session 2 is now locked, because the O/S cannot go through an exclusive LOCK, and VFP is waiting for data from the O/S.
--
Peter Crabtree
Hi Peter, I *think* I understood the problem but could not use that conclusion to rationalize that after session 1 UNLOCKed the record, letting Session 2 access it, then RLOCK() it, session 1 could then immediately RLOCK() its same original (different from session 2's) record. I would have thought, under the original conclusion, that session 1 would have been prevented from locking by the same "bug". To me that eliminated the READ length as being causal. What am I missing? --
Jim Nelson
I have VFP 8, and I'll try to take some time over the weekend to verify this.
However, I'm not sure how large a concern it is. In most modern VFP systems, locks are very brief, milliseconds or centiseconds or deciseconds at worst, while a record (or set of records in a transaction) is being updated. Since more records are being locked, and records actually unrelated to the transaction, this increases the chance of a deadlock situation. But I think that proper use of SET REPROCESS should minimize this problem.
How big is the threat here? The lock takes place in the tail end of the table, at 2^31-1-RECNO(). The example of C(4) records is pretty small record in which to hold meaningful data, never mind 350 million of them. More realistic might be a row of width 40, in which case the table size is limited to 52,377,649 records and while the contested area is limited to the last 50 megabytes of the 2 gig limit (2%) - cutting it pretty close! --
Ted Roche
[2003.07.13] Tested in VFP8 with similar results. A couple key differences: Rather than running the entire program, I created the table and locked a record in one instance, then opened a second instance, used the table shared, and issued GO 357913873. Observed that there was a funky little scrolling message in the status bar: "Record not available... please wait." The only way to release that session was to go back to the first and issue UNLOCK ALL. Changing the setting in the second to SET REPROCESS TO AUTOMATIC (after re-locking in session #1), returned an immediate RLOCK() = .T., but trying to EDIT the record brought up the scrolling message again.
Again, I think the exposure can be minimized with the properly locking strategy, and being cautious at using files near their maximum size. While the example shows the danger for 20% of a four-byte field-record, the exposure is only for the last 2.5% of a record with 39 bytes (+1 for deleted).

Ted, did you give the table a CDX as suggested at the top by Scott Finegan?... I think he hit upon something there.

This is, a relatively small concern, but definately blows away the argument that the locking scheme would be the prime reason that VFP tables can't go above 2 GB.
My main concern about it is it could easily cause a
Dead Lock condition, with the smallest useful table (two Integer fields, for a multi-to-multi linking table (4+4+1)) means that about 26,500,000 of the records (of a max of about 238,500,000) could cause a
Dead Lock (11%). Granted, the chance for lock is small, but still there.
Also, many supposed "instability" problems with large tables could be directly related to this, as the second session just
locks, apparantly stuck waiting for the
_fRead() function to return.
Keep in mind, by the way, that changing the record size will change the calculations to find a pair of records that will deadlock.
On yet another note, the CDX shouldn't have anything to do with this, as the locking is clearly observable with a tool like
File Mon. A CDX isn't going to change the record locking scheme.
As for the RLock() not locking the second session, of course it won't! It's going to lock some other location ((2^31)-1-Recno()), not the record. The problem lies in trying to read or write the record, as any locked area in a file cannot be read or written. --
Peter Crabtree
Reproduction code:
CLOSE DATABASES ALL
IF !File(".\test.dbf")
IF MessageBox("Must make testing table - this WILL take a while. Continue?",4) != 6 &&IDYES
RETURN .F.
ENDIF
CREATE DBF test(test C(4)) && This makes 5 byte records - the 4 byte field, and the delete marker
FOR I = 1 TO 357913953
INSERT INTO test VALUES("test")
ENDFOR
USE IN TEST
MessageBox("Finished creating table")
ENDIF
*To test with a CDX:
*USE TEST EXCLUSIVE
*INDEX ON Deleted() TAG Deleted
*USE IN TEST
USE TEST SHARED
Rlock("357913953","test")
LOCAL lcLockBugProgram
TEXT TO lcLockBugProgram NOSHOW
USE test SHARED
?"See? Frozen!"
GO 357913873
QUIT && will get here only AFTER the first session unlocks the record
ENDTEXT
SET SAFETY OFF
StrToFile(lcLockBugProgram,"lockbugx.prg",0)
COMPILE lockbugX
StrToFile("SCREEN = ON"+Chr(13)+"COMMAND = DO lockbugx","lockbugx.fpw",0)
DECLARE INTEGER ShellExecute IN SHELL32.dll INTEGER,STRING,STRING,STRING,STRING,INTEGER
ShellExecute(0,"open",_VFP.FullName,"-Clockbugx.fpw -T",Curdir(),1)
MessageBox("Click OK to unlock the file and unfreeze the other session")
USE IN test
Inkey(.5)
DELETE FILE lockbugx.*
Contributers:
Peter Crabtree
Category VFP Bugs