Wiki Home

VFP Basic Data Concepts


Namespace: WIN_COM_API
This is a draft section of the Learn VFP In 21 Days project.

Category 3 Star Topics


What's in this Section

Basic Movement
BOF and EOF
Creating New Records
Cursors
Data Structure
Deleting Records
Designing and Creating Tables
Editing Existing Records
Field Order
Indexes
Opening and Closing Tables
Overview
Record Pointer
Searching and Finding
Sort Order
Work Areas and Data Sessions


Overview

This being a practical manual for beginners, I'm deliberately leaving out a lot of details in order to focus on the basics of how to get the data up and running. Especially, I'm not including full syntax diagrams for commands; those are easily found in the VFP online Help. Suffice it to say that this system is extremely flexible, powerful, and fast, and if you're looking for more information on what you can do with it, it's readily available in the Help files or via several excellent books and websites.

This section describes how VFP data is organized and stored, and some basic ways to create, find, modify and delete records. Since VFP's native data engine was born in an xbase world, we'll focus on the xbase methods for handling data. However, some time after that birth, VFP "adopted" SQL into the family, so we'll touch on ways to use SQL syntax to do some basic data manipulation. SQL has become a critically important feature of VFP in recent years, but a full discussion of VFP's SQL implementation is beyond the scope of this section. It can be useful to know that when you use SQL commands with VFP tables, VFP translates those commands into xbase terms behind the scenes and manipulates the tables in xbase.

VFP is a relational database management system (RDMS), and as such, it has a lot of built-in ways to quickly establish relationships between tables and to automatically manage referential integrity issues.

*** NOTE TO THE WIKI ***
I don't use the visual database relation management or RI tools, or the SET RELATION command, and I don't feel qualified to write about them. Can somebody else chip in on this?
*********************

While an understanding of good data normalization and database design principles is very important for building quality applications with VFP, these topics, too, are beyond the scope of this document.

VFP provides both visual IDE tools for creating tables and databases, and programmatic commands for doing so. We'll briefly describe each of these in this section.

And finally, I'll note in passing that various VFP GUI controls let you "bind" data that's sitting in tables directly to them, so that users can instantly see the data as soon as the control appears, and changes they make to that data are instantly stored in the tables. While that may or may not be a good idea, depending on your application requirements, a discussion of that concept definitely belongs in the sections on GUI controls, not here.

Data Structure

In Visual Foxpro, as in other xbase languages, data is stored in files called tables. In VFP, tables can be "free" or associated with a database. I'm not going into the details of databases here; that's in another section. I just want to clear up some potential confusion about VFP's implementation of the "database" concept. When the association between a table and a database is made, the table is said to be "in" the database, and the main database .dbc file is often referred to as the database "container", but this is really inaccurate. The database does not physically "contain" tables. The database "container" file contains "metadata" that establish and define each association between a table and a database. In fact, the database container file can be in a completely separate file system path from the tables it "contains", and it will work just fine as long as the paths were properly specified when the associations were created.

The structure of a .dbf table is important to understanding the xbase data navigation commands.

The .dbf has a header section that contains information about the structure of the table (field names, data types, and sizes), a count of the number of records currently in the table, and other information. (When a table becomes corrupted, it is usually because some piece of information in the header no longer "agrees" with what's in the table. With the proper tools and techniques, it is usually possible to dig into the header and repair table corruption--but those tools and techniques aren't really a part of the VFP feature set. "Corruption" in VFP data, when it happens, is much more often a case of an index file getting out of "synch" with table data, and this is usually fixed pretty easily by rebuilding the indexes.)

Data in the table is stored in rows and columns, of course. Columns contain fields, and a row represents one complete record. .Dbf tables are "record based". That is, each record has a distinct location in the table, referred to by number, and can be referenced by that location alone. Unlike "set based" tables, it is not necessary to know anything about the data contained in a record in order to refer to it or locate it. In my opinion, this is one of the major strengths of the .dbf design. It is easier to find a specific record, and much easier to let a user navigate through a set of records in some kind of logical sequence (such as alphabetical order by last name), with the .dbf system than it is with "set based" database systems.

Memo fields are a special case. When you add a memo field to a table, a tiny little "pointer" record for it is stored in the .dbf file, but the actual contents of the memo fields are stored in a separate memo table with the same name as the .dbf file and a .fpt extension. Some commands that you execute on all the fields in a table may leave out any memo fields unless you specify that they be included. Others include the memo fields by default but you can exclude them if you want. Watch out for those.

Work Areas and Data Sessions

One of the more confusing aspects of VFP's native data engine is "work areas". "Work areas" are common to all xbase languages, but VFP has introduced some new twists over the years. A work area is a space in memory (either in RAM or "virtual" disk "memory") in which a table is opened and its data can be accessed or modified. Only one table can be open in any work area at a time, but that's not really a problem because VFP can have up to 32,000 work areas in use at once. And if, by some chance, you manage to use up all of these, that's not really a problem either, because those 32,000 work areas are contained in one "data session", and you can have up to 32,000 data sessions, each with its own 32,000 work areas, all functioning at the same time. But that's not the confusing part.

The confusing part is that some things about work areas are separate and distinct, but others are not. For example, you can open the same table in two different work areas and sort each one on a different field. However, if you add a new record to that table in one work area, it will also be immediately added to the table in the second work area.

Data sessions may be even more confusing than work areas. If the two work areas described above are in separate data sessions, the record added in Data Session 1, Work Area 1, will not be found in the same table in Data Session 2, Work Area 1.

Data sessions can't be established in straight code; they must be created by objects such as forms or the nonvisual Session object. They can only be referenced by number, which is one of those things about VFP that really drive beginners crazy.

People get confused by the concepts of "default data session" and data session numbers. When you start the VFP IDE, or a stand-alone application built with VFP, you are in the "global data session". This is data session 1, as you will see if you

? _SCREEN.DataSessionID

By default, a form's Data Session property is also set to 1 - "default data session". But these numbers have no relation to each other. The value of a form's Data Session property is a code for a setting. 1 means "default data session"; 2 means "private data session". Data session number, represented by a form's DataSessionID property, is a different animal.

If this is the first form you open after starting a VFP session, the default data session will be the global data session, but that's just a coincidence. The default data session is whatever data session you are in when you open the form.

Suppose you have a form called MyFirstForm, and you set its Data Session property to 2 - private data session. When you open MyFirstForm, it will not be in the global data session. It will be in a new private data session that it establishes. This new private data session becomes the default data session for everything you do within MyFirstForm's scope. It will be Data Session 2, and 2 will be the value of MyFirstForm's DataSessionID property. Now suppose MyFirstForm has a method called "OpenAnotherForm", which runs a form called MySecondForm, and MySecondForm's Data Session property is set to 1 - default data session. When you run MyFirstForm.OpenAnotherForm(), MySecondForm opens in Data Session 2, and MySecondForm.DataSessionID will be 2, which is the default data session for MyFirstForm. If, instead, you set MySecondForm.DataSession to 2 - private data session, before opening it, it will open with its DataSessionID equal to 3 - a new, third datasession. Clear as mud?

Fortunately, applications involving more than 2 or 3 open data sessions at once are pretty rare. Once two or more data sessions have been established, you can switch between them with

SET DATASESSION TO [some number]

You can also change an object's DataSessionID number on the fly, with some perhaps startling results, but I'm not going to get into that can of worms here.

Designing and Creating Tables

The VFP Table Designer is extremely user friendly; it's virtually self-explanatory for users who have a basic understanding of tables and indexes. It also lets you apply a lot of fancy concepts like default values and triggers if your table is part of a database. I won't get into all that here.

Essentially, you open the New... dialog from the File Menu, choose Table, and use either the Wizard or New File options to create the table. You'll be prompted for a name, and then can easily enter field names and set their data types and sizes. You can also create simple indexes on fields. Any indexes you create here will be stored in a structural compound index file. That's important to know, and it's explained later in this section.

Another good thing about the Table Designer is that you can change the structure of an existing table (add or delete fields, add or delete indexes) and when you save the table, your changes can be propagated to all the records. The data won't be disturbed any more than absolutely necessary.

You can also create tables programmatically. It's quite easy. Here's the basic form:

CREATE TABLE foo (Firstname C(10))

That line creates a new table called foo with a field called Firstname of Character type and a field width of 10 characters. To add more than one field at once, just separate them with commas:

CREATE TABLE foo (Firstname C(10), Lastname C(20), Birthdate D)

Notice the last field, Birthdate. The D means it's a Date field. Date fields have a hard-coded default length of 8, so I don't need to specify a length for it. Several other data types also have hard-coded lengths.

After you execute a CREATE TABLE command, the table is left open in the current work area, and you can immediately put records into it.

Creating a table, whether in the Table Designer or programmatically, only establishes the structure of the table. It does not create any records, That's a separate process, described below.

The AFIELDS() function returns a wealth of information about the structure of a table.

Field Order

In VFP tables, each field in the table structure has a number that corresponds to the order in which the field was added to the table. For example, if I

CREATE TABLE foo (Firstname C(10), Lastname C(20), Birthdate D)

then Firstname is field number 1, Lastname is field number 2, and Birthdate is field number 3. I can refer directly to these fields by their numbers with the FIELD() function.

? FIELD(1) && Returns "Firstname"

REPLACE FIELD(1) WITH "Joe"

? EVALUATE(FIELD(1)) && Returns "Joe"

This can be extremely useful when constructing generic routines for getting data into and out of fields, especially if you use a scheme for designing tables that puts all fields of a particular type in the same field order position in every table.

Cursors

No, it's not the little flashing line at the spot where you type in your word processor. In VFP a cursor is a very much like a table, but it only exists in memory (sometimes in RAM, more often in "virtual" disk "memory"). Some people say it's an abbreviation of "CURrent Set Of Records", but that may be apochryphal.

I'm not going to talk about all of the capabilities and purposes of cursors here. Most of them are not applicable to the "basics" of VFP data, and there is a separate section on them elsewhere in this guide.

The point I want to make here is that whenever you look at the contents of a VFP table, VFP pulls data off the disk and puts it into a cursor. That's what you are actually looking at, not the real table. This explains some things, such as why a "table" doesn't stay sorted the way you left it after you close it. It also allows VFP to implement its various data buffering schemes.

Virtually all that has been written about VFP data handling makes it seem as though you can directly manipulate the data in tables. You can't. All of the data-handling commands and functions actually work only on cursors, and when you're done, VFP saves the data back to the actual table.

(Sort of. Actually, VFP relies on the OS to write data to the disk, and the OS does that in its own sweet time. So there are situations where changes to data don't actually get put into the table when you expect them to. This can happen whether you are using buffering or not, and there's very little you can do about it. Fortunately, it only rarely causes significant problems.)

There are low-level file manipulation commands that do let you directly mess with table files. Don't do it unless you really know what you're doing, though.

Opening and Closing Tables

In the simplest example, you open a table in VFP like this:

USE foo

*** Note to Mike Yearwood

I just deleted my less-clear paragraph and kept the one you substituted here. -- Ken Dibble

***

There are a couple of things to notice about this. The first is that this command causes VFP to open the table in the current work area. Work areas are numbered, but it gets very tricky to keep track of all those numbers after a while. Fortunately, VFP helps you out by also giving the work area a name, which in this case will be the name of the table. -- Mike Yearwood

The other point is that the USE command takes a name expression. If you want to have a generic method for opening tables, you can't just put table names into a variable and feed the naked variable to the USE command. The command will expect a name expression, and it will throw an error. So you need either to surround the variable with delimiters to indicate it's a name expression, like so:

Mytable = "foo"

USE (Mytable)

Or you can use macro expansion on the variable:

USE &Mytable

But, from here on in, until you close the table, when you talk about foo, you'll actually be talking about the work area where you opened the foo table. Sometimes you need a string value for this, sometimes not. I can't make head nor tail of why it's one way sometimes and why it's different other times. All I can do is tell you to watch out.

You can tell VFP what to call the work area where you open a table, like so:

USE foo ALIAS bar

If I have two tables, one called "foo" and the other called "bar", I can

USE foo ALIAS bar

and right after that, I can

USE bar ALIAS foo

VFP won't complain, and when you look at the foo work area, you'll see the bar table, while the bar work area will contain the foo table. If you do this kind of thing too much, your data will definitely be FUBAR. My advice: don't mess around specifying aliases. Just let VFP do its automatic thing. (There are always exceptions, of course, and I'll describe one of them in a little while.)

You'll recall that only one table can be open in a work area at a time. Assuming I remain in the same Data Session, if I:

USE bar

The bar table will be opened in a work area called "bar".

If I then

USE foo

The bar table will be closed, the foo table will be opened in the same work area, and that work area will be renamed "foo". I mentioned that work areas are tracked by number by VFP, although names are easier for humans to use. There's an exception to that. In VFP, work area 0 actually means "the next unopened work area available in numerical order". So if you want to have more than one table open at once, tell VFP to open tables in the 0 work area:

SELECT 0

USE foo

or

USE foo IN 0

Note that this is NOT the same as the SQL "SELECT" command. The xbase SELECT means "choose a work area"; the SQL SELECT means "fetch me some records". This is just one of those things that happens when a language has been around for 20 years and keeps changing with the times.

I mentioned having the same table open in two different work areas with different sort orders. You might think this would work:

USE foo IN 0

SELECT 0

USE foo

However, there are a couple of problems here. The second USE will throw a "File is in use" error. And even if it worked, you wouldn't have an easy way to switch back and forth between the two work areas. Oh yes, they'd each have separate numbers, but you won't know what they are, and it would be a pain in the neck to find out. So here's the exception where specifying an ALIAS is useful:

USE foo IN 0 ALIAS foo1

SELECT 0

USE foo AGAIN ALIAS foo2

Now that you've got more than one table open at once, you need to be sure you're in the right work area before you try to access one of them. Do this with the SELECT command too:

SELECT foo1
* do something

SELECT bar
* do something else

SELECT foo2
* And so on

Good defensive programming means you should never assume that anything is going to work, of course. If you try to SELECT a work area that doesn't exist, you'll get an error. Three of the most common problems xbase beginners face are 1. trying to access tables that aren't open, 2. trying to open a table that's already open, and 3. scrambling data because they thought they were in work area foo but they were really in work area bar. So always check before you act:

IF NOT USED("bar")
SELECT 0
USE bar
ELSE
SELECT bar
ENDIF
or
IF ALIAS() == "bar"
* Modify some data
ELSE
IF USED("bar")
SELECT bar
ELSE
USE bar IN 0
ENDIF
* Modify some data
ENDIF
Note here that the ALIAS() function returns the name of the work area you're currently "in".

There's always more than one way to skin a fox, of course, and many of VFP's xbase commands let you specify a work area in line. For example, to modify the data in the Names field:

IF USED("foo")
REPLACE Names WITH "Alex" IN foo
ENDIF
ensures you've put things where they belong even when you haven't SELECTed the foo work area.

Before moving on, I should point out that you can use several SQL commands in VFP without explicitly opening a table first. When you do this, VFP performs an implicit USE foo IN 0 before executing the SQL SELECT, and it leaves that work area open, but not necessarily SELECTed. Thus, if I start from scratch:

? USED("foo") && Returns .F.

If I now do:

SELECT * FROM foo INTO CURSOR temp

? USED("foo") && Returns .T.

And if I don't check and do:

USE foo

I'll get a "File is in use" error. At this point,

? ALIAS() && Returns "TEMP"

This little gotcha fools a lot of people.

Closing a single table is simple. Just issue:

USE

in the appropriate work area, or:

USE IN bar

to close a table in a different work area.

The command

CLOSE TABLES ALL

will close all tables open anywhere.

CLOSE DATABASES ALL

will close all tables associated with any database along with the databases.

There are more variations on this theme of CLOSE; see the Help for details.

Indexes

I talked about sorting a table in different field orders. In order to do this, you need to create indexes on the fields you want to sort by.

Indexing is a pretty complex subject; fortunately, VFP makes creating and using indexes pretty easy.

Brief historical note: Early on in xbase development, index data was stored in individual index files, one each for each field in each table you wanted to index. VFP can still do this, and the result will be a file with a .idx extension. There's no reason to do this today unless you're working with a legacy application.

Today the standard is to use compound index files. These store all the indexes you create on fields in a particular table in one .cdx file. Each of these indexes is a "tag", and the name of the tag is a name expression. There's an important distinction between structural and nonstructural compound index files. If a structural index file gets lost or damaged, you won't be able to open the table (at least, not without some repair work). A lost or damaged nonstructural compound index file won't affect the table directly. Structural index files are automatically opened and updated when you open and modify a table's structure or data. You have to manually open nonstructural files to make them update when data is modified, and you have to REINDEX them after any change to the table structure. It would appear that you can only do one or the other with a particular table; you can't have both kinds of compound index files on the same table even if you give them different names.

The basic syntax to create a structural compound index tag is:

USE [table name]

or

SELECT [table name]

INDEX ON [field name] TAG [tag name]

This automatically creates a structural compound index file with the same name as the table if none exists. If the index file already exists, it adds your new tag to it. Notice that the table must be open and SELECTed before you can create an index. However, the table doesn't have to contain any records in order to be indexed.

Notice also that you can give the tag a different name from the field. Unless you have a good reason to do so, you probably shouldn't because it will be harder to remember. However, there are very good reasons to do this. So this is a good point to talk about VFP's Rushmore optimization.

"Rushmore" is a proprietary technique to speed up data access on indexed tables. It was created early in the history of Foxpro, long before there was a Visual Foxpro, and long before Microsoft bought the Fox Software company. Supposedly, the name comes from the fact that the system is used to "rush more" data to the user faster. It's been alleged that the main reason Microsoft made this purchase was to get its hands on the Rushmore technology. Certainly, Microsoft has applied that technology to other products, including Access and, as alleged by some, to SQL Server and perhaps to its upcoming Longhorn OS's database-based file system. In other words, Rushmore pretty much remains unsurpassed in the entire database world as a data access speed enhancer.

In order to take full advantage of it, though, you need to have an index not only on the field you want to search for, but in many cases, on a massaged version of the data in that field. A common example would be searching for records for people whose last name is "VanDamme". You can't count on users to capitalize names properly, and it's really not possible to produce a foolproof procedure for correcting capitalization of names like this one. The standard answer is to convert the entire name to upper or lower case and match it like this:

? UPPER(namevalue1) == UPPER(namevalue2)

However, if you have a simple index tag on the lastname field and you look for this kind of a match, Rushmore won't fully kick in. So instead, do:

INDEX ON UPPER(lastname) TAG ULastName

Now your search for VanDamme will run much faster.

You can also use expressions when creating index tags for specialized purposes, like:

INDEX ON (pricepaid > 1000) TAG BigSpenders

or

INDEX ON UPPER(lastname) + TRANSFORM(birthdate) + phonenumber TAG CompoundKey

if you think you're likely to need that kind of thing. The VFP Help cautions that having a whole bunch of index tags in a .cdx, or having tags with a lot of embedded functions, will slow down some table operations signficantly, so don't do too much of this.

I need to emphasize that sometimes Rushmore can actually slow things down instead of speeding them up. This will be the case, for example, if you have an index on DELETED() in a table with a very large number of records, but only a few deleted records. As I said, indexing is complex, and I don't want to confuse the issue in a beginner-level document.

But most definitely, create indexes for fields you search on often. When used properly, Rushmore speeds up searches by a factor of 20 or more on big tables.

When "corruption" occurs, the most likely culprit is an index that's "out of synch" with its table. VFP has a REINDEX command that is supposed to fix these problems, but it rarely works. Instead, you should wipe out the problem index file with

DELETE TAG ALL

and then rebuild it, either in the Table Designer or programmatically. This will fix just about any index corruption issue. VFP has various functions and commands to let you create a generic routine to delete and recreate indexes on any table or set of tables. Writing such a routine will be well worth your time.

Sort Order

Once you have an index on a field, it's easy to sort the table using it. With the table open in the current work area, just:

SET ORDER TO lastname

Note that you're calling the tag name, not the field name. So

SET ORDER TO BigSpenders

will put all the rich customers at the top.

The SET ORDER persists only as long as the table is open. If you close the table, it reverts to its natural order.

SET ORDER is not equivalent to the SQL SELECT ORDER BY clause. The only way to sort a VFP table on more than one field at a time is to create a compound index tag on those fields like so:

INDEX ON UPPER(lastname) + UPPER(firstname) + UPPER(midinitial) + TRANSFORM(birthdate) TAG PeopleOrder

SET ORDER TO PeopleOrder

In contrast, you can sort a recordset returned by SQL SELECT simply by specifying field names:

SELECT...
...
ORDER BY lastname, firstname, midinitial, birthdate

You don't have to have indexes on any of these fields to sort the SQL SELECT recordset on them. But you do need indexes if you want the SQL SELECT engine to use Rushmore to speed up its tasks. And note that this only sorts the recordset you retrieved, not the underlying table.

Record Pointer

The .dbf table model has a concept called the "record pointer". Imagine you have a hand-written grocery list, and you're going up and down the aisles at the store, but you forgot to bring a pencil with you. You know the store well enough that you were able to write the list in the order in which the groceries are arranged along the aisles. Even so, without a pencil to cross items off, you have a hard time keeping track of what you've already gotten on the list. You try to keep your fingertip next to the last item you put into your cart, but you keep getting distracted and moving your finger. Finally, inspiration strikes and you dig a paperclip out of your pocket and slide it onto the paper right over the list item. This is how the .dbf record pointer works.

The pointer starts out on the first record in the table, in the table's natural order (the order in which records were added), when you open it. It stays there until a command is issued that moves it, and it stays wherever it's moved to until it's moved again. Note that it's "clipped" to the record itself, not the record's position in the table, which can change when the sort order is changed or when a record is added or deleted. For example:

TABLE FOO
________________________________________________
Names Dates Numbers
________________________________________________
-> Bob 12/23/56 666
Joe 04/01/04 22
Alice 11/22/63 430
________________________________________________

The table above is sorted in natural order, and I've just opened it. The -> is the record pointer. After I sort the table on the Names field, it looks like this:

TABLE FOO
________________________________________________
Names Dates Numbers
________________________________________________
Alice 11/22/63 430
-> Bob 12/23/56 666
Joe 04/01/04 22
________________________________________________

I didn't move the record pointer; in fact, it didn't move at all. It's still on record 1 in natural order, but the table is no longer sorted in natural order. If you have the same table open in more than one work area, the record pointer can be on different records in each one. Making assumptions about where the record pointer is at any given time is risky.

You'll recall I said I can locate any record in a .dbf table without knowing anything about its data. That's true. I do need to know the record's position in the table's natural order, however. For example, if I issue:

GOTO 1

in the table as sorted above, the record pointer doesn't move. That's because it's already on "Bob", which was the first record I put into the table. If I say

GOTO 3

in the table sorted on Names, I get this:

TABLE FOO
________________________________________________
Names Dates Numbers
________________________________________________
-> Alice 11/22/63 430
Bob 12/23/56 666
Joe 04/01/04 22
________________________________________________

The xbase function RECNO() tells me which record the pointer is on. In the above case, RECNO() = 3. Since I've only got one work area open (the one called foo), RECNO() gives the correct result for the foo table. If I had more than one work area open, RECNO() with no argument would tell me where the pointer was in the last work area in which I opened a table that is still open. In this case, I might want to ask specifically for RECNO("foo") or RECNO("bar").

RECNO() can be very useful in situations where you need to handle a particular record, move the record pointer to some other record in the same work area, and then go back to the record you were on, and you know that nothing's going to change in the meantime. HOWEVER, remember that a record's RECNO() value can change when records are added or deleted, so it's not a reliable permanent marker for a record. You should use primary keys for that.

Basic Movement

GOTO is one way to move the record pointer in a VFP table. But there's an even more elemental way than that: the SKIP command. SKIP, used alone, moves the record pointer to the next record "down" in the table in the current sort order. Very important: SKIP respects the current sort order, NOT the natural order. There are variations:

SKIP -1

moves the pointer "up" one record.

SKIP 3

moves the pointer to the third record after the record we started on.

The SKIP command should not be confused with the SET SKIP command (used with temporary relations between tables) or the SKIP FOR command (used with menu systems).

BOF and EOF

VFP table files have "beginnings" and "ends", but the concept is a bit squishy.

When you're at the beginning of a file that contains at least one record (even if the record contains no data), the record pointer will always be on the first record in natural order. At this time,

? BOF() && "Beginning of File"

returns .T.

If you try to

SKIP -1

at this point, you'll get an error message.

However, the last record in a table is NOT the end of the file. There is always a "phantom" empty record after the last record. This record really exists in some scenarios, but not in others. I suspect the idea here was to make it easy to use a very basic table browsing screen, circa 1985, to add a new record manually into a table by simply hitting the down arrow when you're at the bottom record. This still works in the dialog VFP provides to "Enter data now?" right after you create a new table in the Table Designer (but not in the more modern grid-based browse screen). However, it won't work programmatically. You can't put data into a new record with xbase commands until you APPEND BLANK. (INSERT INTO..., by definition, creates a new record; VFP does an implicit APPEND BLANK here, so you won't run into this issue). Either way, though, if you succeed in putting data into a new record, you'll find there's still a phantom blank record right after it. You can't ever "fill it up". So when you're on the last real record, you can still

SKIP

without error. After you do that,

? EOF() && "End of File"

returns .T.

If you issue SKIP again now, you'll get an error message.

In an empty table (one that contains no records that you intentionally created), the record pointer will be on the phantom record. At this point, both BOF() and EOF() will return .T.

Creating New Records

We've already touched on this topic above, and it's pretty straightforward, so I'll just review briefly:

To create a new record with xbase, the table must be open already. Then, either:

SELECT foo

APPEND BLANK

or

APPEND BLANK IN foo

You can use the SQL INSERT syntax also, as follows:

INSERT INTO foo (field1, field2, field3) VALUES ("value1", 3, CTOD("12/23/56"))

You can leave out the VALUES clause entirely to create a blank record. If you use the VALUES clause, you must provide a value for every field in the table. It's not enough to just use an extra comma to mark the missing value, you must supply an actual value for each field, even if it's an empty value. You might think this is because VFP works in natural field order and has no way of knowing which field gets what value if you skip one. If that were all there was to it, though, then using a comma as a placeholder should work. I guess it's just one of those things....

You can use INSERT INTO..., without first opening the table. As it does with SQL SELECT, VFP will open the table for you behind the scenes and leave it open after the INSERT, with the record pointer left on the new record. But the currently SELECTed work area will be whatever it was before you did the INSERT.

Editing Existing Records

You can use xbase commands to edit one record at a time or in a batch process. The basic command is:

REPLACE myfield WITH "Mike"

If you just do this, there must be an open table in the current work area that has a field called myfield. The replacement will occur in the record the record pointer is sitting on. Note that if you have two different tables that both have a "myfield" field, VFP doesn't care which one your data goes into; it goes into whichever one is in the current work area. So you can also

REPLACE myfield WITH "Mike" IN foo

The batch process is pretty cool. There are several options, and they are cumulative. To handle several fields at once:

REPLACE myfield WITH "Mike", yourfield WITH "You", hisfield WITH "him"

To handle several records at once, use the FOR "scope" operator:

REPLACE myfield WITH "Mike" FOR birthdate < DATE()

This will put "Mike" into the myfield field in every record for a person who wasn't born today.

To make sure you do the replacement on all records, you can do something like:

REPLACE myfield WITH "Mike" FOR UPPER(FIELD(1)) = "MYFIELD"

You can combine the several records at once concept with the FOR clause to do a whole bunch of edits with one command. There are more variations on this, but you get the idea. When all REPLACEs are done, the record pointer will be on the last record where a REPLACE occurred in the current sort order, or at EOF if no REPLACEs were successful.

You can also use SQL syntax to edit existing records like so:

UPDATE foo ;
SET myfield = "Mike" ;
SET yourfield = "You" ;
WHERE so and so

Deleting Records

In .dbf data systems, the "DELETE" command is a misnomer. DELETE doesn't remove records from a table, it just marks them as "deleted". You can unmark them with the RECALL command.

This has implications for data searches. If you don't SET DELETED ON for the current data session, or you don't use NOT DELETED() in a FOR or WHILE clause when searching or selecting records, then records marked for deletion will still be included in your results.

To actually remove the marked records, you have to open the table EXCLUSIVEly (that is, only one user) and issue the PACK command. Once a table is PACKed, you cannot retrieve the deleted records by any means. BACK UP your data.

As far as using the DELETE command, it doesn't get much simpler than this. MAKE SURE you have the proper table open in the current work area, move the record pointer to the record you want to delete, and say:

DELETE

In fact, deleting a record is such an important step that it's probably a good idea to do:

DELETE IN foo

whether you've previously selected the proper work area or not.

The record will be marked as deleted and the record pointer will remain on the deleted record. This has implications for user interfaces. You don't usually want the user to be able to see a record s/he has deleted, but if you don't do something to move the pointer and refresh the display, s/he will.

DELETE also uses scope operators, so you can:

DELETE NEXT 3

to delete the record the pointer is on and the two records after it in the current sort order. The pointer will end up on the last deleted record.

You can also

DELETE FOR lastname = "Hitler"

All records whose lastname fields contain "Hitler" will be marked for deletion, and the record pointer will be left on the last deleted record in the current sort order, or at EOF if no matching records were found.

Searching and Finding

*** Note to Mike Yearwood -

Since mostly what I wanted to do here was remove the errors you pointed out, and to reorganize what you added to fit the flow of the section without modifying your text, I couldn't think of a reasonable way to actually document those changes so you could see how it was before. I just did them. I've used bold + italics or the [NEW] tag to emphasize some additions I've made to improve accuracy as well. As always, this is a work in progress. Please do add/correct whatever you like. -- Ken Dibble

***

VFP provides several ways to find particular records. All of the xbase methods involve moving the record pointer at some point in the process, which has implications for data buffering, though that's a whole other kettle of fish.

The LOCATE and SQL SELECT commands don't require indexes on the fields you're searching for but usually work faster if you have them; the rest of the commands require indexes.

We'll talk about the xbase methods first.

LOCATE is the easiest one to use, because it doesn't require an index. It is also the most flexible. Many developers avoid using it because it's relatively slow. However, I ran several tests on a table containing 10,000 records and found that with an index on the search field, LOCATE isn't really much slower than the other xbase options. It's certainly a lot less finicky in its syntax, and it can be used when you don't know if there's an index on the search field or not.

LOCATE starts its search with the first record in the current index order. To start searching from the current record, include the REST clause. If you issue LOCATE with no other keywords, the pointer goes to the top record. -- Mike Yearwood

You can also do:

GO TOP

to accomplish the same thing. Some people say that's a bit slower than a naked LOCATE. With all but the biggest tables, the speed difference is not detectable by humans.

There are some situations, usually involving buffering, where you may need to move the pointer like this just to get the data in the cursor -- remember the cursor, the memory "image" of the table? -- written back to the table (though "written" is a somewhat squirrely concept in VFP owing to its dependence on the Windows OS to actually write to the disk). In any case, that's an intermediate-level topic.

The basic command is

LOCATE FOR myfield = "Bob"

If there's a record that matches the FOR condition, the record pointer will be positioned on it. If there's not, the pointer will be at EOF.

To find out if LOCATE actually found a match, use

? FOUND() && Returns .T. if the match was found

It's always wise to test FOUND() and provide some kind of alternate response if it's .F., ESPECIALLY if you're planning to modify or delete the record you expected to find.

You can search backwards with LOCATE by first changing the current index order to descending like this:

SET ORDER TO TAG lastname DESCENDING

and then doing

LOCATE FOR UPPER(lastname) = "VAN DAMME"
-- Mike Yearwood

Unlike the other xbase search commands, you can use LOCATE to create extremely specific search criteria:

LOCATE FOR NOT DELETED() AND ALLTRIM(Firstname) == "Bob" AND LEFT(Lastname,3) == "Smi" ;
AND LEN(ALLTRIM(Lastname)) >= 5 AND NOT UPPER(SUBSTR(Lastname,4,2)) == "TH"

That'll return Bob Smirk and Bob Smidgen but not Bob Smith or Bobbi Smirk.

Unlike other VFP data manipulation commands, LOCATE does not support an IN [work area] clause. You must have the table open in the current work area where you want to search.

You can find a batch of records and process them with LOCATE, like so:

DO WHILE NOT EOF()
LOCATE FOR birthdate < DATE()

IF FOUND()
* Do something here that changes the value of birthdate, or you'll be in a permanent continuous loop.
ENDIF
ENDDO

But this is pretty archaic code; it's easier and faster to use VFP's SCAN....ENDSCAN construct.

SEEK is the fastest way to find a particular record in VFP, but it's limited in how it can be used. You can only search for matches in fields that have indexes, and you can only work with one index tag at a time. It doesn't accept any scope operators. So the only way to do a specific search like the one for Bob Smi... above is to first create an index on all those conditions. It's not the sort of thing you're likely to plan in advance, is it? SEEK starts searching from the first record in the current index order, but unlike LOCATE, you can tell it to search "up" instead of "down" right in the command with the DESCENDING clause. SEEK will leave the pointer either on the found record or at EOF.

SELECT foo

SEEK "Mike" ORDER firstname DESCENDING

or

SELECT foo

SET ORDER TO firstname

SEEK "Mike"

or

SEEK "Mike" ORDER firstname IN "foo"

These will all set FOUND() to .T. when the first match is found.

SEEK always confuses me. The index tag must be preceded by "ORDER". The VFP Help says you can use "TAG" instead, but I get an error message whenever I try it. The tag name must be a name expression, but if you specifiy the work area, it must be as a character expression.

SEEK() is much the same as SEEK. However, it's a function that returns .T. if the record is found, and .F. if it isn't. So, you don't have to check FOUND() after using it.

SELECT foo

SET ORDER TO firstname

? SEEK("Mike")

or

? SEEK("Mike","foo","firstname")

The VFP help implies that the second and third arguments are both optional, but in fact it won't work if you try to specify the tag without also specifying the work area, even if you do put in the usual comma placeholders.

INDEXSEEK() is interesting. Nearly identical in use to SEEK(), it does not move the record pointer to EOF if it doesn't find a match, and it can be set to not move the pointer even if it does find one. The Help says that it doesn't move the record pointer at all if it doesn't find a match, but that's not true, as you will find out if you use Row Buffering and expect INDEXSEEK() to prevent a forced update to the record you started on. The truth seems to be that VFP moves the record pointer while checking each record for a match but returns it to the record where you started if it doesn't find one or if you told it not to move the pointer. Some people have also reported that INDEXSEEK() will not always find newly-added records in a table buffering scenario unless the pointer is moved before calling the function--which obviates the whole point. For these reasons, I consider INDEXSEEK() to be a good idea that doesn't work well, and I'd leave it alone if I were you.

SQL SELECT assumes a "set based" system even though VFP is "record based". While you can use it to do some very complex stuff, arguably it's not nearly as easy to do so as with xbase commands.

The basic form is

SELECT * FROM foo

If any records are found that match your request, the VFP system variable _TALLY will contain a count of those records.

If you just go this far, you'll select all the fields in all the records in the table foo into a recordset which will be immediately displayed in a VFP BROWSE window. When you close the BROWSE, your recordset is gone.

To prevent the BROWSE and keep the recordset for processing, put it into something.

SELECT * FROM foo INTO CURSOR temp

or

SELECT * FROM foo INTO ARRAY aMyArray

Now there will be no BROWSE. If you use a cursor the recordset will persist until you close the cursor. If you use an array, it persists until the array goes out of scope.

Getting all the fields in all the records isn't always useful, so you can narrow things down:

SELECT firstname, lastname FROM foo INTO CURSOR temp WHERE UPPER(firstname) == "BOB"

As you'd expect, you now have a cursor that contains only first names and last names in records where the first name is "Bob".

As I mentioned earlier, you can also use the ORDER BY clause to sort the resulting dataset.

SQL SELECT does not move the record pointer. It does open the table if it's not already open and leaves it open, but not necessarily SELECTed, when it's done.

SQL SELECT is much more complex and powerful than I've described. In fact, it's practically a whole language in itself. Becoming proficient in it is not a trivial task, but it is a very valuable skill that can be carried forward into all areas of database development in all languages with all systems.


Contributors: Ken Dibble Mike Yearwood
Category Learning VFP
( Topic last updated: 2004.03.27 01:29:33 PM )