Wiki Home

Int Vs Chr Speed


Namespace: VFP
Here is a program to compare integer and character based keys using VFP. First, a few tests just to be sure that VFP can compare numbers quicker than strings. Next, compare the performance of populating tables and running a query that joins two tables using integer key fields and then character key fields.

Comments:

It doesn't matter if the numbers and strings being compared are the same or different, the times are about the same. I'm sure that larger strings take longer, but the times were the same for 1 character vs 10.

The queries focus on the performance difference between integers and characters. If more supplemental data was included, it would increase both times by a constant, which would reduce the relative difference; but the tests support that integers are twice as fast as characters, so what ever portion of a query is spent on the join, that is the part that is affected by data type.

I tried using tables that were considerably larger that the amount of ram, but the times were well over 3 hours a piece, with integer still coming out ahead.

Inside the DBF an integer is stored as a normal Intel memory architecture 4 byte value. When you create an index on an integer field, the value stored in the CDX file is actually a 4 byte BINTOC() character value.

Please note: the BINTOC() 4 byte string is not a 'standard' C/C++ signed or unsigned 4 byte integer - it is in fact a bizarre variant on this. To prove this to yourself:

? BINTOC(0) = REPLICATE(CHR(0),4)  && UINT/INT compare
? BINTOC(0) = CHR(128) + REPLICATE(CHR(0),3)  && whut BINTOC() returns
? BINTOC(1) = CHR(1) + REPLICATE(CHR(0),3)  && again, UINT or INT
? BINTOC(1) = CHR(128) + REPLICATE(CHR(0),2) + CHR(1)  && what BINTOC() says
? BINTOC(-1) = REPLICATE(CHR(255),4)
? BINTOC(-1) = CHR(127) + REPLICATE(CHR(255),3)


Ed Rauh


Please run this on your own machine and post the results. (The program puts the text onto the clipboard). I would hate it if this was just a peculiar side effect of my systems. It will also give us an idea of the effect different components of a system have on performance.


#DEFINE NUMCMPS 1000000
#DEFINE PNTRECS 1000
#DEFINE CLDRECS 6000000
#DEFINE CHRKEYSIZE 10

do linit

? "Comparing Numbers"
lTime( "lCmp(0, 0)")
?
? "Comparing Strings"
lTime( "lCmp('1234567890', '1234567890')" )
?
? "Using Integer Keys"
lTime( "lMkDataIntKeys()" )
lTime( "lTestQuery()" )
?
? "Using Char Keys"
lTime( "lMkDataCharKeys()" )
lTime( "lTestQuery()" )

do lCleanUp

return

function lCmp( tnA, tnB )
local ;
	lnCtr, ;
	llX
	
for lnCtr = 1 to NUMCMPS
	llX = tnA = tnB
endfor

return	


function lMkDataIntKeys()

local ;
	lnCtr

create database Test

create table TblA ;
	( pkTbla i primary key )
	
create table TblB ;
	( pkTblb i primary key, ;
	fkTbla i references Tbla )
	
for lnCtr = 1 to PNTRECS
	insert into Tbla values ( lnCtr )
endfor

for lnCtr = 1 to CLDRECS
	insert into Tblb values ( lnCtr, lnCtr%PNTRECS+1 )
endfor

close databases

return
	
function lMkDataCharKeys()

create database Test
local ;
	lnCtr
	
create table TblA ;
	( pkTbla c(CHRKEYSIZE) primary key )
	
create table TblB ;
	( pkTblb c(CHRKEYSIZE) primary key, ;
	fkTbla c(CHRKEYSIZE) references Tbla )
	
for lnCtr = 1 to PNTRECS
	insert into Tbla values ;
		( str( lnCtr, CHRKEYSIZE ) )
endfor

for lnCtr = 1 to CLDRECS
	insert into Tblb values ;
		( str( lnCtr, CHRKEYSIZE ), str( lnCtr%PNTRECS+1, CHRKEYSIZE) )
endfor

close databases

return

function lFileSize()
local ;
	laDir(1), ;
	lnCtr
	
adir( laDir, "Tbl?.*" )
for lnCtr= 1 to alen( laDir, 1 )
	? laDir(lnCtr,1), trans( laDir(lnCtr,2), "999,999,999,999" )
endfor
return
function lTestQuery

open database test

select * ;
	from TblB;
		join TblA on fkTbla = pkTbla ;
	into cursor Temp

close databases
return

function lInit
clear
set escape on
set safety off
close data all
close database

set alternate to temp
set alternate on

return


function lCleanUp
set alternate to
_clipText = FileToStr( "temp.txt" )

close data all
open database test
drop table TblB
drop table TblA
close database


function lTime( tcFunc )
local ;
	lnSrt, ;
	lnEnd, ;
	lnDlt
	
? tcFunc
lnSrt = seconds()
&tcFunc
lnEnd = seconds()
lnDlt = lnEnd - lnSrt
?? lShowTim( lnDlt )
return lnDlt

function lShowTim( tcSec )
local ;
	lcRet

do case
	case tcSec < 60
		lcRet = " " + trans( round( tcSec, 2 ) ) + " Secconds"
	case tcSec < 60^2
		lcRet = " " + trans( round( tcSec/60, 2 ) ) + " Minutes"
	otherwise
		lcRet = " " + trans( round( tcSec/60^2, 2 ) ) + " Hours"
endcase

return lcRet

Results

File Sizes:

Using Integer Keys
TBLA.DBF 5,329
TBLA.CDX 7,680
TBLB.DBF 54,000,361
TBLB.CDX 84,470,272

Using Char Keys
TBLA.DBF 11,329
TBLA.CDX 8,192
TBLB.DBF 126,000,361
TBLB.CDX 89,162,240

PII-450, 128Mb, Local files, IDE drive.

Comparing Numbers
lCmp(0, 0) 2.3 Secconds

Comparing Strings...
lCmp('1234567890', '1234567890') 2.94 Secconds

Using Integer Keys
lMkDataIntKeys() 16.64 Minutes
lTestQuery() 4.12 Minutes

Using Char Keys
lMkDataCharKeys() 16.02 Minutes
lTestQuery() 5.03 Minutes

PI-266, 96Mb, 10Mb ethernet, Server: NT, 2xPPro-150, scsiIII drive.

Comparing Numbers, big and small, equal and not
lCmp(0, 0) 4.82 Secconds

Comparing Strings...
lCmp('1234567890', '1234567890') 5.92 Secconds

Using Integer Keys
lMkDataIntKeys() 35.81 Minutes
lTestQuery() 14.98 Minutes

Using Char Keys
lMkDataCharKeys() 42.64 Minutes
lTestQuery() 28.63 Minutes

I suppose you have to compare similar types - i.e. same size fields - so change (10) to C(4) NOCPTRANS and fill it with binary data (BINTOC() for example) - but don't "convert" them while compare - of course extra function call will reduce performance
-- Igor Korolyov

Something you may need to bear in mind is that Fox does not work with big integers. I recently populated a table with (up to) 18 digit integer keys from an external system. Unfortunately for any operations on integers, any digits after the 16th were forced to 0. Hence data was lost, primary keys were violated, records couldn't be found etc.
I changed it to store character data with leading zeroes which doesn't have this problem.

contributors: Carl Karsten
Category Performance Category Code Samples
( Topic last updated: 2006.01.31 05:12:23 AM )