Investigate the overheads of function calls
Investigate the overheads of function calls
Author: Arthur Pool (see EMail Addresses)
(Statements in green color [like this] are comments from Bernd Schemmer)
(Use the included REXX program
to run the test described in this text on your PC)
In the OS/2 operating system, there are several different ways a REXX program
can call a function coded in REXX, each of which has implications for both
performance and maintainability.
These options are discussed (briefly) in the WARP
REXX on-line information (at least, in WARP Connect V3), which says
(inter alia):
"... internal labels take first precedence, then built-in functions, and
finally external functions. External functions and subroutines have a system-defined
search order. REXX searches for external functions in the following order:
- Functions that have been loaded into the macrospace
for pre-order execution
- Functions that are part of a function package
- REXX functions in the current directory, with
the current extension
- REXX functions along environment PATH, with the
current extension
- REXX functions in the current directory, with
the default extension
- REXX functions along environment PATH, with the
default extension
- Functions that have been loaded into the macrospace
for post-order execution.
"
In practice then, the options, and their implications, are:
[1] Include the source code for the function in the primary source file.
This should provide the best performance. However, if the function is to
be called by more than one primary REXX program, this approach
is undesirable in that it requires duplication of code, with consequent
maintenance problems.
[2] Load the macro function into a REXX MacroSpace. This can
be achieved using the RexxAddMacro call from C,
or using the RxAddMacro function provided in the RXU
utility package (I understand that WARP 4 (Merlin) provides a similar
function in the REXXUTIL package but I have no experience with that; see
New REXXUTIL functions in Object
REXX, The functions to work
on the macro space, and LoadMac.cmd
for information about the REXXUTIL DLL included in Object REXX ). This
approach is more or less equivalent to the EXECLOAD facility
in VM/CMS. This approach should provide performance somewhere between the
preceding and following approaches. It does however have some management
costs:
- you have to explicitly load the function into
the macrospace - if you don't, you'll simply execute the copy from disk,
which will be much slower;
- you have to also ensure that the macrospace copy
is unloaded and reloaded whenever the function's source file is modified
- if you don't you'll be executing an out-of-date copy.
As noted above, functions can be loaded in either of two ways:
- Loaded into the MacroSpace for pre-order execution
(executed before disk-based files) - this should produce better performance
than using disk-based functions.
- Loaded into the MacroSpace for post-order execution
(executed after disk-based files) - this should produce worse performance
than using disk-based functions (below).
However, it is worthwhile to consider the meaning of current extension
and default extension. When a function is loaded into
the MacroSpace, it can be loaded with an extension (eg, .CMD)
or without an extension. In these tests, as the
primary REXX file had an extension of .CMD, it appears that
the current extension is .CMD. We therefore
measured the performance of functions loaded into the MacroSpace (and called)
with an extension of .CMD and also without an extension.
Note however that in general one would prefer to load functons without extension
so that they are equally accessible to REXX programs with any extension
- whether called from REXX command files (.CMD), THE macros (.THE), or from
other environments. It's also cumbersome to have to specify the extension
when invoking the macro.
We measured 4 sub-cases:
- 2a] MacroSpace function, pre-order, .CMD extension;
- 2b] MacroSpace function, pre-order, no extension;
- 2c] MacroSpace function, post-order, .CMD extension;
- 2d] MacroSpace function, post-order, no extension;
- Leave the source code as a separate file, which is invoked anew
for each call to the function. Although ideal in terms of maintenance, this
approach will not produce good performance. In practice, disk caching will
reduce the impact of all function calls after the first. Also, the performance
will be affected by whether the source file is in the current directory,
or how far the system has to search along the PATH string before it finds
the source file. We therefore test these sub-categories:
- 3a] source file in the current directory;
- 3b] source file in a directory at the start of the PATH string.
- 3c] source file in a directory at the end of the PATH string.
- 3d] source file in a directory at the end of the PATH string,
without EAs. (See below for the rationale for test [3d].)
Following are some measurements of elapsed time (in seconds) for 255 function
calls using these various approaches (use the included test
program to run this tests on your PC).
[C:\Usr\AFP\SW\Testing]REXX_Function_Call_Performance 255
[1] function in the source program: 0.88
[2a] MacroSpace function, pre-order, .CMD extension: 2.06
[2b] MacroSpace function, pre-order, no extension: 2.13
[2c] MacroSpace function, post-order, .CMD extension: 91.09
[2d] MacroSpace function, post-order, no extension: 180.87
[3a] function in an external source file - CURRENT directory: 10.28
[3b] function in an external source file - START of PATH: 12.66
[3c] function in an external source file - END of PATH: 55.25
[3d] function in an external source file - END of PATH, no EAs: 42.12
[C:\Usr\AFP\SW\Testing]
Notes:
1) The function used was:
REXX_Function_Call_Performance_1:
return arg(1)**arg(1)
2) These measurements were on a 80486-DX4 with OS/2 Warp Connect (Blue Box)
with no service applied, using (obviously) HPFS.
(Results on a P133 with 32 MB RAM and OS/2 WARP 4 with Fixpack #7 and Object
REXX with HPFS:
D:\...\DEVELOP\REXX\FWTOOLS\REXXTT\Test>REXX_Function_Call_Performance 255
[1] function in the source program: 0.15
[2a] MacroSpace function, pre-order, .CMD extension: 0.48
[2b] MacroSpace function, pre-order, no extension: 0.49
[2c] MacroSpace function, post-order, .CMD extension: 4.70
[2d] MacroSpace function, post-order, no extension: 11.77
[3a] function in an external source file - CURRENT directory: 2.12
[3b] function in an external source file - START of PATH: 2.53
[3c] function in an external source file - END of PATH: 5.82
[3d] function in an external source file - END of PATH, no EAs: 7.75
)
(Results on a P266 with 160 MB RAM and OS/2 WARP 4 with Fixpack #12 and
Object REXX with HPFS:
D:\...\DEVELOP\REXX\FWTOOLS\REXXTT\Test>REXX_Function_Call_Performance.CMD 255
[1] function in the source program: 0.04
[2a] MacroSpace function, pre-order, .CMD extension: 0.17
[2b] MacroSpace function, pre-order, no extension: 0.17
[2c] MacroSpace function, post-order, .CMD extension: 2.46
[2d] MacroSpace function, post-order, no extension: 6.61
[3a] function in an external source file - CURRENT directory: 0.81
[3b] function in an external source file - START of PATH: 1.08
[3c] function in an external source file - END of PATH: 2.98
[3d] function in an external source file - END of PATH, no EAs: 3.46
)
Conclusions:
- As expected, including the function in the primary
source file was fastest.
- Loading the function in the MacroSpace for pre-order
execution is slower than including the function in the primary program,
but much faster than invoking from a disk file.
- Loading the function in the MacroSpace with an
extension of .CMD is faster than loading without any extension,
most notably when loaded for post-order execution, presumably because the
first search is for macros with a .CMD extension. Note however
that to achieve this performance, the extension must be specified both when
the file is loaded and when the function is invoked - somewhat cumbersome,
and probably of marginal benefit in the case of pre-order execution except
in the most extreme cases.
- When loading automatically from a disk file,
the position of the function's source file in the PATH string can have a
significant effect on performance - and the presence of network drives in
the PATH string is likely to exacerbate the effect.
- Loading in the MacroSpace for post-order execution
is very slow. This slow performance can be partly explained because the
system searches every directory in the search path (twice if the function
is loaded without any extension), fails to find the function as an separate
external file, and then finally looks in the MacroSpace (where it finds
the function).
- Even allowing for the preceding item, note that
the performance of functions loaded in the MacroSpace for post-order execution
is much slower in all cases than those loaded from disk (even when the source
file's directory is at the end of the PATH string).
Why is this?
I thought that perhaps it was because the version loaded into the MacroSpace
does not include the semi-compiled version which REXX normally stores in
the Extended Attributes (EAs), but test [3d] shows that this alone does
not explain the poor performance of tests [2c] and [2d]
(The macro space contains only the tokenized code.)
- Test [3d] (see above) strips the EAs from the
function's source file and makes it read-only (which prevents REXX from
attaching the semi-compiled form to the source file) to compare the performance
of an external source file without the benefit of the semi-compiled form.
Surprisingly, this version is faster than test [3c]! Unlikely though it
seems, the semi-compiled form appears to be of no benefit in this test!
Perhaps because the external function is quite small and relatively simple,
the additional overhead of accessing and loading
the EAs outweighs the benefit of the semi-compiled form?
(I think the answer to this question is Yes. In addition, this is also dependent
from the processor and harddisk used; see results for the P133 above.)
In any case, the poor performance of tests [2c] and [2d] remains a puzzle.
- Loading in the MacroSpace for pre-order execution
therefore appears to be generally a good compromise - fairly good performance,
easy to maintain, but has some management overheads.
History:
- Sent to THElist 1998-03-16
- Updated copy sent to THElist 1998-08-03
[Back: Using the REXX Macro Space]
[Next: Test program to test the overheads of function calls]