Berkeley Multimedia Research Center
Published: September 1994
Berkeley, CA
USA
http://www.bmrc.berkeley.edu 

Building Common Lisp Applications with Reasonable Performance

Abstract
1. Introduction
2. Baseline Application
3. Vendor Independent Optimizations
3.1 Time and Space Profiling and Type Declarations
3.2 Attach/Detach
3.3 CLOS method dispatch compilation
3.4 Optimization perfromance tests

4. Vendor Specific Optimizations
4.1 Allegro Common Lisp
4.2 Lucid Common Lisp

5. Discussion
6. Conclusions
Acknowledgements
References 

Building Common Lisp Applications
with Reasonable Performance*

John Boreczky
Lawrence A. Rowe
Computer Science Division-EECS
University of California
Berkeley, CA 94720
 

Abstract

1. Introduction
Back to Top

 This paper describes our experience with a number of methods that can be used to improve the performance of Common Lisp applications. These methods include vendor-independent optimizations such as recoding high traffic functions and vendor-dependent optimizations such as incremental loading. The results of these optimizations are quantified and compared.

 


1. Introduction

Abstract
2. Baseline Application
Back to Top

 

Over the past several years we have developed several applications in Common Lisp (CL) [Ste90] including a graphical user interface toolkit [RKS91], an end-user browser for a CIM data- base [SmR92], and a hypermedia system and courseware [BeR91,ScR92a]. It is well-known that CL is an excellent prototyping environment, however, the CPU and memory resources required by CL applications have prevented them from being used in a production environment [Gab91, LaR91]. CL vendors have invested significant resources to produce compiler improvements and tools to reduce the resources required.
This paper describes the use of tools produced by two CL vendors, namely Franz Inc. (Allegro CL [FrI91a]) and Lucid Inc. (Lucid CL [LuI90]), to productize a moderately complex semiconduc- tor manufacturing application. A variety of different optimizations, including vendor-independent and vendor-dependent optimizations, were applied to improve resource utilization. The results of these improvements are quantified and compared.
The sample application used in this experiment is CIMTOOL [SmR92] which provides a graph- ical user interface (GUI) to a semiconductor manufacturing CIM database that contains data about the fabrication facility (e.g., floor plan, equipment, utility lines, etc.), equipment (e.g., status, trou- ble reports, processing history, etc.), and work-in-progress (e.g., lot status and history). It also dis- plays data captured from real-time sensors and images and videos that describe the facility, equipment, and its operation. Figure 1 shows a screen dump with CIMTOOL running that displays the floorplan, an equipment image, a query selection window, and status windows for equipment and utility lines.

 

 Figure 1: CIMTOOL screendump
CIMTOOL is written almost entirely in CL using the Picasso GUI development environment also produced at Berkeley [RKS91, KoR91]. Picasso provides an application framework and an X Window System [Sch91] interface toolkit. CIMTOOL and Picasso make extensive use of the Com- mon Lisp Object System (CLOS) [Kee88].
The vendor-independent optimizations that we applied to the code included: 1) recoding high traffic functions, 2) compiling the CLOS method dispatch code, and 3) implementing a dump system to reduce interface and application object creation. The vendor-specific optimizations that we applied included: 1) dynamic loading of compiled functions supported by Allegro CL and 2) reducing the working set size supported by Lucid CL. Applying these optimizations reduced startup time by 50% and execution time by 25% on a representative task. Memory requirements increased slightly for Lucid CL because we were not able to use their Treeshaker technology to remove unused code for reasons discussed below. Memory requirements were reduced 20% for Allegro CL. However, in the best case, CIMTOOL required 19 MB of memory which is impractical for deployment to a user community that might run 10 concurrent copies of the application on X terminals connected to a shared server.
The remainder of this paper describes the experiment in more detail. It is organized as follows. Section 2 describes CIMTOOL, the Lisp systems, and a baseline benchmark before optimization. Section 3 describes the vendor independent optimizations, and Section 4 describes the vendor specific optimizations. Sections 3 and 4 also describe the effects of performing these optimizations. Section 5 discusses our experiences performing this experiment. Section 6 concludes the paper.


2. Baseline Application

1. Introduction
3. Vendor Independent Optimizations
Back to Top

 

Picasso is a GUI development system that provides an application framework and an interface toolkit. The application framework provides high-level abstractions such as dialog boxes, panels and forms for interacting with data, frames for presenting the top level application interface (a form and a menu bar), and tools for collecting the various interface abstractions into a coherent application. The interface toolkit provides an extensive collection of widgets including: buttons, scrollbars, text editors, graphic browsers, tables and video widgets.
Picasso includes a simple constraint system to bind variables to other variables or interface objects, several parameter passing/variable passing options, a large assortment of geometry managers, and interfaces to Postgres and Ingres relational databases.
The design and implementation of Picasso required over 10 man-years of work over a 5 year period by experienced programmers who had limited experience with Lisp when they started. The framework and toolkit contain roughly 35000 lines of code, while a typical Picasso application requires a few thousand lines of code (e.g., CIMTOOL is about 3000). While writing Picasso, the emphasis was on rapid development and high functionality with a hope that execution speed would be reasonable. The desire to use Picasso to deploy multimedia and semiconductor manufacturing applications in the Berkeley Microfabrication Laboratory [VoK89] required us to improve performance which led to the work on attach/detach and the other optimizations described in this paper.
A typical Picasso application is composed of several Lisp files that describe the interface objects (e.g., non-modal panels and frames and modal dialog boxes) and files that contain support functions for the application. The support functions are usually compiled and loaded with the application. Each interface description file contains one or more macro calls that create Picasso interface objects which are represented by CLOS objects. Creation of Picasso interface objects is slow because they require a large number of "call-next-method" calls and CLX creation calls to initialize the object and all of the widgets it uses. The constraint mechanism is used to propagate changes to a program variable to other variables or to the display. The Lisp compiler is used to compile to handle the constraint code.
CIMTOOL provides a GUI interface to specify ad hoc queries to a database. The main memory data structure for the facility floorplan including walls and icons to represent equipment requires about 40 KB of memory. Picasso interface objects are created as needed for windows to display information.
The benchmark task used for performance measurement is composed of 35 simple actions that involved selecting data items using a mouse, performing queries on those items, and viewing data graphically and textually. This task was a representative usage of the application, although images and video were not displayed.
Two commercial Common Lisp implementations were used: Allegro CL 4.0.1 and Lucid CL 4.0.1+. All tests were run on a SPARCstation 1+ with 40 megabytes of physical memory and 55 megabytes of swap space running Sun OS 4.1.1. Relatively little paging was done because of the physical memory. Attempts to run any version of CIMTOOL, including those heavily optimized, on machines with less physical memory (e.g., 16 MB or 24 MB) were much too slow; startup time was greater than 5 minutes and execution time was 3-4 minutes for the benchmark. We originally used CLX 11R4 under both Lisp implementations, but CLX R5 fixed a number of bugs under Lucid CL and had no appreciable benefit under Allegro CL, so our tests were run with CLX 11R4 in Alle- gro CL and CLX 11R5 in Lucid CL. All Lisp source files were compiled with optimization settings of safety 1, space 1, debug 1, speed 2, and compilation-speed 0. All executable images and the CIM- TOOL files were NFS mounted across a local Ethernet.
Allegro CL partitions dynamic memory into old space and new space. Text space is a third area of memory used to hold static data. New space is divided into two equal-sized pieces, called half- or semi-spaces, to allow stop-and-copy garbage collection. New space is used to store newly created objects. When an object in new space has survived a fixed number of garbage collections, called generations, it is placed in old space (i.e., tenured). The sizes of these areas are set when the base Lisp image is created, although additional old spaces can be created if needed. The time Allegro CL spends garbage collecting can be reduced by allocating more memory. In contrast, allocating more memory has little effect on the performance of the Lucid CL garbage collector. We created our Allegro images with 14 MB of old space and 11 MB of new space (5.5 MB per semi-space). This space is in the virtual address space for the image but not necessarily in the working set when an application is run. This amount of memory is more than is necessary, but it reduces the number of garbage collections significantly. Since an object in new space is processed multiple times before it is tenured, reducing the number of garbage collections improves performance, even though each garbage collection takes longer. The 14 megabytes of old space is more than sufficient for any objects tenured during execution (much less is needed when using the vendor-specific optimizations as described in Section 4.1).
The Lucid memory organization is more complicated and more flexible. Generational garbage collection is implemented with a hierarchy of dynamic spaces called ephemeral levels. Objects that survive a garbage collection are moved into the next ephemeral level. There are separate areas for read-only information (i.e., code), static data, temporary data, and foreign data in addition to the ephemeral levels and the two semi-spaces for dynamic objects. Ephemeral level 0 is where new objects are created. When it is filled, a garbage collection is done and any surviving objects are moved to ephemeral level 1. In the default setup, level 0 is 512 KB, and levels 1 and 2 are 640 KB. Objects that survive level 2 move to the current dynamic semi-space, which is expanded as needed. Consequently, Lucid CL garbage collection results in shorter but more frequent garbage collections than Allegro CL. It is possible to select the size and number of ephemeral levels in addition to the size of the dynamic semi-spaces, which allows the memory setup to be tuned for the needs of an application. We used an image with 24 MB of total space for Lucid CL.
Given the substantial differences between the memory organization of these two implementations and the ability to add memory to an image, it is difficult to determine the size of a running program. The method used in this paper is to determine the process size from the UNIX process status command (ps) and to subtract the unused portion of old space for Allegro CL and dynamic space for Lucid CL. The new/ephemeral spaces were untouched because changing their sizes would affect performance. Thus, our reported execution size is the run time size of the smallest images that could run with the reported performance. It would be unrealistic to run these images because there is no room to tenure objects, but they serve as a useful means of comparison. We investigated and rejected other possible measures of process size, such as number of bytes consed, number of page faults, and resident set size, because they did not reflect the real memory required to run the application.
The load times reported are the elapsed time required to load the application from disk and initialize it. Execution time is the time elapsed from invoking CIMTOOL until the Lisp prompt returns after completing the test and quitting the CIMTOOL application. All times reported are elapsed times to the nearest second. We report elapsed times rather than CPU times because we want to measure user's perceived performance of the application. In addition, the application is not CPU bound, it is memory and I/O bound, and CPUs are getting faster more rapidly than memory and I/O bandwidth are increasing.
To provide a sense of perspective and establish a baseline performance, tables 1 and 2 show the measurements of the base CL implementations and the effect of adding Picasso and CIMTOOL code. The load times for CIMTOOL are large because of the time needed to build the main memory database. We report both disk size and execution size of the application although users typically care more about execution size. Although specific details are different between the two implementations, overall performance is quite similar.
Optimization  Load Time (sec)  Disk Size (KB)  Execute Size (KB)
Allegro CL + CLX + CLOS  16  5672  16347
+ Picasso  49  11616  22349
+ CIMTOOL  158  11602  24132
Table 1: Allegro Common Lisp Baseline
Optimization  Load Time (sec)  Disk Size (KB)  Execute Size (KB)
Lucid CL + CLX + CLOS  19  10048  15432
+ Picasso  40  14656  22096
+ CIMTOOL  136  14648  26340
Table 2: Lucid Common Lisp Baseline

3. Vendor Independent Optimizations

2. Baseline Application
4. Vendor Specific Optimizations
Back to Top

 

Many optimizations can be applied to improve the performance of a CL application. Some optimizations involve making improvements to the source code of an application and are thus implementation independent, even though vendors might provide tools to aid in the process. This section describes three such optimizations and the resulting performance improvements.

3.1 Time and Space Profiling and Type Declarations

3. Vendor Independent Optimizations

 

Profiling is used to provide a rough estimate of the time and space used by individual functions. Functions that require more time and space than seem necessary or functions that are called very often can be rewritten using a better implementation or algorithm. Both Lisp implementations provide facilities to count function calls and estimate time and space usage.
The Lucid profiler provides data in a number of formats. We examined profile data for both initialization and execution. General internal functions such as LRUN were the largest time consumers, but some standard functions were also large time consumers. We then used the Performance Monitor to gather more detailed and accurate statistics on these standard functions.
We examined all functions that accounted for more than 0.5% of the initialization time or more than 0.5% of the execution time, which included more than 50 functions. The percentage of time spent in the standard functions during loading/initialization and execution are shown in Tables 3 and 4, respectively. The sequence and arithmetic functions indicate that type declarations should improve performance. For the other functions, it is easy to see and accept why a large amount of time is spent there. This information would have been more useful during Picasso development, but after the fact it is difficult to isolate which code to rewrite knowing that the MEMBER function is called often and takes a good amount of time. The information did tell us that we might achieve a 5 to 10% performance improvement from careful type declarations, but it did not suggest a specific course of action.
Function  % of load time
NTHCDR  3.88
GETHASH  3.52
READ-CHAR  1.70
MEMBER  1.21
LIST-NREVERSE  1.17
DIRECTORY-NAMESTRING  1.14
SUBTYPEP  0.82
TYPEP  0.60
STRING=  0.55
ASSOC  0.51
Table 3: Percentage of Load Time Spent in Standard CL Functions
Function  % of execution time
GETHASH  3.96
READ-CHAR  1.14
LIST-NREVERSE  1.06
EQUAL  0.89
DIRECTORY-NAMESTRING  0.81
TYPEP  0.71
LISTEN  0.63
SUBTYPEP  0.60
POSITION  0.60
STRING=  0.57
REPLACE  0.54
Table 4: Percentage of Execution Time Spent in Standard CL Functions.
We did not collect time and space profiles using Allegro CL. The profilers in version 4.0.1 and 4.1 Beta didn't work with our code, although the official 4.1 release fixed many of the problems. We have not pursued this analysis further, mainly because we doubt that significant improvement is possible since we already profiled with the Lucid CL tools.
Type declarations allow a Lisp compiler to produce more efficient code by reducing the need for run-time type checks and by reducing the overhead to dispatch generic functions (i.e., to call CLOS methods). Both Lisp compilers can report the type information that is assumed or inferred while optimizing. When these assumptions are insufficient, type declarations allow the compiler to do a much better job of code optimization. Both compilers also provide finer control than the required space, speed, safety and debug proclamations that influence the types of optimizations done while compiling (e.g., tail merging can be turned off even if speed is given a high value).
There are few type declarations in Picasso. A few time-critical methods declared some variables to be integers. The other use of declarations was to reduce the number of compiler warnings by indicating that some variables should be ignored. We began using declarations early in Picasso development expecting that the compilers and run-time environments would report type errors. Type checking for program reliability is a well-respected feature of most programming systems for production applications. Unfortunately, Lisp compilers at the time (late `86 and early `87) did not do type checking or make much use of type declarations. This deficiency discouraged us from using declarations at the time since there was no immediate benefit. In retrospect we should have disciplined ourselves to use type declarations but we naively believed the Lisp religion that faster machines would solve all performance problems.
We asked the compilers to provide suggestions for type declarations in Picasso using the explain declaration in Allegro and the show-optimizations compiler option in Lucid. Both compilers produced a lot of output. Trimming away the useless information left us with two main types of declarations: declare numerical arguments and/or results to be fixnums and declare sequences to be simple arrays (the same recommendations we inferred from the profiling). The results of adding type declarations to the code are shown in Section 3.4.

3.2 Attach/Detach

3. Vendor Independent Optimizations

 

Attaching is the process of creating the CLX data structures that correspond to a Picasso inter- face object, such as a window, and establishing a link between the object and the data structures. This ensures that the X server knows about and allocates the proper resources for the object and its children. Normally, Picasso objects are created, initialized, and attached when they are needed, and they are implicitly detached when an application is exited. Detach methods allow Picasso objects (and their children) to be severed from the X server once they are initialized which means that Pic- asso applications can be initialized, detached, and dumped to disk for later use. Upon loading, these dumped images require the Picasso objects to be attached, and the application is ready to run.
Using attach/detach resulted in substantial savings in startup time because the creation and initialization of Picasso objects is quite slow. This slowness is due to the large number of slots that need to be initialized for some objects, the large number of "call-next-method" calls to support inheritance for Picasso objects, and the naive use of CLOS by inexperienced programmers. Because the top-level Picasso tool has all of the other Picasso objects in an application as descendants and is itself a descendent of the display (a CLX data structure that encapsulates the X server, screens and input devices), it is possible to perform attaches or detaches with one top-level method call.

3.3 CLOS method dispatch compilation

3. Vendor Independent Optimizations

 

The final vendor independent optimization that we investigated was compilation of CLOS method dispatching. When a generic function is invoked for the first time or after a new method is defined, method dispatch code is created and compiled. This compilation requires the compiler to be present in the image, makes loading of files with many method definitions slow, and causes poor performance for the first call to a generic function. Both Lucid CL and Allegro CL provide facilities to compile the dispatch code from a running application into a file so that it can be loaded at startup the next time that application is run. Lucid CL has a few simple functions that can compile the dispatch code for later use by a running application or save this information in a file. Allegro CL 4.0.1 had undocumented (and thus rather clumsy) functions for saving this type of information in files which could be compiled for later use, but occasionally produced unreadable objects (i.e., those printed with #<...>) in the files. This deficiency was remedied in Allegro CL 4.1.

3.4 Optimization performance tests

3. Vendor Independent Optimizations

 

The results of applying the optimizations described above alone and in combination to both Allegro CL and Lucid CL are shown in Tables 5 and 6, respectively.
As discussed above, we expected only a small performance improvement from the use of type declarations. Most sequences in Picasso are lists, and the few arrays we use are of variable size, so the sequence declarations provide no benefit. We found that declaring a sequence to be a list had no effect on performance. The fixnum declarations do help, but many functions where such declara- tions would have the greatest effect, such as geometry managers, use ratios or floats. We made the recommended declarations when possible, erring on the side of caution to avoid the introduction of new bugs. In general, type declarations resulted in a 15 to 25% reduction in size in the compiled code for functions that had a large number of declarations. Averaged across all of Picasso, however, the space reduction was small. The effect on execution time was also small. Elapsed time was reduced by 10 to 15%.
Attach/detach provided tremendous improvements in load time because most of the slow initialization of Picasso objects is precomputed. Much of the load time is due to disk I/O. The effect is especially dramatic for CIMTOOL because the facility database is loaded when the program is executed. Using attach/detach eliminates data loading. Attach/detach allows a user to stop a Picasso application in progress and restart it at a later time. When using Lucid Common Lisp, it is a good idea to garbage collect dynamic space before doing a dump, because the image dump process places all tenured objects into static space.
The effect of loading the compiled dispatch code upon startup is small. The extra overhead of loading in the binary containing the compiled dispatch code is repaid by slightly faster initialization and execution, except when used with attach/detach, where there is not enough initialization performed to recover the cost. 

4. Vendor Specific Optimizations

3. Vendor Independent Optimizations
5. Discussion
Back to Top

 

This section describes the tools provided by Franz and Lucid to productize CL applications. 

4.1 Allegro Common Lisp

4. Vendor specific Optimizations

 

Allegro Presto [FrI91b] allows a Lisp image to be built that contains only the functions necessary for an application with the option of loading in other functions if needed. It does this dynamic loading by extending the concept of autoloading, which allows an entire compiled file to be loaded if one of its functions is called. This approach allows individual functions to be loaded as needed.
Allegro Presto allows any compiled binary to be libfasl loaded by creating a stub function in the image for each function in the compiled file. The code vectors and other information, such as argument lists, are loaded from the proper binary files on disk only when those functions are called. Thus, the compiled files serve as libraries of functions that can be loaded if and when they are needed. Consequently, it is possible to create a small Lisp system that has the standard Common Lisp functions libfasl loaded and build your application on this foundation. It is also possible to load a binary file normally if it is known that all functions in the image will be needed eventually.
Once an image is built with Allegro Presto, libfasl loaded binary files cannot by moved or modified, since the stubs refer to the code vectors and other information by absolute position within an absolute filename. Allowing these files to be referenced by relative pathname is necessary for delivering applications and it is reportedly under development.
Allegro Presto also allows you to specify the functions that will always be needed in an application and to place them in text space. A process similar to performance monitoring is used to record the functions that are loaded from stubs while the user exercises the application. A new Lisp image can then be built that contains these functions in text space. Placing the required functions in text space helps to improve locality of reference and allows code to be shared by multiple running images (for those who like to run more than one Lisp image at a time).
The results of applying Allegro Presto alone and in combination with some vendor independent optimizations to CIMTOOL are shown in Table 7. The "All Optimizations" case is a libfasl loaded image with code in text space, attach/detach, method compilation, and type declarations. Loading functions dynamically, done with libfasl and code in text space, reduces the memory required at the expense of longer load time. The disk size for these images is small because we dynamically load the 7 MB Lisp bundle file that contains the compiled code for the standard CL functions and optional packages.
Many Lisp images must be created when using Allegro Presto. Starting from scratch, placing code in text space requires three images to be built: one in which to compile your application, the second to make a libfasl image of your application to determine which functions are loaded, and the third to place those functions in text space. Allegro Presto is also of limited use during application development, since it requires rebuilding the image when any of the libfasl loaded binaries change. The interface to Allegro Presto in Allegro CL 4.0.1 requires the user to be involved with the low level details of how images are built. The interface is slightly improved in Allegro CL 4.1.

4.2 Lucid Common Lisp

4. Vendor Specific Optimizations

 

The Lucid Delivery Toolkit [LuI91] includes the Backtrace Logging Facility and the Performance Monitor. In addition, it includes the code Reorganizer and Treeshaker. We used the Beta version of the Treeshaker provided with Lucid CL 4.0.1+.
The Reorganizer rearranges the memory space of a Lucid CL application image to reduce disk paging. If items that are used together are located near each other, locality of reference is increased and paging is reduced. The Reorganizer operates like a profiler in that monitoring code is added to the image that records which functions and data items are accessed during execution. You can then dump a fresh image of your application using the data produced by the Reorganizer.
The Treeshaker removes unused functions and other data (e.g., data types and printing formats) from an application to reduce the image size. Unlike Allegro Presto, items that are removed from an image are no longer available, so the Treeshaker is really creating a subset of Lucid Common Lisp that is sufficient to execute the application. The Treeshaker is not intended to be used on applications that use the compiler while running because including the compiler and all code it needs severely limits the amount of code and data that can be removed from the image.
The Reorganizer requires a lot of memory to run. It requires roughly 3 times the amount used when running the application normally. In addition, the image runs very slowly due to the code that is tracking data and code references. The results of applying the Reorganizer alone and in combination with the vendor independent optimizations to CIMTOOL are shown in Table 8. As promised, the Reorganized image is slightly larger and about 18% faster. The "All Optimizations" image includes applying the Reorganizer with attach/detach, method compilation and type declarations.
The Treeshaker requires even more memory to run - closer to 4 times the application's normal usage. We tried to use Treeshaker even though Picasso does use the compiler in certain situations. Unfortunately, the Beta version of the Treeshaker was not able to handle the dynamic generation and loading of code performed by Picasso. The Lucid support staff was helpful in our getting Treeshaker to run on small Lisp test cases, but we never managed to get it to run on a Picasso image, even with a simpler Picasso application or no application at all. When Treeshaker is officially released, it can be used to reduce Lucid CL image sizes. However, in our experience it will have a minimal effect on execution time unless you are running on a severely memory-limited system. 

5. Discussion

4. Vendor Specific Optimizations
6. Conclusions
Back to Top

 

This section discusses the overall effect of the optimizations and the tradeoff between the effort needed and the benefit gained.
Using a combination of Lucid Reorganizer, attach/detach, method compilation and type declarations on the original CIMTOOL application, we were able to realize a 50% reduction in load time and a 23% reduction in execution time at the expense of a 22% increase in disk image size and a 4% increase in run-time image size.
(!!libfasl+text+attach is better!!)The best results with Allegro Common Lisp came from using attach/detach on the original application. This resulted in a 48% reduction in load time and a 21% in execution time at the cost of a 21% increase in disk image size and a 2% increase in run-time image size.
Looking over the effects of the optimizations, it is clear that attach/detach is the big winner by providing a 20% execution time reduction and a 50% reduction in load time. The load time improvement is not surprising, because attach/detach allows most initialization to be done before the image is made. The increased performance during execution occurs because some initialization is done when a Picasso object is first displayed, and attach/detach reduces or eliminates this time.
The other optimizations produced 15% or less reductions in elapsed execution time, although the Reorganizer did a bit better. Unfortunately, many of the optimizations conflict, that is, when two or more are combined, performance is often worse than the best of the included optimizations. This result means that we might be able do a few percent better than our best for Lucid CL by not using method compilation. We certainly do not want to try every combination to see which is best, but the given results provide some ideas as to what might work well together.
Attach/detach was not difficult to implement because much of Picasso was designed with the feature in mind. Although the exact mechanism does not translate to other applications, the idea of doing as much computation as possible in advance can be useful. Attach/detach might be a useful feature for CLOS to support. (!! find synonym for USEFUL)
CLOS method compilation is easy to apply to an application, unfortunately it tends to cause worse performance when combined with other optimizations. The overhead of loading the compiled dispatch code must be offset by enough initialization and execution to avoid a net loss. Given the utility of attach/detach, which removes most of the initialization, it does not seem appropriate to use method compilation.
Allegro Presto provided substantial reductions in the image size on disk, given that the bundle file needs to be present on disk anyway. For a small increase in load time we get a 10 to 15% reduction in elapsed execution time. It took considerable effort and understanding to produce Lisp images using all the features of Allegro Presto, and there are limitations that make it less useful during development. Nevertheless, it produced good results.
The Lucid Reorganizer provided a reduction in execution speed at the expense of a slight increase in disk image size. Creating the Reorganizer data is slow but not very complicated. And, using the Reorganizer seems to produce few conflicts with the other optimizations. When the Treeshaker is available it should help to reduce load time and disk image size as well, although permanently removing code from an image can cause problems.


6. Conclusions

5. Discussions
Acknowledgements
Back to Top

 

In the end, the best improvements came from modifying and tuning our code. It is good to see major Common Lisp vendors addressing the performance and resource problems of Lisp, and their tools do provide benefit. As expected, these general optimization tools are not a substitute for careful design, good coding, and a bit of after-the-fact tuning.
The Franz and Lucid products both provide excellent development environments. In comparison, each product has strengths. The Allegro dynamic loading of functions facility (Presto) is easy to use and does a good job of reducing image sizes. The Lucid compiler provides considerable help in detailing which declarations will best improve performance. The bottom line is that for the versions we tested, Lucid is better for development and Allegro is better for deployment.
(!! from Baseline App. section) It is difficult to make comparisons between Lisp implementations because each implementation has its strong and weak points. In our experience, Lucid CL executes faster and the Lucid compiler produces smaller binaries than Allegro CL. Allegro CL, on the other hand, compiles faster, loads compiled binaries faster, and the base system is smaller. However, as discussed below, the differences between these systems are insignificant compared to the differences between CL and other programming languages.
The downside of this experiment was the effort required. Man-months of extra work to tune the performance of an application is excessive and rarely available. Although the situation is improving, it is still difficult to justify the use of Common Lisp for the development of a large application that needs to be delivered to users. In our case, the performance and space gains were not sufficient to justify continuing the development of Picasso in Lisp.


Acknowledgements

6. Conclusions
References
Back to Top

 

We want to thank Jon L. White and Sayuri Nishimura at Lucid Inc. and Charles Cox at Franz Inc. for their help. 

References

Acknowledgements
Back to Top

 [BeR90] B. S. Becker and L. A. Rowe, "A Hypermedia Extension of the PICASSO Appli- cation Framework", Proc. NIST Advanced Information Interfaces: Making Data Accessible 1991, June 1991.

[FrI91a] Franz Incorporated, Allegro Common Lisp User Guide, Release 4.0, Berkeley, CA. January 1991.

[FrI91b] Franz Incorporated, "Technical Memorandum #14: Allegro Presto", Berkeley, CA, January 1991.

[Gab91] R. P. Gabriel, "Lisp: Good News, Bad News, How to Win Big", AI Expert, June 1991, pp. 31-39.

[Kee88] S. E. Keene, Object-Oriented Programming in Common Lisp, Addison-Wesley, Reading, MA, 1988.

[KoR91] J. A. Konstan, L. A. Rowe, "Developing a GUIDE Using Object-Oriented Pro- gramming", Proceedings of 1991 Conference on Object-Oriented Programming Systems, Languages and Applications.

[LaR91] D. K. Layer and C. Richardson, "Lisp Systems in the 1990s", Communications ACM, Vol. 34, No. 9, September 1991, pp. 48-57.

[LuI90] Lucid Incorporated, Sun Common Lisp Version 4.0 User's Guide, Second Ed., Menlo Park, CA. September 1990.

[LuI91] Lucid Incorporated, Sun Common Lisp Version 4.0: The Delivery Toolkit (Beta Draft), Menlo Park, CA, March 1991.

[RKS91] L. Rowe, J. Konstan, B. Smith, S. Seitz and C. Liu, "The PICASSO Application Framework", Proceedings of User Interface and Software Technology 1991 Con- ference.

[Sch91] R. W. Scheifler, et al., CLX - Common LISP X Interface, Release 5, August 1991.

[ScR92a] P. K. Schank and L. A. Rowe, "An Introduction to Semiconductor Manufacturing and Markets", (ERL Report No. UCB/ERL M92/35), Berkeley, CA: University of California, Electrical Engineering and Computer Sciences.

[ScR92b] P. K. Schank and L. A. Rowe, "Navigation and Learning in a Multimedia Course on Semiconductor Manufacturing", (ERL Report No. UCB/ERL M92/??), Berke- ley, CA: University of California, Electrical Engineering and Computer Sciences.

[SmR92] B. C. Smith, L. A. Rowe, "An Application-Specific Ad Hoc Query Interface ", IEEE Transactions on Semiconductor Manuf., Vol. 5, No. 4, November 1992, pp. 281-289.

[Ste90] G. L. Steele Jr., Common Lisp, The Language, Second Ed., Digital Press, 1990.

[VoK89] K. Voros and P. K. Ko, "Evolution of the Microfabrication Facility at Berkeley", Electronics Research Lab. Memorandum M89/109, Sept. 1989


* This research was supported by the National Science Foundation (Grant MIP-90-14940) and the Semiconductor Research Corporation, Texas Instruments, and National Semiconductor, with a matching grant from the State of California's MICRO program. This material is based in part upon work supported by the National Science Foundation under infrastructure Grant No. CDA-87722788