Go to the first, previous, next, last section, table of contents.

Records and Object Orientation

Most programming languages have a standard facility for defining new types of records or structures. A record is an object with named fields. For example, we might define a point record type, to represent geometric points in a plane. Each point object might have a x field and a y field, giving the horizontal and vertical coordinates of a point relative to the origin. Once this point class is defined, we can create instances of it, i.e., actual objects of type point, to represent particular points in space.

Scheme is an unusual language in that there is not a standard facility for defining new types. We can build a type-definition facility, however, using macros.

In this chapter, I'll show a simple record definition facility written in Scheme. Then I'll describe a simple object-oriented programming system for Scheme, and show how it can be implemented in Scheme, too. (Both of these systems rely on Lisp-style macros, which are not standard Scheme, but are available in almost every Scheme implementation.)

Records (Structures)

Using Procedural Abstraction to Implement Data Abstraction

Scheme's main abstraction mechanism is procedural abstraction. We can define procedures that represent higher-level operations, i.e., operations not built into the language, but which are useful for our purposes. We can construct abstract data types, which are data types that represent higher-level concepts (such as points in a coordinate space), and use procedures to implement the operations.

For example, we can fake a point data type by hand, by writing a set of procedures that will construct point objects and access their fields. We can choose a representation of points in terms of preexisting Scheme types, and write our procedures accordingly.

For example, we can use Scheme vectors to represent points, with each point represented as a small vector, with a slot for the x field and a slot for the y field. We can write a handful of procedures to create and operate on instances of our point data type, which will really allocate Scheme vectors and operate on them in ways that are consistent with our higher-level point abstraction.

We start with a constructor procedure make-point, which will create ("construct") a point object and initialize its x and y fields. It really allocates a Scheme vector. The zeroth slot of the vector holds the symbol point, so that we can tell it represents a point object.

; a point is represented as a three-element vector, with the 0th
; slot holding the symbol point, the 1st slot representing
; the x field,, and the 2nd slot representing the y field.

(define (make-point x y)
   (vector 'point x y))

We also define a predicate for testing whether an object is a point record. It checks to see if the object is actuall a Scheme vector and its zeroth slot holds the symbol point. This isn't perfect--we might mistake another vector that happens to hold that symbol in its zeroth slot for a point, but we'll ignore that for now. (It's easy to fix, and we'll fix it later when we build a more sophisticated object system.)

; check to see if something is a point by checking to see if it's
; a vector whose 0th slot holds the symbol point.
(define (point? obj)
   (and (vector? obj)
        (eq? (vector-ref obj 0) 'point)))

Now we define accessor procedures to get and set the x and y fields of our points--the 1st and 2nd slots of the vector we use to represent a point.

; accessors to get and set the value of a point's x field.
(define (point-x obj)
   (vector-ref obj 1))
   
(define (point-x-set! obj value)
   (vector-set obj 1 value))

; accessors to get and set the value of a point's y field.
(define (point-y obj)
   (vector-ref obj 2))
   
(define (point-y-set! obj)
   (vector-set! obj 2 value))

This isn't perfect, either--we should probably test to make sure an object is a point before operating on it as a point. For example, point-x should be more like this:

(define (point-x obj)
   (if (point? obj)
       (vector-ref obj 1)
       (error "attempt to apply point-x to a non-point)))

Once we've defined the procedures that represent operations on an abstract data type, we can ignore how it's implemented--we no longer have to worry about how points are represented.

We can also change the implementation of an abstract data type by redefining the procedures that create and operate on instances of that type.

For example, we could decide to represent points as lists, rather than vectors, and redefine the constructor, predicate, and accessors to use that representation.

We could also change the representation to polar form, rather than Cartesian, storing a direction and distance from the origin rather than x and y distances. With a polar representation, we could still support the operations that return or set x coordinates, using trigonometric functions to compute them from the direction and distance.

Automating the Construction of Abstract Data Types with Macros

As I just showed, it's easy to define an abstract data type in Scheme, by hand, using procedural abstraction. Doing this for every abstract data type is very tedious, however, so it would be good to automate the process and provide a declarative interface to it.

We'd like to be able to write something like this:

(define-structure point
   x
   y)

and have Scheme automatically construct the constructor, type predicate, and accessor procedures for us. In most languages, this is done by the compiler, but we can tell Scheme how to do it by defining define-structure as a macro. Whenever the interpreter or compiler encounters a define-structure form, our macro transformation procedure will be called and will generate the relevant procedures, which will then be interpreted or compiled in place of the define-structure form.

We'll use a define-macro (Lisp-style) macro for this. this macro will intercept each define-structure form, analyze it, and produce an s-expression that is a sequence of procedure definitions to be interpreted or compiled. Each define-structure form will be translated into a begin form containing a series of procedure definitions.

; define-struct is a macro that takes a struct name and any number of field
; names, all of which should be symbols.  Then it generates a begin expression
; to be compiled, where the begin expression contains the constructor for this
; structure type, a predicate to identify instances of this structure type,
; and all of the accessor definitions for its fields.

(define-macro (define-struct struct-name . field-names)

   ; analyze the macro call expression and construct some handy symbols
   ; and an s-expression that will define and record the accessor methods.

   (let* ((maker-name (string->symbol
                       (string-append "make-"
                                      (symbol->string struct-name))))
          (pred-name (string->symbol
                      (string-append (symbol->string struct-name) "?")))
          (accessor-defns (generate-accessor-defns struct-name field-names)))
      ; return an s-expression that's a series of definitions to be
      ; interpreted or compiled.
      `(begin (define (,maker-name ,@field-names)
                 (vector ',struct-name ,@field-names))
              (define (,pred-name obj)        
                 (and (vector? obj)
                      (eq? (vector-ref obj 0) ,struct-name)))
              ,@accessor-defns)))

To generate all of the accessor definitions, we call a special helper routine, generate-accessor-defns, and splice the result into the sequence of definitions using unquote-splicing (,@). generate-accessor-definitions simply iterates over the list of slot names tail-recursively (using named let), consing two definitions onto the definitions for the rest of the slots:

; generate-accessor-defns generates a list of s-expressions that
; define the accessors (getters and setters) for a structure.

(define (generate-accessor-defns structname fnames)

   (let ((structname-string (symbol->string structname)))

      ; loop over the fieldnames, and for each fieldname, generate two
      ; s-expressions:  one that is a definition of a getter, and one that's
      ; a definition of a setter.
      ; As we loop, increment a counter i so that we can use it as the index
      ; for each slot we're generating accessors for

      (let loop ((fieldnames fnames)
                 (i 1))
         (if (null? fieldnames)
             '()
             ; take a fieldname symbol, convert to string, append it to the
             ; struct name string with a hyphen in the middle, and convert
             ; that to a symbol...
             (let* ((fieldname-string (symbol->string (car fieldnames)))
                    (getter-name (string->symbol
                                  (string-append structname-string
                                                 "-"
                                                 fieldname-string)))
                    (setter-name (string->symbol
                                  (string-append structname-string
                                                 "-"
                                                 fieldname-string
                                                 "-set!"))))

                 ; now construct the define forms and cons them onto the
                 ; front of the list of the remaining define forms, generated
                 ; iteratively (tail-recursively)

                 (cons `(define (,getter-name obj)
                           (vector-ref obj ,i))
                       (cons `(define (,setter-name obj value)
                                 (vector-set! obj ,i value))
                             (loop (cdr fieldnames)
                                   (+ i 1)))))))))

Simple Uses of OOP Objects

In this section, I'll discuss a simple object system and how it it used. This object system is not part of Standard Scheme, but can be [almost entirely ?] implemented in portable Scheme, and used in any Scheme system with a reasonably powerful macro system.

The object system is based on classes and generic procedures. It is a subset of the RScheme object system, and its basic functionality is similar to a subset of CLOS object system for Common Lisp, the Dylan object system, Meroon, TinyCLOS, and STkLOS.

Late Binding

One of the major features of object-based and object-oriented programming is late binding of methods, which means that we can write code in terms of abstract operations without knowing exactly which concrete operations will be executed at run time.

For example, consider a graphical program that maintains a list of objects whose graphical reprsentations are visiblle on the user's screen, and periodically redraws those objects. It might iterate over this "display list" of objects, applying a drawing routine to each object to display it on the screen. In most interesting applications, there would be a variety of graphical object types, each of which is drawn in a different ways.

If our graphical objects are represented as traditional records, such as C structs or Pascal records, the drawing routine must be modified each time a new graphical type is added to the program. For example, suppose we have a routine draw which can draw any kind of object on the screen. draw might be written with a case expression, like this:

 (define (draw obj)
    (cond ((triangle? obj)
           (draw-triangle obj))
          ((square? obj)
           (draw-square obj))
          ((circle? obj)
           (draw-circle obj))
          ; more branches...
          ;    .
          ;    .
          ;    .
          ((regular-pentagon? obj)
           (draw-regular-pentagon obj))

Each time we define a new kind of record that represents a graphical object, we must add a branch to this cond to check for that kind of object, and call the appropriate drawing routine.

In large, sophisticated programs that deal with many kinds of objects, the code may be littered with cond or case statements like this, which represent abstract operations, and map them onto concrete operations for specific types. (This example maps the abstract operation "draw an object" onto concrete operations like draw-triangle, draw-square, and so on.)

Such code is very difficult to maintain and extend. Whenever a new type is added to the system, all of the cond or case expressions that could be affected must be located and modified.

What we would like is a way of specifying how an abstract operation is implemented for a particular kind of object, and having the system keep track of the details. For example, we'd like to say at one point in the program, "here's how you draw a regular pentagon," and then be able to use regular pentagons freely. We can then use the abstract operation draw, and rely on the system to automatically check what kind of object is being drawn, find the appropriate drawing routine for that type, and call it to draw that particular object.

For example, the routine that draws all of the visible objects might just look like this:

(map draw display-list)

When we later add a new type, such as irregular-hexagon, we can just define a method for drawing irregular hexagons, and the system will automatically make the draw operation work for irregular hexagons. We don't have to go find all of the code that might encounter irregular hexagons and modify it.

This feature is called late binding of methods. When we write code that uses an abstract operation, we don't have to specify exactly what concrete operation should be performed.

(Note: here we're using a fairly general sense of the word "binding," which is more general than the notion of variable binding. We're making an association between a piece of code and the operation it represents, rather than between a name and a piece of storage. In this general sense, "binding" means to associate something with something else, and in this example, we associating the abstract operation draw with the particular procedure needed to draw a particular object at run time.)

As we'll see a little later, we can define a generic procedure that reprsents the abstract draw operation, and rely on an object system to bind that abstract operation to the appropriate drawing procedure for a particular type at run time. When we later define new types and methods for drawing them, the generic procedure will be automatically updated to handle them. This lets us write most of our code at a higher level of abstraction, in terms of operations that "just work" for all of the relevant types. (E.g., we might have abstract operations that can draw, move, and hide any kind of graphical object, so that we don't need to worry about the differences between the different kinds of graphical objects if those differences don't matter for what we're trying to do.)

Class Definitions and Slot Specifications

A class is an object that describes a particular kind of object. A class definition is an expression like a record or structure definition, which defines the structure of that kind of object. Classes can also have associated behavior or methods, which are routines for performing particular operations on instances of a class.

For example, suppose we would like to have a class of objects that can be used to represent points in two-dimensional space. Each point object will have an x slot and a y slot, which hold the object's position in the x and y dimensions.

(A slot is a field of an object, which in other languages may be known as an instance variable, a data member, an attribute, or a feature.)

We can define our point class like this:

(define-class <point> (<object>)
   (x init-value: 0)
   (y init-value: 0))

Here we have chosen to name the class <point>. By convention, we use angle brackets to begin and end the names of classes, so that it's clear that they are class names, not names of normal objects.

The parenthesized expression after the class name <point> is a sequence of superclass names, which will be explained later.(12) (When in doubt, it is a good idea to use <object> as the sole superclass, so use (<object>) after the class name in the class definition.)

The two remaining fields after the superclasses are the slot specifications, which say what kinds of fields an instance of <point> will have. A slot specification is written in parentheses, and the first thing is the name of the slot. After that come keyword/value pairs. Here we use the keyword init-value: followed by the value 0.

The specification (x init-value: 0) says that each instance of <point> will have a slot (field) named x, and that the initial value of the field is 0. That is, when we create a <point> instance to represent a 2-d point, the initial x value will be zero. Likewise, the slot specification (y init-value 0 says that each point will also have a y slot whose initial value is 0.

We can create an instance of an object by using the special form make, which is actually implemented as a macro. The make operation takes a class as its first argument, and returns a new object that is an instance of that class.

To make a <point>, we might use the make expression

(make <point>)

This expression returns a new point whose x and y slots are initialized to zero.

If we want the slots of an object to be initialized to a requested value at the time the object is initialized--rather than always being initialized the to the same value for every object, we can omit the initial value specification in the class definition, and provide it to the make call that creates an object.

(define-class <point> (<object>)
   (x)
   (y))

Given this class definition, we can use make to create a <point> instance with particular x and y values:

(define my-point
        (make <point> x: 10 y: 20))

Here we've created a point object with an x value of 10 and a y value of 20. Note that the x value is labeled by a keyword x:. As in a class definition, a keyword argument to make looks sort of like an argument, but it really isn't: it's the name of the following argument.

Keyword arguments to define-class and make let you write the arguments in any order, by giving the name before the value. We could have written the above call to make with the values in the opposite order:

(define my-point (make <point> y: 20 x: 10))

The result of this definition is exactly the same as the earlier one. The make macro will sort out the arguments, looking at the keyword to figure out what the following arguments are for.

By default, when we define a class with slots x and y, we implicitly define operations on those fields of those objects.

For each field, two routines are defined, a getter, which fetches the value of the field, and a setter, which sets the value of the field. The name of the getter is just the name of the field. The name of the setter starts with set-, followed by the name of the field, followed by an exclamation point to indicate that the operation is destructive (i.e., modifies the state of the object by replacing an old value with a new one.)

Given the point we created, we can ask the value of its x field by evaluating the expression (x my-point), which will return 10. We can change teh value to 50 by evaluating the expression (set-x! my-point 50). We can increment it by 1 with the expression

(set-x! my-point
        (+ 1 (x my-point)))

Different kinds of objects can have fields with the same name, and the getters and setters will operate on the appropriate field of whatever kind of object they are applied to. (Accessors are actually generic procedures, which will be explained later.)

Generic Procedures and Methods

A generic procedure is a procedure that does a certain operation, but may do it in different ways depending on what kind of argument it is given. A generic procedure can be specialized, telling it how to perform a particular kind of operation for a particular kind of argument.

A method definition specifies how a generic operation should be done for an object of a particular type. Conceptually, a generic function keeps track of all of the methods which perform a particular operation on different kinds of objects. A generic procedure is called just like any other function, but the first thing it does is to look up the appropriate method for the kind of object being operated on. Then it applies that method. A generic procedure is therefore a kind of dispatcher, which maps abstract operations onto the actual procedures for performing them.

For example, suppose we would like to define several classes, @code<stack>, @code<queue>, and @code<d-e-queue>, to represent stacks, queues, and double-ended queues, respectively.

We could define stack this way:

(define-class <stack> (<object>)
   (items init-value: '())   ; list of items in the stack

An instance of <stack> has one field, items, which points to a list of items in the stack. We can push items onto the stack by consing them onto the front of its list of items, or pop items off of the stack by cdring the list.

To define the behavior of <stack>---and things like stacks--we need some generic procedures, insert-first! and remove-first!. These will add an item to the front (top) of a stack, or remove and return the item from the front (top) of a stack, respectively.

(define-generic (insert-first! obj item))

(define-generic (remove-first! obj))

These two generic procedures define "generic operations" which may be supported by different classes, but do semantically "the same thing." That is, the generic procedures don't represent how to do a particular kind of operation on a particular kind of object, but instead represent a general kind of operation that we can define for different kinds of objects.

This pair of generic procedures therefore acts as an abstract data type, which represents object that can behave as stacks. The don't say how any particular implementation of stacks works.

To make the generic operations work for the particular class <stack>, we need to define methods that say how to perform the insert-first! and remove-first! operations on objects that are instances of class <stack>.

For this, we use the macro define-method. Here's the definition of the insert-first! operation for the class <stack>:

(define-method (insert-first! (self <stack>) item)
   (set-items! self
               (cons item (items self))))

This method definition is very much like a procedure definition. Here we're defining a method that takes two arguments, named self and item. The calling form (insert-first! (<stack> self) item) says that this is the particular procedure to use for the generic procedure insert-first! operation when it's given two arguments, and the first argument is an instance of class <stack>.

That is, we're defining a procedure of two arguments, self and item, but we're also saying that this procedure is to be used by the generic procedure insert-first! only when its first argument is a stack. (The names self and item were chosen for convenience--as with a normal procedure, we can name arguments anything we want.)

Given this definition, when insert-first! is called with two arguments, and the first is a stack, this procedure will be executed to perform the operation in the appropriate way for stacks. We say that we are specializing the generic procedure insert-first! for instances of the class <stack>.

The body of this method definition refers to the stack being operated on as self, the name given as the first argument name; it refers to the second argument, which is being pushed on the stack, as item. The body of the method is

(set-items! self
            (cons item (items self)))

which relies on the getter and setter implicitly defined for the items slot by the class definition. It fetches the value of the head slot of self using head, conses the argument item onto that list, and assigns the result to the head slot using set-head!.

The method for the generic procedure remove-first! when applied to stacks could be defined like this:

(define-method (remove-first! (self <stack>))
   (let ((first-item (car (items self))))
      (set-items! (cdr (items self)))))

Now let's implement a queue data type. Like a stack, a queue data type allows you to push an item on the front of an ordered sequence of items--it supports the insert-first! operation.

However, a queue doesn't let you add items to the front--it only lets you add items to the rear. So our <queue> class should support remove-first!, like <stack>, but insert-last! instead of insert-first!.

This means that we can define a method for <queue> on the remove-first! generic procedures, but we need a new generic procedure insert-last!, which represents the abstract operation of removing the last item from an ordered sequence.

(define-generic insert-last!)

The pair of generic operations insert-last! and remove-first! represent the abstract datatype of queues and things that can behave like queues.

To actually implement queues, we need a class definition and some method definitions, to say how a queue should be represented, and how the queue operations should be done on it.

For a queue, it's good for accesses to be fast at either end, so we'll want a doubly-linked list, rather than a simple list of pairs. Here's a class definition for <queue>:

(define-class <queue> (<object>)
   (front '())
   (rear '()))

Each <queue>s keep a pointer to the beginning of the linked list and a pointer to the end of the linked list. The queue itself is structured as a doubly-linked list of queue nodes, each of which has a pointer to an item that's conceptually in the queue, plus a next pointer to the next doubly-linked list node, and a prev pointer to the previous one.

To implement the doubly-linked list, we'll use a helper class to implement the list nodes, called <d-l-list-node>.

(define-class <d-l-list-node> (<object>)
   (item)
   (next)
   (prev))

This definition will implicitly define setters and getters for the fields, e.g., set-next! and set-next! for the next field of a <d-l-list-node>.

Now we can define the methods for the remove-first! and insert-last! operations on instances of <queue>.

(define (insert-last! (self <queue>) item)
   (let ((new-node (make <d-l-list-node> item: item
                                         prev: (rear self))
                                         next: '())))
      (cond ((null? (front self))          ; nothing in queue yet?
             (set-front! self new-node)              ; this will be first
            (else                              ; otherwise
             (set-next! (rear self) new-node))) ; append to rear of list
      (set-rear! self new-node))))   ; update rear pointer

(define (remove-first! (self <queue>))
   (let ((first-node (front self)))
      (if (null? first-node)
          (error "attempt to remove-first! from an empty queue:" self)
          (let* ((first-item (item first-node))
                 (rest (next first-node)))
             (cond((null? rest)    ; no nodes left in queue?
                   (set-front! self '()) 
                   (set-rear! self '()))
                  (else
                   (set-prev! rest '())
                   (set-front! self rest)))))))

Note that what stacks and queues both support the abstract operation of removing the first item, but each does it in a different way--the same operation (generic procedure) is implemented by different code (methods).

Generic Procedures and Classes are First-Class

A generic procedure is a procedure, like any other--it is a first-class object that happens to be callable as a procedure. You can therefore use store generic procedures in data structures, pass them as arguments to other procedures, and so on.

For example, in a graphical program, we may have a generic draw procedure to display any kind of graphical object, and each class of graphical object may have its own draw method. By mapping the generic procedure draw over a list of graphical objects, like this,

(map draw list-of-objects-to-be-drawn)

we can invoke the appropriate draw method for each kind of object.

In our system, classes are also first class. When we use define-class to define a class named <point>, we are actually doing two things: we are creating a special kind of object to represent the class, and we are defining a variable named <point> initialized with a pointer to the class object.

Implementing the Simple Object System

In this section, I'll present a simple implementation of the simple object system described so far. Our object system is based on metaobjects, i.e., objects which represent or manipulate other objects such as class instances and methods. (The meta- is Greek for "about," "beyond," or "after".

In programming language terminology, metaobjects are objects that are "about" other objects or procedures. The two most important kinds of metaobjects are class objects and generic procedure objects. A class object represents instances of a particular class, and a generic procedure object represents a generic operation.

Metaobjects control how other objects behave. For example, a class object controls how instances of the class are constructed, and a generic procedure object controls when and how the particular methods on that generic procedure are invoked to do the right thing for particular kinds of objects.

A big advantage of the metaobject approach is that since metaobjects are just objects in the language, we can implement most or all of the object system in the language--in this section, we'll show how to implement a simple object system for Scheme, in portable Scheme. (We will rely on macros, which some versions of Scheme don't support yet, however.) An advantage of writing a Scheme object system in Scheme is that we can modify and extend the object system without having to change the compiler.

We will use macros to translate class, generic procedure, and method definitions into standard Scheme data structures and procedures. A class object in our system is just a data structure, for which we'll use a vector (one-dimensional array) as the main representation. A class object will record all of the information necessary to create instances of that class.

Instances of a class will also be represented as Scheme vectors. Each slot of an object will be represented as a field of a vector, and we'll translate slot names into vector indexes.

Generic procedures will be represented as Scheme procedures, constructed in a way that lets us define methods--each generic procedure will maintain a table of methods indexed by which classes they work for. When a generic procedure is called in the normal way, it check the class of the object it's being applied to, and will search its table of methods for the appropriate method, and call that method, passing along the same arguments. Methods will also be represented as Scheme procedures.

Implementing `define-class`

define-class is a macro which accepts the users's description of a class, massages it a little, and passes it on to the procedure create-class to construct a class object.

The reason that define-class is written as a macro and not a procedure is so that the arguments to the macro won't be evaluated immediately. For example, the class name (e.g., <point> or <queue passed to define-class isn't a variable to be evaluated--it's a symbol to be used as the name of the class.

When a call to define-macro is compiled (or interpreted), the transformation procedure for the macro does two things. First, it constructs the class object and adds it to a special data structure by calling register-class. Then it generates code to define a variable whose name is the name of the class, and initialize that with a pointer to the class. The generated code (the variable definition) is returned by the transformer, and that's what's interpreted or compiled at the point where the macro was called.

For example, consider a call to create a <point> class:

(define-class <point> (<object>)
   (x)
   (y))

This should be translated by macro processing into a variable definition for <point>, which will hold a pointer to the class object, like this:

(define <point> complicated_expression)

where complicated_expression has the side-effect of constructing the class object, registering its existence with related objects (virtual procedures for the accessors), and so on. complicated_expression should look something like this, for our <point> definition:

; construct an association list describing the slots of this kind of object,
; indexed by slot name and holding the routines to get and set the slot
; values.

(let ((slots-alist (generate-slots-alist '((x) (y)))))

   ; create the class object, implemented as a Scheme vector
   (let ((class-object (vector <<class>>       ; pointer to class of class
                               '<point>        ; name symbol for this class
                               (list <object>) ; list of superclass objects
                               slots-alist     ; slot names/getters/setters
                               '*dummy*)))     ; placeholder
                               
      ; create and install the instance allocation routine, which will create
      ; and initialize an instance of this class, implemented as a vector
      (vector-set! class-object 4 (lambda (x y)
                                     (vector class-object x y)))
                                     
      
      ; register accessor methods with appropriate generic procedures
      (register-accessor-methods class-object slots-alist)
      
      ; and return the class object we constructed
      class-object))

In more detail, what this generated code does is:

builds an association list, indexed by slot name, holding getter and setter procedures for each slot of a <point> object. It creates procedures that will get and set the values of the slots x and y, which have been mapped to indexed fields 1 and 2 of the vector used to represent an instance. (These are the methods for the generic procedures x, set-x!, y, and set-y!, which will be registered with those generic procedures.)
creates a vector that will be the class object. Its 0th slot is initialized with a pointer to the special object <class>, which identifies this object as a class object.(13) The 1st slot holds a pointer to the name symbol that is this class's class name. (This is just for documentation purposes.) The 2nd slot holds a list of pointers to this object's immediate superclasses. (Note that this is a list of pointers to actual class objects, not name symbols.(14)
creates a procedure that will allocate and initialize an instance, given the initial values of the slots. A pointer to this object is stored in the class object. (This side-effect is needed because the class object must be created before this procedure, so that the class pointer is available to it.) This procedure takes the slot values in the same order the slots are laid out. (The make macro will ensure that arguments are passed in the right order from calls to make using keywords.)
registers the accessor methods for the slots with the appropriate generic procedures. For now, we'll assume that the generic procedure objects already exist--they must be defined explicitly, like any other generic procedures. Later we'll show how the necessary generic procedure definitions can be automatically generated as needed.
returns the class object.

Since this is all done in the initial value expression of the definition of the variable <point>, the returned class object becomes the initial value of that variable binding.

Once all this is done, we could create an instance of class point by extracting the allocator procedure from the class object and calling it with the initial values in the proper order. For example,

((vector-ref <point> 4) 20 30)

would extract the point-allocator procedure from the <point> class object, and call it to create a <point> instance with an x value of 20 and a y value of 30. (The make macro will provide a friendlier interface.)

Now we'll show a simplified version of the procedure generate-class-code, which generates the kind of class-creating s-expression shown above.

Now let's look at the macro to produce code like this from a simple class definition.

For now, we'll assume that the body of the class definition consists of nothing but slot declarations with no keword options--initial value specifiers or other options--i.e., they're one-element lists holding just a symbol that names a slot. Ignoring inheritance and assuming that a class includes only the slots declared in this class definition, we'll simply assign slots index numbers in the order they're declared.

We'll also continue to ignore issues of inheritance and automatic generation of generic procedures for slot accessor methods. When we implement inheritance, described later, we'll need to do something with the list of superclasses.)

(define-macro (define-class class-name superclass-list . slot-decls)
   `(define ,class-name
            (let ((slots-slist (generate-slots-alist ',slot-decls 1)))

               ; create the class object, implemented as a Scheme vector
               (let ((class-object (vector <<class>>            ; metaclass
                                           ',class-name         ; name
                                           (list ,@superclass-list) ; supers
                                           slots-alist          ; slots
                                           '*dummy*)))          ; creator
 
                   ; install a routine to create instances                             
                   (vector-set! class-object
                                4
                                ; creation routine takes slot values
                                ; as args, creates a vector w/class
                                ; pointer for this class followed by
                                ; slot values in place.
                                (lambda ,(map car slot-decls)
                                   (vector class-object
                                           ,@(map car slot-decls))))
                                     
      
                   ; register accessor methods with appropriate generic procs 
                   (register-accessor-methods class-object slots-alist)
      
                   class-object))

Two important helper routines are used by this macro: generate-slots-alist and register-accessor-methods.

The initial value expression for slots-alist is a call to generate-slots-alist, with an argument that is a quoted version of the argument declarations passed to the macro. Notice that we're using unquote inside a quoted expression, and this works. The value of slot-decls will be substituted inside the quote expression during macro processing.

For the <point> definition, the expression (generate-slots-alist ',slot-decls 1) will translate to (generate-slots-alist '((x) (y)) 1).(15) Several other expressions in the macro work this way, as well: For the <point> example, ',class-name will translate into '<point>, a literal referring to the name symbol for the particular class we're defining.

Likewise, (list ,@superclass-list), which uses unquote-splicing, will be translated to (list <object>); when that expression is evaluated, the value of the variable <object> will be fetched and put in a list. (Notice that this makes a list with the actual class object in it, not the symbol <object>.) The lambda expression that generates an instance creating procedure uses both unquote and unquote-splicing:

(lambda ,(map car slot-decls)
   (vector class-name ,@(map car slot-decls))

It will translate to

(lambda (x y)
   (vector class-name x y))

generate-slots-alist just traverses the list of slot declarations recursively, inrementing an index of which slot number is next, and constructs list of associations, one per slot. Each association is a list hose car (i.e., the key) is the name of the slot, and its second and third elements are procedures to access the slot. The actual accessor procedures are generated by calls to slot-n-getter and slot-n-setter, which return procedures to get or set the nth slot of a vector.

(define (generate-slots-alist slot-decls slot-num)
   (if (null? slot-decls)
       '()
       (cons `(,(caar slot-decls)
               ,(slot-n-getter slot-num)
               ,(slot-n-setter slot-num))
             (generate-slots-alist (cdr slot-decls)
                                   (+ 1 slot-num)))))

(This procedure is initially called with a slot-num of 1, reserving the zeroth slot for the class pointer.)

Here are simple versions of slot-n-getter and slot-n-setter. Each one simply makes a closure of an accessor procedure, capturing the environment where n is bound, to specialize the accessor to access a particular slot. (If we handle keyword options, we'll have to make the code a little more complicated.)

(define (slot-n-getter offset)
   (lambda (obj)                        ; return a procedure to read
      (vector-ref obj offset)))         ; slot n of an object
(define (slot-n-setter offset)
   (lambda (obj value)                  ; return a procedure to update
      (vector-set! obj offset value)))  ; slot n of an object

We construct a new closure for each slot accessor, but that really isn't necessary. We could cache the closures, and always return the same closure when we need an accessor for a particular slot offset.

class `<<class>>`

Our simple object system implementation assumes that every instance is represented as a Scheme vector whose 0th slot holds a pointer to a class object, which is also an object in the system. This implies that a class object must also have a class pointer in its zeroth slot. A question naturally arises as to what the class of a class object is, and what its class pointer points to.

This is actually a deep philosophical question, and for advanced and powerful object system, it has practical consequences. For our little object system, we'll settle the question in a simple way. All class objects have a class pointer that points to a special object, the class of all classes. We call this object <<class>>, where the doubled angle brackets suggest that it is not only a class, but the class of other class objects. This is known as a metaclass, because it's a class that's about classes.

It doesn't do very much--it just gives a special object we can use as the class value for other class objects, so that we can tell that they're classes.

In our simple system, the unique object <<class>> has a class pointer that points to itself---that is, it describes itself in the same sense that it describes other classes. This circularity isn't harmful, and allows us to terminate the possibly infinite regression of classes, metaclasses, meta-metaclasses, and so on.

We construct this one special object "by hand." Like other class objects in our system, it is represented as a Scheme vector whose first element points to itself, and which has a few other standard fields. Most of the standard fields will be empty, because class <<class>> has no superclasses, no slots, and no allocator--because we create the one instance specially.

The following definition suffices to create the class <<class>>:

(define <<class>>
        (let ((the-object (vector '*dummy*   ; placeholder for class ptr
                                  '<<class>> ; name of this class
                                  '()        ; superclasses (none)
                                  '()        ; slots (none)
                                  #f         ; allocator (none)
                                  '())))     ; prop. list (initially empty)
           ; set class pointer to refer to itself
           (vector-set! the-object 0 the-object)
           ; and return the class object as initial value for define
           the-object))

Once this is done, we can define a few other routines that will come in handy in implementing the rest of the object system:

instance? is a predicate that checks whether an object is an instance of a class in our class system, as opposed to a plain old Scheme object like a pair or a number. (In a better object system, like RScheme's, all Scheme objects would also be instances of classes, but we'll ignore that for now.)

; An object is an instance of a class if it's represented as a
; Scheme vector whose 0th slot holds a class object.
; Note: we assume that we never shove class objects into other
;       vectors.  We could relax this assumption, but our code
;       would be slower.
(define (instance? obj)
   (and (vector? obj)
        (class? (vector-ref 0 obj)))

; An object is a class (meta)object if it's represented as a Scheme
; vector whose 0th slot holds a pointer to the class <<class>>.
; Note: we assume that we never shove the <<class>> object into
;       other vectors.  We could relax this, at a speed cost.
(define (class? obj)
   (and (vector? obj)
        (eq? (vector-ref 0 obj) <<class>>)))

; We can fetch the class of an instance by extracting the value
; in its zeroth slot.  Note that we don't check that the argument
; obj *is* an instance, so applying this to a non-instance is an error.
(define (class-of-instance obj)
   (vector-ref obj 0))

Implementing `define-generic`

Each generic procedure maintains a table of methods that are defined on it, indexed by the classes they are applicable to. In our simple object system implementation, this table will be implemented as an association list, keyed by class pointer. That is, the association list is a list of lists, and each of those lists holds a class object and a procedure. The class object represents the class on which the method is defined, and the procedure is the method itself.

When the generic procedure is called on a particular instance, it will extract the class pointer from the zeroth slot of the instance, and use it as a key to probe its own association list. It will then extract the procedure that's the second element of the resulting list, and call it. When calling the method, it will pass along the same arguments it received.

This scheme can be rather slow--a linear search of all methods may be slow if there are many methods defined on a generic procedure, and especially if the frequently-called ones are not near the front of the list. We could speed this up considerably by using caching tricks, e.g., reorganizing the list to put recently-used elements at the front. A more aggressive system could figure out how to avoid looking up methods at runtime in most cases, but that's considerably more complicated. We won't bother with any of that for now, to keep our example system simple.

(Understanding this simple system will be a good start toward understanding more sophisticated systems that perform much better, and even this simple system is fast enough for many real-world uses, such as most scripting and GUI programming, or coarse-grained object-oriented programming where most of the real work is done in non-object-oriented code.)

When we evaluate an expression such as

(define-generic (insert-first! obj item))

we would like the macro to be translated into code that will do several things:

create an association list to store methods later defined on this generic procedure
create the generic procedure itself
provide a means for adding methods to the association list.
bind the name (e.g., insert-first! and initialize the binding with a pointer to the generic procedure.

The first two and the last are easy, and we'll ignore the third for now. define-generic can generate code like this:

(define insert-first!
        ;  create an environment that only the generic procedure will
        ;  be able to see.
        (let ((method-alist '()))
           ;  create and return the generic procedure that can see that
           ;  method a-list.
           (lambda (obj item)
              (let* ((class (class-of-instance obj))
                     (method (cadr (assq class method-alist))))
                 (if method
                     (method obj item)
                     (error "method not found"))))))

Here we use let to create a local variable binding to hold the association list, and capture it by using lambda to create the generic procedure in its scope. Once the procedure is returned from the let, only that procedure will ever be able to operate on that association list.

The procedure returned by lambda will take the two arguments specified by the generic procedure declaration, extract the class pointer from the first argument object, probe the association list to get the appropriate method for that class, and (tail-)call that method, passing along the original arguments. If it fails to find a method for the class of the instance it's being applied to, it signals an error.

Keeping in mind that this code doesn't quite work because we can't actually add methods to the method association list, we could define define-generic as a macro this way:

(define-macro (define-generic name . args)
   `(define ,name
            (let ((method-alist '()))
               (lambda (,@args)
                 (let* ((class (class-of-instance ,(car args))))
                        (method (cadr (assq class method-alist))))
                    (if method
                        (method obj item)
                        (error "method not found"))))))

To allow methods to be added to the method-alist, we'll change the macro to create another procedure, along with the generic procedure, in the environment where method-list is visible. This procedure can be used to add a new method to the method association lists. This table will be an association list stored in the global variable *generic-procedures*.

We'll also maintain a table of generic procedures and the corresponding procedures that add methods to their association lists. While we're at it, we'll modify define-generic record the name of a generic procedure when it's defined, so that it can print out a more helpful error message when a lookup fails. The inital value expression will be a letrec which lets us define four variables, two of which are procedure-valued, and then returns one of those procedures, the actual generic procedure

         
(define *generic-procedures* '())

(define-macro (define-generic name . args)
   `(define ,name
            (letrec ((gp-name ,name)
                     (method-alist '())
                     (method-adder
                      (lambda (generic-proc method)
                            (set! method-alist
                                  (cons (list generic-proc method)
                                        method-alist)))) 
                     (generic-proc
                      (lambda (,@args)
                         (let* ((class (class-of-instance ,(car args))))
                                (method (cadr (assq class
                                                    method-alist))))
                            (if method
                                (method obj item)
                                (error "method not found for "
                                       gp-name))))))
                                       
               ; add the generic procedure and its method-adding
               ; routine to the association list of generic procedures
                      
               (set! *generic-procedures*
                     (cons (list generic-proc method-adder)
                           *generic-procedures*))
                           
               generic-procedure)))

Implementing `define-method`

Now that each generic procedure is associated with a method-adding procedure that can add to its list of methods, we can define the define-method macro. define-method will create a method using lambda, and add it to the generic procedure's method association list, indexed by the class that it is to be used for.

In this simple system, where only the first argument is dispatched on (used in selecting the appropriate method), we only need to treat the first argument declaration specially.

Consider an example the example of defining an insert-first! method for class stack.

(define-method (insert-first! (self <stack>) item)
   (set-items! self
               (cons item (items self))))

We'd like this to be translated by macro processing into the equivalent

(add-method-to-generic-proc insert-first!
                            <stack>
                            (lambda (self item)
                               (set-items! self
                                           (cons item (items self)))))

The real work is done by the procedure add-method-to-generic-procedure, which we can write as

(define (add-method-to-generic-procedure generic-proc class method)
   (let ((method-adder! (cadr (assq *generic-procedures* generic-proc))))
      (method-adder! class method)))

This procedure expects three arguments--a generic procedure object, a class object, and a closure that implments the corresponding method. It searces the association list The calling pattern for the define-method macro will ensure that the actual calling expression is destructured into three parts, giving us the first argument's name and the name and its class.

(define-macro (define-method (gp (arg1 class) . args) . body)
   `(add-method-to-generic-proc ,gp
                                ,class
                                (lambda (arg1 ,@args)
                                   ,@body)))

Installing Accessor Methods

Given the code we've seen so far, we've almost got a working object system, but we left out a detail when we defined define-class. Recall that the accessor routines for a class's slots are supposed to be used as methods on generic procedures such as x. define-class generates code that calls register-accessor-methods, to install the accessor routines for the slots of a class as methods on generic procedures.

register-accessor-methods iterates over the slots association list of the class, looking at each slot name and its corresponding accessors, and adding the accessor procedures to the appropriate generic procedure. For a given slot name, the appropriate generic procedure name is automatically constructed using the accessor naming conventions.

[ OOPS--theres a hitch here. We didn't index the generic procedures by name... it's also awkward that Scheme doesn't provide a standard bound? procedure so that we can tell if the generic procedure already exists. Is it even possible to automatically define the generic procedures in absolutely portable Scheme, without doing something painful? I suppose that if we can search the list of generic procedures by name, the macro transformer for define-class can look to see which accessor names don't have corresponding generic functions, BEFORE actually generating the transformed code. It could then add a (define-generic ...) to its output for each accessor that doesn't already have an existing generic procedure to add it to. Tedious, and annoying to have to explain. ]

Keyword options

Inheritance

So far we've described a simple object-based programming system and shown how it can be implemented. A fully object-oriented system requires another feature---inheritance.

Inheritance allows you to define a new class in terms of another class. For example, we might have a class <point>, and want to define a similar class, <colored-point>, which records the color to be used to display a point when it is drawn on the user's screen.

Given our simple object-based system so far, we would have to define colored-point from scratch, defining its x and y fields as well as its color field. This definition would be mostly redundant with the definition of <point>, making the code harder to understand and maintain.

Inheritance lets us define new classes by describing its differences from another class. For example, we could define colored-point like this:

(define-class <colored-point> (<point>)
   (color))

This definition says that instances of <colored-point> have all of the slots of <points> (i.e., x and y), as well as another slot, color. We say that <colored-point> inherits the slotss defined for <point>.

Inheritance applies to methods as well as slots. The definition above tells our object system that the methods defined for the superclass <point> should also be used for <colored-point>s, unless we specifically define new methods just for <colored-point>s on the same generic procedures.

This gives us a concise declarative way of defining classes--we can declare that a <colored-point> is like a <point>, except for the particular differences we specify. The object system then infers what slots a <colored-point> must have from this declaration (and methods we define for this class) plus the declarations for <point> and its methods.

Note that inheritance is transitive. If we define a subclass of <colored-point>, say <flashing-colored-point>, it will inherit the slots and methods of <colored-point>, and also the slots and methods of <point>.

Overriding and Refining Inherited Methods

By default, a class inherits all of the methods defined for its superclasses. We can override an inherited definition, though, by defining a method definition explicitly. For example, we might have a draw method for class <point> which simply draws a black pixel on the screen at the point's coordinates. (This might be through a call to an underlying graphics library provided by the operating system.) For <colored-point>, we would probably want to define a new draw method so that the point would be drawn in color.

Sometimes, we don't want to completely redefine an inherited method for a new class, but we would like to refine it--we may want to define the new method in terms of the inherited method.

For example, suppose we have a class <queue>, which maintains a queue as we saw earlier, and we woulto refine it to create a new kind of queue that keeps track of the size of the queue--i.e., the number of items in the queue.

We might define <counted-queue> as a subclass of <queue>, but with a size slot, like this:

(define <counted-queue> (<queue>)
   (size initial-value: 0))

Then we can define the get-first and put-last methods for counted-queue in terms of the corresponding methods for <queue>. We do this by using a special pseudo-procedure called next-method. Inside a method definition, the name next-method refers to an inherited procedure by the same name. This allows us to call the inherited version of a method even though we're overriding that definition.

(define-method (get-first! (self <counted-queue>))
   (count-set! self (- (count self) 1))  ; update count of items, and
   (next-method self))                       ; call inherited get-first

(define-method (put-last! (self <counted-queue>) item)
   (next-method self item)
   (count-set! self (+ (count self) 1))

next-method's name comes from the fact that it represents the next most specific method for this operation applied to this class, according to the inheritance graph. The method we're defining is the most specific method, because it's defined for this class exactly, and the inherited one is the next most specific. (The inherited one may in turn call a method that was inherited earlier, which will in turn be the next most specific method, and so on.)