An Introduction to Scheme and its Implementation - Variables vs. Bindings vs. Values

Go to the first, previous, next, last section, table of contents.

Variables vs. Bindings vs. Values

The distinction between variables, bindings, and values is particularly important in Scheme, and important for understanding interpreters and compilers, so I'll take a little time to discuss it with examples.

What is this distinction? Why not just say that the variable holds a value, i.e., why not call the unit of storage a variable? Because that's not right. Consider the following short program.

(define (double x)        ; define a procedure that doubles its argument
   (+ x x))            

(define (quadruple x)     ; define a procedure that quadruples
   (+ (double x)          ; its argument.
      (double x)))

(define (foo x)           ; define a recursive procedure that calls
   (if (> x 0)            ; itself if its argument is more than 0,
       (foo (- x 1)))     ; handing the recursive call an argument
                          ; that's one less.

Notice that we've defined three procedures, double, quadruple, and foo, each of which has a local (argument) variable x. (An argument variable is just a local variable that gets its initial value in a special way, passed from the caller of the procedure.)

There are therefore three different variables named x in this code. In each of the procedures, it means something different. Each procedure defines a different meaning for the name x, and each separate meaning is a different variable.

This becomes obvious when we talk about "the variable x defined in the procedure double" versus "the variable x in the procedure foo," and so on.

We could change the names of variables, so that every variable has a different name, without changing the meaning of any of the procedure definitions. Wherever two different variables have the same name, we can change one of them to something else, as long as we change all references in the scope of that variable to the new name.

In the example above, we might change each variable x to x1, x2, or x3, and change all uses within its scope to use the new name:

(define (double x1)        ; define a procedure that doubles its argument
   (+ x2 x3))            

(define (quadruple x2)     ; define a procedure that quadruples
   (+ (double x2)          ; its argument.
      (double x2)))

(define (foo x3)           ; define a recursive procedure that calls
   (if (> x3 0)            ; itself if its argument is more than 0,
       (foo (- x3 1)))     ; handing the recursive call an argument
                          ; that's one less.

This makes it clearer that each of the variables is a different thing, but it doesn't change what the procedures do, because the normal scope rules of the language have the same effect.

Notice also that when we define the procedures, there is no storage allocated for their local variables; the variables named x in the procedures are just definitions--no space will be allocated for them until the procedures are actually called. That's when binding happens--some storage is allocated at run time to hold the value.

(Bear in mind that this happens in other languages too, even if people don't discuss it clearly--for example, a C argument variable is bound when you enter the procedure, because suddenly space is allocated for it and the name refers to that space.)

When we call something a "variable," that's not because we can assign to it and change its value. None of the above variables has a value that varies in that sense; none of these procedures happens to modify the values they're given as arguments. In some languages, such as pure functional languages, you can't do assignment at all, but those languages still have variables.

In programming language terminology, the term "variable" means pretty much what it means in mathematics--at different times we invoke the same procedure and the variable will refer to something different. For example, I may call double with the argument 10, and while executing in that call to double, the value of x will be 10. Later, I may call double with the value 500, and while executing in that call the value of x will be 500.

Consider how similar this is to variables in logic. I may have a logical statement that "for all x, if x is a person then x is mortal". (Forall x, person(x) -> mortal(x)). I can use the same logical rule (statement) and apply it to lots of things.

If Socrates is a person then Socrates is mortal, and if Bill Clinton is a person then Bill Clinton is mortal, and so on. (Or even, if my car is human then my car is mortal.)

Each time I use it, x may refer to a different thing, and that's why it's called a variable.

Just because it's a variable doesn't mean that when I use it I change the state of the thing I use it to refer to--for example, Bill Clinton is probably not modified much by the fact that I'm inferring something about him, and I'm pretty sure Socrates isn't changed much at all by the experience.

It also doesn't mean that the meaning of a variable changes from instant to instant. If I use the rule above, and apply it to Socrates, saying "if Socrates is a person then Socrates is mortal", x consistently refers to Socrates--that's the point. But I can also say that "if Bill Clinton is a person then Bill Clinton is mortal." In that case x refers consistently to Bill Clinton. In logic, we say that in one case x is bound to Socrates, but then used consistently within the rule; and in the other, we say it's bound to Bill Clinton, and then used consistently within the rule.

The point here is that the same variable can refer to different things at different times. These different things are called bindings, because the variable is associated with ("bound to") something different each time.

Consider the recursive procedure foo, above. In a recursive procedure, the same variable may be bound to different things at the same time. Suppose I call foo with the argument 15, and it binds its argument x and gives the binding the initial value 15. Then it examines that value, and calls itself with the argument 14. The recursive call binds its argument x with the value 14, then examines that and calls itself with the value 13, and so on.

At each recursive call, a new binding of x is created, even if the old bindings still exist, because the earlier calls haven't finished yet--they're suspended for the duration of the recursion.

When there are multiple bindings in existence at the same time, only one one is "visible" as a procedure executes. For example, in a recursive set of calls to a procedure, only one binding is "in scope," that is, visible) to an executing procedure--the binding for that call. We call this the current binding of the variable. When a call returns, an older binding becomes visible again, and becomes the current binding.

But what is a variable bound to, i.e., to what does a variable refer? In Scheme, it refers to a piece of storage. When you call a procedure, for example, each argument variable is bound to a piece of storage that can hold the argument value you pass. Inside that call to that procedure, that variable name will refer to that piece of memory.

A single binding of a Scheme variable may hold different values over time, because of assignments, as in most procedural languages. So not only may the same variable be bound to different pieces of storage, but each piece of storage may hold different values over time.(5)

Sometimes people talk about binding a variable to a value, but in Scheme (and other languages with assignment) this is not correct, and speaking in this sloppy way causes confusion. If you don't distinguish between storage and values, you can't talk clearly about assignment.

Always remember that there are three "one-to-many mappings" here:

a single name (identifier) can be used for different variables at different places in a program, or for a symbol
a given variable may be bound to different pieces of storage, e.g., at different calls to the same procedure,
a given binding may hold different values if you assign to it and change what's stored there.

To keep these terms straight, it's usually best to think about local variables; top-level or global variables are a special case, because they only have one binding each.

Top-level defines can be a little confusing in terms of the variable/binding/value distinction, because they do three different things. They declare a variable that will be visible in a scope (the top level scope), they bind they variable to new storage (creating the top-level binding), and they initialize that storage with an initial value.

==================================================================
This is the end of Hunk M.

TIME TO TRY IT OUT

At this point, you should go read Hunk N of the next chapter
and work through the examples using a running Scheme system.
Then return here and resume this chapter.
==================================================================

(Go to Hunk N, which starts at section Interactively Changing a Program (Hunk N).)

Go to the first, previous, next, last section, table of contents.