This is r5rs.info, produced by makeinfo version 4.2 from r5rs.texi.

INFO-DIR-SECTION The Algorithmic Language Scheme
START-INFO-DIR-ENTRY
* R5RS: (r5rs).                 The Revised(5) Report on Scheme.
END-INFO-DIR-ENTRY

20 February 1998


File: r5rs.info,  Node: Top,  Next: Introduction,  Prev: (dir),  Up: (dir)

Revised(5) Report on the Algorithmic Language   Scheme
******************************************************


     RICHARD KELSEY, WILLIAM CLINGER, AND JONATHAN REES (Editors)                                    
     H. ABELSON        R. K. DYBVIG      C. T. HAYNES      G. J. ROZAS
     N. I. ADAMS IV    D. P. FRIEDMAN    E. KOHLBECKER     G. L. STEELE JR.
     D. H. BARTLEY     R. HALSTEAD       D. OXLEY          G. J. SUSSMAN
     G. BROOKS         C. HANSON         K. M. PITMAN      M. WAND


Dedicated to the Memory of Robert Hieb


Summary
*******

The report gives a defining description of the programming language
Scheme.  Scheme is a statically scoped and properly tail-recursive
dialect of the Lisp programming language invented by Guy Lewis Steele
Jr. and Gerald Jay Sussman.  It was designed to have an exceptionally
clear and simple semantics and few different ways to form expressions.
A wide variety of programming paradigms, including imperative,
functional, and message passing styles, find convenient expression in
Scheme.

The introduction offers a brief history of the language and of the
report.

The first three chapters present the fundamental ideas of the language
and describe the notational conventions used for describing the
language and for writing programs in the language.

Chapters *Note Expressions:: and *Note Program structure:: describe the
syntax and semantics of expressions, programs, and definitions.

Chapter *Note Standard procedures:: describes Scheme's built-in
procedures, which include all of the language's data manipulation and
input/output primitives.

Chapter *Note Formal syntax and semantics:: provides a formal syntax
for Scheme written in extended BNF, along with a formal denotational
semantics.  An example of the use of the language follows the formal
syntax and semantics.

The report concludes with a list of references and an alphabetic index.

Contents
********

* Menu:

* Introduction::
* Overview of Scheme::
* Lexical conventions::
* Basic concepts::
* Expressions::
* Program structure::
* Standard procedures::
* Formal syntax and semantics::
* Notes::
* Additional material::
* Example::
* Bibliography::
* Index::


File: r5rs.info,  Node: Introduction,  Next: Overview of Scheme,  Prev: Top,  Up: Top

Introduction
************

* Menu:

* Background::
* Acknowledgements::

Programming languages should be designed not by piling feature on top of
feature, but by removing the weaknesses and restrictions that make
additional features appear necessary.  Scheme demonstrates that a very
small number of rules for forming expressions, with no restrictions on
how they are composed, suffice to form a practical and efficient
programming language that is flexible enough to support most of the
major programming paradigms in use today.

Scheme was one of the first programming languages to incorporate first
class procedures as in the lambda calculus, thereby proving the
usefulness of static scope rules and block structure in a dynamically
typed language.  Scheme was the first major dialect of Lisp to
distinguish procedures from lambda expressions and symbols, to use a
single lexical environment for all variables, and to evaluate the
operator position of a procedure call in the same way as an operand
position.  By relying entirely on procedure calls to express iteration,
Scheme emphasized the fact that tail-recursive procedure calls are
essentially goto's that pass arguments.  Scheme was the first widely
used programming language to embrace first class escape procedures,
from which all previously known sequential control structures can be
synthesized.  A subsequent version of Scheme introduced the concept of
exact and inexact numbers, an extension of Common Lisp's generic
arithmetic.  More recently, Scheme became the first programming
language to support hygienic macros, which permit the syntax of a
block-structured language to be extended in a consistent and reliable
manner.


File: r5rs.info,  Node: Background,  Next: Acknowledgements,  Prev: Introduction,  Up: Introduction

Background
==========

The first description of Scheme was written in 1975 [Scheme75].  A
revised report [Scheme78]  appeared in 1978, which described the
evolution of the language as its MIT implementation was upgraded to
support an innovative compiler [Rabbit].  Three distinct projects began
in 1981 and 1982 to use variants of Scheme for courses at MIT, Yale, and
Indiana University [Rees82], [MITScheme], [Scheme311].  An introductory
computer science textbook using Scheme was published in 1984 [SICP].

As Scheme became more widespread, local dialects began to diverge until
students and researchers occasionally found it difficult to understand
code written at other sites.  Fifteen representatives of the major
implementations of Scheme therefore met in October 1984 to work toward
a better and more widely accepted standard for Scheme.

Their report [RRRS] was published at MIT and Indiana University in the
summer of 1985.  Further revision took place in the spring of 1986
[R3RS], and in the spring of 1988 [R4RS].  The present report reflects
further revisions agreed upon in a meeting at Xerox PARC in June 1992.


We intend this report to belong to the entire Scheme community, and so
we grant permission to copy it in whole or in part without fee.  In
particular, we encourage implementors of Scheme to use this report as a
starting point for manuals and other documentation, modifying it as
necessary.


File: r5rs.info,  Node: Acknowledgements,  Prev: Background,  Up: Introduction

Acknowledgements
================

We would like to thank the following people for their help: Alan
Bawden, Michael Blair, George Carrette, Andy Cromarty, Pavel Curtis,
Jeff Dalton, Olivier Danvy, Ken Dickey, Bruce Duba, Marc Feeley, Andy
Freeman, Richard Gabriel, Yekta G"ursel, Ken Haase, Robert Hieb, Paul
Hudak, Morry Katz, Chris Lindblad, Mark Meyer, Jim Miller, Jim Philbin,
John Ramsdell, Mike Shaff, Jonathan Shapiro, Julie Sussman, Perry
Wagle, Daniel Weise, Henry Wu, and Ozan Yigit.  We thank Carol
Fessenden, Daniel Friedman, and Christopher Haynes for permission to
use text from the Scheme 311 version 4 reference manual.  We thank
Texas Instruments, Inc. for permission to use text from the _TI Scheme
Language Reference Manual_[TImanual85].  We gladly acknowledge the
influence of manuals for MIT Scheme[MITScheme], T[Rees84], Scheme
84[Scheme84],Common Lisp[CLtL], and Algol 60[Naur63].

We also thank Betty Dexter for the extreme effort she put into setting
this report in TeX, and Donald Knuth for designing the program that
caused her troubles.

The Artificial Intelligence Laboratory of the Massachusetts Institute
of Technology, the Computer Science Department of Indiana University,
the Computer and Information Sciences Department of the University of
Oregon, and the NEC Research Institute supported the preparation of
this report.  Support for the MIT work was provided in part by the
Advanced Research Projects Agency of the Department of Defense under
Office of Naval Research contract N00014-80-C-0505.  Support for the
Indiana University work was provided by NSF grants NCS 83-04567 and NCS
83-03325.



File: r5rs.info,  Node: Overview of Scheme,  Next: Lexical conventions,  Prev: Introduction,  Up: Top

Overview of Scheme
******************

* Menu:

* Semantics::
* Syntax::
* Notation and terminology::


File: r5rs.info,  Node: Semantics,  Next: Syntax,  Prev: Overview of Scheme,  Up: Overview of Scheme

Semantics
=========

This section gives an overview of Scheme's semantics.  A detailed
informal semantics is the subject of chapters *Note Basic concepts::
through *Note Standard procedures::.  For reference purposes, section
*Note Formal semantics:: provides a formal semantics of Scheme.

Following Algol, Scheme is a statically scoped programming language.
Each use of a variable is associated with a lexically apparent binding
of that variable.

Scheme has latent as opposed to manifest types.  Types are associated
with values (also called objects) rather than with variables.  (Some
authors refer to languages with latent types as weakly typed or
dynamically typed languages.)  Other languages with latent types are
APL, Snobol, and other dialects of Lisp.  Languages with manifest types
(sometimes referred to as strongly typed or statically typed languages)
include Algol 60, Pascal, and C.

All objects created in the course of a Scheme computation, including
procedures and continuations, have unlimited extent.  No Scheme object
is ever destroyed.  The reason that implementations of Scheme do not
(usually!) run out of storage is that they are permitted to reclaim the
storage occupied by an object if they can prove that the object cannot
possibly matter to any future computation.  Other languages in which
most objects have unlimited extent include APL and other Lisp dialects.

Implementations of Scheme are required to be properly tail-recursive.
This allows the execution of an iterative computation in constant space,
even if the iterative computation is described by a syntactically
recursive procedure.  Thus with a properly tail-recursive
implementation, iteration can be expressed using the ordinary
procedure-call mechanics, so that special iteration constructs are
useful only as syntactic sugar.  See section *Note Proper tail
recursion::.

Scheme procedures are objects in their own right.  Procedures can be
created dynamically, stored in data structures, returned as results of
procedures, and so on.  Other languages with these properties include
Common Lisp and ML.

One distinguishing feature of Scheme is that continuations, which in
most other languages only operate behind the scenes, also have
"first-class" status.  Continuations are useful for implementing a wide
variety of advanced control constructs, including non-local exits,
backtracking, and coroutines.  See section *Note Control features::.

Arguments to Scheme procedures are always passed by value, which means
that the actual argument expressions are evaluated before the procedure
gains control, whether the procedure needs the result of the evaluation
or not.  ML, C, and APL are three other languages that always pass
arguments by value.  This is distinct from the lazy-evaluation
semantics of Haskell, or the call-by-name semantics of Algol 60, where
an argument expression is not evaluated unless its value is needed by
the procedure.

Scheme's model of arithmetic is designed to remain as independent as
possible of the particular ways in which numbers are represented within
a computer. In Scheme, every integer is a rational number, every
rational is a real, and every real is a complex number.  Thus the
distinction between integer and real arithmetic, so important to many
programming languages, does not appear in Scheme.  In its place is a
distinction between exact arithmetic, which corresponds to the
mathematical ideal, and inexact arithmetic on approximations.  As in
Common Lisp, exact arithmetic is not limited to integers.


File: r5rs.info,  Node: Syntax,  Next: Notation and terminology,  Prev: Semantics,  Up: Overview of Scheme

Syntax
======

Scheme, like most dialects of Lisp, employs a fully parenthesized prefix
notation for programs and (other) data; the grammar of Scheme generates
a sublanguage of the language used for data.  An important consequence
of this simple, uniform representation is the susceptibility of Scheme
programs and data to uniform treatment by other Scheme programs.  For
example, the `eval' procedure evaluates a Scheme program expressed as
data.

The `read' procedure performs syntactic as well as lexical
decomposition of the data it reads.  The `read' procedure parses its
input as data (section *note External representation::), not as program.

The formal syntax of Scheme is described in section *Note Formal
syntax::.


File: r5rs.info,  Node: Notation and terminology,  Prev: Syntax,  Up: Overview of Scheme

Notation and terminology
========================

* Menu:

* Primitive; library; and optional features::
* Error situations and unspecified behavior::
* Entry format::
* Evaluation examples::
* Naming conventions::


File: r5rs.info,  Node: Primitive; library; and optional features,  Next: Error situations and unspecified behavior,  Prev: Notation and terminology,  Up: Notation and terminology

Primitive; library; and optional features
-----------------------------------------

It is required that every implementation of Scheme support all features
that are not marked as being "optional".  Implementations are free to
omit optional features of Scheme or to add extensions, provided the
extensions are not in conflict with the language reported here.  In
particular, implementations must support portable code by providing a
syntactic mode that preempts no lexical conventions of this report.

To aid in understanding and implementing Scheme, some features are
marked as "library". These can be easily implemented in terms of the
other, primitive, features.  They are redundant in the strict sense of
the word, but they capture common patterns of usage, and are therefore
provided as convenient abbreviations.


File: r5rs.info,  Node: Error situations and unspecified behavior,  Next: Entry format,  Prev: Primitive; library; and optional features,  Up: Notation and terminology

Error situations and unspecified behavior
-----------------------------------------

When speaking of an error situation, this report uses the phrase "an
error is signalled" to indicate that implementations must detect and
report the error.  If such wording does not appear in the discussion of
an error, then implementations are not required to detect or report the
error, though they are encouraged to do so.  An error situation that
implementations are not required to detect is usually referred to simply
as "an error."

For example, it is an error for a procedure to be passed an argument
that the procedure is not explicitly specified to handle, even though
such domain errors are seldom mentioned in this report.
Implementations may extend a procedure's domain of definition to
include such arguments.

This report uses the phrase "may report a violation of an
implementation restriction" to indicate circumstances under which an
implementation is permitted to report that it is unable to continue
execution of a correct program because of some restriction imposed by
the implementation.  Implementation restrictions are of course
discouraged, but implementations are encouraged to report violations of
implementation restrictions.

For example, an implementation may report a violation of an
implementation restriction if it does not have enough storage to run a
program.

If the value of an expression is said to be "unspecified," then the
expression must evaluate to some object without signalling an error,
but the value depends on the implementation; this report explicitly does
not say what value should be returned.


File: r5rs.info,  Node: Entry format,  Next: Evaluation examples,  Prev: Error situations and unspecified behavior,  Up: Notation and terminology

Entry format
------------

Chapters *Note Expressions:: and *Note Standard procedures:: are
organized into entries.  Each entry describes one language feature or a
group of related features, where a feature is either a syntactic
construct or a built-in procedure.  An entry begins with one or more
header lines of the form

 - CATEGORY: TEMPLATE

for required, primitive features, or

 - QUALIFIER CATEGORY: TEMPLATE

where QUALIFIER is either "library" or "optional" as defined  in
section *Note Primitive; library; and optional features::.

If CATEGORY is "syntax", the entry describes an expression type, and
the template gives the syntax of the expression type.  Components of
expressions are designated by syntactic variables, which are written
using angle brackets, for example, <expression>, <variable>.  Syntactic
variables should be understood to denote segments of program text; for
example, <expression> stands for any string of characters which is a
syntactically valid expression.  The notation

 <thing1> ...

indicates zero or more occurrences of a <thing>, and

 <thing1> <thing2> ...

indicates one or more occurrences of a <thing>.

If CATEGORY is "procedure", then the entry describes a procedure, and
the header line gives a template for a call to the procedure.  Argument
names in the template are ITALICIZED.  Thus the header line

 - procedure: vector-ref VECTOR K

indicates that the built-in procedure vector-ref takes two arguments, a
vector VECTOR and an exact non-negative integer K (see below).  The
header lines

 - procedure: make-vector K
 - procedure: make-vector K FILL

indicate that the make-vector procedure must be defined to take either
one or two arguments.

It is an error for an operation to be presented with an argument that it
is not specified to handle.  For succinctness, we follow the convention
that if an argument name is also the name of a type listed in section
*Note Disjointness of types::, then that argument must be of the named
type.  For example, the header line for vector-ref given above dictates
that the first argument to vector-ref must be a vector.  The following
naming conventions also imply type restrictions:

                                   
    OBJ
          any object

    LIST, LIST1, ... LISTJ, ...
          list (see section *note Pairs and lists::)

    Z, Z1, ... ZJ, ...
          complex number

    X, X1, ... XJ, ...
          real number

    Y, Y1, ... YJ, ...
          real number

    Q, Q1, ... QJ, ...
          rational number

    N, N1, ... NJ, ...
          integer

    K, K1, ... KJ, ...
          exact non-negative integer



File: r5rs.info,  Node: Evaluation examples,  Next: Naming conventions,  Prev: Entry format,  Up: Notation and terminology

Evaluation examples
-------------------

The symbol "=>" used in program examples should be read "evaluates to."
For example,


     (* 5 8)                                ==>  40

means that the expression (* 5 8) evaluates to the object 40.  Or, more
precisely:  the expression given by the sequence of characters "(* 5
8)" evaluates, in the initial environment, to an object that may be
represented externally by the sequence of characters "40".  See section
*Note External representations:: for a discussion of external
representations of objects.


File: r5rs.info,  Node: Naming conventions,  Prev: Evaluation examples,  Up: Notation and terminology

Naming conventions
------------------

By convention, the names of procedures that always return a boolean
value usually end in "`?'".  Such procedures are called predicates.

By convention, the names of procedures that store values into previously
allocated locations (see section *note Storage model::) usually end in
"`!'".  Such procedures are called mutation procedures.  By convention,
the value returned by a mutation procedure is unspecified.

By convention, "`->'" appears within the names of procedures that take
an object of one type and return an analogous object of another type.
For example, `list->vector' takes a list and returns a vector whose
elements are the same as those of the list.


File: r5rs.info,  Node: Lexical conventions,  Next: Basic concepts,  Prev: Overview of Scheme,  Up: Top

Lexical conventions
*******************

* Menu:

* Identifiers::
* Whitespace and comments::
* Other notations::

This section gives an informal account of some of the lexical
conventions used in writing Scheme programs.  For a formal syntax of
Scheme, see section *Note Formal syntax::.

Upper and lower case forms of a letter are never distinguished except
within character and string constants.  For example, `Foo' is the same
identifier as `FOO', and #x1AB is the same number as #X1ab.


File: r5rs.info,  Node: Identifiers,  Next: Whitespace and comments,  Prev: Lexical conventions,  Up: Lexical conventions

Identifiers
===========

Most identifiers allowed by other programming languages are also
acceptable to Scheme.  The precise rules for forming identifiers vary
among implementations of Scheme, but in all implementations a sequence
of letters, digits, and "extended alphabetic characters" that begins
with a character that cannot begin a number is an identifier.  In
addition, `+', `-', and `...' are identifiers.  Here are some examples
of identifiers:


     lambda                   q
     list->vector             soup
     +                        V17a
     <=?                      a34kTMNs
     the-word-recursion-has-many-meanings

Extended alphabetic characters may be used within identifiers as if
they were letters.  The following are extended alphabetic characters:


     ! $ % & * + - . / : < = > ? @ ^ _ ~

See section *Note Lexical structure:: for a formal syntax of
identifiers.

Identifiers have two uses within Scheme programs:

   * Any identifier may be used as a variable or as a syntactic keyword
     (see sections *note Variables; syntactic keywords; and regions::
     and *note Macros::).

   * When an identifier appears as a literal or within a literal (see
     section *note Literal expressions::), it is being used to denote a
     _symbol_ (see section *note Symbols::).



File: r5rs.info,  Node: Whitespace and comments,  Next: Other notations,  Prev: Identifiers,  Up: Lexical conventions

Whitespace and comments
=======================

"Whitespace" characters are spaces and newlines.  (Implementations
typically provide additional whitespace characters such as tab or page
break.)  Whitespace is used for improved readability and as necessary
to separate tokens from each other, a token being an indivisible
lexical unit such as an identifier or number, but is otherwise
insignificant.  Whitespace may occur between any two tokens, but not
within a token.  Whitespace may also occur inside a string, where it is
significant.

A semicolon (;) indicates the start of a comment.  The comment
continues to the end of the line on which the semicolon appears.
Comments are invisible to Scheme, but the end of the line is visible as
whitespace.  This prevents a comment from appearing in the middle of an
identifier or number.


     ;;; The FACT procedure computes the factorial
     ;;; of a non-negative integer.
     (define fact
       (lambda (n)
         (if (= n 0)
             1        ;Base case: return 1
             (* n (fact (- n 1))))))


File: r5rs.info,  Node: Other notations,  Prev: Whitespace and comments,  Up: Lexical conventions

Other notations
===============

For a description of the notations used for numbers, see section *Note
Numbers::.

. + -
     These are used in numbers, and may also occur anywhere in an
     identifier except as the first character.  A delimited plus or
     minus sign by itself is also an identifier.  A delimited period
     (not occurring within a number or identifier) is used in the
     notation for pairs (section *note Pairs and lists::), and to
     indicate a rest-parameter in a  formal parameter list (section
     *note Procedures::).  A delimited sequence of three successive
     periods is also an identifier.

( )
     Parentheses are used for grouping and to notate lists (section
     *note Pairs and lists::).

'
     The single quote character is used to indicate literal data
     (section *note Literal expressions::).

`
     The backquote character is used to indicate almost-constant data
     (section *note Quasiquotation::).

, ,@
     The character comma and the sequence comma at-sign are used in
     conjunction with backquote (section *note Quasiquotation::).

"
     The double quote character is used to delimit strings (section
     *note Strings::).

\
     Backslash is used in the syntax for character constants (section
     *note Characters::) and as an escape character within string
     constants (section *note Strings::).

[ ] { } |
     Left and right square brackets and curly braces and vertical bar
     are reserved for possible future extensions to the language.

#
     Sharp sign is used for a variety of purposes depending on the
     character that immediately follows it:

#t #f
     These are the boolean constants (section *note Booleans::).

#\
     This introduces a character constant (section *note Characters::).

#(
     This introduces a vector constant (section *note Vectors::).
     Vector constants are terminated by ) .

#e #i #b #o #d #x
     These are used in the notation for numbers (section *note Syntax
     of numerical constants::).


File: r5rs.info,  Node: Basic concepts,  Next: Expressions,  Prev: Lexical conventions,  Up: Top

Basic concepts
**************

* Menu:

* Variables; syntactic keywords; and regions::
* Disjointness of types::
* External representations::
* Storage model::
* Proper tail recursion::


File: r5rs.info,  Node: Variables; syntactic keywords; and regions,  Next: Disjointness of types,  Prev: Basic concepts,  Up: Basic concepts

Variables; syntactic keywords; and regions
==========================================

An identifier may name a type of syntax, or it may name a location
where a value can be stored.  An identifier that names a type of syntax
is called a _syntactic keyword_ and is said to be _bound_ to that
syntax.  An identifier that names a location is called a _variable_ and
is said to be _bound_ to that location.  The set of all visible
bindings in effect at some point in a program is known as the
_environment_ in effect at that point.  The value stored in the
location to which a variable is bound is called the variable's value.
By abuse of terminology, the variable is sometimes said to name the
value or to be bound to the value.  This is not quite accurate, but
confusion rarely results from this practice.

Certain expression types are used to create new kinds of syntax and
bind syntactic keywords to those new syntaxes, while other expression
types create new locations and bind variables to those locations.
These expression types are called _binding constructs_.

Those that bind syntactic keywords are listed in section *Note Macros::.
The most fundamental of the variable binding constructs is the `lambda'
expression, because all other variable binding constructs can be
explained in terms of `lambda' expressions.  The other variable binding
constructs are `let', `let*', `letrec', and `do' expressions (see
sections *note Procedures::, *note Binding constructs::, and *note
Iteration::).

Like Algol and Pascal, and unlike most other dialects of Lisp except
for Common Lisp, Scheme is a statically scoped language with block
structure.  To each place where an identifier is bound in a program
there corresponds a "region" of the program text within which the
binding is visible.  The region is determined by the particular binding
construct that establishes the binding; if the binding is established
by a `lambda' expression, for example, then its region is the entire
`lambda' expression.  Every mention of an identifier refers to the
binding of the identifier that established the innermost of the regions
containing the use.  If there is no binding of the identifier whose
region contains the use, then the use refers to the binding for the
variable in the top level environment, if any (chapters *note
Expressions:: and *note Standard procedures::); if there is no binding
for the identifier, it is said to be "unbound".


File: r5rs.info,  Node: Disjointness of types,  Next: External representations,  Prev: Variables; syntactic keywords; and regions,  Up: Basic concepts

Disjointness of types
=====================

No object satisfies more than one of the following predicates:


     boolean?          pair?
     symbol?           number?
     char?             string?
     vector?           port?
     procedure?

These predicates define the types _boolean_, _pair_, _symbol_,
_number_, _char_ (or _character_), _string_, _vector_, _port_, and
_procedure_.  The empty list is a special object of its own type; it
satisfies none of the above predicates.

Although there is a separate boolean type, any Scheme value can be used
as a boolean value for the purpose of a conditional test.  As explained
in section *Note Booleans::, all values count as true in such a test
except for #f.  This report uses the word "true" to refer to any Scheme
value except #f, and the word "false" to refer to #f.


File: r5rs.info,  Node: External representations,  Next: Storage model,  Prev: Disjointness of types,  Up: Basic concepts

External representations
========================

An important concept in Scheme (and Lisp) is that of the _external
representation_ of an object as a sequence of characters.  For example,
an external representation of the integer 28 is the sequence of
characters "28", and an external representation of a list consisting of
the integers 8 and 13 is the sequence of characters "(8 13)".

The external representation of an object is not necessarily unique.  The
integer 28 also has representations "#e28.000" and "#x1c", and the list
in the previous paragraph also has the representations "( 08 13 )" and
"(8 . (13 . ()))" (see section *note Pairs and lists::).

Many objects have standard external representations, but some, such as
procedures, do not have standard representations (although particular
implementations may define representations for them).

An external representation may be written in a program to obtain the
corresponding object (see `quote', section *note Literal expressions::).

External representations can also be used for input and output.  The
procedure `read' (section *note Input::) parses external
representations, and the procedure `write' (section *note Output::)
generates them.  Together, they provide an elegant and powerful
input/output facility.

Note that the sequence of characters "(+ 2 6)" is _not_ an external
representation of the integer 8, even though it _is_ an expression
evaluating to the integer 8; rather, it is an external representation
of a three-element list, the elements of which are the symbol + and the
integers 2 and 6.  Scheme's syntax has the property that any sequence
of characters that is an expression is also the external representation
of some object.  This can lead to confusion, since it may not be
obvious out of context whether a given sequence of characters is
intended to denote data or program, but it is also a source of power,
since it facilitates writing programs such as interpreters and
compilers that treat programs as data (or vice versa).

The syntax of external representations of various kinds of objects
accompanies the description of the primitives for manipulating the
objects in the appropriate sections of chapter *Note Standard
procedures::.


File: r5rs.info,  Node: Storage model,  Next: Proper tail recursion,  Prev: External representations,  Up: Basic concepts

Storage model
=============

Variables and objects such as pairs, vectors, and strings implicitly
denote locations or sequences of locations.  A string, for example,
denotes as many locations as there are characters in the string.
(These locations need not correspond to a full machine word.) A new
value may be stored into one of these locations using the string-set!
procedure, but the string continues to denote the same locations as
before.

An object fetched from a location, by a variable reference or by a
procedure such as `car', `vector-ref', or `string-ref', is equivalent
in the sense of `eqv?' (section *note Equivalence predicates::) to the
object last stored in the location before the fetch.

Every location is marked to show whether it is in use.  No variable or
object ever refers to a location that is not in use.  Whenever this
report speaks of storage being allocated for a variable or object, what
is meant is that an appropriate number of locations are chosen from the
set of locations that are not in use, and the chosen locations are
marked to indicate that they are now in use before the variable or
object is made to denote them.

In many systems it is desirable for constants (i.e. the values of
literal expressions) to reside in read-only-memory.  To express this,
it is convenient to imagine that every object that denotes locations is
associated with a flag telling whether that object is mutable or
immutable.  In such systems literal constants and the strings returned
by `symbol->string' are immutable objects, while all objects created by
the other procedures listed in this report are mutable.  It is an error
to attempt to store a new value into a location that is denoted by an
immutable object.


File: r5rs.info,  Node: Proper tail recursion,  Prev: Storage model,  Up: Basic concepts

Proper tail recursion
=====================

Implementations of Scheme are required to be _properly tail-recursive_.
Procedure calls that occur in certain syntactic contexts defined below
are `tail calls'.  A Scheme implementation is properly tail-recursive
if it supports an unbounded number of active tail calls.  A call is
_active_ if the called procedure may still return.  Note that this
includes calls that may be returned from either by the current
continuation or by continuations captured earlier by
`call-with-current-continuation' that are later invoked.  In the
absence of captured continuations, calls could return at most once and
the active calls would be those that had not yet returned.  A formal
definition of proper tail recursion can be found in
[propertailrecursion].

     _Rationale:_

     Intuitively, no space is needed for an active tail call because the
     continuation that is used in the tail call has the same semantics
     as the continuation passed to the procedure containing the call.
     Although an improper implementation might use a new continuation
     in the call, a return to this new continuation would be followed
     immediately by a return to the continuation passed to the
     procedure.  A properly tail-recursive implementation returns to
     that continuation directly.

     Proper tail recursion was one of the central ideas in Steele and
     Sussman's original version of Scheme.  Their first Scheme
     interpreter implemented both functions and actors.  Control flow
     was expressed using actors, which differed from functions in that
     they passed their results on to another actor instead of returning
     to a caller.  In the terminology of this section, each actor
     finished with a tail call to another actor.

     Steele and Sussman later observed that in their interpreter the
     code for dealing with actors was identical to that for functions
     and thus there was no need to include both in the language.


A _tail call_ is a procedure call that occurs in a _tail context_.
Tail contexts are defined inductively.  Note that a tail context is
always determined with respect to a particular lambda expression.

   * The last expression within the body of a lambda expression, shown
     as <tail expression> below, occurs in a tail context.

     (lambda <formals>
       <definition>* <expression>* <tail expression>)

   * If one of the following expressions is in a tail context, then the
     subexpressions shown as <tail expression> are in a tail context.
     These were derived from rules in the grammar given in chapter
     *Note Formal syntax and semantics:: by replacing some occurrences
     of <expression> with <tail expression>.  Only those rules that
     contain tail contexts are shown here.

     (if <expression> <tail expression> <tail expression>)
     (if <expression> <tail expression>)
     
     (cond <cond clause>+)
     (cond <cond clause>* (else <tail sequence>))
     
     (case <expression>
       <case clause>+)
     (case <expression>
       <case clause>*
       (else <tail sequence>))
     
     (and <expression>* <tail expression>)
     (or <expression>* <tail expression>)
     
     (let (<binding spec>*) <tail body>)
     (let <variable> (<binding spec>*) <tail body>)
     (let* (<binding spec>*) <tail body>)
     (letrec (<binding spec>*) <tail body>)
     
     (let-syntax (<syntax spec>*) <tail body>)
     (letrec-syntax (<syntax spec>*) <tail body>)
     
     (begin <tail sequence>)
     
     (do (<iteration spec>*)
         (<test> <tail sequence>)
       <expression>*)
     
     where
     
     <cond clause> --> (<test> <tail sequence>)
     <case clause> --> ((<datum>*) <tail sequence>)
     
     <tail body> --> <definition>* <tail sequence>
     <tail sequence> --> <expression>* <tail expression>

   * If a `cond' expression is in a tail context, and has a clause of
     the form `(<expression1> => <expression2>)' then the (implied)
     call to the procedure that results from the evaluation of
     <expression2> is in a tail context.  <expression2> itself is not
     in a tail context.


Certain built-in procedures are also required to perform tail calls.
The first argument passed to `apply' and to
`call-with-current-continuation', and the second argument passed to
`call-with-values', must be called via a tail call.  Similarly, `eval'
must evaluate its argument as if it were in tail position within the
`eval' procedure.

In the following example the only tail call is the call to `f'.  None
of the calls to `g' or `h' are tail calls.  The reference to `x' is in
a tail context, but it is not a call and thus is not a tail call.


     (lambda ()
       (if (g)
           (let ((x (h)))
             x)
           (and (g) (f))))

     _Note:_ Implementations are allowed, but not required, to
     recognize that some non-tail calls, such as the call to `h' above,
     can be evaluated as though they were tail calls.  In the example
     above, the `let' expression could be compiled as a tail call to
     `h'. (The possibility of `h' returning an unexpected number of
     values can be ignored, because in that case the effect of the
     `let' is explicitly unspecified and implementation-dependent.)


File: r5rs.info,  Node: Expressions,  Next: Program structure,  Prev: Basic concepts,  Up: Top

Expressions
***********

* Menu:

* Primitive expression types::
* Derived expression types::
* Macros::

Expression types are categorized as _primitive_ or _derived_.
Primitive expression types include variables and procedure calls.
Derived expression types are not semantically primitive, but can instead
be defined as macros.  With the exception of `quasiquote', whose macro
definition is complex, the derived expressions are classified as
library features.  Suitable definitions are given in section *Note
Derived expression type::.


File: r5rs.info,  Node: Primitive expression types,  Next: Derived expression types,  Prev: Expressions,  Up: Expressions

Primitive expression types
==========================

* Menu:

* Variable references::
* Literal expressions::
* Procedure calls::
* Procedures::
* Conditionals::
* Assignments::


File: r5rs.info,  Node: Variable references,  Next: Literal expressions,  Prev: Primitive expression types,  Up: Primitive expression types

Variable references
-------------------

 - syntax: <variable>
     An expression consisting of a variable (section *note Variables;
     syntactic keywords; and regions::) is a variable reference.  The
     value of the variable reference is the value stored in the
     location to which the variable is bound.  It is an error to
     reference an unbound variable.

     (define x 28)
     x                                      ==>  28



File: r5rs.info,  Node: Literal expressions,  Next: Procedure calls,  Prev: Variable references,  Up: Primitive expression types

Literal expressions
-------------------

 - syntax: quote <datum>
 - syntax: '<datum>
 - syntax: <constant>
     `(quote <datum>)' evaluates to <datum>.  <Datum> may be any
     external representation of a Scheme object (see section *note
     External representations::).  This notation is used to include
     literal constants in Scheme code.


     (quote a)                              ==>  a
     (quote #(a b c))                       ==>  #(a b c)
     (quote (+ 1 2))                        ==>  (+ 1 2)

     `(quote <datum>)' may be abbreviated as '<datum>.  The two
     notations are equivalent in all respects.

     'a                                     ==>  a
     '#(a b c)                              ==>  #(a b c)
     '()                                    ==>  ()
     '(+ 1 2)                               ==>  (+ 1 2)
     '(quote a)                             ==>  (quote a)
     ''a                                    ==>  (quote a)

     Numerical constants, string constants, character constants, and
     boolean constants evaluate "to themselves"; they need not be
     quoted.

     '"abc"                                 ==>  "abc"
     "abc"                                  ==>  "abc"
     '145932                                ==>  145932
     145932                                 ==>  145932
     '#t                                    ==>  #t
     #t                                     ==>  #t

     As noted in section *Note Storage model::, it is an error to alter
     a constant (i.e. the value of a literal expression) using a
     mutation procedure like `set-car!' or `string-set!'.



File: r5rs.info,  Node: Procedure calls,  Next: Procedures,  Prev: Literal expressions,  Up: Primitive expression types

Procedure calls
---------------

 - syntax: <operator> <operand1> ...,
     A procedure call is written by simply enclosing in parentheses
     expressions for the procedure to be called and the arguments to be
     passed to it.  The operator and operand expressions are evaluated
     (in an unspecified order) and the resulting procedure is passed
     the resulting arguments.


     (+ 3 4)                                ==>  7
     ((if #f + *) 3 4)                      ==>  12

     A number of procedures are available as the values of variables in
     the initial environment; for example, the addition and
     multiplication procedures in the above examples are the values of
     the variables `+' and `*'.  New procedures are created by
     evaluating lambda expressions (see section *note Procedures::).

     Procedure calls may return any number of values (see `values' in
     section *note Control features::).  With the exception of `values'
     the procedures available in the initial environment return one
     value or, for procedures such as `apply', pass on the values
     returned by a call to one of their arguments.

     Procedure calls are also called _combinations_.

          _Note:_ In contrast to other dialects of Lisp, the order of
          evaluation is unspecified, and the operator expression and
          the operand expressions are always evaluated with the same
          evaluation rules.

          _Note:_ Although the order of evaluation is otherwise
          unspecified, the effect of any concurrent evaluation of the
          operator and operand expressions is constrained to be
          consistent with some sequential order of evaluation.  The
          order of evaluation may be chosen differently for each
          procedure call.

          _Note:_ In many dialects of Lisp, the empty combination, (),
          is a legitimate expression.  In Scheme, combinations must
          have at least one subexpression, so () is not a syntactically
          valid expression.



File: r5rs.info,  Node: Procedures,  Next: Conditionals,  Prev: Procedure calls,  Up: Primitive expression types

Procedures
----------

 - syntax: lambda <formals> <body>
     _Syntax:_ <Formals> should be a formal arguments list as described
     below, and <body> should be a sequence of one or more expressions.

     _Semantics:_ A lambda expression evaluates to a procedure.  The
     environment in effect when the lambda expression was evaluated is
     remembered as part of the procedure.  When the procedure is later
     called with some actual arguments, the environment in which the
     lambda expression was evaluated will be extended by binding the
     variables in the formal argument list to fresh locations, the
     corresponding actual argument values will be stored in those
     locations, and the expressions in the body of the lambda expression
     will be evaluated sequentially in the extended environment.  The
     result(s) of the last expression in the body will be returned as
     the result(s) of the procedure call.

     (lambda (x) (+ x x))                   ==>  __a procedure
     ((lambda (x) (+ x x)) 4)               ==>  8
     
     (define reverse-subtract
       (lambda (x y) (- y x)))
     (reverse-subtract 7 10)                ==>  3
     
     (define add4
       (let ((x 4))
         (lambda (y) (+ x y))))
     (add4 6)                               ==>  10

     <Formals> should have one of the following forms:

        * (<variable1> ...,): The procedure takes a fixed number of
          arguments; when the procedure is called, the arguments will
          be stored in the bindings of the corresponding variables.

        * <variable>: The procedure takes any number of arguments; when
          the procedure is called, the sequence of actual arguments is
          converted into a newly allocated list, and the list is stored
          in the binding of the <variable>.

        * (<variable1> ..., <variable_n> .  <variable_n+1>): If a
          space-delimited period precedes the last variable, then the
          procedure takes n or more arguments, where n is the number of
          formal arguments before the period (there must be at least
          one).  The value stored in the binding of the last variable
          will be a newly allocated list of the actual arguments left
          over after all the other actual arguments have been matched
          up against the other formal arguments.


     It is an error for a <variable> to appear more than once in
     <formals>.

     ((lambda x x) 3 4 5 6)                 ==>  (3 4 5 6)
     ((lambda (x y . z) z)
      3 4 5 6)                              ==>  (5 6)

     Each procedure created as the result of evaluating a lambda
     expression is (conceptually) tagged with a storage location, in
     order to make `eqv?' and `eq?' work on procedures (see section
     *note Equivalence predicates::).



File: r5rs.info,  Node: Conditionals,  Next: Assignments,  Prev: Procedures,  Up: Primitive expression types

Conditionals
------------

 - syntax: if <test> <consequent> <alternate>
 - syntax: if <test> <consequent>
     _Syntax:_ <Test>, <consequent>, and <alternate> may be arbitrary
     expressions.

     _Semantics:_ An `if' expression is evaluated as follows: first,
     <test> is evaluated.  If it yields a true value (see section *note
     Booleans::), then <consequent> is evaluated and its value(s)
     is(are) returned.  Otherwise <alternate> is evaluated and its
     value(s) is(are) returned.  If <test> yields a false value and no
     <alternate> is specified, then the result of the expression is
     unspecified.

     (if (> 3 2) 'yes 'no)                  ==>  yes
     (if (> 2 3) 'yes 'no)                  ==>  no
     (if (> 3 2)
         (- 3 2)
         (+ 3 2))                           ==>  1



File: r5rs.info,  Node: Assignments,  Prev: Conditionals,  Up: Primitive expression types

Assignments
-----------

 - syntax: set! <variable> <expression>
     <Expression> is evaluated, and the resulting value is stored in
     the location to which <variable> is bound.  <Variable> must be
     bound either in some region enclosing the `set!' expression or at
     top level.  The result of the `set!' expression is unspecified.

     (define x 2)
     (+ x 1)                                ==>  3
     (set! x 4)                             ==>  _unspecified_
     (+ x 1)                                ==>  5



File: r5rs.info,  Node: Derived expression types,  Next: Macros,  Prev: Primitive expression types,  Up: Expressions

Derived expression types
========================

* Menu:

* Conditional::
* Binding constructs::
* Sequencing::
* Iteration::
* Delayed evaluation::
* Quasiquotation::

The constructs in this section are hygienic, as discussed in section
*Note Macros::.  For reference purposes, section *Note Derived
expression type:: gives macro definitions that will convert most of the
constructs described in this section into the primitive constructs
described in the previous section.