Pseudo Code Pdf 183855

Partial capture of text on file.

By Jon Bentley
WRITING CORRECT PROGRAMS
In the late 1960s people were talking about the promise of I’ve given this problem as an in-class assignment in courses
programs that verify the correctness of other programs. Unfor- at Bell Labs and IBM. The professional programmers had one
tunately, it is now the middle of the 198Os, and, with precious hour (sometimes more) to convert the above description into a
few exceptions, there is still little more than talk about auto- program in the language of their choice; a high-level pseudo-
mated verification systems. Despite unrealized expectations, code was fine. At the end of the specified time, almost all the
however, the research on program verification has given us programmers reported that they had correct code for the task.
something far more valuable than a black box that gobbles We would then take 30 minutes to examine their code,
programs and flashes “good” or “bad’‘-we now have a funda- which the programmers did with test cases. In many different
mental understanding of computer programming. classes and with over a hundred programmers, the results
The purpose of this column is to show how that fundamen- varied little: 90 percent of the programmers found bugs in
tal understanding can help programmers write correct pro- their code (and I wasn’t always convinced of the correctness
grams. But before we get to the subject itself, we must keep it of the code in which no bugs were found).
in perspective. Coding skill is just one small part of writing I found this amazing: only about 10 percent of professional
correct programs. The majority of the task is the subject of the programmers were able to get this small program right. But
three previous columns: problem definition, algorithm design, they aren’t the only ones to find this task difficult. In the
and data structure selection. If you perform those tasks well, history in Section 62.1 of his Sorting and Searching, Knuth
then writing correct code is usually easy. points out that while the first binary search was published in
1946, the first published binary search without bugs did not
The Challenge of Binary Search appear until 1962.
Even with the best of designs, every now and then a program-
mer has to write subtle code. This column is about one prob- Writing The Program
lem that requires particularly careful code: binary search. The key idea of binary search is that we always know that if
After defining the problem and sketching an algorithm to T is anywhere in X[l . . N], then it must be in a certain range
solve it, we’ll use principles of program verification in several of X. We’ll use the shorthand MustBe(range) to mean that if T
stages as we develop the program. is anywhere in the array, then it must be in range. With this
The problem is to determine whether the sorted array notation, it’s easy to convert the above description of binary
X[l . , N] contains the element 7’. Precisely, we know that N search into a program sketch.
> 0 and that X[l] 5 X[Z] 5 . . l 5 X[N]. The types of T and initialize range to designate X[l..N]
the elements of X are the same; the pseudocode should work loop
equally well for integers, reals or strings. The answer is stored ( invariant: MustBe(range) )
in the integer P (for position); when P is zero 7’ is not in if range is empty,
X[l . . IV], otherwise 1 5 P 5 N and T = X[P]. return that T is nowhere in the
Binary search solves the problem by keeping track of a array
range within the array in which T must be if it is anywhere compute M,the middle of the range
in the array. Initially, the range is the entire array. The range use M as a probe to shrink the range
is diminished by comparing its middle element to T and if T is found during the
discarding half the range. This process continues until T is shrinking process, return its
discovered in the array or until the range in which it must lie position
is known to be empty. The process makes roughly log, N endloop
comparisons.
Most programmers think that with the above description in The crucial part of this program is the loop invariant, which
hand, writing the code is easy; they’re wrong. The only way is enclosed in { j’s, This is an assertion about the program state
you’ll believe this is by putting down this column right now, that is invariantly true at the beginning and end of each
and writing the code yourself. Try it. iteration of the loop (hence its name); it formalizes the intui-
Permission to copy without fee all or part of this material is granted tive notion we had above.
provided that the copies are not made or distributed for direct We’ll now refine the program, making sure that all our
commercial advantage, the ACM copyright notice and the title of the actions respect the invariant. The first issue we must face is
publication and its date appear, and notice is given that copying is by the representation of range: we’ll use two indices L and UzyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA (for
permission of the Association for Computing Machinery. To copy “lower” and “upper”) to represent the range L . . U. (There are
otherwise, or to republish, requires a fee and/or specific permission. other possible representations for a range, such as its begin-
0 1983 ACM OOOl-0782/83/1200-104075C
Communications of the ACM December 1983 Volume 26 Number 12
assertion. The basic techniques of program verification-stat-
ning position and its length.) The next step is the initializa- ing the invariant precisely and keeping eye towards maintain-
tion; what values should L and U have so that MustBe(L, U) is ing the invariant as we wrote each line of code-helped us
true? The obvious choice is 1 and N: MustBe(1, N) says that if greatly as we converted the algorithm sketch into pseudocode.
T is anywhere in X, then it is in X[l . . N], which is precisely This development gives us some confidence that the program
what we know at the beginning of the program. Thus initiali-
zation consists of the assignments L : = 1 and u : = N. is correct, but we are by no means certain of its correctness.
The next tasks are to check for an empty range and to Spend a few minutes convincing yourself that the code is
compute the new midpoint, M. The range L . . U is empty if correct before reading on.
L > U, in which case we store the special value o in P and
terminate the loop, which gives Understanding the Program
if L>U then When I face a subtle problem, I try to derive a program at
P:= 0; break about the level of detail we just saw. I then use verification
methods to increase my confidence that it is correct. In this
The break statement terminates the loop denoted by the loop- section we’ll study such an argument for the above binary
endloop delimeters. This statement computes M, the midpoint search at a picky level of detail-in practice I’d go through a
of the range: much less formal analysis. The program in Figure 1 is (too)
M := (L + U) div 2 heavily annotated with assertions that formalize the intuitive
notions we used as we developed the code.
The div operatorimplements integer division, so 6 div 2 is 3, While the development of the code was top-down (starting
as is 7 div 2. Theprogram is now with the general idea and refining it to individual lines of
L :=l; U:=N code), this analysis of correctness will be bottom-up: we’ll
loop start with the individual lines of code, and show how they
( invariant: MustBe(L,U) 1 work together to solve the problem.
if L>U then
P:= 0; break Warning-Boring Material Ahead
M := (L+U) div 2 Skip to Next Section When Drowsiness Strikes
use M as a probe to shrink the range
L. .u. If T is found during the It’s easy to verify lines 1 through 3. The assertion in line 1
shrinking process, note its is true by the definition of MustBe (if T is anywhere in the
position and break array, then it must be in X[l . . IV]). The assignment in line 2
endloop therefore gives the assertion in line 3.
Refining the last three lines in the loop body will involve We come now to the heart of the program: the loop in lines
comparing T and X[M] and taking appropriate action to main- 4 through 28. There will be three parts to our argument for its
tain the invariant. Thus the code will have the form correctness, each of which is closely related to the loop invar-
case iant:
X[M] < T: Action a Initialization. The loop invariant is true when execution of
X[M] = T: Action b the loop begins.
X[M] > T: Action c Preservation. If the invariant holds at the beginning of an
Action b is easy: we know that T is in position M, so we set P iteration and the loop body is executed again, then the
to M and break the loop. Because the other two cases are invariant will still be true.
symmetric, we’ll focus on the first and trust that the last will Termination. Upon termination of the loop, the desired re-
follow by symmetry (this is part of the reason we’ll verify the sult will hold (in this case, the desired result is that P has
code precisely in the next section). the correct value). Showing this will use the facts estab-
If X[v < T, then we know that X[l] I X[2] 5 l . l 5 X[M] lished by the invariant.
< T, so T can’t be anywhere in X[l . . M]. Combining this Initialization is easy: the assertion in line 3 is the same as that
with the knowledge that T must be in X[L . . U], we know in line 5. To establish the other two properties, we will reason
that if it is anywhere, then it must be in X[M + 1 . . U], which from line 5 through to line 27. When we discuss lines 9 and
we write as MustBe(M + 1, U). Given this, how can we 21 (the break statements) we will establish termination prop-
reestablish the invariant MustBe(L, U)? The answer is obvious: erties, and if we make it all the way to line 27, we will have
set L to M + 1. Putting these cases together with the previous established preservation, because line 27 is the same as line 5.
pseudocode gives If the test in line 6 is successful, we know the assertion of
L := 1; U:=N line 7: if T is anywhere in the array then it must be between
loop I, and U, and the success of the test means that L > U.
{ MustBe(L,U) ) Reasoning from that gives the assertion in line 8: T is no-
if L>U then where in the array. Thus we correctly terminate the loop in
P :=O; break line 9 after setting P to zero.
M := (L+U) div 2 On the other hand, if the test in line 6 is unsuccessful, we
case come to line 10. The invariant still holds [we’ve done nothing
X[M] < T: L:=M+l to change it), and because the test failed we also know that
X[M] = T: P:=M; break L, 5 U. Line 11 sets M to the average of L and U, truncated
X[M] > T: U:=M-1 down to the nearest integer. Because the average is always
endloop between the two values, we have the assertion of line 12.
This brings us to the case statement in lines 12 through 27;
It’s a short program: ten lines of code and one invariant the analysis of the statement will involve considering each of
December 1983 Volume 26 Number 12 Communications of the ACM
1. ( MustBe(l,N) ) be between L and U and can’t be at or above M, then it must
2. L := 1; U := N be between L and M - 1 (if it is anywhere in X); hence line
3. { MustBe(L,U) ) 24. Execution of line 25 with line 24 true leaves line 26
4.loop true-that is the definition of assignment. Thus this choice of
5. ( MustBe(L,U) ) the case statement re-establishes the invariant in line 27.
6. if L>U then The argument for lines 14 through 18 has exactly the same
7. ( L>U and MustBe(L,U) 1 form. Line 15 follows from line 14 and the sortedness of the
8. ( T is nowhere in the array 1 array; it in turn implies line 16. The assignment in line 17 re-
9. P:= 0; break establishes the invariant in line 18. We’ve thus analyzed all
10. { MustBe(L,U) and L<=U } three choices of the case statement. One correctly terminates
11. M := (L+U) div 2 the loop, and the other two maintain the invariant.
12. ( MustBe(L,U) and Lc=M<=U ) This completes one part of the analysis of the code: we’ve
case
13. X[M] < T: shown that if the loop terminates, then it does so with the
14. { MustBe(L,U) and CantBe(l,M) ) correct value in P. It may still, however, have a bug: it might
15. { MustBe(M+l,U) ) never halt. Indeed, this was the most common error in the
16. L := M+1 programs written by the professional programmers.
17. MustBe(L,U) ) Our halting proof will use a different aspect of the range
18. L . . U. That range is initially a certain finite size (N), and lines
19. X[M] = T: 6 through 9 ensure that the loop terminates when the range
20. { XPI = T 1 contains less than one element. Thus to prove termination we
21. P := M; break show that the range shrinks during each iteration of the loop.
22. X[M] > T: But that is easy by combining lines l&l7 and 25. Line 12 tells
23. ( MustBe(L,U) and CantBe(M,N) 1 us that M is indeed within the current range. If lines 14
24. { MustBe(L,M-1) ) through 18 are executed, then line 17 ensures that the bottom.
25. U := M-l of the range will be increased by at least one. Likewise, if
26. ( MustBe(L,U)j lines 22 through 26 are executed, then line 25 will decrease
27. ( MustBe(L,U) ) the top of the range by at least one. In both cases in which
28. endloop the loop continues, the range is decreased by at least one; the
FIGURE 1. A binary search program annotated with assertions. program must therefore halt.
Implementing the Program
its three possible choices. The easiest choice to analyze is the So far we’ve worked with the program in a high-level pseudo-
second alternative, in line 19. In that case we know the language; our willingness to invent a new control structure
assertion in line 20, so we are correct in setting P to M and allowed us to ignore the details of any particular implementa-
terminating the loop. This is the second of two places where tion language and to focus on the heart of the problem. Even-
the loop is terminated, and both end it correctly, so we have tually, however, we have to write the program in a real
established the termination correctness of the loop. language. Just so you don’t think that I chose the language to
We come now to the two symmetric branches of the case make the task easy, I implemented binary search in BASIC.
statement; because we concentrated on the first branch as we Although that language is fine for some tasks, its pallcity of
developed the code, we’ll turn our attention now to lines zz control structures and its global (and typically short) variable
through 26. Consider the assertion in line 23; the first clause names are substantial barriers to building real programs.
is the invariant, which the program has not altered. The Even with these problems, it was easy to translate the
second clause is true because T < X[M] s X[M + l] 5 l l l 5 above pseudocode into the subroutine in a BASIC dialect
X[N], so we know that T can’t be anywhere in the array shown in Figure 2. Because 3 translated this program from the
above position M -1; this is expressed in the assertion with carefully verified pseudocode, I had good reason to believe
the shorthand CantBe(M, N). But logic tells us that if ‘I’ must that it is correct. Before I would use it in an application,
1000 ’
1010 ' BINARY SEARCH FOR T IN X(l..N)
1020 ’ PRE: X(1..N) IS SORTED IN NONDECREASING ORDER
1030 ’ POST: P=O => T IS NOT IN X(l..N)
1040 ’ P>O => PU THEN P=O: RETURN
1100
1110 M=CINT((L+U)h)
1120 IF X(M)T THEN U=M-1: GOT0 1070
1140 I X(M)=T
1150 P=M: RETURN
FIGURE 2. Binary search in a BASIC dialect
1042 Communications of the ACM &cember 1983 Volume 26
Number 32
however, I would test it on sample data. I therefore wrote a Subroutines. To verify a subroutine, we first state its pur-
test program in about 25 lines of BASIC. After declaring pose by two assertions. Its precondition is the state that must
an (N + 2)-element array (indexed from 0 to N + l), the be true before it is called, and its postcondition is what the
program initialized X[Il to’ I. It then performed N searches for routine will guarantee on termination (see the BASIC binary
the elements 1 through N and checked that each returned search for examples). These conditions are more a contract
P = I. After that, it performed N + 1 unsuccessful searches than a statement of fact: they say that if the routine is called
for 0.5, 1.5,. . . , N + 0.5 and checked that each returned with the preconditions satisfied, then the routine will assume
P = 0. Finally, it searched for 0 and N + 1 -and checked that the burden of establishing the postcondition. After I prove
they returned P = 0. When I ran these tests for values of once that the body of the routine satisfies the conditions, I can
N from 0 to 10 (inclusive), the first version of the program use the stated relations between the pre- and post-conditions
passed them all. without ever again considering its implementation.
These tests poke around most of the program. They test
every possible position for successful and unsuccessful The Role of Program Verification
searches, and the case that an element is in the array but In teaching verification techniques to professionals, I’ve ob-
outside the search bounds. Testing N from 0 to 10 covers the served that when one programmer tries to convince another
empty array, common sizes for bugs (one, two and three), that a piece of code is correct, the primary tool is the test case:
several powers of two, and many numbers one away from a execute the program by hand on a certain input. That’s a
power of two. That testing would have been dreadfully boring powerful tool: it’s good for detecting bugs, easy to use, and
(and therefore probably erroneous) by hand, but it used an well understood. It is clear, however, that programmers have
insignificant amount of computer time. a deeper understanding of programs-if they didn’t, they
Many factors contribute to my opinion that the BASIC pro- could never write them in the first place. One of the major
gram is correct: I used sound principles to derive the pseudo- benefits of program verification is that it gives programmers a
code; I used analytic techniques to “verify” its correctness; language in which they can express that understanding.
and then I let a computer do what it’s good at and bombard The language is first used as code is developed. The pro-
the program with test cases. grammer should be able to explain every line of code as it is
written. The important explanations end up in the program
Principles of Program Verification text as assertions; deciding which assertions to include is an
This exercise displays many strengths of program verification: art that comes only with practice. Some languages provide an
the problem is important and requires careful code, the devel- assert statement that allows the programmer to write the
opment of the program is guided by verification ideas, and the assertions as logical expressions that are tested at run time; if
analysis of correctness employs general tools. The primary a false assertion is encountered, then it is reported and the
weakness of this exercise is its level of detail; in practice we run is terminated (most systems allow assertion checking to
could argue at a more informal level. Fortunately, the details be turned off if it is too costly in run time).
illustrate a number of general principles, including the follow- The language of verification is also used often after the code
ing: has been written. During debugging, violations of the assert
statements lead us to bugs, and examining the form of a
Assertions. The relations among input, program variables, violation shows us how to remove the bug without intmduc-
and output describe the “size” of a program; assertions allow a ing another. The verification techniques formalize code walk-
programmer to enunciate those relations precisely. Their inte- through procedures. Assertions are crucial during mainte-
gral role throughout a program’s life is discussed in the next nance of a program; when you pick up code that you’ve
section. never seen before, and no one else has looked at for years,
Sequential Control Structures. The simplest structure to assertions about the program state can give invaluable insight.
control a program is of the form “do this statement then that I mentioned before that these techniques are only a small
statement.” We understand such structures by placing asser- part of writing correct programs; keeping the code simple is
tions between them and analyzing each step of the program’s usually the key to correctness. On the other hand, several
progress individually. professional programmers familiar with these techniques have
Selection Control Structures. These structures include i f related to me an experience that is too common in my own
and case statements of various forms; during execution, one programming: when they construct a program, the “hard”
of many choices is selected. We show the correctness of such parts work the first time, while the bugs are in the “easy”
a structure by considering each of the several choices individ- parts. When they came to a hard part, they hunkered down
ually. The fact that a certain choice is selected allows us to and successfully used powerful formal techniques. In the easy
make an assertion in the proof; if we execute the statement parts, though, they returned to their old ways of program-
following if I > J, for instance, then we can use the fact that ming, with the old results. I wouldn’t have believed this phe-
I > J to derive the next relevant assertion. nomenon until it happened to me; it’s good motivation to use
Iteration Control Structures. Arguing the correctness of the techniques frequently.
loops has three phases: initialization, preservation, and termi- Problems
nation. We first argue that the loop invariant is established by As laborious as our proof of binary search was, it is still
initialization, and then show that each iteration maintains its unfinished by some standards. How would you prove that
truth. These two steps show by mathematical induction that the program is free of runtime errors (such as division by
the invariant is true before and after each iteration of the zero, word overflow, or array indices out of bounds)? If you
loop. The third step is to argue that whenever execution of have a background in discrete mathematics, can you for-
the loop terminates, the desired result is true. These together malize the proof in a logical system?
establish that if the loop ever halts, then it does so correctly;
we must prove termination by other means (the halting proof If the original binary search was too easy for you, try the
of binary search used a typical argument). variant that returns in P the first occurrence of T in the
December 2 983 Volume 26 Number 22 Communications of the ACM 1043

The words contained in this file might help you see if this file matches what you are looking for:

...By jon bentley writing correct programs in the late s people were talking about promise of i ve given this problem as an class assignment courses that verify correctness other unfor at bell labs and ibm professional programmers had one tunately it is now middle os with precious hour sometimes more to convert above description into a few exceptions there still little than talk auto program language their choice high level pseudo mated verification systems despite unrealized expectations code was fine end specified time almost all however research on has us reported they for task something far valuable black box gobbles we would then take minutes examine flashes good or bad have funda which did test cases many different mental understanding computer programming classes over hundred results purpose column show how fundamen varied percent found bugs tal can help write pro wasn t always convinced grams but before get subject itself must keep no perspective coding skill just small part amazi...

Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area