Recursion

A function that makes a call to itself is said to be recursive. Many problems in mathematics and computer science are naturally recursive. This just means that a problem can be broken down into smaller instances of itself, solved, then put back together.

For example, consider the factorial function. The factorial of an integer n (written n!) is the product n *(n-1)*(n-2)...*3*2*1. It represents the number of different ways n items can be ordered, i.e., the number of permutations of n items. The empty set can only be ordered one way, so 0! is equal to 1. A recursive formulation of factorial is this:

n! =

1, if n = 0,
n((n-1)!) otherwise.

Note that, even though we define the function in terms of itself, this definition contains all we need to know to compute n!.

The Fibonacci numbers are another naturally recursive function of positive integers. The first and second Fibonacci numbers, F₁ and F₂, are both equal to 1. The nth Fibonacci number, F_n, is the sum of the previous two, F_n-1 and F_n-2.

A recursive function must have two parts:

One or more base cases. This solves a "trivial" instance of the problem, like finding the factorial of zero.
The recursive case. This is where the function breaks up the problem, solves the parts recursively, then comes up with an answer to the whole problem.

Recursion in C

C supports recursive functions. Any function may call itself. This often leads to code that is more brief and more clear. For example, here is an iterative version of factorial ("iterative" in this context means "not recursive," i.e. using explicit iteration to do many things):

int factorial (int n) {
	int	product, i;

	product = 1;
	for (i=2; i<=n; i++)
		product = product * i;
	return product;
}

Here is a recursive version of factorial:

int factorial (int n) {
	if (n == 0) return 1;
	return n * factorial (n-1);
}

The recursive version is much cleaner and retains the functional definition of factorial.

How It Works

At this point, you might say "waitaminute; n is changing for each invokation of factorial. Why does this work?" The answer is that C keeps separate instances of the parameter n (and any local variables, although factorial has none). It keeps track of their values on the runtime stack, the same stack we talked about before for keeping track of return addresses.

This "keeping track stack" is invisible to you as the C programmer, but seeing what happens there can help you understand how it works.

The parameters, return value, and local variables of a function hold (almost) all the information the function needs to do its work. The compiler sets up activation records on the stack (sometimes called stack frames) to handle this information. An activation record is pushed onto the stack whenever the function starts executing, and the function works from whichever activation record is on top of the stack.

The activation records for factorial would contain the parameter n and the return value (as well as information about what and how to return and any temporary variables needed). Suppose we call factorial with n=4. An activation record containing 4 would be pushed onto the stack, then control would be handed over to factorial. factorial would call itself with 4-1, causing an activation record containing 3 to be pushed and control handed to the beginning of factorial, etc., until 0 is reached. Then the sequence of recursive calls would unwind itself, doing all the multiplications in reverse order.

More Examples

Suppose we're not interested in computing a return value, we just want to get something done. Recursion can still be used. For example, let's say we want to read in and print out some numbers from standard input to standard output. We can do this iteratively:

void read_stuff (void) {
	int	n;

	for (;;) {
		scanf ("%d", &n);
		if (feof (stdin)) break;
		printf ("%d\n", n);
	}
}

or recursively:

void read_stuff (void) {
	int	n;

	scanf ("%d", &n);
	if (!feof (stdin)) {
		printf ("%d\n", n);
		read_stuff ();
	}
}

The second function does the same thing as the first, but with no loops! Not impressed? Ok, reverse the order of the calls to printf and read_stuff. What happens? The numbers are read in, but printed out in reverse order! Try doing that with a simple for loop.

Recursive Linked List Implementation

Linked lists can be thought of as recursive data structures. Recall the typedef/struct for a simple integer linked list:

typedef struct _node {
	int		k;
	struct _node 	*next;
} node, *list;

We can think of a linked list recursively as:

The empty list, represented by the NULL pointer, or
A node (the head) whose value is its k field and whose next field (the tail) is a linked list.

See the base case and recursive case? We can write many list processing functions based on this idea (indeed, the computer language LISP is based almost entirely on this idea). We can simplify our code and emphasize the recursive nature using two #defines in C:

#define head (x) ((x)->k)
#define tail (x) ((x)->next)

So head of a pointer is the k field, and tail of a pointer is the next field.

Here are some recursive linked list functions:

/* insert an int at the end of the list */

void insert_at_tail (list *L, int k) {
	node	*p;

	/* base case: list is empty */

	if (*L == NULL) {
		p = (node *) malloc (sizeof (node));
		tail (p) = NULL;
		head (p) = k;
		*L = p;
	}
	else
	/* recursive case: insert into tail of tail */
		insert_at_tail (&tail (*L), k);
}

/* print a list in order */

void print_list (list L) {
	if (L) {
		printf ("%d\n", head (L));
		print_list (tail (L));
	}
}

/* print a list in reverse order */

void print_list_backwards (list L) {
	if (L) {
		print_list_backwards (tail (L));
		printf ("%d\n", head (L));
	}
}

/* insert into a list in order */

void insert_ordered (list *L, int k) {
	node	*p;

	if (*L && head (*L) < k) {
		insert_ordered (&tail (*L), k);
	} else {
		p = (node *) malloc (sizeof (node));
		tail (p) = *L;
		head (p) = k;
		*L = p;	
	}
}

/* return the length of a list */

int length_list (list L) {
	if (L) return 1 + length_list (tail (L));
	return 0;
}

/* search a list for an int */

int *search_list (list L, int k) {
	/* list empty? then k isn't there. */
	if (!L) return NULL;
	/* found k at head? return pointer */
	if (head (L) == k) return &head (L);
	/* not head?  search for k in tail */
	return search_list (tail (L), k);
}

/* delete a node from a list */

void delete_node (list *L, int k) {
	/* let's do this one in class, after the homeworks are turned in :-) */
}

/* return a new list with all elements common to L1 and L2 */

list intersect (list L1, list L2) {
	node    *p;
 
	if (!L1 || !L2) return NULL;
	if (search_list (L1, head(L2))) {
		p = (node *) malloc (sizeof (node));
		head(p) = head(L2);
		tail(p) = intersect (L1, tail(L2));
		return p;
	} else
		return intersect (L1, tail (L2));
}