The lambda calculus was constructed by Alonzo Church as one of the responses to the crisis in the foundations of mathematics created by the discovery of Russell's paradox (among others). The goal was to provide a very simple calculus of functions, and use it to formalize mathematics, in a way that would avoid the paradoxes. This explains why mathematicians were interested in the somewhat bizarre looking definitions of concepts from basic arithmetic as lambda terms; no one would come up with such things to use them for actual computing. As with so much else, you have to understand the social and cultural context in order to make sense of what was done: they wanted to show that at least arithmetic really was sound, by reducing it to the lambda calculus.
In retrospect, it seems a bit of a miracle that the obsession with producing detailed formal foundations for mathematics would also provide foundations for what later turned out to be computer science. Nevertheless, the lambda calculus inspired large parts of Lisp, ML, Haskell, and indirectly OBJ, and even ALGOL 60, since beta reduction inspired the call-by-name procedure passing semantics of ALGOL 60; the copy rule is just the substitution mechanism of beta reduction.
In the late 19th century, the logician Gottlob Frege criticized the
mathematical notation for functions then in common use (and still in common
use) for being ambiguous: when we write f(x), it is not clear
whether we mean the value of f on a particular value
x, or whether we mean that function named f,
considered as varying with the variable x. The lambda notation
breaks this ambiguity, by making it explicit what variables (if any) are
involved. Thus if we mean f as a function of the variable
x, we write .\ x. f (where .\ is as
close to a lambda as i can get in ascii), and if we mean that value of the
function f at x. we write (f x).
I would like to think that the webpage on the lambda calculus in OBJ gives a more precise and easier to understand exposition than that in Stansifer; the best way to learn from it is to write some terms and run them, perhaps with trace on; you can also play with the code (see below). In any case, it gives a fully formal definition for the syntax and operational semantics of the lambda calculus, along with numerous examples, including (among other things) the following: