I was working on 2 papers on statistics when I recalled a study I’d read some time ago: “On ‘Rethinking Rigor in Calculus...,’ or Why We Don't Do Calculus on the Rational Numbers’”. The answer is obviously trivial, and the paper was really in response to another suggesting that we eliminate certain theorems and their proofs from elementary collegiate calculus courses. But I started to wonder (initially just as a thought exercise) whether one could “do calculus” on the rationals and if so could the benefits outweigh the restrictions? Measure theory already allows us to construct countably infinite sample spaces. However, many researchers who regularly use statistics haven’t even taken undergraduate probability courses, let alone courses on or that include rigorous probability. Also, even students like engineers who take several calculus courses frequently don’t really understand the real number line because they’ve never taken a course in real analysis.
The rationals are the only set we learn about early on that have so many of the properties the reals do, and in particular that of infinite density. So, for example, textbook examples of why integration isn’t appropriate for pdfs of countably infinite sets typically use examples like the binomial or Bernoulli distributions, but such examples are clearly discrete. Other objections to defining the rationals to be continuous include:
1) The irrational numbers were discovered over 2,000 years ago and the attempts to make calculus rigorous since have (almost) always taken as desirable the inclusion of numbers like pi or sqrt(2). Yet we know from measure theory that the line between distinct and continuous can be fuzzy and that we can construct abstract probability spaces that handle both countable and uncountable sets.
2) We already have a perfectly good way to deal with countably infinite sets using measure theory (not to mention both discrete calculus and discretized calculus). But the majority of those who regularly use statistics and therefore probability aren’t familiar with measure theory.
The third and most important reason is actually the question I’m asking: nobody has bothered to rigorously define the rationals to be continuous to allow a more limited application of differential and integral calculi because there are so many applications which require the reals and (as noted) we already have superior ways for dealing with any arbitrary set.
Yet most of the reasons we can’t e.g., integrate over the rationals in the interval [0,1] have to do with the intuitive notion that it contains “gaps” where we know irrational numbers exist even though the rationals are infinitely dense. It is, in fact, possible to construct functions that are continuous on the rationals and discontinuous on the reals. Moreover, we frequently use statistical methods that assume continuity even though the outcomes can’t ever be irrational-valued. Further, the Riemann integral is defined in elementary calculus and often elsewhere as an integer-valued and thus a countable set of summed "terms" (i.e., a function that is Riemann integrable over the interval [a,b] is integrated by a summation from i=1 to infinity of f(x*I)Δx, but whatever values the function may take, by definition the terms/partitions are ordered by integer multiples of i). As for the gaps, work since Cantor in particular (e.g., the Cantor set) have demonstrated how the rationals “fill” the entire unit interval such that one can e.g., recursively remove infinitely many thirds from it equal to 1 yet be left with infinitely many remaining numbers. In addition to objections mostly from philosophers that even the reals are continuous, we know the real number line has "gaps" in some sense anyway; how many "gaps" depends on whether or not one thinks that in addition to sqrt(-1) the number line should include hyperreals or other extensions of R1. Finally, in practice (or at least application) we never deal with real numbers anyway (we can only approximate their values).
Another potential use is educational: students who take calculus (including multivariable calculus and differential equations) never gain an appreciable understanding of the reals because they never take courses in which these are constructed. Initial use of derivatives and integrals defined on the rationals and then the reals would at least make clear that there are extremely nuanced, conceptually difficult properties of the reals even if these were never elucidated.
However, I’ve been sick recently and my head has been in a perpetual fog from cold medicines, so the time I have available to answer my own question is temporarily too short. I start thinking about e.g., the relevance of the differences between uncountable and countable sets, compact spaces and topological considerations, or that were we to assume there are no “gaps” where real numbers would be we'd encounter issues with e.g., least upper bounds, but I can't think clearly and I get nowhere: the medication induced fog won't clear. So I am trying to take the lazy, cowardly way out and ask somebody else to do my thinking for me rather than wait until I am not taking cough suppressants and similar meds.
David: We actually do integrate over the rational numbers. Probably the most essential integration formula is that of the integral of x^n over the interval [0,1]. The value of this can be established entirely over the rationals. You can have a look at my Famous Math Problems10 video at my channel (user njwildberger).
https://www.youtube.com/watch?v=vo-ItaB28f8&index=16&list=PLIljB45xT85Bfc-S4WHvTIM7E-ir3nAOf
There are some of us that don't believe in the infinite-precision dream which supports the `real numbers'. If you are interested in why, my recent seminar: `A Socratic look at logical weaknesses in modern pure mathematics' gives some reasons. Also at my YouTube channel.
Yes. I know. Hence the comment about measure theory's capacity, particularly in one of the two potential examples of possible utility I gave (probability). The problem, though, is that the number of researchers who rely on calculus and linear algebra through the use of the statistical methods they learned in undergraduate and graduate statistics classes far exceeds the number of researchers who have actually taken either calculus or linear algebra. Both are undergraduate subjects, while this is not true in general of rigorous probability theory (i.e., measure-theoretic where sample spaces are defined in terms of a triple ). Undergraduate probability usually consists of a rigid dichotomy between continuous and discrete where the former is handled using calculus. It doesn't involve measures, Lebesgue integrals or Lebesgue-Stieltjes integrals, or sigma-algebras. Thus such methods are far beyond the scope of researchers who haven't taken much more than intro stats and multivariate stats. And as frequently the "much more" is limited, at least in terms of calculus/analysis, to calculus I/sing-variable calculus. Unfortunately, such a course is largely an informal introduction to 19th century calculus restricted to functions of a single variable. That's another way of saying "basically useless".
My idea (which is probably less of an idea than a side-effect of Nyquil/Dayquil) was to develop something that is CLEARLY far less powerful but much simpler and which could handle datasets that are clearly discrete but are nonetheless routinely treated as continuous by researchers across the sciences (although the problem abounds in some sciences while is virtually or wholly absent in others). Or, alternatively, something I could introduce quickly into a single-variable calculus course not as a simplification but as a means to get students to realize how little they actually know about the real number line and Cartesian plane they've been working with since pre-college algebra.
Then, of course, there's just the fun of seeing how it might be done and how not. The literature is replete with extensions to the real number line (eg., Conway's surreals), an alternative but rigorous approach to analysis (Robinson's non-standard analysis in which infinitesimals are re-introduced via the extension of R1 to include the hyperreals), and discrete calculus (not discretized, see Discrete Calculus: Applied Analysis on Graphs for Computational Science). Meanwhile, philosophers continue to debate continuity from a metaphysical or philosophy of mathematics perspective, and physicists (esp. in particle physics and related fields) have developed methods such as "discrete exterior calculus" and other modernizations to the calculus of differences to deal with the increasingly "discretized" world of QFT and the standard model (in which discretization serves as one analogue to secondary quantization).
Yet despite both extensions of the reals and finite/discrete/etc. formulations of calculus, there isn't really anything (that I know of) on anybody going "backwards" (even discrete calculus, which "intrinsically discrete" as the authors of the above book put it, is developed more closely with calculus and quite independently of simpler discrete sets). By backwards I mean seeing what an admittedly far less powerful calculus could do (if anything) by defining the rationals as continuous and removing the "gaps" by assumption.
I'm not proposing something that can do anything we can't already do now (and do better now), just something as simple as the elementary calculus from undergraduate probability that doesn't involve social scientists treating e.g., a 5-point scale as uncountably infinite, or which could be used to sneak the difference between countable and uncountable infinities and other nuances of the reals into elementary calculus courses.
Oh I have no ambitions there. Far better and more comprehensive attempts to reform calculus than mine, supported by much more influential individuals than I, have gained little to no traction. These include (but are not limited to)
1)DUMP-THE-RIEMANN-INTEGRAL-PROJECT (D.R.I.P.) (http://classicalrealanalysis.info/resources-drip.php)
And they included free textbooks.
2) Various attempts to reintroduce infinitesimals from MAA papers to Keisler's attempts to broaden the use of non-standard analysis by writing a few textbooks aimed at elementary as well as more advanced students of calculus.
3) Gilbert Strang's use of MIT OCW to promote his "too much calculus" view via a series of condensed lectures on calculus (Big Picture Calculus or something).
4) Several attempts by various by various authors to teach multivariable calculus and linear algebra together (Hubbard & Hubbard's Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach (4th Ed.), Shifrin's Multivariable Mathematics, Kaplan & Lewis' two-volume Calculus and Linear Algebra, Apostol's One-Variable Calculus with an Introduction to Linear Algebra and the follow-up text Multivariable Calculus and Linear Algebra, etc.
...and so forth.
I don't even have a say in what textbooks I can use, let alone change how anybody else teaches. But I do have control over how I teach the material and what supplementary resources I can use (even more so when tutoring on the side). Also, this started mainly as part of a way of demonstrating the various seriously problematic yet standard methods for analyzing Likert-type response data, so if all I can do is include something to help illustrate the nuances in distributions that underlie statistical models/tests in countless studies which were reduced to checking a table to see if some p-value was reached, that's something. And even if I can't do that, as I said in the end we are always dealing with rational numbers so playing around with them in this way can't hurt. Finally, it may be that when I can think straight again I will realize that this whole notion was clearly idiotic for X set of reasons (although only to a point, as discrete calculus does exist in more than one form and there are other things I am sure of even in this state).
Totally off-topic, but there is an unpublished book by Charles Geyer which teaches undergraduate probability (and beyond) from non-standard analysis and looks like a thing you might enjoy having a look at, Andrew.
http://www.stat.umn.edu/geyer/nsa/o.pdf
Dear Pedro:
Thanks so much! I have already started looking it over and very much appreciate the link! I am going to have to add it to the next version of my bibliography of sources for graduate researchers.
-Andrew
Norman Wildberger is doing some very interesting work in constructing the foundations of mathematics without the notion of infinity, extending the rationals algebraic ally with objects like sqrt(2) which are not "numbers" but algebraic objects. I'd look up his Math Foundations series on youtube.
I suspect that a theory using rational numbers only would be more complicated than one using reals. Of course we know about a few real irrationals such as pi and sqrt(2), so we can't ignore them. (The normal distribution is normalised by a factor sqrt(2pi), for example.) However there are only a countable number of constructible numbers, so there should be a theory which eliminates anything that is not constructible. There is no way we can talk about any such numbers explicitly anyway.
In fact there are theories of constructible numbers. Intuitionism is one, and Bishop's constructive analyis is another. These theories are more complicated than standard theories, so probably not suitable for your intended audience.
On the other hand, the pragmatic way is just to sweep the problems under the carpet as is done in elementary courses. Long before I knew about the Lebesgue integral (or even the Riemann integral, or uncountable sets), I thought I understood calculus, which in the UK is taught in the final two (or sometimes three) years at high school. Once limits of sequences and series are understood, it is not too hard to motivate limits of sums of f(x)dx as the dx s tend to zero. Although this only works easily for (piecewise) continuous functions, this is almost enough for elementary work. Add in limits at discontinuities and over infinite intervals and we have elementary calculus.
Dear George:
Thanks, but I'm really not looking for the standard discrete calculus, discrete exterior calculus, mimetic discretization, etc. After all, we both know that there already exists a far superior approach to dealing with finite sets or countably infinite sets in probability theory than constructing a rigorous foundation for analysis based upon treating the rationals as continuous. And, feeling better, I much less optimistic about this endeavor (one need only look to the difference between the integral test for convergence an the values of convergent sequences to see a fundamental issue, not to mention the fact that the rationals are to the reals what electrons are to the nucleus). It's still actually proving to be interesting in a number of ways. I was hoping to get some feedback from a friend from Brown only to find an interest in participation, so even though I've now twice taken a back seat on my own DXM induced notion I managed to produce something that, while probably useless for most of the reasons I had thought, is useful for several others.
Dear Peter T. Breuer:
"I've been looking at this without getting the foggiest clue as to how people intend to stop the measure of all the rationals being zero, when that's a countable sum of the measures of singleton rationals, which are all zero, hence the sum is zero overall. It seems that you are all aware of this problem, but won't say how to solve it!"
Yet, somehow, virtually all the developments in calculus were made before measure theory's existence, and so esteemed a mathematician as Bartle stated that it's "time to discard the Lebesgue integral as the primary integral" while measure theory has, in general, become increasingly less important. To quote Hamming: "all of measure theory seems to be playing a smaller and smaller role in other fields of mathematics, and none at all in fields that merely use mathematics."
"Measure 0" only makes sense if one assumes the nature of the number line is of a certain nature (and not the surreals, hyperreals, or other extensions). This "problem" has been solved; the problem is the solutions (such as non-standard analysis) are too complicated to bother with: I would much prefer to use measure theory and teach it. The problem is trying to provide a tool for students and researchers whose mathematical literacy is somewhat limited.
Also, there exist uncountably infinite sets with measure 0 (the Cantor set, for example).
And the point you make is one of the reasons I continue to hope for something less powerful but far simpler for those whose use of analysis/calculus is limited to the assumptions made when using statistics that rely on topics in math they never learned. Most of the real number line is composed of irrational numbers. In fact, it's so vastly "filled" with irrational numbers that the rational numbers are an "infinitesimal" amount compared to them. Yet researchers in the medical, social, psychological, cognitive, and neurosciences continue to think that taking linguistic responses they treat as 5 or 7 integers somehow "approximate" uncountably infinite sets. More generally, any continuous variable (already problematic as there is no strict dichotomy discrete/continuous) is supposed to take on any value in the continuous interval over which it is defined, most variables outside of the physical sciences can't take on irrational numbers, and thus most of the values they are supposed to be able to take they can't.
Put another way, if it is of such significance that any interval composed merely of rationals is measure 0, then why do the vast majority of peer-reviewed scientific papers treat variables which can at best be defined over countably many values as non-0?
Dear Peter T Breuer:
I would drop measure theory altogether, as with it I already have a far more powerful tool for determining probabilities for any given set. I would probably build off of the work on previous work differentiating rational functions, something akin to this:
Analysis in a Rational World (http://www.hull.ac.uk/php/479885/analysis_in_a_rational_world.pdf)
Rather than this:
Discrete Calculus (http://homepages.math.uic.edu/~kauffman/DCalc.pdf)
Although working with the rational numbers is attractive (if it can be made simple enough), because the rationals are dense in the reals, it suffers from the same problem: the myth of precise measurement. The ancient Greeks were disturbed by the fact that the diagonals of a unit square are irrational. Eudoxus eventually solved the problem by a construction similar to that used later by Dedekind.
But they could have solved the problem another way. The dimensions of a 1m square table top are only accurate to about the nearest 1mm, and the corners could be up to 0.5degrees from a right angle. The sides won't even be perfectly straight. The best machine workshop precision is not even to 1 micrometre. So why should we pretend that measurements are exact?
An alternative, then, is to use interval arithmetic. First choose a precision, \delta. Every measurement is only accurate to within \delta. Intervals could overlap, but for simplicity we may divide the real line into disjoint intervals. To be realistic, we should also choose K, the greatest possible measurement, but this could be flexible--the largest number that actually occurs in the calculations. Now we don't need integrals, only sums. Integrals are still useful as approximations but all functions will be Riemann integrable if the domain is finite.
This idea is just a more extreme version of using just the rationals: e.g. if \delta = 10^-k, use just rationals with denominator 10^k and numerator between +- Kx10^k. In binary, we could use IEEE double precisions so that everything can be computed except for the occasional +-inf or NaN.
I can see some problems with this approach, it might not be as simple as it seems, but it might be simpler than working with the whole set of rationals.
Another idea is to base the course on the generalised Riemann (Henstock) integral. This is like the Riemann integral but with countable sums and is equivalent to the Denjoy integral. The latter is incredibly complicated as it uses transfinite induction to generalise the Lebesgue integral to cope with conditionally convergent integrals.
Lebesgue chose about a dozen axioms for an integral, but the crucial one is that the monotone convergence theorem should be satisfied. He then showed that for functions that only take values 0 or 1, his axioms reduced to the three axioms of Lebesgue measure. He then showed how to define an integral in terms of this measure. A modern approach to the latter step is to define a measurable function to be the limit of a sequence of simple functions (see Classical and Modern Integration Theories by Ivan N. Pesin).
The Lebesgue integral is more powerful than the Riemann integral because limit theorems are simpler as we don't have to check that the resulting integral converges. The dominated and monotone convergent theorems take care of that. Additionally the Fubini-Tonelli theorems say that we can change the order of repeated integrals so long as the multiple integral exists.
For absolutely convergent Henstock integrals you can use the power of these theorems. As probability theory seems to stick to the absolutley convergent case, nothing is lost by using the Henstock integral and you don't have to do any measure theory.
However, consider the following modification of the St. Petersburg paradox. Toss a coin until a head occurs. If this happens at an odd number, n, of trials receive $2^n, if n is even pay $2^n. The expected return does not exist, it is 1-1+1-1+-... . Exercise: modify the payoffs so the the sum is conditionally convergent (to ln(2)) but not absolutley convergent.
Dear Andrew Messing, seems to me, some easy book such as in the work by Apostol or even Bostock and Chandler I could find at Sussex, Brighton surely are pertinent for the students, starting studying the Calculus at the University. As long as it rains in August, impossible to say, I possibly could join you and Peter T. Brower, trying some easy introduction to integration for the students ... Obviously no promise
I think this problem has already been circumvented by meromorphic and holomorphic functional theory, where a properly placed Jordan curve can root out those pesky irrational points or poles, etc... I vote to retain measure theory, for it is the basis upon which all else is erected.
Let I=[0,1] and Q=the rationals in [0,1].
I suggest that you consider the following questions:
1. Can I be covered by the union of a sequence of intervals whose lengths-sum is strictly less than 1?
2. The same as question 1 with I replaced by Q ?
My main difficulty with this approach is that you lose so many basic and important results: for example, the intermediate value theorem, so important early in integral calculus, fails miserably when considering functions of the rationals: for example, define f(x) to be 0 if x^22. I am not sure that I know of a nice fix for this.
Even with just rational intervals (rational endpoints, subsets of Q)?
I have written long ago (in 1992) a paper (in french) entitled "Why Analysis is not rational". The aim of this paper was to justify--- through counter-examples --- the neccessity of proofs for elementary results of analysis but results which are false if you restrict to rational numbers. The intermediate value theorem is of course an example. There are others: "any decimal infinite sequence of figures defines a number", "a continuous bounded function on a bounded closed interval attains his bounds", "any continuous function on a closed bounded interval is uniformly continuous", etc. All these results depend on the continuity of real numbers (not only density). And the fundamental reason is the fact that the only axiom of real numbers not true for rationals guarantees also the fact that not only the reals are complete but that the real line is locally compact. (You can give many equivalent forms of that axiom (Borel-Lebegue is one of them --- complicated. I prefer "Every growing bounded sequence of numbers has a limit".) ) So your question is just "do I need that local compacity?" As you say, for a mathematician, the answer is trivial! But it is not that clear for secondary school pupils and even teachers. Just in case, I include my old paper.
Hi Neil, I may be wrong, but I think the intermediate value theorem works in interval arithmetic with a suitable definition of continuous. Imagine the graph of a continuous function (in the usual sense) covered with squares of side \epsilon. Any interval of length \epsilon on the y axis will correspond to such an inteval on the x axis. I might have to give up on making the intervals disjoint, but that was just a convenience. Perhaps I can use two sets disjoint intervals overapping each other. We only know values to the nearest \epsilon, so continuous functions on the above definition could be discontinuous on the usual definition.
Somehow I'm reminded of fuzzy sets.
Terry,
Can you show using interval arithmetic that the function I gave above (f is 0 if x^2
Neil, you are correct, I can't prove that function is discontinuous. I must give up on the intermediate value theorem, but not on interval arithmetic which seems to me to be more consistent with the nature of real measurement.
Dear David Gilat and followers,
Your comments are interesting . Recall David question.
Let I=[0,1] and Q=the rationals in [0,1].
1. Can I be covered by the union of a sequence of intervals whose lengths-sum is strictly less than 1?
2. The same as question 1 with I replaced by Q ?
As far as I understand the question I can say the answer to question 1 is negative and to question 2 is positive. I can give more details in further communication. The David question is related to Lebesgue measure theory.
If I is covered by the union of a sequence of intervals I_n the by sub-additivity of Lebesgue measure its lengths-sum is greater or equal than 1; ie. 1 =|I_n| \leq \sum |I_n|.
For every $\varepsilon >0$ there is a sequence of intervals $I_n$ such that $I_n$ cover $Q$ and
\sum |I_n|< \varepsilon.
best,MM
To Miodrag Mateljevic:
The whole point of my question 1 was to argue the impossibility of coverage from first principles, without the use of Lebesgue's theory. The argument may go as follows:
WLOG assume that the intervals are open (otherwise, each interval can be slightly extended to an open interval so that the lengths-sum of the extended intervals remains strictly less than 1). If the union of the original intervals cover I, thenclearly so does the union of the extended open intervals. If so, using compactness of I (the Heine-Borel theorem), extract a finite sub-cover and prove (non trivial, but easy) that the union of a finite sequence of intervals with lengths-sum strictly less than 1 cannot cover I (nor Q).
This argument illustrates the enormous difference between finite and countable unions of intervals, or if you will - between the algebra of finite unions of intervals and the sigma-algebra (of Borel sets) generated by intervals. These ideas are at the root of the difference between Jordan content and Lebesgue measure, or if you will - between Riemann and Lebesgue integration.
Dear followers,
I will try to illustrate our discussion on an example related to movement of a material point along a coordinate line.
The mean value theorem states, roughly: that given a planar arc between two endpoints, there is at least one point at which the tangent to the arc is parallel to the secant through its endpoints. If we measure velocity along path approximately, we need a version of mean value theorem for this settings.
Mechanical meaning of derivative. Consider the simplest case: a movement of a material point along a coordinate line, moreover, the motion law is given, i.e. a coordinate x of this moving point is the known function x(t) of time t. During the time interval from t_0 till t_0 + \Delta t the point displacement is equal to: x(t_0 +\Delta t) – x(t_0 ) = \Delta x , and its average velocity is: v_a = \Delta x/\Delta t. As \Delta t approaches 0, then an average velocity value approaches the certain value, which is called an instantaneous velocity v(t_0 ) of a material point in the moment t_0 . But according to the derivative definition we have v(t_0)= x'(t_0). In a similar way we can define the acceleration as a(t_0)= v'(t_0). Suppose that x(0)=0 and that we can measure velocity v_1, v_2,..., v_n=v_0 at times t_1, t_2,..., t_n=t_0 (with errors).
In this setting we can try to make The mean value theorem of the form: there are \epsilon> 0 and t_k in (t_1,t_0) such that v_k in (v_0 -\epsilon,v_0 +\epsilon).
We also can define v_a(t_k,t_{k+1})= (x_{k+1} -x_k)/(t_{k+1}- t_k) and observe that x_0= \sum (x_{k+1} -x_k)=\sum (v_a(t_k,t_{k+1})(t_{k+1}- t_k).
If we measure the positive acceleration a(t_k) at all points t_1, t_2,..., t_n=t_0 the graph of path-time function is convex with error (between two convex functions).
Roughly speaking, in this way we can develop the notion of integral over finite sets and others mathematical notions. In practise it has some sense because we can make only finite number of measurement with errors.
No measurement is perfectly accurate or exact. Many instrumental, physical and human limitations cause measurements to deviate from the "true" values of the quantities being measured. These deviations are called "experimental uncertainties," but more commonly the shorter word "error" is used.
I add only a few comments
1. David Gilat explained that the whole point of his question 1 was to argue the impossibility of coverage from first principles, without the use of Lebesgue's theory. It seems that here there is interesting mathematics.
2. As Andrew Messing says: ''The irrational numbers were discovered over 2,000 years ago and the attempts to make calculus rigorous since have (almost) always taken as desirable the inclusion of numbers like pi or sqrt(2)''. Here we can also ask what is sqrt(2) in the reality.
The answers to Andrew's questions here are depressing. I thought first about keeping quiet, since I don't feel like fighting another flame war. Differentiation and integration can be understood quite rigorously on a very elementary level without heavy machinery of real analysis such as limits, completeness and compactness if we deal with smooth enough functions. The ideas were presented at mathoverflow several years ago where they were met with a violent opposition from some members. Follow the link below and read the answer with rating -3. I would be glad to answer any questions, and provide additional references if anybody cares.
I also had a discussion about this approach on the Russian site dxdy, where I could convince some participants that the approach is viable, despite the initial skepticism.
And Andrew, if you still work at Harvard and want to discuss this matter in person, we could meet conveniently, since I live very nearby.
http://mathoverflow.net/questions/40082/why-do-we-teach-calculus-students-the-derivative-as-a-limit?page=2&tab=votes#tab-top
Michael,
When I learned calculus first, in high school in the UK, there was very little about limits, and the tools, integration and differentiation were taught first for polynomials, then it was "shown" how to extend the ideas to other functions. There was little rigor introduced at that stage: the material was very easy to understand and to use for the students who would be going on to physics or engineering and needed the tools there.
I taught myself some elementary real analysis from Hardy's Pure Mathematics at the same time, because *I* was interested --- not because I needed it to understand how to use the tools.
I sometimes think that calculus courses suffer because they are designed for too broad a range of students: mathematicians can benefit from proofs, whether it be via limits, or Lipschitz style arguments: students taking a course in order to learn how to use a toolbox benefit much less (and some of them are harmed). When I went to university in the UK, physicists taught physics students calculus, and engineers taught engineering students calculus, and I'm sure that their classes were *very* very different from the real analysis course I took! But in the US, calculus courses are usually taught by mathematicians, regardless of the audience, and we're often a little too keen to sneak in some extra proofs, some extra "real mathematics".
Back to Andrew's original question: both in the baby calculus course I took in high school, and in Hardy, the reals were essential: the rationals are not rich enough. I think that Andrew's original question was not "can we teach calculus without limits" but rather "can we teach calculus without the reals". I'm skeptical about the prospect of *that*. But I'd be happy to see him try, and happier still to be proven wrong in my skepticism!
Neil
George,
A very interesting paper --- and a cursory read seems to suggest that completions are important, as a partial answer to Andrew's original question.
Thanks for pointing it out!
Neil
Neal, I am glad that once upon a time when you were a secondary school and university student, some people tried to do it right. Polynomials is the natural way to start with calculus, since all of its working formulas follow from the factoring of x^k -a^k through x-a and the triangle inequality. Polynomials satisfy the local Lipschitz estimates, that are key to understanding why polynomials with positive derivatives increase (the monotonicity theorem). All the familiar properties of integration, such as positivity of the integral and the fundamental theorem, follow from the monotonicity theorem.
To go beyond polynomials and algebraic functions, we can admit these estimates as the definition of differentiability and arrive at locally Lipschitz differentiable functions. Differentiability can be viewed as a generalization of the Little Bezout theorem for polynomials that says that x-a divides p(x)-p(a). This is all very elementary and in no way depends on real numbers and their deeper properties, like completeness and local compactness. This theory is good enough to handle piecewise-analytic functions, that is about all we encounter in elementary calculus. It is all explained in my article (see the link below).
Can you explain, please, how the real numbers were essential to your baby-calculus in high school? It doesn't sound right to me.
It also looks strange when you consider all the proofs as essentially the same, regardless of how difficult or easy they are. Adopting stronger definitions results in simpler poofs, and makes the ideas more accessible to less mathematically-minded audiences. It gives them some reasonable explanations of how things work instead of the usual handwaving and vague appeals to intuition that leave the students confused and mystified.
http://www.mathfoolery.com/Article/simpcalc-v1.pdf
I forgot I asked this question (I don't really remember thinking about it much, but cold/flu medicine will do that to you). I do recall working on the best ways to illustrate to students in elementary calculus, intro stats, or even some research methods courses the nuanced nature of the real number line and infinities, but I have been developing that idea via methods like the cantor set. This provides an intuitive (albeit inaccurate) method to demonstrate some such nuances (I wrote a blog post on presenting the inaccurate but intuitive method here: https://legiononomamoi.wordpress.com/2015/05/09/double-and-nothing-the-return-of-the-cantor-set/), as well as the standard use we find in textbooks on measure theory, real analysis, etc. Other simple methods I've tried to simplify or use existing simple examples include the counterintuitive results we get from considering the collection of rationals in an interval on R (specifically, that the rationals are dense) yet is somehow full of gaps, from the easy proof that there must be a rational number between any two rational numbers (and therefore infinitely many) to ways to get at the negligibility of countably infinite sets without requiring knowledge of or introducing measure theory
But it looks like there has been an interesting discussion nonetheless. Perhaps I should get sick more often! Thanks to all and my apologies.
Andrew, I don't understand why being dense and having gaps would seem counterintuitive. Consider the set of rationals whose denominator is a power of 2. They are dense but there are other rationals between any two of them.
It might, however, seem counterintuitive that the number of gaps is uncountable. But maybe not if the students have seen the above example.
Andrew, I guess you can crate something new if your system is well defined and has internal coherency. Your axioms and objects must be defined as not to produce contradictions with another set of axioms of standard math. Euclid postulate of parallels has prooved to be arbitrary so other geometries could be devised without conflicting with it. But you can't just use regular math, dropping linked properties. If you built a system that gets in trouble, there could be a way to refine the objets definition. Anyway this is a nice way to get in touch with the walls you will have to deal with. Non-standard analysis could be an example. Give a basic example of what a sample of your theory should be able to do. Maybe you would want to consider each new kind of point as having a given infinitesimal non-zero value ε or something like that (covering the holes). Rationals are enumerable like the naturals, so of measure zero, even if you take them all.
Indeed the set of rational numbers Q is countable, hence of measure zero if you insist on countable additivity of measure. Countable additivity however is just a mathematically convenient requirement of Measure. If you relax countable additivity and require only finite additivity, which is of course an intrinsic property of any reasonable concept of "measure", you can have a translation invariant measure on Q which assigns the length as the measure of every interval (of rationals). Of course, doing so you lose uniqueness of extension from the algebra generated by intervals (on Q) and other nice properties such as continuity with respect to unions of monotone sequences of sets (equivalent to countable additivity), hence monotone convergence when you go on to develop integration.
Dear Terry Moore:
Neither do I (anymore, anyway, and not for a long time). However, research in mathematics education literature not only on students' understanding but also teachers' understanding of the rational numbers indicate certain deficiencies regarding the rationals and reals that (for the most part) come down to or are related to issues of non-denumerability and "dense" countably infinite sets. In short, the main distinctions between the reals and rationals. Hence the use of assumptions of functions from and to Q w.r.t differentiation and continuity for pedagogical purposes. As introducing students in a beginning calculus or analysis class to measure theory isn't generally an option (at least not a good one), and I have found- consistent with the research- that students are not typically able to define what real numbers are or answer questions like "how can there be infinitely many rational numbers between any two rational numbers s.t. there are infinitely many irrational numbers in any interval?" or similar questions regarding "gaps" in the real number line that can't explicitly demonstrated to exist (and thus, as single irrational numbers, provide no understanding about the nature of the real number line).
Hi Andrew, my point was that there are infinitely many rationals between any two rationals. This isn't difficult for students to see and demystifies the similar statement about irrational numbers. Uncountability of the irrationals is harder to understand, but the existence of irrationals is simple enough once you prove that rationals have terminating or recurring decimal expansions. You can then show them numbers like 0.10100100010000... that don't recur. There is no need to prove uncountability.
Why not stick to the Riemann integral, doing it informally? You could point out that there is a theory for highly discontinuous functions, but so long as you stick to piecewise continuity, you have all you need for most purposes. The disadvantage is that limits of sequences of such functions can be highly discontinuous, but you could use the old "it can be shown that" argument to say that there is a generalisation of the integral for which this is not a problem.
Is there really any need to define and prove everything? Especially for students who do not intend to specialise in mathematics. Even Euclidean geometry is not completely rigorous because new unstated assumptions sneak in from time to time, especially incidence and continuity axioms.
Hi Terry:
I haven't found the idea the rationals are "dense" to be in-and-of-itself a conceptual difficulty (nor have I read any research indicating so). It is only when one considers infinitely-many rationals between any two in conjunction with the "gaps" for the irrational numbers. For example, consider Thomson, Bruckner, & Bruckner's Elementary Real Analysis: "The rational numbers are dense. They make an appearance in every interval; there are no gaps..." (Sect. 1.9). This is what, I have found, most students come to conclude when pressed to consider how any "gap" can exist when no matter how close rational number A is to rational number B, there will always be a rational number C that is closer: there are no gaps, and therefore no space for any real numbers (let alone so much "space" that the rational numbers are negligible and can be basically ignored in modern integration and measure theory).
This is why I find examples like the Cantor ternary set so useful: presented informally as just a means of removing the middle thirds successively from the unit interval, one can show that the amount removed is equal to 1 yet intuitively, as every time a "middle" is removed clearly the end "pieces" remain, there must be two "pieces" left for every "piece" removed, meaning that we can remove "length" 1 from a unit of "length" 1 and have infinitely many "pieces" left over. At more advanced levels, Cantor's set provides us with an example of an uncountably infinite set with measure 0.
Nor is it so much a matter of proof. I was speaking not so long ago with a graduate researcher in biological and chemical engineering who had your typical single- & and multivariable calculus plus at least 1 course on differential equations in addition to other mathematics courses just as an undergrad, yet was unable to answer whether or not there were infinitely many rational numbers between any two rational numbers or understand that there could be. I have found this to be more of a norm. As researchers in multiple sciences typically treat as continuous variables that can take on only finitely (let alone countably infinite) values as if they were uncountably infinite, and textbooks in statistics define continuous variables s.t. between any two values there is another one, it seems important to introduce more rigor into the mathematics education than currently exists.
DISCRETE OR CONTINUOUS?
I would say that outside of pure math, all data depends on precision of instruments of measure (data collecting in experimental science and statistical data). So this kind of data will always be a decimal approximation. Maybe with the best instruments, we can add many decimals at best. Even when we deal with continuous phenomena, like time or motion. No wonder a continuous graph is made from these sets of data even if it should not be. Graph is then a fitting curve. Even trickier is a flow of water. It looks continuous but if you film with a high speed camera and view it at a slower rate, you will see individual drops. So is it a discrete phenomena? Is there a drop between any two drops?
Speaking of gaps, it is also difficult to see if there is a rational between any two irrationals or transcendent numbers. Or vice-versa.
TRIVIA
The Koch snowflake is created with a variation of Cantor algorithm. It creates a bounded curve with infinite perimeter from a triangle but what is left of the original triangle is the Cantor set on each of it's sides.
An interesting fact is that the n-th iteration of Cantor gives the optimal solution to the n+1 disk tower of Hanoï problem.
Dear Christian Boissinotte:
In measure-theoretic probability (rigorous probability theory), there is no distinction between discrete and continuous probability distributions/variables. Unfortunately, most researchers never take courses in measure theory or rigorous probability theory. Of course we cannot measure irrational values, merely approximate them. And of course one can define integration over the rationals (just as derivatives and continuous functions have for decades). The question, as I understand it having read my original questions, is whether there is a useful construction of a calculus of incomplete "dense" sets like the rationals. I am quite sure that the answer is no (thanks to having a clear mind relative to the one on Nyquil I had when I apparently asked this question).To the extent that a discrete calculus using the rationals is useful for the purposes I think I wanted, a few examples of derivatives and continuous functions using the rationals demonstrate the limitations as well as any such construct.
What do you mean by "calculus" in "construction of a calculus of incomplete "dense" sets like the rationals?"
Yeah, but, as with the Cantor set, even though the cut removes most of the numbers, the residual is dense and spans the interval; hence, you can integrate using just rationals, skipping the irrationals, and still span the interval completely, rather, cover the space adequately.
Dear Luisiana Cundin:
The "residual" is the Cantor set (the intersection/summation of removed intervals/amounts is its complement). The residual is nowhere dense yet uncountably infinite. The complement is (obviously, given its algorithmic/sequential construction) countable and has measure 1 (granted the typical ternary version) because it is the sum of interval "lengths" equal to that of the unit interval.
Also, the removed intervals contain infinitely many irrational numbers, i.e., the complement of the Cantor set (the set with positive measure) contains uncountably many irrational numbers. To see this, simply recognize that the first interval removed (1/3,2/3) contains uncountably many irrational numbers. The Cantor set doesn't span the unit interval (it has measure 0) and its complement of measure 1 contains uncountably infinitely many irrational numbers.
All right, all right. So, construct a new set, one where all the irrationals have been judiciously cut out. The residual would be everywhere dense and contain rationals alone.
Actually, it appears the set of rationals alone do not constitute a dense set and is not suitable for integration.
http://home.iitk.ac.in/~tmk/courses/mth404/main.pdf
Huh? In what sense "the set of rationals alone do not constitute a dense set?" Can you clarify? Aren't rationals dense in reals in the standard metric d(x,y)=|x-y|? Andrew, why are you so obsessed with the Cantor set?
What is meant: a given set of ordered rationals are punctured by an infinite set of irrationals, hence, that set would not allow Riemann integration. This brings me back to Cauchy's integration, where one evaluates a function at each pole along some interval; hence, enabling integration in cases not usually allowed.
You are wrong. If the function we want to integrate is defined on rationals and is uniformly continuous, its Riemann integral can be calculated with no troubles, and the value of this integral will be the same as the Riemann integral of the (unique) continuous continuation of this function to the reals.
Technically speaking, not so. If I take a set and remove the irrationals, x in [0,1]\Q, would render a discontinuous set, the Riemann sum becomes non-integrable, for there would be confusion over each sub-interval, their endpoints and whether or not there are overlaps. Usually, Lebesgue measures are introduced to get around this... Once again, it seems to me that one must "jump" over various discontinuities in order to integrate; hence, one is really summing up a sequence of residues.
Nonsense! For uniformly continuous functions on rationals Riemann integral works fine and all the formulas are the same.
See example 1.3 on page 3 of the attached paper, attached to my previous comment made above...
Irrelevant. The example is talking about the Dirichlet function that is 1 on rationals and 0 on irrationals. I am talking about uniformly continuous functions on rationals that a priori are not defined on irrationals, but of course can be extended there by continuity. For such functions Riemann integral as the limit of Riemann sums is well defined and works just fine. So you are wrong again.
OK Fine. How do you define a continuous function mapped to an interval that all irrationals have been removed? Moreover, the constant function is certainly continuous throughout any interval. And, I can certainly define any new function, h(x), by multiplying g(x) by f(x), where the latter is unity for all rationals and zero for irrationals.
I was talking about uniform continuity, I.e. for every e there is d such that |f(x)-f(u)|
But, the whole point is that the Riemann sum would need to avoid all irrationals; hence, you must cut out these values. One way would be to simply multiply any function g(x) by f(x), where f(x) is defined to be unity for all numbers except irrationals. That would ostensibly mean NO function is Riemann integrable, if the irrationals are removed!
I hear ya, the function f(x) is defined in the pdf file attached to my earlier comment. Page 3, example 1.3, which is considered a classic result, by the way...
Andrew's question leads to a real conundrum...
George, you can read my article, a link attached.
http://www.mathfoolery.com/Article/simpcalc-v1.pdf
This example 1.3 has no relevance to what I say. Riemann integration works just fine for uniformly continuous functions defined on rationals. There are no irrationals in the definition domain, O.K? I see no conundrum here.
I hear ya Mike, but listen, the constant function, 1, is continuous across the unit interval. The Riemann integral would be defined. Now, perform the Riemann integral without any of the irrationals. That would mean you have to construct sub-intervals that do not contain any irrationals. This brings up Andrew's original problem, like the Cantor set. This raises a bit of a conundrum.
The Riemann integral is defined just on the unit interval of THE RATIONALS, as the limit of the Riemann sums, irrationals can be totally ignored for uniformly continuous functions on RATIONALS. When I talk about intervals of rationals, I mean the sets of rationals saisfying the corresponding inequalities, not the sets of reals. Everything works just fine. Do you hear now?
Dear Michael Livshits:
Most of the texts I've chosen or been required to use to teach real analysis, measure theory, or (measure-theoretic) probability theory use the Cantor set as a pedagogical tool. However, most of these texts are
1) Used in courses in elementary real analysis/real analysis, which many scientists never take
2) Used in graduate level mathematics courses (e.g., measure theory, probability theory, integration theory, etc.) that again most scientists never take.
Using the Cantor set to illustrate fundamental concepts in relevant topics in mathematics that many/most researchers employ is hardly my innovation. Where I differ from basically standard approaches is in the belief that a more informal introduction to the Cantor set can be very useful in elementary courses in mathematics of the sort researchers take.
Dear George Stoica:
I initially wrote a paper (actually 3, but only one of which I sent to an MAA editor), only to discover that I was mostly reinventing the wheel (so to speak). I believe I asked this question in order to help shape a novel approach so as to be able to submit a new paper. Basically, this question was (or at least has turned out to be) an attempt to solidify certain aspects of a current project to create the paper you desire (and that I wish I had prepared to give you). In other words, while I would love to provide you with a coherent, clear document detailing my perspectives here, I seem to have asked my initial question and have more recently read and responded to answers in order to refine my thoughts to enable me to produce just the sort of document I would like to provide you with.
I seriously doubt that "the Cantor set can be very useful in elementary courses in mathematics of the sort researchers take," since it describes rather pathological phenomena, such as singular measures and singular functions that never appear in the more elementary mathematics the researchers need. Can you give an example of Cantor set being useful in elementary statistics?
Well, I think it might. Given one only has rational probabilities, then when integrating, one is actually faced with a reduced set of numbers. In other words, there is no such thing as an irrational probability.
From a practical perspective, does it matter if certain probability is rational or not? In practice all we need is a good enough rational approximation.
Fair enough, from a practical point of view; but, if we are technically only allowed to integrate functions on the rationals, then our entire calculus collapses! Keep in mind, that practically speaking, it still works; but, it works only because of our sloppiness, hand-waving and overlooking certain realities.
In other words, in the case of a Dirichlet function, the Riemann sum is either zero or unity; hence, the sum of probabilities amount to nothing short of a Bernoulli trial. That makes for a rather large margin of error.
It does not collapse if we deal with locally uniformly continuous functions. Dirichlet function is not the same as its continuation by continuity from rationals to reals. Get over it, Riemann integration of uniformly continuous functions over rationals woks just fine, and not because of our sloppiness, but because of our judicious choice of the functions we integrate.
I don't think you get it, Mike. A constant function IS uniformly continuous on any interval. It's not the function that is under scrutiny, but it is the field of numbers allowed to generate subintervals for summation. Once again, I can multiply any function by the Dirichlet function and the result is the same: not Riemann integrable.
Oh, thanks for the bloody down vote, Mike!
No, it is you who don't get it. The Dirichlet function on rationals is a constant, and no irrationals are involved in Riemann integral of uniformly continuous function over rationals. You sound more and more like a broken record, and it is getting hard to keep up a conversation.
Dear Mr Michael Livshits,
Maybe the problem in this discussion is because people have different views of what an integral is.
My conception of an integral is that it is an area under a curve, which can be a limit. (No need for a limit if the function is constant or first degree, for example, as we can get the area otherwise).
It is the sum of the products of infinitesimal non-zero measures of intervals that form a partition of the definition set, each of these being multiplied by a value of the function on that interval (given by the mean value theorem).
The result should be independent of the choice of this partition. This works with Riemann and Lebesgue.
Now I also have a problem with integration on the rationals only, because this will not give an area (definition set is not compact), and the x-values for the mean theorem on each interval of the partition could be irrationals.
Can you please tell me if you use a different definition of an integral, and an explicit (judicious!) example of a simple function you choose to integrate?
It's one thing to voice your displeasure, but to shell out down votes like they're skittles just ain't cool! Mike...
Touche' Christian. So, the question is, "How do we skirt this issue?" Is there a way to recover; and not by Lebesgue's method, but something all together different?
Thanks Mr Cundin.
My goal was not to say that one is right and the other is wrong. I gave my point of view and asked help to understand the other point of view. Is that wrong?
I'd like to hear from Stoica or Andrew on this point: How do you get around the fact that a rational cut would constitute an infinite number of discontinuities? Also, how exactly, in a word, does Lebesgue's treatment get around it? If I take a bin the width of one rational, I would still find it hard to believe the resulting integral is possible...
I explained it before, but it looks like you guys managed not to notice it. The Riemann integral of a uniformly continuous function over an interval of rationals is calculated by the same procedure as for reals, it is the limit of the Riemann sums when the maximal length of the subintervals involved in the subdivisions goes to 0. The existence of this limit is simple to check. This limit may be irrational, but the definition domain of the function is in Q. This also works for any set dense in R instead of Q. The value of this integral is the same as the value of the conventional Riemann integral of the continuous continuation of the function to the corresponding real interval. This continuation is unique because the function we start with is uniformly continuous. Is it clear? Example: integral from 0 to 1 of f(x)=x^2. And by the way, "the mean theorem" is totally irrelevant here.
So, perusing the paper, "Return to the Riemann Integral", by Robert G. Bartle, MAA paper, on page 627, theorem 3.2, the author essentially takes the lower bound of the Riemann sum to arrive at \int{f dx} equal to zero, x in [0,1], f is Dirichlet function. But, that is biased and not a universal proof that the Riemann sum is always zero for the Dirichlet function. I find most proofs that attempt to get out from underneath Riemann's problem, for intervals comprised of discontinuous intervals, apply the concept of limits, which are not really applicable to the problem, unless, after applying the limit, one arrived at a similar result, i.e. non-Riemann integrable. Simply applying a gauge or limit definition is not really satisfactory at all, at least, not in my opinion; it's a quick way out of a cul-de-sac. By definition, the lower and upper sum of a Riemann sum must be equal, otherwise, the function on that interval set is non-integrable.
Moreover, the sum of two rationals is another rational number, hence, how to explain the limit of a Riemann sum of just rationals would ever converge to an irrational number; unless, if one defined an irrational number to be the infinite sum of rational numbers... Additionally, how can one honestly claim, and believe, the sum of rational numbers would somehow amount to nothing?
Consider 1/(1+x^2) its integral is arctan. The function is rational, but the integral from 0 to 1 is not. All the irrationals are limits of rationals, what is here to explain?
All right, not to be argumentative, then let's just say, "It's integrable, even though, we cannot prove it rigorously." In other words, many of the anti-derivatives you think you know, they may be completely wrong; yet, you wouldn't know it, because you have not rigorously proven the Riemann sum, that is to say, you have not adequately accounted for a rational cut as the field for constructing your sub-intervals.
Stoica and all, same here... The upshot is that one cannot make the move out of the Riemann cul-de-sac without employing limits, e.g. irrationals represented by the infinite sum of rationals, etc... In other words, this raises an entirely different problem, how to move from descretized space to continuum space. One last point: we are confined to using only rational numbers, for no one knows an irrational number accurately; hence, all integration is effectively over a field of rationals alone, thus, the conundrum remains.
I am not stuck, it is the other 2 guys that are stuck because they do not understand anything. And Louisiana, yes, it is integrable and I can prove it rigorously, and so can you, consider it a homework in introductory analysis.
Dear George, the paper you have suggested strikes me as yet another example of rather pointless purely mathematical sophistry which is of no value to people interested in applications.
I hesitate to add more confusion to the debate, but here goes anyway.
Integration should not be taught as the area under a curve (and the derivative should not be taught as the gradient of a curve). These are good at helping with visualisation, but as definitions they just confuse. For example, the volume of a cone may be calculated by integrating over discs, integral (pi h2 tan(alpha) dh). Under what curve is this the area? It's the graph of the function f(h) = pi h2 tan(alpha)--this just confuses. The real concept, and the one the better students will infer for themselves, is summing approximate volumes and taking limits. (Instead of taking limits one can use sup of lower sums or inf of upper sums. Strictly speaking we should show that these are equal, but I suspect that this doesn't help with understanding the basic idea.)
I would think of some practical examples. For science students one could generalise work = force times distance to cases when the force is not constant, for example. I'm sure there are good examples in biology or economics too. Perhaps for any students we can use distance = speed times time, and work this backwards for derivatives.
Double integration is often taught as repeated integrals. This is also inadvisable as it doesn't intuitively allow for change of coordinates such as to polar coordinates. For integration in any number of dimensions, or even when dimension is undefined, the underlying concept is breaking the domain into small subsets for which the function is almost constant and summing and taking limits (or sup or inf).
It is only after these ideas become second nature that formal definitions should be attempted.
The advantage of the Lebesgue formulation is that it leads to much simpler results, e.g. monotone and dominated convergence and interchange of order of repeated integrals if the double integral exists. The disadvantage is the amount of work needed to get there. So why not use the Daniel integral? This was, in fact, Lebesgue's approach. He chose axioms that he would like for an integral, and then defined an integral that satisfied these. Daniel generalised by merely dropping the axioms that lead to Lebesgue measure.
I haven't addressed the original question about restriction to the rationals. I think their incompleteness makes this inadvisable (we can't do without pi, e, and irrational radicals). But that doesn't mean that we have to work with the whole real line. The subset of all numbers that can be constructed with a finite process will do. (By "constructed" I mean that any given degree of approximation can be calculated--even if it takes the lifetime of a quintillion universes.) However, constructive analysis seems rather complicated, so perhaps it is better to stick to more traditional approaches.
For those only interested in applications, I wonder if formal proofs are needed at all. The main theorems of Lebesgue theory (monotone convergence etc.) should be taught, and the interested student should be referred to the literature for proofs. Go the whole hog for maths students, but consider the Daniel approach which still allows for measure (defined as an integral).
For maths students we need to give a convincing argument for the inadequacy of naive or Riemann approaches. Lebesgue worried that we can't always take monotone limits of sequences of integrals. A nice example whose limit is the characteristic function of the rationals would do. This also shows that Peano-Jordan measure cannot cope with all sets. Therefore we try to extend the measure. But we can't assume that the new theory can cope with any set either (in fact it can't, but it is optional to give an example of a non-measurable set).
My approach would not be to start with a ring of subsets of a set S, but a class that contains differences of nested sets and unions of disjoint sets. These are all we need for finite additivity. For countable additivity we also need unions of sequences of disjoint sets. This constitutes (according to Heinz Bauer) a Dynkin system (and this leads to a simpler approach than using monotone classes).
Adding intersections of pairs of sets leads to a sigma ring. With only finite unions we get a ring. Naturally we would like to measure the whole of S. If this is also a member we get an algebra (or, with countable unions, a sigma algebra) (most probability literature uses the origional word "field", but the analogy with a field in algebra is weak, whereas a ring of sets really is a ring with operations symmetric difference and intersection).
I think this is a reasonably intuitive approach. Once we realise that we can't assume that all sets are measurable, we ask what absolute minimum properties we require of the class of sets we can measure. For this purpose we don't need intersections, but without them many applications would be hamstrung.
I'll add one more thing (sorry this is so long). When extending measures, Caratheodory's approach is better than using inner measures which only apply when S has finite measure.
To Terry Moore: I generally agree with your approach. For brevity, I would just make one comment: It should be pointed out at some stage that if one does not insist on countable additivity, all sets can be made measurable.
David: I agree with that, but I haven't read enough about finite additivity to know if there are any problems. I think there might be some complications. From a practical point of view, even finite additivity is going too far. There are fewer than 10100 atoms in the universe on which to store information, so bounded additivity is more realistic (i.e. we only consider sequences with length less than some universal bound, which need not be specified). However, that makes simple models (such as the Poisson distribution) unnecessarily complicated.
I once believed that it is a theorem that all sets that can be described with a finite sentence are measurable. But apparently that is still a conjecture. If true, then from a constructive point of view, all sets are measurable.
David: I was thinking of non-conglomerabliity of finitely additive measures. I don't know if there have been any developments in the 29 years since the following was published:
http://www.hss.cmu.edu/philosophy/seidenfeld/relating%20to%20finite%20additivity/stat%20implications%20fa.pdf
David: We actually do integrate over the rational numbers. Probably the most essential integration formula is that of the integral of x^n over the interval [0,1]. The value of this can be established entirely over the rationals. You can have a look at my Famous Math Problems10 video at my channel (user njwildberger).
https://www.youtube.com/watch?v=vo-ItaB28f8&index=16&list=PLIljB45xT85Bfc-S4WHvTIM7E-ir3nAOf
There are some of us that don't believe in the infinite-precision dream which supports the `real numbers'. If you are interested in why, my recent seminar: `A Socratic look at logical weaknesses in modern pure mathematics' gives some reasons. Also at my YouTube channel.
Totally agree with you, Norman, the real numbers in their full generality belong to mathematical mythology more than to practical mathematics. But they can be understood without much difficulty by using rounding or truncation for the infinite decimals and avoiding the philosophical question of who gives us the next digit or more generally how these infinite decimals are generated. And of course in practice some finite accuracy is always enough. From the practical perspective the whole subject of analysis is about how to deal with approximations, and the real numbers are just convenient mathematical fictions that we use to avoid cumbersome language. By the way, I had looked through your book on rational trigonometry shortly after it appeared, very much fun. :-)