(Previous post on John Gabriel: Calculus 102 (Cauchy Sequences and the Real Numbers))

Alright, now that we’ve tackled the basic definitions and theorems regarding calculus, we can start looking at Gabriel’s first video on calculus.

2. Cauchy’s Kludge

Before we start, let’s play a little game. The above image is a screenshot of Gabriel’s video, and there are two incredibly annoying formal errors in it. They might be nitpicky, and the first one might be either due to the software he uses (is that Geogebra? No idea…) or might just be due to convenience/lazyness – but the second one I think definitely points to Gabriel not knowing how to even use logical notation properly. Or at least in a way that makes any sense. Can you find the two errors?

*ding* – time’s up! So here’s the first one:

\[\color{green}{ f'(0.5)=\lim_{1.72\to0.5}\frac{4.08-6.25}{1.72-0.5}=-1.78 }\]

What the hell is \(\lim_{1.72\to0.5}\) supposed to mean? “As \(1.72\) approaches \(0.5\)”? Fixed numbers can’t approach anything! Only variables can. And if it’s not the actual limit for an actual variable, then the \(f'(0.5)=\ldots\) in front is just plain wrong!

But okay, it might be that this just annoys me because I’m a logician and I’ve been somewhat conditioned to have an eye out for formalities like that – after all, he’s just trying to demonstrate how the derivative results from taking the limit of secant lines. And he’s actually not doing a bad job at it – or rather, I assume, his software isn’t. So let’s assume that this weird \(\lim_{1.72\to0.5}\) is also due to the software. The second error, however…

Here’s Gabriel’s definition of a function limit:

\[\forall \varepsilon\color{red}{(\delta)}>0\;\exists\delta>0:\;\forall x\;(0<\mid x-a\mid <\delta \;\Longrightarrow\; \mid f(x)-L\mid <\varepsilon\color{red}{(\delta)})\]

Compare that to my definition of a function limit:

\[\forall\varepsilon>0\;\exists\delta_{\color{red}{\varepsilon}}>0\;\forall x\in A\; (\mid x-p_0\mid<\delta_{\color{red}{\varepsilon}} \;\Longrightarrow\; \mid f(x)-L\mid<\varepsilon)\]

Now, why did I put a lower index \(\varepsilon\) after the \(\delta\)? Well, a formula \(\forall \varepsilon\exists\delta\) translates to “For all \(\varepsilon\), there exists some \(\delta\), such that…” – meaning that the exact value of the \(\delta\) whose existence is posited by the formula may depend on the specific value of \(\varepsilon\). You can think of it as a function \(f_\delta\), that maps each \(\varepsilon\) to some specific value \(\delta\) (in fact: Functions like that, that are the “result of” a \(\forall x\exists y\ldots\)-Formula are called Skolem functions). Point being: the value of \(\delta_\varepsilon\) may depend on your choice of \(\varepsilon\). This makes sense if you think back to the “game interpretation” of the formula: You give me some arbitrarily small number \(\varepsilon\), and in response I choose a value \(\delta\), depending on your \(\varepsilon\), such that the formula holds.

Example: The formula \(\forall x\in\mathbb Z\;\exists y\in\mathbb Z\; x+y=0\) says: “For all integers \(x\) there exists some integer \(y\) such that \(x+y=0\)”. The \(y\) being alluded to here is, of course, just \(-x\), and obviously the value of \(-x\) depends on \(x\) (duh…). In fact, the (in this case unique) Skolem function for this formula is the function \(f_y(x):=-x\) (however, Skolem functions are rarely unique).

So, what does Gabriel want to tell us with \(\forall \varepsilon(\delta)\)? I have no idea. The only thing that I can think of is, that he wants to tell us that the value of \(\varepsilon(\delta)\) may depend on the specific value of \(\delta\) – but that makes no sense whatsoever, given that

  1. the \(\exists \delta\) comes after the \(\forall\varepsilon\) – at the point of the \(\forall \varepsilon(\delta)\), the \(\delta\) that this \(\varepsilon(\delta)\) refers to hasn’t even been introduced,
  1. the \(\varepsilon\) is universally quantified anyway – meaning: The \(\forall\varepsilon\) already says “For all \(\varepsilon\)”, hence I may freely choose any value for \(\varepsilon\) anyway. To note that some value may depend on some other value only makes sense in the case of existential quantifiers, i.e. \(\exists\), and
  1. The \(\delta\) may depend on the \(\varepsilon\), as I explained – hence \(\varepsilon\) can’t also depend on \(\delta\), otherwise you get a circular dependency. How is that game going to work? You choose an \(\varepsilon\) based on my not-yet-made choice of \(\delta\), and then I pick my \(\delta\) based on your \)\varepsilon\) and you choose a \(\varepsilon\) based on my \(\delta\) and… huh?

So, what’s going on here? If you have any idea (or potentially, where he got this specific formula from), let me know, because I don’t. All I can conclude from this is that Gabriel doesn’t know what he’s talking about. Not surprising, but definitely annoying.

Now that we got the formal nitpicking errors out of the way, let’s go for the actual… errm… “content” of that screenshot. Bullshit is colored red:

In order to use the \(\color{red}{\text{ill-defined}}\) limit definition:

\[\forall \varepsilon(\delta)>0\;\exists\delta>0:\;\forall x\;(0<\mid x-a\mid <\delta \;\Longrightarrow\; \mid f(x)-L\mid <\varepsilon(\delta))\]

One \(\color{red}{\text{ must know the value}}\) of \(L\). However, in order to find \(L\), \(\color{red}{\text{one must use}}\) the \(\color{red}{\text{ill-defined}}\) limit definition:


Now you see \(h\):


and \(\color{red}{\text{now you don’t}}\)!


(insert stupid magician-potato-thingywhat the hell is that thing?… here)

Cauchy’s Kludge (3:57)

Holy shit, that guy 😀 Okay, one by one:

1. I love how he calls the limit definition “ill-defined” and then gives an (apart from the error above) rigorous, formal definition in first-order logic. Seriously, you can’t make something more well-defined than by using first-order logic. I mean, that’s kind-of what first-order logic is for! You can even input this definition in an automated theorem prover and see that it checks out. Just for fun, I did exactly that in our own MMT system:

Tadaa, the formula type checks. Awesome 🙂

2. No, in order to find \(L\) one does not have to use the “ill-defined” (which is in fact perfectly well-defined, see previous post for details) “limit definition”. In fact, you can use whatever method you want to find \(L\). You can conjure it up from the entrails of a chicken, for all I care. It doesn’t matter one bit, where you get your \(L\) from, as long as you can afterwards use the above definition to prove, that the \(L\) you found is in fact the limit.

Actually: One of the things that annoy many (especially undergrad) math students is that often, the limits for the more complicated sequences, series etc. seem to just fall from the sky. The professor announces the limit to be some seemingly arbitrary value, and then he proves that that value really is the limit! And as a student, you’re just left pondering how anyone came up with that value. Probably by reading the entrails of a chicken, who knows. (Of course, because some incredibly smart guy took an incredibly educated guess which happened to work out, or by deriving it by some incredibly elaborate method way too complicated to demonstrate in class.)

And if you prefer an example specifically for derivatives: The derivative of the natural logarithm \(f(x)=\ln(x)\) is \(f'(x)=\frac1x\). I could prove it to you; it’s not even that hard if you know the trick (although you need partial integration, which is why I’m skipping it here). But my point is: I have no idea how to derive that. Seriously. I haven’t tried in a while, it might actually be rather easy, but in class we only proved that it is in fact the derivative, not how any one realized that it is.

3. “Now you see \(h\), now you don’t” – errrm, yeah, because you substituted the thing-containing-\(h\) by the thing-that-by-definition-is-the-previous-thing-containing-\(h\). Are you seriously surprised that, once you define a symbol as a specific term and you replace the term by the symbol that by definition is the same thing as that term, the term is gone? That’s the whole point of definitions, you daft moron! The \(h\) didn’t disappear, it’s still there, you just need to expand the \(f’\) by its definition!

But also: \(h\) is not, like, a specific value or something; it’s a bound variable of the limit. It is a purely syntactic element of the definition; splitting the body \(\frac{f(a+h)-f(a)}h\) from the binder \(\lim_{h\to0}\) in front gives the whole thing a very different semantics. It’s like the difference between \(\forall x\;\varphi(x)\) and \)\varphi(x)\) (not surprisingly – if you expand the definition you get a \(\forall h>0\ldots\)) – the latter needs some specific assignment for the variable \(x\) to be a well-formed statement in the first place, while the former is already a well-formed statement. Just sayin’.

4. Apparently definitions are magic tricks for Gabriel. Awesome, apparently I’m a magician then 🙂

But still – does Gabriel maybe have a point here? I mean – the definition of a function limit “requires” some limit to exist, so if we use the definition of the derivative, which contains a function limit, to compute the limit – isn’t that circular reasoning? Well, let’s just expand all the definitions to see whether that works out:

We have \(f'(x):=\lim_{h\to0}\frac{f(x+h)-f(x)}{h}\). Now, expanding the definition, we get: \(L=f'(a)\) if and only if:
\[\forall\varepsilon>0\;\exists\delta_\varepsilon>0\;\forall h\neq0\; \left(\mid h-0 \mid < \delta_\varepsilon\;\Longrightarrow\;\color{green}{\left| \frac{f(a+h)-f(a)}{h}-L \right|}<\varepsilon\right)\]

(Note: the \(h\neq0\) is not cheating; I used an extra \(A\subset\mathbb Q\) for the domain of the function in my definition with the intention, that \(A\) does not include the approached value (in this case \(0\)) itself; Gabriel instead uses an \(0<\mid x-a\mid \). Both are fine and amount to “the same thing” in this context (since the fraction is in fact not defined for \(h=0\), hence \(0\notin A\)), even if they’re not strictly equivalent in a logical sense)

Okay, now note that in the term \(\left| \frac{f(a+h)-f(a)}{h}-L \right|\), we have \(h\neq0\) as a prerequisite; hence having \(h\) in the divisor of the fraction is unproblematic. Meaning, we can easily “compute” this fraction, without changing the truth value of the formula – after all, “computing” just means: We’re changing the expression in such a way that equality is preserved. Let’s do this with the function \(f(x)=x^2\) as an example:

\left| \frac{f(a+h)-f(a)}{h}-L \right|&=\left| \frac{(a+h)^2-a^2}{h}-L \right|=\left| \frac{\color{red}{a^2}+2a\color{cyan}{h}+\color{cyan}{h}^2\color{red}{-a^2}}{\color{cyan}{h}}-L \right|\\
&=\color{green}{\left| 2a+h-L \right|}

Now we can (since they’re equal) substitute the resulting term \(\left| 2a+h-L \right|\) in our original formula, yielding:

\[\forall\varepsilon>0\;\exists\delta_\varepsilon>0\;\forall h\neq0\; \left(\mid h \mid < \delta_\varepsilon\;\Longrightarrow\;\color{green}{\left| 2a+h-L \right|}<\varepsilon\right)\]

Now remember, this formula says: “For all \(\varepsilon>0\), there is some \(\delta_\varepsilon>0\) such that for any \(h\neq0\) with \(\mid h\mid<\delta_\varepsilon\) we have \(\left| 2a+h-L \right|<\varepsilon\)”. Our goal is to find an \(L\) that makes this formula true. And now it’s obvious that if such an \(L\) exists, i.e. if the formula is supposed to be true, it has to be \(L=2a\), because then we have \(\left| 2a+h-L \right|=\mid h\mid\). To show that this indeed works out, we just need to be able to give a \(\delta_\varepsilon\) such that whenever \(\mid h\mid <\delta_\varepsilon\), then \(\mid h\mid<\varepsilon\). This is trivially true for \(\delta_\varepsilon=\varepsilon\). Hence we have proven, that

and since we didn’t make any assumptions about \(a\) (i.e. the proof works for any \(a\)), the derivative \(f'(x):=2x\) is indeed a well-defined function. Hooray!

And now note, that (as people do in practice) this whole convoluted reasoning can be done way shorter by simply doing the following:

\[f'(a)=\lim_{h\to0}\frac{f(a+h)-f(a)}h=\lim_{h\to0} 2a+h=2a\]

This seems to be what Gabriel is annoyed about and what he claims is “ill-defined” – but note, that what we’re doing by computing \(\lim_{h\to0}\frac{f(a+h)-f(a)}h\) is really just a shorthand, convenient way to derive a term in such a way, that we can easily extract a proof that the result of our derivation is in fact the limit we’re looking for according to the formal, rigorous definition of a function limit.

Nothing is ill-defined here! We can, if in doubt, always convert any such derivation to an actual formal proof. Except, of course, if we made a mistake. But then try getting through peer-review…

Now, the question remains why Gabriel would think the definition of a function limit / derivative would be ill-defined. As often with Gabriel, it’s hard to find out what exactly his problem is, but the following might be a clue:

There are a lot of problems with mainstream calculus; it’s flawed for several reasons, as I’ll explain shortly, but one of the main reasons is that in order for this function here to be differentiable at this point here – essentially what Cauchy’s definition is saying is, that it needs to have a derivative at every point in this short interval here.

Cauchy’s Kludge (2:13)

And Gabriel is half-right here (as it turns out, not even that). He’s “wrong” (or at least somewhat inaccurate) in that “this short interval” he’s pointing at is in no way significant. But he’s right also wrong, in that in order for a function to be differentiable at some point \(p\), the function will also not need to be differentiable at every point in some interval around \(p\). It just turns out it often is, provided the function under consideration is sufficiently nice (see Umer’s comment below). Either way that’s not a problem, however, and makes sense in light of continuity:

All these limit-definitions and \(\varepsilon\)-\(\delta\)-style definition have one thing in common: They always talk about what happens when we get arbitrarily close to some value of interest – i.e. they define a property of some point by how this point relates to its neighborhood. It’s not a property that can be established for a point in isolation – only in relation to surrounding points. Which makes sense, if you think about it: If you want to “approach” a limit, the very word “approach” implies that you’re in some sense covering the surrounding neighborhood of that limit, and that often happens to work because of the same property holding for those points you’re covering.

This is basically the very thing that one expects from continuous (i.e. “drawable without lifting the pen”) functions. In fact; one way to define “continuous at some point \(p\)” is by demanding that the function limit at \(p\) exists and is exactly \(f(p)\).

Anyway, what I’m trying to say is: To say “the function \(f\) is differentiable at point \(p\)” is [often, but not] actually the same as “the function \(f\) is differentiable in some interval surrounding \(p\)” – it’s just that the former is easy to express formally, and hence is convenient as a definition. I don’t know why this is supposed to be a problem; that’s just how continuity works!

One last thing: The bottom of the screenshot shows the following quote:

Cauchy had stated in his Cours d’analyse that irrational numbers are to be regarded as the limits of sequences of rational numbers. Since a limit is defined as a number to which the terms of the sequence approach in such a way that ultimately the difference between this number and the terms of the sequence can be made less than any given number, the existence of the irrational number depends, in the definition of limit, upon the known existence, and hence the prior definition, of the very quantity whose definition is being attempted.
That is, one cannot define the number ‘square root of 2’ as the limit of the sequence 1, 1.4, 1.41, 1.414,… because to prove that this sequence has a limit one must assume, in view of the definitions of limits and convergence, the existence of this number as previously demonstrated or defined. Cauchy appears not to have noticed the circularity of the reasoning in this connection, but tacitly assumed that every sequence converging within itself has a limit.

The History of Calculus and its Conceptual Development (Page.281) Carl B. Boyer

There’s two things to say about this:

  1. “are to be regarded as” is not the same thing as “are defined as”. I don’t know if Cauchy actually meant “Irrational numbers are defined as the limits of sequences” here. If he did, he indeed used circular reasoning. In that case, Cauchy was just wrong. People tend to be that quite often. However, I’m actually fine with saying “irrational numbers can be regarded as limits of sequences”. They can be defined in such a way that this makes sense and is not circular (see my last post, where I give three different but equivalent definitions of the real numbers).
  1. Who cares? Cauchy is not the last word on anything to do with Calculus. Nowadays, we don’t define real numbers as “limits” of Cauchy sequences – we can rather just define them as (equivalence classes of) Cauchy sequences directly. There’s no circularity there, and if you don’t like that, define them as decimal expansions, or as Dedekind cuts, or axiomatically, or…

To use this quote to cast doubt on modern mathematics (and it seems clear, that this is what Gabriel wants to do) is typical creationist logic: “Hey, I found something in a book by Cauchy that’s flawed, hence Calculus is wrong” is exactly like saying “I found an error in Darwin’s book, hence the modern theory of Evolution is wrong”. Why do cranks always think, that one single book is the infallible foundation of a whole modern scientific field?

Well okay, in the case of creationists that’s just projection, I assume. They have an infallible book, hence the opposition must have one as well. Gabriel, it seems, regards Plato’s works (as we’ll see) as his infallible bible, hence… modern mathematicians must use Cauchy’s works as their infallible bible? Is that the reasoning here? Who knows…