## Online Dating: Quarantine Edition

You did it. You really did it.

You finally succumbed to quarantine fever and used your credit card to pay the membership fee for Match.com.

Of course, you’re skeptical about the whole “online dating” thing, but you’re optimistic and hopeful—and probably a little excited by the idea that true love might be (literally) just a mouse click away. Really, though, you’re just hoping the opportunity cost of spending $23.99/month will be worth it when you [sigh] *finally* meet that twenty-something, super-rich [insert famous celebrity here] look-alike who is (1) still single (no chance), (2) also a member of Match.com (never happen), and (3) of the opinion you’re more attractive than anyone else they could ever meet (highly unlikely). But you still play the lottery, so you believe anything is possible. (It’s an unfortunate consequence of our sociosexual evolution that callipygian gifts require payment in kind. I think Aristotle said that.)

Anyway, what now? Endless scrolling? Interminable thanks-but-no-thanks messages followed by hasty profile blocking? Wasted Friday and Saturday nights trying to get not-quite [said famous celebrity] 2.0 with the slightly unattractive aquiline profile to pay more attention to you than the comments on their Instagram selfies? There’s a better way. Say “no” to bad lighting and undercooked meat, and say “yes” to ~~the dress~~ the mathematics of probability optimization.

Proba…what?

Right.

What if I told you we can use the power and beauty of mathematics to give you the best chance of finding the ~~lust~~ love of your life? It’s true. Let’s say you’re willing to look at a total of *t* profile pictures sent to you by Match.com’s ostensibly preternatural algorithms. By rejecting the first *r* pictures, you will maximize the probability of finding your “ideal match.” (Call this person *x*.) I know: you don’t believe it, but it really is *that* simple. So, what’s the value of *r *for a given *t*? Technically, the answer isbut we’ll get to that.

First, some ground rules:

- Once you pass on a profile pic, you can’t go back. That person is gone
*forever.*[insert crying emoji] - Once you choose the value for
*r*, you must reject every person from the first to the*r*th. - You must choose the first profile (
*x*>*r*th) that’s better than all the others you’ve seen. - You must choose the last profile you see if you haven’t chosen anyone to that point.
- Once you choose someone, they are guaranteed to accept.

So, what do we know? Well, unfortunately, if your ideal match happens to show up within the first *r* profiles, you’re sunk. Because of rules 1 and 2, the probability of picking *x—*assuming *x* happens to arrive within the first *r* profiles—is, well, zero. To optimize your chances of picking *x*, we need to pick the optimized value for *r*. And to do *that,* we need to calculate the probability of *x*‘s location as Match.com sends profile pics to your inbox.

Okay, you can’t pick anyone within the first *r* profiles, but what if the (*r*+1)st profile is your dream date *x*? You’ll pick that person for sure, right? So, the probability is 1. But the probability of the (*r*+1)st profile being your dream date is (gulp) the worst it could be: 1/*t* (assuming a uniform distribution of profile pics). We take the product of these values; that is, “the probability you’d choose the (*r*+1)st profile assuming that person is better than the other *r* profiles” multiplied by “the probability this person happens to be located at the (*r*+1)st position in Match.com’s algorithm.” That happens to be [drum roll, please] (1)(1/*t*) = 1/*t*. For large *t*, that’s not so good. It gets better, though.

What if *x* is the (*r*+2)nd profile in your inbox? Well, you wouldn’t pick *x* in this scenario *unless* the (*r*+1)st profile *wasn’t better than all the previous r profiles*; in other words, the highest-rated profile to that point (i.e., the moment you received the (*r*+1)st profile) was one of the previous *r* profiles (otherwise, you would’ve picked the (*r*+1)st profile). The probability that the highest-rated profile of the first (*r*+1) profiles arrived within the first *r* profiles you rejected is very high, *r*/(*r*+1), and the probability that *x* happens to arrive in the (*r*+2)nd position remains 1/*t*. This is really the tricky part of the whole concept, so do yourself a favor and make sure you get it straight.

So, the total probability of the (*r*+2)nd profile being your dream date is the product of the two probability values we already calculated, that is,Probabilities for the (*r*+3)rd, (*r*+4)th, . . . , (*t*-2)th, (*t*-1)th, and *t*th profiles are calculated similarly, and we simply sum the individual probability values:

,

and after factoring out *r*/*t*, this simplifies to

,

which is simply an *r*-mulitple of the average of the individual probabilities the *j*th candidate will be *x* withNow, if we’re going to give you the best chance of finding your ideal *x*, we need to optimize the value for *r*. In other words, giving you the best odds requires knowing how many *r* profiles you must be committed to rejecting given an arbitrary number of profiles *t*. This means we need to optimizeand that requiresfor arbitrary *t*. The trick is to substitute *r*-1 and *r*+1 into the above equation and solve the inequalities.

Taking the first case, we haveAfter substitution, we have

.

Multiplying by *t* and distributing *r*-1 into the first term on the RHS gives us **(1)**

.

Notice the bracketed expressions in both the LHS/RHS are equal! Now, we only have to deal with coefficients. Let’s do that. Subtracting the LHS from the RHS leaves us with

,

which, after rearranging, becomes

.

If we substitute *r*+1 into inequality **(1)** above and follow a similar calculation, we arrive at the other inequality we need:

,

yielding the final result **(2)**:

.

At this point, we can find the optimized value for *r* given an arbitrary *t*. Just make sure the above inequality still holds. For example, imagine Match.com sends you, say, seven profile pics. Then, we have and *r* = 2 because 19/20 < 1 < 29/20 holds.

What does this mean? Well, given a total pool of seven profiles from which to choose (adhering to the aforementioned restrictions), you would automatically reject the first *two* profiles—no matter who they were—and then choose the very next profile that was *better than the first two you rejected*. Note: To calculate *r*, we use the smallest number of terms (based on the value of *t* we chose) to satisfy both sides of the inequality in **(2)**.

Here are some calculations using the above machinery:

So, if you decide to consider a pool of 50 Match.com profiles, you’d automatically reject the first 18 and choose the first profile better than any of those 18 you rejected. That will give you a 37.43% chance of finding your ideal love match. Sure, you have about a 63% chance of missing *x* using this strategy, but *any* other strategy you choose will decrease your odds (assuming you follow the rules).[1]

One thing we can’t fail to notice is that as *t* gets larger, our *p*-value begins to settle around 37%.[2] That’s not a coincidence. In fact, the above inequalities relate to (a bounded subset of) the harmonic series:All the denominators in our rational terms are conterminous positive integers that begin from *r* or *r*+1 and terminate at *t*-1. (Check this.) So, our work thus far can be recast as a functionof our two variables—defined by the RHS inequality—based on how we interpret the (upper and lower) Riemann sums of the integral, the meshes of which are simply the areas given by our chosen subseries of

Taking the RHS, we have

,

and the LHS is

.

Putting it all together, we have

.

As *t* and *r* grow larger, however, we find that

,

and the last inequality above suggests we’re “squeezing” the value of 1 from both sides, yielding the final calculation:

,

as claimed.

What’s great about this is that your chances don’t decrease the larger *t* is. Whether you’re willing to look through 15 profiles or 15,000, your optimized probability remains the same—about 37%. Also, the variable *t* doesn’t have to deal with profiles; it could involve, say, time. If you’ve allocated five months to find the best venue for your wedding, you’ll reject everything you see until (about) day 56 (= 150/*e*), at which point you’ll choose the first venue that’s better than every venue you’ve seen so far. Of course, venues aren’t like jilted dates: you can always circle back and choose a venue you’ve rejected, but the math works the same way given the assumptions.

So, there it is. You have nearly 2/5 probability of finding your ideal date using the above approach. That’s pretty good, actually. It’s not as good the probability the sun will rise tomorrow, but it’s a lot better than getting even a short-term run in the stock market.

Math even cares about your love life.

Footnotes:

[1] Some strategies alter your chances by modifying our restrictions (e.g., selecting merely *one of the best* candidates, allowing proposal rejections, having full information about the candidates, incurring costs to passing, etc.). (As an example, one prominent design suggests stopping atas opposed to 1/*e*.) Many people consider the classic optimization problem to be unrealistic for these reasons. Of course, this is silliness: The math just gives you the best chance of finding *x*; it doesn’t say anything about the chances of being accepted by *x* or living happily ever after with *x *or whether you would consider less optimal candidates *y* and *z* to be desirable replacements. That’s up to you, butseems worth the rejection risk!

[2] This is why “no-information” optimal stopping is often referred to as “the 37% rule,” an algorithm originating in 1949 as Flood’s “fiancee problem“—recontextualized as “the secretary problem“—and later popularized by Martin Gardiner in the February 1960 issue of *Scientific American*.

References:

[1] B. Christian and T. Griffiths, *Algorithms to Live By: The Computer Science of Human Decisions*, New York: Henry Holt and Company, 2016.

[2] J. Billingham, *Kissing the frog: A Mathematician’s Guide to Mating*, Plus Magazine*, ***9 **(2008) 1-3.

## Bon Appétit: Thanksgiving Edition

Informed readers living in the United States might be aware of the fact that the FDA publishes a regulatory guide, which enumerates the (ignominiously) acceptable limits of various “defects” found in domestic food products. “Defect,” of course, is a government-sponsored euphemism for a variety of food-based atrociousness—including fecal matter, insects and insect parts, insoluble organic material, mold, dirt, maggots, larvae, and rodent hairs.

One particularly egregious example involves that ubiquitous and delectable herb very often used during Thanksgiving feasts throughout North America: sage. The FDA allows *an average* of “200 or more insect fragments per 10 grams” of ground sage (about 14 teaspoons). At a(n average) rate of 20 fragments/gram, what’s the probability you consumed at least TEN insect parts had you prepared Martha Stewart’s traditional bread-stuffing recipe, which only uses about 0.7 grams of sage?

Fortunately, mathematics saves us from the uncertainty surrounding the number of insect parts we’re likely to consume, though we might not like the answer. Let *X* be the number of insect parts found in 0.7 grams of ground sage based on the average-rate limit set by the FDA. We assume a Poisson distribution and calculate the complement of the probability of eating at most nine insect parts:which gives us

So, the probability of eating ten insect parts in a recipe that uses 0.7g of ground sage is a whopping 89%!

And just in case you assume you always avoid the added protein by lucking out as a member of the 11% minority, I have bad news for you: You’re guaranteed to eat two insect parts in 0.7g no matter what you do.

Anybody want seconds?

## Physicists vs. Mathematicians

A physicist who meets the harmonic series for the first time

might approach it by summing *n* terms in an attempt to get a feel for what’s happening asBasic logic suggests the series will converge: Each term is getting smaller, and asthe individual terms we’re adding to the sum approach zero.

Think about it intuitively for a second.

Imagine I emptied an Olympic-sized swimming pool, handed you a one-liter bucket, and asked you to refill the pool by filling and emptying the bucket according to the terms of the harmonic series. That is, fill the bucket and dump it into the pool, then fill half the bucket and dump it into the pool, fill it a third of the way and dump it, and so forth.

Do you think you’ll ever fill the entire pool using this method?

It seems *unfathomably* unlikely you’d even fill a small fraction of the pool, let alone refill it to its full capacity. But this is why we cannot rely upon intuition to solve problems, and it’s why, in the end, I believe mathematics is superior to physics. The latter relies upon experimentation, the scientific process, estimation, probability, and some amount of intuitive guesswork. The former requires the kind of logical rigor in (dis)proving conjectures that leaves no doubt about its conclusions. The physicist will assiduously dump the first *n* = 250,000,000 buckets into the pool, only to realize she still has more than 2,499,980 liters to go. Forour physicist will need to dump

buckets into the pool. Forshe would need to addbuckets according to the successive terms of the series. Of course, while we must applaud the independently wealthy scientist who possesses the patience and dedication required to engage in such an arduous process, it’s not a practical way to solve problems (or fill pools).

Mathematicians, however, have proven that the infinite sum of the harmonic series actually *diverges*; the proof given by Nicole d’Oresme (1323-1382) is one of my all-time favorite proofs, and its utter simplicity and logical beauty deserve repeating here.

**Proof.** Grouping the terms, we have

which is greater than

a sum that simplifies to

Clearly, if the series *S* is smaller thanand *S* is divergent, then the harmonic series must be divergent.

This means that, eventually, you’ll not only be able to fill the pool, but you’ll be able to fill an infinite number of pools an infinite number of times…and it only took a few seconds to figure it out.

No buckets involved.

## The Life of Pi

I still think it’s pretty cool that

,

where an *infinite* area equals a single, real-valued number. Of course,is irrational, which means, among other things, that we can never actually “write down” an exact value for it, and in that sense, it’s an intuitive equality:is “infinite” in a way that models the infinite area under the Gaussian integral.

**. Let**

Proof

Proof

*G*be the Gaussian integral. Then,

.

This can be transformed into polar coordinates:

because Thus, we have

.

And because, we know, as desired.

Think about this for a second.

With only a few simple techniques, we’ve explicitly evaluated an *infinite* amount of space (i.e., the area under *G*) without spending an infinite amount of time calculating an infinite number of definite integrals as the curve gets closer and closer to the *x-*axis.

Of course, the Gaussian integral isn’t the only instance wheremakes an appearance. There are plenty of other examples involving infinite sums and products whereplays an indispensible role. Here’s a sample (involving only a fraction of Euler’s discoveries alone!):

Discovered or invented, that’s pretty awesome.

## Sleepless in Seattle

**Conjecture 1.1.** As a sequence of eventswhere *b* is bedtime, then

Ifis sufficiently small andis the event at which point I take my DHA supplements, then there exists a mapfrom real-world events to memesappropriately defined, such that I lose sleep.

## Where Have All the (Good) Lawsuits Gone?

Question: What do Ed Sheeran, Led Zeppelin, Lady Gaga, and Katy Perry have in common?

Answer: They have all been accused of copyright infringement.[1]

It’s no surprise our particular brand of postmodernism has led us to embrace a significant amount of litigiousness (socioeconomic and otherwise), but what’s the impetus behind the sudden onslaught of *copyright-infringement* cases we see in the music industry? Is it simply a matter of finding a legal opportunity to make a quick buck at an artist’s expense, a calculated financial score within a tech- and service-heavy labor market that no longer has any real use for musicians (or their musical ruminations)? Avarice certainly cannot be dismissed, but maybe there are other possibilities.

“Gen-ed” music education has suffered a precipitous decline in recent decades, which means it’s not unreasonable to assume some artists likely don’t have the requisite compositional awareness to avoid crossing the legal boundaries prescribed by copyright law. (And some artists [read: singers] don’t even write their own material.) It’s also possible many musical artists and casual music fans (at least within the millennial and Gen-Z cohorts) are uninterested in any discographies prior to the year 2000. Should we expect twenty-somethings to recall sufficiently the melodic contours of “Elenor Rigby,” “A Farewell to Kings,” or “Kasmir” in an attempt to avoid even the slightest melodic or harmonic evocations when in the throws of creative expression?

Perhaps we must resign ourselves to the frightening possibility that modern pop/rock music has reached a saturation point, a terminus where we’ve exhausted a sufficient number of the finite intelligible combinations (using, say, three or four diatonic chords and the most basic time signatures) such that it should now be considered inevitable one artist will sound like another within the same musical space.

Have we crossed some sort of musical Rubicon?

If we limit our compositional options to four diatonic triads—say, I, vi, IV and V—and their chord tones, there are only 3^4 = 81 possible melodies you could write.[2] That’s hardly a deep well of melodic variety. And if analysis (e.g., prolongational/Schenkerian theory, etc.) “reduces” melodic structures to these kinds of skeletal designs, there seems to be little hope of avoiding an infringement charge when expert witnesses are summoned to the scene of the crime. Yes, how one decorates the basic melodic structure *inter alia *with non-chord tones, suspensions, and (consonant support of) passing tones greatly increases the number of note-to-note possibilities—and this is significant when evaluating a legal threshold for infringement—but there are only so many ways to descend from c² to f¹ in the key of F major and still get Spotify streams. Someone needs to say it: Within a sufficiently simplified harmonic structure, your melody will sound a lot like something that’s already been written.

So, what about the lawsuits?

In his entertaining and surprisingly-detailed-for-YouTube discussion of the Ed Sheeran lawsuit (see the first link in [1]), Adam Neely focuses on the harmonic similarities between the two songs in question: “Thinking Out Loud” and “Let’s Get it On.” There, he suggests the “iii” (mediant) chord in “Let’s Get it On” enjoys “relative dominant” function based on the text *Harmony Simplified* by Hugo Riemann (REE-mahn), whom Neely mistakenly refers to as Hugo “Reimann” (RYE-mahn).

This is incorrect.

An orthodox Riemannian functional analysis would understand the mediant in this context as having tonic function, prolonging the Eb-major tonic as the *Leittonweschelklang *(or “leading-tone change chord”) of I. Even the voice-leading taken from the lawsuit’s relevant musical example suggests such a hearing (Figure 1.1). A dominant-function analysis (i.e., *T Dp*) would suggest dominant prolongation beginning at chord 2, which, under other circumstances, might be a plausible hearing, but it’s simply an unreasonable claim in this case. So, a proper functional analysis of the I-iii-IV-V progression in “Let’s Get it On” would bewhich (barely) distinguishes itself from the D :* T –> S D* structure of “Thinking Out Loud.”[3] The progressions are as similar as they can be without being exactly the same.[4]

But why should this matter? After all, you can’t copyright a chord progression any more than you can copyright the color blue or the Poisson distribution.[5] Ideas, musical or otherwise, need room to breathe, and harmonic progressions have long been considered the canvas upon which musicians paint their artistic visions.[6] So, even if we admit Sheeran’s backing tracks to the verses are essentially identical to “Let’s Get it On,” is that enough to justify a *100-million-dollar* judgment for the plaintiff?[7] As we consider other musical parameters when we listen to both excerpts (e.g., rhythm, meter, timbre, instrumentation, form, tempo), we must conclude that a reasonable listener would consider the musical expressions to be nearly identical, and the lawsuit’s language certainly makes that argument:

‘Thinking Out Loud’ copies various elements of ‘Let’s Get it On,’ including but not limited to the melody, rhythms, harmonies, drums, bass line, backing chorus, tempo, syncopation and looping. (2)

That’s quite a damning claim—and with respect to the backing tracks to the verses, the accusation seems quite justified—but in the process of dismissing the lawsuit as frivolous and overreaching, both Neely and Beato argue the songs have “very different melodies,” almost suggesting melodic content should be paramount when assessing these kinds of infringement claims. I’ve already suggested melodic content might be more limited than we think, and engaging in a complete analysis of both melodic structures is beyond the scope of this (already-too-long) blog post, but it seems imprudent to dismiss any and all similarities (and, perhaps, err in adjudicating a legitimate lawsuit in the process) based solely on one’s inability to make deep connections beyond the foreground on a cursory hearing. The lawsuit suggests a number of structural similarities between melodies, and the reader can decide if those arguments have any merit.

Is it really the case, though, that melody is always the defining factor in identifying musical plagiarism? Let’s try an experiment. I used a PRNG to alter each note of the following melody (and its “harmonic” support) such that it (i.e., the melody) is completely unrecognizable (and would be wholly immune to accusations of copyright infringement), but I kept the rhythmic and metric structures *exactly the same *as the original. Can you guess the song based solely on these details? (Comment with your guesses.)

I hope this convinces you that, in some cases, melody isn’t everything. (And that’s why I think Spirit might have a claim against Led Zeppelin.)

What really bothers me about these lawsuits, though, is not their (ostensible?) frivolousness; it’s what they say about the current state of the arts and humanities in the United States. The NEA and NEH clamor for funding and spend a considerable amount of time trying to convince taxpayers that lionizing and privileging STEM knowledge will eventually sterilize us, leaving in its wake an eviscerated, money-centric existence without Whitman scholars to provide those elusive reasons to continue living. But, here, in high-profile copyright cases like these, the NEA, reified by the academic music-theory community, has a wonderful opportunity—really, an obligation—to step onto the big stage and justify its existence by properly adjudicating these legal disputes. Music theorists *finally* have a chance to transform their compendium of useless arcana into a definitive, practical application that can benefit society (think the birth of econometrics in the 1950s), yet, to this point, they have failed to do so. Theorists who care at all about their profession should be embarrassed by the fact that the outcome of this lawsuit (and others like it) will probably be determined by the kind of music-theory-for-dummies, paint-by-numbers analysis one finds in the legal complaint.

Too bad Ed Sheeran might have to pay the 100-million-dollar tab as a result.

Footnotes:

[1] Rather than wade through all the dirty details of each lawsuit, you can familiarize yourself by watching this, this, this, and this. For the sake of expediency, I will refrain from taking exception to many of the details (historical or otherwise) that are communicated in the videos (e.g., Beato’s “line clichés,” etc.).

[2] That is, one chord tone for each harmony with each chord played successively—e.g., E/(C-E-G), C/(A-C-E), B/(G-B-D), and C/(C-E-G). One only need compare the opening of Leslie Bricusse’s “Candy Man” and Stephen Sondheim’s “No One is Alone” from the musical *Into the Woods *to be convinced of the mathematical limitations of melodic design.

[3] The neo-Riemannian theorist would analyze this as **L**(Eb+) = G-, where the “LPR group” recasts the *Schritt-Wechsel* group, which is isomorphic to the non-commutative dihedral group D12.

[4] Key choice is irrelevant when analyzing the functional or prolongational relationships between two harmonic progressions.

[5] Would it matter, from a legal perspective, if a chord progression was so unusual that the probability of two different artists composing it was well beyond chance?

[6] We can’t, as Neely does, dismiss Bach’s use of the contested harmonic progression while invoking the historicity of sixteenth-century imitation masses. Either music history is relevant or it’s not, and we can’t jettison Bach on anachronistic grounds while embracing Palestrina’s received practice of using preexisting material.

[7] Requirements for academic plagiarism would likely have already been met. Do we have any more control over the words we type into ~~Microsoft Word~~ LaTeX than the notes we scribble on our staff paper?

Postscript: Bonus points to those who made the connection between the title of the post and VH’s 1982 track “Where Have All the Good Times Gone?” on *Diver Down*.

## Dear Backstreet Boys

You should have used the ballad I wrote for you guys in 1999.

It would have been a hit.

Sincerely,

u220e

## Is God a Mathematician?

This is part *deux* to my post “Amazon’s Primes,” where I explore prime numbers in an effort to proselytize my belief that (most of) mathematics is discovered rather than invented. I’ve since stumbled upon Barry Mazur’s interesting 2008 paper “Mathematical Platonism and its Opposites,” in which he addresses this intellectual bifurcation—what he calls “The Question”—and I thought I’d briefly respond to his comments.

Mazur begins by framing the Platonic position as follows:

If we adopt the Platonic view that mathematics is discovered, we are suddenly in surprising territory, for this is a full-fledged theistic position. Not that it necessarily posits a god, but rather that its stance is such that the only way one can adequately express one’s faith in it, the only way one can hope to persuade others of its truth, is by abandoning the arsenal of rationality, and relying on the resources of the prophets.

At best, this is incomplete; at worst, it’s dismissive. Why can’t the universe possess an incredibly high degree of order/structure without the inevitable recourse to theism? Theists certainly believe God is the locus of such “rationality” (through design), as I do, but there are a number of mathematical concepts (like the organizing principle behind prime numbers) that don’t at all assume the existence of a supernal being. Either you can arrange a set of objects with a given cardinality *c* into subsets with equal cardinalities (less than *c*) or you can’t. That will be the case in any universe with discrete objects, which was a critical point of my initial post.[1]

Mazur continues with his “Do’s and Dont’s for future writers promoting the Platonic…persuasions”:

One crucial consequence of the Platonic position is that it views mathematics as a project akin to physics, Platonic mathematicians being—as physicists certainly are—describers or possibly predictors—not, of course, of the physical world, but of some other more noetic entity. Mathematics—from the Platonic perspective—aims…to come up with the most faithful description of that entity. This attitude has the curious effect of reducing some of the urgency of…rigorous proof. Some mathematicians think of mathematical proof as the certificate guaranteeing trustworthiness of, and formulating the nature of, the building-blocks of the edifices that comprise our constructions.

Without proof: no building-blocks, no edifice. Our step-by-step articulated arguments are the devices that some mathematicians feel are responsible for bringing into being the theories we work in. This can’t quite be so for the ardent Platonist, or at least it can’t be so in the same way that it might be for the non-Platonist. Mathematicians often wonder about…the laxity of proof in the physics literature. But I believe this kind of lamentation is based on a misconception, namely the misunderstanding of the fundamental function of proof in physics. Proof has principally…a rhetorical role: to convince others that your description holds together, that your model is a faithful re-production, and possibly to persuade yourself of that as well.

I’m not sure why Mazur thinks the platonist position would vitiate an understanding of, and urgency for, rigorous proof. In fact, I’d argue he has it reversed. Rigorous proof (as opposed to the mere descriptions of physics, which are entirely motivated by, and subject to, the scientific method) is precisely the thing that bolsters the Platonic view of mathematics as being “out there” and not “in us.” I don’t believe there is an infinite number of primes because there exists some computer that’s been spitting out ever-larger primes for the last decade.[2] That would constitute the kind of “rhetorical proof” Mazur attributes to physics. No, we mathematicians believe there’s an infinite number of primes *because we have a rigorous proof* of that claim, and we can confidently power down the computer because it will never, ever find a “largest prime.”[3]

Mazur concludes by arguing that:

in the hands of a mathematician who is a determined Platonist, proof could very well serve primarily this kind of rhetorical function…and not…have the rigorous theory-building function it is often conceived as fulfilling. My feeling, when I read a Platonist’s account of his or her view of mathematics, is that unless such issues regarding the nature of proof are addressed and conscientiously examined, I am getting a superficial account of the philosophical position, and I lose interest in what I am reading. But the main task of the Platonist who wishes to persuade non-believers is to…communicate an experience that transcends the language available to describe it. If all you are going to do is to chant credos synonymous with “the mathematical forms are out there”—which some proud essays about mathematical Platonism content themselves to do—well, that will not persuade.

This is a fair point and one, I believe, I successfully sidestepped in my initial post. There, I tried to communicate the notion of primality in a way that transcends the kind of man-made hieroglyphics and symbolic logic that inevitably emerge with formal definitions of mathematical concepts.[4] But the larger question remains: Why should the platonist position view formal proof as a “rhetorical function” at all? Can’t we pursue “rigorous theory-building functions” while also realizing such a pursuit is synonymous with the process of uncovering (and not merely describing) ~~God’s design~~ nature’s inherent structure? We can, but, more than that, I believe an edifice of theory-building that rests upon an immanent Platonic framework does stake a stronger claim. If theory-building was nothing more than the progeny of (human-)contrived logic, I think we would feel much less secure in what we know; for example, could we ever be certainis *genuinely* “irrational,” however we chose to define that term, or does our subjective and myopic system of logic create a mirage of knowledge that wouldn’t necessarily be true in every (or any) other universe?

Perhaps the best argument for mathematical platonism (MP) involves what might follow from the argument against it: If mathematics is, in fact, invented, can we objectively *prove* anything? Can we imagine a universe where, say, Peano’s axioms or ZF(C) set theory didn’t apply? Can we imagine a universe where something isn’t equal to itself? Where equality isn’t transitive with respect to the “natural numbers”—i.e., if *x* = *y* and *y* = *z*, then *x* = *z* where *x*, *y*, *z* are elements of the natural numbers—or where the union of two non-empty sets *A* and *B* doesn’t equal some set *C* containing the elementsWhat’s the likelihood we’ve designed the most perfect logical infrastructure for mathematics that also makes it impossible for us to differentiate between objective Truth and self-referential consistency? How do we explain that “‘the enormous usefulness of mathematics in the natural sciences is something bordering on the mysterious’ where ‘there is no rational explanation for it'”?[5] The platonic atheist seems to have two choices: (1) MP exists because an infinite multiverse demands the existence of at least one universe with such objective logical consistency and (2) MP exists because the universe is a mathematical construct. An infinite multiverse bears the burden of being far too permissive—everything that might exist *must* exist—and the idea of living in a mathematical construct, which is Max Tegmark’s controversial proposition, just seems impossible to embrace with too much enthusiasm.

But what *about* the primes? I said we have a rigorous proof for the infinitude of primes, but what does that proof look like? Can we see anything in its mechanics that might suggest MP is a false assertion, that it’s nothing more than some sleight-of-hand constructed entirely from man-made logical principles?

You be the judge.

**Theorem:** There exists an infinite number of primes.

**Proof (Euclid):** Assume the set of primesis finite withthe largest prime. DefineBecause every positive integer greater than 1 has a prime factor, *Q* must have a prime factor. Call it *q*. (Note that because the ring of integers is closed under multiplication, so ) Thus, we havefor some positive integer *k*. If(because *p* is assumed to be the largest prime), then it must be the case that(becausecontains *all the primes* as factors), which means, of course,for some positive integer *r*. From the equality above and substituting appropriately for *Q* and *p*!, we haveThis meansif their product equals 1, and this contradicts the claim that *q* is prime. So, *q* > *p*, and because *p* was arbitrarily chosen to be the largest prime in a finite set of primesit follows that there is no largest prime.

“*It is the glory of God to conceal a thing: but the honor of kings [is] to search out a matter*.” ~ Proverbs 25:2 (KJV)

Footnotes:

[1] The following quote from Max Tegmark is quite *apropos*: “Think of mathematical symbols as mere labels without intrinsic meaning. It doesn’t matter whether you write, ‘Two plus two equals four,’ ‘2 + 2 = 4,’ or ‘Dos más dos es igual a cuatro.’ The notation used to denote the entities and the relations is irrelevant; the only properties of integers are those embodied by the relations between them. That is, we don’t invent mathematical structures—we discover them, and invent only the notation for describing them.” (*Our Mathematical Universe*)

[2] This is precisely what the GIMPS project does—for Mersenne primes, at least.

[3] Actually, there are many proofs of the infinitude of primes.

[4] I later realized my discussion of prime numbers might have suggested we build a bridge between mathematical Truth and physical processes (e.g., arranging physical objects), and that’s not my position at all. (That *is *the project of physics.) Sometimes, however, it’s convenient to reference the physical world as a way to access certain abstract mathematical ideas (e.g., rigid *n*-gon rotations in group theory, etc.).

[5] Max Tegmark quoting Wigner in *Our Mathematical Universe *(355).

## The Harlem Disktrotters (Est. 2019)

I met a flat-earther (FE) for the first time about a year ago during an informal holiday gathering, and I really didn’t give it much thought at the time. I simply dismissed it as the biased reasoning of someone who seemed predisposed toward conspiracy theories. I have only recently, however, realized the flat-earth position (FEP) has, at some surreptitious point, morphed into a fully fledged global *movement* (pun intended). In service of understanding that movement, then, I spent some time watching a variety of YouTube debates and reading various material on the Internet. What I’ve discovered is that the FEP has a ready-made (if insufficient) defense for almost every conceivable pro-sphere counterpoint, some of which involve the following topics:

- curvature at the horizon
- rotation of the constellations
- lunar eclipses
- changing lengths and angles of shadows
- satellite and telescopic photographs
- issues of perspective
- cosmological consistency
- momentum and inertia
- non-Euclidean geometry

But there’s one thing FEs can’t really refute: gravity. And a significant part of their inability to do so involves our subjective, corporeal interaction with it. It’s easy to dismiss photographs of a spherical earth as dissembling “composites” that evince a global conspiracy because—once we decide everyone is in on the cover-up—it’s a claim that’s impossible to disprove. Think what you want about the principle of Popperian falsification, but it’s still the best game in town for differentiating between legitimate scientific inquiry and an intellectual hatchet job.

So, what about gravity? Imagine we take an infinitely thin cross-section of the spherical earth so we can operate in two dimensions. Recall from differential geometry that the unit normal vector **N **= **N**(*t*) at some point *P* = *P*(*t*) on a smooth curve at parameter *t* is defined as

whereis the radius of curvature. If **T** is defined asthen

and **N** is the orthogonal unit vector to **T** (i.e.,that points to the concave side of the curve. Look at figure 1. The vector(red line) is tangent at point *D* to our imagined cross-section of the earth centered at *S*, and the vectoris the normal vector at *D*.[1]

The normal vector **N** models how gravity acts upon objects resting on the earth’s surface; that is, the mass of the earth “pulls” objects to the center of mass/gravity *along the vector that is orthogonal to a given tangency point *(for some **T**).[2] This is why we experience gravity exactly the same way *no matter where we are on the planet*. Any point we occupy at a given time—say, standing in front of the Barnes & Noble in Bayside, Queens (40.7805° N, 73.7764° W)—will always be a point of tangency such that there will exist a normal vector that represents the direction of the orthogonal pull of gravity toward the center of the earth.

This would not be the scenario if the earth were a flat disk. On a disk, we would only experience gravity as we usually do if we were located at the disk’s center. Why? Because that is the only place (on a disk) where the gravitational pull would involve a vector orthogonal to the tangent plane (i.e., the earth’s surface).[3] As soon as we begin moving away from the center, toward the edge of the disk, the center of gravity begins pulling at an angle that is not orthogonal to the tangent plane, and the further we get from the center, the greater the angle. See figure 2. As we traveled to the edge of the disk, gravity—at least based upon our usual expectations—would become almost nonexistent.

There are many examples.

Here are a few: Balls thrown vertically into the air would travel toward the center of the disk—not up and down—and they would travel further using the same force the closer we got to the edge of the disk. Trees growing beyond the center of the disk would grow on angles (negative gravitropism) based on the deviation from orthogonality.[4] Even walking to the edge of the disk would involve an angular pull of gravity on our bodies, as if one were walking up a hill whose incline kept increasing; eventually, if we could traverse the edge of the disk and stand on its side, we’d experience gravity as we normally do because, at that point,and the gravitational pull would, again, be orthogonal to our position. Of course, none of this is at all what we experience—at baseball games, hiking through the forest, or walking around on the coast of the Bering Sea. We always experience gravity as a (normal-vector) force that’s “pulling us” (essentially) straight down into the earth. Why?

Because the earth is a sphere.

FEs must have realized gravity is a significant problem for their worldview because they already fabricated a solution: eliminate it altogether. As explained by “6or1/2Dozen” on the Flat Earth Society website:

In most flat earth models, the force perceived as gravity is the disc of the planet being accelerated upwards by [a] force know [sic] as the Universal Accelerator (UA). The Universal Accelerator, as the name might imply, accelerates universally, so that the Sun, Moon and planets accelerate at the same rate as the disc of the earth. [There is also an infinite plane model in which gravity is gravity, but requires the Earth to be an infinite plane with infinite mass]. Please note that although it is called the Universal Accelerator, it does not actually accelerate things universally, objects like people, trees, small rocks and cheese are exempt.

Thus, to circumvent the issue of orthogonality we’ve been discussing, FE proponents simply jettison the inconvenience of gravity—Newton and Einstein be damned!—and replace it with a nebulous and problematic concept of a “Universal Accelerator” (UA) where the entire solar system (i.e., the firmament under the dome that covers the disk) is “accelerating” at the same rate.[5]

There are, of course, many objections to the UA theory. As mentioned in note 5, increasing acceleration means the earth will surpass the speed of light, which is impossible to reconcile with any form of reality. And if the velocity of the firmament is constant (i.e., *dv*/*dx* = 0), then the force that mimics gravity would be equal at all points on the disk and objects would always fall at the same rate. Unfortunately, this is not what we experience. Balls dropped from the same height—say, at the top of a mountain—will at different times during their journey enjoy different velocities. For example, the equation for the velocity of an object in freefall is *v* = *gt *where *t* is time (in seconds)*. *If Ball A falls for three seconds, it will have a velocity of (9.8(3)=) 29.4m/sec. If Ball B is then dropped two seconds after Ball A, then Ball B (at *t* = 3) has a velocity of (9.8(1)=) 9.8m/sec. So, either acceleration is zero (i.e., velocity is constant), which means objects in freefall should fall at the same rate (when they don’t), or the disk is traveling faster than the speed of light (yet our clocks are still moving).[6]

But even if we found a way to dismiss orthodox notions of gravity without abandoning experimental observations involving objects in freefall, we would be unable to reconcile UA with another incontrovertible fact: Gravity is inversely proportional to the square of the distance, so gravity is stronger (i.e., weight increases) the closer we get to sea level. You weigh more in front of the Barnes & Noble in Bayside than you do on the top of Mt. Everest, and you weigh more shopping in Alert, Nunavut than you would be sipping an Americano in Fortaleza, Brazil. If UA were true, even if FEP could account for differing freefall velocities, weight measurements would be uniform everywhere on the disk, but, again, this is completely belied by real-world experiments.

Footnotes:

[1] The illustration shows the normal vector pointing to the convex side of the curve, but we can easily see it points inward—to the concave side—as well.

[2] It is true the earth is not *completely* spherical, but the deviations from Platonic sphericality (and the density of other matter that influences the direction of the pull of gravity, etc.) do not produce an effect great enough to appreciate the change.

[3] There would exist a normal vector if we could stand “on the edge” of the disk (e.g., the side of a coin), but we exclude that possibility as FE proponents claim we can never travel beyond the “edge” of the (tangent plane of the) earth.

[4] The equation for the angle of deviation might look something likewhere *d* is the distance traveled from the center of the disk to its edge and *r* is the radius of the disk.

[5] It’s unclear what the author means by “acceleration,” but one must logically assume the derivative of the disk’s velocity is a constant (*dv* = *a dx* | *a* > 0). Otherwise, a constant(ly increasing) acceleration would mean the disk would eventually surpass the speed of light, and, according to special relativity, our clocks would eventually stop and we would (theoretically) travel back in time. No FEs can tell us what is causing this acceleration.

[6] If there are roughlyseconds in 100 years, then the disk would be traveling at a velocity of (roughly)meters per second (assuming the disk began accelerating from rest on 25 March 1919 at 1:09 PST), and we know the speed of light is (roughly)meters per second.

*It is he that sitteth upon the sphere *[חוּג]* of the earth, and the inhabitants thereof are as grasshoppers; that stretcheth out the heavens as a curtain, and spreadeth them out as a tent to dwell in: That bringeth the princes to nothing; he maketh the judges of the earth as vanity.* ~ Isaiah 40:22-23 (KJV)

## Let Them Eat Pseudoscience

In a now-(in)famous paper published in the 313th volume of the prestigious magazine *Science*, Dimitri Tymoczko (DT) makes the startling claim that the Möbius strip (MS) represents the topology (i.e., the “fundamental shape”) of representatives of dyad set-classes (i.e., all the types of two-note “chords” you can play on the piano). Unfortunately, he goes one step further and suggests the MS represents a sort of Platonic mathematical truth about dyadic structures in general.

This is absurd.

From page 2 of DT’s paper in *Science*:

I now describe the geometry of musical chords. An ordered sequence of

npitches can be represented as a point in R^n. Directed line segments in this space represent voice leadings. A measure of voice-leading size assigns lengths to these line segments….To model an ordered sequence ofnpitch classes, form the quotient space R/12Z^n, also known as then-torus T^n. To model unorderedn-note chords of pitch classes, identify all points (x_1,x_2,…,x_n) and (x_s(1),x_s(2),…,x_s(n)), wheresis any permutation. The result is the global-quotient orbifold T^n/S_n, then-torus T^nmodulo the symmetric group S_n.

It should be clear, even by a cursory reading of the above passage, that the geometry of the quotient orbifold is induced by a predetermined precondition of (maximal) parsimony—as well as octave equivalence and tunings that privilege an equal division of the octave—a feature reified by the directed edges whose “lengths” represent voice-leading distances in[1] “Points” of unordered sets of pitch classes will perforce be proximate to other “points” of unordered sets of pitch classes whose distances involve minimal voice-leading perturbations. The MS (fig. 1) emerges from the decision to privilege parsimonious voice-leading principles (as a function of log-frequency) in organizing the point lattice in Euclidean space.

It is this predetermined requirement of parsimony in constructing the quotient orbifold to which I object because it represents *inter alia *something of a Texas Sharpshooter fallacy, which leads ineluctably to a spurious intimation of Platonic design that simply does not exist. The MS is as much the fundamental topology for dyads as the dictionary is the “fundamental design” of the English language. We don’t get to marvel *ex post facto* at the unadulterated “linearity” of the dictionary after we’ve decided to arrange the words according to the organizing principle that engenders such linearity. The fact that modern theorists have historically privileged parsimonious relations—e.g., the conformist Tonnetz, Power Towers, Chicken-Wire Torus, Weitzmann regions, Cube Dance, etc.—is an insufficient defense to the general indictment. The parsimonious-MS relationship is merely one reification of a number of possible topologies for dyads. Privileging T6 relations, for example, generates the topology of a (ring) torus, suggesting there’s nothing objectively “fundamental” at all about dyadic space.

The appeal to Platonic discovery galvanizes general interest in the paper and, in my opinion, explains its publication in *Science*. This is a problem not only because the paper fails to uncover anything approaching Platonic “Truths” about musical space but also because it is symptomatic of a certain level of self-consciousness within the subdiscipline of mathematical music theory, an attraction toward hijacking mathematical hieroglyphics (and in some cases, real mathematics) in an effort to legitimize the study of music theory and portray music-theoretic ideas as a more substantive (read: “less artsy”) intellectual pursuit. But self-consciousness transmogrifies into unshirted intellectual crisis when mathematics is banefully misappropriated to bolster subjective claims about musical objects under investigation. Such is the case here.

Musicologists would be much better to avoid such blatant *non sequiturs*.

Footnotes:

[1] Constructing the quotient orbifold as an *n*-torus modulo the symmetric group of *n* elements allows us to eliminate identical unordered sets with permuted elements. For example, in DT’s MS model in figure 1, we see that, which allows us to choose either {03} or {30} as the “minor third” representative. If we were modeling triads, we might haveetc., allowing us to choose, say, {037} among the 3! orderings as the “minor triad” representative inspace.

## The (Half-)Life of Han van Meegeren

December 30th was the 71st anniversary of the death of Han van Meegeren, and I thought this would be an opportune time to whip up a not-so-quick post lauding the power and beauty of differential equations. (As if we need an excuse for such honorifics!) For the uninitiated, Han van Meegeren was a talented Dutch painter who suffered from a combustible fusion of realities: an insatiable desire for fame and a star on the decline; it was this desire that ultimately led him to perpetrate what some consider to be “the most dramatic art scam of the twentieth century.”

He almost got away with it.

**The Pledge.** The plan was simple: Forge a number of “early” Vermeer paintings that, as a collective, would serve as an organic confirmation of the more substantial and “mature” Vermeer forgeries to follow. It worked brilliantly. Abraham Bredius, the preeminent art historian of his day, adjudicated the surreptitious Han van Meegeren forgeries as Vermeer originals for a Dutch estate. He proudly published his analysis, which incidentally confirmed his pet theory that the Italians influenced Vermeer’s artistic oeuvre:

It is a wonderful moment…when [one] finds himself suddenly confronted with a hitherto unknown painting by a great master, untouched, on the original canvas, and without any restoration, just as it left the painter’s studio….Neither the beautiful signature [nor] the pointillés on the bread…is necessary to convince us that we have…the masterpiece of Johannes Vermeer of Delft….In no other picture by the great master…do we find…such a profound understanding of the Bible story—a sentiment so nobly human expressed through the medium of the highest art.

The deception, now substantiated in print by the ultimate academic authority, was complete. Van Meegeren was back on top, raking in the cash for his fraudulent paintings, and fooling the art world he now despised for failing to recognize his genius.

**The Turn.** Van Meegeren’s scam began to unravel after the end of WWII when authorities were tracking Nazi collaborators. An investigation discovered Van Meegeren had, through an intermediary, unwittingly sold a “Vermeer” to Goering, and as a result, he was accused of, and arrested for, treason. His defense was as simple as his scam: Admit the paintings sold to the Germans were fake. Van Meegeren’s claim was dismissed as a desperate attempt to mitigate the severer charge of treason. To prove his accusers wrong, however, van Meegeren began forging “Jesus Amongst the Doctors” in prison as proof of his skill set. But it was to no avail: van Meegeren was charged with collaborating with the enemy, and a panel of experts was introduced to examine the paintings.

What no one realized, however, was that van Meegeren had prepared for forensic scrutiny. He scratched the paint off worthless paintings from the period in order to use age-appropriate canvases and defeat X-ray analysis. He used color schemes and materials Vermeer would likely have used. He even employed Pheno formaldehyde in an attempt to mimic the rigid texture of paint that had been fossilizing since the seventeenth century. Van Meegeren was assiduous but ultimately imperfect; experts detected both the Pheno formaldehyde and some trace evidence of the color “cobalt blue,” a coeval pigment of the 1940s but wholly unknown and unavailable to Vermeer. On the basis of that evidence, van Meegeren was sentenced to one year in prison for forgery, and roughly two months later, he died of a heart attack.

The evidence presented was quite convincing in and of itself, but van Meegeren still had his doubters; some continued to believe the paintings were simply too good to be forgeries—a testament to van Meegeren’s skill. There was, however, one physical detail van Meegeren could never have addressed, a fact that would prove without question his “Vermeers” were forgeries: the rate of radioactive decay of the lead-210 and radium-226 in the paint he used. If we assumeis the change in the number of disintegrated atoms for some unit of time *t* with decay constantand(i.e., the time at which an element begins decaying), we have the following general-solution equation

which we can simplify asTaking natural logs, defining(i.e., the half life for radioactive decay), and solving foryieldsThat is, we can calculate an element’s half-life by dividing the natural log of 2 by its decay constant. Unfortunately, there’s never a way to determine preciselywhen trying to date an object, so the above equation cannot do much to help our cause.

But all is not lost. We can use half-life values of the specific elements in question to estimate an amount of decay based on specific time frames we wish to investigate. Due to various facts about chemistry we won’t address here, we know the amount of lead-210 (half-life = 22 years) and radium-226 (half-life = 1600 years) for an authentic 300-year-old Vermeer would stand in “radioactive equilibrium,” from which we can deduce that a modern forgery will have a much higher level of lead-210 radioactivity in relation to its radium-226 content.

**The Prestige. **Supposeis the number of grams (of white lead) at *t *years withat production, andis the function that gives the disintigration rate of radium-226 (grams per minute) of white lead at *t*. Then we have the following differential equation:Because the half-life of radium-226 is 1600 years, our estimates for a 300-year-old Vermeer will involve a pretty consistent value for *r*(*t*), which means we can replace *r*(*t*) with the constant *r*. Upon multiplying our differential equation on both sides by the integrating factor, we’re left withbecause

which is what we need whenA straightforward calculation then gives us

and, solving for *y*(*t*), we have

recalling thatandBut our goal is to estimate the amount of lead-210 at the time of production in order to detect a forgery. This means we need to solve forSettingin the above equation and doing some rearranging, we finally reach the form of the equation we desire:

We know the decay constant for lead-210 is 22 years; thus,and we can now calculate the exponential of *e*:

All that remains is substituting appropriate values forthe disintegration rate of lead-210, and the (constant) disintegration rate *r *of radium-226.[1] Due to varying uranium concentrations throughout the world, a very conservative estimate for an upper bound on the disintegration rate of a modern painting was determined to be 30,000 grams/minute of white lead. Values for van Meegeren’s “Disciples at Emmaus” were determined to beandyielding the following calculation:

more than *three times* the allowable limit for an authentic painting of the seventeenth century. Clearly, van Meegeren’s “Disciples at Emmaus” was a forgery.

Differential equations 1, Abraham Bredius 0.

Footnotes:

[1] The investigation used the disintegration rate of Polonium-210 in place of lead-210 for convenience without any real loss of precision. The rates ofandequalize after only a few years.

References:

Braun, Martin. *Differential Equations and Their Applications*, Texts in Applied Mathematics, Springer-Verlag: New York, 1992.

## Dinner with the Three Musketeers

During dinner one evening with friends, the topic turned to probability. My friend Jeremy thought it’s more difficult to pick one card—say, the 5♦—from a thoroughly shuffled 52 card deck on the first attempt than picking *all the other cards without picking* the 5♦. Taking an informal poll didn’t help: Some agreed with him while others said leaving the 5♦ unpicked was more difficult. Could they be equal?

What do you think?

**Proof.** Assuming each selection is an independent event, the probability *P* of choosing *n *– 1 items (without choosing the target item) is

where *k* = 1. So, the probability of leaving the 5♦ as the last card is equal to the 1/52 probability of picking it first.

“One for all, and all for one,” as they say.

## The Hall of Monty-zuma

By now, most of you probably know about the famous Monty Hall Problem, but in case you’ve been living under a rock for the last 35 years, I will provide a brief description of the situation:

Assume that a room is equipped with three doors. Behind two are goats, and behind the third is a shiny new car. You are asked to pick a door, and [you] will win whatever is behind it. Let’s say you pick door 1. Before the door is opened, however, someone who knows what’s behind the doors (Monty Hall) opens

one of the othertwo doors, revealing a goat and asks you if you wish to change your selection to the third door (i.e., the door which neither you picked nor he opened). The Monty Hall problem is deciding whether you do.

I know. Your instincts might tell you it doesn’t matter if you switch: Either (a) your initial 1/3 probability of picking the car doesn’t change when a goat is revealed or (b) the probability of picking the car increases to 1/2 with only two unopened doors remaining. Both (flawed) logical approaches will convince you to stick with your initial selection, and you wouldn’t be alone in that opinion. But the truth is that you *should* switch to the last door when given the opportunity.

Why?

Well, when you pick door #1, the probability of picking the car is 1/3, leaving 2/3 probability the car is behind one of the other doors. When Monty reveals a goat behind, say, door #3 and asks if you’d like to switch your pick (to door #2), he’s really giving you a chance, in a sense, to travel back in time and pick *both* doors 2 and 3!

Consider this: Imagine Monty calls you to the stage and asks you to pick *two* doors. You pick doors 2 and 3, which means you have a 2/3 chance of winning the car. If he then revealed a goat behind door #3, would you believe the probability for winning the car would decrease to 1/3 or 1/2? Of course not. The new information doesn’t change anything because the probability that there was a goat behind *at least one* of the doors you picked is unity. But you’re now in the same exact situation as if you’d switched your pick in the initial scenario: Choosing door #2 with a goat revealed behind door #3 and a 2/3 probability of winning the car.

Perhaps an even better way to understand it was given by Charles Wheelan in his 2013 book *Naked Statistics*: Suppose Monty shows you 100 doors and tells you there are goats behind 99 of them and a car behind one. He asks you to pick a door. You pick door #1. He then reveals the goats behind doors 2-99 and asks if you’d like to switch your pick to door #100. It’s painfully obvious you’d switch because doing so increases your probability of winning the car from 0.01 (i.e., getting the right door on your first guess) to 0.99, which is the same mathematical logic we were following when we only had three doors from which to choose. We’ll skip the mathematical proof, but if you’re still not convinced, try it yourself!

But here’s something crazy.

Suppose someone *else* from the audience (not Monty) randomly chooses a door and happens to reveal a goat—someone who *didn’t know that door was hiding a goat—*then the probability of picking the car remains static at 1/2 with *no benefit* to switching! This is completely counterintuitive (at least to me and Paul Erdös!) and even more confusing if you’ve finally accepted the probabilistic increase to 2/3 when switching in the original scenario, but we can prove it mathematically using Bayesian probability. For a primer on Bayesian probability, look here, here, and here.

Suppose Alice, a U.S. Marine, selects door #1 in the hope of revealing the car, after which a second contestant, Bob, is called to the stage to select one of the remaining doors (#2 or #3). Let Pr(O), Pr(T), Pr(R) be the probabilities of the car being behind door #1, #2, and #3, respectively, and define G to be the information there is a goat behind door #3. Further, define Pr(G|O), Pr(G|T) as the conditional probabilities that a goat is behind door #3 given the car is behind doors #1 and #2, respectively. As Bob comes to the stage to make his selection, here’s what we know:

- Pr(O) = Pr(T) = Pr(R) = 1/3. Unlike the scenario with Monty, neither Alice nor Bob knows where the car is located, so there’s an equiprobable value for finding the car.
- Pr(G|O) = 1/2. If the car is behind door #1, Bob is guaranteed to reveal a goat behind door #3
*if he picks it*. (Remember, Alice chose door #1, so Bob must choose between doors #2 and #3.) - Pr(G|T) = 1/2. If the car is behind door #2, Bob has a 50-50 shot at revealing a goat behind door #3 because he might pick door #2 and win the car.
- Pr(G|R) = 0. If the car is behind door #3, there’s no way Bob can reveal a goat there.

From Bayesian probability theory,

Pr(G) := (Pr(O) x Pr(G|O)) + (Pr(T) x Pr(G|T)) +

(Pr(R) x Pr(G|R)) = (1/3)(1/2) + (1/3)(1/2) + (1/3)(0) = 1/3,

and the posterior probability is calculated as

Pr(O|G) := Pr(O) x Pr(G|O)/Pr(G) = (1/3)(3/2) = 1/2.

The reader is invited to confirm the same value for Pr(T|G). It’s interesting to note the only difference between scenarios lies with the calculation for Pr(G|T). Monty *knows* where the car is, and that means he would be forced to pick door #3 and reveal a goat to keep the game going—assuming, as we have, that you picked door #1. That’s not the case, however, when Bob comes to the stage in the alternate scenario. He’s not forced to choose door #3 because he’s not sure where the car is. He might choose door #2 and end the game by winning the car. That probabilistic reduction—from certainty to even odds for Pr(G|T)—is the linchpin to the transformation of the entire problem. It’s also incredibly cool that mathematics can “discern” a seismic shift in the probabilistic outcome based on whether or not the person choosing the remaining doors holds any information about what’s behind them.

The frustrating thing about Bayesian probability, however, is that it can be immune to different interpretations of the conditional probabilities. For example, if we understood Pr(G|O) and Pr(G|T) to mean strictly “the probability of a goat behind door #3 (whether it’s opened or not!) given the car is behind door #1 and door #2, respectively,” then Pr(G|O) = Pr(G|T) = 1 and we get the same answer: Pr(G) = 2/3 and Pr(O|G) = (1/3)(1/(2/3)) = 1/2. Unfortunately, that approach mirrors the faulty (i.e., *ex post facto*) logic that leads to the erroneous answer in the first scenario: If we calculate our chances *after* Monty shows us the goat behind door #3, which is what I think most people do when they’re asked to consider the problem in its original form, then we can confidently claim Pr(G|O) = Pr(G|T) = 1, leading to a posterior probability of 0.5 and no need to switch doors. Of course, the benefit to this sort of mathematical flexibility is that we can design and investigate a number of different scenarios using alternate p-values that lead ineluctably to the same (possibly counterintuitive) posterior probability.

## Solution to the Raven’s IQ Problem in Gladwell’s “Outliers”

I can’t overstate how much I enjoyed Malcolm Gladwell’s book Outliers, and I highly recommend it to anyone who might be interested in delving deeper into an eclectic investigation of various “outlier” events. It’s an incredibly illuminating and engaging read, where the alchemy of sudden and colorful success is replaced with the muted quotidian tones of fortuitous circumstance, hard work, and a Pollock-sized splash of serendipity. One pair of chapters, however, interested me more than the others: “The Trouble with Geniuses—Parts 1 and 2.” Here, Gladwell discusses the fascinating—if surprisingly underwhelming—trajectory of Chris Langan, a former Long Island bouncer who possesses one of the world’s highest IQs (195-210). I’m not going to elaborate on Langan’s history or Gladwell’s treatment of him in the book, but I would like to discuss one of the IQ questions Gladwell includes in Part I as a way to underscore Langan’s transcendent intelligence. He states:

One of the most widely used intelligence tests is something called Raven’s Progressive Matrices. It requires no language skills or specific body of acquired knowledge. It’s a measure of abstract reasoning skills. A typical Raven’s test consists of [48] items, each one harder than the one before it, and IQ is calculated based on how many items are answered correctly.

After giving the reader an extremely easy (i.e., early) example from the RPM test, Gladwell submits “the kind of really hard question that comes at the end of the Raven’s” as a challenge to readers:

I’ve taken quite a few IQ tests in my lifetime, and this is one of the most challenging questions I’ve encountered. For some time, the pattern eluded me. (After praying and confessing Phillippians 4:13—“I can do all things through Christ who strengthens me!”—the Lord immediately blessed me with the solution.) In the book, Gladwell provides RPM’s expected answer—matrix (A)—but he’s unable to provide a logical rationale or pattern for the correct choice. As we know, getting the right answer means nothing if you don’t *why* it’s the correct answer. Inflated IQ scores are repositories for random guesses and good test-taking skills.

(Stop reading here if you want to attempt to find the pattern yourself. )

We approach the solution in precisely the same way we would have solved it had Gladwell not given us the answer. Encode each suits’ position (1-9) within each 3 x 3 “matrix” as partitions of a set *S* = {145789236} with the partitions ordered as follows: diamonds ♦, hearts ♥, then clubs ♣. (Use left-to-right orthography. Think of the positions as a telephone keypad—without the zero.) Matrix positions likewise move horizontally then vertically (i.e., top left –> top middle –> top right –> middle left, etc.). At this point, our situation looks like this:

- Matrix 1: 145 | 789 | 236
- Matrix 2: 678 | 125 | 349
- Matrix 3: 149 | 238 | 567
- Matrix 4: 378 | 146 | 259
- Matrix 5: 359 | 148 | 267
- Matrix 6: 379 | 156 | 248
- Matrix 7: 139 | 257 | 468
- Matrix 8: 146 | 357 | 289

Viewed through the lens of the underlying positional structure rather than the hopeless clutter of the RPM matrices, the global design emerges. Do you see it? We will briefly postpone an exploration of the technical details in order to begin with an arresting visual aid:

- Matrix 1:
**145 | 789 | 236** - Matrix 2:
**678 | 125 | 349** - Matrix 3:
**149 | 238 |****567**

—————————- - Matrix 4:
**378 | 146 | 259** - Matrix 5:
**359 | 148 | 267** - Matrix 6:
**379 | 156 |****248**

—————————- - Matrix 7:
**139 | 257 | 468** - Matrix 8:
**146 | 357 | 289** - Matrix 9:
**?**

We clearly see what’s happening. If Matrix 1 represents the set *S* = {145789236}, then the elements of *S* in each successive matrix (i.e., the order positions of the respective suits) are reduced by one and permuted such that the partitions are arranged according to the colored pattern. This generates the meta-design for each larger group of three matrices. (One must assume the test designer broke the global pattern here, after matrices 3 and 6, in order to highlight the horizontal continuity of the sequence within each row; this makes it more difficult to deduce the pattern because it does not carry through to the next row of matrices.)[1] The mappings can be more formally understood as iterative permutations (in this case,rotations) on the elements of the cyclic group ℤ/9ℤ. Sounds complicated, but it isn’t: Subtract 1 from the row immediately above it and permute the result according to the colored pattern—with all arithmetic calculated mod 9 (and 0 = 9).

That’s it.

Here, the permutation for each matrix is a rotation of the elements with a constant index (6) whereis the *i*th element at the *k*th position andpermutes the elements by rotatingpositions as follows:

To the observer, <349> and <934> are indistinguishable; thus, we can ignore canonical orderings of the partitions (i.e., smallest to largest, etc.) if we wish. The rest of the problem proceeds in a similar fashion, and decoding the pattern makes it a trivial exercise to predict what comes next: {246 178 935}—answer (A). One can confirm the blue, red, and green pattern continues into the ninth matrix. What’s wonderful about viewing the problem in this way is that you’re better prepared should you ever encounter a much more difficult mapping. Imagine, for example, I gave you the following design:

- Matrix 1: 145 | 789 | 236
- Matrix 2: 257 | 489 | 136
- Matrix 3: 178 | 249 | 356
- Matrix 4: 458 | 129 | 367
- Matrix 5: 247 | 159 | 368
- Matrix 6: ?

What’s the global pattern in this case? This is more difficult than the RPM problem, so take your time.

Without decoding the suits into *S* and its partitions, this becomes a much more difficult problem to solve. If you’re familiar with automorphic maps, you might realize this matrix sequence involves a series of group automorphisms from *G*, the group, to itself.[2] Everyone will have seen the invariance of 9 in the middle partitions and likely the consistent presence of the 3/6 pairs in the last ones, but a more discriminating mathematical eye will translate these features into a clue to the design. Because (5,9) = 1 (i.e., 5 and 9 are coprime to each other), the map *f* : *G* –> *G* (under mod 9 multiplication by 5) will permute the elements of the group in a predictable way, eventually returning to the initial permutation. In other words, if, then. As we’ve seen, *s* and *t* need not be unique elements of *G*.[3]

Footnotes:

[1] Someone pointed out the transformation from matrices 3 to 4 and 6 to 7 involves a counterclockwise rotation, mapping {369} –> {123}, {258} –> {456}, and {147} –> {789}.

[2] The entire collection of automorphic maps from a group *G* to itself is called the automorphic group *Aut*(*G*), which is isomorphic to the symmetric groupwith *n* elements.

[3] For interested readers, Matrix 6 should read as follows: 281 | 579 | 463. A hypothetical Matrix 7 would return to the initial permutation of Matrix 1.

## Amazon’s Primes

There’s an ongoing debate concerning the origin of mathematics: Is it the creation of mankind or does mathematics somehow transcend our existence? Would mathematics cease to exist if not for the specific descriptions, prescriptions, and labels developed over centuries of thought, or do the hieroglyphics we’ve adopted point to immanent, Platonic features of the fabric of the universe, the specifics of which are simply waiting to be uncovered? Many prominent authors have offered a range of answers to the question—from Mario Livio’s excellent (if equivocal) investigation that underscores the complexity of the debate to Max Tegmark’s claim that reality is, itself, a mathematical construct—yet the debate continues.

One thing, however, seems clear: If mathematics is, in its totality, nothing more than a man-made construction that doesn’t point to anything abstract and outside itself, we didn’t do a very good job designing it. Problems range from the serious to the annoying: Gödel proved that an axiomatic system is not (self-referentially) closed, requiring exogenous confirmation outside its domain (serious), and we can find exact solutions to differential equations of the form

*only* when there is a functionsuch that appropriately defined functionsand satisfy the following criteria:,, andwhere

(annoying). It seems intuitive—to me, at least—that we would have avoided such terminal flaws when constructing an intellectual edifice as imposing and efficacious as mathematics, an argument that strongly suggests the limitations and deficiencies we encounter result from our failure to address sufficiently the details of an immanent Platonic structure with man-made techniques and notation.

But if, like me, you believe in God, there really is no debate; a universe created by an omniscient Creator is necessarily imbued with supernatural logic and order, a very small measure of which we created beings can access through our limited and fallible modes of inquiry, and we should expect to meet obstacles in our attempts to investigate such a supernal design. But can we say anything about the origin of mathematics outside the context of personal faith? I think we can. We certainly won’t settle the matter in a single blog post, but I hope to offer a few thoughts in an attempt to convince you that mathematics isn’t completely and irrevocably a man-made construct.

Think about the prime numbers. A prime numberis a positive integer whose factors are *only* 1 and itself. In other words, there’s no other (positive) integersuch that. One could make the very convincing argument that a prime number *p* is nothing more than one of man’s creations. After all, *we* invented all the terms that describe prime numbers, right? We developed the concepts and symbols for “number,” “factor,” “integer,” “set,” “positive,” “less than,” “set membership,” and “divide.” It seems we’ve merely designed the rules of the game and then created a character that will obsequiously follow our carefully prescribed logical narrative.

But is it as simple as that?

Those close to me know I’m a vociferous reader, so if you’re going to buy me a gift for my birthday or Christmas, I will always prefer an Amazon gift card so I can buy more books. My wish list is really long—in fact, I have two wish lists on two different Amazon sites—and when I received a gift card from my wife for Christmas, I was able to buy 13 books from my two lists. Always in a hurry to get a new book I’ve been wanting to read, I realized the worst-case shipping scenario would be the trivial partition of the prime 13—receiving 13 separate shipments with each package containing a single book—but I also knew (as do you) that it would be impossible to receive less than 13 UPS packages with the same number of books in each package. One of many possible mathematical descriptions of this shipping situation can be described as follows: There exists no partition (*R*) of a prime *p* wheresuch that all of the following hold: (1), (2) and (3). A much less jargon-intensive description would be that given a set *S* containing *p* elements, there is no way to construct proper subsetsof *S* such that *every* subset has the same cardinality (i.e., contains the same number of elements) where. (This assumes the elements of each subset are unique:). Okay, so maybe that wasn’t any less jargon intensive! Anyway, it should be clear how our two definitions of a prime number—multiplicative factors and partitions—are related. (In fact, there are many ways to describe the mathematical notion of primality!) If we *could* construct such a collection of subsets, then the total number of subsets would be a divisor (or factor) of *p*. The following graphic represents some possible book-package groupings based on the prime number 13, and all of them fail the partition requirements given above:

13/2: BB | BB | BB | BB | BB | BB | B

13/3: BBB | BBB | BBB | BBB | B

13/4: BBBB | BBBB | BBBB | B

13/5: BBBBB | BBBBB | BBB

13/7: BBBBBBB | BBBBBB

13/9: BBBBBBBBB | BBBB

Okay, but what’s the point? The point is that *we’re limited in how we can organize the physical space around us*. Whether we’re discussing books or computers or people or planets, a set with *p*-elements cannot be organized into subsets containing an equal number of objects no matter how hard you try. (Again, the trivial exception exists when the number of subsets equals *p *because *p* | *p*.) That’s not an abstraction. It’s not the product of a man-made system, and it doesn’t require the idea of a supernal Creator. It’s simply a physical impossibility. Primality *emerges* from the *res extensa* of grouping objects, but the quality of *primeness*, from which primality receives its import, is a Platonic quality. It would be true in any universe, it would be true if there were no objects at all to arrange, and it’s true even if we want to imagine grouping objects that don’t exist (e.g., unicorns, wizards, leprechauns, etc.).

The locutions and symbols and formalism we use to reify this quality of primeness (when defining primality, as we did above) must be distinguished from the quality itself. It is this transcendental primeness that represents a superordinate and abstract mathematical framework that exits (and has always existed!) apart from the existence of any object. Primeness is the *very reason we are able to develop* a formal definition and/or proof for, and useful descriptions of, primality. The process of formalizing primeness (i.e., primality) emerges from the Platonic quality of primeness, and various applications (e.g., cryptosystems, automorphic maps of cyclic groups, etc.) then emerge from primality: Abstract truth (e.g., primeness) —> rigorous definition (e.g., primality) —> functional use (e.g., applications). The causal chain begins (at least) prior to formalization, so the notion of “what it means to be a prime number” cannot be attributed to man.

## Does the “Free Lunch” Come with Fries?

So, I’ve been thinking about getting into the consulting business (primarily in the U.S.), and a few high-profile wealth-management firms have advised me to move to the States in order to take advantage of its favorable tax plan. Saving money on taxes always sounds like a pretty good strategy, but is moving to a different country the best option?

We’ll get to that—but, first, we require a brief digression (or two).

Some of you might have heard of something called “Purchasing Power Parity” (PPP). It’s a concept from economic theory that “compares different…currencies through a…’basket of goods’ approach. [T]wo currencies are in equilibrium…when a…basket of goods (taking into account the exchange rate) is priced the same in both countries” (Investopedia). PPP basically compares the exchange ratewith the quotient of prices (*E*) for an identical item (or a basket of items) sold in both countries. It’s a speculative measure designed to “predict” the movement of the value of one currency against another. So, if a particular Einstein bobblehead costsUSD andCAD, thenUSD, which is significantly higher than the then-current exchange rate of (about) $1.29 USD.

This means Canadians will prefer to purchase the bobblehead in the U.S. (@ $12.89) and use the other three-plus dollars for something else. In short, the Canadian price is too high, suggesting the USD will appreciate (over some time frame) until *E* equals the exchange rate. When , we speculate the possible existence of an *arbitrage* opportunity. The concept of arbitrage is hardly new, yet it continues to drive the boldest (and, at times, most reckless) investment strategies on Wall Street and beyond. It’s often said, colloquially, that “there ain’t no such thing as a free lunch” (TANSTAAFL); the proven concept of arbitrage, however, completely belies that claim. I offer a quick and tailored primer for skeptical readers.

“TISATAAFL” or “How to Guarantee a Gambling Profit Using Mathematics”

In very basic terms, a betting-arbitrage opportunity arises when the sum of the *bettors’* odds of a successful outcome derived from the gaming odds is less than one.[1] In a two-system bet, this is very simply calculated aswhere theodds of Bettor 1 to *win *are. The second bet placed with Bettor 2 follows similarly. An example: Suppose James desires to place a wager on the 2018 college football national title game between Alabama and Clemson. His brother, Joshua, is giving 2/1 odds on Alabama, and Michael is giving 1.25/1 odds on Clemson.

James wants to determine if this represents a genuine arbitrage opportunity. The 2/1 line means Joshua believes Alabama has a 2/3 probability of winning (i.e., 2/(2+1)), and the 1.25 (= 5/4)/1 line means Michael believes Clemson has a 5/9 probability of winning (i.e., (5/4)/[(5/4)+1]). Calculating (their belief of) the probability that *James* will win each bet is (1 – 2/3) + (1 – 5/9) = (1/3) + (4/9) = 7/9 < 1. It is, in fact, an arbitrage opportunity.

How does arbitrage work? Let’s say James’s gambling allowance happens to be 100 dollars in total, and he’ll bet *x* dollars on Alabama and 100 – *x* dollars on Clemson. Let’s imagine Alabama wins. An Alabama victory means James loses –*x* dollars to Joshua and wins 5(100 – *x*)/4 dollars from Michael. This yields the first linear profit curve (red line, below): -9x/4 + 125 = 0. Of course, we want the profit to be greater than zero, so we set the LHS of the equation accordingly and solve, yielding x < 500/9.

Now, imagine Clemson wins. This would mean James loses (100 – *x*) dollars to Michael and wins 2*x* dollars from Joshua, the sum of which yields the second linear profit curve (blue line, below): 3x – 100 > 0 such that x > 100/3. Thus, the amount of money James needs to wager with Josh (*x*) and Michael (100 – *x*) to *guarantee* a profit no matter which team wins the game lies within the global inequality derived from both equations: 100/3 < x < 500/9. This is a range from (roughly) 33.34 dollars to 55.56 dollars.

Curious readers will want to know two things: (1) the *maximum* possible profit based on the gambling allowance and (2) the optimum wager James should place on Alabama to generate that amount. We’ve jumped the gun a bit by providing the above graphic, but the answer involves solving the system of linear equations in the previous paragraph; the linear profit curves cross each other at an equilibrium point—recall the supply-and-demand curves of elementary economics—and it is this intersection that represents the Cartesian coordinate that (a) reveals the maximum possible profitand (b) the optimum betthat guarantees that maximum profit amount.

Fortunately, as a general principle, we don’t need to graph these functions. Setting both equations equal to each other and solving foris sufficient: Solving gives us an optimum bet value of dollars on Alabama and 100 – 42.87 = 57.14 dollars on Clemson. This betting profile generates a maximum guaranteed profit (*p*) ofdollars based on a 100-dollar total wager—*no matter which team wins the game*. The above graphic provides the relevant visual representation.

This should give you a general sense of the power and seductiveness arbitrage offers and why it’s essentially the Holy Grail of any investment strategy. (For some readers, it might be cool enough to know arbitrage exists, and you may want to make a few bets with your friends. But do the math first!) To put it simply, there’s no better option available to you than the one that generates a financial profit no matter the outcome. (Sorry, Milton Friedman!) One might ask what this has to do with PPP or moving to a foreign country. It involves the notion of currency arbitrage as a “free lunch.”

Recall the Canadian who was interested in the Einstein bobblehead. She essentially earns three dollars by making the (online) purchase from the States. It’s as if she bought the bobblehead in Canada and the government deposited three dollars into her account. Unfortunately, PPP doesn’t tell us when to make the purchase (we need the exchange rate for that), but we use that information to make inferences, like whether we’ll save money if we buy a book from Spokane rather than Vancouver. PPP, however, only involves “tradeable” commodities. “Immobile goods” like real estate and services are inaccessible to PPP calculations.

One such “inaccessible” item is tax liability. The professional advice I received was simple: Move to the United States in order to avail yourself of the more attractive federal tax rates. But can I get a “free lunch” by staying in my country of residence? PPP tells me whether there exists a currency imbalance, not whether I should move. An approach that *does* help me make this determination involves what I will call the “Net-Purchasing-Power Index” (NPPI). NPPI simply calculates the exchange ratethat represents the equilibrium point between two baskets of post-tax income portfolios and compares it to the current exchange rate. We begin with first principles—the technical definition of *net income*—and derive the NPPI from there:

Here, *a* is the total value of the income portfolio *in the U.S.*,is (again) the equilibrium rate,is the relevant federal tax rate in the target country, andis the relevant federal tax rate in the U.S. Our goal is to calculate the NPPI by calculating the quotient ofand. When NPPI < 1, the exchange rate is greater than the equilibrium rate, and we have the *potential* for an arbitrage opportunity—but not yet a *guarantee*. For that, we need to do a bit more work. As the NPPI tends to zero (i.e., as the exchange rate gets larger), the portions of our potential free lunch grow significantly.

Let’s walk through an example.

Suppose an analysis suggests my U.S. corporation will generate $500,000 USD in consulting fees in 2018. Conventional wisdom, as we’ve seen, suggests relocating to the U.S. That is, a $500,000 portfolio at a U.S. federal tax rate of 39% leaves me with $500,000(1 – 0.39) = $305,000 dollars if I move to the States. If I bring that money into a target country with a federal tax rate of, say, 47%, I’ll only have an after-tax amount of $265,000. It seems as if I’m losing money by choosing not to move. But what about? Let’s imagine the exchange rate between the U.S. and the target country isUSD. Is an arbitrage opportunity possible? Using the equation above,, which is the rate that “equalizes” the post-exchange purchasing power between both countries. Because, I would (really) be losing money.

Clearly, the after-tax, after-conversion portfolio of $291,500 is less than the $305,000 I’d be able to spend on goods and services in the U.S. If I think the extra $13,500 I’d save by moving to the U.S. is worth the time and effort, I should relocate. But what if. Then,and I have a real chance to make some free money by staying put: In this case, I create $332,098 by bringing my U.S. income into the target country, and I enjoy a net-purchasing power of +$27,098. (Assume I’ve taken advantage of the legal means to minimize double-taxation issues.)

But isn’t this a guaranteed arbitrage opportunity? No, for two reasons: (1) we haven’t accounted for price differentials and (2) arbitrage also depends on how much income we’re bringing into the target country. What if, for example, prices are much more expensive in the target country? That is, what if PPP is severely unbalanced, as in the bobblehead example? In that case, the increased prices eat away at any NPPI surplus, though if we’re dealing with tradeable commodities, as we saw earlier, one would simply purchase those items from the States. Unfortunately, importing goods isn’t always a guarantor of profits. Let’s say, for argument’s sake, the amount of income we’re dealing with is $100. If, then NPPI < 1 and I’m left with $61 living in the U.S. and $66.41 in the target country. After buying the bobblehead at $9.99 USD, I have $51.01 in the U.S. and $50.42 if I buy it in the target country ($66.41 – $15.99). NPPI < 1, but I’ve still lost money. (For simplicity sake, assume the sales tax is equal.)

This means I either have to (a) import the bobblehead from the States to have a chance at maintaining my NPPI advantage or (b) increase the amount of money I’m converting from USD into CAD. If I chose to import the bobblehead, I do still come out ahead: $66.41 – 9.99(1.2532) = $53.89, which means staying in the target country is still $2.88 better than if I’d moved to the U.S. and paid for the bobblehead in USD. It’s a very slight advantage, but that’s only because the amounts we’re dealing with are small. As the portfolio (*a*) grows, so does the advantage. (This ineluctably leads to the notion of leverage as an investment strategy, but we won’t address that here.) Unfortunately, as the price grows, the advantage decreases, and if the price is high enough, choosing not to move becomes a *disadvantage*.

The question, then, becomes this: Is there any way to evaluate an arbitrage opportunity given a specific constellation of values for the variables we’ve been discussing? Yes, there is. Such an evaluation involves solving a linear optimization problem that accounts for price levels. I will call this the Currency Arbitrage Price (CAP), and it utilizes both NPPI and PPP values. In what follows, however, we assume NPPI < 1. (Recall that if NPPI > 1, then no arbitrage opportunity is possible.) So, what do we need to know? We need to determine the maximum price level of a specific item in the target country that guarantees a post-purchase profit. We can calculate this by adding our price variables to the calculation of . Solving the necessary inequality for, we have:

Notice the cancellation that occurs when. This last inequality tells us how much an (identical) item needs to cost in the target country in order to guarantee a profit given the other variables. We can visualize this inequality by graphing the linear CAP functiondefined by

,

and we guarantee arbitrage when. Armed with this information, let’s revisit the $100/bobblehead example. Solving the above inequality gives us. This means we are guaranteed an arbitrage opportunity when the bobblehead price is $9.99 in the U.S. and less than $15.41 in the target country. Let’s imagine it’s priced in the clearance bin (in the target country) at $11.99. In the U.S., paying in USD, we’d be left with the usual after-tax, after-purchase amount of $51.01, but in the target country, we’d now have an after-tax, after-purchase balance of $54.42. Despite the disparity in currency valuations and the higher tax rate, we enjoy an overall profit, which is an increase from the earlier amount of $53.89 we gained from importing.

Free lunch.

We can do a bit more. Imagine the target country decides all bobbleheads should be $11.99, and the U.S. decides it must reduce bobblehead prices to stay competitive. We love this Einstein bobblehead so much that we want to send it to all our friends. But the U.S. price keeps falling. How long can we purchase the bobblehead at $11.99 in the target country until we lose our arbitrage advantage? In other words, at what U.S. price does our profit reach zero? To solve this problem, we simply solve the above inequality forThis gives us

The function forfollows similarly. As long as the U.S. price is greater than, we retain our arbitrage advantage. So, if bobblehead prices remain fixed at $11.99 in the target country, the U.S. price can fall to $6.57 per unit and we’ll still earn a profit (as small as it might be at that price). You don’t need any extra information to calculate. PPI gives us the prices we need, and the values for all the other variables—exchange and tax rates—are easily accessible to the public. We’re simply doing some basic algebraic shuffling.

If we factor sales tax into the price differential, we add a layer of complexity to the problem of quantifying arbitrage. Ifandare the sales-tax rates in the U.S. and Canada, respectively, then our profit functions becomedefined by

for the price in Canada and

for the price in the U.S., respectively. In this more complex case, imagine we import $5000 USD at an exchange rate of with federal income-tax rates ofandand sales-tax rates ofandin the U.S. and Canada, respectively. We note that NPPI < 1, and we want to purchase a new computer whereUSD andCAD. Do we have an arbitrage opportunity? Unfortunately, we don’t—not until the Canadian price is reduced to less than $3,165.

The function for the Canadian price (above) reveals this upper bound when(solid green). As you can see from the graph, we lose money at the current price ratio ($1407.22 – $1781.25 = -$374.03). This is shown by the gap between the red- and blue-dashed lines transversed by the vertical line that represents the U.S. price; this is the difference betweenvalues of the *individual* profit curves (and not the above functions that arise from setting those equations equal to each other and solving forand, which are represented by the solid lines on the graphs). Though we lose money if we purchase the computer in Canada at the current price of $3,499, we do gain a profit of $137.71 by importing it from the U.S. But let’s imagine we choose to wait for a local sale, and the Best Buy in Vancouver reduces the price to $2,999. Now, we *do* have an arbitrage opportunity:

We’ve now earned a better-than-importing profit of $1967.22 – $1781.25 = $185.97, despite the higher federal- and sales-tax rates, by bringing in the USD-based income and paying for the computer in CAD. The graph also reveals that we’ll continue to generate a profit until the U.S. price—in response to Canada’s competitiveness—drops to about $2,329 (purple), at which point the individual (dashed) profit curves intersect with each other at the equilibrium price point and the total profit drops to zero: Both the Canadian and U.S. consumers, at that point, would be left with a remaining balance of $1967.22 after purchasing the computer.

Below is a list of some real-world examples (between Canada and the U.S.) at the time of publication with a variety of values for some of the variables under consideration (NY and BC sales taxes were used for the calculations):

So, that’s it. Currency arbitrage in a nutshell. Perhaps something like this exists somewhere in the literature—I imagine it might, even though I’ve never seen it explicitly during my study of economics—but we offer it here in the event it will pique general interest. It might be beneficial to review the basic process involved in calculating the CAP for a given portfolio combined with a certain collection of data points:

(1) Calculate the equilibrium rate

(2) Confirm NPPI < 1

(3) Determineand

(4) Determineandif necessary

(5) Purchase “identical items” in the target country if the price is less than

(6) Purchase (4)’s items freely until its price in the host country reaches(assumeremains constant)

Anticipated objections to the CAP model:

(I) *Availability of identical goods*

If a tradeable good in the target country is truly unique, you couldn’t have purchased it anywhere else; the notion of PPP is simply unimportant in those cases. Of course, you will have to decide whether you wish to (or, for some reason, must) pay for that uniqueness or if you’d prefer to choose an item that closely (but not precisely) matches the one you’re considering, assuming such an item is available. As far as price modeling is concerned, very closely matched items can be (and probably should be) considered “identical.” Variations among packages of Bic pens, for example, probably don’t mean very much with respect to the sticker price.

(II) *PPP applicability*

PPP only accounts for so-called tradeable goods, but it is possible to compare “immobile” goods using a number of objective metrics. For homes and real estate, for example, we could use price/sq.ft., location, year of construction, amenities, projected repairs, and many other measures of objective value. Much like the issue of identical goods, then, we can gain a pretty good comparison between immobile goods between countries that will allow us to use a generalized approach to PPP. Value is in the eye of the beholder, which means an eye toward equality of value between such goods is achievable.

So, what about the big question: Should I move to the U.S. based on my fanciful financial projections or remain in the target country and bring the money here? Well, if NPPI < 1, which means the exchange rate outstrips the taxation gap, then I’m guaranteed a free lunch (or two) as long as I purchase (near-)identical goods that fall below theupper bound. If I can do that through importing goods with a favorable exchange rate or by taking advantage of cheaper relative prices in the target country given a certain sales-tax profile, then it’s in my interest to eschew the idea of relocating, even though the tax rates are more favorable in the States.

Footnotes:

[1] In wagers like these, everyone *must hold their money* until *after* the event is completed. In this way, an arbitrageur can cover her losses with her winnings and keep the remaining profit. Online betting sites require you to front the money as you make the wager, which is why this arbitrage strategy won’t work in those cases.

## Can We Quantify Certain Kinds of Ethical Choices?

In his book *Ethics in the Real World, *the renowned philosopher Peter Singer proposes a metric for ethical risk informed by his (generally held) worldview of *consequentialism* (i.e., the idea that the consequence of an act determines the ethical value of that act). Singer states that, generally speaking, “we can measure how bad a particular risk is by multiplying the probability of the bad outcome by how bad the outcome would be” (183). Thus, an act is considered more ethical if it offers less general risk (for death, for torture, for financial waste, for suffering, for climate change, etc.) than an alternative act. We can model Singer’s non-mathematical comments by the very simple productwhereis the “Singer risk” for the *n*th event andandare the outcome and probability for the *n*th event, respectively. Note thatis really just an area calculation inwith the “sides” of the rectangle defined as the two variables in question; the larger the area, the greater the risk. Simple, right? We will return to the concept of area later.

But is this a viable model for risk? Forget for a moment about other kinds of ethical choices we make that have less definitive outcomes—a decision to break a friend’s confidentiality, defending a colleague from a false accusation that risks alienation among one’s coworkers, telling the truth despite hurting someone’s feelings, etc. Limited to the probability of “bad outcomes” we can quantify, however, does Singer’s product capture a quantification of ethical risk in a real and intuitive way? Is the ethical value of an act, in general, determined by the consequence(s) of that act? At a first glance, it seems we shouldn’t take Singer’s metric too seriously—and, perhaps, he doesn’t either—because it immediately strikes the reader as an inadequate method to quantify ethical risk in any meaningful way. How can we, to imagine one easy example, compare the loss of life between, and among, different demographics? Is it more ethical to prevent the death of a child if that preventive measure causes the death of, say, an elderly person? Five elderly people? What if it caused the death of a young, female professional at the height of her earning and reproductive powers? Is it even possible to balance those scales when making a risk assessment?

Even if we could achieve some sort of balance involving what I will call *congruent *cases (i.e., outcomes that involve a single parameter: the number of people harmed, the tonnage of CO2 released into the atmosphere, etc.), we’re still left with the much more difficult problem of quantifying *incongruent* outcomes: Is the ethical risk for blindness and malnutrition in third-world countries equal to that of domestic homelessness and drug addiction? Is rolling back the pursuit of nuclear energy (and the problems associated with managing its toxic, immutable waste) on par with diminishing our carbon footprint by reducing CO2 levels? If it is, how can we model that risk relationship? If not, why not, and how do we build into Singer’s model an objective and unbiased evaluation of those disparities? Assuming we accept Singer’s basic design, it would be an extraordinarily difficult task to “nondimensionalize,” as it were, the innumerable combinations of outcomes that would necessarily inform our decision-making process. If Singer’s model—and the philosophical platform of consequentialism, in general—has any hope of offering even a partial solution to the important kinds of ethical dilemmas he raises in his book, it *must* be able to handle the complexities involved in comparing these kinds of incongruities. But for the sake of argument, let’s set aside those additional complexities—as well as general critiques of consequentialism—and address the model in its most simplified form: a risk metric as a simple product limited to congruent outcomes.

Singer’s basic approach isn’t entirely without some precedent. Financial risk models, for example, involve (the sum of the) products of probabilities and returns, but they are couched within much larger mathematical and statistical machinery and require several additional calculations (e.g., expected rate of return, variance, etc.). The expected value *E*(*X*) of a continuous random variable involves the integration of the product of the random variable and the PDF, which has attached to it certain conditions (only positive values, total integration equals 1). There are other examples. Singer, however, argues that a quantification of risk could be limited to the product of an outcome and the probability that outcome occurs, and it is the validity of this basic approach we will challenge.

In light of this very narrow definition of ethical risk, then, consider the following thought experiment, couched in the form of a poll question, that was posted on three different FB groups:

*Which of the following options would you consider to be the ethically superior choice?
(1) Ten people are killed if you roll a three with a ten-sided die.*

*(2) One person is killed if you fail to roll a three with (a different) ten-sided die.*

Here, we set two independent, stochastic events (very nearly) equal to each other, though it seems clear they’re not equal ethically; in doing so, we hoped to investigate whether people would respond to the quantification of risk, as defined by Singer, or, perhaps, something else. (I’ve reasonably defined “how bad the outcome would be” simply by the number of people who would be killed.) Contrary to predictions based on Singer’s metric, a sizable majority of people (33/44 = 0.75) selected option 1 as the more ethical choice, even though (a) option 2 actually offers slightly LESS risk, which makes it the preferred choice according to Singer’s model, and (b) the number of people at risk for harm in option 1 is ten times greater.

So, what *happened*?

Most people seem to have responded **not** to Singer’s risk metric but to the *probability *of the outcomes. The risk of ten people dying——is very much mitigated by the fact that there’s a 90-percent chance nothing happens and the ten people at risk will remain unharmed. This stands in sharp relief with option 2——where there exists a 90-percent chance the person at risk will be killed, despite the fact that the total number of people at risk is one-tenth that of option 1. It seems the pollsters simplified the ethical dilemma by focusing on the probabilities of the outcomes, as if the poll options were as follows:

(1) There’s a 90% chance no one dies.

(2) There’s a 10% chance no one dies.

Notice the sigma values have vanished. Risk has now been reduced to reflect the *p*-values for harm, as if participants (subconsciously) treated Singer’s metric like a functiondefined byand evaluated the poll options as. (Because, we can’t simply cancel the outcomes.) This result is not particularly surprising. Most participants seemed to follow a probabilistic risk-aversion strategy rather than an outcome-averse one, but it’s an evaluation process that’s clearly not linked to Singer’s consequentialism, which demands a deference to the fact that. That is, the poll results reify the notion that ethical preferences might very well engender greater risk according to Singer’s model.

One might imagine what the polling would have looked like if it followed Singer’s metric. Perhaps everyone would have picked option 2, the result of privilegingregardless of the associated probabilities and/or recognizing it offers less overall risk (0.9 < 1.0). In another scenario, the polling might have been split almost equally between both options, reflecting the (near) equality of risk between the two options. The next question seems inevitable: How could we equate these outcomes in the minds of pollsters? How much would we have to increase the value ofsuch that people felt the *objective* evaluation of both risks were, in fact, very much the same, where it made essentially no difference (in terms of ethical risk) whether we chose option 1 or 2? Perhaps the relatively low p-value for option 1 would overwhelm any value we could assign to. Perhaps there’s more structure to perceptions involving probability-outcome relationships; for example, they might be inversely proportional to each other:. It’s difficult to speculate. If Singer’s model were more robust, we could simply solve the equation for the appropriate variable and calculate the perfect balance of risk, much like we attempt to do in finance or economics. Unfortunately, like so many mathematical models in other fields, things aren’t never quite so simple.

So, what do we do when mathematical equality doesn’t transpose to psychological or ethical “equality”? How can we make sense of two poll options with essentially equal risk that engender such a divergent response? Fortunately, we can use some tools from linear algebra to help us explore the degree to which two risk values—as Singer products with congruent outcomes—are (dis)similar. Assume a nonsingular risk matrix *A* is a 2 x 2 matrix whose entries are defined as follows:,,, andwhere **u** and **v**. The length of the cross product of (these risk) vectors **u**, **v** is equal to the absolute value of det *A*:

.

The value of omega reveals a relationship between risk vectors. The determinant of a 2 x 2 matrix, if it exists, can be thought of as the area of a projected parallelogram indelimited by its vectors—in this case, **u **and** v**. The greater thevalue, the greater the area of the projected parallelogram and the larger the dissimilarity between Singer risks. Thoughaccording to the Singer metric,, revealing the relationship is not nearly as close as the risk products suggest. This result might also proffer a partial explanation for the poll results, which, despite near equality in risk values, are heavily skewed toward option 1. Perhapsresponds in some way to the pollsters’ decision to privilege likelihood over outcome. For a quick comparison, considerwhen,,, and. Here, the Singer risks are equal (3), yet even while comparing two events with identical risk products, the sensitivity of is able to differentiate between them. That may be a helpful and quick initial guide when comparing the risk of two congruent ethical choices.

### Preference Rules

At this point, we might be inclined to consider the feasibility of certain kinds of “preference rules” (PR) with respect to ethical risk; that is, are there any ways to make an objectively unequivocal decision between Singer risks given certain values? The short answer: Yes, there are, and we list three such rules (PR1-3) that will *always* hold in any Singer-risk comparison. We also include two “derived preference rules” (DPR) that similarly hold in any situation:

PR1:

PR2:

PR3:

DPR1:

DPR2:

PR1-3 are almost insultingly obvious, and for those unfamiliar with the symbols of formal logic, I offer an informal exposition. PR1 states that if the probabilities between two Singer risks are equal, we will prefer the smaller outcome, which is equivalent to preferring the smaller Singer-risk value. (Remember, we’re limiting our investigation to risk products with congruent outcomes.) PR2 simply reverses the issue addressed in PR1: If the outcomes of two Singer-risk values are the same, we will prefer the smaller probability, where, again, we’re preferring the smaller Singer-risk value. PR3 formalizes the concept inherent in PR1-2: If both the probability and the outcome of a Singer-risk value are smaller than those of a second Singer-risk value, we will prefer, as we should expect, the smaller Singer-risk value. These rules are inviolable and will obviously hold in all cases. (The universal quantifieris implied in each of the above cases.)

The DPRs are only slightly less obvious, and we only construct them because they relate to our earlier exploration of omega. DPR1 says that ifis the smaller risk-matrix value andis greater than 1, prefer. This is a convenient rule if you’re given det *A* products and the associated probabilities. The proof for this is trivial.

*Proof* (direct): Suppose *A* is a 2 x 2 nonsingular (i.e.,) risk matrix such that. Then,. If, then. But, which meansand.

Thus, we will prefer the smaller Singer-risk value as prescribed by PR3. Unfortunately, a proof for DPR2 must use a different approach, but we can at least state it as follows: Suppose *A* is a 2 x 2 nonsingular risk matrix such thatand the ratio of probabilities,, is less than 1/*k*, then prefer Singer risk. The same definitions from DPR1 apply here as well. One might have already asked the obvious question: Whence *k*? It arises in the process of transforming the principal inequality to an equality:

We know, so and we’re now in a position to formalize a proof for DPR2.

*Proof* (direct): We need to show that if, thengiven the det *A* inequality. By transforming the former inequality into one we can use, we see thatbecomesby the properties of logarithms, and it’s no coincidence this latter inequality involves the last two (RHS) terms of the final equality displayed above. It is the case that if because of the signs of the terms. Simplifying, we have

which only holds when, as desired.

Because, we will prefer as prescribed by PR3. How do these DPRs relate to the poll options? We have 0.1(1) < 0.9(10) and 0.1 < 0.9, so we need to determine if 0.1/0.9 < 1/*k*. In this case, 1/*k* is *smaller*, which confirms our intuition that PR3 can’t be invoked in the poll-question case:with respect to *p*-values, andin terms of outcomes. Unfortunately, we cannot establish a similarly inviolable preference rule for these kinds of mismatched inequalities, for it is this mismatched relationship that makes difficult the process of determining the ethical preference between single products involving congruent outcomes.

### Area as a Quantification for Risk

What about Singer’s implied use of “area” as a metric? We’ve already mentioned its very limited scope prevents it from being a comprehensive model, but the fact that his model quantifies risk as an area calculation is not, *ipso facto*, a problem. There exists a long and storied tradition in mathematics, for example, in which area calculations are the very calculations we want: probability densities, work, distance, center-of-mass problems, kinetic energy, average value of a function, and arc length are only a few examples. We’ll see a few more shortly. Within the context of that rich tradition, then, we can imagine a number of other area-based models in an effort to uncover a metric that (at least) betrays the poll results. Part of what complicates matters is that the probability values of “Singer risks” aren’t built upon the same mathematical infrastructure. In other words, we cannot directly compare probabilities within the same distribution. We’d like to be able to do so, but if we view as a continuous random variable, the associated PDF functions cannot be equal. This can be seen with even a quick glance at the poll options: What probability distribution, for example, *decreases* probability values as we *increase* the area under the distribution curve? A cohesive PDF in the case of the Singer metric would have to yield *both* 0.9 at *x* = 1 and 0.1 at *x* = 10. If there is such a distribution, I’m not aware of it. Of course, the lack of a distinct PDF is mitigated by the reality that our ethical dilemma isn’t entirely random. Yes, there is a stochastic process attached to an impending if-then action, but that’s not the same thing as having a truly random variable.

We consider three area calculations as integrations of functionswhose definitions are implicitly defined below:

1.

The quantification of risk is now the area under the above linear function where is simply the slope of the function. Option 2 is *far* less risky (0.45 << 5) because risk now involves the quadratic growth of the outcome. A mental visualization of the (respective areas under the) graphs of these Singer metrics will be enough to convince anyone of the inadmissibility of this approach as a viable model. As we’ve seen, most people seemed to respond to the *likelihood* of the event and not the value of , as if the outcomes were largely irrelevant to the decision-making process, yet the linearity of an outcome-based design dramatically increases sensitivity to ; we’re simply privileging the wrong variable.

2.

A volume calculation indoes a better job of approximating the Singer-risk values, but it also fails to model the poll results. The integration is straightforward and simplifies to. This approach only slightly reduces the value of the quadratic growth in the previous example by squaring the *p*-value, but it’s not enough of a reduction in most cases. (Recall from analysis that if, thenas.) Even though this new model tightens the risk difference between both options (+0.0475 vs. +0.1), it still suggests option 2 offers slightly less risk:and.

3.

Here, the surface-area calculation inalso fails to model the poll results. We (somewhat arbitrarily) choose the planein the hope of striking a better balance between probabilities and outcomes. Simplifying and solving leaves us with the following product:

Here, like the other two approaches, option 1 remains the riskier option:and. The risk gap between options, however, has now widened compared to the volume calculation, and we still have the quadratic growth ofbuilt into the model. An alternative iterated integral—namely,—shrinks this gap (and), but it still fails to track the majority decision to treat option 1 as the more ethical choice. Thus, in every case, the models we’ve explored produce a risk value for option 2 that is less than option 1. This is disappointing, but we only intend to offer a brief investigation into the possibility of an alternative model. A fully realized and robust design is well beyond the scope of a blog post, so I will leave it to interested readers to pursue a viable solution, including a better motivation for *z*, concerning the kinds of ethical problems we’re investigating here.

### Conclusion

After all this, though, we’ve neglected to ask, perhaps, the most crucial question: Is the concept of risk a vitally important and pervasive consideration? To this, we must offer a full-throated “yes!” We need only remind the reader that notions of risk aren’t, as this post might suggest to some, mere fodder for a tiresome intellectual and mathematical exercise; we as a society make many decisions based on quantifications of risk—from actuaries calculating life expectancy for insurance policies and the beta risk of financial investments to disaster management and the cost-benefit analysis involved with safety recalls. And though pure notions of ethical risk are absent in most of these examples, we still very much engage in just the kinds of stochastic events reified in the poll options—where lives can, and often do, (literally) hang in the balance; there are probabilities associated with dying an unnatural death by being hit by a drunk driver, with the collapse of hedge funds holding severely over-leveraged arbitrage portfolios, with the flooding and destruction of Florida’s coastal cities during hurricane season, and with how many people might be killed by driving Company X’s new car. But that’s not all: We use these quantifications to draft legislation, evaluate legal settlements, decide how maintenance funds are allocated, and design the constellation of food and products your children will put in their mouths. Sometimes, such decisions can lead to subversive, and even illegal, acts.

Yet despite the vertiginous ubiquity of risk assessments that swirl around us, many people simply refused to choose a poll option on the grounds of some misguided moral indignation (“The only ethical choice is not to choose!”). Perhaps their reluctance involves the potential dread that comes with an increased awareness of self, that adjudicating a tough ethical decision requires a prism through which some are afraid to see themselves. It takes the willingness of an honest and restless soul to subject oneself to such psychic refractions.

That we should all have such courage.

## A Proposed Proof for the Existence of God

Assume it is impossible to prove God does not exist. Then the probability that God exists, denoted *p*(*G*), is greater than zero:and(This forces) Also assume, as many important physicists and cosmologists do, that (1) the multiverse exists and is composed of an infinite number of independent universes and (2) our current universe is but one of those infinite universes existing in the multiverse.[1]

If the probability of the non-existence of God, denoted *p*(¬*G*), in some universe is defined as

,

then as the number of universes (*n*) approaches infinity,

That is, the sequenceasAny event that *can *happen will ineluctably happen given enough trials. This means God must exist in at least one universe within the multiverse, and if He does, then He must exist in *all* universes, including our universe, because omnipresence is a necessary condition for God to exist.

Footnotes:

[1] This is certainly a reasonable, if not ubiquitously held, concept that follows from the mathematics of inflationary theory. In *Our Mathematical Universe*, for example, Max Tegmark suggests if “inflation…made our space infinite[, t]hen there are infinitely many Level I parallel universes” (121). If this still seems unreasonable, consider the fact that a random walk on a lattice in “has unity probability of reaching any point (including the starting point) as the number of steps approaches infinity.”

## The Myth of Altruism?

The American Heritage Dictionary (2011) defines “altruism” as “selflessness.” If one accepts that standard definition, then it seems reasonable to view an “altruistic act” as one that fails to produce a net gain in personal benefit for the actor subsequent to its completion. (Here, we privilege *psychological altruism* as opposed to *biological altruism*, which is often dismissed by the “selfish gene” theory of Darwinian selection and notions of reproductive fitness.) Most people, however, assume psychologically-based altruistic acts exist because they believe an act that does not demand or expect overt reciprocity or recognition by the recipient (or others) is so defined. But is this view sufficiently comprehensive, and is it really possible to behave toward others in a way that is completely devoid of self? Is self-interest an ineluctable process with respect to volitional acts of kindness? Here, we explore the likelihood of engaging in an authentically selfless act and capturing true altruism, in general. (Note: For those averse to mathematical jargon, feel free to skip to the paragraph that begins with “[A]t this stage” to get a basic understanding of orthogonality and then move to the next section, “Semantic States as ‘Intrinsic Desires’,” without losing much traction.)

### The Model

Imagine for a moment every potential (positive) outcome that could emerge as a result of performing some act—say, holding the door for an elderly person. You might receive a “thank you,” a smile from an approving onlooker, someone reciprocating in kind, a feeling you’ve done what your parents (or your religious upbringing) might have expected you to do, perhaps even a monetary reward—whatever. (Note: We assume there will never be an eager desire or expectation for negative consequences, so we require all outcomes to be positive, beneficial events. Of course, a comprehensive model would also include the desire to *avoid* negative consequences—the ignominy of failing to return a wallet or aiding a helpless animal (an example we will revisit later)—but these can be transformed into positive statements that avoid the unnecessary complications associated with the contrapositive form.)

We suppose there are *n* outcomes, and we can imagine each outcome enjoys a certain probability of occurring. We will call this the *potential vector* , the components of which are simply the probabilities that each outcome (ordered 1 through *n*) will occur:

andwheredoes not have to equal 1 because events are independent and more than a single outcome is possible. (You might, for example, receive both a “thank you” and a dollar bill for holding the door for an elderly woman.) So, the vectorrepresents the agglomeration of the discrete probabilities of every positive thing that could occur to one’s benefit by engaging in the act.

Consider, now, another vector, , that represents the constellation of desires and expectations for the possible outcomes enumerated in. That is, if, thencatalogs the interest and desire in outcome. (It might be convenient to imagineas a binary vector of length *n *and an element of, but we will be better to treatvectors as a subset of the parent vector spaceto whichbelongs.) In other words,: either you desire the outcome (whose probability is denoted by)or you don’t. (There are no “probabilities of expectation or desire” in our model.) We will soon see how these vectors address our larger problem of quantifying acts of altruism.

The pointinis determined by, and we want to establish a plane parallel to (and including)with normal vector. Define a point X generated by a vectorwhere the scalarand. Ifis a normal vector of, then the normal-form equation of the plane is given by, and its general equation is

.

We now have a foundation upon which to establish a basic, quantifiable metric for altruism. If we assume, as we did above, that an altruistic act benefits the recipient and fails to generate *any* positive benefits for the actor, then such an act must involve potential and expectation vectors whose scalar product equals zero, which means they stand in an orthogonal (i.e., right-angle) relationship to each other. It is interesting to note there are only *two *possible avenues for–orthogonality within our model: (a) the actor desires and/or expects absolutely no rewards (i.e.,), which is the singular and generally understood notion of altruism, and (b) the actor only desires and/or expects rewards that are simply impossible (i.e.,where). (We will assume.) In all other cases, the scalar product will be greater than zero, violating the altruism requirement that there be no benefit to the actor. Framed another way, (the vector of) an altruistic act forms part of a basis for a subspace in.

At this stage, it might be beneficial to pause and walk through a very easy example. Imagine there are only three possible outcomes for buying someone their morning coffee at Starbucks: (1) the recipient says “thank you,” (2) someone buys your coffee for you (“paying it forward”), and (3) the person offers to pay your mortgage. A reasonable potential vector might be [0.9, 0.5, 0]—i.e., there’s a 90% chance you’ll get a “thank you,” a 50% chance someone else will buy your coffee for you, and a zero-percent chance this person will pay your mortgage. Now, assume your expectation vector for those outcomes is [1, 0, 0]—you expect people to say “thank you” when someone does something nice for them, but you don’t expect someone to buy your coffee or pay your mortgage as a result. The scalar product is greater than zero (), which means the act of buying the coffee fails to meet the requirement for altruism (i.e., the potential vector is not orthogonal to the plane that includes Q and X =* t***q**). In this example, as we’ve seen in the general case, the only way buying the coffee could have been an altruistic act is if (a) the actor expects or desires no outcome at all or (b) the actor expected or desired her mortgage to be paid (and nothing else). We will discuss later the reasonableness of the former scenario. (It might also be interesting to note the model can quantify the *degree* to which an act is altruistic.)

The above formalism will work in every case where there is a single, fixed potential vector and a specified constellation of expectations; curious readers, however, might be interested in cases where there exists a non-scalar-multiple range of expectations (i.e., when Xfor some scalar *t*), and we can dispatch the formalism fairly quickly. In these cases, orthogonality would involve a specific potential vector and a plane involving the displacement of expectation vectors. The vector form of this plane is, and direction vectors, are defined as follows:

withdefined similarly for points Q and R;are scalars (possibly understood as time per some unit of measurement for a transition vector), and points S and R of the direction vectors are necessarily located on the plane in question. Unpacking the vector form of the equation yields the following matrix equation:

whose parametric equations are

It’s not at all clear how one might interpret “altruistic orthogonality” between a potential vector and a transition or range (i.e., subtraction) vector of expectations within this alternate plane, but it will be enough for now to consider its normal vectors—one at Q and, if we wish, one at X (through the appropriate mathematical adjustments)—as secondary altruistic events orthogonal to the relevant plane intersections:

### Semantic States as ‘Intrinsic Desires’

To this point, we’ve established a very simple mathematical model that allows us to quantify a notion of altruism, but even this model hinges on the likelihood that one’s expectation vector equals zero: an actor neither expects nor desires any outcome or benefit from engaging in the act. This seems plausible for events we can recognize and catalog (e.g., reciprocal acts of kindness, expressions of affirmation, etc.), but what about the internal motivations—philosophers refer to these as *intrinsic desires*—that very often drive our decision-making process? What can we say about acts that resonate with these subjective, internal motivations like religious upbringing, a generic sense of rectitude, cultural conditioning, or the Golden Rule? These intrinsic desires must also be included in the collection of benefits we might expect to gain from engaging in an act and, thus, must be included in the set of components of potential outcomes. If you’ve been following the above mathematical discussion, such internal states guarantee non-orthogonality; that is, they secure a scalar forbecausefor some internal state *k*. This means internal states belie a genuine act of altruism. It is important to note, too, these acts are closely associated with notions of social exchange theory, where (1) “assets” and “liabilities” are not necessarily objective, quantifiable things (e.g., wealth, beauty, education, etc.) and (2) one’s decisions often work toward shrinking the gap between the *perceived* self and *ideal* self. (See, particularly, Murstein, 1971.) In considering the context of altruism, internal states combine these exchange features: An act that aligns with some intrinsic desire will bring the actor closer to the vision of his or her *ideal* self, which, in turn, will be subjectively perceived and experienced as an asset. Altruism is perforce banished in the process.

So, the question then becomes: Is it possible to act in a way that is completely devoid of both a desire for external rewards *and* any motivation involving intrinsic desires, internal states that provide (what we will conveniently call) *semantic assets*? As I hope I’ve shown, yes, it is (mathematically) possible—and in light of that, then, I might have been better served placing quotes around the word *myth* in the title—but we must also ask ourselves the following question: How *likely* it is that an act would be genuinely altruistic given our model? If we imagine secondary (non-scalar) planescomposed of expectation vectors from arbitrary points(with) parallel to the x-axis, as described above, then it is easy to see there are a countably infinite number of planes orthogonal to the relevant potential vector. (Assumebecause if **q** is the zero vector, it is orthogonal to every plane.) But there are an (uncountably) infinite number of anglesand, which means there exists a far greater number of planes that are non-orthogonal to a given potential vector, but this only considersrotations inas a two-dimensional slice of our outcome space. As you might be able to visualize, the number of non-orthogonal planes grows considerably if we includerotations in. Within the context of three dimensions, and to get a general sense of the unlikelihood of acquiring random orthogonality, suppose there exists a secondary plane, as described above, for every integer-based value of(and) with rotations in; then the probability of a potential vector being orthogonal to a randomly chosen planeof independent expectation vectors is highly improbable: *p* = 1/178 = 0.00561797753, a value significant to eleven digits. If we includerotations to those already permitted, the *p*-value for random orthogonality decreases to 0.00001564896, which is a value so small as to be essentially nonexistent. So, although altruism is theoretically possible because our model admits the potential for orthogonality, our model also suggests such acts are quite unlikely, especially for large *n*. For philosophically sophisticated readers, the model supports the theory of *psychological altruism *(henceforth ‘PA’) that informs the vast majority of decisions we make in response to others, but based on *p*-values associated with the prescribed model, I would argue we’re probably closer to Thomas Hobbes’s understanding of *psychological egoism* (henceforth ‘PE’), even though the admission of orthogonality subverts the totalitarianism and inflexibility inherent within PE.

One final thought explicates the obvious problem with our discussion to this point: There isn’t any way to quantify probabilities of potential outcomes based on events that haven’t yet happened, *even though* we know intuitively such probabilities, outcomes, and expectations exist. To be sure, the concept of altruism is palpably more philosophical or psychological or Darwinian than mathematical, but our model is successful in its attempt to provide a skeletal structure to a set of disembodied, intrinsic desires—to posit our choices are, far more often than they are not, means to ends (whether external or internal) rather than selfless, other-directed ends in themselves.

### Some Philosophical Criticisms

Philosophical inquiry concerning altruism is rich and varied. Aristotle believed the concept of altruism—the specific word was not coined until 1851 by Auguste Comte—was an outward-directed moral good that benefited oneself, the benefits accruing in proportion to the number of acts committed. Epicurus argued that selfless acts should be directed toward friends, yet he viewed friendship as the “greatest means of attaining pleasure.” Kant held for acts that belied self-interest but argued, curiously, they could also emerge from a sense of duty and obligation. Thomas Hobbes rejected the notion of altruism altogether; for him, every act is pregnant with self-interest, and the notion of selflessness is an unnatural one. Nietzsche felt altruistic acts were degrading to the self and sabotaged each person’s obligation to pursue self-improvement and enlightenment. Emmanuel Levinas argued individuals are not ends in themselves and that our priority should be (and can only be!) acting benevolently and selflessly towards others—an argument that fails to address the conflict inherent in engaging with a social contract where each individual is also a receiving “other.” (This is the problem with utilitarian-based approaches to altruism, in general.) Despite the varied historical analyses, nearly every modern philosopher (according to most accounts) rejects the notion of *psychological egoism—*the notion that *every* act is driven by benefits to self—and accepts, as our model admits, that altruism does motivate a certain number of volitional acts. But because our model suggests very low *p*-values for PA*,* it seems prudent to address some of the specific arguments against a prevalent, if not unshirted, egoism.

1. Taking the blue pill: Testing for ‘I-desires’

Consider the following story:

Mr. Lincoln once remarked to a fellow passenger…that all men were prompted by selfishness in doing good. His [companion] was antagonizing this position when they were passing over a corduroy bridge that spanned a slough. As they crossed this bridge they espied an old razor-backed sow on the bank making a terrible noise because her pigs had got into the slough and were in danger of drowning. [M]r. Lincoln called out, ‘Driver can’t you stop just a moment?’ Then Mr. Lincoln jumped out, ran back and lifted the little pigs out of the mud….When he returned, his companion remarked: ‘Now Abe, where does selfishness come in on this little episode?’ ‘Why, bless your soul, Ed, that was the very essence of selfishness. I should have had no peace of mind all day had I gone on and left that suffering old sow worrying over those pigs.’ [Feinberg, Psychological Altruism]

The author continues:

What is the content of his desire? Feinberg thinks he must really desire the well-being of the pigs; it is incoherent to think otherwise. But that doesn’t seem right. Feinberg says that he is not indifferent to them, and of course, that is right, since he is moved by their plight. But it could be that he desires to help them simply because their suffering causes him to feel uncomfortable (there is a brute causal connection) and the only way he has to relieve this discomfort is to help them. Then he would, at bottom be moved by an I-desire (‘I desire that I no longer feel uncomfortable’), and the desire would be egoistic. Here is a test to see whether the desire is basically an I-desire. Suppose that he could simply have taken a pill that quietened the worry, and so stopped him being uncomfortable, and taking the pill would have been easier than helping the pigs. Would he have taken the pill and left the pigs to their fate? If so, the desire is indeed an I-desire. There is nothing incoherent about this….We can apply similar tests generally. Whenever it is suggested that an apparently altruistic motivation is really egoistic, since it [is] underpinned by an I-desire, imagine a way in which the I-desire could be satisfied without the apparently altruistic desire being satisfied. Would the agent be happy with this? If they would, then it is indeed an egoistic desire. if not, it isn’t.

This is a powerful argument. If one could take a pill—say, a tranquilizer—that would relieve the actor from the discomfort of engaging the pigs’ distress, which is the assumed motivation for saving the pigs according to the (apocryphal?) anecdote, then the volitional act of getting out of the coach and saving the pigs must then be considered a genuinely altruistic act because it is directed toward the welfare of the pigs and is, by definition, not an “I-desire.” But this analysis makes two very large assumptions: (1) there is a singular motivation behind an act and (2) we can whisk away a proposed motivation by some physical or mystical means. To be sure, there could be more than one operative motivation for an action—say, avoiding discomfort *and* receiving a psychosocial reward—and the thought-experiment of a pill removing the impetus to act does not apply in all cases.

Suppose, for example, one only desires to avoid the pigs’ death and not the precursor of their suffering. Is it meaningful to imagine the possibility of a magical pill that could avoid the pigs’ death? If by the “pill test” we intend to eviscerate any and all possible motivations by some fantastic means, then we really haven’t said much at all. We’ve only argued the obvious tautology: that things would be different if things were different. (Note: the conditional **A** –> **A** is *always* true, which means **A** <–> **A** is, too.) Could we, for example, apply this test to our earlier coffee experiment? Imagine our protagonist could take a pill that would, by acting on neurochemical transmitters, magically satisfy her expectation and desire for being thanked for purchasing the coffee. Can we really say her motivation is now altruistic, presumably because the pill has rendered an objective “thank you” from the recipient unnecessary? In terms of our mathematical model, does the pill create a zero expectation vector? It’s quite difficult to imagine this is the case; the motivation—that is, the expectation of, and desire for, a “thank you”—is not eliminated because it is fulfilled by a different mechanism.

2. Primary object vs. Secondary possessor

As a doctor who desires to cure my patient, I do not desire pleasure; I desire that my patient be made better. In other words, as a doctor, not all my particular desires have as their object some facet of myself; my desire for the well-being of my patient does not aim at alteration in myself but in another. My desire is other-regarding; its object is external to myself. Of course, pleasure may arise from my satisfied desire in such cases, though equally it may not; but my desire is not aimed at my own pleasure. The same is true of happiness or interest: my satisfied desire may make me happy or further my interest, but these are not the objects of my desire. Here, [Joseph] Butler simply notices that desires have possessors – those whose desires they are – and if satisfied desires produce happiness, their possessors experience it. The object of a desire can thus be distinguished from the possessor of the desire: if, as a doctor, my desire is satisfied, I may be made happy as a result; but neither happiness nor any other state of myself is the object of my desire. That object is other-regarding, my patient’s well-being. Without some more sophisticated account, psychological egoism is false. [See Butler, J. (1726) Fifteen Sermons Preached at the Rolls Chapel, London]

Here, the author errs not in assuming pleasure can be a residual feature of helping his patients—it can be—but in presuming his desire for the well-being of others is a first cause. It is likely that such a desire originates from a desire to fulfill the Hippocratic oath, to avoid imposing harm, which demands professional and moral commitments from a good physician. The desire to be (seen as) a good physician, which requires a (“contrapositive”) desire to avoid harming patients, is clearly a motivation directed toward self. Receiving a “thank you” for buying someone’s coffee might create a feeling of pleasure within the actor (in response to the pleasure felt and/or exhibited by the recipient), but the pleasure of the recipient is not necessarily (and is unlikely to be) a first cause. If it were a first (and only) cause, then all the components of the expectation vector would be zero and the act would be considered altruistic.

Notice we must qualify that if-then statement with the word “only” because our model treats such secondary “I-desires” as unique components of the expectation vector. (“Do I desire the feeling of pleasure that will result in pleasing someone else when I buy him or her coffee?”) We will set aside the notion that an expectation of a residual pleasurable feeling in response to another’s pleasure is not necessarily an intrinsic desire. I can *expect* to feel good in response to doing X without desiring, or being motivated by, that feeling—this is the heart of the author’s argument—but if any part of the motivation for buying the coffee involves a desire to receive pleasure—even if the first cause involves a desire for the pleasure of others—then the act cannot truly be cataloged as altruistic because, as mentioned above, it must occupy a component within **q**. The issue of desire, then, requires an investigation into first causes (i.e., “ultimate”) motivations, and the logical fallacy of Joseph Butler’s argument (against what is actually *psychological hedonism*) demands it.

3. Sacrifice or pain

Also taken from the above link:

A simple argument against psychological egoism is that it seems obviously false….Hume rhetorically asks, ‘What interest can a fond mother have in view, who loses her health by assiduous attendance on her sick child, and afterwards [sic] languishes and dies of grief, when freed, by its death, from the slavery of that attendance?’ Building on this observation, Hume takes the ‘most obvious objection’ to psychological egoism.…[A]s it is contrary to common feeling and our most unprejudiced notions, there is required the highest stretch of philosophy to establish so extraordinary a paradox. To the most careless observer there appear to be such dispositions as benevolence and generosity; such affections as love, friendship, compassion, gratitude. […] And as this is the obvious appearance of things, it must be admitted, till some hypothesis be discovered, which by penetrating deeper into human nature, may prove the former affections to be nothing but modifications of the latter. Here Hume is offering a burden-shifting argument. The idea is that psychological egoism is implausible on its face, offering strained accounts of apparently altruistic actions. So the burden of proof is on the egoist to show us why we should believe the view.

Sociologist Emile Durkheim argued that altruism involves voluntary acts of “self-destruction for no personal benefit,” and like Levinas, Durkheim believed selflessness was informed by a utilitarian morality despite his belief that duty, obligation, and obedience to authority were also counted among selfless acts. The notion of sacrifice is perhaps the most convincing counterpoint to overriding claims to egoism. It is difficult to imagine a scenario, all things being equal, where sacrifice (and especially pain) would be a desired outcome. It would seem that a decision to act in the face of personal sacrifice, loss, or physical pain would almost certainly guarantee a genuine expression of altruism, yet we must again confront the issue of first causes. In the case of the assiduous mother, sacrifice might service an intrinsic (and “ultimate”) desire to be considered a good mother. In the context of social-exchange theory, the asset of being (perceived as) a good mother outweighs the liability inherent within self-sacrifice. Sacrifice, after all, is what good mothers do, and being a good mother resonates more closely with the ideal self, as well as society’s coeval definition of what it means to be a “good mother.” In a desire to “do the right thing” and “be a good mother,” then, she chooses sacrifice. It is the desire for rectitude (perceived or real) and the positive perception of one’s approach to motherhood, not solely the sacrifice itself, that becomes the galvanizing force behind the act. First causes very often answer the following question: “What would a good [*insert category or group to which membership is desired*] do?”

What of pain? We can imagine a scenario in which a captured soldier is being tortured in the hope he or she will reveal critical military secrets. Is the soldier acting altruistically by enduring intense pain rather than revealing the desired secrets? We can’t say it is impossible, but, here, the aegis of a first cause likely revolves around pride or honor; to use our interrogative test for first causes: “Remaining true to a superordinate code is what [respected and honorable soldiers] do.” They certainly don’t dishonor themselves by betraying others, even when it’s in one’s best interest to do so. Recalling Durkheim’s definition, obedience (as distinct from the obligatory notion of duty) also plays an active role here: Honorable soldiers are required to obey the established military code of conduct, so the choice to endure pain might be motivated by a desire to be (seen as) an obedient and compliant soldier who respects the code rather than (merely) an honorable person, though these two things are nearly inextricably enmeshed. To highlight a relevant religious example, Jesus’ sacrifice on the cross might not be considered a truly altruistic act if the then-operative value metric privileged a desire to be viewed by the Father as a good, obedient Son, who was willing to sacrifice Himself for humanity, above the sacrifice (and pain) associated with the crucifixion. (This is an example where the general criticism of Durkheim’s “utilitarian” altruism fails; Jesus did not receive from His utilitarian sacrifice in the way mankind did.) These are complex motivations that require careful parsing, but there’s one thing we do know: If neither sacrifice nor pain can be related to any sort of intrinsic desire that satisfies the above interrogative test, then it probably should be classified as altruistic, even though, as our model suggests, this is not likely to be the case.

4. Self-awareness

Given the arguments, it is still unclear why we should consider psychological egoism to be obviously untrue. One might appeal to introspection or common sense, but neither is particularly powerful. First, the consensus among psychologists is that a great number of our mental states, even our motives, are not accessible to consciousness or cannot reliably be reported…through the use of introspection. While introspection, to some extent, may be a decent source of knowledge of our own minds, it is fairly suspect to reject an empirical claim about potentially unconscious motivations….Second, shifting the burden of proof based on common sense is rather limited. Sober and Wilson…go so far as to say that we have ‘no business taking common sense at face value’ in the context of an empirical hypothesis. Even if we disagree with their claim and allow a larger role for shifting burdens of proof via common sense, it still may have limited use, especially when the common sense view might be reasonably cast as supporting either position in the egoism-altruism debate. Here, instead of appeals to common sense, it would be of greater use to employ more secure philosophical arguments and rigorous empirical evidence.

In other words, we cannot trust thought processes in evaluating our motivations to act. We might think we’re acting altruistically—without any expectations or desires—but we are often mistaken because, as our earlier examples have shown, we fail to appreciate the locus of first causes. (It is also probably true, for better or worse, that most people prefer to think of themselves more highly than they ought—a process that better approaches exchange ideas of the ideal self in choosing how and when to act.) Jeff Schloss, the T.B. Walker Chair of Natural and Behavioral Sciences at Westmont College, suggests precisely this when he states that “people can really intend to act without conscious expectation of return, but that [things like intrinsic desires] could still be motivating certain actions.” The interrogative test seems like one easy way to clarify our subjective intuitions surrounding what motivates our actions, but we need more tools. Our model seems to argue that the burden of proof for altruism rests with the actor—“proving,” without resorting to introspection, one’s expectation vector really is zero—rather than “proving” the opposite, that egoism is the standard construct. Our proposed *p*-values based on the mathematics of our model strongly suggest the unlikelihood of a genuine altruism for a random act (especially for large *n*), but despite the highly suggestive nature of the probability values, it is unlikely they rise to the level of “empirical evidence.”

### Conclusion

Though I’ve done a little work in a fun attempt to convince you genuine altruism is a rather rare occurrence, generally speaking, it should be said that even if my basic conceit is accurate, *this is not a bad thing*! The “intrinsic desires” and (internal) social exchanges that often motivate our decision-making process (1) lead to an increase in the number of desirable behaviors and (2) afford us an opportunity to better align our actions (and ourselves) with a subjective vision of an “ideal self.” We should note, too, the “subjective ideal self” is frequently a reflection of an “objective ideal ([of] self)” constructed and maintained by coeval social constructs. This is a positive outcome, for if we only acted in accordance with genuine altruism, there would be a tragic contraction of good (acts) in the world. Choosing to act kindly toward others based on a private desire that references and reinforces self in a highly abstract way stands as a testament to the evolutionary psychosocial sophistication of humans, and it evinces the kind of higher-order thinking required to assimilate into, and function within, the complex interpersonal dynamic demanded by modern society. We should consider such sophistication to be a moral and ethical victory rather than the evidence of some degenerate social contract surreptitiously pursued by selfish persons.

References:

Bernard Murstein (Ed.). (1971). *Theories of Attraction and Love*. New York, NY: Springer Publishing Company, Inc.

## Does God Sneeze?

The well-known philosopher Colin McGinn writes a fascinating article in *Skeptic* that purports to disprove the existence of God—an ironic project considering I’ve already proven the existence of God mathematically in an earlier post. Nevertheless, in the spirit of inquiry, I offer a “Reader’s Digest” version of McGinn’s engaging (if unoriginal) argument followed by my analysis:

*If God exists, He must be omnipotent, and if He is omnipotent, He must perforce possess every power to engage in all actions and activities without reservation, even those trespasses that would prevent Him from being God (e.g., taking pleasure in the suffering of others, becoming ill, digesting food, etc.). So, either *(1)* God possesses the power to do everything—including the challenge of existing in any (fallen or sinful) state—or *(2)* God is not omnipotent because He can’t do everything. The former demands He must be (at least) morally imperfect, which means He cannot exist, while the latter suggests He cannot exist because omnipotence is a necessary condition for His existence. Therefore, God cannot exist.*

It seems to be a tidy, hermetically sealed argument, and it uses the simplest of logical structures; in fact, McGinn implements nothing more than a simple chain of *modus tollens* constructions:

P1: God exists

P2: God is omnipotent

P3: God is not limited in His actions or behavior

Contrapositive proof for ¬P1:

(1) P2 → P3

(2) ¬P3 (A)

(3) ¬P2 [(1),(3) MT]

(4) P1 → P2

(5) ¬P2 [(3)]

(6) ¬P1 [(4),(5) MT]

The assumption at line (2) is the linchpin of McGinn’s argument: Either God’s omnipotence will, by definition, allow Him to sin (or become schizophrenic or get cancer or forget Euclid’s proof for the infinitude of primes), or He’s not omnipotent at all. And because we can think of things God cannot do, we are forced to admit ¬P3, which then feeds the logical chain. Thus, McGinn argues, God cannot exist in either state—either as a limited or morally imperfect being—so we must reject any claim to His existence beyond *any* doubt.

What’s most important about McGinn’s conceit is his definition of identity: Everything that exists must necessarily abdicate a finite number of features, attributes, and powers in order to assume a definitive state of being. It’s what we *don’t* have (and *can’t* do) that affirms us, which leads precisely to the internecine conflicts among omnipotence, human failing, and inviolable rectitude that galvanizes the above formalism. God must be capable of sneezing if He’s omnipotent, but if He *can’t* sneeze (because God doesn’t get sick), then He can’t exist because there’s something He can’t do.

There are at least two standard counterarguments to McGinn’s analysis:

(A) A perfect God must necessarily act (and exist) according to His (perfect) nature

(B) An omnipotent God simply chooses to act in a morally perfect way

The former means God is not truly free to act, which allows McGinn to spring the trap of (2) above, and the latter is dismissed as an ontological error:

Nor can it be that God merely has the potential to do these things while never actually doing them: for first, to have even the potential is already to place God in the wrong ontological category; and second, if he were to exercise these powers that would immediately deprive him of his godlike status—he would become at best a godhuman hybrid (like Jesus). If God were to pick his nose one day, he would thereby cease to be God. So having that power [to act in such a way] is no part of his nature.

For McGinn, God cannot merely have the *potential* to act in violation of His nature; He *must* act in violation of His nature if He is truly omnipotent, yet any such violation perforce nullifies His existence.

McGinn commits an ontological error of his own, however, by placing God under an umbrella of binary anthropomorphism: God chooses between good and evil, perfection and imperfection, and righteousness and immorality as if He is subject to the strictures of the human condition. By doing so, McGinn makes the rest of his argument easy; establish a finite set of logic-based delimiters for God’s behavior and simply allow the necessary attribute of omnipotence to negate His existence when ineluctable conflicts arise. What McGinn *really* proves is that God cannot exist under the conditions he establishes for omnipotent beings, which is, of course, distinct from offering genuine proof of the nonexistence of God.

But what should replace McGinn’s binary approach to characterizing God? I’m not a philosopher, but it seems clear that a better way to understand God’s existence involves transcending the kind of limited, binary approach described in his article. If we accept McGinn’s general understanding of identity, then God’s existence is defined by what He is not. That is, God doesn’t ever arrive at a binary choice between, say, love and hate because hate is defined *as the absence of God. *(See 1 John 4:16 as an example.) Thus, such a binary choice is illusory because God’s nature transcends things that emerge from His absence.

This view suggests a more comprehensive understanding of God’s nature as *perfect*, as distinct from being *merely* perfect—i.e., God always makes the correct decision given the limits of a binary choice because He acts (or must act) perfectly. The ostensible dichotomy between light and darkness, for example, offers a similar illusion: Darkness is the *absence* of light; it cannot exist where light is present (beyond the trivial byproduct of shadows). This is the reason a flame casts no shadow when placed near a wall (see photo above) because it exists, essentially, in its own ontological category. The nature of light transcends that of darkness.

If we can imagine a supernal being who operates “in the ontological nature of light,” then it will be much easier to see how a “choice” between light and darkness *within that nature* would simply not exist. If we extend this concept to God’s other attributes (e.g., goodness, faithfulness, etc.), we get closer to a thick description of God’s genuine nature that throws into sharp relief McGinn’s humanistic approach, an argument that simply demotes God’s ontological status in an effort to apply human logic to questions of deism.

“*Beware lest any man spoil you through philosophy and vain deceit, after the tradition of men, after the rudiments of the world, and not after Christ.*” ~ Colossians 2:8 (KJV)