The Critical Strip

"…in the pursuit of Wisdom, to the glory of God."

Book Review: God’s Grand Game

leave a comment »

Philosopher Steven Colborne’s self-published book God’s Grand Game: Divine Sovereignty and the Cosmic Playground seeks to reconcile God’s sovereignty (e.g., omniscience, omnipotence, omnipresence, etc.) with human agency in the service of proposing a “Church of the Future.” (Apparently, each of the world’s major religions suffers from terminal flaws, and Colborne proposes what he believes to be the singular path forward.) It’s a Herculean task birthed from a seductive theological problem that has attracted many of history’s greatest minds, yet it doesn’t take long for readers to realize Colborne won’t provide an intellectual return on their investment. His nearly half-serious attempt fails at everything but contradiction, which makes it extraordinarily difficult to appreciate God’s Grand Game (GGG) as anything approaching a serious work of philosophical and theological exegesis. Most of the 45 chapters rehash the same basic concepts, and Colborne’s quotidian writing style feels less like an attempt to simplify complexities for the reader than a betrayal of some alarming vacuity. Think “self-published diary” rather than Summa Theologica. Still, I promised to write a blog review in return for a free copy, and I’m here to deliver.[1]

Colborne claims he is neither a Calvinist nor a pantheist {40}. I’ll take him at his word, though his worldview, as expressed in the book, aligns with those positions in incredibly intimate and multivalent ways. What he can’t deny, however, is his unshirted embrace of an uncompromisingly hard-determinist position that makes absolutely no room for anything else—not human agency, not mental creativity [2], not physical processes, and not even basic notions of perception {37, 49, 53}—and this belief motivates every one of his epistemological and ontological moves. The most charitable characterization of Colborne’s argument might be that we’re living in a supernal, spirit-based version of the Matrix, where an impersonal “God” informs and controls and imparts all experience and action, from our physical senses to the most nuanced mental states.

If defending fatalism were the entire project, however, then one might envision an alternate universe where GGG wasn’t so bad. Bring in Nick Bostrom, Daniel Dennett, Sam Harris, and John Searle, layered above a comprehensive overview of relevant historical thought (e.g., Aquinas, Augustine, etc.), and Colborne could have uncovered a substantive discussion concerning the intersection of free will, consciousness, and the potential for (either God- or computer-based) simulation. Instead, he fumbles for a prescription of theology that leads to many kinds of embarrassing errors, including faulty hermeneutics, straw-men attacks, confirmation bias, and circular reasoning. I won’t do a deep dive into every single non-sequitur, but I hope readers will be largely (if not completely) persuaded by the following examples:

  1. God is omniscient, yet the future is uncertain until He decides what comes next {35}.
  2. Our “modes of mind” can influence the “modes” of others {55}, yet causation is a fallacy {168} and all mental states—including psychoses—are a function of God’s design {37}.
  3. The proof of randomness (i.e., the fact that we’re not in control of our own thoughts) is randomness (e.g., of choice, of word association, etc.) {59}.

Of course, there are straightforward (theo)logical problems as well. To wit:

  • God is the reification of both love and hate {25}
  • God controls Satan’s actions {87}, but Satan doesn’t exist
  • God causes us to sin and imparts to us sinful desires {51 ff., 74}
  • God created all the religions {79 f.}
  • God possibly forgets things {237, fn 2}

. . . and so forth. These are obviously absurd claims, yet Colborne has no choice but to admit them because his initial conditions demand their inclusion. It may be God’s “game,” but Colborne makes the rules.  

Perhaps the most significant deficit of GGG, aside from the disqualifying problems of circular reasoning and self-contradiction, is Colborne’s failure to engage in real theological debate when opportunities do arise. For example, he asserts the predestination doctrine of orthodox Calvinism simply because it aligns with his first principle of hard determinism. This means Colborne must ignore the myriad biblical scriptures that qualify salvific theology as an engagement with (and of) subjective will, while also failing to address the obvious fact that “preordainment” inexorably leads to the nullification of God’s existence.[3] This was a wonderful opportunity for Colborne to conjoin his theological and philosophical positions in a powerful and convincing way using a “real world” problem, and he punted. Of course, considering the above examples, one would expect Colborne to be unfazed by the problem of a God who makes people sin and then destroys them for sinning, and the reader is not disappointed. Examples like these are littered throughout the text.

One should certainly applaud Colborne for conquering the difficult process of writing and self-publishing a book, but in the end, GGG is internally inconsistent and commits the most fundamental errors in logic; unfortunately, that’s reason enough to dismiss it as a spasmodic, pseudointellectual exercise. There are other problems too (e.g., the complete lack of academic research and references, etc.) that give the reader the distinct impression this should be a very first (and very rough) draft rather than a pristine product ready for publication—though, even then, there should be cause for serious concern—but those issues would have been overlooked had Colborne crafted a logically consistent and compelling narrative.

In God’s Grand Game, we all lose.                 


[1] Numbers in square brackets indicate footnotes; those in curly brackets denote page numbers within the text.

[2] This is known as occasionalism whose endgame is pantheism.

[3] According to Colborne, God chooses “one of the innumerable worlds” that “veils” preselected minds from knowing Him {75}. So, if God causes everyone to sin—because all human behavior is really God’s invasive action operating through us without our consent—and then punishes a preordained subset of those fully determined persons (i.e., the non-elect) with eternal damnation for ostensibly rejecting Christ (when they possessed no free will to do so), then God must be unjust. But if God exists, He must exist as a perfect, and perfectly moral, being, a state that requires He adjudicate justly and righteously. Therefore, God cannot exist.

Online Dating: Quarantine Edition

leave a comment »

You did it. You really did it.

You finally succumbed to quarantine fever and used your credit card to pay the membership fee for

Of course, you’re skeptical about the whole “online dating” thing, but you’re optimistic and hopeful—and probably a little excited by the idea that true love might be (literally) just a mouse click away. Really, though, you’re just hoping the opportunity cost of spending $23.99/month will be worth it when you [sigh] finally meet that twenty-something, super-rich [insert famous celebrity here] look-alike who is (1) still single (no chance), (2) also a member of (never happen), and (3) of the opinion you’re more attractive than anyone else they could ever meet (highly unlikely). But you still play the lottery, so you believe anything is possible. (It’s an unfortunate consequence of our sociosexual evolution that callipygian gifts require payment in kind. I think Aristotle said that.)

Anyway, what now? Endless scrolling? Interminable thanks-but-no-thanks messages followed by hasty profile blocking? Wasted Friday and Saturday nights trying to get not-quite [said famous celebrity] 2.0 with the slightly unattractive aquiline profile to pay more attention to you than the comments on their Instagram selfies? There’s a better way. Say “no” to bad lighting and undercooked meat, and say “yes” to the dress the mathematics of probability optimization.



What if I told you we can use the power and beauty of mathematics to give you the best chance of finding the lust love of your life? It’s true. Let’s say you’re willing to look at a total of t profile pictures sent to you by’s ostensibly preternatural algorithms. By rejecting the first r pictures, you will maximize the probability of finding your “ideal match.” (Call this person x.) I know: you don’t believe it, but it really is that simple. So, what’s the value of r for a given t? Technically, the answer iste^{-1},but we’ll get to that.

First, some ground rules:

  1. Once you pass on a profile pic, you can’t go back. That person is gone forever. [insert crying emoji]
  2. Once you choose the value for r, you must reject every person from the first to the rth.
  3. You must choose the first profile (x > rth) that’s better than all the others you’ve seen.
  4. You must choose the last profile you see if you haven’t chosen anyone to that point.
  5. Once you choose someone, they are guaranteed to accept.

So, what do we know? Well, unfortunately, if your ideal match happens to show up within the first r profiles, you’re sunk. Because of rules 1 and 2, the probability of picking x—assuming x happens to arrive within the first r profiles—is, well, zero. To optimize your chances of picking x, we need to pick the optimized value for r. And to do that, we need to calculate the probability of x‘s location as sends profile pics to your inbox.

Okay, you can’t pick anyone within the first r profiles, but what if the (r+1)st profile is your dream date x? You’ll pick that person for sure, right? So, the probability is 1. But the probability of the (r+1)st profile being your dream date is (gulp) the worst it could be: 1/t (assuming a uniform distribution of profile pics). We take the product of these values; that is, “the probability you’d choose the (r+1)st profile assuming that person is better than the other r profiles” multiplied by “the probability this person happens to be located at the (r+1)st position in’s algorithm.” That happens to be [drum roll, please] (1)(1/t) = 1/t. For large t, that’s not so good. It gets better, though.

What if x is the (r+2)nd profile in your inbox? Well, you wouldn’t pick x in this scenario unless the (r+1)st profile wasn’t better than all the previous r profiles; in other words, the highest-rated profile to that point (i.e., the moment you received the (r+1)st profile) was one of the previous r profiles (otherwise, you would’ve picked the (r+1)st profile). The probability that the highest-rated profile of the first (r+1) profiles arrived within the first r profiles you rejected is very high, r/(r+1), and the probability that x happens to arrive in the (r+2)nd position remains 1/t. This is really the tricky part of the whole concept, so do yourself a favor and make sure you get it straight.

So, the total probability of the (r+2)nd profile being your dream date is the product of the two probability values we already calculated, that is,t^{-1}\cdot r(r+1)^{-1}=r[t(r+1)]^{-1}.Probabilities for the (r+3)rd, (r+4)th, . . . , (t-2)th, (t-1)th, and tth profiles are calculated similarly, and we simply sum the individual probability values:


and after factoring out r/t, this simplifies to

\displaystyle p(r,t)=rt^{-1} \sum_{k=r+1}^{t-1} 1/k,

which is simply an r-mulitple of the average of the individual probabilities the jth candidate will be x withj = r+1,r+2,\cdots,t-1.Now, if we’re going to give you the best chance of finding your ideal x, we need to optimize the value for r. In other words, giving you the best odds requires knowing how many r profiles you must be committed to rejecting given an arbitrary number of profiles t. This means we need to optimizep(r,t)and that requiresp(r,t)>p(r-1,t),p(r+1,t)for arbitrary t. The trick is to substitute r-1 and r+1 into the above equation and solve the inequalities.

Taking the first case, we havep(r,t)>p(r-1,t).After substitution, we have

\displaystyle rt^{-1}\Big[r^{-1}+(r+1)^{-1}+\cdots + (t-1)^{-1}\Big]>
\displaystyle (r-1)t^{-1}\Big[(r-1)^{-1}+r^{-1}+(r+1)^{-1}\cdots+(t-1)^{-1}\Big].

Multiplying by t and distributing r-1 into the first term on the RHS gives us (1)

\displaystyle r\Big[r^{-1}+(r+1)^{-1}+\cdots + (t-1)^{-1}\Big]>
\displaystyle 1+(r-1)\Big[r^{-1}+(r+1)^{-1}+\cdots + (t-1)^{-1}\Big].

Notice the bracketed expressions in both the LHS/RHS are equal! Now, we only have to deal with coefficients. Let’s do that. Subtracting the LHS from the RHS leaves us with

\displaystyle 0>1-\Big[r^{-1}+(r+1)^{-1}+\cdots + (t-1)^{-1}\Big],

which, after rearranging, becomes

\displaystyle r^{-1}+(r+1)^{-1}+\cdots + (t-1)^{-1}>1.

If we substitute r+1 into inequality (1) above and follow a similar calculation, we arrive at the other inequality we need:

\displaystyle (r+1)^{-1}+(r+2)^{-1}+\cdots+(t-1)^{-1}<1,

yielding the final result (2):

\displaystyle \sum_{k=r+1}^{t-1} 1/k < 1 < \sum_{k=r}^{t-1} 1/k.

At this point, we can find the optimized value for r given an arbitrary t. Just make sure the above inequality still holds. For example, imagine sends you, say, seven profile pics. Then, we have \sum_{i=3}^{6} i^{-1} <1<\sum_{i=2}^{6} i^{-1},and r = 2 because 19/20 < 1 < 29/20 holds.

What does this mean? Well, given a total pool of seven profiles from which to choose (adhering to the aforementioned restrictions), you would automatically reject the first two profiles—no matter who they were—and then choose the very next profile that was better than the first two you rejected. Note: To calculate r, we use the smallest number of terms (based on the value of t we chose) to satisfy both sides of the inequality in (2).

Here are some calculations using the above machinery:

t = 3,\,\, r = 1,\,\, p\approx 0.5
t= 8,\,\, r = 3,\,\, p\approx 0.41
t=10,\,\, r = 3,\,\, p\approx 0.40
t = 30,\,\, r = 11,\,\, p\approx 0.38
t= 50,\,\, r = 18,\,\, p\approx 0.374
t= 100,\,\, r = 37,\,\, p\approx 0.37
t= 1000,\,\, r = 369,\,\, p\approx 0.368

So, if you decide to consider a pool of 50 profiles, you’d automatically reject the first 18 and choose the first profile better than any of those 18 you rejected. That will give you a 37.43% chance of finding your ideal love match. Sure, you have about a 63% chance of missing x using this strategy, but any other strategy you choose will decrease your odds (assuming you follow the rules).[1]

One thing we can’t fail to notice is that as t gets larger, our p-value begins to settle around 37%.[2] That’s not a coincidence. In fact, the above inequalities relate to (a bounded subset of) the harmonic series:\sum_{i=1}^{\infty} 1/i.All the denominators in our rational terms are conterminous positive integers that begin from r or r+1 and terminate at t-1. (Check this.) So, our work thus far can be recast as a functionf:\mathbb{Z}^+\rightarrow\mathbb{Q}of our two variablesr,t—defined by the RHS inequality—based on how we interpret the (upper and lower) Riemann sums of the integral, the meshes of which are simply the areas given by our chosen subseries ofH_n.

Taking the RHS, we have

\displaystyle \int_{r}^{t}\!\!x^{-1}=\Big[\ln x\,\Big]_{r}^{t}=\,\ln\Big(\frac{t}{r}\Big)\,<\,\sum_{i=r}^{t-1} 1/i,

and the LHS is

\displaystyle \sum_{i=r+1}^{t-1} 1/i < \int_{r}^{t-1}\!\!x^{-1}=\Big[\ln x\Big]_{r}^{t-1}=\ln\Big(\frac{t-1}{r}\Big).

Putting it all together, we have

\displaystyle \sum_{i=r+1}^{t-1} 1/i < \ln\Big(\frac{t-1}{r}\Big)<1<\ln\Big(\frac{t}{r}\Big)<\sum_{i=r}^{t-1} 1/i.

As t and r grow larger, however, we find that

\ln\,(t-1)r^{-1}\approx\ln\,tr^{-1}\approx 1,

and the last inequality above suggests we’re “squeezing” the value of 1 from both sides, yielding the final calculation:

\displaystyle \ln tr^{-1}\approx 1\,\,\longrightarrow\,\,tr^{-1}\approx e\,\,\longrightarrow\,\,r\approx te^{-1},

as claimed.

What’s great about this is that your chances don’t decrease the larger t is. Whether you’re willing to look through 15 profiles or 15,000, your optimized probability remains the same—about 37%. Also, the variable t doesn’t have to deal with profiles; it could involve, say, time. If you’ve allocated five months to find the best venue for your wedding, you’ll reject everything you see until (about) day 56 (= 150/e), at which point you’ll choose the first venue that’s better than every venue you’ve seen so far. Of course, venues aren’t like jilted dates: you can always circle back and choose a venue you’ve rejected, but the math works the same way given the assumptions.

So, there it is. You have nearly 2/5 probability of finding your ideal date using the above approach. That’s pretty good, actually. It’s not as good the probability the sun will rise tomorrow, but it’s a lot better than getting even a short-term run in the stock market.

Math even cares about your love life.



[1] Some strategies alter your chances by modifying our restrictions (e.g., selecting merely one of the best candidates, allowing proposal rejections, having full information about the candidates, incurring costs to passing, etc.). (As an example, one prominent design suggests stopping ate^{-1/2},as opposed to 1/e.) Many people consider the classic optimization problem to be unrealistic for these reasons. Of course, this is silliness: The math just gives you the best chance of finding x; it doesn’t say anything about the chances of being accepted by x or living happily ever after with x or whether you would consider less optimal candidates y and z to be desirable replacements. That’s up to you, butp=1/2eseems worth the rejection risk!

[2] This is why “no-information” optimal stopping is often referred to as “the 37% rule,” an algorithm originating in 1949 as Flood’s “fiancee problem“—recontextualized as “the secretary problem“—and later popularized by Martin Gardiner in the February 1960 issue of Scientific American.


[1] B. Christian and T. Griffiths, Algorithms to Live By: The Computer Science of Human Decisions, New York: Henry Holt and Company, 2016.

[2] J. Billingham, Kissing the frog: A Mathematician’s Guide to Mating, Plus Magazine, 9 (2008) 1-3.

Physicists vs. Mathematicians (Part II)

leave a comment »

Written by u220e

2020-03-26 at 11:07 am


Bon Appétit: Thanksgiving Edition

leave a comment »

Informed readers living in the United States might be aware of the fact that the FDA publishes a regulatory guide, which enumerates the (ignominiously) acceptable limits of various “defects” found in domestic food products. “Defect,” of course, is a government-sponsored euphemism for a variety of food-based atrociousness—including fecal matter, insects and insect parts, insoluble organic material, mold, dirt, maggots, larvae, and rodent hairs.

One particularly egregious example involves that ubiquitous and delectable herb very often used during Thanksgiving feasts throughout North America: sage. The FDA allows an average of “200 or more insect fragments per 10 grams” of ground sage (about 14 teaspoons). At a(n average) rate of 20 fragments/gram, what’s the probability you consumed at least TEN insect parts had you prepared Martha Stewart’s traditional bread-stuffing recipe, which only uses about 0.7 grams of sage?

Fortunately, mathematics saves us from the uncertainty surrounding the number of insect parts we’re likely to consume, though we might not like the answer. Let X be the number of insect parts found in 0.7 grams of ground sage based on the average-rate limit set by the FDA. We assume a Poisson distribution and calculate the complement of the probability of eating at most nine insect parts:P(X=x)\rightarrow P(X\geq 10) = 1 - P(X\leq 9),which gives us

\displaystyle p_X(k)=1 - \sum_{k=0}^9 \frac{e^{-14}14^k}{k!}\approx 1-0.109 = 0.891.

So, the probability of eating ten insect parts in a recipe that uses 0.7g of ground sage is a whopping 89%!

And just in case you assume you always avoid the added protein by lucking out as a member of the 11% minority, I have bad news for you: You’re guaranteed to eat two insect parts in 0.7g no matter what you do.

\displaystyle p_X(k)=1 - \sum_{k=0}^2 \frac{e^{-14}14^k}{k!}\approx 1

Anybody want seconds?

Written by u220e

2019-11-30 at 7:48 pm

Physicists vs. Mathematicians

with one comment

A physicist who meets the harmonic series for the first time

\displaystyle H_{\infty}=\sum_{i=1}^{\infty}i^{-1}=1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\cdots

might approach it by summing n terms in an attempt to get a feel for what’s happening asn\rightarrow\infty.Basic logic suggests the series will converge: Each term is getting smaller, and asn\rightarrow\infty,the individual terms we’re adding to the sum approach zero.

Think about it intuitively for a second.

Imagine I emptied an Olympic-sized swimming pool, handed you a one-liter bucket, and asked you to refill the pool by filling and emptying the bucket according to the terms of the harmonic series. That is, fill the bucket and dump it into the pool, then fill half the bucket and dump it into the pool, fill it a third of the way and dump it, and so forth.

Do you think you’ll ever fill the entire pool using this method?

It seems unfathomably unlikely you’d even fill a small fraction of the pool, let alone refill it to its full capacity. But this is why we cannot rely upon intuition to solve problems, and it’s why, in the end, I believe mathematics is superior to physics. The latter relies upon experimentation, the scientific process, estimation, probability, and some amount of intuitive guesswork. The former requires the kind of logical rigor in (dis)proving conjectures that leaves no doubt about its conclusions. The physicist will assiduously dump the first n = 250,000,000 buckets into the pool, only to realize she still has more than 2,499,980 liters to go. ForH_n>100,our physicist will need to dump


buckets into the pool. ForH_n>10^3,she would need to add1.10611511...\times 10^{434}buckets according to the successive terms of the series. Of course, while we must applaud the independently wealthy scientist who possesses the patience and dedication required to engage in such an arduous process, it’s not a practical way to solve problems (or fill pools).


Mathematicians, however, have proven that the infinite sum of the harmonic series actually diverges; the proof given by Nicole d’Oresme (1323-1382) is one of my all-time favorite proofs, and its utter simplicity and logical beauty deserve repeating here.

Proof. Grouping the terms, we have

\displaystyle H_{\infty}=\sum_{i=1}^{\infty}i^{-1}=1+\frac{1}{2}+\Big(\frac{1}{3}+\frac{1}{4}\Big)+\Big(\frac{1}{5}+\frac{1}{6}+\frac{1}{7}+\frac{1}{8}\Big)+\cdots,

which is greater than

\displaystyle S = 1+\frac{1}{2}+\Big(\frac{1}{4}+\frac{1}{4}\Big)+\Big(\frac{1}{8}+\frac{1}{8}+\frac{1}{8}+\frac{1}{8}\Big)+\cdots,

a sum that simplifies to

\displaystyle 1+\frac{1}{2}+\frac{1}{2}+\frac{1}{2}+\cdots = \infty.

Clearly, if the series S is smaller thanH_{\infty}and S is divergent, then the harmonic series must be divergent.  \square

This means that, eventually, you’ll not only be able to fill the pool, but you’ll be able to fill an infinite number of pools an infinite number of times…and it only took a few seconds to figure it out.

No buckets involved.

Written by u220e

2019-11-11 at 3:23 pm

The Life of Pi

leave a comment »

I still think it’s pretty cool that

\displaystyle \int_{-\infty}^{\,\infty}e^{-x^2}\,\,\mathrm{d}x = \sqrt{\pi},

where an infinite area equals a single, real-valued number. Of course,\sqrt{\pi}is irrational, which means, among other things, that we can never actually “write down” an exact value for it, and in that sense, it’s an intuitive equality:\sqrt{\pi}is “infinite” in a way that models the infinite area under the Gaussian integral.

The Gaussian integral: f(x) = exp[-x^2]

. Let G be the Gaussian integral. Then,

\displaystyle G^2 = \Big[\int_{-\infty}^{\,\infty}e^{-x^2}\,\mathrm{d}x\Big]^2\,\equiv\,\int_{-\infty}^{\,\infty}\int_{-\infty}^{\,\infty}e^{-(x^2+y^2)}\,\,\mathrm{d}x\,\mathrm{d}y.

This can be transformed into polar coordinates:

\displaystyle \int_{-\infty}^{\,\infty}\int_{-\infty}^{\,\infty}e^{-(x^2+y^2)}\,\,\mathrm{d}x\,\mathrm{d}y\,\,\equiv\,\,\!\!\int_0^{\infty}\!\!\int_0^{2\pi}\,e^{-r^2}\,r\,\mathrm{d}\theta\,\,\mathrm{d}r\,

becausex^2 + y^2 = r^2. Thus, we have

\displaystyle 2\pi\!\int_0^{\infty}\!\!e^{-r^2}\,r\,\mathrm{d}r\,\,\,\,\longrightarrow\,\,\,\,2\pi\!\int_0^{\infty}\!e^u\,r\,\mathrm{d}u\,=\,-\pi\Big[e^{-r^2}\Big]_0^{\infty}=-\pi(-1)=\pi.

And becauseG^2 = \pi, we knowG=\sqrt{\pi}, as desired.\blacksquare

Think about this for a second.

With only a few simple techniques, we’ve explicitly evaluated an infinite amount of space (i.e., the area under G) without spending an infinite amount of time calculating an infinite number of definite integrals as the curve gets closer and closer to the x-axis.

Of course, the Gaussian integral isn’t the only instance where\pimakes an appearance. There are plenty of other examples involving infinite sums and products where\piplays an indispensible role. Here’s a sample (involving only a fraction of Euler’s discoveries alone!):

\displaystyle \frac{\pi}{\sin s\pi}=\frac{1}{s} +\sum_{n=1}^{\infty}(-1)^n\Big(\frac{1}{n+s}-\frac{1}{n-s}\Big)

\displaystyle \pi\cot s\pi = \frac{1}{s}+\sum_{n=1}^{\infty}\Big(\frac{1}{n+s}-\frac{1}{n-s}\Big)




\displaystyle\frac{\sin x}{x}=\prod_{n=1}^{\infty}\Big(1-\frac{x^2}{n^2\pi^2}\Big)

\displaystyle 1+\frac{1}{2^k}+\frac{1}{3^k}+\cdots+\frac{1}{n^{2k}}+\cdots=\frac{(-1)^{k-1}2^{2k-1}B_{2k}}{(2k)!}\,\pi^{2k}

Discovered or invented, that’s pretty awesome.

Sleepless in Seattle

leave a comment »


Conjecture 1.1.  As a sequence of events(a_n)\rightarrow b,where b is bedtime, then

\displaystyle\forall\epsilon >0\,\,\exists N\in\mathbb{N} : n\geq N\rightarrow |a_n - b| <\epsilon.

If\epsilonis sufficiently small andn\geq Nis the event at which point I take my DHA supplements, then there exists a mapf:\Re\rightarrow\textbf{M}from real-world events to memesM\in\textbf{M},appropriately defined, such that I lose sleep.

Written by u220e

2019-10-04 at 8:42 am

Where Have All the (Good) Lawsuits Gone?

leave a comment »

Question: What do Ed Sheeran, Led Zeppelin, Lady Gaga, and Katy Perry have in common?

Answer: They have all been accused of copyright infringement.[1]

It’s no surprise our particular brand of postmodernism has led us to embrace a significant amount of litigiousness (socioeconomic and otherwise), but what’s the impetus behind the sudden onslaught of copyright-infringement cases we see in the music industry? Is it simply a matter of finding a legal opportunity to make a quick buck at an artist’s expense, a calculated financial score within a tech- and service-heavy labor market that no longer has any real use for musicians (or their musical ruminations)? Avarice certainly cannot be dismissed, but maybe there are other possibilities.

“Gen-ed” music education has suffered a precipitous decline in recent decades, which means it’s not unreasonable to assume some artists likely don’t have the requisite compositional awareness to avoid crossing the legal boundaries prescribed by copyright law. (And some artists [read: singers] don’t even write their own material.) It’s also possible many musical artists and casual music fans (at least within the millennial and Gen-Z cohorts) are uninterested in any discographies prior to the year 2000. Should we expect twenty-somethings to recall sufficiently the melodic contours of “Elenor Rigby,” “A Farewell to Kings,” or “Kasmir” in an attempt to avoid even the slightest melodic or harmonic evocations when in the throws of creative expression?

Perhaps we must resign ourselves to the frightening possibility that modern pop/rock music has reached a saturation point, a terminus where we’ve exhausted a sufficient number of the finite intelligible combinations (using, say, three or four diatonic chords and the most basic time signatures) such that it should now be considered inevitable one artist will sound like another within the same musical space.

Have we crossed some sort of musical Rubicon?

If we limit our compositional options to four diatonic triads—say, I, vi, IV and V—and their chord tones, there are only 3^4 = 81 possible melodies you could write.[2] That’s hardly a deep well of melodic variety. And if analysis (e.g., prolongational/Schenkerian theory, etc.) “reduces” melodic structures to these kinds of skeletal designs, there seems to be little hope of avoiding an infringement charge when expert witnesses are summoned to the scene of the crime. Yes, how one decorates the basic melodic structure inter alia with non-chord tones, suspensions, and (consonant support of) passing tones greatly increases the number of note-to-note possibilities—and this is significant when evaluating a legal threshold for infringement—but there are only so many ways to descend from c² to f¹ in the key of F major and still get Spotify streams. Someone needs to say it: Within a sufficiently simplified harmonic structure, your melody will sound a lot like something that’s already been written.

So, what about the lawsuits?

In his entertaining and surprisingly-detailed-for-YouTube discussion of the Ed Sheeran lawsuit (see the first link in [1]), Adam Neely focuses on the harmonic similarities between the two songs in question: “Thinking Out Loud” and “Let’s Get it On.” There, he suggests the “iii” (mediant) chord in “Let’s Get it On” enjoys “relative dominant” function based on the text Harmony Simplified by Hugo Riemann (REE-mahn), whom Neely mistakenly refers to as Hugo “Reimann” (RYE-mahn).

This is incorrect.

An orthodox Riemannian functional analysis would understand the mediant in this context as having tonic function, prolonging the Eb-major tonic as the Leittonweschelklang (or “leading-tone change chord”) of I. Even the voice-leading taken from the lawsuit’s relevant musical example suggests such a hearing (Figure 1.1). A dominant-function analysis (i.e., T Dp) would suggest dominant prolongation beginning at chord 2, which, under other circumstances, might be a plausible hearing, but it’s simply an unreasonable claim in this case. So, a proper functional analysis of the I-iii-IV-V progression in “Let’s Get it On” would be\text{Eb}: T\,\,T\!\!\!\!\!\!\!<S \,D,which (barely) distinguishes itself from the D : T –> S D structure of “Thinking Out Loud.”[3] The progressions are as similar as they can be without being exactly the same.[4]

Figure 1.1: Roman-numeral analysis of both songs taken from the lawsuit

But why should this matter? After all, you can’t copyright a chord progression any more than you can copyright the color blue or the Poisson distribution.[5] Ideas, musical or otherwise, need room to breathe, and harmonic progressions have long been considered the canvas upon which musicians paint their artistic visions.[6] So, even if we admit Sheeran’s backing tracks to the verses are essentially identical to “Let’s Get it On,” is that enough to justify a 100-million-dollar judgment for the plaintiff?[7] As we consider other musical parameters when we listen to both excerpts (e.g., rhythm, meter, timbre, instrumentation, form, tempo), we must conclude that a reasonable listener would consider the musical expressions to be nearly identical, and the lawsuit’s language certainly makes that argument:

‘Thinking Out Loud’ copies various elements of ‘Let’s Get it On,’ including but not limited to the melody, rhythms, harmonies, drums, bass line, backing chorus, tempo, syncopation and looping. (2)

That’s quite a damning claim—and with respect to the backing tracks to the verses, the accusation seems quite justified—but in the process of dismissing the lawsuit as frivolous and overreaching, both Neely and Beato argue the songs have “very different melodies,” almost suggesting melodic content should be paramount when assessing these kinds of infringement claims. I’ve already suggested melodic content might be more limited than we think, and engaging in a complete analysis of both melodic structures is beyond the scope of this (already-too-long) blog post, but it seems imprudent to dismiss any and all similarities (and, perhaps, err in adjudicating a legitimate lawsuit in the process) based solely on one’s inability to make deep connections beyond the foreground on a cursory hearing. The lawsuit suggests a number of structural similarities between melodies, and the reader can decide if those arguments have any merit.

Is it really the case, though, that melody is always the defining factor in identifying musical plagiarism? Let’s try an experiment. I used a PRNG to alter each note of the following melody (and its “harmonic” support) such that it (i.e., the melody) is completely unrecognizable (and would be wholly immune to accusations of copyright infringement), but I kept the rhythmic and metric structures exactly the same as the original. Can you guess the song based solely on these details? (Comment with your guesses.)

I hope this convinces you that, in some cases, melody isn’t everything. (And that’s why I think Spirit might have a claim against Led Zeppelin.)

What really bothers me about these lawsuits, though, is not their (ostensible?) frivolousness; it’s what they say about the current state of the arts and humanities in the United States. The NEA and NEH clamor for funding and spend a considerable amount of time trying to convince taxpayers that lionizing and privileging STEM knowledge will eventually sterilize us, leaving in its wake an eviscerated, money-centric existence without Whitman scholars to provide those elusive reasons to continue living. But, here, in high-profile copyright cases like these, the NEA, reified by the academic music-theory community, has a wonderful opportunity—really, an obligation—to step onto the big stage and justify its existence by properly adjudicating these legal disputes. Music theorists finally have a chance to transform their compendium of useless arcana into a definitive, practical application that can benefit society (think the birth of econometrics in the 1950s), yet, to this point, they have failed to do so. Theorists who care at all about their profession should be embarrassed by the fact that the outcome of this lawsuit (and others like it) will probably be determined by the kind of music-theory-for-dummies, paint-by-numbers analysis one finds in the legal complaint.

Too bad Ed Sheeran might have to pay the 100-million-dollar tab as a result.


[1] Rather than wade through all the dirty details of each lawsuit, you can familiarize yourself by watching this, this, this, and this. For the sake of expediency, I will refrain from taking exception to many of the details (historical or otherwise) that are communicated in the videos (e.g., Beato’s “line clichés,” etc.).

[2] That is, one chord tone for each harmony with each chord played successively—e.g., E/(C-E-G), C/(A-C-E), B/(G-B-D), and C/(C-E-G). One only need compare the opening of Leslie Bricusse’s “Candy Man” and Stephen Sondheim’s “No One is Alone” from the musical Into the Woods to be convinced of the mathematical limitations of melodic design.

[3] The neo-Riemannian theorist would analyze this as L(Eb+) = G-, where the “LPR group” recasts the Schritt-Wechsel group, which is isomorphic to the non-commutative dihedral group D12.

[4] Key choice is irrelevant when analyzing the functional or prolongational relationships between two harmonic progressions.

[5] Would it matter, from a legal perspective, if a chord progression was so unusual that the probability of two different artists composing it was well beyond chance?

[6] We can’t, as Neely does, dismiss Bach’s use of the contested harmonic progression while invoking the historicity of sixteenth-century imitation masses. Either music history is relevant or it’s not, and we can’t jettison Bach on anachronistic grounds while embracing Palestrina’s received practice of using preexisting material.

[7] Requirements for academic plagiarism would likely have already been met. Do we have any more control over the words we type into Microsoft Word LaTeX than the notes we scribble on our staff paper?

Postscript: Bonus points to those who made the connection between the title of the post and VH’s 1982 track “Where Have All the Good Times Gone?” on Diver Down.

Dear Backstreet Boys

leave a comment »

Written by u220e

2019-08-17 at 5:53 pm

Is God a Mathematician?

leave a comment »

This is part deux to my post “Amazon’s Primes,” where I explore prime numbers in an effort to proselytize my belief that (most of) mathematics is discovered rather than invented. I’ve since stumbled upon Barry Mazur’s interesting 2008 paper “Mathematical Platonism and its Opposites,” in which he addresses this intellectual bifurcation—what he calls “The Question”—and I thought I’d briefly respond to his comments.

Mazur begins by framing the Platonic position as follows:

If we adopt the Platonic view that mathematics is discovered, we are suddenly in surprising territory, for this is a full-fledged theistic position. Not that it necessarily posits a god, but rather that its stance is such that the only way one can adequately express one’s faith in it, the only way one can hope to persuade others of its truth, is by abandoning the arsenal of rationality, and relying on the resources of the prophets.

At best, this is incomplete; at worst, it’s dismissive. Why can’t the universe possess an incredibly high degree of order/structure without the inevitable recourse to theism? Theists certainly believe God is the locus of such “rationality” (through design), as I do, but there are a number of mathematical concepts (like the organizing principle behind prime numbers) that don’t at all assume the existence of a supernal being. Either you can arrange a set of objects with a given cardinality c into subsets with equal cardinalities (less than c) or you can’t. That will be the case in any universe with discrete objects, which was a critical point of my initial post.[1]

Mazur continues with his “Do’s and Dont’s for future writers promoting the Platonic…persuasions”:

One crucial consequence of the Platonic position is that it views mathematics as a project akin to physics, Platonic mathematicians being—as physicists certainly are—describers or possibly predictors—not, of course, of the physical world, but of some other more noetic entity. Mathematics—from the Platonic perspective—aims…to come up with the most faithful description of that entity. This attitude has the curious effect of reducing some of the urgency of…rigorous proof. Some mathematicians think of mathematical proof as the certificate guaranteeing trustworthiness of, and formulating the nature of, the building-blocks of the edifices that comprise our constructions.

Without proof: no building-blocks, no edifice. Our step-by-step articulated arguments are the devices that some mathematicians feel are responsible for bringing into being the theories we work in. This can’t quite be so for the ardent Platonist, or at least it can’t be so in the same way that it might be for the non-Platonist. Mathematicians often wonder about…the laxity of proof in the physics literature. But I believe this kind of lamentation is based on a misconception, namely the misunderstanding of the fundamental function of proof in physics. Proof has principally…a rhetorical role: to convince others that your description holds together, that your model is a faithful re-production, and possibly to persuade yourself of that as well.

I’m not sure why Mazur thinks the platonist position would vitiate an understanding of, and urgency for, rigorous proof. In fact, I’d argue he has it reversed. Rigorous proof (as opposed to the mere descriptions of physics, which are entirely motivated by, and subject to, the scientific method) is precisely the thing that bolsters the Platonic view of mathematics as being “out there” and not “in us.” I don’t believe there is an infinite number of primes because there exists some computer that’s been spitting out ever-larger primes for the last decade.[2] That would constitute the kind of “rhetorical proof” Mazur attributes to physics. No, we mathematicians believe there’s an infinite number of primes because we have a rigorous proof of that claim, and we can confidently power down the computer because it will never, ever find a “largest prime.”[3]

Mazur concludes by arguing that:

in the hands of a mathematician who is a determined Platonist, proof could very well serve primarily this kind of rhetorical function…and not…have the rigorous theory-building function it is often conceived as fulfilling. My feeling, when I read a Platonist’s account of his or her view of mathematics, is that unless such issues regarding the nature of proof are addressed and conscientiously examined, I am getting a superficial account of the philosophical position, and I lose interest in what I am reading. But the main task of the Platonist who wishes to persuade non-believers is to…communicate an experience that transcends the language available to describe it. If all you are going to do is to chant credos synonymous with “the mathematical forms are out there”—which some proud essays about mathematical Platonism content themselves to do—well, that will not persuade.

This is a fair point and one, I believe, I successfully sidestepped in my initial post. There, I tried to communicate the notion of primality in a way that transcends the kind of man-made hieroglyphics and symbolic logic that inevitably emerge with formal definitions of mathematical concepts.[4] But the larger question remains: Why should the platonist position view formal proof as a “rhetorical function” at all? Can’t we pursue “rigorous theory-building functions” while also realizing such a pursuit is synonymous with the process of uncovering (and not merely describing) God’s design nature’s inherent structure? We can, but, more than that, I believe an edifice of theory-building that rests upon an immanent Platonic framework does stake a stronger claim. If theory-building was nothing more than the progeny of (human-)contrived logic, I think we would feel much less secure in what we know; for example, could we ever be certain\sqrt{2}is genuinely “irrational,” however we chose to define that term, or does our subjective and myopic system of logic create a mirage of knowledge that wouldn’t necessarily be true in every (or any) other universe?

Perhaps the best argument for mathematical platonism (MP) involves what might follow from the argument against it: If mathematics is, in fact, invented, can we objectively prove anything? Can we imagine a universe where, say, Peano’s axioms or ZF(C) set theory didn’t apply? Can we imagine a universe where something isn’t equal to itself? Where equality isn’t transitive with respect to the “natural numbers”—i.e., if x = y and y = z, then x = z where x, y, z are elements of the natural numbers—or where the union of two non-empty sets A and B doesn’t equal some set C containing the elementsA\cup B?What’s the likelihood we’ve designed the most perfect logical infrastructure for mathematics that also makes it impossible for us to differentiate between objective Truth and self-referential consistency? How do we explain that “‘the enormous usefulness of mathematics in the natural sciences is something bordering on the mysterious’ where ‘there is no rational explanation for it'”?[5] The platonic atheist seems to have two choices: (1) MP exists because an infinite multiverse demands the existence of at least one universe with such objective logical consistency and (2) MP exists because the universe is a mathematical construct. An infinite multiverse bears the burden of being far too permissive—everything that might exist must exist—and the idea of living in a mathematical construct, which is Max Tegmark’s controversial proposition, just seems impossible to embrace with too much enthusiasm.

But what about the primes? I said we have a rigorous proof for the infinitude of primes, but what does that proof look like? Can we see anything in its mechanics that might suggest MP is a false assertion, that it’s nothing more than some sleight-of-hand constructed entirely from man-made logical principles?

You be the judge.

Theorem: There exists an infinite number of primes.

Proof (Euclid): Assume the set of primes\mathbb{P}is finite withp\in\mathbb{P}the largest prime. DefineQ = p\,! + 1.Because every positive integer greater than 1 has a prime factor, Q must have a prime factor. Call it q. (Note that Q\in\mathbb{Z}^+because the ring of integers is closed under multiplication, so p\,!\in\mathbb{Z}^+.) Thus, we haveQ = qkfor some positive integer k. Ifq\leq p(because p is assumed to be the largest prime), then it must be the case thatq\,\,|\,\,p\,!,(becausep\,!contains all the primes as factors), which means, of course,p\,! = qrfor some positive integer r. From the equality above and substituting appropriately for Q and p!, we have1 = Q - p\,! = q(k - r).This meansq,(k-r) = 1if their product equals 1, and this contradicts the claim that q is prime. So, q > p, and because p was arbitrarily chosen to be the largest prime in a finite set of primes\mathbb{P},it follows that there is no largest prime.\blacksquare

It is the glory of God to conceal a thing: but the honor of kings [is] to search out a matter.” ~ Proverbs 25:2 (KJV)


[1] The following quote from Max Tegmark is quite apropos: “Think of mathematical symbols as mere labels without intrinsic meaning. It doesn’t matter whether you write, ‘Two plus two equals four,’ ‘2 + 2 = 4,’ or ‘Dos más dos es igual a cuatro.’ The notation used to denote the entities and the relations is irrelevant; the only properties of integers are those embodied by the relations between them. That is, we don’t invent mathematical structures—we discover them, and invent only the notation for describing them.” (Our Mathematical Universe)

[2] This is precisely what the GIMPS project does—for Mersenne primes, at least.

[3] Actually, there are many proofs of the infinitude of primes.

[4] I later realized my discussion of prime numbers might have suggested we build a bridge between mathematical Truth and physical processes (e.g., arranging physical objects), and that’s not my position at all. (That is the project of physics.) Sometimes, however, it’s convenient to reference the physical world as a way to access certain abstract mathematical ideas (e.g., rigid n-gon rotations in group theory, etc.).

[5] Max Tegmark quoting Wigner in Our Mathematical Universe (355).


The Harlem Disktrotters (Est. 2019)

leave a comment »

I met a flat-earther (FE) for the first time about a year ago during an informal holiday gathering, and I really didn’t give it much thought at the time. I simply dismissed it as the biased reasoning of someone who seemed predisposed toward conspiracy theories. I have only recently, however, realized the flat-earth position (FEP) has, at some surreptitious point, morphed into a fully fledged global movement (pun intended). In service of understanding that movement, then, I spent some time watching a variety of YouTube debates and reading various material on the Internet. What I’ve discovered is that the FEP has a ready-made (if insufficient) defense for almost every conceivable pro-sphere counterpoint, some of which involve the following topics:

  • curvature at the horizon
  • rotation of the constellations
  • lunar eclipses
  • changing lengths and angles of shadows
  • satellite and telescopic photographs
  • issues of perspective
  • cosmological consistency
  • momentum and inertia
  • non-Euclidean geometry

But there’s one thing FEs can’t really refute: gravity. And a significant part of their inability to do so involves our subjective, corporeal interaction with it. It’s easy to dismiss photographs of a spherical earth as dissembling “composites” that evince a global conspiracy because—once we decide everyone is in on the cover-up—it’s a claim that’s impossible to disprove. Think what you want about the principle of Popperian falsification, but it’s still the best game in town for differentiating between legitimate scientific inquiry and an intellectual hatchet job.

So, what about gravity? Imagine we take an infinitely thin cross-section of the spherical earth so we can operate in two dimensions. Recall from differential geometry that the unit normal vector = N(t) at some point P = P(t) on a smooth curve at parameter t is defined as

\displaystyle \text{\bf{N}} = \frac{d\text{\bf{T}}/ds}{|d\text{\bf{T}}/ds|}=\frac{1}{\kappa}\frac{d\text{\bf{T}}}{ds}\,\,\,\Longrightarrow\,\,\, d\text{\bf{T}}=\kappa\text{\bf{N}}ds

where\kappais the radius of curvature. If T is defined as\cos\phi\text{\bf{i}} + \sin\phi\text{\bf{j}},then

\displaystyle \frac{d\text{\bf{T}}}{ds}=\frac{d\text{\bf{T}}}{d\phi}\frac{d\phi}{ds}=\Big(-\sin\phi\text{\bf{i}}+\cos\phi\text{\bf{j}}\Big)\frac{d\phi}{ds},

and N is the orthogonal unit vector to T (i.e.,\text{\bf{T}}\cdot\text{\bf{N}}=0)that points to the concave side of the curve. Look at figure 1. The vector\vec{s_t}(red line) is tangent at point D to our imagined cross-section of the earth centered at S, and the vector\vec{s_n}is the normal vector at D.[1]

Figure 1: Tangency and the normal vector

The normal vector N models how gravity acts upon objects resting on the earth’s surface; that is, the mass of the earth “pulls” objects to the center of mass/gravity along the vector that is orthogonal to a given tangency point (for some T).[2] This is why we experience gravity exactly the same way no matter where we are on the planet. Any point we occupy at a given time—say, standing in front of the Barnes & Noble in Bayside, Queens (40.7805° N, 73.7764° W)—will always be a point of tangency such that there will exist a normal vector that represents the direction of the orthogonal pull of gravity toward the center of the earth.

Figure 2: Gravity acting upon a disk

This would not be the scenario if the earth were a flat disk. On a disk, we would only experience gravity as we usually do if we were located at the disk’s center. Why? Because that is the only place (on a disk) where the gravitational pull would involve a vector orthogonal to the tangent plane (i.e., the earth’s surface).[3] As soon as we begin moving away from the center, toward the edge of the disk, the center of gravity begins pulling at an angle that is not orthogonal to the tangent plane, and the further we get from the center, the greater the angle. See figure 2. As we traveled to the edge of the disk, gravity—at least based upon our usual expectations—would become almost nonexistent.

There are many examples.

Here are a few: Balls thrown vertically into the air would travel toward the center of the disk—not up and down—and they would travel further using the same force the closer we got to the edge of the disk. Trees growing beyond the center of the disk would grow on angles (negative gravitropism) based on the deviation from orthogonality.[4] Even walking to the edge of the disk would involve an angular pull of gravity on our bodies, as if one were walking up a hill whose incline kept increasing; eventually, if we could traverse the edge of the disk and stand on its side, we’d experience gravity as we normally do because, at that point,\theta=0and the gravitational pull would, again, be orthogonal to our position. Of course, none of this is at all what we experience—at baseball games, hiking through the forest, or walking around on the coast of the Bering Sea. We always experience gravity as a (normal-vector) force that’s “pulling us” (essentially) straight down into the earth. Why?

Because the earth is a sphere.

FEs must have realized gravity is a significant problem for their worldview because they already fabricated a solution: eliminate it altogether. As explained by “6or1/2Dozen” on the Flat Earth Society website:

In most flat earth models, the force perceived as gravity is the disc of the planet being accelerated upwards by [a] force know [sic] as the Universal Accelerator (UA). The Universal Accelerator, as the name might imply, accelerates universally, so that the Sun, Moon and planets accelerate at the same rate as the disc of the earth. [There is also an infinite plane model in which gravity is gravity, but requires the Earth to be an infinite plane with infinite mass]. Please note that although it is called the Universal Accelerator, it does not actually accelerate things universally, objects like people, trees, small rocks and cheese are exempt.

Thus, to circumvent the issue of orthogonality we’ve been discussing, FE proponents simply jettison the inconvenience of gravity—Newton and Einstein be damned!—and replace it with a nebulous and problematic concept of a “Universal Accelerator” (UA) where the entire solar system (i.e., the firmament under the dome that covers the disk) is “accelerating” at the same rate.[5]

There are, of course, many objections to the UA theory. As mentioned in note 5, increasing acceleration means the earth will surpass the speed of light, which is impossible to reconcile with any form of reality. And if the velocity of the firmament is constant (i.e., dv/dx = 0), then the force that mimics gravity would be equal at all points on the disk and objects would always fall at the same rate. Unfortunately, this is not what we experience. Balls dropped from the same height—say, at the top of a mountain—will at different times during their journey enjoy different velocities. For example, the equation for the velocity of an object in freefall is v = gt where t is time (in seconds). If Ball A falls for three seconds, it will have a velocity of (9.8(3)=) 29.4m/sec. If Ball B is then dropped two seconds after Ball A, then Ball B (at t = 3) has a velocity of (9.8(1)=) 9.8m/sec. So, either acceleration is zero (i.e., velocity is constant), which means objects in freefall should fall at the same rate (when they don’t), or the disk is traveling faster than the speed of light (yet our clocks are still moving).[6]

But even if we found a way to dismiss orthodox notions of gravity without abandoning experimental observations involving objects in freefall, we would be unable to reconcile UA with another incontrovertible fact: Gravity is inversely proportional to the square of the distance, so gravity is stronger (i.e., weight increases) the closer we get to sea level. You weigh more in front of the Barnes & Noble in Bayside than you do on the top of Mt. Everest, and you weigh more shopping in Alert, Nunavut than you would be sipping an Americano in Fortaleza, Brazil. If UA were true, even if FEP could account for differing freefall velocities, weight measurements would be uniform everywhere on the disk, but, again, this is completely belied by real-world experiments.


[1] The illustration shows the normal vector pointing to the convex side of the curve, but we can easily see it points inward—to the concave side—as well.

[2] It is true the earth is not completely spherical, but the deviations from Platonic sphericality (and the density of other matter that influences the direction of the pull of gravity, etc.) do not produce an effect great enough to appreciate the change.

[3] There would exist a normal vector if we could stand “on the edge” of the disk (e.g., the side of a coin), but we exclude that possibility as FE proponents claim we can never travel beyond the “edge” of the (tangent plane of the) earth.

[4] The equation for the angle of deviation might look something like\theta=\pi /2 - \pi d(2r)^{-1},where d is the distance traveled from the center of the disk to its edge and r is the radius of the disk.

[5] It’s unclear what the author means by “acceleration,” but one must logically assume the derivative of the disk’s velocity is a constant (dv = a dx | a > 0). Otherwise, a constant(ly increasing) acceleration would mean the disk would eventually surpass the speed of light, and, according to special relativity, our clocks would eventually stop and we would (theoretically) travel back in time. No FEs can tell us what is causing this acceleration.

[6] If there are roughly10^{9.4988}seconds in 100 years, then the disk would be traveling at a velocity of (roughly)10^{10.49}meters per second (assuming the disk began accelerating from rest on 25 March 1919 at 1:09 PST), and we know the speed of light is (roughly)10^{8.47682}meters per second.

It is he that sitteth upon the sphere [חוּג] of the earth, and the inhabitants thereof are as grasshoppers; that stretcheth out the heavens as a curtain, and spreadeth them out as a tent to dwell in: That bringeth the princes to nothing; he maketh the judges of the earth as vanity. ~ Isaiah 40:22-23 (KJV)


Let Them Eat Pseudoscience

with 4 comments

In a now-(in)famous paper published in the 313th volume of the prestigious magazine Science, Dimitri Tymoczko (DT) makes the startling claim that the Möbius strip (MS) represents the topology (i.e., the “fundamental shape”) of representatives of dyad set-classes (i.e., all the types of two-note “chords” you can play on the piano). Unfortunately, he goes one step further and suggests the MS represents a sort of Platonic mathematical truth about dyadic structures in general.

This is absurd.

From page 2 of DT’s paper in Science:

I now describe the geometry of musical chords. An ordered sequence of n pitches can be represented as a point in R^n. Directed line segments in this space represent voice leadings. A measure of voice-leading size assigns lengths to these line segments….To model an ordered sequence of n pitch classes, form the quotient space R/12Z^n, also known as the n-torus T^n. To model unordered n-note chords of pitch classes, identify all points (x_1, x_2,…,x_n) and (x_s(1), x_s(2),…,x_s(n)), where s is any permutation. The result is the global-quotient orbifold T^n/S_n, the n-torus T^n modulo the symmetric group S_n.

It should be clear, even by a cursory reading of the above passage, that the geometry of the quotient orbifold is induced by a predetermined precondition of (maximal) parsimony—as well as octave equivalence and tunings that privilege an equal division of the octave—a feature reified by the directed edges whose “lengths” represent voice-leading distances in\mathbb{R}^n.[1] “Points” of unordered sets of pitch classes will perforce be proximate to other “points” of unordered sets of pitch classes whose distances involve minimal voice-leading perturbations. The MS (fig. 1) emerges from the decision to privilege parsimonious voice-leading principles (as a function of log-frequency) in organizing the point lattice in Euclidean space.

Figure 1: A Möbius-strip topology built from representatives of dyad classes in R^2

It is this predetermined requirement of parsimony in constructing the quotient orbifold to which I object because it represents inter alia something of a Texas Sharpshooter fallacy, which leads ineluctably to a spurious intimation of Platonic design that simply does not exist. The MS is as much the fundamental topology for dyads as the dictionary is the “fundamental design” of the English language. We don’t get to marvel ex post facto at the unadulterated “linearity” of the dictionary after we’ve decided to arrange the words according to the organizing principle that engenders such linearity. The fact that modern theorists have historically privileged parsimonious relations—e.g., the conformist Tonnetz, Power Towers, Chicken-Wire Torus, Weitzmann regions, Cube Dance, etc.—is an insufficient defense to the general indictment. The parsimonious-MS relationship is merely one reification of a number of possible topologies for dyads. Privileging T6 relations, for example, generates the topology of a (ring) torus, suggesting there’s nothing objectively “fundamental” at all about dyadic space.

The appeal to Platonic discovery galvanizes general interest in the paper and, in my opinion, explains its publication in Science. This is a problem not only because the paper fails to uncover anything approaching Platonic “Truths” about musical space but also because it is symptomatic of a certain level of self-consciousness within the subdiscipline of mathematical music theory, an attraction toward hijacking mathematical hieroglyphics (and in some cases, real mathematics) in an effort to legitimize the study of music theory and portray music-theoretic ideas as a more substantive (read: “less artsy”) intellectual pursuit. But self-consciousness transmogrifies into unshirted intellectual crisis when mathematics is banefully misappropriated to bolster subjective claims about musical objects under investigation. Such is the case here.

Musicologists would be much better to avoid such blatant non sequiturs.


[1] Constructing the quotient orbifold as an n-torus modulo the symmetric group of n elements allows us to eliminate identical unordered sets with permuted elements. For example, in DT’s MS model in figure 1, we see that\{03\}\equiv\{30\}\!\!\mod S_{2}, which allows us to choose either {03} or {30} as the “minor third” representative. If we were modeling triads, we might have\{037\}\equiv\{307\}\!\!\mod S_{3}\equiv\{730\}\!\!\mod S_{3},etc., allowing us to choose, say, {037} among the 3! orderings as the “minor triad” representative in\mathbb{T}^3/S_3space.

The (Half-)Life of Han van Meegeren

leave a comment »

December 30th was the 71st anniversary of the death of Han van Meegeren, and I thought this would be an opportune time to whip up a not-so-quick post lauding the power and beauty of differential equations. (As if we need an excuse for such honorifics!) For the uninitiated, Han van Meegeren was a talented Dutch painter who suffered from a combustible fusion of realities: an insatiable desire for fame and a star on the decline; it was this desire that ultimately led him to perpetrate what some consider to be “the most dramatic art scam of the twentieth century.”

He almost got away with it.

The Pledge. The plan was simple: Forge a number of “early” Vermeer paintings that, as a collective, would serve as an organic confirmation of the more substantial and “mature” Vermeer forgeries to follow. It worked brilliantly. Abraham Bredius, the preeminent art historian of his day, adjudicated the surreptitious Han van Meegeren forgeries as Vermeer originals for a Dutch estate. He proudly published his analysis, which incidentally confirmed his pet theory that the Italians influenced Vermeer’s artistic oeuvre:

It is a wonderful moment…when [one] finds himself suddenly confronted with a hitherto unknown painting by a great master, untouched, on the original canvas, and without any restoration, just as it left the painter’s studio….Neither the beautiful signature [nor] the pointillés on the bread…is necessary to convince us that we have…the masterpiece of Johannes Vermeer of Delft….In no other picture by the great master…do we find…such a profound understanding of the Bible story—a sentiment so nobly human expressed through the medium of the highest art.

The deception, now substantiated in print by the ultimate academic authority, was complete. Van Meegeren was back on top, raking in the cash for his fraudulent paintings, and fooling the art world he now despised for failing to recognize his genius.

The Turn. Van Meegeren’s scam began to unravel after the end of WWII when authorities were tracking Nazi collaborators. An investigation discovered Van Meegeren had, through an intermediary, unwittingly sold a “Vermeer” to Goering, and as a result, he was accused of, and arrested for, treason. His defense was as simple as his scam: Admit the paintings sold to the Germans were fake. Van Meegeren’s claim was dismissed as a desperate attempt to mitigate the severer charge of treason. To prove his accusers wrong, however, van Meegeren began forging “Jesus Amongst the Doctors” in prison as proof of his skill set. But it was to no avail: van Meegeren was charged with collaborating with the enemy, and a panel of experts was introduced to examine the paintings.

What no one realized, however, was that van Meegeren had prepared for forensic scrutiny. He scratched the paint off worthless paintings from the period in order to use age-appropriate canvases and defeat X-ray analysis. He used color schemes and materials Vermeer would likely have used. He even employed Pheno formaldehyde in an attempt to mimic the rigid texture of paint that had been fossilizing since the seventeenth century. Van Meegeren was assiduous but ultimately imperfect; experts detected both the Pheno formaldehyde and some trace evidence of the color “cobalt blue,” a coeval pigment of the 1940s but wholly unknown and unavailable to Vermeer. On the basis of that evidence, van Meegeren was sentenced to one year in prison for forgery, and roughly two months later, he died of a heart attack.

The evidence presented was quite convincing in and of itself, but van Meegeren still had his doubters; some continued to believe the paintings were simply too good to be forgeries—a testament to van Meegeren’s skill. There was, however, one physical detail van Meegeren could never have addressed, a fact that would prove without question his “Vermeers” were forgeries: the rate of radioactive decay of the lead-210 and radium-226 in the paint he used. If we assumedN/dt = -\lambda Nis the change in the number of disintegrated atoms for some unit of time t with decay constant\lambdaandN(t_0)=N_0(i.e., the time at which an element begins decaying), we have the following general-solution equation

\displaystyle N(t)=N_0\exp\Big(\!\!-\!\lambda\!\int_{t_0}^t ds\Big)=N_0\,e^{-\lambda(t-t_0)}\, ,

which we can simplify asN/N_0=\exp(-\lambda(t-t_0)).Taking natural logs, definingN/N_0=1/2(i.e., the half life for radioactive decay), and solving for(t-t_0)yields(t-t_0)=(\ln 2)\lambda^{-1}.That is, we can calculate an element’s half-life by dividing the natural log of 2 by its decay constant. Unfortunately, there’s never a way to determine preciselyN_0when trying to date an object, so the above equation cannot do much to help our cause.

But all is not lost. We can use half-life values of the specific elements in question to estimate an amount of decay based on specific time frames we wish to investigate. Due to various facts about chemistry we won’t address here, we know the amount of lead-210 (half-life = 22 years) and radium-226 (half-life = 1600 years) for an authentic 300-year-old Vermeer would stand in “radioactive equilibrium,” from which we can deduce that a modern forgery will have a much higher level of lead-210 radioactivity in relation to its radium-226 content.

The Prestige. Supposey(t)is the number of grams (of white lead) at t years withy_0=t_0at production, andr(t)is the function that gives the disintigration rate of radium-226 (grams per minute) of white lead at t. Then we have the following differential equation:dy/dt = -\lambda y+r(t).Because the half-life of radium-226 is 1600 years, our estimates for a 300-year-old Vermeer will involve a pretty consistent value for r(t), which means we can replace r(t) with the constant r. Upon multiplying our differential equation on both sides by the integrating factor, we’re left withd/dt\,\,e^{\lambda t}y=re^{\lambda t}because

\displaystyle \frac{d}{dt}\,e^{\lambda t}y=\Big(e^{\lambda t}\Big)'y+y'e^{\lambda t}=\lambda e^{\lambda t}y+\frac{dy}{dt} e^{\lambda t}\,,

which is what we need whendy/dt + \lambda y=r(t).A straightforward calculation then gives us

\displaystyle e^{\lambda t}y(t) - e^{\lambda t}y_0 = r(e^{\lambda t} - e^{\lambda t_0})\lambda^{-1}\, ,

and, solving for y(t), we have

\displaystyle y(t)=\frac{r}{\lambda}\Big(1-e^{-\lambda(t-t_0)}\Big)+y_0\,e^{-\lambda(t-t_0)}

recalling thaty(t_0)=y_0and\exp(\lambda t_0-\lambda t)=\exp(-\lambda(t-t_0)).But our goal is to estimate the amount of lead-210 at the time of production in order to detect a forgery. This means we need to solve for\lambda y_0\,.Setting(t-t_0)=300in the above equation and doing some rearranging, we finally reach the form of the equation we desire:

\displaystyle \lambda y_0 = \lambda y(t) e^{300\lambda} - r(e^{300\lambda} -1)\,.

We know the decay constant for lead-210 is 22 years; thus,\lambda =\ln 2 /22and we can now calculate the exponential of e:

\displaystyle e^{300\lambda}=e^{(300/22)\ln 2}=(\exp(\ln 2))^{300/22}=2^{150/11}.

All that remains is substituting appropriate values for\lambda y(t),the disintegration rate of lead-210, and the (constant) disintegration rate of radium-226.[1] Due to varying uranium concentrations throughout the world, a very conservative estimate for an upper bound on the disintegration rate of a modern painting was determined to be 30,000 grams/minute of white lead. Values for van Meegeren’s “Disciples at Emmaus” were determined to be\lambda y(t)=8.5andr=0.8,yielding the following calculation:

\displaystyle \lambda y_0=(8.5)2^{150/11} - 0.8(2^{150/11}-1)=98,050\,\,,

more than three times the allowable limit for an authentic painting of the seventeenth century. Clearly, van Meegeren’s “Disciples at Emmaus” was a forgery.

Differential equations 1, Abraham Bredius 0.


[1] The investigation used the disintegration rate of Polonium-210 in place of lead-210 for convenience without any real loss of precision. The rates of^{210}\text{Po}and^{210}\text{Pb}equalize after only a few years.


Braun, Martin. Differential Equations and Their Applications, Texts in Applied Mathematics, Springer-Verlag: New York, 1992.

Dinner with the Three Musketeers

leave a comment »

During dinner one evening with friends, the topic turned to probability. My friend Jeremy thought it’s more difficult to pick one card—say, the 5♦—from a thoroughly shuffled 52 card deck on the first attempt than picking all the other cards without picking the 5♦. Taking an informal poll didn’t help: Some agreed with him while others said leaving the 5♦ unpicked was more difficult. Could they be equal?

What do you think?

Proof. Assuming each selection is an independent event, the probability P of choosing – 1 items (without choosing the target item) is

\displaystyle P = \Big(\frac{n-1}{n}\Big)\Big(\frac{n-2}{n-1}\Big)\cdots\Big(\frac{1}{2}\Big)\,=\,\prod_{i=2}^n 1-i^{-1} \,={n\choose k}^{-1}\!\!=\,\frac{(n-1)!}{n!}\,\,=\,\,\frac{1}{n}\,\,\,,

where k = 1. So, the probability of leaving the 5♦ as the last card is equal to the 1/52 probability of picking it first.  \Box

“One for all, and all for one,” as they say.

Written by u220e

2019-01-14 at 12:03 pm

The Reader is Left to the Proof as a Comic

leave a comment »

Written by u220e

2019-01-14 at 6:00 am


Anyone have a K1 Adaptor?

leave a comment »

In a world not too far removed from our own [Black Mirror, Nosedive“], where one’s entire socioeconomic future is predicated upon one’s present “star” rating, even the smallest downvote from a complete stranger can mean the difference between booking a same-day flight to your “best friend’s” wedding and renting an obsolete jalopy for the nine-hour drive, which—oh, yeah—means you’ll miss the rehearsal dinner.[1]

But did Lacie (4.2) really need to sign up for MOH duties at Naomi’s wedding, replete with throngs of “primes” (i.e., individuals with a minimum of 4.5 stars), to secure the additional 0.3 rating points needed to lock down the luxury pad at Pelican Cove?

Of course not.

LetR_dbe the desired rating,R_cthe current rating, r the minimum rating requirement for interactions, and x ≥ 0 the number of upvoters withr > R_d. A little middle-school math gives us the simple equation:

\displaystyle R_d = \frac{R_c +rx}{x+1}\,\, ,

where 4.5(+ 1) = 4.6+ 4.2 with the appropriate substitutions. In this case, x = 3, which means Lacie could have paid three people $5 each to give her an upvote of 4.6 to reach her rating goal and be able to sign the lease with the 20% discount.[2] The reason she did it, of course, was to maximize her rating. But invoking a bit of mathematics beyond middle school, we see that\lim_{x\rightarrow\infty} (R_c + rx)(x+1)^{-1}=r.

Proof. MultiplyingR_dby(1/x)(1/x)^{-1}, we have

\displaystyle (R_c\, x^{-1}+r)(1+1/x)^{-1}\rightarrow rasx\rightarrow\infty.   \Box

Lacie could only ever approach the limiting value for her rating—no matter how many guests attended the wedding—and her success would depend entirely upon her desired rating. (We assume no rounding effects.) IfR_dis too high, it’s obvious she’s not likely to find enough people with higher ratings to get the requisite upvotes, even at the reception of a rating-obsessed alpha female like “Nay Nay.” (A proportional rating system would only make things worse.)

How many upvotes would Lacie need based on the variables we’ve defined? Here’s a partial list of worst-case scenarios, reified as the quadruple(R_d,R_c,r,x):

(4.5, 4.20, 4.60, 3)
(4.8, 4.20, 4.81, 60)
(4.5, 4.18, 4.51, 32)
(4.5, 4.18, 4.70, 2)
(4.5, 3.20, 4.70, 7)
(4.5, 3.20, 4.51, 130)
(4.5, 0.80, 4.51, 370)
(4.5, 0.80, 4.75, 15)
(4.8, 0.80, 4.90, 40)

One thing is obvious: The wedding was rating overkill. Even after the abject debacle at the reception hall, Lacie could still sign the lease—with the discount—for fewer than 16 upvotes of at least a rating of 4.75. That’s not too bad as far as from-the-brink-of-disaster recoveries go. Heck, it’s a lot easier than rehabbing your Experian credit score.

Perhaps China is onto something.


[1] Lacie couldn’t schedule the last-minute flight because the airline required at least a 4.2 rating to do so, and she had a slipped to 4.18 (see photo) due to a few unfortunate interactions on the way to the airport. Her car rental didn’t include the necessary K1 adapter to charge the vehicle en route, a fact that had serious consequences.

[2] Suppose rating power is proportional to the rating (of the one giving the rating). Let\omega:= a/5be the requisite weight where a is the rating of the upvoter. Now, we haveR_d = (R_c +\omega rx)(x+1)^{-1}, which means Lacie would need the paid upvoters (rated 4.6) to give her a perfect 5.0 to reach the desired rating of 4.5. One might assume there exists a proportional relationship to the rating system based on Lacie’s belief that the ratings of highly rated persons at the reception was important to her plan—and, in that case, the limit would be\omega r—but it’s more likely she was simply appreciating the sheer number of upvotes that would be available.

Written by u220e

2019-01-12 at 5:10 pm

The Hall of Monty-zuma

leave a comment »

By now, most of you probably know about the famous Monty Hall Problem, but in case you’ve been living under a rock for the last 35 years, I will provide a brief description of the situation:

Assume that a room is equipped with three doors. Behind two are goats, and behind the third is a shiny new car. You are asked to pick a door, and [you] will win whatever is behind it. Let’s say you pick door 1. Before the door is opened, however, someone who knows what’s behind the doors (Monty Hall) opens one of the other two doors, revealing a goat and asks you if you wish to change your selection to the third door (i.e., the door which neither you picked nor he opened). The Monty Hall problem is deciding whether you do.

I know. Your instincts might tell you it doesn’t matter if you switch: Either (a) your initial 1/3 probability of picking the car doesn’t change when a goat is revealed or (b) the probability of picking the car increases to 1/2 with only two unopened doors remaining. Both (flawed) logical approaches will convince you to stick with your initial selection, and you wouldn’t be alone in that opinion. But the truth is that you should switch to the last door when given the opportunity.


Well, when you pick door #1, the probability of picking the car is 1/3, leaving 2/3 probability the car is behind one of the other doors. When Monty reveals a goat behind, say, door #3 and asks if you’d like to switch your pick (to door #2), he’s really giving you a chance, in a sense, to travel back in time and pick both doors 2 and 3!

Consider this: Imagine Monty calls you to the stage and asks you to pick two doors. You pick doors 2 and 3, which means you have a 2/3 chance of winning the car. If he then revealed a goat behind door #3, would you believe the probability for winning the car would decrease to 1/3 or 1/2? Of course not. The new information doesn’t change anything because the probability that there was a goat behind at least one of the doors you picked is unity. But you’re now in the same exact situation as if you’d switched your pick in the initial scenario: Choosing door #2 with a goat revealed behind door #3 and a 2/3 probability of winning the car.

Perhaps an even better way to understand it was given by Charles Wheelan in his 2013 book Naked Statistics: Suppose Monty shows you 100 doors and tells you there are goats behind 99 of them and a car behind one. He asks you to pick a door. You pick door #1. He then reveals the goats behind doors 2-99 and asks if you’d like to switch your pick to door #100. It’s painfully obvious you’d switch because doing so increases your probability of winning the car from 0.01 (i.e., getting the right door on your first guess) to 0.99, which is the same mathematical logic we were following when we only had three doors from which to choose. We’ll skip the mathematical proof, but if you’re still not convinced, try it yourself!

But here’s something crazy.

Suppose someone else from the audience (not Monty) randomly chooses a door and happens to reveal a goat—someone who didn’t know that door was hiding a goat—then the probability of picking the car remains static at 1/2 with no benefit to switching! This is completely counterintuitive (at least to me and Paul Erdös!) and even more confusing if you’ve finally accepted the probabilistic increase to 2/3 when switching in the original scenario, but we can prove it mathematically using Bayesian probability. For a primer on Bayesian probability, look here, here, and here.

Suppose Alice, a U.S. Marine, selects door #1 in the hope of revealing the car, after which a second contestant, Bob, is called to the stage to select one of the remaining doors (#2 or #3). Let Pr(O), Pr(T), Pr(R) be the probabilities of the car being behind door #1, #2, and #3, respectively, and define G to be the information there is a goat behind door #3. Further, define Pr(G|O), Pr(G|T) as the conditional probabilities that a goat is behind door #3 given the car is behind doors #1 and #2, respectively. As Bob comes to the stage to make his selection, here’s what we know:

  1. Pr(O) = Pr(T) = Pr(R) = 1/3. Unlike the scenario with Monty, neither Alice nor Bob knows where the car is located, so there’s an equiprobable value for finding the car.
  2. Pr(G|O) = 1/2. If the car is behind door #1, Bob is guaranteed to reveal a goat behind door #3 if he picks it. (Remember, Alice chose door #1, so Bob must choose between doors #2 and #3.)
  3. Pr(G|T) = 1/2. If the car is behind door #2, Bob has a 50-50 shot at revealing a goat behind door #3 because he might pick door #2 and win the car.
  4. Pr(G|R) = 0. If the car is behind door #3, there’s no way Bob can reveal a goat there.

From Bayesian probability theory,

Pr(G) := (Pr(O) x Pr(G|O)) + (Pr(T) x Pr(G|T)) +
(Pr(R) x Pr(G|R)) = (1/3)(1/2) + (1/3)(1/2) + (1/3)(0) = 1/3,

and the posterior probability is calculated as

Pr(O|G) := Pr(O) x Pr(G|O)/Pr(G) = (1/3)(3/2) = 1/2.    \Box

The reader is invited to confirm the same value for Pr(T|G). It’s interesting to note the only difference between scenarios lies with the calculation for Pr(G|T). Monty knows where the car is, and that means he would be forced to pick door #3 and reveal a goat to keep the game going—assuming, as we have, that you picked door #1. That’s not the case, however, when Bob comes to the stage in the alternate scenario. He’s not forced to choose door #3 because he’s not sure where the car is. He might choose door #2 and end the game by winning the car. That probabilistic reduction—from certainty to even odds for Pr(G|T)—is the linchpin to the transformation of the entire problem. It’s also incredibly cool that mathematics can “discern” a seismic shift in the probabilistic outcome based on whether or not the person choosing the remaining doors holds any information about what’s behind them.

The frustrating thing about Bayesian probability, however, is that it can be immune to different interpretations of the conditional probabilities. For example, if we understood Pr(G|O) and Pr(G|T) to mean strictly “the probability of a goat behind door #3 (whether it’s opened or not!) given the car is behind door #1 and door #2, respectively,” then Pr(G|O) = Pr(G|T) = 1 and we get the same answer: Pr(G) = 2/3 and Pr(O|G) = (1/3)(1/(2/3)) = 1/2. Unfortunately, that approach mirrors the faulty (i.e., ex post facto) logic that leads to the erroneous answer in the first scenario: If we calculate our chances after Monty shows us the goat behind door #3, which is what I think most people do when they’re asked to consider the problem in its original form, then we can confidently claim Pr(G|O) = Pr(G|T) = 1, leading to a posterior probability of 0.5 and no need to switch doors. Of course, the benefit to this sort of mathematical flexibility is that we can design and investigate a number of different scenarios using alternate p-values that lead ineluctably to the same (possibly counterintuitive) posterior probability.

Solution to the Raven’s IQ Problem in Gladwell’s “Outliers”

with 11 comments

I can’t overstate how much I enjoyed Malcolm Gladwell’s book Outliers, and I highly recommend it to anyone who might be interested in delving deeper into an eclectic investigation of various “outlier” events. It’s an incredibly illuminating and engaging read, where the alchemy of sudden and colorful success is replaced with the muted quotidian tones of fortuitous circumstance, hard work, and a Pollock-sized splash of serendipity. One pair of chapters, however, interested me more than the others: “The Trouble with Geniuses—Parts 1 and 2.” Here, Gladwell discusses the fascinating—if surprisingly underwhelming—trajectory of Chris Langan, a former Long Island bouncer who possesses one of the world’s highest IQs (195-210). I’m not going to elaborate on Langan’s history or Gladwell’s treatment of him in the book, but I would like to discuss one of the IQ questions Gladwell includes in Part I as a way to underscore Langan’s transcendent intelligence. He states:

One of the most widely used intelligence tests is something called Raven’s Progressive Matrices. It requires no language skills or specific body of acquired knowledge. It’s a measure of abstract reasoning skills. A typical Raven’s test consists of [48] items, each one harder than the one before it, and IQ is calculated based on how many items are answered correctly.

After giving the reader an extremely easy (i.e., early) example from the RPM test, Gladwell submits “the kind of really hard question that comes at the end of the Raven’s” as a challenge to readers:

Outliers 001

I’ve taken quite a few IQ tests in my lifetime, and this is one of the most challenging questions I’ve encountered. For some time, the pattern eluded me. (After praying and confessing Phillippians 4:13—“I can do all things through Christ who strengthens me!”—the Lord immediately blessed me with the solution.) In the book, Gladwell provides RPM’s expected answer—matrix (A)—but he’s unable to provide a logical rationale or pattern for the correct choice. As we know, getting the right answer means nothing if you don’t why it’s the correct answer. Inflated IQ scores are repositories for random guesses and good test-taking skills.

(Stop reading here if you want to attempt to find the pattern yourself. )

We approach the solution in precisely the same way we would have solved it had Gladwell not given us the answer. Encode each suits’ position (1-9) within each 3 x 3 “matrix” as partitions of a set S = {145789236} with the partitions ordered as follows: diamonds ♦, hearts ♥, then clubs ♣. (Use left-to-right orthography. Think of the positions as a telephone keypad—without the zero.) Matrix positions likewise move horizontally then vertically (i.e., top left –> top middle –> top right –> middle left, etc.). At this point, our situation looks like this:

  • Matrix 1:  145 | 789 | 236
  • Matrix 2:  678 | 125 | 349
  • Matrix 3:  149 | 238 | 567
  • Matrix 4:  378 | 146 | 259
  • Matrix 5:  359 | 148 | 267
  • Matrix 6:  379 | 156 | 248
  • Matrix 7:  139 | 257 | 468
  • Matrix 8:  146 | 357 | 289

Viewed through the lens of the underlying positional structure rather than the hopeless clutter of the RPM matrices, the global design emerges. Do you see it? We will briefly postpone an exploration of the technical details in order to begin with an arresting visual aid:

  • Matrix 1:  145 789 236
  • Matrix 2:  678 125 349
  • Matrix 3:  149 238 567
  • Matrix 4:  378 146 259
  • Matrix 5:  359 148 267
  • Matrix 6:  379 156 248
  • Matrix 7:  139 257 468
  • Matrix 8:  146 357 289
  • Matrix 9:           ?

We clearly see what’s happening. If Matrix 1 represents the set S = {145789236}, then the elements of S in each successive matrix (i.e., the order positions of the respective suits) are reduced by one and permuted such that the partitions are arranged according to the colored pattern. This generates the meta-design for each larger group of three matrices. (One must assume the test designer broke the global pattern here, after matrices 3 and 6, in order to highlight the horizontal continuity of the sequence within each row; this makes it more difficult to deduce the pattern because it does not carry through to the next row of matrices.)[1] The mappings can be more formally understood as iterative permutations (in this case,r_jrotations) on the elements of the cyclic group ℤ/9ℤ. Sounds complicated, but it isn’t: Subtract 1 from the row immediately above it and permute the result according to the colored pattern—with all arithmetic calculated mod 9 (and 0 = 9).

That’s it.

Here, the permutation for each matrix is a rotation of the elements with a constant index (6) wherex_{i,k}is the ith element at the kth position andr_jpermutes the elements by rotatingjpositions as follows: 

\displaystyle \begin{matrix}(x_{1,1}, x_{2,2}, x_{3,3},\cdots, x_{n-1,n-1}, x_{n,n})\\ r_j(x_{1,j+1}-1, x_{2,j+2}-1,\cdots, x_{n-1,j+n-1}-1, x_{n,j+n}-1)\\ r_j(x_{1,2j+1}-2, x_{2,2j+2}-2,\cdots, x_{n-1,2j+n-1}-2, x_{n,2j+n}-2)\\ r_j(x_{1,3j+1}-3, x_{2,3j+2}-3,\cdots, x_{n-1,3j+n-1}-3, x_{n,3j+n}-3)\\ \vdots\\ r_j(x_{1,jm+1}-m, x_{2,jm+2}-m,\cdots, x_{n-1,jm+n-1}-m, x_{n,jm+n}-m).\end{matrix}

To the observer, <349> and <934> are indistinguishable; thus, we can ignore canonical orderings of the partitions (i.e., smallest to largest, etc.) if we wish. The rest of the problem proceeds in a similar fashion, and decoding the pattern makes it a trivial exercise to predict what comes next: {246 178 935}—answer (A). One can confirm the blue, red, and green pattern continues into the ninth matrix. What’s wonderful about viewing the problem in this way is that you’re better prepared should you ever encounter a much more difficult mapping. Imagine, for example, I gave you the following design:

  • Matrix 1:  145 | 789 | 236
  • Matrix 2:  257 | 489 | 136
  • Matrix 3:  178 | 249 | 356
  • Matrix 4:  458 | 129 | 367
  • Matrix 5:  247 | 159 | 368
  • Matrix 6:           ?

What’s the global pattern in this case? This is more difficult than the RPM problem, so take your time.

Without decoding the suits into S and its partitions, this becomes a much more difficult problem to solve. If you’re familiar with automorphic maps, you might realize this matrix sequence involves a series of group automorphisms from G, the group(\mathbb{Z}/9\mathbb{Z})^{\times}, to itself.[2] Everyone will have seen the invariance of 9 in the middle partitions and likely the consistent presence of the 3/6 pairs in the last ones, but a more discriminating mathematical eye will translate these features into a clue to the design. Because (5,9) = 1 (i.e., 5 and 9 are coprime to each other), the map f : G –> G (under mod 9 multiplication by 5) will permute the elements of the group in a predictable way, eventually returning to the initial permutation. In other words, ifs\in G, then\exists\,\, t\in G : 5s\!\!\mod 9=t. As we’ve seen, s and t need not be unique elements of G.[3] 


[1] Someone pointed out the transformation from matrices 3 to 4 and 6 to 7 involves a counterclockwise rotation, mapping {369} –> {123}, {258} –> {456}, and {147} –> {789}. 

[2] The entire collection of automorphic maps from a group G to itself is called the automorphic group Aut(G), which is isomorphic to the symmetric groupS_nwith n elements. 

[3] For interested readers, Matrix 6 should read as follows: 281 | 579 | 463. A hypothetical Matrix 7 would return to the initial permutation of Matrix 1.  

Written by u220e

2018-11-27 at 5:32 pm

The Mathematics of Thanksgiving

leave a comment »

Written by u220e

2018-11-23 at 8:03 am


Amazon’s Primes

leave a comment »

There’s an ongoing debate concerning the origin of mathematics: Is it the creation of mankind or does mathematics somehow transcend our existence? Would mathematics cease to exist if not for the specific descriptions, prescriptions, and labels developed over centuries of thought, or do the hieroglyphics we’ve adopted point to immanent, Platonic features of the fabric of the universe, the specifics of which are simply waiting to be uncovered? Many prominent authors have offered a range of answers to the question—from Mario Livio’s excellent (if equivocal) investigation that underscores the complexity of the debate to Max Tegmark’s claim that reality is, itself, a mathematical construct—yet the debate continues.

One thing, however, seems clear: If mathematics is, in its totality, nothing more than a man-made construction that doesn’t point to anything abstract and outside itself, we didn’t do a very good job designing it. Problems range from the serious to the annoying: Gödel proved that an axiomatic system is not (self-referentially) closed, requiring exogenous confirmation outside its domain (serious), and we can find exact solutions to differential equations of the form

\displaystyle M(t,y)+N(t,y)\frac{dy}{dt}=0

only when there is a function\phi(t,y)such that appropriately defined functionsM(t,y)andN(t,y) satisfy the following criteria:M(t,y)=\partial\phi\,/\,\partial t,N(t,y)=\partial\phi\,/\,\partial y, and\partial M\,/\,\partial y=\partial N\,/\,\partial twhere

\displaystyle \phi(t,y) = \int M(t,y)\,dt + \int\Big[N(t,y) - \int \frac{\partial M(t,y)}{\partial y}\,dt\,\Big] dy

(annoying). It seems intuitive—to me, at least—that we would have avoided such terminal flaws when constructing an intellectual edifice as imposing and efficacious as mathematics, an argument that strongly suggests the limitations and deficiencies we encounter result from our failure to address sufficiently the details of an immanent Platonic structure with man-made techniques and notation.

But if, like me, you believe in God, there really is no debate; a universe created by an omniscient Creator is necessarily imbued with supernatural logic and order, a very small measure of which we created beings can access through our limited and fallible modes of inquiry, and we should expect to meet obstacles in our attempts to investigate such a supernal design. But can we say anything about the origin of mathematics outside the context of personal faith? I think we can. We certainly won’t settle the matter in a single blog post, but I hope to offer a few thoughts in an attempt to convince you that mathematics isn’t completely and irrevocably a man-made construct.

Think about the prime numbers. A prime numberp\in\mathbb{P}is a positive integer whose factors are only 1 and itself. In other words, there’s no other (positive) integern< psuch thatpn^{-1}\in\mathbb{Z}^+. One could make the very convincing argument that a prime number p is nothing more than one of man’s creations. After all, we invented all the terms that describe prime numbers, right? We developed the concepts and symbols for “number,” “factor,” “integer,” “set,” “positive,” “less than,” “set membership,” and “divide.” It seems we’ve merely designed the rules of the game and then created a character that will obsequiously follow our carefully prescribed logical narrative.

But is it as simple as that?

Those close to me know I’m a vociferous reader, so if you’re going to buy me a gift for my birthday or Christmas, I will always prefer an Amazon gift card so I can buy more books. My wish list is really long—in fact, I have two wish lists on two different Amazon sites—and when I received a gift card from my wife for Christmas, I was able to buy 13 books from my two lists. Always in a hurry to get a new book I’ve been wanting to read, I realized the worst-case shipping scenario would be the trivial partition of the prime 13—receiving 13 separate shipments with each package containing a single book—but I also knew (as do you) that it would be impossible to receive less than 13 UPS packages with the same number of books in each package. One of many possible mathematical descriptions of this shipping situation can be described as follows: There exists no partition (R) of a prime p whereR_p := r_1+r_2+\cdots+r_jsuch that all of the following hold: (1)r_i\in\mathbb{Z}^+, (2)p = \sum_{i=1}^j r_i and (3)r_1 = r_2 =\cdots =r_j. A much less jargon-intensive description would be that given a set S containing p elements, there is no way to construct proper subsetss_1, s_2, \cdots, s_jof S such that every subset has the same cardinality (i.e., contains the same number of elements) wherej\neq p. (This assumes the elements of each subset are unique:s_1\bigcap s_2\bigcap\cdots\bigcap s_j = \varnothing). Okay, so maybe that wasn’t any less jargon intensive! Anyway, it should be clear how our two definitions of a prime number—multiplicative factors and partitions—are related. (In fact, there are many ways to describe the mathematical notion of primality!) If we could construct such a collection of subsets, then the total number of subsets would be a divisor (or factor) of p. The following graphic represents some possible book-package groupings based on the prime number 13, and all of them fail the partition requirements given above:

13/2:  BB | BB | BB | BB | BB | BB | B
13/3:  BBB | BBB | BBB | BBB | B
13/4:  BBBB | BBBB | BBBB | B
13/5:  BBBBB | BBBBB | BBB

Okay, but what’s the point? The point is that we’re limited in how we can organize the physical space around us. Whether we’re discussing books or computers or people or planets, a set with p-elements cannot be organized into subsets containing an equal number of objects no matter how hard you try. (Again, the trivial exception exists when the number of subsets equals p because p | p.) That’s not an abstraction. It’s not the product of a man-made system, and it doesn’t require the idea of a supernal Creator. It’s simply a physical impossibility. Primality emerges from the res extensa of grouping objects, but the quality of primeness, from which primality receives its import, is a Platonic quality. It would be true in any universe, it would be true if there were no objects at all to arrange, and it’s true even if we want to imagine grouping objects that don’t exist (e.g., unicorns, wizards, leprechauns, etc.).

The locutions and symbols and formalism we use to reify this quality of primeness (when defining primality, as we did above) must be distinguished from the quality itself. It is this transcendental primeness that represents a superordinate and abstract mathematical framework that exits (and has always existed!) apart from the existence of any object. Primeness is the very reason we are able to develop a formal definition and/or proof for, and useful descriptions of, primality. The process of formalizing primeness (i.e., primality) emerges from the Platonic quality of primeness, and various applications (e.g., cryptosystems, automorphic maps of cyclic groups, etc.) then emerge from primality: Abstract truth (e.g., primeness) —> rigorous definition (e.g., primality) —> functional use (e.g., applications). The causal chain begins (at least) prior to formalization, so the notion of “what it means to be a prime number” cannot be attributed to man.

Written by u220e

2018-03-31 at 3:18 pm

Does the “Free Lunch” Come with Fries?

leave a comment »

So, I’ve been thinking about getting into the consulting business (primarily in the U.S.), and a few high-profile wealth-management firms have advised me to move to the States in order to take advantage of its favorable tax plan. Saving money on taxes always sounds like a pretty good strategy, but is moving to a different country the best option?

We’ll get to that—but, first, we require a brief digression (or two).

Some of you might have heard of something called “Purchasing Power Parity” (PPP). It’s a concept from economic theory that “compares different…currencies through a…’basket of goods’ approach. [T]wo currencies are in equilibrium…when a…basket of goods (taking into account the exchange rate) is priced the same in both countries” (Investopedia). PPP basically compares the exchange rater_xwith the quotient of prices (E) for an identical item (or a basket of items) sold in both countries. It’s a speculative measure designed to “predict” the movement of the value of one currency against another. So, if a particular Einstein bobblehead costsp_u = 9.99USD andp_c = 15.99CAD, thenE = p_cp_u^{-1}\approx 1.6USD, which is significantly higher than the then-current exchange rate of (about) $1.29 USD.

This means Canadians will prefer to purchase the bobblehead in the U.S. (@ $12.89) and use the other three-plus dollars for something else. In short, the Canadian price is too high, suggesting the USD will appreciate (over some time frame) until E equals the exchange rate. When E\neq r_x, we speculate the possible existence of an arbitrage opportunity. The concept of arbitrage is hardly new, yet it continues to drive the boldest (and, at times, most reckless) investment strategies on Wall Street and beyond. It’s often said, colloquially, that “there ain’t no such thing as a free lunch” (TANSTAAFL); the proven concept of arbitrage, however, completely belies that claim. I offer a quick and tailored primer for skeptical readers.

“TISATAAFL” or “How to Guarantee a Gambling Profit Using Mathematics”

In very basic terms, a betting-arbitrage opportunity arises when the sum of the bettors’ odds of a successful outcome derived from the gaming odds is less than one.[1] In a two-system bet, this is very simply calculated as1-a_1(a_1+b_1)^{-1} + 1-a_2(a_2+b_2)^{-1} < 1where thea_1/b_1odds of Bettor 1 to win area_1(a_1+b_1)^{-1}. The second bet placed with Bettor 2 follows similarly. An example: Suppose James desires to place a wager on the 2018 college football national title game between Alabama and Clemson. His brother, Joshua, is giving 2/1 odds on Alabama, and Michael is giving 1.25/1 odds on Clemson.

James wants to determine if this represents a genuine arbitrage opportunity. The 2/1 line means Joshua believes Alabama has a 2/3 probability of winning (i.e., 2/(2+1)), and the 1.25 (= 5/4)/1 line means Michael believes Clemson has a 5/9 probability of winning (i.e., (5/4)/[(5/4)+1]). Calculating (their belief of) the probability that James will win each bet is (1 – 2/3) + (1 – 5/9) = (1/3) + (4/9) = 7/9 < 1. It is, in fact, an arbitrage opportunity.

How does arbitrage work? Let’s say James’s gambling allowance happens to be 100 dollars in total, and he’ll bet x dollars on Alabama and 100 – x dollars on Clemson. Let’s imagine Alabama wins. An Alabama victory means James loses –x dollars to Joshua and wins 5(100 – x)/4 dollars from Michael. This yields the first linear profit curve (red line, below): -9x/4 + 125 = 0. Of course, we want the profit to be greater than zero, so we set the LHS of the equation accordingly and solve, yielding x < 500/9.

Now, imagine Clemson wins. This would mean James loses (100 – x) dollars to Michael and wins 2x dollars from Joshua, the sum of which yields the second linear profit curve (blue line, below): 3x – 100 > 0 such that x > 100/3. Thus, the amount of money James needs to wager with Josh (x) and Michael (100 – x) to guarantee a profit no matter which team wins the game lies within the global inequality derived from both equations: 100/3 < x < 500/9. This is a range from (roughly) 33.34 dollars to 55.56 dollars.

Curious readers will want to know two things: (1) the maximum possible profit based on the gambling allowance and (2) the optimum wager James should place on Alabama to generate that amount. We’ve jumped the gun a bit by providing the above graphic, but the answer involves solving the system of linear equations in the previous paragraph; the linear profit curves cross each other at an equilibrium point—recall the supply-and-demand curves of elementary economics—and it is this intersection that represents the Cartesian coordinate that (a) reveals the maximum possible profitf(x_m)and (b) the optimum betx := x_mthat guarantees that maximum profit amount.

Fortunately, as a general principle, we don’t need to graph these functions. Setting both equations equal to each other and solving forx_mis sufficient: Solving-(9/4)x_m + 125 = 3x_m - 100 gives us an optimum bet value ofx_m\approx 42.86 dollars on Alabama and 100 – 42.87 = 57.14 dollars on Clemson. This betting profile generates a maximum guaranteed profit (p) ofp := f(x_m)\approx 28.57dollars based on a 100-dollar total wager—no matter which team wins the game. The above graphic provides the relevant visual representation.

This should give you a general sense of the power and seductiveness arbitrage offers and why it’s essentially the Holy Grail of any investment strategy. (For some readers, it might be cool enough to know arbitrage exists, and you may want to make a few bets with your friends. But do the math first!) To put it simply, there’s no better option available to you than the one that generates a financial profit no matter the outcome. (Sorry, Milton Friedman!) One might ask what this has to do with PPP or moving to a foreign country. It involves the notion of currency arbitrage as a “free lunch.”

Recall the Canadian who was interested in the Einstein bobblehead. She essentially earns three dollars by making the (online) purchase from the States. It’s as if she bought the bobblehead in Canada and the government deposited three dollars into her account. Unfortunately, PPP doesn’t tell us when to make the purchase (we need the exchange rate for that), but we use that information to make inferences, like whether we’ll save money if we buy a book from Spokane rather than Vancouver. PPP, however, only involves “tradeable” commodities. “Immobile goods” like real estate and services are inaccessible to PPP calculations.

One such “inaccessible” item is tax liability. The professional advice I received was simple: Move to the United States in order to avail yourself of the more attractive federal tax rates. But can I get a “free lunch” by staying in my country of residence? PPP tells me whether there exists a currency imbalance, not whether I should move. An approach that does help me make this determination involves what I will call the “Net-Purchasing-Power Index” (NPPI). NPPI simply calculates the exchange rater_qthat represents the equilibrium point between two baskets of post-tax income portfolios and compares it to the current exchange rater_x. We begin with first principles—the technical definition of net income—and derive the NPPI from there:

Here, a is the total value of the income portfolio in the U.S.,r_qis (again) the equilibrium rate,t_cis the relevant federal tax rate in the target country, andt_uis the relevant federal tax rate in the U.S. Our goal is to calculate the NPPI by calculating the quotient ofr_qandr_x. When NPPI < 1, the exchange rate is greater than the equilibrium rate, and we have the potential for an arbitrage opportunity—but not yet a guarantee. For that, we need to do a bit more work. As the NPPI tends to zero (i.e., as the exchange rate gets larger), the portions of our potential free lunch grow significantly.

Let’s walk through an example.

Suppose an analysis suggests my U.S. corporation will generate $500,000 USD in consulting fees in 2018. Conventional wisdom, as we’ve seen, suggests relocating to the U.S. That is, a $500,000 portfolio at a U.S. federal tax rate of 39% leaves me with $500,000(1 – 0.39) = $305,000 dollars if I move to the States. If I bring that money into a target country with a federal tax rate of, say, 47%, I’ll only have an after-tax amount of $265,000. It seems as if I’m losing money by choosing not to move. But what aboutr_x? Let’s imagine the exchange rate between the U.S. and the target country isr_x\approx 1.10USD. Is an arbitrage opportunity possible? Using the equation above,r_q\approx 1.151, which is the rate that “equalizes” the post-exchange purchasing power between both countries. Becauser_q r_x^{-1}>1, I would (really) be losing money.

Clearly, the after-tax, after-conversion portfolio of $291,500 is less than the $305,000 I’d be able to spend on goods and services in the U.S. If I think the extra $13,500 I’d save by moving to the U.S. is worth the time and effort, I should relocate. But what ifr_x\approx 1.2532. Then,r_q r_x^{-1} < 1and I have a real chance to make some free money by staying put: In this case, I create $332,098 by bringing my U.S. income into the target country, and I enjoy a net-purchasing power of +$27,098. (Assume I’ve taken advantage of the legal means to minimize double-taxation issues.)

But isn’t this a guaranteed arbitrage opportunity? No, for two reasons: (1) we haven’t accounted for price differentials and (2) arbitrage also depends on how much income we’re bringing into the target country. What if, for example, prices are much more expensive in the target country? That is, what if PPP is severely unbalanced, as in the bobblehead example? In that case, the increased prices eat away at any NPPI surplus, though if we’re dealing with tradeable commodities, as we saw earlier, one would simply purchase those items from the States. Unfortunately, importing goods isn’t always a guarantor of profits. Let’s say, for argument’s sake, the amount of income we’re dealing with is $100. Ifr_x\approx 1.2532, then NPPI < 1 and I’m left with $61 living in the U.S. and $66.41 in the target country. After buying the bobblehead at $9.99 USD, I have $51.01 in the U.S. and $50.42 if I buy it in the target country ($66.41 – $15.99). NPPI < 1, but I’ve still lost money. (For simplicity sake, assume the sales tax is equal.)

This means I either have to (a) import the bobblehead from the States to have a chance at maintaining my NPPI advantage or (b) increase the amount of money I’m converting from USD into CAD. If I chose to import the bobblehead, I do still come out ahead: $66.41 – 9.99(1.2532) = $53.89, which means staying in the target country is still $2.88 better than if I’d moved to the U.S. and paid for the bobblehead in USD. It’s a very slight advantage, but that’s only because the amounts we’re dealing with are small. As the portfolio (a) grows, so does the advantage. (This ineluctably leads to the notion of leverage as an investment strategy, but we won’t address that here.) Unfortunately, as the price grows, the advantage decreases, and if the price is high enough, choosing not to move becomes a disadvantage.

The question, then, becomes this: Is there any way to evaluate an arbitrage opportunity given a specific constellation of values for the variables we’ve been discussing? Yes, there is. Such an evaluation involves solving a linear optimization problem that accounts for price levels. I will call this the Currency Arbitrage Price (CAP), and it utilizes both NPPI and PPP values. In what follows, however, we assume NPPI < 1. (Recall that if NPPI > 1, then no arbitrage opportunity is possible.) So, what do we need to know? We need to determine the maximum price level of a specific item in the target country that guarantees a post-purchase profit. We can calculate this by adding our price variables to the calculation of r_q. Solving the necessary inequality forp_c, we have:

\displaystyle\begin{array}{rcl}  a-at_u-p_u & < & ar_x-ar_xt_c-\text{PPI}\cdot p_u\\  a(1-t_u)-p_u & < & ar_x(1-t_c)-p_c\\  p_c & < & a((1-t_c)r_x-(1-t_u))+p_u\end{array}

Notice the cancellation that occurs when\text{PPI}\cdot p_u \to p_c. This last inequality tells us how much an (identical) item needs to cost in the target country in order to guarantee a profit given the other variables. We can visualize this inequality by graphing the linear CAP functionf : \mathbb{Q}^+\to\mathbb{Q}defined by

\displaystyle f(p_c)=a((1-t_c)r_x - (1-t_u)) + p_u - p_c,

and we guarantee arbitrage whenf(p_c) > 0. Armed with this information, let’s revisit the $100/bobblehead example. Solving the above inequality gives usp_c = 15.41. This means we are guaranteed an arbitrage opportunity when the bobblehead price is $9.99 in the U.S. and less than $15.41 in the target country. Let’s imagine it’s priced in the clearance bin (in the target country) at $11.99. In the U.S., paying in USD, we’d be left with the usual after-tax, after-purchase amount of $51.01, but in the target country, we’d now have an after-tax, after-purchase balance of $54.42. Despite the disparity in currency valuations and the higher tax rate, we enjoy an overall profit, which is an increase from the earlier amount of $53.89 we gained from importing.

Free lunch.

We can do a bit more. Imagine the target country decides all bobbleheads should be $11.99, and the U.S. decides it must reduce bobblehead prices to stay competitive. We love this Einstein bobblehead so much that we want to send it to all our friends. But the U.S. price keeps falling. How long can we purchase the bobblehead at $11.99 in the target country until we lose our arbitrage advantage? In other words, at what U.S. price does our profit reach zero? To solve this problem, we simply solve the above inequality forp_u.This gives us

\displaystyle p_u > a((1-t_c)r_x - (1-t_u))-p_c.

The function forp_ufollows similarly. As long as the U.S. price is greater thanp_u, we retain our arbitrage advantage. So, if bobblehead prices remain fixed at $11.99 in the target country, the U.S. price can fall to $6.57 per unit and we’ll still earn a profit (as small as it might be at that price). You don’t need any extra information to calculatep_u. PPI gives us the prices we need, and the values for all the other variables—exchange and tax rates—are easily accessible to the public. We’re simply doing some basic algebraic shuffling.

If we factor sales tax into the price differential, we add a layer of complexity to the problem of quantifying arbitrage. Ifs_uands_care the sales-tax rates in the U.S. and Canada, respectively, then our profit functions becomef : \mathbb{Q}^+\to\mathbb{Q}defined by

\displaystyle f(p_c) = \left(1+s_u\right)p_u + a\left(\left(1-t_c\right)r_x -\left(1-t_u\right)\right) -\left(1+s_c\right)p_c

for the price in Canada and

\displaystyle f(p_u) = -a\left(\left(1-t_c\right)r_x -\left(1-t_u\right)\right)+\left(1+s_c\right)p_c -\left(1+s_u\right)p_u

for the price in the U.S., respectively. In this more complex case, imagine we import $5000 USD at an exchange rate ofr_x=1.2532 with federal income-tax rates oft_u=0.10andt_c=0.15and sales-tax rates ofs_u=0.0875ands_c=0.12in the U.S. and Canada, respectively. We note that NPPI < 1, and we want to purchase a new computer wherep_u=2500USD andp_c=3499CAD. Do we have an arbitrage opportunity? Unfortunately, we don’t—not until the Canadian price is reduced to less than $3,165.

The function for the Canadian price (above) reveals this upper bound whenp_c =0(solid green). As you can see from the graph, we lose money at the current price ratio ($1407.22 – $1781.25 = -$374.03). This is shown by the gap between the red- and blue-dashed lines transversed by the vertical line that represents the U.S. pricex=p_u; this is the difference betweenf(p_u)values of the individual profit curves (and not the above functions that arise from setting those equations equal to each other and solving forp_candp_u, which are represented by the solid lines on the graphs). Though we lose money if we purchase the computer in Canada at the current price of $3,499, we do gain a profit of $137.71 by importing it from the U.S. But let’s imagine we choose to wait for a local sale, and the Best Buy in Vancouver reduces the price to $2,999. Now, we do have an arbitrage opportunity:

We’ve now earned a better-than-importing profit of $1967.22 – $1781.25 = $185.97, despite the higher federal- and sales-tax rates, by bringing in the USD-based income and paying for the computer in CAD. The graph also reveals that we’ll continue to generate a profit until the U.S. price—in response to Canada’s competitiveness—drops to about $2,329 (purple), at which point the individual (dashed) profit curves intersect with each other at the equilibrium price point and the total profit drops to zero: Both the Canadian and U.S. consumers, at that point, would be left with a remaining balance of $1967.22 after purchasing the computer.

Below is a list of some real-world examples (between Canada and the U.S.) at the time of publication with a variety of values for some of the variables under consideration (NY and BC sales taxes were used for the calculations):



So, that’s it. Currency arbitrage in a nutshell. Perhaps something like this exists somewhere in the literature—I imagine it might, even though I’ve never seen it explicitly during my study of economics—but we offer it here in the event it will pique general interest. It might be beneficial to review the basic process involved in calculating the CAP for a given portfolio combined with a certain collection of data points:

(1) Calculate the equilibrium rater_q
(2) Confirm NPPI < 1
(3) Determinep_candp_u
(4) Determines_cands_uif necessary
(5) Purchase “identical items” in the target country if the price is less thanp_c
(6) Purchase (4)’s items freely until its price in the host country reachesp_u(assumep_cremains constant)

Anticipated objections to the CAP model:

(I) Availability of identical goods

If a tradeable good in the target country is truly unique, you couldn’t have purchased it anywhere else; the notion of PPP is simply unimportant in those cases. Of course, you will have to decide whether you wish to (or, for some reason, must) pay for that uniqueness or if you’d prefer to choose an item that closely (but not precisely) matches the one you’re considering, assuming such an item is available. As far as price modeling is concerned, very closely matched items can be (and probably should be) considered “identical.” Variations among packages of Bic pens, for example, probably don’t mean very much with respect to the sticker price.

(II) PPP applicability

PPP only accounts for so-called tradeable goods, but it is possible to compare “immobile” goods using a number of objective metrics. For homes and real estate, for example, we could use price/sq.ft., location, year of construction, amenities, projected repairs, and many other measures of objective value. Much like the issue of identical goods, then, we can gain a pretty good comparison between immobile goods between countries that will allow us to use a generalized approach to PPP. Value is in the eye of the beholder, which means an eye toward equality of value between such goods is achievable.

So, what about the big question: Should I move to the U.S. based on my fanciful financial projections or remain in the target country and bring the money here? Well, if NPPI < 1, which means the exchange rate outstrips the taxation gap, then I’m guaranteed a free lunch (or two) as long as I purchase (near-)identical goods that fall below thep_cupper bound. If I can do that through importing goods with a favorable exchange rate or by taking advantage of cheaper relative prices in the target country given a certain sales-tax profile, then it’s in my interest to eschew the idea of relocating, even though the tax rates are more favorable in the States.


[1] In wagers like these, everyone must hold their money until after the event is completed. In this way, an arbitrageur can cover her losses with her winnings and keep the remaining profit. Online betting sites require you to front the money as you make the wager, which is why this arbitrage strategy won’t work in those cases.

Can We Quantify Certain Kinds of Ethical Choices?

leave a comment »

In his book Ethics in the Real World, the renowned philosopher Peter Singer proposes a metric for ethical risk informed by his (generally held) worldview of consequentialism (i.e., the idea that the consequence of an act determines the ethical value of that act). Singer states that, generally speaking, “we can measure how bad a particular risk is by multiplying the probability of the bad outcome by how bad the outcome would be” (183). Thus, an act is considered more ethical if it offers less general risk (for death, for torture, for financial waste, for suffering, for climate change, etc.) than an alternative act. We can model Singer’s non-mathematical comments by the very simple product\text{S}_n = p_n\sigma_nwhere\text{S}_nis the “Singer risk” for the nth event and\sigma_nandp_nare the outcome and probability for the nth event, respectively. Note that\text{S}_nis really just an area calculation in\mathbb{R}^2with the “sides” of the rectangle defined as the two variables in question; the larger the area, the greater the risk. Simple, right? We will return to the concept of area later.

But is this a viable model for risk? Forget for a moment about other kinds of ethical choices we make that have less definitive outcomes—a decision to break a friend’s confidentiality, defending a colleague from a false accusation that risks alienation among one’s coworkers, telling the truth despite hurting someone’s feelings, etc. Limited to the probability of “bad outcomes” we can quantify, however, does Singer’s product capture a quantification of ethical risk in a real and intuitive way? Is the ethical value of an act, in general, determined by the consequence(s) of that act? At a first glance, it seems we shouldn’t take Singer’s metric too seriously—and, perhaps, he doesn’t either—because it immediately strikes the reader as an inadequate method to quantify ethical risk in any meaningful way. How can we, to imagine one easy example, compare the loss of life between, and among, different demographics? Is it more ethical to prevent the death of a child if that preventive measure causes the death of, say, an elderly person? Five elderly people? What if it caused the death of a young, female professional at the height of her earning and reproductive powers? Is it even possible to balance those scales when making a risk assessment?

Even if we could achieve some sort of balance involving what I will call congruent cases (i.e., outcomes that involve a single parameter: the number of people harmed, the tonnage of CO2 released into the atmosphere, etc.), we’re still left with the much more difficult problem of quantifying incongruent outcomes: Is the ethical risk for blindness and malnutrition in third-world countries equal to that of domestic homelessness and drug addiction? Is rolling back the pursuit of nuclear energy (and the problems associated with managing its toxic, immutable waste) on par with diminishing our carbon footprint by reducing CO2 levels? If it is, how can we model that risk relationship? If not, why not, and how do we build into Singer’s model an objective and unbiased evaluation of those disparities? Assuming we accept Singer’s basic design, it would be an extraordinarily difficult task to “nondimensionalize,” as it were, the innumerable combinations of outcomes that would necessarily inform our decision-making process. If Singer’s model—and the philosophical platform of consequentialism, in general—has any hope of offering even a partial solution to the important kinds of ethical dilemmas he raises in his book, it must be able to handle the complexities involved in comparing these kinds of incongruities. But for the sake of argument, let’s set aside those additional complexities—as well as general critiques of consequentialism—and address the model in its most simplified form: a risk metric as a simple product limited to congruent outcomes.

Singer’s basic approach isn’t entirely without some precedent. Financial risk models, for example, involve (the sum of the) products of probabilities and returns, but they are couched within much larger mathematical and statistical machinery and require several additional calculations (e.g., expected rate of return, variance, etc.). The expected value E(X) of a continuous random variable involves the integration of the product of the random variable and the PDF, which has attached to it certain conditions (only positive values, total integration equals 1). There are other examples. Singer, however, argues that a quantification of risk could be limited to the product of an outcome and the probability that outcome occurs, and it is the validity of this basic approach we will challenge.

In light of this very narrow definition of ethical risk, then, consider the following thought experiment, couched in the form of a poll question, that was posted on three different FB groups:

Which of the following options would you consider to be the ethically superior choice?
(1) Ten people are killed if you roll a three with a ten-sided die.

(2) One person is killed if you fail to roll a three with (a different) ten-sided die.

Here, we set two independent, stochastic events (very nearly) equal to each other, though it seems clear they’re not equal ethically; in doing so, we hoped to investigate whether people would respond to the quantification of risk, as defined by Singer, or, perhaps, something else. (I’ve reasonably defined “how bad the outcome would be” simply by the number of people who would be killed.) Contrary to predictions based on Singer’s metric, a sizable majority of people (33/44 = 0.75) selected option 1 as the more ethical choice, even though (a) option 2 actually offers slightly LESS risk, which makes it the preferred choice according to Singer’s model, and (b) the number of people at risk for harm in option 1 is ten times greater.

So, what happened?

Most people seem to have responded not to Singer’s risk metric but to the probability of the outcomes. The risk of ten people dying—\text{S}_1=1—is very much mitigated by the fact that there’s a 90-percent chance nothing happens and the ten people at risk will remain unharmed. This stands in sharp relief with option 2—\text{S}_2=0.9—where there exists a 90-percent chance the person at risk will be killed, despite the fact that the total number of people at risk is one-tenth that of option 1. It seems the pollsters simplified the ethical dilemma by focusing on the probabilities of the outcomes, as if the poll options were as follows:

(1) There’s a 90% chance no one dies.
(2) There’s a 10% chance no one dies.

Notice the sigma values have vanished. Risk has now been reduced to reflect the p-values for harm, as if participants (subconsciously) treated Singer’s metric like a functionf:\mathbb{R}^+_0\to\mathbb{R}defined byf(p_n,\sigma_n) = p_n\sigma_nand evaluated the poll options as\partial_{\sigma_n}f. (Because\sigma_1\neq\sigma_2, we can’t simply cancel the outcomes.) This result is not particularly surprising. Most participants seemed to follow a probabilistic risk-aversion strategy rather than an outcome-averse one, but it’s an evaluation process that’s clearly not linked to Singer’s consequentialism, which demands a deference to the fact that\text{S}_2 < \text{S}_1. That is, the poll results reify the notion that ethical preferences might very well engender greater risk according to Singer’s model.

One might imagine what the polling would have looked like if it followed Singer’s metric. Perhaps everyone would have picked option 2, the result of privileging\sigmaregardless of the associated probabilities and/or recognizing it offers less overall risk (0.9 < 1.0). In another scenario, the polling might have been split almost equally between both options, reflecting the (near) equality of risk between the two options. The next question seems inevitable: How could we equate these outcomes in the minds of pollsters? How much would we have to increase the value of\sigma_1such that people felt the objective evaluation of both risks were, in fact, very much the same, where it made essentially no difference (in terms of ethical risk) whether we chose option 1 or 2? Perhaps the relatively low p-value for option 1 would overwhelm any value we could assign to\sigma_1. Perhaps there’s more structure to perceptions involving probability-outcome relationships; for example, they might be inversely proportional to each other:p_1\propto\sigma_1^{-1}. It’s difficult to speculate. If Singer’s model were more robust, we could simply solve the equation for the appropriate variable and calculate the perfect balance of risk, much like we attempt to do in finance or economics. Unfortunately, like so many mathematical models in other fields, things aren’t never quite so simple.

So, what do we do when mathematical equality doesn’t transpose to psychological or ethical “equality”? How can we make sense of two poll options with essentially equal risk that engender such a divergent response? Fortunately, we can use some tools from linear algebra to help us explore the degree to which two risk values—as Singer products with congruent outcomes—are (dis)similar. Assume a nonsingular risk matrix A is a 2 x 2 matrix whose entries are defined as follows:a_{11}=p_i,a_{12}=\sigma_i,a_{21}=p_j, anda_{22}=\sigma_jwhere u= [p_1\,\,\sigma_1] and v= [p_2\,\,\sigma_2]. The length of the cross product of (these risk) vectors u, v is equal to the absolute value of det A:

\displaystyle \omega = \Vert \textbf{u}\times\textbf{v}\Vert = \left |\,\text{det}\!\begin{bmatrix}p_i & \sigma_i\\ p_j & \sigma_j\end{bmatrix}\right|=\left|\,p_i\sigma_j - p_j\sigma_i\right | .

The value of omega reveals a relationship between risk vectors. The determinant of a 2 x 2 matrix, if it exists, can be thought of as the area of a projected parallelogram in\mathbb{R}^2delimited by its vectors—in this case, u and v. The greater the\omegavalue, the greater the area of the projected parallelogram and the larger the dissimilarity between Singer risks. Though\text{S}_1\approx\text{S}_2according to the Singer metric,\omega = 8.9, revealing the relationship is not nearly as close as the risk products suggest. This result might also proffer a partial explanation for the poll results, which, despite near equality in risk values, are heavily skewed toward option 1. Perhaps\omegaresponds in some way to the pollsters’ decision to privilege likelihood over outcome. For a quick comparison, consider\omega=1.1whenp_1=0.5,\sigma_1=6,p_2=0.6, and\sigma_2=5. Here, the Singer risks are equal (3), yet even while comparing two events with identical risk products, the sensitivity of \omega is able to differentiate between them. That may be a helpful and quick initial guide when comparing the risk of two congruent ethical choices.

Preference Rules

At this point, we might be inclined to consider the feasibility of certain kinds of “preference rules” (PR) with respect to ethical risk; that is, are there any ways to make an objectively unequivocal decision between Singer risks given certain values? The short answer: Yes, there are, and we list three such rules (PR1-3) that will always hold in any Singer-risk comparison. We also include two “derived preference rules” (DPR) that similarly hold in any situation:

PR1: p_i = p_j\to \min\,(\sigma_i,\sigma_j)
PR2: \sigma_i = \sigma_j\to \min\,(p_i,p_j)
PR3: \left((p_i < p_j) \land (\sigma_i < \sigma_j)\right) \to \text{S}_i

DPR1:  ((p_i\sigma_j < p_j\sigma_i) \land (p_i p_j^{-1} > 1)) \to \text{S}_j
DPR2:  ((p_i\sigma_j < p_j\sigma_i) \land (p_i p_j^{-1} < k^{-1})) \to \text{S}_i

PR1-3 are almost insultingly obvious, and for those unfamiliar with the symbols of formal logic, I offer an informal exposition. PR1 states that if the probabilities between two Singer risks are equal, we will prefer the smaller outcome, which is equivalent to preferring the smaller Singer-risk value. (Remember, we’re limiting our investigation to risk products with congruent outcomes.) PR2 simply reverses the issue addressed in PR1: If the outcomes of two Singer-risk values are the same, we will prefer the smaller probability, where, again, we’re preferring the smaller Singer-risk value. PR3 formalizes the concept inherent in PR1-2: If both the probability and the outcome of a Singer-risk value are smaller than those of a second Singer-risk value, we will prefer, as we should expect, the smaller Singer-risk value. These rules are inviolable and will obviously hold in all cases. (The universal quantifier\forall\,\, i,jis implied in each of the above cases.)

The DPRs are only slightly less obvious, and we only construct them because they relate to our earlier exploration of omega. DPR1 says that ifp_i\sigma_jis the smaller risk-matrix value andp_i p_j^{-1}is greater than 1, prefer\text{S}_j. This is a convenient rule if you’re given det A products and the associated probabilities. The proof for this is trivial.

Proof (direct): Suppose A is a 2 x 2 nonsingular (i.e.,\text{det} A\neq 0) risk matrix such thatp_1\sigma_2 < p_2\sigma_1. Then,p_1 p_2^{-1}<\sigma_1\sigma_2^{-1}. Ifp_1 p_2^{-1} > 1, thenp_1 >p_2. But\sigma_1\sigma_2^{-1}>p_1 p_2^{-1}, which means\sigma_1\sigma_2^{-1}>1and\sigma_1 > \sigma_2\Box

Thus, we will prefer the smaller Singer-risk value as prescribed by PR3. Unfortunately, a proof for DPR2 must use a different approach, but we can at least state it as follows: Suppose A is a 2 x 2 nonsingular risk matrix such thatp_i\sigma_j < p_j\sigma_iand the ratio of probabilities,p_ip_j^{-1}, is less than 1/k, then prefer Singer risk\text{S}_i. The same definitions from DPR1 apply here as well. One might have already asked the obvious question: Whence k? It arises in the process of transforming the principal inequality to an equality:

We knowk > 1, sok^{-1} < 1 and we’re now in a position to formalize a proof for DPR2.

Proof (direct): We need to show that ifp_1 p_2^{-1} < k^{-1}, then\sigma_2 > \sigma_1given the det A inequality. By transforming the former inequality into one we can use, we see thatp_1 p_2^{-1} < k^{-1}becomes-\ln p_1p_2^{-1} > \ln kby the properties of logarithms, and it’s no coincidence this latter inequality involves the last two (RHS) terms of the final equality displayed above. It is the case that\sigma_2 > \sigma_1 if -\ln p_1p_2^{-1}>\ln kbecause of the signs of the terms. Simplifying, we have

which only holds when\sigma_2 > \sigma_1, as desired.  \Box

Becausep_2 > p_1, we will prefer\text{S}_1 as prescribed by PR3. How do these DPRs relate to the poll options? We have 0.1(1) < 0.9(10) and 0.1 < 0.9, so we need to determine if 0.1/0.9 < 1/k. In this case, 1/k is smaller, which confirms our intuition that PR3 can’t be invoked in the poll-question case:j > iwith respect to p-values, andi > jin terms of outcomes. Unfortunately, we cannot establish a similarly inviolable preference rule for these kinds of mismatched inequalities, for it is this mismatched relationship that makes difficult the process of determining the ethical preference between single products involving congruent outcomes.

Area as a Quantification for Risk

What about Singer’s implied use of “area” as a metric? We’ve already mentioned its very limited scope prevents it from being a comprehensive model, but the fact that his model quantifies risk as an area calculation is not, ipso facto, a problem. There exists a long and storied tradition in mathematics, for example, in which area calculations are the very calculations we want: probability densities, work, distance, center-of-mass problems, kinetic energy, average value of a function, and arc length are only a few examples. We’ll see a few more shortly. Within the context of that rich tradition, then, we can imagine a number of other area-based models in an effort to uncover a metric that (at least) betrays the poll results. Part of what complicates matters is that the probability values of “Singer risks” aren’t built upon the same mathematical infrastructure. In other words, we cannot directly compare probabilities within the same distribution. We’d like to be able to do so, but if we view \sigma_n as a continuous random variable, the associated PDF functions cannot be equal. This can be seen with even a quick glance at the poll options: What probability distribution, for example, decreases probability values as we increase the area under the distribution curve? A cohesive PDF in the case of the Singer metric would have to yield both 0.9 at x = 1 and 0.1 at x = 10. If there is such a distribution, I’m not aware of it. Of course, the lack of a distinct PDF is mitigated by the reality that our ethical dilemma isn’t entirely random. Yes, there is a stochastic process attached to an impending if-then action, but that’s not the same thing as having a truly random variable.

We consider three area calculations as integrations of functionsf : \mathbb{R}_0^+\to\mathbb{R}whose definitions are implicitly defined below:

1.  \displaystyle\int_0^{\sigma_n}\!\!\!xc_n\,\mathrm{d}x = p_n\int_0^{\sigma_n}\!\!x\,\mathrm{d}x

The quantification of risk is now the area under the above linear function where p_n=c_n \in (0,1) is simply the slope of the function. Option 2 is far less risky (0.45 << 5) because risk now involves the quadratic growth of the outcome. A mental visualization of the (respective areas under the) graphs of these Singer metrics will be enough to convince anyone of the inadmissibility of this approach as a viable model. As we’ve seen, most people seemed to respond to the likelihood of the event and not the value of \sigma, as if the outcomes were largely irrelevant to the decision-making process, yet the linearity of an outcome-based design dramatically increases sensitivity to \sigma; we’re simply privileging the wrong variable.

2.  \displaystyle \iint\limits_Rf(x,y)\,\,\mathrm{d}A=\int_0^{p_n}\!\!\!\!\int_0^{\sigma_n}\!xy\,\,\mathrm{d}x\,\mathrm{d}y

A volume calculation in\mathbb{R}^3does a better job of approximating the Singer-risk values, but it also fails to model the poll results. The integration is straightforward and simplifies to\frac{1}{4}(p_n\sigma_n)^2. This approach only slightly reduces the value of the quadratic growth in the previous example by squaring the p-value, but it’s not enough of a reduction in most cases. (Recall from analysis that ifa >1, then(a^{-n})\to 0asn\to\infty.) Even though this new model tightens the risk difference between both options (+0.0475 vs. +0.1), it still suggests option 2 offers slightly less risk:\text{S}_1 = 0.25and\text{S}_2 = 0.2025.

3.  \displaystyle \iint\limits_Gg(x,y,f(x,y))\left(\partial_x^2f+\partial_y^2f+1\right)^{1/2}\!\mathrm{d}A=\!\left(\frac{1}{4}\sigma_n^{-2}+\frac{5}{4}\right)^{1/2}\!\!\int_0^{p_n}\!\!\!\!\int_0^{\sigma_n}\left(xy+\frac{x}{2\sigma_n}-\frac{y}{2}+1\right)\mathrm{d}y\,\mathrm{d}x

Here, the surface-area calculation in\mathbb{R}^3also fails to model the poll results. We (somewhat arbitrarily) choose the planez=f(x,y)=x(2\sigma_n)^{-1} - y/2 +1in the hope of striking a better balance between probabilities and outcomes. Simplifying and solving leaves us with the following product:

Here, like the other two approaches, option 1 remains the riskier option:\text{S}_1\approx 1.7and\text{S}_2\approx 1.4. The risk gap between options, however, has now widened compared to the volume calculation, and we still have the quadratic growth of\sigmabuilt into the model. An alternative iterated integral—namely,\int_0^{p_n}\!\int_0^{\sigma_n}xyz\,\sec\gamma\,\mathrm{d}y\,\mathrm{d}x—shrinks this gap (\text{S}_1\approx 0.36and\text{S}_2\approx 0.26), but it still fails to track the majority decision to treat option 1 as the more ethical choice. Thus, in every case, the models we’ve explored produce a risk value for option 2 that is less than option 1. This is disappointing, but we only intend to offer a brief investigation into the possibility of an alternative model. A fully realized and robust design is well beyond the scope of a blog post, so I will leave it to interested readers to pursue a viable solution, including a better motivation for z, concerning the kinds of ethical problems we’re investigating here.


After all this, though, we’ve neglected to ask, perhaps, the most crucial question: Is the concept of risk a vitally important and pervasive consideration? To this, we must offer a full-throated “yes!” We need only remind the reader that notions of risk aren’t, as this post might suggest to some, mere fodder for a tiresome intellectual and mathematical exercise; we as a society make many decisions based on quantifications of risk—from actuaries calculating life expectancy for insurance policies and the beta risk of financial investments to disaster management and the cost-benefit analysis involved with safety recalls. And though pure notions of ethical risk are absent in most of these examples, we still very much engage in just the kinds of stochastic events reified in the poll options—where lives can, and often do, (literally) hang in the balance; there are probabilities associated with dying an unnatural death by being hit by a drunk driver, with the collapse of hedge funds holding severely over-leveraged arbitrage portfolios, with the flooding and destruction of Florida’s coastal cities during hurricane season, and with how many people might be killed by driving Company X’s new car. But that’s not all: We use these quantifications to draft legislation, evaluate legal settlements, decide how maintenance funds are allocated, and design the constellation of food and products your children will put in their mouths. Sometimes, such decisions can lead to subversive, and even illegal, acts.

Yet despite the vertiginous ubiquity of risk assessments that swirl around us, many people simply refused to choose a poll option on the grounds of some misguided moral indignation (“The only ethical choice is not to choose!”). Perhaps their reluctance involves the potential dread that comes with an increased awareness of self, that adjudicating a tough ethical decision requires a prism through which some are afraid to see themselves. It takes the willingness of an honest and restless soul to subject oneself to such psychic refractions.

That we should all have such courage.

A Proposed Proof for the Existence of God

leave a comment »

Assume it is impossible to prove God does not exist. Then the probability that God exists, denoted p(G), is greater than zero.[1] Also assume, as many important physicists and cosmologists do, that (1) the multiverse exists and is composed of an infinite number of independent universes and (2) our current universe is but one of those infinite universes existing in the multiverse.[2]

If the probability of the non-existence of God, denoted pG), in some universe is defined as

\displaystyle p(\neg G) = (1 - ab^{-1})\in\left(0,1\right),

then as the number of universes (n) approaches infinity,

\displaystyle \lim_{n \rightarrow \infty} (1 - ab^{-1})^n = 0.

That is, the sequence\left(1-ab^{-1}\right)^n\to 0asn\to\infty.Any event that can happen will ineluctably happen given enough trials.[3] This means God must exist in at least one universe within the multiverse, and if He does, then He must exist in all universes, including our universe, because omnipresence is a necessary condition for God to exist.  \blacksquare


[1] That is, if p(G)=ab^{-1}\in (0,1), then ab^{-1}\in\mathbb{Q^+}: a,b\in\mathbb{Z}^+andb > a.

[2] This is certainly a reasonable, if not ubiquitously held, concept that follows from the mathematics of inflationary theory. In Our Mathematical Universe, for example, Max Tegmark suggests if “inflation…made our space infinite[, t]hen there are infinitely many Level I parallel universes” (121). If this still seems unreasonable, consider the fact that a random walk on a lattice in\mathbb{R}^2 “has unity probability of reaching any point (including the starting point) as the number of steps approaches infinity.”

[3] The multiverse is often thought of as a way for cosmologists to bowdlerize fine-tuning arguments by contextualizing them as products of inevitability.

Written by u220e

2016-09-15 at 9:13 pm

The Myth of Altruism?

leave a comment »

The American Heritage Dictionary (2011) defines “altruism” as “selflessness.” If one accepts that standard definition, then it seems reasonable to view an “altruistic act” as one that fails to produce a net gain in personal benefit for the actor subsequent to its completion. (Here, we privilege psychological altruism as opposed to biological altruism, which is often dismissed by the “selfish gene” theory of Darwinian selection and notions of reproductive fitness.) Most people, however, assume psychologically-based altruistic acts exist because they believe an act that does not demand or expect overt reciprocity or recognition by the recipient (or others) is so defined. But is this view sufficiently comprehensive, and is it really possible to behave toward others in a way that is completely devoid of self? Is self-interest an ineluctable process with respect to volitional acts of kindness? Here, we explore the likelihood of engaging in an authentically selfless act and capturing true altruism, in general. (Note: For those averse to mathematical jargon, feel free to skip to the paragraph that begins with “[A]t this stage” to get a basic understanding of orthogonality and then move to the next section, “Semantic States as ‘Intrinsic Desires’,” without losing much traction.)

The Model

Imagine for a moment every potential (positive) outcome that could emerge as a result of performing some act—say, holding the door for an elderly person. You might receive a “thank you,” a smile from an approving onlooker, someone reciprocating in kind, a feeling you’ve done what your parents (or your religious upbringing) might have expected you to do, perhaps even a monetary reward—whatever. (Note: We assume there will never be an eager desire or expectation for negative consequences, so we require all outcomes to be positive, beneficial events. Of course, a comprehensive model would also include the desire to avoid negative consequences—the ignominy of failing to return a wallet or aiding a helpless animal (an example we will revisit later)—but these can be transformed into positive statements that avoid the unnecessary complications associated with the contrapositive form.)

We suppose there are n outcomes, and we can imagine each outcome enjoys a certain probability of occurring. We will call this the potential vector \mathbf{p}, the components of which are simply the probabilities that each outcome (ordered 1 through n) will occur:

\displaystyle \mathbf{p} = [p(1), p(2), p(3),\dots,p(n-1),p(n)]

and0\leq p(i)\leq 1where\sum_{i=1}^n p(i)does not have to equal 1 because events are independent and more than a single outcome is possible. (You might, for example, receive both a “thank you” and a dollar bill for holding the door for an elderly woman.) So, the vector\mathbf{p}represents the agglomeration of the discrete probabilities of every positive thing that could occur to one’s benefit by engaging in the act.

Consider, now, another vector, \mathbf{q}, that represents the constellation of desires and expectations for the possible outcomes enumerated in\mathbf{p}. That is, if\mathbf{q} = [q(1),q(2),q(3),\dots,q(n-1),q(n)], thenq(i)catalogs the interest and desire in outcomep(i). (It might be convenient to imagine\mathbf{q}as a binary vector of length n and an element of\text{R}_2^n, but we will be better to treat\mathbf{q}vectors as a subset of the parent vector space\text{R}^nto which\mathbf{p}belongs.) In other words,q(i) = 0,1: either you desire the outcome (whose probability is denoted by)p(i)or you don’t. (There are no “probabilities of expectation or desire” in our model.) We will soon see how these vectors address our larger problem of quantifying acts of altruism.

The point\text{Q}in\text{R}^nis determined by\mathbf{q}, and we want to establish a plane parallel to (and including)\mathbf{q}with normal vector\mathbf{p}. Define a point X generated by a vector\mathbf{x} = t\mathbf{q}where the scalart>1and\mathbf{x} = [c_1,c_2,c_3,\dots,c_{n-1},c_n]. If\mathbf{p}is a normal vector of\mathbf{x} - \mathbf{q}, then the normal-form equation of the plane is given by\mathbf{p}\cdot(\mathbf{x} - \mathbf{q})=0, and its general equation is

\displaystyle\sum_{i=1}^n p(i)c_i = p(1)c_1 + p(2)c_2 + \dots + p(n-1)c_{n-1} + p(n)c_n=0.

We now have a foundation upon which to establish a basic, quantifiable metric for altruism. If we assume, as we did above, that an altruistic act benefits the recipient and fails to generate any positive benefits for the actor, then such an act must involve potential and expectation vectors whose scalar product equals zero, which means they stand in an orthogonal (i.e., right-angle) relationship to each other. It is interesting to note there are only two possible avenues for\mathbf{p}\mathbf{q}orthogonality within our model: (a) the actor desires and/or expects absolutely no rewards (i.e.,\mathbf{q}=0), which is the singular and generally understood notion of altruism, and (b) the actor only desires and/or expects rewards that are simply impossible (i.e.,p(i)=0whereq(i)=1). (We will assume\mathbf{p}\neq0.) In all other cases, the scalar product will be greater than zero, violating the altruism requirement that there be no benefit to the actor. Framed another way, (the vector of) an altruistic act forms part of a basis for a subspace in\text{R}^n.

At this stage, it might be beneficial to pause and walk through a very easy example. Imagine there are only three possible outcomes for buying someone their morning coffee at Starbucks: (1) the recipient says “thank you,” (2) someone buys your coffee for you (“paying it forward”), and (3) the person offers to pay your mortgage. A reasonable potential vector might be [0.9, 0.5, 0]—i.e., there’s a 90% chance you’ll get a “thank you,” a 50% chance someone else will buy your coffee for you, and a zero-percent chance this person will pay your mortgage. Now, assume your expectation vector for those outcomes is [1, 0, 0]—you expect people to say “thank you” when someone does something nice for them, but you don’t expect someone to buy your coffee or pay your mortgage as a result. The scalar product is greater than zero (0.9(1) + 0.5(0) + 0^2 = 0.9), which means the act of buying the coffee fails to meet the requirement for altruism (i.e., the potential vector is not orthogonal to the plane that includes Q and X = tq). In this example, as we’ve seen in the general case, the only way buying the coffee could have been an altruistic act is if (a) the actor expects or desires no outcome at all or (b) the actor expected or desired her mortgage to be paid (and nothing else). We will discuss later the reasonableness of the former scenario. (It might also be interesting to note the model can quantify the degree to which an act is altruistic.)

The above formalism will work in every case where there is a single, fixed potential vector and a specified constellation of expectations; curious readers, however, might be interested in cases where there exists a non-scalar-multiple range of expectations (i.e., when X=\mathbf{x}\neq t\mathbf{q}for some scalar t), and we can dispatch the formalism fairly quickly. In these cases, orthogonality would involve a specific potential vector and a plane involving the displacement of expectation vectors. The vector form of this plane is\mathbf{x}=\mathbf{q} + t_1\mathbf{u} + t_2\mathbf{v}, and direction vectors\mathbf{u},\mathbf{v} are defined as follows:


with\mathbf{v}defined similarly for points Q and R;t_iare scalars (possibly understood as time per some unit of measurement for a transition vector), and points S and R of the direction vectors are necessarily located on the plane in question. Unpacking the vector form of the equation yields the following matrix equation:

\displaystyle\begin{bmatrix}c_1\\c_2\\c_3\\ \vdots\\c_{n-1}\\c_n\end{bmatrix}=\begin{bmatrix}q(1)\\q(2)\\q(3)\\ \vdots\\q(n-1)\\q(n)\end{bmatrix}+t_1\begin{bmatrix}s(1)-q(1)\\s(2)-q(2)\\s(3)-q(3)\\ \vdots\\s(n-1)-q(n-1)\\s(n)-q(n)\end{bmatrix}+t_2\begin{bmatrix}r(1)-q(1)\\r(2)-q(2)\\r(3)-q(3)\\ \vdots\\r(n-1)-q(n-1)\\r(n)-q(n)\end{bmatrix}

whose parametric equations are

\displaystyle\begin{matrix}c_1=q(1)+t_1[s(1)-q(1)]+t_2[r(1)-q(1)]\\ \vdots\\ c_n=q(n)+t_1[s(n)-q(n)]+t_2[r(n)-q(n)].\end{matrix}

It’s not at all clear how one might interpret “altruistic orthogonality” between a potential vector and a transition or range (i.e., subtraction) vector of expectations within this alternate plane, but it will be enough for now to consider its normal vectors—one at Q and, if we wish, one at X (through the appropriate mathematical adjustments)—as secondary altruistic events orthogonal to the relevant plane intersections:

\displaystyle p_1(1)c_1 - p_2(1)c_1 + p_1(2)c_2 - p_2(2)c_2 + \dots + p_1(n)c_n - p_2(n)c_n = 0.

Semantic States as ‘Intrinsic Desires’

To this point, we’ve established a very simple mathematical model that allows us to quantify a notion of altruism, but even this model hinges on the likelihood that one’s expectation vector equals zero: an actor neither expects nor desires any outcome or benefit from engaging in the act. This seems plausible for events we can recognize and catalog (e.g., reciprocal acts of kindness, expressions of affirmation, etc.), but what about the internal motivations—philosophers refer to these as intrinsic desires—that very often drive our decision-making process? What can we say about acts that resonate with these subjective, internal motivations like religious upbringing, a generic sense of rectitude, cultural conditioning, or the Golden Rule? These intrinsic desires must also be included in the collection of benefits we might expect to gain from engaging in an act and, thus, must be included in the set of components of potential outcomes. If you’ve been following the above mathematical discussion, such internal states guarantee non-orthogonality; that is, they secure a scalar for\mathbf{p}\cdot\mathbf{q}becausep_k,q_k >0for some internal state k. This means internal states belie a genuine act of altruism. It is important to note, too, these acts are closely associated with notions of social exchange theory, where (1) “assets” and “liabilities” are not necessarily objective, quantifiable things (e.g., wealth, beauty, education, etc.) and (2) one’s decisions often work toward shrinking the gap between the perceived self and ideal self. (See, particularly, Murstein, 1971.) In considering the context of altruism, internal states combine these exchange features: An act that aligns with some intrinsic desire will bring the actor closer to the vision of his or her ideal self, which, in turn, will be subjectively perceived and experienced as an asset. Altruism is perforce banished in the process.

So, the question then becomes: Is it possible to act in a way that is completely devoid of both a desire for external rewards and any motivation involving intrinsic desires, internal states that provide (what we will conveniently call) semantic assets? As I hope I’ve shown, yes, it is (mathematically) possible—and in light of that, then, I might have been better served placing quotes around the word myth in the title—but we must also ask ourselves the following question: How likely it is that an act would be genuinely altruistic given our model? If we imagine secondary (non-scalar) planesP_1, P_2,\dots, P_ncomposed of expectation vectors from arbitrary pointsp_1,p_2,\dots,p_n(withp_j \in P_j) parallel to the x-axis, as described above, then it is easy to see there are a countably infinite number of planes orthogonal to the relevant potential vector. (Assumeq\neq 0because if q is the zero vector, it is orthogonal to every plane.) But there are an (uncountably) infinite number of angles0<\theta<\piand\theta\neq\pi/2, which means there exists a far greater number of planes that are non-orthogonal to a given potential vector, but this only considers\thetarotations in\mathbb{R}^2as a two-dimensional slice of our outcome space\mathbb{R}^n. As you might be able to visualize, the number of non-orthogonal planes grows considerably if we include\thetarotations in\mathbb{R}^3. Within the context of three dimensions, and to get a general sense of the unlikelihood of acquiring random orthogonality, suppose there exists a secondary plane, as described above, for every integer-based value of0<\theta<\pi(and\theta\neq\pi/2) with rotations in\mathbb{R}^2; then the probability of a potential vector being orthogonal to a randomly chosen planeP_jof independent expectation vectors is highly improbable: p = 1/178 = 0.00561797753, a value significant to eleven digits. If we include\mathbb{R}^3rotations to those already permitted, the p-value for random orthogonality decreases to 0.00001564896, which is a value so small as to be essentially nonexistent. So, although altruism is theoretically possible because our model admits the potential for orthogonality, our model also suggests such acts are quite unlikely, especially for large n. For philosophically sophisticated readers, the model supports the theory of psychological altruism (henceforth ‘PA’) that informs the vast majority of decisions we make in response to others, but based on p-values associated with the prescribed model, I would argue we’re probably closer to Thomas Hobbes’s understanding of psychological egoism (henceforth ‘PE’), even though the admission of orthogonality subverts the totalitarianism and inflexibility inherent within PE.

One final thought explicates the obvious problem with our discussion to this point: There isn’t any way to quantify probabilities of potential outcomes based on events that haven’t yet happened, even though we know intuitively such probabilities, outcomes, and expectations exist. To be sure, the concept of altruism is palpably more philosophical or psychological or Darwinian than mathematical, but our model is successful in its attempt to provide a skeletal structure to a set of disembodied, intrinsic desires—to posit our choices are, far more often than they are not, means to ends (whether external or internal) rather than selfless, other-directed ends in themselves.

Some Philosophical Criticisms

Philosophical inquiry concerning altruism is rich and varied. Aristotle believed the concept of altruism—the specific word was not coined until 1851 by Auguste Comte—was an outward-directed moral good that benefited oneself, the benefits accruing in proportion to the number of acts committed. Epicurus argued that selfless acts should be directed toward friends, yet he viewed friendship as the “greatest means of attaining pleasure.” Kant held for acts that belied self-interest but argued, curiously, they could also emerge from a sense of duty and obligation. Thomas Hobbes rejected the notion of altruism altogether; for him, every act is pregnant with self-interest, and the notion of selflessness is an unnatural one. Nietzsche felt altruistic acts were degrading to the self and sabotaged each person’s obligation to pursue self-improvement and enlightenment. Emmanuel Levinas argued individuals are not ends in themselves and that our priority should be (and can only be!) acting benevolently and selflessly towards others—an argument that fails to address the conflict inherent in engaging with a social contract where each individual is also a receiving “other.” (This is the problem with utilitarian-based approaches to altruism, in general.) Despite the varied historical analyses, nearly every modern philosopher (according to most accounts) rejects the notion of psychological egoism—the notion that every act is driven by benefits to self—and accepts, as our model admits, that altruism does motivate a certain number of volitional acts. But because our model suggests very low p-values for PA, it seems prudent to address some of the specific arguments against a prevalent, if not unshirted, egoism.

1. Taking the blue pill: Testing for ‘I-desires’

Consider the following story:

Mr. Lincoln once remarked to a fellow passenger…that all men were prompted by selfishness in doing good. His [companion] was antagonizing this position when they were passing over a corduroy bridge that spanned a slough. As they crossed this bridge they espied an old razor-backed sow on the bank making a terrible noise because her pigs had got into the slough and were in danger of drowning. [M]r. Lincoln called out, ‘Driver can’t you stop just a moment?’ Then Mr. Lincoln jumped out, ran back and lifted the little pigs out of the mud….When he returned, his companion remarked: ‘Now Abe, where does selfishness come in on this little episode?’ ‘Why, bless your soul, Ed, that was the very essence of selfishness. I should have had no peace of mind all day had I gone on and left that suffering old sow worrying over those pigs.’ [Feinberg, Psychological Altruism]

The author continues:

What is the content of his desire? Feinberg thinks he must really desire the well-being of the pigs; it is incoherent to think otherwise. But that doesn’t seem right. Feinberg says that he is not indifferent to them, and of course, that is right, since he is moved by their plight. But it could be that he desires to help them simply because their suffering causes him to feel uncomfortable (there is a brute causal connection) and the only way he has to relieve this discomfort is to help them. Then he would, at bottom be moved by an I-desire (‘I desire that I no longer feel uncomfortable’), and the desire would be egoistic. Here is a test to see whether the desire is basically an I-desire. Suppose that he could simply have taken a pill that quietened the worry, and so stopped him being uncomfortable, and taking the pill would have been easier than helping the pigs. Would he have taken the pill and left the pigs to their fate? If so, the desire is indeed an I-desire. There is nothing incoherent about this….We can apply similar tests generally. Whenever it is suggested that an apparently altruistic motivation is really egoistic, since it [is] underpinned by an I-desire, imagine a way in which the I-desire could be satisfied without the apparently altruistic desire being satisfied. Would the agent be happy with this? If they would, then it is indeed an egoistic desire. if not, it isn’t.

This is a powerful argument. If one could take a pill—say, a tranquilizer—that would relieve the actor from the discomfort of engaging the pigs’ distress, which is the assumed motivation for saving the pigs according to the (apocryphal?) anecdote, then the volitional act of getting out of the coach and saving the pigs must then be considered a genuinely altruistic act because it is directed toward the welfare of the pigs and is, by definition, not an “I-desire.” But this analysis makes two very large assumptions: (1) there is a singular motivation behind an act and (2) we can whisk away a proposed motivation by some physical or mystical means. To be sure, there could be more than one operative motivation for an action—say, avoiding discomfort and receiving a psychosocial reward—and the thought-experiment of a pill removing the impetus to act does not apply in all cases.

Suppose, for example, one only desires to avoid the pigs’ death and not the precursor of their suffering. Is it meaningful to imagine the possibility of a magical pill that could avoid the pigs’ death? If by the “pill test” we intend to eviscerate any and all possible motivations by some fantastic means, then we really haven’t said much at all. We’ve only argued the obvious tautology: that things would be different if things were different. (Note: the conditional A –> A is always true, which means A <–> A is, too.) Could we, for example, apply this test to our earlier coffee experiment? Imagine our protagonist could take a pill that would, by acting on neurochemical transmitters, magically satisfy her expectation and desire for being thanked for purchasing the coffee. Can we really say her motivation is now altruistic, presumably because the pill has rendered an objective “thank you” from the recipient unnecessary? In terms of our mathematical model, does the pill create a zero expectation vector? It’s quite difficult to imagine this is the case; the motivation—that is, the expectation of, and desire for, a “thank you”—is not eliminated because it is fulfilled by a different mechanism.

2. Primary object vs. Secondary possessor

As a doctor who desires to cure my patient, I do not desire pleasure; I desire that my patient be made better. In other words, as a doctor, not all my particular desires have as their object some facet of myself; my desire for the well-being of my patient does not aim at alteration in myself but in another. My desire is other-regarding; its object is external to myself. Of course, pleasure may arise from my satisfied desire in such cases, though equally it may not; but my desire is not aimed at my own pleasure. The same is true of happiness or interest: my satisfied desire may make me happy or further my interest, but these are not the objects of my desire. Here, [Joseph] Butler simply notices that desires have possessors – those whose desires they are – and if satisfied desires produce happiness, their possessors experience it. The object of a desire can thus be distinguished from the possessor of the desire: if, as a doctor, my desire is satisfied, I may be made happy as a result; but neither happiness nor any other state of myself is the object of my desire. That object is other-regarding, my patient’s well-being. Without some more sophisticated account, psychological egoism is false. [See Butler, J. (1726) Fifteen Sermons Preached at the Rolls Chapel, London]

Here, the author errs not in assuming pleasure can be a residual feature of helping his patients—it can be—but in presuming his desire for the well-being of others is a first cause. It is likely that such a desire originates from a desire to fulfill the Hippocratic oath, to avoid imposing harm, which demands professional and moral commitments from a good physician. The desire to be (seen as) a good physician, which requires a (“contrapositive”) desire to avoid harming patients, is clearly a motivation directed toward self. Receiving a “thank you” for buying someone’s coffee might create a feeling of pleasure within the actor (in response to the pleasure felt and/or exhibited by the recipient), but the pleasure of the recipient is not necessarily (and is unlikely to be) a first cause. If it were a first (and only) cause, then all the components of the expectation vector would be zero and the act would be considered altruistic.

Notice we must qualify that if-then statement with the word “only” because our model treats such secondary “I-desires” as unique components of the expectation vector. (“Do I desire the feeling of pleasure that will result in pleasing someone else when I buy him or her coffee?”) We will set aside the notion that an expectation of a residual pleasurable feeling in response to another’s pleasure is not necessarily an intrinsic desire. I can expect to feel good in response to doing X without desiring, or being motivated by, that feeling—this is the heart of the author’s argument—but if any part of the motivation for buying the coffee involves a desire to receive pleasure—even if the first cause involves a desire for the pleasure of others—then the act cannot truly be cataloged as altruistic because, as mentioned above, it must occupy a component within q. The issue of desire, then, requires an investigation into first causes (i.e., “ultimate”) motivations, and the logical fallacy of Joseph Butler’s argument (against what is actually psychological hedonism) demands it.

3. Sacrifice or pain

Also taken from the above link:

A simple argument against psychological egoism is that it seems obviously false….Hume rhetorically asks, ‘What interest can a fond mother have in view, who loses her health by assiduous attendance on her sick child, and afterwards [sic] languishes and dies of grief, when freed, by its death, from the slavery of that attendance?’ Building on this observation, Hume takes the ‘most obvious objection’ to psychological egoism.…[A]s it is contrary to common feeling and our most unprejudiced notions, there is required the highest stretch of philosophy to establish so extraordinary a paradox. To the most careless observer there appear to be such dispositions as benevolence and generosity; such affections as love, friendship, compassion, gratitude. […] And as this is the obvious appearance of things, it must be admitted, till some hypothesis be discovered, which by penetrating deeper into human nature, may prove the former affections to be nothing but modifications of the latter. Here Hume is offering a burden-shifting argument.  The idea is that psychological egoism is implausible on its face, offering strained accounts of apparently altruistic actions. So the burden of proof is on the egoist to show us why we should believe the view.

Sociologist Emile Durkheim argued that altruism involves voluntary acts of “self-destruction for no personal benefit,” and like Levinas, Durkheim believed selflessness was informed by a utilitarian morality despite his belief that duty, obligation, and obedience to authority were also counted among selfless acts. The notion of sacrifice is perhaps the most convincing counterpoint to overriding claims to egoism. It is difficult to imagine a scenario, all things being equal, where sacrifice (and especially pain) would be a desired outcome. It would seem that a decision to act in the face of personal sacrifice, loss, or physical pain would almost certainly guarantee a genuine expression of altruism, yet we must again confront the issue of first causes. In the case of the assiduous mother, sacrifice might service an intrinsic (and “ultimate”) desire to be considered a good mother. In the context of social-exchange theory, the asset of being (perceived as) a good mother outweighs the liability inherent within self-sacrifice. Sacrifice, after all, is what good mothers do, and being a good mother resonates more closely with the ideal self, as well as society’s coeval definition of what it means to be a “good mother.” In a desire to “do the right thing” and “be a good mother,” then, she chooses sacrifice. It is the desire for rectitude (perceived or real) and the positive perception of one’s approach to motherhood, not solely the sacrifice itself, that becomes the galvanizing force behind the act. First causes very often answer the following question: “What would a good [insert category or group to which membership is desired] do?”

What of pain? We can imagine a scenario in which a captured soldier is being tortured in the hope he or she will reveal critical military secrets. Is the soldier acting altruistically by enduring intense pain rather than revealing the desired secrets? We can’t say it is impossible, but, here, the aegis of a first cause likely revolves around pride or honor; to use our interrogative test for first causes: “Remaining true to a superordinate code is what [respected and honorable soldiers] do.” They certainly don’t dishonor themselves by betraying others, even when it’s in one’s best interest to do so. Recalling Durkheim’s definition, obedience (as distinct from the obligatory notion of duty) also plays an active role here: Honorable soldiers are required to obey the established military code of conduct, so the choice to endure pain might be motivated by a desire to be (seen as) an obedient and compliant soldier who respects the code rather than (merely) an honorable person, though these two things are nearly inextricably enmeshed. To highlight a relevant religious example, Jesus’ sacrifice on the cross might not be considered a truly altruistic act if the then-operative value metric privileged a desire to be viewed by the Father as a good, obedient Son, who was willing to sacrifice Himself for humanity, above the sacrifice (and pain) associated with the crucifixion. (This is an example where the general criticism of Durkheim’s “utilitarian” altruism fails; Jesus did not receive from His utilitarian sacrifice in the way mankind did.) These are complex motivations that require careful parsing, but there’s one thing we do know: If neither sacrifice nor pain can be related to any sort of intrinsic desire that satisfies the above interrogative test, then it probably should be classified as altruistic, even though, as our model suggests, this is not likely to be the case.

4. Self-awareness

Given the arguments, it is still unclear why we should consider psychological egoism to be obviously untrue.  One might appeal to introspection or common sense, but neither is particularly powerful. First, the consensus among psychologists is that a great number of our mental states, even our motives, are not accessible to consciousness or cannot reliably be reported…through the use of introspection. While introspection, to some extent, may be a decent source of knowledge of our own minds, it is fairly suspect to reject an empirical claim about potentially unconscious motivations….Second, shifting the burden of proof based on common sense is rather limited. Sober and Wilson…go so far as to say that we have ‘no business taking common sense at face value’ in the context of an empirical hypothesis. Even if we disagree with their claim and allow a larger role for shifting burdens of proof via common sense, it still may have limited use, especially when the common sense view might be reasonably cast as supporting either position in the egoism-altruism debate.  Here, instead of appeals to common sense, it would be of greater use to employ more secure philosophical arguments and rigorous empirical evidence.

In other words, we cannot trust thought processes in evaluating our motivations to act. We might think we’re acting altruistically—without any expectations or desires—but we are often mistaken because, as our earlier examples have shown, we fail to appreciate the locus of first causes. (It is also probably true, for better or worse, that most people prefer to think of themselves more highly than they ought—a process that better approaches exchange ideas of the ideal self in choosing how and when to act.) Jeff Schloss, the T.B. Walker Chair of Natural and Behavioral Sciences at Westmont College, suggests precisely this when he states that “people can really intend to act without conscious expectation of return, but that [things like intrinsic desires] could still be motivating certain actions.” The interrogative test seems like one easy way to clarify our subjective intuitions surrounding what motivates our actions, but we need more tools. Our model seems to argue that the burden of proof for altruism rests with the actor—“proving,” without resorting to introspection, one’s expectation vector really is zero—rather than “proving” the opposite, that egoism is the standard construct. Our proposed p-values based on the mathematics of our model strongly suggest the unlikelihood of a genuine altruism for a random act (especially for large n), but despite the highly suggestive nature of the probability values, it is unlikely they rise to the level of “empirical evidence.”


Though I’ve done a little work in a fun attempt to convince you genuine altruism is a rather rare occurrence, generally speaking, it should be said that even if my basic conceit is accurate, this is not a bad thing! The “intrinsic desires” and (internal) social exchanges that often motivate our decision-making process (1) lead to an increase in the number of desirable behaviors and (2) afford us an opportunity to better align our actions (and ourselves) with a subjective vision of an “ideal self.” We should note, too, the “subjective ideal self” is frequently a reflection of an “objective ideal ([of] self)” constructed and maintained by coeval social constructs. This is a positive outcome, for if we only acted in accordance with genuine altruism, there would be a tragic contraction of good (acts) in the world. Choosing to act kindly toward others based on a private desire that references and reinforces self in a highly abstract way stands as a testament to the evolutionary psychosocial sophistication of humans, and it evinces the kind of higher-order thinking required to assimilate into, and function within, the complex interpersonal dynamic demanded by modern society. We should consider such sophistication to be a moral and ethical victory rather than the evidence of some degenerate social contract surreptitiously pursued by selfish persons.


Bernard Murstein (Ed.). (1971). Theories of Attraction and Love. New York, NY: Springer Publishing Company, Inc.

Written by u220e

2016-09-05 at 2:30 pm

A Tuesday Riddle

leave a comment »

You’re given three mislabeled boxes of chocolate.

One box has 20 pieces of mint-filled chocolate.
One box has 20 pieces of coconut-filled chocolate.
And one “mixed” box has 10 pieces of each—10 coconut, 10 mint.

All the chocolates are identical to the eye.


The answer is ONE chocolate (from the “mixed” box). Why? Because the mislabeled “mixed” box must contain a single flavor—either mint or coconut—and the single piece you take from that box will determine which one it is. Suppose it’s mint. So, then we have the yet-unidentified coconut and mixed boxes remaining (i.e., for the mislabeled “coconut” and “mint” boxes):

“Coconut” (could be coconut or mixed)
“Mint” (could be coconut or mixed).

But “coconut” is mislabeled, so it *must* be mixed—we’ve already found the real mint box—and the mislabeled “mint” box is really the coconut box because, well, that’s the only one left.

So, the correct poll response is “less than 3.”

Written by u220e

2021-03-09 at 2:44 pm

Posted in Uncategorized

Tagged with ,

Humanities 101

leave a comment »

Written by u220e

2020-03-22 at 12:28 pm


Proof By Celebrity

leave a comment »

Written by u220e

2019-11-01 at 1:41 pm


Tagged with , ,

%d bloggers like this: