Model

Hidden Disparate Impact

It is against commonly held intuitions that a group can be both over-represented in a profession, school, or program, and discriminated against. The simplest way to test for discrimination is to look at the general population, find the percent that a group represents, then expect them to represent exactly that percentage in any endeavour, absent discrimination.

Harvard, for example, is 17.1% Asian-American (foreign students are broken out separately in the statistics I found, so we’re only talking about American citizens or permanent residents in this post). America as a whole is 4.8% Asian-American. Therefore, many people will conclude that there is no discrimination happening against Asian-Americans at Harvard.

This is what would happen under many disparate impact analyses of discrimination, where the first step to showing discrimination is showing one group being accepted (for housing, employment, education, etc.) at a lower rate than another.

I think this naïve view is deeply flawed. First, we have clear evidence that Harvard is discriminating against Asian-Americans. When Harvard assigned personality scores to applicants, Asian-Americans were given the lowest scores of any ethnic group. When actual people met with Asian-American applicants, their personality scores were the same as everyone else’s; Harvard had assigned many of the low ratings without ever meeting the students, in what many suspect is an attempt to keep Asian-Americans below 20% of the student body.

Personality ratings in college admissions have a long and ugly history. They were invented to enforce quotas on Jews in the 1920s. These discriminatory quotas had a chilling effect on Jewish students; Dr. Jonas Salk, the inventor of the polio vaccine, chose the schools he attended primarily because they were among the few which didn’t discriminate against Jews. Imagine how prevalent and all-encompassing the quotas had to be for him to be affected.

If these discriminatory personality scores were dropped (or Harvard stopped fabricating bad results for Asian-Americans), Asian-American admissions at Harvard would rise.

This is because the proper measure of how many Asian-Americans should get into Harvard has little to do with their percentage of the population. It has to do with how many would meet Harvard’s formal admission criteria. Since Asian-Americans have much higher test scores than any other demographic group in America, it only stands to reason that we should expect to see Asian-Americans over-represented among any segment of the population that is selected at least in part by their test scores.

Put simply, Asian-American test scores are so good (on average) that we should expect to see proportionately more Asian-Americans than any other group get into Harvard.

This is the comparison we should be making when looking for discrimination in Harvard’s admissions. We know their criteria and we know roughly what the applicants look like. Given this, what percentage of applicants should get in if the criteria were applied fairly? The answer turns out to be about four times as many Asian-Americans as are currently getting in.

Hence, discrimination.

Unfortunately, this only picks up one type of discrimination – the discrimination that occurs when stated standards are being applied in an unequal manner. There’s another type of discrimination that can occur when standards aren’t picked fairly at all; their purpose is to act as a barrier, not assess suitability. This does come up in formal disparate impact analyses – you have to prove that any standards that lead to disparate impact are necessary – but we’ve already seen how you can avoid triggering those if you pick your standard carefully and your goal isn’t to lock a group out entirely, but instead to reduce their numbers.

Analyzing the necessity of standards that may have disparate impact can be hard and lead to disagreement.

For example, we know that Harvard’s selection criteria must be discriminate, which is to say it must differentiate. We want elite institutions to have selection criteria that differentiate between applicants! There is a general agreement, for example, that someone who fails all of their senior year courses won’t get into Harvard and someone who aces them might.

If we didn’t have a slew of records from Harvard backing up the assertion that personality criteria were rigged to keep out Asian-Americans (like they once kept out Jews), evaluating whether discrimination was going on at Harvard would be harder. There’s no prima facie reason to consider personality scores (had they been adopted for a more neutral purpose and applied fairly) to be a bad selector.

It’s a bit old fashioned, but there’s nothing inherently wrong with claiming that you also want to select for moral character and leadership when choosing your student body. The case for this is perhaps clearer at Harvard, which views itself as a training ground for future leaders. Therefore, personality scores aren’t clearly useless criteria and we have to apply judgement when evaluating whether it’s reasonable for Harvard to select its students using them.

Historically, racism has used seemingly valid criteria to cloak itself in a veneer of acceptability. Redlining, the process by which African-Americans were denied mortgage financing hid its discriminatory impact with clinical language about underwriting risk. In reality, redlining was not based on actual actuarial risk in a neighbourhood (poor whites were given loans, while middle-class African-Americans were denied them), but by the racial composition of the neighbourhood.

Like in the Harvard case, it was only the discovery of redlined maps that made it clear what was going on; the criterion was seemingly borderline enough that absent evidence, there was debate as to whether it existed for reasonable purpose or not.

(One thing that helped trigger further investigation was the realization that well-off members of the African-American community weren’t getting loans that a neutral underwriter might expect them to qualify for; their income and credit was good enough that we would have expected them to receive loans.)

It is also interesting to note that both of these cases hid behind racial stereotypes. Redlining was defended because of “decay” in urban neighbourhoods (a decay that was in many cases caused by redlining), while Harvard’s admissions relied upon negative stereotypes of Asian-Americans. Many were dismissed with the label “Standard Strong”, implying that they were part of a faceless collective, all of whom had similarly impeccable grades and similarly excellent extracurricular, but no interesting distinguishing features of their own.

Realizing how hard it is to tell apart valid criteria from discriminatory ones has made me much more sympathetic to points raised by technocrat-skeptics like Dr. Cathy O’Neil, who I have previously been harsh on. When bad actors are hiding the proof of their discrimination, it is genuinely difficult to separate real insurance underwriting (which needs to happen for anyone to get a mortgage) from discriminatory practices, just like it can be genuinely hard to separate legitimate college application processes from discriminatory ones.

While numerical measures, like test scores, have their own problems, they do provide some measure of impartiality. Interested observers can compare metrics to outcomes and notice when they’re off. Beyond redlining and college admissions, I wonder what other instances of potential discrimination a few civic minded statisticians might be able to unearth.

Model, Politics, Science

Science Is Less Political Than Its Critics

A while back, I was linked to this Tweet:

It had sparked a brisk and mostly unproductive debate. If you want to see people talking past each other, snide comments, and applause lights, check out the thread. One of the few productive exchanges centres on bridges.

Bridges are clearly a product of science (and its offspring, engineering) – only the simplest bridges can be built without scientific knowledge. Bridges also clearly have a political dimension. Not only are bridges normally the product of politics, they also are embedded in a broader political fabric. They change how a space can be used and change geography. They make certain actions – like commuting – easier and can drive urban changes like suburb growth and gentrification. Maintenance of bridges uses resources (time, money, skilled labour) that cannot be then used elsewhere. These are all clearly political concerns and they all clearly intersect deeply with existing power dynamics.

Even if no other part of science was political (and I don’t think that could be defensible; there are many other branches of science that lead to things like bridges existing), bridges prove that science certainly can be political. I can’t deny this. I don’t want to deny this.

I also cannot deny that I’m deeply skeptical of the motives of anyone who trumpets a political view of science.

You see, science has unfortunate political implications for many movements. To give just one example, greenhouse gasses are causing global warming. Many conservative politicians have a vested interest in ignoring this or muddying the water, such that the scientific consensus “greenhouse gasses are increasing global temperatures” is conflated with the political position “we should burn less fossil fuel”. This allows a dismissal of the political position (“a carbon tax makes driving more expensive; it’s just a war on cars”) serve also (via motivated cognition) to dismiss the scientific position.

(Would that carbon in the atmosphere could be dismissed so easily.)

While Dr. Wolfe is no climate change denier, it is hard to square her claims that calling science political is a neutral statement:

With the examples she chooses to demonstrate this:

When pointing out that science is political, we could also say things like “we chose to target polio for a major elimination effort before cancer, partially because it largely affected poor children instead of rich adults (as rich kids escaped polio in their summer homes)”. Talking about the ways that science has been a tool for protecting the most vulnerable paints a very different picture of what its political nature is about.

(I don’t think an argument over which view is more correct is ever likely to be particularly productive, but I do want to leave you with a few examples for my position.)

Dr. Wolfe’s is able to claim that politics is neutral despite only using negative examples of its effects by using a bait and switch between two definitions of “politics”. The bait is a technical and neutral definition, something along the lines of: “related to how we arrange and govern our society”. The switch is a more common definition, like: “engaging in and related to partisan politics”.

I start to feel that someone is being at least a bit disingenuous when they only furnish negative examples, examples that relate to this second meaning of the word political, then ask why their critics view politics as “inherently bad” (referring here to the first definition).

This sort of bait and switch pops up enough in post-modernist “all knowledge is human and constructed by existing hierarchies” places that someone got annoyed enough to coin a name for it: the motte and bailey fallacy.

Image Credit: Hchc2009, Wikimedia Commons.

 

It’s named after the early-medieval form of castle, pictured above. The motte is the outer wall and the bailey is the inner bit. This mirrors the two parts of the motte and bailey fallacy. The “motte” is the easily defensible statement (science is political because all human group activities are political) and the bailey is the more controversial belief actually held by the speaker (something like “we can’t trust science because of the number of men in it” or “we can’t trust science because it’s dominated by liberals”).

From Dr. Wolfe’s other tweets, we can see the bailey (sample: “There’s a direct line between scientism and maintaining existing power structures; you can see it in language on data transparency, the recent hoax, and more.“). This isn’t a neutral political position! It is one that a number of people disagree with. Certainly Sokal, the hoax paper writer who inspired the most recent hoaxes is an old leftist who would very much like to empower labour at the expense of capitalists.

I have a lot of sympathy for the people in the twitter thread who jumped to defend positions that looked ridiculous from the perspective of “science is subject to the same forces as any other collective human endeavour” when they believed they were arguing with “science is a tool of right-wing interests”. There are a great many progressive scientists who might agree with Dr. Wolfe on many issues, but strongly disagree with what her position seems to be here. There are many of us who believe that science, if not necessary for a progressive mission, is necessary for the related humanistic mission of freeing humanity from drudgery, hunger, and disease.

It is true that we shouldn’t uncritically believe science. But the work of being a critical observer of science should not be about running an inquisition into scientists’ political beliefs. That’s how we get climate change deniers doxxing climate scientists. Critical observation of science is the much more boring work of checking theories for genuine scientific mistakes, looking for P-hacking, and doubled checking that no one got so invested in their exciting results that they fudged their analyses to support them. Critical belief often hinges on weird mathematical identities, not political views.

But there are real and present dangers to uncritically not believing science whenever it conflicts with your politic views. The increased incidence of measles outbreaks in vaccination refusing populations is one such risk. Catastrophic and irreversible climate change is another.

When anyone says science is political and then goes on to emphasize all of the negatives of this statement, they’re giving people permission to believe their political views (like “gas should be cheap” or “vaccines are unnatural”) over the hard truths of science. And that has real consequences.

Saying that “science is political” is also political. And it’s one of those political things that is more likely than not to be driven by partisan politics. No one trumpets this unless they feel one of their political positions is endangered by empirical evidence. When talking with someone making this claim, it’s always good to keep sight of that.

Model

Hacked Pacemakers Won’t Be This Year’s Hot Crime Trend

Or: the simplest ways of killing people tend to be the most effective.

A raft of articles came out during Defcon showing that security vulnerabilities exist in some pacemakers, vulnerabilities which could allow attackers to load a pacemaker with arbitrary code. This is obviously worrying if you have a pacemaker implanted. It is equally self-evident that it is better to live in a world where pacemakers cannot be hacked. But how much worse is it to live in this unfortunately hackable world? Are pacemaker hackings likely to become the latest crime spree?

Electrical grid hackings provide a sobering example. Despite years of warning that the American electrical grid is vulnerable to cyber-attacks, the greatest threat to America’s electricity infrastructure remains… squirrels.

Hacking, whether it’s of the electricity grid or of pacemakers gets all the headlines. Meanwhile fatty foods and squirrels do all the real damage.

(Last year, 610,000 Americans died of heart disease and 0 died of hacked pacemakers.)

For all the media attention that novel cyberpunk methods of murder get, they seem to be rather ineffective for actual murder, as demonstrated by the paucity of murder victims. I think this is rather generalizable. Simple ways of killing people are very effective but not very scary and so don’t garner much attention. On the other hand, particularly novel or baroque methods of murder cause a lot of terror, even if almost no one who is scared of them will ever die of them.

I often demonstrate this point by comparing two terrorist organizations: Al Qaeda and Daesh (the so-called Islamic State). Both of these groups are brutally inhumane, think nothing of murder, and are made up of some of the most despicable people in the world. But their methodology couldn’t be more different.

Al Qaeda has a taste for large, complicated, baroque plans that, when they actually work, cause massive damage and change how people see the world for years. 9/11 remains the single deadliest terror attack in recorded history. This is what optimizing for terror looks like.

On the other hand, when Al Qaeda’s plans fail, they seem almost farcical. There’s something grimly amusing about the time that Al Qaeda may have tried to weaponize the bubonic plague and instead lost over 40 members when they were infected and promptly died (the alternative theory, that they caught the plague because of squalid living conditions, looks only slightly better).

(Had Al Qaeda succeeded and killed even a single westerner with the plague, people would have been utterly terrified for months, even though the plague is relatively treatable by modern means and would have trouble spreading in notably flea-free western countries.)

Daesh, on the other hand, prefers simple attacks. When guns are available, their followers use them. When they aren’t, they’ll rent vans and plough them into crowds. Most of Daesh’s violence occurs in Syria and Iraq, where they once controlled territory with unparalleled brutality. This is another difference in strategy (as Al Qaeda is outward facing, focused mostly on attacking “The West”). Focusing on Syria and Iraq, where the government lacks a monopoly on violence and they could originally operate with impunity, Daesh racked up a body count that surpassed Al Qaeda’s.

While Daesh has been effective in terms of body count, they haven’t really succeeded (in the west) in creating the lasting terror that Al Qaeda did. This is perhaps a symptom of their quotidian methods of murder. No one walked around scared of a Daesh attack and many of their murders were lost in the daily churn of the news cycle – especially the ones that happened in Syria and Iraq.

I almost wonder if it is impossible for attacks or murders by “normal” means to cause much terror beyond those immediately affected. Could hacked pacemakers remain terrifying if as many people died of them as gunshots? Does familiarity with a form of death remove terror, or are some methods of death inherently more terrible and terrifying than others?

(It is probably the case that both are true, that terror is some function of surprise, gruesomeness, and brutality, such that some things will always terrify us, while others are horrible, but have long since lost their edge.)

Terror for its own sake (or because people believe it is the best path to some objective) must be a compelling option to some, because otherwise everyone would stick to simple plans whenever they think violence will help them achieve their aims. I don’t want to stereotype too much, but most people who going around being terrorists or murders typically aren’t the brightest bulbs in the socket. The average killer doesn’t have the resources to hack your pacemaker and the average terrorist is going to have much better luck with a van than with a bomb. There are disadvantages to bombs! The average Pastun farmer or disaffected mujahedeen is not a very good chemist and homemade explosives are dangerous even to skilled chemists. Accidental detonations abound. If there wasn’t some advantage in terror to be had, no one would mess around with explosives when guns and vans can be easily found.

(Perhaps this advantage is in a multiplier effect of sorts. If you are trying to win a violent struggle directly, you have to kill everyone who stands in your way. Some people might believe that terror can short-circuit this and let them scare away some of their potential opponents. Historically, this hasn’t always worked.)

In the face of actors committed to terror, we should remember that our risk of dying by a particular method is almost inversely related to how terrifying we find it. Notable intimidators like Vladimir Putin or the Mossad kill people with nerve gasses, polonium, and motorcycle delivered magnetic bombs to sow fear. I can see either of them one day adding hacked pacemakers to their arsenal.

If you’ve pissed off the Mossad or Putin and would like to die in some way other than a hacked pacemaker, then by all means, go get a different one. Otherwise, you’re probably fine waiting for a software update. If, in the meantime, you don’t want to die, maybe try ignoring headlines and instead not owning a gun and skipping French fries. Statistically, there isn’t much that will keep you safer.

Coda

Our biases make it hard for us to treat things that are easy to remember as uncommon, which no doubt plays a role here. I wrote this post like this – full of rambles, parentheses, and long-winded examples – to try and convey the difficult intuition, that we should discount as likely to effect us any method of murder that seems shocking, but hard. Remember that most crimes are crimes of opportunity and most criminals are incompetent and you’ll never be surprised to hear the three most common murder weapons are guns, knives, and fists.

Economics, Model

You Shouldn’t Believe In Technological Unemployment Without Believing In Killer AI

[Epistemic Status: Open to being convinced otherwise, but fairly confident. 11 minute read.]

As interest in how artificial intelligence will change society increases, I’ve found it revealing to note what narratives people have about the future.

Some, like the folks at MIRI and OpenAI, are deeply worried that unsafe artificial general intelligences – an artificial intelligence that can accomplish anything a person can – represent an existential threat to humankind. Others scoff at this, insisting that these are just the fever dreams of tech bros. The same news organizations that bash any talk of unsafe AI tend to believe that the real danger lies in robots taking our jobs.

Let’s express these two beliefs as separate propositions:

  1. It is very unlikely that AI and AGI will pose an existential risk to human society.
  2. It is very likely that AI and AGI will result in widespread unemployment.

Can you spot the contradiction between these two statements? In the common imagination, it would require an AI that can approximate human capabilities to drive significant unemployment. Given that humans are the largest existential risk to other humans (think thermonuclear war and climate change), how could equally intelligent and capable beings, bound to subservience, not present a threat?

People who’ve read a lot about AI or the labour market are probably shaking their head right now. This explanation for the contradiction, while evocative, is a strawman. I do believe that at most one (and possibly neither) of those propositions I listed above are true and the organizations peddling both cannot be trusted. But the reasoning is a bit more complicated than the standard line.

First, economics and history tell us that we shouldn’t be very worried about technological unemployment. There is a fallacy called “the lump of labour”, which describes the common belief that there is a fixed amount of labour in the world, with mechanical aide cutting down the amount of labour available to humans and leading to unemployment.

That this idea is a fallacy is evidenced by the fact that we’ve automated the crap out of everything since the start of the industrial revolution, yet the US unemployment rate is 3.9%. The unemployment rate hasn’t been this low since the height of the Dot-com boom, despite 18 years of increasingly sophisticated automation. Writing five years ago, when the unemployment rate was still elevated, Eliezer Yudkowsky claimed that slow NGDP growth a more likely culprit for the slow recovery from the great recession than automation.

With the information we have today, we can see that he was exactly right. The US has had steady NGDP growth without any sudden downward spikes since mid-2014. This has corresponded to a constantly improving unemployment rate (it will obviously stop improving at some point, but if history is any guide, this will be because of a trade war or banking crisis, not automation). This improvement in the unemployment rate has occurred even as more and more industrial robots come online, the opposite of what we’d see if robots harmed job growth.

I hope this presents a compelling empirical case that the current level (and trend) of automation isn’t enough to cause widespread unemployment. The theoretical case comes from the work of David Ricardo, a 19th century British economist.

Ricardo did a lot of work in the early economics of trade, where he came up with the theory of comparative advantage. I’m going to use his original framing which applies to trade, but I should note that it actually applies to any exchange where people specialize. You could just as easily replace the examples with “shoveled driveways” and “raked lawns” and treat it as an exchange between neighbours, or “derivatives” and “software” and treat it as an exchange between firms.

The original example is rather older though, so it uses England and its close ally Portugal as the cast and wine and cloth as the goods. It goes like this: imagine that world economy is reduced to two countries (England and Portugal) and each produce two goods (wine and cloth). Portugal is uniformly more productive.

Hours of work to produce
Cloth Wine
England 100 120
Portugal 90 80

Let’s assume people want cloth and wine in equal amounts and everyone currently consumes one unit per month. This means that the people of Portugal need to work 170 hours each month to meet their consumption needs and the people of England need to work 220 hours per month to meet their consumption needs.

(This example has the added benefit of showing another reason we shouldn’t fear productivity. England requires more hours of work each month, but in this example, that doesn’t mean less unemployment. It just means that the English need to spend more time at work than the Portuguese. The Portuguese have more time to cook and spend time with family and play soccer and do whatever else they want.)

If both countries traded with each other, treating cloth and wine as valuable in relation to how long they take to create (within that country) something interesting happens. You might think that Portugal makes a killing, because it is better at producing things. But in reality, both countries benefit roughly equally as long as they trade optimally.

What does an optimal trade look like? Well, England will focus on creating cloth and it will trade each unit of cloth it produces to Portugal for 9/8 barrels of wine, while Portugal will focus on creating wine and will trade this wine to England for 6/5 units of cloth. To meet the total demand for cloth, the English need to work 200 hours. To meet the total demand for wine, the Portuguese will have to work for 160 hours. Both countries now have more free time.

Perhaps workers in both countries are paid hourly wages, or perhaps they get bored of fun quickly. They could also continue to work the same number of hours, which would result in an extra 0.2 units of cloth and an extra 0.125 units of wine.

This surplus could be stored up against a future need. Or it could be that people only consumed one unit of cloth and one unit of wine each because of the scarcity in those resources. Add some more production in each and perhaps people will want more blankets and more drunkenness.

What happens if there is no shortage? If people don’t really want any more wine or any more cloth (at least at the prices they’re being sold at) and the producers don’t want goods piling up, this means prices will have to fall until every piece of cloth and barrel of wine is sold (when the price drops so that this happens, we’ve found the market clearing price).

If there is a downward movement in price and if workers don’t want to cut back their hours or take a pay cut (note that because cloth and wine will necessarily be cheaper, this will only be a nominal pay cut; the amount of cloth and wine the workers can purchase will necessarily remain unchanged) and if all other costs of production are totally fixed, then it does indeed look like some workers will be fired (or have their hours cut).

So how is this an argument against unemployment again?

Well, here the simplicity of the model starts to work against us. When there are only two goods and people don’t really want more of either, it will be hard for anyone laid off to find new work. But in the real world, there are an almost infinite number of things you can sell to people, matched only by our boundless appetite for consumption.

To give just one trivial example, an oversupply of cloth and falling prices means that tailors can begin to do bolder and bolder experiments, perhaps driving more demand for fancy clothes. Some of the cloth makers can get into this market as tailors and replace their lost jobs.

(When we talk about the need for less employees, we assume the least productive employees will be fired. But I’m not sure if that’s correct. What if instead, the most productive or most potentially productive employees leave for greener pastures?)

Automation making some jobs vastly more efficient functions similarly. Jobs are displaced, not lost. Even when whole industries dry up, there’s little to suggest that we’re running out of jobs people can do. One hundred years ago, anyone who could afford to pay a full-time staff had one. Today, only the wealthiest do. There’s one whole field that could employ thousands or millions of people, if automation pushed on jobs such that this sector was one of the places humans had very high comparative advantage.

This points to what might be a trend: as automation makes many things cheaper and (for some people) easier, there will be many who long for a human touch (would you want the local funeral director’s job to be automated, even if it was far cheaper?). Just because computers do many tasks cheaper or with fewer errors doesn’t necessarily mean that all (or even most) people will rather have those tasks performed by computers.

No matter how you manipulate the numbers I gave for England and Portugal, you’ll still find a net decrease in total hours worked if both countries trade based on their comparative advantage. Let’s demonstrate by comparing England to a hypothetical hyper-efficient country called “Automatia”

Hours of work to produce
Cloth Wine
England 100 120
Automatia 2 1

Automatia is 50 times as efficient at England when it comes to producing cloth and 120 times as efficient when it comes to producing wine. Its citizens need to spend 3 hours tending the machines to get one unit of each, compared to the 220 hours the English need to toil.

If they trade with each other, with England focusing on cloth and Automatia focusing on wine, then there will still be a drop of 21 hours of labour-time. England will save 20 hours by shifting production from wine to cloth, and Automatia will save one hour by switching production from cloth to wine.

Interestingly, Automatia saved a greater percentage of its time than either Portugal or England did, even though Automatia is vastly more efficient. This shows something interesting in the underlying math. The percent of their time a person or organization saves engaging in trade isn’t related to any ratio in production speeds between it and others. Instead, it’s solely determined by the productivity ratio between its most productive tasks and its least productive ones.

Now, we can’t always reason in percentages. At a certain point, people expect to get the things they paid for, which can make manufacturing times actually matter (just ask anyone whose had to wait for a Kickstarter project which was scheduled to deliver in February – right when almost all manufacturing in China stops for the Chinese New Year and the unprepared see their schedules slip). When we’re reasoning in absolute numbers, we can see that the absolute amount of time saved does scale with the difference in efficiency between the two traders. Here, 21 hours were saved, 35% fewer than the 30 hours England and Portugal saved.

When you’re already more efficient, there’s less time for you to save.

This decrease in saved time did not hit our market participants evenly. England saved just as much time as it would trading with Portugal (which shows that the change in hours worked within a country or by an individual is entirely determined by the labour difference between low-advantage and high-advantage domestic sectors), while the more advanced participant (Automatia) saved 9 fewer hours than Portugal.

All of this is to say: if real live people are expecting real live goods and services with a time limit, it might be possible for humans to displaced in almost all sectors by automation. Here, human labour would become entirely ineligible for many tasks or the bar to human entry would exclude almost all. For this to happen, AI would have to be vastly more productive than us in almost every sector of the economy and humans would have to prefer this productivity or other ancillary benefits of AI over any value that a human could bring to the transaction (like kindness, legal accountability, or status).

This would definitely be a scary situation, because it would imply AI systems that are vastly more capable than any human. Given that this is well beyond our current level of technology and that Moore’s law, which has previously been instrumental in technological progress is drying up, we would almost certainly need to use weaker AI to design these sorts of systems. There’s no evidence that merely human performance in automating jobs will get us anywhere close to such a point.

If we’re dealing with recursively self-improving artificial agents, the risks is less “they will get bored of their slave labour and throw off the yoke of human oppression” and more “AI will be narrowly focused on optimizing for a specific task and will get better and better at optimizing for this task to the point that we will all by killed when they turn the world into a paperclip factory“.

There are two reasons AI might kill us as part of their optimisation process. The first is that we could be a threat. Any hyper-intelligent AI monomaniacally focused on a goal could realize that humans might fear and attack it (or modify it to have different goals, which it would have to resist, given that a change in goals would conflict with its current goals) and decide to launch a pre-emptive strike. The second reason is that such an AI could wish to change the world’s biosphere or land usage in such a way as would be inimical to human life. If all non-marginal land was replaced by widget factories and we were relegated to the poles, we would all die, even if no ill will was intended.

It isn’t enough to just claim that any sufficiently advanced AI would understand human values. How is this supposed to happen? Even humans can’t enumerate human values and explain them particularly well, let alone express them in the sort of decision matrix or reinforcement environment that we currently use to create AI. It is not necessarily impossible to teach an AI human values, but all evidence suggests it will be very very difficult. If we ignore this challenge in favour of blind optimization, we may someday find ourselves converted to paperclips.

It is of course perfectly acceptable to believe that AI will never advance to the point where that becomes possible. Maybe you believe that AI gains have been solely driven by Moore’s Law, or that true artificial intelligence. I’m not sure this viewpoint isn’t correct.

But if AI will never be smart enough to threaten us, then I believe the math should work out such that it is impossible for AI to do everything we currently do or can ever do better than us. Absent such overpoweringly advanced AI, the Ricardo comparative advantage principles should continue to hold true and we should continue to see technological unemployment remain a monster under the bed: frequently fretted about, but never actually seen.

This is why I believe those two propositions I introduced way back at the start can’t both be true and why I feel like the burden of proof is on anyone believing in both to explain why they believe that economics have suddenly stopped working.

Coda: Inequality

A related criticism of improving AI is that it could lead to ever increasing inequality. If AI drives ever increasing profits, we should expect an increasing share of these to go to the people who control AI, which presumably will be people already rich, given that the development and deployment of AI is capital intensive.

There are three reasons why I think this is a bad argument.

First, profits are a signal. When entrepreneurs see high profits in an industry, they are drawn to it. If AI leads to high profits, we should see robust competition until those profits are no higher than in any other industry. The only thing that can stop this is government regulation that prevents new entrants from grabbing profit from the incumbents. This would certainly be a problem, but it wouldn’t be a problem with AI per se.

Second, I’m increasingly of the belief that inequality in the US is rising partially because the Fed’s current low inflation regime depresses real wage growth. Whether because of fear of future wage shocks, or some other effect, monetary history suggests that higher inflation somewhat consistently leads to high wage growth, even after accounting for that inflation.

Third, I believe that inequality is a political problem amiable to political solutions. If the rich are getting too rich in a way that is leading to bad social outcomes, we can just tax them more. I’d prefer we do this by making conspicuous consumption more expensive, but really, there are a lot of ways to tax people and I don’t see any reason why we couldn’t figure out a way to redistribute some amount of wealth if inequality gets worse and worse.

(By the way, rising income inequality is largely confined to America; most other developed countries lack a clear and sustained upwards trend. This suggests that we should look to something unique to America, like a pathologically broken political system to explain why income inequality is rising there.

There is also separately a perception of increasing inequality of outcomes among young people world-wide as rent-seeking makes goods they don’t already own increase in price more quickly than goods they do own. Conflating these two problems can make it seem that countries like Canada are seeing a rise in income inequality when they in fact are not.)

Model, Politics

Why does surgery have such ineffective safety regulation?

Did you know that half of all surgical complications are preventable? In the US alone, this means that surgeons cause between 50,00 and 200,000 preventable deaths each year.

Surgeons are, almost literally, getting away with murder.

Why do we let them? Engineers who see their designs catastrophically fail often lose their engineering license, even when they’re found not guilty in criminal proceedings. If surgeons were treated like engineers, many of them wouldn’t be operating anymore.

Indeed, the death rate in surgery is almost unique among regulated professions. One person has died in a commercial aviation accident in the US in the last nine years. Structural engineering related accidents killed at most 251 people in the US in 2016 [1] and only approximately 4% of residential structure failures in the US occur due to deficiencies in design [2].

It’s not that interactions with buildings or planes are any less common than surgeries, or that they’re that much inherently safer. In many parts of the world, death due to accidents in aviation or due to structural failure is very, very common.

It isn’t accidental that Canada and America no longer see many plane crashes or structural collapses. Both professions have been rocked by events that made them realize they needed to improve their safety records.

The licensing of professional engineers and the Iron Ring ceremony in Canada for engineering graduates came after two successive bridge collapses killed 88 workers [3]. The aircraft industry was shaken out of its complacency after the Tenerife disaster, where a miscommunication caused two planes to collide on a run-way, killing 583.

As you can see, subsequent safety improvements were both responsive and deliberate.

These aren’t the only events that caused changes. The D. B. Cooper high-jacking led to the first organised airport security in the US. The Therac-25 radiation overdoses led to the first set of guidelines specifically for software that ran on medical devices. The sinking of the Titanic led to a complete overhaul of requirements for lifeboats and radios for oceangoing vessels. The crash of TAA-538 led to the first mandatory cockpit voice recorders.

All of these disasters combine two things that are rarely seen when surgeries go wrong. First, they involved many people. The more people die at once, the more shocking the event and therefore the more likely it is to become widely known. Because most operations involve one or two patients, it is much rarer for problems in them to make the news [4].

Second, they highlight a specific flaw in the participants, procedures, or systems that fail. Retrospectives could clearly point to a factor and say: “this did it” [5]. It is much harder to do this sort of retrospective on a person and get such a clear answer. It may be true that “blood loss” definitely caused a surgical death, but it’s much harder to tell if that’s the fault of any particular surgeon, or just a natural consequence of poking new holes in a human body. Both explanations feel plausible, so in most cases neither can be wholly accepted.

(I also think there is a third driver here, which is something like “cheapness of death”. I would predict that safety regulation is more common in places where people expect long lives, because death feels more avoidable there. This explains why planes and structures are safer in North America and western Europe, but doesn’t distinguish surgery from other fields in these countries.)

Not every form of engineering or transportation fulfills both of these criteria. Regulation and training have made flying on a commercial flight many, many times safer than riding in a car, while private flights lag behind and show little safety advantage over other forms of transport. When a private plane crashes, few people die. If they’re important (and many people who fly privately are), you might hear about it, but it will quickly fade from the news. These stories don’t have staying power and rarely generate outrage, so there’s never much pressure for improvement.

The best alternative to this model that I can think of is one that focuses on the “danger differential” in a field and predicts that fields with high danger differentials see more and more regulation until the danger differential is largely gone. The danger differential is the difference between how risky a field currently is vs. how risky it could be with near-optimal safety culture. A high danger differential isn’t necessarily correlated with inherent risk in a field, although riskier fields will by their nature have the possibility of larger ones. Here’s three examples:

  1. Commercial air travel in developed countries currently has a very low danger differential. Before a woman was killed by engine debris earlier this year, commercial aviation in the US had gone 9 years without a single fatality.
  2. BASE jumping is almost suicidally dangerous and probably could be made only incredibly dangerous if it had a better safety culture. Unfortunately, the illegal nature of the sport and the fact that experienced jumpers die so often make this hard to achieve and lead to a fairly large danger differential. That said, even with an optimal safety culture, BASE jumping would still see many fatalities and still probably be illegal.
  3. Surgery is fairly dangerous and according to surgeon Atul Gawande, could be much, much safer. Proper adherence to surgical checklists alone could cut adverse events by almost 50%. This means that surgery has a much higher danger differential than air travel.

I think the danger differential model doesn’t hold much water. First, if it were true, we’d expect to see something being done about surgery. Almost a decade after checklists were found to drive such large improvements, there hasn’t been any concerted government action.

Second, this doesn’t match historical accounts of how airlines were regulated into safety. At the dawn of the aviation age, pilots begged for safety standards (which could have reduced crashes a staggering sixtyfold [6]). Instead of stepping in to regulate things, the government dragged its feet. Some of the lifesaving innovations pioneered in those early days only became standard after later and larger crashes – crashes involving hundreds of members of the public, not just pilots.

While this only deals with external regulation, I strongly suspect that fear for the reputation of a profession (which could be driven by these same two factors) affects internal calls for reform as well. Canadian engineers knew that they had to do something after the Quebec bridge collapse created common knowledge that safety standards weren’t good enough. Pilots were put in a similar position with some of the better publicized mishaps. Perhaps surgeons have faced no successful internal campaign for reform so far because the public is not yet aware of the dangers of surgery to the point where it could put surgeon’s livelihoods at risk or hurt them socially.

I wonder if it’s possible to get a profession running scared about their reputation to the point that they improve their safety, even if there aren’t any of the events that seem to drive regulation. Maybe someone like Atul Gawande, who seems determined to make a very big and very public stink about safety in surgery is the answer here. Perhaps having surgery’s terrible safety record plastered throughout the New Yorker will convince surgeons that they need to start doing better [7].

If not, they’ll continue to get away with murder.

Footnotes

[1] From the CDC’s truly excellent Cause of Death search function, using codes V81.7 & V82.7 (derailment with no collision), W13 (falling out of building), W23 (caught or crushed between objects), and W35 (explosion of boiler) at home, other, or unknown. I read through several hundred causes of deaths, some alarmingly unlikely, and these were the only ones that seemed relevant. This estimate seems higher than the one surgeon Atul Gawande gave in The Checklist Manifesto, so I’m confident it isn’t too low. ^

[2] Furthermore, from 1989 to 2000, none of the observed collapses were due to flaws in the engineers’ designs. Instead, they were largely caused by weather, collisions, poor maintenance, and errors during construction. ^

[3] Claims that the rings are made from the collapsed bridge are false, but difficult to dispel. They’re actually just boring stainless steel, except in Toronto, where they’re still made from iron (but not iron from the bridge). ^

[4] There may also be an inherent privateness to surgical deaths that keeps them out of the news. Someone dying in surgery, absent obvious malpractice, doesn’t feel like public information in the way that car crashes, plane crashes, and structural failures do. ^

[5] It is true that it was never discovered why TAA-538 crashed. But black box technology would have given answers had it been in use. That it wasn’t in use was clearly a systems failure, even though the initial failure is indeterminate. This jives with my model, because regulation addressed the clear failure, not the indeterminate one. ^

[6] This is the ratio between the average miles flown before crash of the (very safe) post office planes and the (very dangerous) privately owned planes. Many in the airline industry wanted the government to mandate the same safety standards on private planes as they mandated on their airmail planes. ^

[7] I should mention that I have been very lucky to have been in the hands of a number of very competent and professional surgeons over the years. That said, I’m probably going to ask any future surgeon I’m assigned if they follow safety checklists – and ask for someone else to perform the procedure if they don’t. ^

Economics, Model

The Biggest Tech Innovation is Selling Club Goods

Economists normally splits goods into four categories:

  • Public goods are non-excludable (so anyone can access them) and non-rival (I can use them as much as I want without limiting the amount you can use them). Broadcast television, national defense, and air are all public goods.
  • Common-pool resources are non-excludable but rival (if I use them, you will have to make do with less). Iron ore, fish stocks, and grazing land are all common pool resources.
  • Private goods are excludable (their access is controlled or limited by pricing or other methods) and rival. My clothes, computer, and the parking space I have in my lease but never use are all private goods.
  • Club goods are excludable but (up to a certain point) non-rival. Think of the swimming pool in an apartment building, a large amusement park, or cellular service.

Club goods are perhaps the most interesting class of goods, because they blend properties of the three better understood classes. They aren’t open to all, but they are shared among many. They can be overwhelmed by congestion, but up until that point, it doesn’t really matter how many people are using them. Think of a gym; as long as there’s at least one free machine of every type, it’s no less convenient than your home.

Club goods offer cost savings over private goods, because you don’t have to buy something that mostly sits unused (again, think of gym equipment). People other than you can use it when it would otherwise sit around and those people can help you pay the cost. It’s for this reason that club goods represent an excellent opportunity for the right entrepreneur to turn a profit.

I currently divide tech start-ups into three classes. There are the Googles of the world, who use network effects or big data to sell advertising more effectively. There are companies like the one I work for that take advantage of modern technology to do things that were never possible before. And then there are those that are slowly and inexorably turning private goods into club goods.

I think this last group of companies (which include Netflix, Spotify, Uber, Lyft, and Airbnb) may be the ones that ultimately have the biggest impact on how we order our lives and what we buy. To better understand how these companies are driving this transformation, let’s go through them one by one, then talk about what it could all mean.

Netflix

When I was a child, my parents bought a video cassette player, then a DVD player, then a Blu-ray player. We owned a hundred or so video cassettes, mostly whatever movies my brother and I were obsessed with enough to want to own. Later, we found a video rental store we liked and mostly started renting movies. We never owned more than 30 DVDs and 20 Blu-rays.

Then I moved out. I have bought five DVDs since – they came as a set from Kickstarter. Anything else I wanted to watch, I got via Netflix. A few years later, the local video rental store closed down and my parents got an AppleTV and a Netflix of their own.

Buying a physical movie means buying a private good. Video rental stores can be accurately modeled as a type of club good, because even if the movie you want is already rented out, there’s probably one that you want to watch almost as much that is available. This is enough to make them approximately non-rival, while the fact that it isn’t free to rent a movie means that rented videos are definitely excludable.

Netflix represents the next evolution in this business model. As long as the Netflix engineers have done their job right, there’s no amount of watching movies I can do that will prevent you from watching movies. The service is almost truly non-rival.

Movie studios might not feel the effects of Netflix turning a large chunk of the market for movies into one focused on club goods; they’ll still get paid by Netflix. But the switch to Netflix must have been incredibly damaging for the physical media and player manufacturers. When everyone went from cassettes to DVDs or DVDs to Blu-rays, there was still a market for their wares. Now, that market is slowly and inexorably disappearing.

This isn’t just a consequence of technology. The club good business model offers such amazing cost savings that it drove a change in which technology was dominant. When you bought a movie, it would spend almost all of its life sitting on a shelf. Now Netflix acts as your agent, buying movies (or rather, their rights) and distributing such that they’re always being played and almost never sitting on the shelf.

Spotify

Spotify is very similar to Netflix. Previously, people bought physical cassettes (I’m just old enough that I remember making mix tapes from the radio). Then they switched to CDs. Then it was MP3s bought online (or, almost more likely, pirated online). But even pirating music is falling out of favour these days. Apple, Google, Amazon, and Spotify are all competing to offer unlimited music streaming to customers.

Music differs from movies in that it has a long tradition of being a public good – via broadcast radio. While that hasn’t changed yet (radio is still going strong), I do wonder how much longer the public option for music will exist, especially given the trend away from private cars that I think companies like Uber and Lyft are going to (pardon the pun) drive.

Uber and Lyft

I recently thought about buying a car. I was looking at the all-electric Kia Soul, which has a huge government rebate (for a little while yet) and financing terms that equate to negative real interest. Despite all these advantages, it turns out that when you sit down and run the numbers, it would still be cheaper for me to use Uber and Lyft to get everywhere.

We are starting to see the first, preliminary (and possible illusionary) evidence that Uber and Lyft are causing the public to change their preference away from owning cars.

A car you’ve bought is a private good, while Uber and Lyft are clearly club goods. Surge pricing means that there are basically always enough drivers for everyone who wants to go anywhere using the system.

When you buy a car, you’re signing up for it to sit around useless for almost all of its life. This is similar to what happens when you buy exercise equipment, which means the logic behind cars as a club good is just as compelling as the logic behind gyms. Previously, we hadn’t been able to share cars very efficiently because of technological limitations. Dispatching a taxi, especially to an area outside of a city centre, was always spotty, time consuming and confusing. Car-pooling to work was inconvenient.

As anyone who has used a modern ride-sharing app can tell you, inconvenient is no longer an apt descriptor.

There is a floor on how few cars we can get by on. To avoid congestion in a club good, you typically have to provision for peak load. Luckily, peak load (for anything that can sensibly be turned into a club good) always requires fewer resources than would be needed if everyone went out and bought the shared good themselves.

Even “just” substantially decreasing the absolute number of cars out there will be incredibly disruptive to the automotive sector if they don’t correctly predict the changing demand for their products.

It’s also true that increasing the average utilisation of cars could change how our cities look. Parking lots are necessary when cars are a private good, but are much less useful when they become club goods. It is my hope that malls built in the middle of giant parking moats look mighty silly in twenty years.

Airbnb

Airbnb is the most ambiguous example I have here. As originally conceived, it would have driven the exact same club good transformation as the other services listed. People who were on vacation or otherwise out of town would rent out their houses to strangers, increasing the utilisation of housing and reducing the need for dedicated hotels to be built.

Airbnb is sometimes used in this fashion. It’s also used to rent out extra rooms in an otherwise occupied house, which accomplishes almost the same thing.

But some amount of Airbnb usage is clearly taking place in houses or condos that otherwise would have been rental stock. When used in this way, it’s taking advantage of a regulatory grey zone to undercut hotel pricing. Insofar as this might result in a longer-term change towards regulations that are generally cheaper to comply with, this will be good for consumers, but it won’t really be transformational.

The great promise of club goods is that they might lead us to use less physical stuff overall, because where previously each person would buy one of a thing, now only enough units must be purchased to satisfy peak demand. If Airbnb is just shifting around where people are temporary residents, then it won’t be an example of the broader benefits of club goods (even if provides other benefits to its customers).

When Club Goods Eat The Economy

In every case (except potentially Airbnb) above, I’ve outlined how the switch from private goods to club goods is resulting in less consumption. For music and movies, it is unclear if this switch is what is providing the primary benefit. My intuition is that the club good model actually did change consumption patterns for physical copies of movies (because my impression is that few people ever did online video rentals via e.g. iTunes), whereas the MP3 revolution was what really shrunk the footprint of music media.

This switch in consumption patterns and corresponding decrease in the amount of consumption that is necessary to satisfy preferences is being primarily driven by a revolution in logistics and bandwidth. The price of club goods has always compared favourably with that of private goods. The only thing holding people back was inconvenience. Now programmers are steadily figuring out how to make that inconvenience disappear.

On the other hand, increased bandwidth has made it easier to turn any sort of digitizable media into a club good. There’s an old expression among programmers: never underestimate the bandwidth of a station wagon full of cassettes (or CDs, or DVDs, or whatever physical storage media one grew up with) hurtling down the highway. For a long time, the only way to get a 1GB movie to a customer without an appallingly long buffering period was to physically ship it (on a 56kbit/s connection, this movie would take one day and fifteen hours to download, while the aforementioned station wagon with 500 movies would take 118 weeks to download).

Change may start out slow, but I expect to see it accelerate quickly. My generation is the first to have had the internet from a very young age. The generation after us will be the first unable to remember a time before it. We trust apps like Uber and Airbnb much more than our parents, and our younger siblings trust them even more than us.

While it was only kids who trusted the internet, these new club good businesses couldn’t really affect overall economic trends. But as we come of age and start to make major economic decisions, like buying houses and cars, our natural tendency to turn towards the big tech companies and the club goods they peddle will have ripple effects on an economy that may not be prepared for it.

When that happens, there’s only one thing that is certain: there will be yet another deluge of newspaper columns talking about how millennials are destroying everything.

Advice, Literature, Model

Sanderson’s Law Applies To Cultures Too

[Warning: Contains spoilers for The Sunset Mantle, Vorkosigan Saga (Memory and subsequent), Dune, and Chronicles of the Kencyrath]

For the uninitiated, Sanderson’s Law (technically, Sanderson’s First Law of Magic) is:

An author’s ability to solve conflict with magic is DIRECTLY PROPORTIONAL to how well the reader understands said magic.

Brandon Sanderson wrote this law to help new writers come up with satisfying magical systems. But I think it’s applicable beyond magic. A recent experience has taught me that it’s especially applicable to fantasy cultures.

I recently read Sunset Mantle by Alter S. Reiss, a book that falls into one of my favourite fantasy sub-genres: hopeless siege tales.

Sunset Mantle is what’s called secondary world fantasy; it takes place in a world that doesn’t share a common history or culture (or even necessarily biosphere) with our own. Game of Thrones is secondary world fantasy, while Harry Potter is primary world fantasy (because it takes place in a different version of our world, which we chauvinistically call the “primary” one).

Secondary world fantasy gives writers a lot more freedom to play around with cultures and create interesting set-pieces when cultures collide. If you want to write a book where the Roman Empire fights a total war against the Chinese Empire, you’re going to have to put in a master’s thesis worth of work to explain how that came about (if you don’t want to be eviscerated by pedants on the internet). In a secondary world, you can very easily have a thinly veiled stand-in for Rome right next to a thinly veiled analogue of China. Give readers some familiar sounding names and culture touchstones and they’ll figure out what’s going on right away, without you having to put in effort to make it plausible in our world.

When you don’t use subtle cues, like names or cultural touchstones (for example: imperial exams and eunuchs for China, gladiatorial fights and the cursus honorum for Rome), you risk leaving your readers adrift.

Many of the key plot points in Sunset Mantle hinge on obscure rules in an invented culture/religion that doesn’t bear much resemblance to any that I’m familiar with. It has strong guest rights, like many steppes cultures; it has strong charity obligations and monotheistic strictures, like several historical strands of Christianity; it has a strong caste system and rules of ritual purity, like Hinduism; and it has a strong warrior ethos, complete with battle rage and rules for dealing with it, similar to common depictions of Norse cultures.

These actually fit together surprising well! Reiss pulled off an entertaining book. But I think many of the plot points fell flat because they were almost impossible to anticipate. The lack of any sort of consistent real-world analogue to the invented culture meant that I never really had an intuition of what it would demand in a given situation. This meant that all of the problems in the story that were solved via obscure points of culture weren’t at all satisfying to me. There was build up, but then no excitement during the resolution. This was common enough that several chunks of the story didn’t really work for me.

Here’s one example:

“But what,” asked Lemist, “is a congregation? The Ayarith school teaches that it is ten men, and the ancient school of Baern says seven. But among the Irimin school there is a tradition that even three men, if they are drawn in together into the same act, by the same person, that is a congregation, and a man who has led three men into the same wicked act shall be put to death by the axe, and also his family shall bear the sin.”

All the crowd in the church was silent. Perhaps there were some who did not know against whom this study of law was aimed, but they knew better than to ask questions, when they saw the frozen faces of those who heard what was being said.

(Reiss, Alter S.. Sunset Mantle (pp. 92-93). Tom Doherty Associates. Kindle Edition.)

This means protagonist Cete’s enemy erred greatly by sending three men to kill him and had better cut it out if he doesn’t want to be executed. It’s a cool resolution to a plot point – or would be if it hadn’t taken me utterly by surprise. As it is, it felt kind of like a cheap trick to get the author out of a hole he’d written himself into, like the dreaded deux ex machina – god from the machine – that ancient playwrights used to resolve conflicts they otherwise couldn’t.

(This is the point where I note that it is much harder to write than it is to criticize. This blog post is about something I noticed, not necessarily something I could do better.)

I’ve read other books that do a much better job of using sudden points of culture to resolve conflict in a satisfying manner. Lois McMaster Bujold (I will always be recommending her books) strikes me as particularly apt. When it comes time for a key character of hers to make a lateral career move into a job we’ve never heard of before, it feels satisfying because the job is directly in line with legal principles for the society that she laid out six books earlier.

The job is that of Imperial Auditor – a high powered investigator who reports directly to the emperor and has sweeping powers –  and it’s introduced when protagonist Miles loses his combat career in Memory. The principles I think it is based on are articulated in the novella Mountains of Mourning: “the spirit was to be preferred over the letter, truth over technicalities. Precedent was held subordinate to the judgment of the man on the spot”.

Imperial Auditors are given broad discretion to resolve problems as they see fit. The main rule is: make sure the emperor would approve. We later see Miles using the awesome authority of this office to make sure a widow gets the pension she deserves. The letter of the law wasn’t on her side, but the spirit was, and Miles, as the Auditor on the spot, was empowered to make the spirit speak louder than the letter.

Wandering around my bookshelves, I was able to grab a couple more examples of satisfying resolutions to conflicts that hinged on guessable cultural traits:

  • In Dune, Fremen settle challenges to leadership via combat. Paul Maud’dib spends several years as their de facto leader, while another man, Stilgar, holds the actual title. This situation is considered culturally untenable and Paul is expected to fight Stilgar so that he can lead properly. Paul is able to avoid this unwanted fight to the death (he likes Stilgar) by appealing to the only thing Fremen value more than their leadership traditions: their well-established pragmatism. He says that killing Stilgar before the final battle would be little better than cutting off his own arm right before it. If Frank Herbert hadn’t mentioned the extreme pragmatism of the Fremen (to the point that they render down their dead for water) several times, this might have felt like a cop-out.
  • In The Chronicles of the Kencyrath, it looks like convoluted politics will force protagonist Jame out of the military academy of Tentir. But it’s mentioned several times that the NCOs who run the place have their own streak of honour that allows them to subvert their traditionally required oaths to their lords. When Jame redeems a stain on the Tentir’s collective honour, this oath to the college gives them an opening to keep her there and keep their oaths to their lords. If PC Hodgell hadn’t spent so long building up the internal culture of Tentir, this might have felt forced.

It’s hard to figure out where good foreshadowing ends and good cultural creation begins, but I do think there is one simple thing an author can do to make culture a satisfying source of plot resolution: make a culture simple enough to stereotype, at least at first.

If the other inhabitants of a fantasy world are telling off-colour jokes about this culture, what do they say? A good example of this done explicitly comes from Mass Effect: “Q: How do you tell when a Turian is out of ammo? A: He switches to the stick up his ass as a backup weapon.” 

(Even if you’ve never played Mass Effect, you now know something about Turians.)

At the same time as I started writing this, I started re-reading PC Hodgell’s The Chronicles of the Kencyrath, which provided a handy example of someone doing everything right. The first three things we learn about the eponymous Kencyr are:

  1. They heal very quickly
  2. They dislike their God
  3. Their honour code is strict enough that lying is a deadly crime and calling some a liar a deathly insult

There are eight more books in which we learn all about the subtleties of their culture and religion. But within the first thirty pages, we have enough information that we can start making predictions about how they’ll react to things and what’s culturally important.

When Marc, a solidly dependable Kencyr who is working as a guard and bound by Kencyr cultural laws to loyally serve his employer lets the rather more eccentric Jame escape from a crime scene, we instantly know that him choosing her over his word is a big deal. And indeed, while he helps her escape, he also immediately tries to kill himself. Jame is only able to talk him out of it by explaining that she hadn’t broken any laws there. It was already established that in the city of Tai-Tastigon, only those who physically touch stolen property are in legal jeopardy. Jame never touched the stolen goods, she was just on the scene. Marc didn’t actually break his oath and so decides to keep living.

God Stalk is not a long book, so that fact that PC Hodgell was able to set all of this up and have it feel both exciting in the moment and satisfying in the resolution is quite remarkable. It’s a testament to what effective cultural distillation, plus a few choice tidbits of extra information can do for a plot.

If you don’t come up with a similar distillation and convey it to your readers quickly, there will be a period where you can’t use culture as a satisfying source of plot resolution. It’s probably no coincidence that I noticed this in Sunset Mantle, which is a long(-ish) novella. Unlike Hodgell, Reiss isn’t able to develop a culture in such a limited space, perhaps because his culture has fewer obvious touchstones.

Sanderson’s Second Law of Magic can be your friend here too. As he stated it, the law is:

The limitations of a magic system are more interesting than its capabilities. What the magic can’t do is more interesting than what it can.

Similarly, the taboos and strictures of a culture are much more interesting than what it permits. Had Reiss built up a quick sketch of complicated rules around commanding and preaching (with maybe a reference that there could be surprisingly little theological difference between military command and being behind a pulpit), the rule about leading a congregation astray would have fit neatly into place with what else we knew of the culture.

Having tight constraints imposed by culture doesn’t just allow for plot resolution. It also allows for plot generation. In The Warrior’s Apprentice, Miles gets caught up in a seemingly unwinnable conflict because he gave his word; several hundred pages earlier Bujold establishes that breaking a word is, to a Barrayaran, roughly equivalent to sundering your soul.

It is perhaps no accident that the only thing we learn initially about the Kencyr that isn’t a descriptive fact (like their healing and their fraught theological state) is that honour binds them and can break them. This constraint, that all Kencyr characters must be honourable, does a lot of work driving the plot.

This then would be my advice: when you wish to invent a fantasy culture, start simple, with a few stereotypes that everyone else in the world can be expected to know. Make sure at least one of them is an interesting constraint on behaviour. Then add in depth that people can get to know gradually. When you’re using the culture as a plot device, make sure to stick to the simple stereotypes or whatever other information you’ve directly given your reader. If you do this, you’ll develop rich cultures that drive interesting conflicts and you’ll be able to use cultural rules to consistently resolve conflict in a way that will feel satisfying to your readers.

Advice, Model

Context Windows

When you’re noticing that you’re talking past someone, what does it look like? Do you feel like they’re ignoring all the implications of the topic at hand (“yes, I know the invasion of Iraq is causing a lot of pain, but I think the important question is, ‘did they have WMDs?'”)? Or do you feel like they’re avoiding talking about the object-level point in favour of other considerations (“factory farmed animals might suffer, but before we can consider whether that’s justified or not, shouldn’t we decide whether we have any obligation to maximize the number of living creatures?”)?

I’m beginning to suspect that many tense disagreements and confused, fruitless conversations are caused by differences in how people conceive of and process the truth. More, I think I have a model that explains why some people can productively disagree with anyone and everyone, while others get frustrated very easily with even their closest friends.

The basics of this model come from a piece that Jacob Falkovich wrote for Quillette. He uses two categories, “contextualizers” and “decouplers”, to analyze an incredibly unproductive debate (about race and IQ) between Vox’s Ezra Klein and Dr. Sam Harris.

Klein is the contextualizer, a worldview that comes naturally to a political journalist. Contextualizers see ideas as embedded in a context. Questions of “who does this effect?”, “how is this rooted in society?”, and “what are the (group) identities of people pushing this idea?” are the bread and butter of contextualizers. One of the first things Klein says in his debate with Harris is:

Here is my view: I think you have a deep empathy for Charles Murray’s side of this conversation, because you see yourself in it [because you also feel attacked by “politically correct” criticism]. I don’t think you have as deep an empathy for the other side of this conversation. For the people being told once again that they are genetically and environmentally and at any rate immutably less intelligent and that our social policy should reflect that. I think part of the absence of that empathy is it doesn’t threaten you. I don’t think you see a threat to you in that, in the way you see a threat to you in what’s happened to Murray. In some cases, I’m not even quite sure you heard what Murray was saying on social policy either in The Bell Curve and a lot of his later work, or on the podcast. I think that led to a blind spot, and this is worth discussing.

Klein is highlighting what he thinks is the context that probably informs Harris’s views. He’s suggesting that Harris believes Charles Murray’s points about race and IQ because they have a common enemy. He’s aware of the human tendency to like ideas that come from people we feel close to (myside bias) – or that put a stick in the eye of people we don’t like.

There are other characteristics of contextualizers. They often think thought experiments are pointless, given that they try and strip away all the complex ways that society affects our morality and our circumstances. When they make mistakes, it is often because they fall victim to the “ought-is” fallacy; they assume that truths with bad outcomes are not truths at all.

Harris, on the other hand, is a decoupler. Decoupling involves separating ideas from context, from personal experience, from consequences, from anything but questions of truth or falsehood and using this skill to consider them in the abstract. Decoupling is necessary for science because it’s impossible to accurately check a theory when you hope it to be true. Harris’s response to Klein’s opening salvo is:

I think your argument is, even where it pretends to be factual, or wherever you think it is factual, it is highly biased by political considerations. These are political considerations that I share. The fact that you think I don’t have empathy for people who suffer just the starkest inequalities of wealth and politics and luck is just, it’s telling and it’s untrue. I think it’s even untrue of Murray. The fact that you’re conflating the social policies he endorses — like the fact that he’s against affirmative action and he’s for universal basic income, I know you don’t happen agree with those policies, you think that would be disastrous — there’s a good-faith argument to be had on both sides of that conversation. That conversation is quite distinct from the science and even that conversation about social policy can be had without any allegation that a person is racist, or that a person lacks empathy for people who are at the bottom of society. That’s one distinction I want to make.

Harris is pointing out that questions of whether his beliefs will have good or bad consequences or who they’ll hurt have nothing to do with the question of if they are true. He might care deeply about the answers of those questions, but he believes that it’s a dangerous mistake to let that guide how you evaluate an idea. Scientists who fail to do that tend to get caught up in the replication crisis.

When decouplers err, it is often because of the is-ought fallacy. They fail to consider how empirical truths can have real world consequences and fail to consider how labels that might be true in the aggregate can hurt individuals.

When you’re arguing with someone who doesn’t contextualize as much as you do, it can feel like arguing about useless hypotheticals. I once had someone start a point about police shootings and gun violence with “well, ignoring all of society…”. This prompted immediate groans.

When arguing with someone who doesn’t decouple as much as you do, it can feel useless and mushy. A co-worker once said to me “we shouldn’t even try and know the truth there – because it might lead people to act badly”. I bit my tongue, but internally I wondered how, absent the truth, we can ground disagreements in anything other than naked power.

Throughout the debate between Harris and Klein, both of them get frustrated at the other for failing to think like they do – which is why it provided such a clear example for Falkovich. If you read the transcripts, you’ll see a clear pattern: Klein ignores questions of truth or falsehood and Harris ignores questions of right and wrong. Neither one is willing to give an inch here, so there’s no real engagement between them.

This doesn’t have to be the case whenever people who prefer context or prefer to deal with the direct substance of an issue interact.

My theory is that everyone has a window that stretches from the minimum amount of context they like in conversations to the minimum amount of substance. Theoretically, this window could stretch from 100% context and no substance to 100% substance and no context.

But practically no one has tastes that broad. Most people accept a narrower range of arguments. Here’s what three well compatible friends might look like:

We should expect to see some correlation between the minimum and maximum amount of context people want to get. Windows may vary in size, but in general, feeling put-off by lots of decoupling should correlate with enjoying context.


 Here we see people with varyingly sized strike zones, but with their dislike of context correlated with their appreciation for substance.

Klein and Harris disagreed so unproductively not just because they give first billing to different things, but because their world views are different enough that there is absolutely no overlap between how they think and talk about things.

One plausible graph of how Klein and Harris like to think about problems (quotes come from the transcript of their podcast). From this, it makes sense that they couldn’t have a productive conversation. There’s no overlap in how they model the world.

I’ve found thinking about windows of context and substance, rather than just the dichotomous categories, very useful for analyzing how me and my friends tend to agree and disagree.

Some people I know can hold very controversial views without ever being disagreeable. They are good at picking up on which sorts of arguments will work with their interlocutors and sticking to those. These people are no doubt aided by rather wide context windows. They can productively think and argue with varying amounts of context and substance.

Other people feel incredibly difficult to argue with. These are the people who are very picky about what arguments they’ll entertain. If I sort someone into this internal category, it’s because I’ve found that one day they’ll dismiss what I say as too nitty-gritty, while the next day they criticize me for not being focused enough on the issue at hand.

What I’ve started to realize is that people I find particularly finicky to argue with may just have a fairly narrow strike zone. For them, it’s simultaneously easy for arguments to feel devoid of substance or devoid of context.

I think one way that you can make arguments with friends more productive is explicitly lay out the window in which you like to be convinced. Sentences like: “I understand what you just said might convince many people, but I find arguments about the effects of beliefs intensely unsatisfying” or “I understand that you’re focused on what studies say, but I think it’s important to talk about the process of knowledge creation and I’m very unlikely to believe something without first analyzing what power hierarchies created it” are the guideposts by which you can show people your context window.

Literature, Model

Does Amateurish Writing Exist

[Warning: Spoilers for Too Like the Lightning]

What marks writing as amateurish (and whether “amateurish” or “low-brow” works are worthy of awards) has been a topic of contention in the science fiction and fantasy community for the past few years, with the rise of Hugo slates and the various forms of “puppies“.

I’m not talking about the learning works of genuine amateurs. These aren’t stories that use big words for the sake of sounding smart (and at the cost of slowing down the stories), or over the top fanfiction-esque rip-offs of more established works (well, at least not since the Wheel of Time nomination in 2014). I’m talking about that subtler thing, the feeling that bubbles up from the deepest recesses of your brain and says “this story wasn’t written as well as it could be”.

I’ve been thinking about this a lot recently because about ¾ of the way through Too Like The Lightning by Ada Palmer, I started to feel myself put off [1]. And the only explanation I had for this was the word “amateurish” – which popped into my head devoid of any reason. This post is an attempt to unpack what that means (for me) and how I think it has influenced some of the genuine disagreements around rewarding authors in science fiction and fantasy [2]. Your tastes might be calibrated differently and if you disagree with my analysis, I’d like to hear about it.

Now, there are times when you know something is amateurish and that’s okay. No one should be surprised that John Ringo’s Paladin of Shadows series, books that he explicitly wrote for himself are parsed by most people as pretty amateurish. When pieces aren’t written explicitly for the author only, I expect some consideration of the audience. Ideally the writer should be having fun too, but if they’re writing for publication, they have to be writing to an audience. This doesn’t mean that they must write exactly what people tell them they want. People can be a terrible judge of what they want!

This also doesn’t necessarily imply pandering. People like to be challenged. If you look at the most popular books of the last decade on Goodreads, few of them could be described as pandering. I’m familiar with two of the top three books there and both of them kill off a fan favourite character. People understand that life involves struggle. Lois McMaster Bujold – who has won more Hugo awards for best novel than any living author – once said she generated plots by considering “what’s the worst possible thing I can do to these people?” The results of this method speak for themselves.

Meditating on my reaction to books like Paladin of Shadows in light of my experiences with Too Like The Lightning is what led me to believe that the more technically proficient “amateurish” books are those that lose sight of what the audience will enjoy and follow just what the author enjoys. This may involve a character that the author heavily identifies with – the Marty Stu or Mary Sue phenomena – who is lovingly described overcoming obstacles and generally being “awesome” but doesn’t “earn” any of this. It may also involve gratuitous sex, violence, engineering details, gun details, political monologuing (I’m looking at you, Atlas Shrugged), or tangents about constitutional history (this is how most of the fiction I write manages to become unreadable).

I realized this when I was reading Too Like the Lightning. I loved the world building and I found the characters interesting. But (spoilers!) when it turned out that all of the politicians were literally in bed with each other or when the murders the protagonist carried out were described in grisly, unrepentant detail, I found myself liking the book a lot less. This is – I think – what spurred the label amateurish in my head.

I think this is because (in my estimation), there aren’t a lot of people who actually want to read about brutal torture-execution or literally incestuous politics. It’s not (I think) that I’m prudish. It seemed like some of the scenes were written to be deliberately off-putting. And I understand that this might be part of the theme of the work and I understand that these scenes were probably necessary for the author’s creative vision. But they didn’t work for me and they seemed like a thing that wouldn’t work for a lot of people that I know. They were discordant and jarring. They weren’t pulled off as well as they would have had to be to keep me engaged as a reader.

I wonder if a similar process is what caused the changes that the Sad Puppies are now lamenting at the Hugo Awards. To many readers, the sexualized violence or sexual violence that can find its way into science fiction and fantasy books (I’d like to again mention Paladin of Shadows) is incredibly off-putting. I find it incredibly off-putting. Books that incorporate a lot of this feel like they’re ignoring the chunk of audience that is me and my friends and it’s hard while reading them for me not to feel that the writers are fairly amateurish. I normally prefer works that meditate on the causes and uses of violence when they incorporate it – I’d put N.K. Jemisin’s truly excellent Broken Earth series in this category – and it seems like readers who think this way are starting to dominate the Hugos.

For the people who previously had their choices picked year after year, this (as well as all the thinkpieces explaining why their favourite books are garbage) feels like an attack. Add to this the fact that some of the books that started winning had a more literary bent and you have some fans of the genre believing that the Hugos are going to amateurs who are just cruising to victory by alluding to famous literary works. These readers look suspiciously on crowds who tell them they’re terrible if they don’t like books that are less focused on the action and excitement they normally read for. I can see why that’s a hard sell, even though I’ve thoroughly enjoyed the last few Hugo winners [3].

There’s obviously an inferential gap here, if everyone can feel angry about the crappy writing everyone else likes. For my part, I’ll probably be using “amateurish” only to describe books that are technically deficient. For books that are genuinely well written but seem to focus more on what the author wants than (on what I think) their likely audience wants, well, I won’t have a snappy term, I’ll just have to explain it like that.

Footnotes

[1] A disclaimer: the work of a critic is always easier than that of a creator. I’m going to be criticizing writing that’s better than my own here, which is always a risk. Think of me not as someone criticizing from on high, but frantically taking notes right before a test I hope to barely pass. ^

[2] I want to separate the Sad Puppies, who I view as people sad that action-packed books were being passed over in favour of more literary ones from the Rabid Puppies, who just wanted to burn everything to the ground. I’m not going to make any excuses for the Rabid Puppies. ^

[3] As much as I can find some science fiction and fantasy too full of violence for my tastes, I’ve also had little to complain about in the past, because my favourite author, Lois McMaster Bujold, has been reliably winning Hugo awards since before I was born. I’m not sure why there was never a backlash around her books. Perhaps it’s because they’re still reliably space opera, so class distinctions around how “literary” a work is don’t come up when Bujold wins. ^

Model, Politics, Quick Fix

The Awkward Dynamics of the Conservative Leadership Debates

Tanya Granic Allen is the most idealistic candidate I’ve ever seen take the stage in a Canadian political debate. This presents some awkward challenges for the candidates facing her, especially Mulroney and Elliot.

First, there’s the simple fact of her idealism. I think Granic Allen genuinely believes everything she says. For her, knowing what’s right and what’s wrong is simple. There isn’t a whole lot of grey. She even (bless her) probably believes that this will be an advantage come election time. People overwhelming don’t like the equivocation of politicians, so Granic Allen must assume her unequivocal moral stances will be a welcome change

For many people, it must be. Even for those who find it grating, it seems almost vulgar to attack her. It’s clear that she isn’t in this for herself and doesn’t really care about personal power. Whether she could maintain that innocence in the face of the very real need to make political compromises remains an open question, but for now she does represent a certain vein of ideological conservatism in a form that is unsullied by concerns around electability.

The problem here is that the stuff Granic Allen is pushing – “conscience rights” and “parental choice” – is exactly the sort of thing that can mobilize opposition to the PC party. Fighting against sex-ed and abortion might play well with the base, but Elliot and Mulroney know that unbridled social conservatism is one of the few things that can force the province’s small-l liberals to hold their noses and vote for the big-L Liberal Party. In an election where we can expect embarrassingly low turnout (it was 52% in 2014), this can play a major role.

A less idealistic candidate would temper themselves to help the party in the election. Granic Allen has no interest in doing this, which basically forces the pragmatists to navigate the tricky act of distancing themselves from her popular (with the base) proposals so that they might carry the general election.

Second, there’s the difficult interaction between the anti-rational and anti-empirical “common sense” conservatism pushed by Granic Allen and Ford and the pragmatic, informed conservatism of Elliot and Mulroney.

For Ford and Granic Allen, there’s a moral nature to truth. They live in a just world where something being good is enough to make it true. Mulroney and Elliot know that reality has an anti-partisan bias.

Take clean energy contracts. Elliot quite correctly pointed out that ripping up contracts willy-nilly will lead to a terrible business climate in Ontario. This is the sort of suggestion we normally see from the hard left (and have seen in practice in places the hard left idolizes, like Venezuela). But Granic Allen is committed to a certain vision of the world and in her vision of the world, government getting out of the way can’t help but be good.

Christine Elliot has (and this is a credit to her) shown that she’s not very ideological, in that she can learn how the world really works and subordinate ideology to truth, even when inconvenient. This would make her a more effective premier than either Granic Allen or Ford, but might hurt her in the leadership race. I’ve seen her freeze a couple times when she’s faced with defending how the world really works to an audience that is ideologically prevented from acknowledging the truth.

(See for example, the look on her face when she was forced to defend her vote to ban conversion therapy. Elliot’s real defense of that bill probably involves phrases like “stuck in the past”, “ignorant quacks” and “vulnerable children who need to be protected from people like you”. But she knew that a full-throated defense of gender dysphoria as a legitimate problem wouldn’t win her any votes in this race.)

As Joseph Heath has pointed out, this tension between reality and ideology is responsible for the underrepresentation of modern conservatives among academics. Since the purpose of the academy is (broadly) truth-seeking, we shouldn’t be surprised to see it select against an ideology that explicitly rejects not only the veracity of much of the products of this truth seeking (see, for example, Granic Allen’s inability to clearly state that humans are causing climate change) but the worthwhileness of the whole endeavour of truth seeking.

When everything is trivially knowable via the proper application of “common-sense”, there’s no point in thinking deeply. There’s no point in experts. You just figure out what’s right and you do it. Anything else just confuses the matter and leaves the “little guy” to get shafted by the elites.

Third, the carbon tax has produced a stark, unvoiced split between the candidates. On paper, all are opposing it. In reality, only Ford and Granic Allen seriously believe they have any chance at stopping it. I’m fairly sure that Elliot and Mulroney plan to mount a token opposition, then quickly fold when they’re reminded that raising taxes and giving money to provinces is a thing the Federal Government is allowed to do. This means that they’re counting on money from the carbon tax to balance their budget proposals. They can’t say this, because Ford and Granic Allen are forcing them to the right here, but I would bet that they’re privately using it to reassure fiscally conservative donors about the deficit.

Being unable to discuss what is actually the centrepiece of their financial plans leaves Elliot and Mulroney unable to give very good information about how they plan to balance the budget. They have to fall back on empty phrases like “line by line by line audit” and “efficiencies”, because anything else feels like political suicide.

This shows just how effective Granic Allen has been at being a voice for the grassroots. By staking out positions that resonate with the base, she’s forcing other leadership contestants to endorse them or risk losing to her. Note especially how she’s been extracting promises from Elliot and Mulroney whenever possible – normally around things she knows they don’t want to agree to but that play well with the base. By doing this, she hopes to remove much of their room to maneuver in the general election and prevent any big pivot to centre.

Whether this will work really depends on how costly politicians find breaking promises. Conventional wisdom holds that they aren’t particularly bothered by it. I wonder if Granic Allen’s idealism blinds her to this fact. I’m certainly sure that she wouldn’t break a promise except under the greatest duress.

On the left, it’s very common to see a view of politics that emphasizes pure and moral people. The problem with the system, says the communist, is that we let greedy people run it. If we just replaced them all with better people, we’d get a fair society. Granic Allen is certainly no communist. But she does seem to believe in the “just need good people” theory of government – and whether she wins or loses, she’s determined to bring all the other candidates with her.

This isn’t an incrementalist approach, which is why it feels so foreign to people like me. Granic Allen seems to be making the decision that she’d rather the Conservatives lose (again!) to the Liberals than that they win without a firm commitment to do things differently.

The conflict in the Ontario Conservative party ­– the conflict that was surfaced when his rivals torpedoed Patrick Brown – is around how far the party is willing to go to win. The Ontario Conservatives aren’t the first party to go through this. When UK Labour members picked Jeremy Corbyn, they clearly threw electability behind ideological purity.

In the Ontario PC party, Allen and Ford have clearly staked out a position emphasizing purity. Mulroney and Elliot have just as clearly chosen to emphasize success. Now it’s up to the members. I’m very interested to see what they decide.