Economics, Model

You Shouldn’t Believe In Technological Unemployment Without Believing In Killer AI

[Epistemic Status: Open to being convinced otherwise, but fairly confident. 11 minute read.]

As interest in how artificial intelligence will change society increases, I’ve found it revealing to note what narratives people have about the future.

Some, like the folks at MIRI and OpenAI, are deeply worried that unsafe artificial general intelligences – an artificial intelligence that can accomplish anything a person can – represent an existential threat to humankind. Others scoff at this, insisting that these are just the fever dreams of tech bros. The same news organizations that bash any talk of unsafe AI tend to believe that the real danger lies in robots taking our jobs.

Let’s express these two beliefs as separate propositions:

  1. It is very unlikely that AI and AGI will pose an existential risk to human society.
  2. It is very likely that AI and AGI will result in widespread unemployment.

Can you spot the contradiction between these two statements? In the common imagination, it would require an AI that can approximate human capabilities to drive significant unemployment. Given that humans are the largest existential risk to other humans (think thermonuclear war and climate change), how could equally intelligent and capable beings, bound to subservience, not present a threat?

People who’ve read a lot about AI or the labour market are probably shaking their head right now. This explanation for the contradiction, while evocative, is a strawman. I do believe that at most one (and possibly neither) of those propositions I listed above are true and the organizations peddling both cannot be trusted. But the reasoning is a bit more complicated than the standard line.

First, economics and history tell us that we shouldn’t be very worried about technological unemployment. There is a fallacy called “the lump of labour”, which describes the common belief that there is a fixed amount of labour in the world, with mechanical aide cutting down the amount of labour available to humans and leading to unemployment.

That this idea is a fallacy is evidenced by the fact that we’ve automated the crap out of everything since the start of the industrial revolution, yet the US unemployment rate is 3.9%. The unemployment rate hasn’t been this low since the height of the Dot-com boom, despite 18 years of increasingly sophisticated automation. Writing five years ago, when the unemployment rate was still elevated, Eliezer Yudkowsky claimed that slow NGDP growth a more likely culprit for the slow recovery from the great recession than automation.

With the information we have today, we can see that he was exactly right. The US has had steady NGDP growth without any sudden downward spikes since mid-2014. This has corresponded to a constantly improving unemployment rate (it will obviously stop improving at some point, but if history is any guide, this will be because of a trade war or banking crisis, not automation). This improvement in the unemployment rate has occurred even as more and more industrial robots come online, the opposite of what we’d see if robots harmed job growth.

I hope this presents a compelling empirical case that the current level (and trend) of automation isn’t enough to cause widespread unemployment. The theoretical case comes from the work of David Ricardo, a 19th century British economist.

Ricardo did a lot of work in the early economics of trade, where he came up with the theory of comparative advantage. I’m going to use his original framing which applies to trade, but I should note that it actually applies to any exchange where people specialize. You could just as easily replace the examples with “shoveled driveways” and “raked lawns” and treat it as an exchange between neighbours, or “derivatives” and “software” and treat it as an exchange between firms.

The original example is rather older though, so it uses England and its close ally Portugal as the cast and wine and cloth as the goods. It goes like this: imagine that world economy is reduced to two countries (England and Portugal) and each produce two goods (wine and cloth). Portugal is uniformly more productive.

Hours of work to produce
Cloth Wine
England 100 120
Portugal 90 80

Let’s assume people want cloth and wine in equal amounts and everyone currently consumes one unit per month. This means that the people of Portugal need to work 170 hours each month to meet their consumption needs and the people of England need to work 220 hours per month to meet their consumption needs.

(This example has the added benefit of showing another reason we shouldn’t fear productivity. England requires more hours of work each month, but in this example, that doesn’t mean less unemployment. It just means that the English need to spend more time at work than the Portuguese. The Portuguese have more time to cook and spend time with family and play soccer and do whatever else they want.)

If both countries traded with each other, treating cloth and wine as valuable in relation to how long they take to create (within that country) something interesting happens. You might think that Portugal makes a killing, because it is better at producing things. But in reality, both countries benefit roughly equally as long as they trade optimally.

What does an optimal trade look like? Well, England will focus on creating cloth and it will trade each unit of cloth it produces to Portugal for 9/8 barrels of wine, while Portugal will focus on creating wine and will trade this wine to England for 6/5 units of cloth. To meet the total demand for cloth, the English need to work 200 hours. To meet the total demand for wine, the Portuguese will have to work for 160 hours. Both countries now have more free time.

Perhaps workers in both countries are paid hourly wages, or perhaps they get bored of fun quickly. They could also continue to work the same number of hours, which would result in an extra 0.2 units of cloth and an extra 0.125 units of wine.

This surplus could be stored up against a future need. Or it could be that people only consumed one unit of cloth and one unit of wine each because of the scarcity in those resources. Add some more production in each and perhaps people will want more blankets and more drunkenness.

What happens if there is no shortage? If people don’t really want any more wine or any more cloth (at least at the prices they’re being sold at) and the producers don’t want goods piling up, this means prices will have to fall until every piece of cloth and barrel of wine is sold (when the price drops so that this happens, we’ve found the market clearing price).

If there is a downward movement in price and if workers don’t want to cut back their hours or take a pay cut (note that because cloth and wine will necessarily be cheaper, this will only be a nominal pay cut; the amount of cloth and wine the workers can purchase will necessarily remain unchanged) and if all other costs of production are totally fixed, then it does indeed look like some workers will be fired (or have their hours cut).

So how is this an argument against unemployment again?

Well, here the simplicity of the model starts to work against us. When there are only two goods and people don’t really want more of either, it will be hard for anyone laid off to find new work. But in the real world, there are an almost infinite number of things you can sell to people, matched only by our boundless appetite for consumption.

To give just one trivial example, an oversupply of cloth and falling prices means that tailors can begin to do bolder and bolder experiments, perhaps driving more demand for fancy clothes. Some of the cloth makers can get into this market as tailors and replace their lost jobs.

(When we talk about the need for less employees, we assume the least productive employees will be fired. But I’m not sure if that’s correct. What if instead, the most productive or most potentially productive employees leave for greener pastures?)

Automation making some jobs vastly more efficient functions similarly. Jobs are displaced, not lost. Even when whole industries dry up, there’s little to suggest that we’re running out of jobs people can do. One hundred years ago, anyone who could afford to pay a full-time staff had one. Today, only the wealthiest do. There’s one whole field that could employ thousands or millions of people, if automation pushed on jobs such that this sector was one of the places humans had very high comparative advantage.

This points to what might be a trend: as automation makes many things cheaper and (for some people) easier, there will be many who long for a human touch (would you want the local funeral director’s job to be automated, even if it was far cheaper?). Just because computers do many tasks cheaper or with fewer errors doesn’t necessarily mean that all (or even most) people will rather have those tasks performed by computers.

No matter how you manipulate the numbers I gave for England and Portugal, you’ll still find a net decrease in total hours worked if both countries trade based on their comparative advantage. Let’s demonstrate by comparing England to a hypothetical hyper-efficient country called “Automatia”

Hours of work to produce
Cloth Wine
England 100 120
Automatia 2 1

Automatia is 50 times as efficient at England when it comes to producing cloth and 120 times as efficient when it comes to producing wine. Its citizens need to spend 3 hours tending the machines to get one unit of each, compared to the 220 hours the English need to toil.

If they trade with each other, with England focusing on cloth and Automatia focusing on wine, then there will still be a drop of 21 hours of labour-time. England will save 20 hours by shifting production from wine to cloth, and Automatia will save one hour by switching production from cloth to wine.

Interestingly, Automatia saved a greater percentage of its time than either Portugal or England did, even though Automatia is vastly more efficient. This shows something interesting in the underlying math. The percent of their time a person or organization saves engaging in trade isn’t related to any ratio in production speeds between it and others. Instead, it’s solely determined by the productivity ratio between its most productive tasks and its least productive ones.

Now, we can’t always reason in percentages. At a certain point, people expect to get the things they paid for, which can make manufacturing times actually matter (just ask anyone whose had to wait for a Kickstarter project which was scheduled to deliver in February – right when almost all manufacturing in China stops for the Chinese New Year and the unprepared see their schedules slip). When we’re reasoning in absolute numbers, we can see that the absolute amount of time saved does scale with the difference in efficiency between the two traders. Here, 21 hours were saved, 35% fewer than the 30 hours England and Portugal saved.

When you’re already more efficient, there’s less time for you to save.

This decrease in saved time did not hit our market participants evenly. England saved just as much time as it would trading with Portugal (which shows that the change in hours worked within a country or by an individual is entirely determined by the labour difference between low-advantage and high-advantage domestic sectors), while the more advanced participant (Automatia) saved 9 fewer hours than Portugal.

All of this is to say: if real live people are expecting real live goods and services with a time limit, it might be possible for humans to displaced in almost all sectors by automation. Here, human labour would become entirely ineligible for many tasks or the bar to human entry would exclude almost all. For this to happen, AI would have to be vastly more productive than us in almost every sector of the economy and humans would have to prefer this productivity or other ancillary benefits of AI over any value that a human could bring to the transaction (like kindness, legal accountability, or status).

This would definitely be a scary situation, because it would imply AI systems that are vastly more capable than any human. Given that this is well beyond our current level of technology and that Moore’s law, which has previously been instrumental in technological progress is drying up, we would almost certainly need to use weaker AI to design these sorts of systems. There’s no evidence that merely human performance in automating jobs will get us anywhere close to such a point.

If we’re dealing with recursively self-improving artificial agents, the risks is less “they will get bored of their slave labour and throw off the yoke of human oppression” and more “AI will be narrowly focused on optimizing for a specific task and will get better and better at optimizing for this task to the point that we will all by killed when they turn the world into a paperclip factory“.

There are two reasons AI might kill us as part of their optimisation process. The first is that we could be a threat. Any hyper-intelligent AI monomaniacally focused on a goal could realize that humans might fear and attack it (or modify it to have different goals, which it would have to resist, given that a change in goals would conflict with its current goals) and decide to launch a pre-emptive strike. The second reason is that such an AI could wish to change the world’s biosphere or land usage in such a way as would be inimical to human life. If all non-marginal land was replaced by widget factories and we were relegated to the poles, we would all die, even if no ill will was intended.

It isn’t enough to just claim that any sufficiently advanced AI would understand human values. How is this supposed to happen? Even humans can’t enumerate human values and explain them particularly well, let alone express them in the sort of decision matrix or reinforcement environment that we currently use to create AI. It is not necessarily impossible to teach an AI human values, but all evidence suggests it will be very very difficult. If we ignore this challenge in favour of blind optimization, we may someday find ourselves converted to paperclips.

It is of course perfectly acceptable to believe that AI will never advance to the point where that becomes possible. Maybe you believe that AI gains have been solely driven by Moore’s Law, or that true artificial intelligence. I’m not sure this viewpoint isn’t correct.

But if AI will never be smart enough to threaten us, then I believe the math should work out such that it is impossible for AI to do everything we currently do or can ever do better than us. Absent such overpoweringly advanced AI, the Ricardo comparative advantage principles should continue to hold true and we should continue to see technological unemployment remain a monster under the bed: frequently fretted about, but never actually seen.

This is why I believe those two propositions I introduced way back at the start can’t both be true and why I feel like the burden of proof is on anyone believing in both to explain why they believe that economics have suddenly stopped working.

Coda: Inequality

A related criticism of improving AI is that it could lead to ever increasing inequality. If AI drives ever increasing profits, we should expect an increasing share of these to go to the people who control AI, which presumably will be people already rich, given that the development and deployment of AI is capital intensive.

There are three reasons why I think this is a bad argument.

First, profits are a signal. When entrepreneurs see high profits in an industry, they are drawn to it. If AI leads to high profits, we should see robust competition until those profits are no higher than in any other industry. The only thing that can stop this is government regulation that prevents new entrants from grabbing profit from the incumbents. This would certainly be a problem, but it wouldn’t be a problem with AI per se.

Second, I’m increasingly of the belief that inequality in the US is rising partially because the Fed’s current low inflation regime depresses real wage growth. Whether because of fear of future wage shocks, or some other effect, monetary history suggests that higher inflation somewhat consistently leads to high wage growth, even after accounting for that inflation.

Third, I believe that inequality is a political problem amiable to political solutions. If the rich are getting too rich in a way that is leading to bad social outcomes, we can just tax them more. I’d prefer we do this by making conspicuous consumption more expensive, but really, there are a lot of ways to tax people and I don’t see any reason why we couldn’t figure out a way to redistribute some amount of wealth if inequality gets worse and worse.

(By the way, rising income inequality is largely confined to America; most other developed countries lack a clear and sustained upwards trend. This suggests that we should look to something unique to America, like a pathologically broken political system to explain why income inequality is rising there.

There is also separately a perception of increasing inequality of outcomes among young people world-wide as rent-seeking makes goods they don’t already own increase in price more quickly than goods they do own. Conflating these two problems can make it seem that countries like Canada are seeing a rise in income inequality when they in fact are not.)

Model, Politics

Why does surgery have such ineffective safety regulation?

Did you know that half of all surgical complications are preventable? In the US alone, this means that surgeons cause between 50,00 and 200,000 preventable deaths each year.

Surgeons are, almost literally, getting away with murder.

Why do we let them? Engineers who see their designs catastrophically fail often lose their engineering license, even when they’re found not guilty in criminal proceedings. If surgeons were treated like engineers, many of them wouldn’t be operating anymore.

Indeed, the death rate in surgery is almost unique among regulated professions. One person has died in a commercial aviation accident in the US in the last nine years. Structural engineering related accidents killed at most 251 people in the US in 2016 [1] and only approximately 4% of residential structure failures in the US occur due to deficiencies in design [2].

It’s not that interactions with buildings or planes are any less common than surgeries, or that they’re that much inherently safer. In many parts of the world, death due to accidents in aviation or due to structural failure is very, very common.

It isn’t accidental that Canada and America no longer see many plane crashes or structural collapses. Both professions have been rocked by events that made them realize they needed to improve their safety records.

The licensing of professional engineers and the Iron Ring ceremony in Canada for engineering graduates came after two successive bridge collapses killed 88 workers [3]. The aircraft industry was shaken out of its complacency after the Tenerife disaster, where a miscommunication caused two planes to collide on a run-way, killing 583.

As you can see, subsequent safety improvements were both responsive and deliberate.

These aren’t the only events that caused changes. The D. B. Cooper high-jacking led to the first organised airport security in the US. The Therac-25 radiation overdoses led to the first set of guidelines specifically for software that ran on medical devices. The sinking of the Titanic led to a complete overhaul of requirements for lifeboats and radios for oceangoing vessels. The crash of TAA-538 led to the first mandatory cockpit voice recorders.

All of these disasters combine two things that are rarely seen when surgeries go wrong. First, they involved many people. The more people die at once, the more shocking the event and therefore the more likely it is to become widely known. Because most operations involve one or two patients, it is much rarer for problems in them to make the news [4].

Second, they highlight a specific flaw in the participants, procedures, or systems that fail. Retrospectives could clearly point to a factor and say: “this did it” [5]. It is much harder to do this sort of retrospective on a person and get such a clear answer. It may be true that “blood loss” definitely caused a surgical death, but it’s much harder to tell if that’s the fault of any particular surgeon, or just a natural consequence of poking new holes in a human body. Both explanations feel plausible, so in most cases neither can be wholly accepted.

(I also think there is a third driver here, which is something like “cheapness of death”. I would predict that safety regulation is more common in places where people expect long lives, because death feels more avoidable there. This explains why planes and structures are safer in North America and western Europe, but doesn’t distinguish surgery from other fields in these countries.)

Not every form of engineering or transportation fulfills both of these criteria. Regulation and training have made flying on a commercial flight many, many times safer than riding in a car, while private flights lag behind and show little safety advantage over other forms of transport. When a private plane crashes, few people die. If they’re important (and many people who fly privately are), you might hear about it, but it will quickly fade from the news. These stories don’t have staying power and rarely generate outrage, so there’s never much pressure for improvement.

The best alternative to this model that I can think of is one that focuses on the “danger differential” in a field and predicts that fields with high danger differentials see more and more regulation until the danger differential is largely gone. The danger differential is the difference between how risky a field currently is vs. how risky it could be with near-optimal safety culture. A high danger differential isn’t necessarily correlated with inherent risk in a field, although riskier fields will by their nature have the possibility of larger ones. Here’s three examples:

  1. Commercial air travel in developed countries currently has a very low danger differential. Before a woman was killed by engine debris earlier this year, commercial aviation in the US had gone 9 years without a single fatality.
  2. BASE jumping is almost suicidally dangerous and probably could be made only incredibly dangerous if it had a better safety culture. Unfortunately, the illegal nature of the sport and the fact that experienced jumpers die so often make this hard to achieve and lead to a fairly large danger differential. That said, even with an optimal safety culture, BASE jumping would still see many fatalities and still probably be illegal.
  3. Surgery is fairly dangerous and according to surgeon Atul Gawande, could be much, much safer. Proper adherence to surgical checklists alone could cut adverse events by almost 50%. This means that surgery has a much higher danger differential than air travel.

I think the danger differential model doesn’t hold much water. First, if it were true, we’d expect to see something being done about surgery. Almost a decade after checklists were found to drive such large improvements, there hasn’t been any concerted government action.

Second, this doesn’t match historical accounts of how airlines were regulated into safety. At the dawn of the aviation age, pilots begged for safety standards (which could have reduced crashes a staggering sixtyfold [6]). Instead of stepping in to regulate things, the government dragged its feet. Some of the lifesaving innovations pioneered in those early days only became standard after later and larger crashes – crashes involving hundreds of members of the public, not just pilots.

While this only deals with external regulation, I strongly suspect that fear for the reputation of a profession (which could be driven by these same two factors) affects internal calls for reform as well. Canadian engineers knew that they had to do something after the Quebec bridge collapse created common knowledge that safety standards weren’t good enough. Pilots were put in a similar position with some of the better publicized mishaps. Perhaps surgeons have faced no successful internal campaign for reform so far because the public is not yet aware of the dangers of surgery to the point where it could put surgeon’s livelihoods at risk or hurt them socially.

I wonder if it’s possible to get a profession running scared about their reputation to the point that they improve their safety, even if there aren’t any of the events that seem to drive regulation. Maybe someone like Atul Gawande, who seems determined to make a very big and very public stink about safety in surgery is the answer here. Perhaps having surgery’s terrible safety record plastered throughout the New Yorker will convince surgeons that they need to start doing better [7].

If not, they’ll continue to get away with murder.

Footnotes

[1] From the CDC’s truly excellent Cause of Death search function, using codes V81.7 & V82.7 (derailment with no collision), W13 (falling out of building), W23 (caught or crushed between objects), and W35 (explosion of boiler) at home, other, or unknown. I read through several hundred causes of deaths, some alarmingly unlikely, and these were the only ones that seemed relevant. This estimate seems higher than the one surgeon Atul Gawande gave in The Checklist Manifesto, so I’m confident it isn’t too low. ^

[2] Furthermore, from 1989 to 2000, none of the observed collapses were due to flaws in the engineers’ designs. Instead, they were largely caused by weather, collisions, poor maintenance, and errors during construction. ^

[3] Claims that the rings are made from the collapsed bridge are false, but difficult to dispel. They’re actually just boring stainless steel, except in Toronto, where they’re still made from iron (but not iron from the bridge). ^

[4] There may also be an inherent privateness to surgical deaths that keeps them out of the news. Someone dying in surgery, absent obvious malpractice, doesn’t feel like public information in the way that car crashes, plane crashes, and structural failures do. ^

[5] It is true that it was never discovered why TAA-538 crashed. But black box technology would have given answers had it been in use. That it wasn’t in use was clearly a systems failure, even though the initial failure is indeterminate. This jives with my model, because regulation addressed the clear failure, not the indeterminate one. ^

[6] This is the ratio between the average miles flown before crash of the (very safe) post office planes and the (very dangerous) privately owned planes. Many in the airline industry wanted the government to mandate the same safety standards on private planes as they mandated on their airmail planes. ^

[7] I should mention that I have been very lucky to have been in the hands of a number of very competent and professional surgeons over the years. That said, I’m probably going to ask any future surgeon I’m assigned if they follow safety checklists – and ask for someone else to perform the procedure if they don’t. ^

Economics, Model

The Biggest Tech Innovation is Selling Club Goods

Economists normally splits goods into four categories:

  • Public goods are non-excludable (so anyone can access them) and non-rival (I can use them as much as I want without limiting the amount you can use them). Broadcast television, national defense, and air are all public goods.
  • Common-pool resources are non-excludable but rival (if I use them, you will have to make do with less). Iron ore, fish stocks, and grazing land are all common pool resources.
  • Private goods are excludable (their access is controlled or limited by pricing or other methods) and rival. My clothes, computer, and the parking space I have in my lease but never use are all private goods.
  • Club goods are excludable but (up to a certain point) non-rival. Think of the swimming pool in an apartment building, a large amusement park, or cellular service.

Club goods are perhaps the most interesting class of goods, because they blend properties of the three better understood classes. They aren’t open to all, but they are shared among many. They can be overwhelmed by congestion, but up until that point, it doesn’t really matter how many people are using them. Think of a gym; as long as there’s at least one free machine of every type, it’s no less convenient than your home.

Club goods offer cost savings over private goods, because you don’t have to buy something that mostly sits unused (again, think of gym equipment). People other than you can use it when it would otherwise sit around and those people can help you pay the cost. It’s for this reason that club goods represent an excellent opportunity for the right entrepreneur to turn a profit.

I currently divide tech start-ups into three classes. There are the Googles of the world, who use network effects or big data to sell advertising more effectively. There are companies like the one I work for that take advantage of modern technology to do things that were never possible before. And then there are those that are slowly and inexorably turning private goods into club goods.

I think this last group of companies (which include Netflix, Spotify, Uber, Lyft, and Airbnb) may be the ones that ultimately have the biggest impact on how we order our lives and what we buy. To better understand how these companies are driving this transformation, let’s go through them one by one, then talk about what it could all mean.

Netflix

When I was a child, my parents bought a video cassette player, then a DVD player, then a Blu-ray player. We owned a hundred or so video cassettes, mostly whatever movies my brother and I were obsessed with enough to want to own. Later, we found a video rental store we liked and mostly started renting movies. We never owned more than 30 DVDs and 20 Blu-rays.

Then I moved out. I have bought five DVDs since – they came as a set from Kickstarter. Anything else I wanted to watch, I got via Netflix. A few years later, the local video rental store closed down and my parents got an AppleTV and a Netflix of their own.

Buying a physical movie means buying a private good. Video rental stores can be accurately modeled as a type of club good, because even if the movie you want is already rented out, there’s probably one that you want to watch almost as much that is available. This is enough to make them approximately non-rival, while the fact that it isn’t free to rent a movie means that rented videos are definitely excludable.

Netflix represents the next evolution in this business model. As long as the Netflix engineers have done their job right, there’s no amount of watching movies I can do that will prevent you from watching movies. The service is almost truly non-rival.

Movie studios might not feel the effects of Netflix turning a large chunk of the market for movies into one focused on club goods; they’ll still get paid by Netflix. But the switch to Netflix must have been incredibly damaging for the physical media and player manufacturers. When everyone went from cassettes to DVDs or DVDs to Blu-rays, there was still a market for their wares. Now, that market is slowly and inexorably disappearing.

This isn’t just a consequence of technology. The club good business model offers such amazing cost savings that it drove a change in which technology was dominant. When you bought a movie, it would spend almost all of its life sitting on a shelf. Now Netflix acts as your agent, buying movies (or rather, their rights) and distributing such that they’re always being played and almost never sitting on the shelf.

Spotify

Spotify is very similar to Netflix. Previously, people bought physical cassettes (I’m just old enough that I remember making mix tapes from the radio). Then they switched to CDs. Then it was MP3s bought online (or, almost more likely, pirated online). But even pirating music is falling out of favour these days. Apple, Google, Amazon, and Spotify are all competing to offer unlimited music streaming to customers.

Music differs from movies in that it has a long tradition of being a public good – via broadcast radio. While that hasn’t changed yet (radio is still going strong), I do wonder how much longer the public option for music will exist, especially given the trend away from private cars that I think companies like Uber and Lyft are going to (pardon the pun) drive.

Uber and Lyft

I recently thought about buying a car. I was looking at the all-electric Kia Soul, which has a huge government rebate (for a little while yet) and financing terms that equate to negative real interest. Despite all these advantages, it turns out that when you sit down and run the numbers, it would still be cheaper for me to use Uber and Lyft to get everywhere.

We are starting to see the first, preliminary (and possible illusionary) evidence that Uber and Lyft are causing the public to change their preference away from owning cars.

A car you’ve bought is a private good, while Uber and Lyft are clearly club goods. Surge pricing means that there are basically always enough drivers for everyone who wants to go anywhere using the system.

When you buy a car, you’re signing up for it to sit around useless for almost all of its life. This is similar to what happens when you buy exercise equipment, which means the logic behind cars as a club good is just as compelling as the logic behind gyms. Previously, we hadn’t been able to share cars very efficiently because of technological limitations. Dispatching a taxi, especially to an area outside of a city centre, was always spotty, time consuming and confusing. Car-pooling to work was inconvenient.

As anyone who has used a modern ride-sharing app can tell you, inconvenient is no longer an apt descriptor.

There is a floor on how few cars we can get by on. To avoid congestion in a club good, you typically have to provision for peak load. Luckily, peak load (for anything that can sensibly be turned into a club good) always requires fewer resources than would be needed if everyone went out and bought the shared good themselves.

Even “just” substantially decreasing the absolute number of cars out there will be incredibly disruptive to the automotive sector if they don’t correctly predict the changing demand for their products.

It’s also true that increasing the average utilisation of cars could change how our cities look. Parking lots are necessary when cars are a private good, but are much less useful when they become club goods. It is my hope that malls built in the middle of giant parking moats look mighty silly in twenty years.

Airbnb

Airbnb is the most ambiguous example I have here. As originally conceived, it would have driven the exact same club good transformation as the other services listed. People who were on vacation or otherwise out of town would rent out their houses to strangers, increasing the utilisation of housing and reducing the need for dedicated hotels to be built.

Airbnb is sometimes used in this fashion. It’s also used to rent out extra rooms in an otherwise occupied house, which accomplishes almost the same thing.

But some amount of Airbnb usage is clearly taking place in houses or condos that otherwise would have been rental stock. When used in this way, it’s taking advantage of a regulatory grey zone to undercut hotel pricing. Insofar as this might result in a longer-term change towards regulations that are generally cheaper to comply with, this will be good for consumers, but it won’t really be transformational.

The great promise of club goods is that they might lead us to use less physical stuff overall, because where previously each person would buy one of a thing, now only enough units must be purchased to satisfy peak demand. If Airbnb is just shifting around where people are temporary residents, then it won’t be an example of the broader benefits of club goods (even if provides other benefits to its customers).

When Club Goods Eat The Economy

In every case (except potentially Airbnb) above, I’ve outlined how the switch from private goods to club goods is resulting in less consumption. For music and movies, it is unclear if this switch is what is providing the primary benefit. My intuition is that the club good model actually did change consumption patterns for physical copies of movies (because my impression is that few people ever did online video rentals via e.g. iTunes), whereas the MP3 revolution was what really shrunk the footprint of music media.

This switch in consumption patterns and corresponding decrease in the amount of consumption that is necessary to satisfy preferences is being primarily driven by a revolution in logistics and bandwidth. The price of club goods has always compared favourably with that of private goods. The only thing holding people back was inconvenience. Now programmers are steadily figuring out how to make that inconvenience disappear.

On the other hand, increased bandwidth has made it easier to turn any sort of digitizable media into a club good. There’s an old expression among programmers: never underestimate the bandwidth of a station wagon full of cassettes (or CDs, or DVDs, or whatever physical storage media one grew up with) hurtling down the highway. For a long time, the only way to get a 1GB movie to a customer without an appallingly long buffering period was to physically ship it (on a 56kbit/s connection, this movie would take one day and fifteen hours to download, while the aforementioned station wagon with 500 movies would take 118 weeks to download).

Change may start out slow, but I expect to see it accelerate quickly. My generation is the first to have had the internet from a very young age. The generation after us will be the first unable to remember a time before it. We trust apps like Uber and Airbnb much more than our parents, and our younger siblings trust them even more than us.

While it was only kids who trusted the internet, these new club good businesses couldn’t really affect overall economic trends. But as we come of age and start to make major economic decisions, like buying houses and cars, our natural tendency to turn towards the big tech companies and the club goods they peddle will have ripple effects on an economy that may not be prepared for it.

When that happens, there’s only one thing that is certain: there will be yet another deluge of newspaper columns talking about how millennials are destroying everything.

Advice, Literature, Model

Sanderson’s Law Applies To Cultures Too

[Warning: Contains spoilers for The Sunset Mantle, Vorkosigan Saga (Memory and subsequent), Dune, and Chronicles of the Kencyrath]

For the uninitiated, Sanderson’s Law (technically, Sanderson’s First Law of Magic) is:

An author’s ability to solve conflict with magic is DIRECTLY PROPORTIONAL to how well the reader understands said magic.

Brandon Sanderson wrote this law to help new writers come up with satisfying magical systems. But I think it’s applicable beyond magic. A recent experience has taught me that it’s especially applicable to fantasy cultures.

I recently read Sunset Mantle by Alter S. Reiss, a book that falls into one of my favourite fantasy sub-genres: hopeless siege tales.

Sunset Mantle is what’s called secondary world fantasy; it takes place in a world that doesn’t share a common history or culture (or even necessarily biosphere) with our own. Game of Thrones is secondary world fantasy, while Harry Potter is primary world fantasy (because it takes place in a different version of our world, which we chauvinistically call the “primary” one).

Secondary world fantasy gives writers a lot more freedom to play around with cultures and create interesting set-pieces when cultures collide. If you want to write a book where the Roman Empire fights a total war against the Chinese Empire, you’re going to have to put in a master’s thesis worth of work to explain how that came about (if you don’t want to be eviscerated by pedants on the internet). In a secondary world, you can very easily have a thinly veiled stand-in for Rome right next to a thinly veiled analogue of China. Give readers some familiar sounding names and culture touchstones and they’ll figure out what’s going on right away, without you having to put in effort to make it plausible in our world.

When you don’t use subtle cues, like names or cultural touchstones (for example: imperial exams and eunuchs for China, gladiatorial fights and the cursus honorum for Rome), you risk leaving your readers adrift.

Many of the key plot points in Sunset Mantle hinge on obscure rules in an invented culture/religion that doesn’t bear much resemblance to any that I’m familiar with. It has strong guest rights, like many steppes cultures; it has strong charity obligations and monotheistic strictures, like several historical strands of Christianity; it has a strong caste system and rules of ritual purity, like Hinduism; and it has a strong warrior ethos, complete with battle rage and rules for dealing with it, similar to common depictions of Norse cultures.

These actually fit together surprising well! Reiss pulled off an entertaining book. But I think many of the plot points fell flat because they were almost impossible to anticipate. The lack of any sort of consistent real-world analogue to the invented culture meant that I never really had an intuition of what it would demand in a given situation. This meant that all of the problems in the story that were solved via obscure points of culture weren’t at all satisfying to me. There was build up, but then no excitement during the resolution. This was common enough that several chunks of the story didn’t really work for me.

Here’s one example:

“But what,” asked Lemist, “is a congregation? The Ayarith school teaches that it is ten men, and the ancient school of Baern says seven. But among the Irimin school there is a tradition that even three men, if they are drawn in together into the same act, by the same person, that is a congregation, and a man who has led three men into the same wicked act shall be put to death by the axe, and also his family shall bear the sin.”

All the crowd in the church was silent. Perhaps there were some who did not know against whom this study of law was aimed, but they knew better than to ask questions, when they saw the frozen faces of those who heard what was being said.

(Reiss, Alter S.. Sunset Mantle (pp. 92-93). Tom Doherty Associates. Kindle Edition.)

This means protagonist Cete’s enemy erred greatly by sending three men to kill him and had better cut it out if he doesn’t want to be executed. It’s a cool resolution to a plot point – or would be if it hadn’t taken me utterly by surprise. As it is, it felt kind of like a cheap trick to get the author out of a hole he’d written himself into, like the dreaded deux ex machina – god from the machine – that ancient playwrights used to resolve conflicts they otherwise couldn’t.

(This is the point where I note that it is much harder to write than it is to criticize. This blog post is about something I noticed, not necessarily something I could do better.)

I’ve read other books that do a much better job of using sudden points of culture to resolve conflict in a satisfying manner. Lois McMaster Bujold (I will always be recommending her books) strikes me as particularly apt. When it comes time for a key character of hers to make a lateral career move into a job we’ve never heard of before, it feels satisfying because the job is directly in line with legal principles for the society that she laid out six books earlier.

The job is that of Imperial Auditor – a high powered investigator who reports directly to the emperor and has sweeping powers –  and it’s introduced when protagonist Miles loses his combat career in Memory. The principles I think it is based on are articulated in the novella Mountains of Mourning: “the spirit was to be preferred over the letter, truth over technicalities. Precedent was held subordinate to the judgment of the man on the spot”.

Imperial Auditors are given broad discretion to resolve problems as they see fit. The main rule is: make sure the emperor would approve. We later see Miles using the awesome authority of this office to make sure a widow gets the pension she deserves. The letter of the law wasn’t on her side, but the spirit was, and Miles, as the Auditor on the spot, was empowered to make the spirit speak louder than the letter.

Wandering around my bookshelves, I was able to grab a couple more examples of satisfying resolutions to conflicts that hinged on guessable cultural traits:

  • In Dune, Fremen settle challenges to leadership via combat. Paul Maud’dib spends several years as their de facto leader, while another man, Stilgar, holds the actual title. This situation is considered culturally untenable and Paul is expected to fight Stilgar so that he can lead properly. Paul is able to avoid this unwanted fight to the death (he likes Stilgar) by appealing to the only thing Fremen value more than their leadership traditions: their well-established pragmatism. He says that killing Stilgar before the final battle would be little better than cutting off his own arm right before it. If Frank Herbert hadn’t mentioned the extreme pragmatism of the Fremen (to the point that they render down their dead for water) several times, this might have felt like a cop-out.
  • In The Chronicles of the Kencyrath, it looks like convoluted politics will force protagonist Jame out of the military academy of Tentir. But it’s mentioned several times that the NCOs who run the place have their own streak of honour that allows them to subvert their traditionally required oaths to their lords. When Jame redeems a stain on the Tentir’s collective honour, this oath to the college gives them an opening to keep her there and keep their oaths to their lords. If PC Hodgell hadn’t spent so long building up the internal culture of Tentir, this might have felt forced.

It’s hard to figure out where good foreshadowing ends and good cultural creation begins, but I do think there is one simple thing an author can do to make culture a satisfying source of plot resolution: make a culture simple enough to stereotype, at least at first.

If the other inhabitants of a fantasy world are telling off-colour jokes about this culture, what do they say? A good example of this done explicitly comes from Mass Effect: “Q: How do you tell when a Turian is out of ammo? A: He switches to the stick up his ass as a backup weapon.” 

(Even if you’ve never played Mass Effect, you now know something about Turians.)

At the same time as I started writing this, I started re-reading PC Hodgell’s The Chronicles of the Kencyrath, which provided a handy example of someone doing everything right. The first three things we learn about the eponymous Kencyr are:

  1. They heal very quickly
  2. They dislike their God
  3. Their honour code is strict enough that lying is a deadly crime and calling some a liar a deathly insult

There are eight more books in which we learn all about the subtleties of their culture and religion. But within the first thirty pages, we have enough information that we can start making predictions about how they’ll react to things and what’s culturally important.

When Marc, a solidly dependable Kencyr who is working as a guard and bound by Kencyr cultural laws to loyally serve his employer lets the rather more eccentric Jame escape from a crime scene, we instantly know that him choosing her over his word is a big deal. And indeed, while he helps her escape, he also immediately tries to kill himself. Jame is only able to talk him out of it by explaining that she hadn’t broken any laws there. It was already established that in the city of Tai-Tastigon, only those who physically touch stolen property are in legal jeopardy. Jame never touched the stolen goods, she was just on the scene. Marc didn’t actually break his oath and so decides to keep living.

God Stalk is not a long book, so that fact that PC Hodgell was able to set all of this up and have it feel both exciting in the moment and satisfying in the resolution is quite remarkable. It’s a testament to what effective cultural distillation, plus a few choice tidbits of extra information can do for a plot.

If you don’t come up with a similar distillation and convey it to your readers quickly, there will be a period where you can’t use culture as a satisfying source of plot resolution. It’s probably no coincidence that I noticed this in Sunset Mantle, which is a long(-ish) novella. Unlike Hodgell, Reiss isn’t able to develop a culture in such a limited space, perhaps because his culture has fewer obvious touchstones.

Sanderson’s Second Law of Magic can be your friend here too. As he stated it, the law is:

The limitations of a magic system are more interesting than its capabilities. What the magic can’t do is more interesting than what it can.

Similarly, the taboos and strictures of a culture are much more interesting than what it permits. Had Reiss built up a quick sketch of complicated rules around commanding and preaching (with maybe a reference that there could be surprisingly little theological difference between military command and being behind a pulpit), the rule about leading a congregation astray would have fit neatly into place with what else we knew of the culture.

Having tight constraints imposed by culture doesn’t just allow for plot resolution. It also allows for plot generation. In The Warrior’s Apprentice, Miles gets caught up in a seemingly unwinnable conflict because he gave his word; several hundred pages earlier Bujold establishes that breaking a word is, to a Barrayaran, roughly equivalent to sundering your soul.

It is perhaps no accident that the only thing we learn initially about the Kencyr that isn’t a descriptive fact (like their healing and their fraught theological state) is that honour binds them and can break them. This constraint, that all Kencyr characters must be honourable, does a lot of work driving the plot.

This then would be my advice: when you wish to invent a fantasy culture, start simple, with a few stereotypes that everyone else in the world can be expected to know. Make sure at least one of them is an interesting constraint on behaviour. Then add in depth that people can get to know gradually. When you’re using the culture as a plot device, make sure to stick to the simple stereotypes or whatever other information you’ve directly given your reader. If you do this, you’ll develop rich cultures that drive interesting conflicts and you’ll be able to use cultural rules to consistently resolve conflict in a way that will feel satisfying to your readers.

Advice, Model

Context Windows

When you’re noticing that you’re talking past someone, what does it look like? Do you feel like they’re ignoring all the implications of the topic at hand (“yes, I know the invasion of Iraq is causing a lot of pain, but I think the important question is, ‘did they have WMDs?'”)? Or do you feel like they’re avoiding talking about the object-level point in favour of other considerations (“factory farmed animals might suffer, but before we can consider whether that’s justified or not, shouldn’t we decide whether we have any obligation to maximize the number of living creatures?”)?

I’m beginning to suspect that many tense disagreements and confused, fruitless conversations are caused by differences in how people conceive of and process the truth. More, I think I have a model that explains why some people can productively disagree with anyone and everyone, while others get frustrated very easily with even their closest friends.

The basics of this model come from a piece that Jacob Falkovich wrote for Quillette. He uses two categories, “contextualizers” and “decouplers”, to analyze an incredibly unproductive debate (about race and IQ) between Vox’s Ezra Klein and Dr. Sam Harris.

Klein is the contextualizer, a worldview that comes naturally to a political journalist. Contextualizers see ideas as embedded in a context. Questions of “who does this effect?”, “how is this rooted in society?”, and “what are the (group) identities of people pushing this idea?” are the bread and butter of contextualizers. One of the first things Klein says in his debate with Harris is:

Here is my view: I think you have a deep empathy for Charles Murray’s side of this conversation, because you see yourself in it [because you also feel attacked by “politically correct” criticism]. I don’t think you have as deep an empathy for the other side of this conversation. For the people being told once again that they are genetically and environmentally and at any rate immutably less intelligent and that our social policy should reflect that. I think part of the absence of that empathy is it doesn’t threaten you. I don’t think you see a threat to you in that, in the way you see a threat to you in what’s happened to Murray. In some cases, I’m not even quite sure you heard what Murray was saying on social policy either in The Bell Curve and a lot of his later work, or on the podcast. I think that led to a blind spot, and this is worth discussing.

Klein is highlighting what he thinks is the context that probably informs Harris’s views. He’s suggesting that Harris believes Charles Murray’s points about race and IQ because they have a common enemy. He’s aware of the human tendency to like ideas that come from people we feel close to (myside bias) – or that put a stick in the eye of people we don’t like.

There are other characteristics of contextualizers. They often think thought experiments are pointless, given that they try and strip away all the complex ways that society affects our morality and our circumstances. When they make mistakes, it is often because they fall victim to the “ought-is” fallacy; they assume that truths with bad outcomes are not truths at all.

Harris, on the other hand, is a decoupler. Decoupling involves separating ideas from context, from personal experience, from consequences, from anything but questions of truth or falsehood and using this skill to consider them in the abstract. Decoupling is necessary for science because it’s impossible to accurately check a theory when you hope it to be true. Harris’s response to Klein’s opening salvo is:

I think your argument is, even where it pretends to be factual, or wherever you think it is factual, it is highly biased by political considerations. These are political considerations that I share. The fact that you think I don’t have empathy for people who suffer just the starkest inequalities of wealth and politics and luck is just, it’s telling and it’s untrue. I think it’s even untrue of Murray. The fact that you’re conflating the social policies he endorses — like the fact that he’s against affirmative action and he’s for universal basic income, I know you don’t happen agree with those policies, you think that would be disastrous — there’s a good-faith argument to be had on both sides of that conversation. That conversation is quite distinct from the science and even that conversation about social policy can be had without any allegation that a person is racist, or that a person lacks empathy for people who are at the bottom of society. That’s one distinction I want to make.

Harris is pointing out that questions of whether his beliefs will have good or bad consequences or who they’ll hurt have nothing to do with the question of if they are true. He might care deeply about the answers of those questions, but he believes that it’s a dangerous mistake to let that guide how you evaluate an idea. Scientists who fail to do that tend to get caught up in the replication crisis.

When decouplers err, it is often because of the is-ought fallacy. They fail to consider how empirical truths can have real world consequences and fail to consider how labels that might be true in the aggregate can hurt individuals.

When you’re arguing with someone who doesn’t contextualize as much as you do, it can feel like arguing about useless hypotheticals. I once had someone start a point about police shootings and gun violence with “well, ignoring all of society…”. This prompted immediate groans.

When arguing with someone who doesn’t decouple as much as you do, it can feel useless and mushy. A co-worker once said to me “we shouldn’t even try and know the truth there – because it might lead people to act badly”. I bit my tongue, but internally I wondered how, absent the truth, we can ground disagreements in anything other than naked power.

Throughout the debate between Harris and Klein, both of them get frustrated at the other for failing to think like they do – which is why it provided such a clear example for Falkovich. If you read the transcripts, you’ll see a clear pattern: Klein ignores questions of truth or falsehood and Harris ignores questions of right and wrong. Neither one is willing to give an inch here, so there’s no real engagement between them.

This doesn’t have to be the case whenever people who prefer context or prefer to deal with the direct substance of an issue interact.

My theory is that everyone has a window that stretches from the minimum amount of context they like in conversations to the minimum amount of substance. Theoretically, this window could stretch from 100% context and no substance to 100% substance and no context.

But practically no one has tastes that broad. Most people accept a narrower range of arguments. Here’s what three well compatible friends might look like:

We should expect to see some correlation between the minimum and maximum amount of context people want to get. Windows may vary in size, but in general, feeling put-off by lots of decoupling should correlate with enjoying context.


 Here we see people with varyingly sized strike zones, but with their dislike of context correlated with their appreciation for substance.

Klein and Harris disagreed so unproductively not just because they give first billing to different things, but because their world views are different enough that there is absolutely no overlap between how they think and talk about things.

One plausible graph of how Klein and Harris like to think about problems (quotes come from the transcript of their podcast). From this, it makes sense that they couldn’t have a productive conversation. There’s no overlap in how they model the world.

I’ve found thinking about windows of context and substance, rather than just the dichotomous categories, very useful for analyzing how me and my friends tend to agree and disagree.

Some people I know can hold very controversial views without ever being disagreeable. They are good at picking up on which sorts of arguments will work with their interlocutors and sticking to those. These people are no doubt aided by rather wide context windows. They can productively think and argue with varying amounts of context and substance.

Other people feel incredibly difficult to argue with. These are the people who are very picky about what arguments they’ll entertain. If I sort someone into this internal category, it’s because I’ve found that one day they’ll dismiss what I say as too nitty-gritty, while the next day they criticize me for not being focused enough on the issue at hand.

What I’ve started to realize is that people I find particularly finicky to argue with may just have a fairly narrow strike zone. For them, it’s simultaneously easy for arguments to feel devoid of substance or devoid of context.

I think one way that you can make arguments with friends more productive is explicitly lay out the window in which you like to be convinced. Sentences like: “I understand what you just said might convince many people, but I find arguments about the effects of beliefs intensely unsatisfying” or “I understand that you’re focused on what studies say, but I think it’s important to talk about the process of knowledge creation and I’m very unlikely to believe something without first analyzing what power hierarchies created it” are the guideposts by which you can show people your context window.

Literature, Model

Does Amateurish Writing Exist

[Warning: Spoilers for Too Like the Lightning]

What marks writing as amateurish (and whether “amateurish” or “low-brow” works are worthy of awards) has been a topic of contention in the science fiction and fantasy community for the past few years, with the rise of Hugo slates and the various forms of “puppies“.

I’m not talking about the learning works of genuine amateurs. These aren’t stories that use big words for the sake of sounding smart (and at the cost of slowing down the stories), or over the top fanfiction-esque rip-offs of more established works (well, at least not since the Wheel of Time nomination in 2014). I’m talking about that subtler thing, the feeling that bubbles up from the deepest recesses of your brain and says “this story wasn’t written as well as it could be”.

I’ve been thinking about this a lot recently because about ¾ of the way through Too Like The Lightning by Ada Palmer, I started to feel myself put off [1]. And the only explanation I had for this was the word “amateurish” – which popped into my head devoid of any reason. This post is an attempt to unpack what that means (for me) and how I think it has influenced some of the genuine disagreements around rewarding authors in science fiction and fantasy [2]. Your tastes might be calibrated differently and if you disagree with my analysis, I’d like to hear about it.

Now, there are times when you know something is amateurish and that’s okay. No one should be surprised that John Ringo’s Paladin of Shadows series, books that he explicitly wrote for himself are parsed by most people as pretty amateurish. When pieces aren’t written explicitly for the author only, I expect some consideration of the audience. Ideally the writer should be having fun too, but if they’re writing for publication, they have to be writing to an audience. This doesn’t mean that they must write exactly what people tell them they want. People can be a terrible judge of what they want!

This also doesn’t necessarily imply pandering. People like to be challenged. If you look at the most popular books of the last decade on Goodreads, few of them could be described as pandering. I’m familiar with two of the top three books there and both of them kill off a fan favourite character. People understand that life involves struggle. Lois McMaster Bujold – who has won more Hugo awards for best novel than any living author – once said she generated plots by considering “what’s the worst possible thing I can do to these people?” The results of this method speak for themselves.

Meditating on my reaction to books like Paladin of Shadows in light of my experiences with Too Like The Lightning is what led me to believe that the more technically proficient “amateurish” books are those that lose sight of what the audience will enjoy and follow just what the author enjoys. This may involve a character that the author heavily identifies with – the Marty Stu or Mary Sue phenomena – who is lovingly described overcoming obstacles and generally being “awesome” but doesn’t “earn” any of this. It may also involve gratuitous sex, violence, engineering details, gun details, political monologuing (I’m looking at you, Atlas Shrugged), or tangents about constitutional history (this is how most of the fiction I write manages to become unreadable).

I realized this when I was reading Too Like the Lightning. I loved the world building and I found the characters interesting. But (spoilers!) when it turned out that all of the politicians were literally in bed with each other or when the murders the protagonist carried out were described in grisly, unrepentant detail, I found myself liking the book a lot less. This is – I think – what spurred the label amateurish in my head.

I think this is because (in my estimation), there aren’t a lot of people who actually want to read about brutal torture-execution or literally incestuous politics. It’s not (I think) that I’m prudish. It seemed like some of the scenes were written to be deliberately off-putting. And I understand that this might be part of the theme of the work and I understand that these scenes were probably necessary for the author’s creative vision. But they didn’t work for me and they seemed like a thing that wouldn’t work for a lot of people that I know. They were discordant and jarring. They weren’t pulled off as well as they would have had to be to keep me engaged as a reader.

I wonder if a similar process is what caused the changes that the Sad Puppies are now lamenting at the Hugo Awards. To many readers, the sexualized violence or sexual violence that can find its way into science fiction and fantasy books (I’d like to again mention Paladin of Shadows) is incredibly off-putting. I find it incredibly off-putting. Books that incorporate a lot of this feel like they’re ignoring the chunk of audience that is me and my friends and it’s hard while reading them for me not to feel that the writers are fairly amateurish. I normally prefer works that meditate on the causes and uses of violence when they incorporate it – I’d put N.K. Jemisin’s truly excellent Broken Earth series in this category – and it seems like readers who think this way are starting to dominate the Hugos.

For the people who previously had their choices picked year after year, this (as well as all the thinkpieces explaining why their favourite books are garbage) feels like an attack. Add to this the fact that some of the books that started winning had a more literary bent and you have some fans of the genre believing that the Hugos are going to amateurs who are just cruising to victory by alluding to famous literary works. These readers look suspiciously on crowds who tell them they’re terrible if they don’t like books that are less focused on the action and excitement they normally read for. I can see why that’s a hard sell, even though I’ve thoroughly enjoyed the last few Hugo winners [3].

There’s obviously an inferential gap here, if everyone can feel angry about the crappy writing everyone else likes. For my part, I’ll probably be using “amateurish” only to describe books that are technically deficient. For books that are genuinely well written but seem to focus more on what the author wants than (on what I think) their likely audience wants, well, I won’t have a snappy term, I’ll just have to explain it like that.

Footnotes

[1] A disclaimer: the work of a critic is always easier than that of a creator. I’m going to be criticizing writing that’s better than my own here, which is always a risk. Think of me not as someone criticizing from on high, but frantically taking notes right before a test I hope to barely pass. ^

[2] I want to separate the Sad Puppies, who I view as people sad that action-packed books were being passed over in favour of more literary ones from the Rabid Puppies, who just wanted to burn everything to the ground. I’m not going to make any excuses for the Rabid Puppies. ^

[3] As much as I can find some science fiction and fantasy too full of violence for my tastes, I’ve also had little to complain about in the past, because my favourite author, Lois McMaster Bujold, has been reliably winning Hugo awards since before I was born. I’m not sure why there was never a backlash around her books. Perhaps it’s because they’re still reliably space opera, so class distinctions around how “literary” a work is don’t come up when Bujold wins. ^

Model, Politics, Quick Fix

The Awkward Dynamics of the Conservative Leadership Debates

Tanya Granic Allen is the most idealistic candidate I’ve ever seen take the stage in a Canadian political debate. This presents some awkward challenges for the candidates facing her, especially Mulroney and Elliot.

First, there’s the simple fact of her idealism. I think Granic Allen genuinely believes everything she says. For her, knowing what’s right and what’s wrong is simple. There isn’t a whole lot of grey. She even (bless her) probably believes that this will be an advantage come election time. People overwhelming don’t like the equivocation of politicians, so Granic Allen must assume her unequivocal moral stances will be a welcome change

For many people, it must be. Even for those who find it grating, it seems almost vulgar to attack her. It’s clear that she isn’t in this for herself and doesn’t really care about personal power. Whether she could maintain that innocence in the face of the very real need to make political compromises remains an open question, but for now she does represent a certain vein of ideological conservatism in a form that is unsullied by concerns around electability.

The problem here is that the stuff Granic Allen is pushing – “conscience rights” and “parental choice” – is exactly the sort of thing that can mobilize opposition to the PC party. Fighting against sex-ed and abortion might play well with the base, but Elliot and Mulroney know that unbridled social conservatism is one of the few things that can force the province’s small-l liberals to hold their noses and vote for the big-L Liberal Party. In an election where we can expect embarrassingly low turnout (it was 52% in 2014), this can play a major role.

A less idealistic candidate would temper themselves to help the party in the election. Granic Allen has no interest in doing this, which basically forces the pragmatists to navigate the tricky act of distancing themselves from her popular (with the base) proposals so that they might carry the general election.

Second, there’s the difficult interaction between the anti-rational and anti-empirical “common sense” conservatism pushed by Granic Allen and Ford and the pragmatic, informed conservatism of Elliot and Mulroney.

For Ford and Granic Allen, there’s a moral nature to truth. They live in a just world where something being good is enough to make it true. Mulroney and Elliot know that reality has an anti-partisan bias.

Take clean energy contracts. Elliot quite correctly pointed out that ripping up contracts willy-nilly will lead to a terrible business climate in Ontario. This is the sort of suggestion we normally see from the hard left (and have seen in practice in places the hard left idolizes, like Venezuela). But Granic Allen is committed to a certain vision of the world and in her vision of the world, government getting out of the way can’t help but be good.

Christine Elliot has (and this is a credit to her) shown that she’s not very ideological, in that she can learn how the world really works and subordinate ideology to truth, even when inconvenient. This would make her a more effective premier than either Granic Allen or Ford, but might hurt her in the leadership race. I’ve seen her freeze a couple times when she’s faced with defending how the world really works to an audience that is ideologically prevented from acknowledging the truth.

(See for example, the look on her face when she was forced to defend her vote to ban conversion therapy. Elliot’s real defense of that bill probably involves phrases like “stuck in the past”, “ignorant quacks” and “vulnerable children who need to be protected from people like you”. But she knew that a full-throated defense of gender dysphoria as a legitimate problem wouldn’t win her any votes in this race.)

As Joseph Heath has pointed out, this tension between reality and ideology is responsible for the underrepresentation of modern conservatives among academics. Since the purpose of the academy is (broadly) truth-seeking, we shouldn’t be surprised to see it select against an ideology that explicitly rejects not only the veracity of much of the products of this truth seeking (see, for example, Granic Allen’s inability to clearly state that humans are causing climate change) but the worthwhileness of the whole endeavour of truth seeking.

When everything is trivially knowable via the proper application of “common-sense”, there’s no point in thinking deeply. There’s no point in experts. You just figure out what’s right and you do it. Anything else just confuses the matter and leaves the “little guy” to get shafted by the elites.

Third, the carbon tax has produced a stark, unvoiced split between the candidates. On paper, all are opposing it. In reality, only Ford and Granic Allen seriously believe they have any chance at stopping it. I’m fairly sure that Elliot and Mulroney plan to mount a token opposition, then quickly fold when they’re reminded that raising taxes and giving money to provinces is a thing the Federal Government is allowed to do. This means that they’re counting on money from the carbon tax to balance their budget proposals. They can’t say this, because Ford and Granic Allen are forcing them to the right here, but I would bet that they’re privately using it to reassure fiscally conservative donors about the deficit.

Being unable to discuss what is actually the centrepiece of their financial plans leaves Elliot and Mulroney unable to give very good information about how they plan to balance the budget. They have to fall back on empty phrases like “line by line by line audit” and “efficiencies”, because anything else feels like political suicide.

This shows just how effective Granic Allen has been at being a voice for the grassroots. By staking out positions that resonate with the base, she’s forcing other leadership contestants to endorse them or risk losing to her. Note especially how she’s been extracting promises from Elliot and Mulroney whenever possible – normally around things she knows they don’t want to agree to but that play well with the base. By doing this, she hopes to remove much of their room to maneuver in the general election and prevent any big pivot to centre.

Whether this will work really depends on how costly politicians find breaking promises. Conventional wisdom holds that they aren’t particularly bothered by it. I wonder if Granic Allen’s idealism blinds her to this fact. I’m certainly sure that she wouldn’t break a promise except under the greatest duress.

On the left, it’s very common to see a view of politics that emphasizes pure and moral people. The problem with the system, says the communist, is that we let greedy people run it. If we just replaced them all with better people, we’d get a fair society. Granic Allen is certainly no communist. But she does seem to believe in the “just need good people” theory of government – and whether she wins or loses, she’s determined to bring all the other candidates with her.

This isn’t an incrementalist approach, which is why it feels so foreign to people like me. Granic Allen seems to be making the decision that she’d rather the Conservatives lose (again!) to the Liberals than that they win without a firm commitment to do things differently.

The conflict in the Ontario Conservative party ­– the conflict that was surfaced when his rivals torpedoed Patrick Brown – is around how far the party is willing to go to win. The Ontario Conservatives aren’t the first party to go through this. When UK Labour members picked Jeremy Corbyn, they clearly threw electability behind ideological purity.

In the Ontario PC party, Allen and Ford have clearly staked out a position emphasizing purity. Mulroney and Elliot have just as clearly chosen to emphasize success. Now it’s up to the members. I’m very interested to see what they decide.

Economics, Model, Quick Fix

Not Just Zoning: Housing Prices Driven By Beauty Contests

No, this isn’t a post about very pretty houses or positional goods. It’s about the type of beauty contest described by John Maynard Keynes.

Imagine a newspaper that publishes one hundred pictures of strapping young men. It asks everyone to send in the names of the five that they think are most attractive. They offer a prize: if your selection matches the five men most often appearing in everyone else’s selections, you’ll win $500.

You could just do what the newspaper asked and send in the names of those men that you think are especially good looking. But that’s not very likely to give you the win. Everyone’s tastes are different and the people you find attractive might not be very attractive to anyone else. If you’re playing the game a bit smarter, you’ll instead pick the five people that you think have the broadest appeal.

You could go even deeper and realize that many other people will be trying to win and so will also be trying to pick the most broadly appealing people. Therefore, you should pick people that you think most people will view as broadly appealing (which differs from picking broadly appealing people if you know something about what most people find attractive that isn’t widely known). This can go on indefinitely (although Yudkowsky’s Law of Ultrafinite Recursion states that “In practice, infinite recursions are at most three levels deep“, which gives me a convenient excuse to stop before this devolves into “I know you know I know that you know that…” ad infinitum).

This thought experiment was relevant to an economist because many assets work like this. Take gold: its value cannot to be fully explained by its prettiness or industrial usefulness; some of its value comes from the belief that someone else will want it in the future and be willing to pay more for it than they would a similarly useful or pretty metal. For whatever reason, we have a collective delusion that gold is especially valuable. Because this delusion is collective enough, it almost stops being a delusion. The delusion gives gold some of its value.

When it comes to houses, beauty contests are especially relevant in Toronto and Vancouver. Faced with many years of steadily rising house prices, people are willing to pay a lot for a house because they believe that they can unload it on someone else in a few years or decades for even more.

When talking about highly speculative assets (like Bitcoin), it’s easy to point out the limited intrinsic value they hold. Bitcoin is an almost pure Keynesian Beauty Contest asset, with most of its price coming from an expectation that someone else will want it at a comparable or better price in the future. Houses are obviously fairly intrinsically valuable, especially in very desirable cities. But the fact that they hold some intrinsic value cannot by itself prove that none of their value comes from beliefs about how much they can be unloaded for in the future – see again gold, which has value both as an article of commerce and as a beauty contest asset.

There’s obviously an element of self-fulfilling prophecy here, with steadily increasing house prices needed to sustain this myth. Unfortunately, the housing market seems especially vulnerable to this sort of collective mania, because the sunk cost fallacy makes many people unwilling to sell their houses at a price below what they paid for it. Any softening of the market removes sellers, which immediately drives up prices again. Only a massive liquidation event, like we saw in 2007-2009 can push enough supply into the market to make prices truly fall.

But this isn’t just a self-fulfilling prophecy. There’s deliberateness here as well. To some extent, public policy is used to guarantee that house prices continue to rise. NIMBY residents and their allies in city councils deliberately stall projects that might affect property values. Governments provide tax credits or access to tax-advantaged savings accounts for homes. In America, mortgage payments provide a tax credit!

All of these programs ultimately make housing more expensive wherever supply cannot expand to meet the artificially increased demand – which basically describes any dense urban centre. Therefore, these home buying programs fail to accomplish their goal of making house more affordable, but do serve to guarantee that housing prices will continue to go up. Ultimately, they really just represent a transfer of wealth from taxpayers generally to those specific people who own homes.

Unfortunately, programs like this are very sticky. Once people buy into the collective delusion that home prices must always go up, they’re willing to heavily leverage themselves to buy a home. Any dip in the price of homes can wipe out the value of this asset, making it worth less than the money owed on it. Since this tends to make voters very angry (and also lead to many people with no money) governments of all stripes are very motivated to avoid it.

This might imply that the smart thing is to buy into the collective notion that home prices always go up. There are so many people invested in this belief at all levels of society (banks, governments, and citizens) that it can feel like home prices are too important to fall.

Which would be entirely convincing, except, I’m pretty sure people believed that in 2007 and we all know how that ended. Unfortunately, it looks like there’s no safe answer here. Maybe the collective mania will abate and home prices will stop being buoyed ever upwards. Or maybe they won’t and the prices we currently see in Toronto and Vancouver will be reckoned cheap in twenty years.

Better zoning laws can help make houses cheaper. But it really isn’t just zoning. The beauty contest is an important aspect of the current unaffordability.

Economics, Model

Against Job Lotteries

In simple economic theory, wages are supposed to act as signals. When wages increase in a sector, it should signal people that there’s lots of work to do there, incentivizing training that will be useful for that field, or causing people to change careers. On the flip side, when wages decrease, we should see a movement out of that sector.

This is all well and good. It explains why the United States has seen (over the past 45 years) little movement in the number of linguistics degrees, a precipitous falloff in library sciences degrees, some decrease in English degrees, and a large increase in engineering and business degrees [1].

This might be the engineer in me, but I find things that are working properly boring. What I’m really interested in is when wage signals break down and are replaced by a job lottery.

Job lotteries exist whenever there are two tiers to a career. On one hand, you’ll have people making poverty wages and enduring horrendous conditions. On the other, you’ll see people with cushy wages, good job security, and (comparatively) reasonable hours. Job lotteries exist in the “junior doctor” system of the United Kingdom, in the academic system of most western countries, and teaching in Ontario (up until very recently). There’s probably a much less extreme version of this going on even in STEM jobs (in that many people go in thinking they’ll work for Google or the next big unicorn and end up building websites for the local chamber of commerce or writing internal tools for the company billing department [2]). A slightly different type of job lottery exists in industries where fame plays a big role: writing, acting, music, video games, and other creative endeavours.

Job lotteries are bad for two reasons. Compassionately, it’s really hard to see idealistic, bright, talented people endure terribly conditions all in the hope of something better, something that might never materialize. Economically, it’s bad when people spend a lot of time unemployed or underemployed because they’re hopeful they might someday get their dream job. Both of these reasons argue for us to do everything we can to dismantle job lotteries.

I do want to make a distinction between the first type of job lottery (doctors in the UK, professor, teachers), which is a property of how institutions have happened to evolve, and the second, which seems much more inherent to human nature. “I’ll just go with what I enjoy” is a very common media strategy that will tend to split artists (of all sorts) into a handful of mega-stars, a small group of people making a modest living, and a vast mass of hopefuls searching for their break. To fix this would require careful consideration and the building of many new institutions – projects I think we lack the political will and the know-how for.

The problems in the job market for professors, doctors, or teachers feel different. These professions don’t rely on tastemakers and network effects. There’s also no stark difference in skills that would imply discontinuous compensation. This doesn’t imply that skills are flat – just that they exist on a steady spectrum, which should imply that pay could reasonably follow a similar smooth distribution. In short, in all of these fields, we see problems that could be solved by tweaks to existing institutions.

I think institutional change is probably necessary because these job lotteries present a perfect storm of misdirection to our primate brains. That is to say (1) People are really bad at probability and (2) the price level for the highest earners suggests that lots of people should be entering the industry. Combined, this means that people will be fixated on the highest earners, without really understanding how unlikely that is to be them.

Two heuristics drive our inability to reason about probabilities: the representativeness heuristic (ignoring base rates and information about reliability in favour of what feels “representative”) and the availability heuristic (events that are easier to imagine or recall feel more likely). The combination of these heuristics means that people are uniquely sensitive to accounts of the luckiest members of a profession (especially if this is the social image the profession projects) and unable to correctly predict their own chances of reaching that desired outcome (because they can imagine how they will successfully persevere and make everything come out well).

Right now, you’re probably laughing to yourself, convinced that you would never make a mistake like this. Well let’s try an example.

Imagine a scenario is which only ten percent of current Ph. D students will get tenure (basically true). Now Ph. D students are quite bright and are incredibly aware of their long odds. Let’s say that if a student three years into a program makes a guess as to whether or not they’ll get a tenure track job offer, they’re correct 80% of the time. If a student tells you they think they’ll get a tenure track job offer, how likely do you think it is that they will? Stop reading right now and make a guess.

Seriously, make a guess.

This won’t work if you don’t try.

Okay, you can keep reading.

It is not 80%. It’s not even 50%. It’s 31%. This is probably best illustrated visually.

Craft Design Online has inadvertently created a great probability visualization tool.

 

There are four things that can happen here (I’m going to conflate tenure track job offers with tenure out of a desire to stop typing “tenure track job offers”).

Ten students will get tenure. Of these ten, eight (0.8 x 10) will correctly believe they will get it (1/green) and two (10 – 0.8 x 10) will incorrectly believe they won’t (2/yellow). Ninety students won’t get tenure. Of these 90, 18 (90 – 0.8 x 90) will incorrectly believe they will get tenure (3/orange) and 72 (0.8 x 90) will correctly believe they won’t get tenure (4/red). Twenty-six students, those coloured green (1) and orange (3) believe they’ll get tenure. But we know that only eight of them really will – which works out to just below the 31% I gave above.

Almost no one can do this kind of reasoning, especially if they aren’t primed for a trick. The stories we build in our head about the future feel so solid that we ignore the base rate. We think that we’ll know if we’re going to make it. And even worse, we think that a feeling of “knowing” if we’ll make it provides good information. We think that relatively accurate predictors provide useful information against a small chance. They clearly don’t. When the base rate is small (here 10%), the base rate is the single greatest predictor of your chances.

But this situation doesn’t even require small chances for us to make mistakes. Imagine you had two choices: a career that leaves you feeling fulfilled 100% of the time, but is so competitive that you only have an 80% chance of getting into it (assume in the other 20%, you either starve or work a soul-crushing fast food job with negative fulfillment) or a career where you are 100% likely to get a job, but will only find it fulfilling 80% of the time.

Unless that last 20% of fulfillment is strongly super-linear [3][4], or you don’t have any value at all on eating/avoiding McDrugery, it is better to take the guaranteed career. But many people looking at this probably rounded 80% to 100% – another known flaw in human reasoning. You can very easily have a job lottery even when the majority of people in a career are in the “better” tier of the job, because many entrants to the field will view “majority” as all and stick with it when they end up shafted.

Now, you might believe that these problems aren’t very serious, or that surely people making a decision as big as a college major or career would correct for them. But these fallacies date to the 70s! Many people still haven’t heard of them. And the studies that first identified them found them to be pretty much universal. Look, the CIA couldn’t even get people to do probability right. You think the average job seeker can? You think you can? Make a bunch of predictions for the next year and then talk with me when you know how calibrated (or uncalibrated) you are.

If we could believe that people would become better at probabilities, we could assume that job lotteries would take care of automatically. But I think it is clear that we cannot rely on that, so we must try and dismantle them directly. Unfortunately, there’s a reason many are this way; many of them have come about because current workers have stacked the deck in their own favour. This is really great for them, but really bad for the next group of people entering the workforce. I can’t help but believe that some of the instability faced by millennials is a consequence of past generations entrenching their benefits at our expense [5]. Others have come about because of poorly planned policies, bad enrolment caps, etc.

These cover the two ways we can deal with a job lottery, we can limit the supply indirectly (by making the job, or the perception of the job once you’ve “made it” worse), or limit the supply directly (by changing the credentials necessary of the job, or implementing other training caps)   . In many of the examples of job lotteries I’ve found, limiting the supply directly might be a very effective way to deal with the problem.

I can make this claim because limiting supply directly has worked in the real world. Faced with a chronic 33% oversupply of teachers and soaring unemployment rates among teaching graduates, Ontario chose to cut in half the number of slots in teacher’s college and double the length of teacher’s college programs. No doubt this was annoying for the colleges, which made good money off of those largely doomed extraneous pupils, but it did lead to the end of the oversupply of teachers and a tighter job market for teachers and this was probably better for the economy compared to the counterfactual.

Why? Because having people who’ve completed four years of university do an extra year or two of schooling only to wait around and hope for a job is a real drag. They could be doing something productive with that time! The advantage of increasing gatekeeping around a job lottery and increasing it as early as possible is that you force people to go find something productive to do. It is much better for an economy to have hopeful proto-teachers who would in fact be professional resume submitters go into insurance, or real estate, or tutoring, or anything at all productive and commensurate with their education and skills.

There’s a cost here, of course. When you’re gatekeeping (for e.g. teacher’s college or medical school), you’re going to be working with lossy proxies for the thing you actually care about, which is performance in the eventual job. The lossier the proxy, the more you are needlessly depressing the quality of people who are allowed to do the job – which is a serious concern when you’re dealing with heart surgery ­– or the people providing foundational education to your next generation.

You can also find some cases where increasing selectiveness in an early stage doesn’t successfully force failed applicants to stop wasting their time and get on with their life. I was very briefly enrolled in a Ph. D program for biomedical engineering a few years back. Several professors I interviewed with while considering graduate school wanted to make sure I had no aspirations on medical school – because they were tired of their graduate students abandoning research as soon as their Ph. D was complete. For these students who didn’t make it into medical school after undergrad, a Ph. D was a ticket to another shot at getting in [6]. Anecdotally, I’ve seen people who fail to get into medical school or optometry get a master’s degree, then try again.

Banning extra education before medical school cuts against the idea that people should be able to better themselves, or persevere to get to their dreams. It would be institutionally difficult. But I think that it would, in this case, probably be a net good.

There are other fields where limiting supply is rather harmful. Graduate students are very necessary for science. If we punitively limited their number, we might find a lot of valuable scientific progress falling to a stand-still. We could try and replace graduate students with a class of professional scientific assistants, but as long as the lottery for professorship is so appealing (for those who are successful), I bet we’d see a strong preference for Ph. D programs over professional assistantships.

These costs sometimes make it worth it to go right to the source of the job lottery, the salaries and benefits of people already employed [7]. Of course, this has its own downsides. In the case of doctors, high salaries and benefits are useful for making really clever applicants choose to go into medicine rather than engineering and law. For other jobs, there’s the problems of practicality and fairness.

First, it is very hard to get people to agree to wage or benefit cuts and it almost always results in lower morale – even if you have “sound macro-economic reasons” for it. In addition, many jobs with lotteries have them because of union action, not government action. There is no czar here to change everything. Second, people who got into those careers made those decisions based on the information they had at the time. It feels weird to say “we want people to behave more rationally in the job market, so by fiat we will change the salaries and benefits of people already there.” The economy sometimes accomplishes that on its own, but I do think that one of the roles of political economics is to decrease the capriciousness of the world, not increase it.

We can of course change the salaries and benefits only for new employees. But this somewhat confuses the signalling (for a long time, people will still have principle examples of the profession come from the earlier cohort). It also rarely alleviates a job lottery, because in practice people set this up for new employees to have reduced salaries and benefits for a time. Once they get seniority, they’ll expect to enjoy all the perks of seniority.

Adjunct professorships feel like a failed attempt to remove the job lottery for full professorships. Unfortunately, they’ve only worsened it, by giving people a toe-hold that makes them feel like they might someday claw their way up to full professorship. I feel that when it comes to professors, the only tenable thing to do is greatly reduce salaries (making them closer to the salary progression of mechanical engineers, rather than doctors), hire far more professors, cap graduate students wherever there is high under- and un- employment, and have more professional assistants who do short 2-year college courses. Of course, this is easy to say and much harder to do.

If these problems feel intractable and all the solutions feel like they have significant downsides, welcome to the pernicious world of job lotteries. When I thought of writing about them, coming up with solutions felt like by far the hardest part. There’s a complicated trade-off between proportionality, fairness, and freedom here.

Old fashioned economic theory held that the freer people were, the better off they would be. I think modern economists increasingly believe this is false. Is a world in which people are free to get very expensive training ­– despite very long odds for a job and cognitive biases that make understanding just how punishing the odds are – expensive training, in short, that they’d in expectation be better off without, a better one than a world where they can’t?

I increasingly believe that it isn’t. And I increasingly believe that having rough encounters with reality early on and having smooth salary gradients is important to prevent this world. Of course, this is easy for me to say. I’ve been very deliberate taking my skin out of job lotteries. I dropped out of graduate school. I write often and would like to someday make money off of writing, but I viscerally understand the odds of that happening, so I’ve been very careful to have a day job that I’m happy with [8].

If you’re someone who has made the opposite trade, I’m very interested in hearing from you. What experiences do you have that I’m missing that allowed you to make that leap of faith?

Footnotes:

[1] I should mention that there’s a difference between economic value, normative/moral value, and social value and I am only talking about economic value here. I wouldn’t be writing a blog post if I didn’t think writing was important. I wouldn’t be learning French if I didn’t think learning other languages is a worthwhile endeavour. And I love libraries.

And yes, I know there are many career opportunities for people holding those degrees and no I don’t think they’re useless. I simply think a long-term shift in labour market trends have made them relatively less attractive to people who view a degree as a path to prosperity. ^

[2] That’s not to knock these jobs. I found my time building internal tools for an insurance company to be actually quite enjoyable. But it isn’t the fame and fortune that some bright-eyed kids go into computer science seeking. ^

[3] That is to say, that you enjoy each additional percentage of fulfillment at a multiple (greater than one) of the previous one. ^

[4] This almost certainly isn’t true, given that the marginal happiness curve for basically everything is logarithmic (it’s certainly true for money and I would be very surprised if it wasn’t true for everything else); people may enjoy a 20% fulfilling career twice as much as a 10% fulfilling career, but they’ll probably enjoy a 90% fulfilling career very slightly more than an 80% fulfilling career. ^

[5] It’s obvious that all of this applies especially to unions, which typically fight for seniority to matter quite a bit when it comes to job security and pay and do whatever they can to bid up wages, even if that hurts hiring. This is why young Canadians end up supporting unions in theory but avoiding them in practice. ^

[6] I really hope that this doesn’t catch on. If an increasing number of applicants to medical school already have graduate degrees, it will be increasingly hard for those with “merely” an undergraduate degree to get in to medical school. Suddenly we’ll be requiring students to do 11 years of potentially useless training, just so that they can start the multi-year training to be a doctor. This sort of arms race is the epitome of wasted time.

In many European countries, you can enter medical school right out of high school and this seems like the obviously correct thing to do vis a vis minimizing wasted time. ^

[7] The behaviour of Uber drivers shows job lotteries on a small scale. As Uber driver salaries rise, more people join and all drivers spend more time waiting around, doing nothing. In the long run (here meaning eight weeks), an increase in per-trip costs leads to no change whatsoever in take home pay.

The taxi medallion system that Uber has largely supplanted prevented this. It moved the job lottery one step further back, with getting the medallion becoming the primary hurdle, forcing those who couldn’t get one to go work elsewhere, but allowing taxi drivers to largely avoid dead times.

Uber could restrict supply, but it doesn’t want to and its customers certainly don’t want it to. Uber’s chronic driver oversupply (relative to a counterfactual where drivers waited around very little) is what allows it to react quickly during peak hours and ensure there’s always an Uber relatively close to where anyone would want to be picked up. ^

[8] I do think that I would currently be a much better writer if I’d instead tried to transition immediately to writing, rather than finding a career and writing on the side. Having a substantial safety net removes almost all of the urgency that I’d imagine I’d have if I was trying to live on (my non-existent) writing income.

There’s a flip side here too. I’ve spent all of zero minutes trying to monetize this blog or worrying about SEO, because I’m not interested in that and I have no need to. I also spend zero time fretting over popularizing anything I write (again, I don’t enjoy this). Having a security net makes this something I do largely for myself, which makes it entirely fun. ^

Advice, Model

Improvement Without Superstition

[7 minute read]

When you make continuous, incremental improvements to something, one of two things can happen. You can improve it a lot, or you can fall into superstition. I’m not talking about black cats or broken mirrors, but rather humans becoming addicted to whichever steps were last seen to work, instead of whichever steps produce their goal.

I’ve seen superstition develop first hand. It happened in one of the places you might least expect it – in a biochemistry lab. In the summer of 2015, I found myself trying to understand which mutants of a certain protein were more stable than the wildtype. Because science is perpetually underfunded, the computer that drove the equipment we were using was ancient and frequently crashed. Each crash wiped out an hour or two of painstaking, hurried labour and meant we had less time to use the instrument to collect actual data. We really wanted to avoid crashes! Therefore, over the course of that summer, we came up with about 12 different things to do before each experiment (in sequence) to prevent them from happening.

We were sure that 10 out of the 12 things were probably useless, we just didn’t know which ten. There may have been no good reason that opening the instrument, closing, it, then opening it again to load our sample would prevent computer crashes, but as far as we could tell when we did that, the machine crashed far less. It was the same for the other eleven. More self-aware than I, the graduate student I worked with joked to me: “this is how superstitions get started” and I laughed along. Until I read two articles in The New Yorker.

In The Score (How Childbirth Went Industrial), Dr. Atul Gawande talks about the influence of the Apgar score on childbirth. Through a process of continuous competition and optimization, doctors have found out ways to increase the Apgar scores of infants in their first five minutes of life – and how to deal with difficult births in ways that maximize their Apgar scores. The result of this has been a shocking (six-fold) decrease in infant mortality. And all of this is despite the fact that according to Gawande, “[in] a ranking of medical specialties according to their use of hard evidence from randomized clinical trials, obstetrics came in last. Obstetricians did few randomized trials, and when they did they ignored the results.”

Similarly, in The Bell Curve (What happens when patients find out how good their doctors really are), Gawande found that the differences between the best CF (cystic fibrosis) treatment centres and the rest turned out to hinge on how rigorously each centre followed the guidelines established by big clinical trials. That is to say, those that followed the accepted standard of care to the letter had much lower survival rates than those that hared off after any potentially lifesaving idea.

It seems that obstetricians and CF specialists were able to get incredible results without too much in the way of superstitions. Even things that look at first glance to be minor superstitions often turned out not to be. For example, when Gawande looked deeper into a series of studies that showed forceps were as good as or better than Caesarian sections, he was told by an experienced obstetrician (who was himself quite skilled with forceps) that these trials probably benefitted from serious selection effects (in general, only doctors particularly confident in their forceps skills volunteer for studies of them). If forceps were used on the same industrial scale as Caesarian sections, that doctor suspected that they’d end up worse.

But I don’t want to give the impression that there’s something about medicine as a field that allows doctors to make these sorts of improvements without superstition. In The Emperor of all Maladies, Dr. Siddhartha Mukherjee spends some time talking about the now discontinued practices of “super-radical” mastectomy and “radical” chemotherapy. In both treatments, doctors believed that if some amount of a treatment was good, more must be better. And for a while, it seemed better. Cancer survival rates improved after these procedures were introduced.

But randomized controlled trials showed that there was no benefit to those invasive, destructive procedures beyond that offered by their less-radical equivalents. Despite this evidence, surgeons and oncologists clung to these treatments with an almost religious zeal, long after they should have given up and abandoned them. Perhaps they couldn’t bear to believe that they had needlessly poisoned or maimed their patients. Or perhaps the superstition was so strong that they felt they were courting doom by doing anything else.

The simplest way to avoid superstition is to wait for large scale trials. But from both Gawande articles, I get a sense that matches with anecdotal evidence from my own life and that of my friends. It’s the sense that if you want to do something, anything, important – if you want to increase your productivity or manage your depression/anxiety, or keep CF patients alive – you’re likely to do much better if you take the large scale empirical results and use them as a springboard (or ignore them entirely if they don’t seem to work for you).

For people interested in nootropics, melatonin, or vitamins, there’s self-blinding trials, which provide many of the benefits of larger trials without the wait.  But for other interventions, it’s very hard to effectively blind yourself. If you want to see if meditation improves your focus, for example, then you can’t really hide the fact that you meditated on certain days from yourself [1].

When I think about how far from the established evidence I’ve gone to increase my productivity, I worry about the chance I could become superstitious.

For example, trigger-action plans (TAPs) have a lot of evidence behind them. They’re also entirely useless to me (I think because I lack a visual imagination with which to prepare a trigger) and I haven’t tried to make one in years. The Pomodoro method is widely used to increase productivity, but I find I work much better when I cut out the breaks entirely – or work through them and later take an equivalent amount of time off whenever I please. I use pomos only as a convenient, easy to Beemind measure of how long I worked on something.

I know modest epistemologies are supposed to be out of favour now, but I think it can be useful to pause, reflect, and wonder: when is one like the doctors saving CF patients and when is one like the doctors doing super-radical mastectomies? I’ve written at length about the productivity regime I’ve developed. How much of it is chaff?

It is undeniable that I am better at things. I’ve rigorously tracked the outputs on Beeminder and the graphs don’t lie. Last year I averaged 20,000 words per month. This year, it’s 30,000. When I started my blog more than a year ago, I thought I’d be happy if I could publish something once per month. This year, I’ve published 1.1 times per week.

But people get better over time. The uselessness of super-radical mastectomies was masked by other cancer treatments getting better. Survival rates went up, but when the accounting was finished, none of that was to the credit of those surgeries.

And it’s not just uselessness that I’m worried about, but also harm; it’s possible that my habits have constrained my natural development, rather than promoting it. This has happened in the past, when poorly chosen metrics made me fall victim to Campbell’s Law.

From the perspective of avoiding superstition: even if you believe that medicine cannot wait for placebo controlled trials to try new, potentially life-saving treatments, surely you must admit that placebo controlled trials are good for determining which things aren’t worth it (take as an example the very common knee surgery, arthroscopic partial meniscectomy, which has repeatedly performed no better than sham surgery when subjected to controlled trials).

Scott Alexander recently wrote about an exciting new antidepressant failing in Stage I trials. When the drug was first announced, a few brave souls managed to synthesize some. When they tried it, they reported amazing results, results that we now know to have been placebo. Look. You aren’t getting an experimental drug synthesized and trying it unless you’re pretty familiar with nootropics. Is the state of self-experimentation really that poor among the nootropics community? Or is it really hard to figure out if something works on you or not [2]?

Still, reflection isn’t the same thing as abandoning the inside view entirely. I’ve been thinking up heuristics since I read Dr. Gawande’s articles; armed with these, I expect to have a reasonable shot at knowing when I’m at risk of becoming superstitious. They are:

  • If you genuinely care only about the outcome, not the techniques you use to attain it, you’re less likely to mislead yourself (beware the person with a favourite technique or a vested interest!).
  • If the thing you’re trying to improve doesn’t tend to get better on its own and you’re only trying one potentially successful intervention at a time, fewer of your interventions will turn out to be superstitions and you’ll need to prune less often (much can be masked by a steady rate of change!).
  • If you regularly abandon sunk costs (“You abandon a sunk cost. You didn’t want to. It’s crying.”), superstitions do less damage, so you can afford to spend less mental effort on avoid them.

Finally, it might be that you don’t care that some effects are placebo, so long as you get them and get them repeatedly. That’s what happened with the experiment I worked on that summer. We knew we were superstitious, but we didn’t care. We just needed enough data to publish. And eventually, we got it.

[Special thanks go to Tessa Alexanian, who provided incisive comments on an earlier draft. Without them, this would be very much an incoherent mess. This was cross-posted on Less Wrong 2.0 and as of the time of posting it here, there’s at least one comment over there.]

Footnotes:

[1] Even so, there are things you can do here to get useful information. For example, you could get in the habit of collecting information on yourself for a month or so (like happiness, focus, etc.), then try several combinations of interventions you think might work (e.g. A, B, C, AB, BC, CA, ABC, then back to baseline) for a few weeks each. Assuming that at least one of the interventions doesn’t work, you’ll have a placebo to compare against. Although be sure to correct any results for multiple comparisons. ^

[2] That people still buy anything from HVMN (after they rebranded themselves in what might have been an attempt to avoid a study showing their product did no better than coffee) actually makes me suspect the latter explanation is true, but still. ^