Error bar

Phil Matier is a terrible journalist. One of the ways in which he’s terrible is that he doesn’t understand statistics at all, or at least, the way he uses statistics in his writing is uninformed (at best) or intentionally misleading (at worst). I’ll be charitable and assume that the fallacies in his work come from a place of ignorance rather than deception.

Today he wrote a piece on cycling demographics which is sure to annoy bike people. (It annoyed me, too, but not for the same reasons.) In it, he makes a number of claims about the demographics of cycling in the Bay Area (focusing, as Matier always does, on the San Francisco experience). For example:

Across the Bay Area, white riders represented 61% of the bike commuters, followed by Hispanics at 17%, Asians at 15% and African Americans at 2.4%.

In San Francisco, the white percentage was even higher — 65% of regular riders — followed by Asians and Hispanics at 14% each and African Americans at just over 1% of regular San Francisco bike commuters. 

He’s quoting numbers from the American Community Survey 5-year estimates, which is our best source of city-level data on cycling. But the fact that the ACS is our best source of data doesn’t mean that it’s a good source of data, especially for breakdowns at this level.

I work with ACS data about cycling rates a lot, and the sample sizes are incredibly small. For example, census tract 35201 in the Outer Sunset:

Map of San Francisco showing Golden Gate Park and a selected census tract (67000) in the Outer Sunset
Census tract 35201, western San Francisco

The ACS estimates that there are 57 bicycle commuters in this census tract, with a margin of error of 55. So there’s a 95% statistical confidence that the number of commute cyclists here is somewhere between 2 and 112. That’s really not very helpful to whatever analysis you’re trying to do. And it gets worse the more you try to subdivide it; you really can’t do more than amplify noise.

Those error bars are virtually never reported on outside of academic publications. I talk about them here, sometimes, but even I often report numbers which have huge uncertainty without really qualifying them properly. (Sorry about that.)

Even at the level of the entire city the error bars are pretty big. San Francisco is estimated to have 20,298 commute cyclists, +/- 1,108. When you break that down into eight or 10 (or more) racial and ethnic groups, you have to be very cautious of making any firm conclusions.

Matier, being a terrible journalist, doesn’t worry about such things. Not only does he completely ignore the margins of error on the numbers he’s quoting, he sets misleading comparative baselines. One of his factoids is that Whites make up 40% of San Francisco’s resident population but 65% of its bike commuters. Resident population is the wrong baseline to use; the bike commuter population should be compared to the overall commuter population, which is 46% White. Whites are over-represented in the commuter population, probably because they’re more likely to be of working age and to have a job. There’s still probably a statistical significance to the bike commute population being 65% White, but the 40% comparison is misleading.

The article also fails to address spatial disparities. Look at where the bike commuters in SF live:

Map of bike mode share in San Francisco. All of the darkest census tracts are in the center of the city, near downtown.

Excepting the Presido, the areas where San Francisco sees a lot of bike commuting are the areas which are easy to get around on a bike (flat), and close enough to downtown to be worth riding. Cycling rates are very low in the south, west, and east of the city, because riding to downtown from those areas requires a long ride over substantial hills. The fact that those neighborhoods are predominantly people of color (over 80% for most of them), and the fact that bike commute rates are low, is mostly an accident of history.

Another lovely bit of Matier’s non-data-analysis:

As for age, about 7 out of 10 bike commuters in San Francisco are between 20 and 40 years old, a trend replicated in other Bay Area counties, except for Marin, Napa, Solano and Sonoma, where the lion’s share of bike commuters are in their 50s.

So, this point would be great, except that it doesn’t hold for almost half of the Bay Area. Specifically, the rural counties. A good journalist would wonder what’s going on in Marin and Napa that the bike commuters are older than in San Francisco, and a really good journalist would wonder if there’s also a difference in racial and ethnic makeup of the groups. (Guess what: There is.) Unfortunately we’re dealing with Matier, so we get none of that.

A couple of quotes from Jeffrey Tumlin provide a bit of redemption: Tumlin notes that, “For many neighborhoods, a protected bikeway to downtown is not their highest priority — getting kids safely to school may be more important.” Which may be true. This would be a fine moment to note that we have very little data on non-commute trips, which make up the majority of our transportation environment, and maybe even to speculate that our lack of understanding of the interests of those groups may be contributing to poor policy decisions.

Instead, Matier concludes, “For now, however, the roads are ruled by the bike bros.”

What an ass.

My conclusions:

  • If you are forced to read the SF Comical, skip Phil Matier.
  • U.S. bike advocacy does have a real issue with skewing male and white.
  • U.S. cycling participation really doesn’t skew white, though it does skew male, and it skews white in a few tech-oriented metros.
  • “Error Bar” would be a good name for a data science drinking establishment.
  • Be skeptical of statistical claims, because all of our data sucks.


Leave a Reply Text

Your email address will not be published. Required fields are marked *