Correlation still ain’t causation

Tim Harford has written an excellent article about the pitfalls of “big data” analysis. It basically boils down to this:

Correlation isn’t causation, no matter how much data you have.

But the bigger your dataset, the more likely you are to find correlations within it.

To illustrate, here is a graph of 100 random walks, each with 20 points (a random walk equals its previous value plus a random number, which I’ve determined to be uniformly distributed between -0.5 and 0.5):

I’m actually quite surprised that Excel didn’t crash when I drew this chart, so maybe it doesn’t meet the definition of big data.

Anyway, these random walks have absolutely no relationship to each other — each is just the result of adding up 20 random numbers. But if you tested all 4,950 possible correlations between these 100 series, you would find a staggering 2,175 correlations (44%!) are statistically significant at the 95% level just by chance (yes, I did test them).

One solution to this mess is of course to have some good theory behind your analysis in the first place, and only test for correlations that make sense within that theory. This still doesn’t guarantee that you won’t find spurious correlations, but will put you in a much better place than blindly testing everything possible in your dataset.

So far from making theory redundant, I think big data makes it even more important. The nice thing about big data is that it potentially allows testing more intricate and nuanced theories than is possible with small datasets.

Counteracting the sad faces

The Royal NZ Herald today includes a story by the notorious property editor Anne Gibson about a planned “huge” 14-unit apartment complex in Parnell that has upset the neighbours because it has only 10 carparks and might cause parking congestion on the street.

Never mind that (a) a rational developer will choose the number of carparks to meet expected demand, and (b) the real problem if any is that on-street parking in the area is not properly priced (ie free).

The article is of course accompanied by a photo of five NZ Herald Sad Faces (TM):

In a bold attempt to bring some balance to Gibson’s story, I have sourced photos of the 28 happy people who will move in to these new apartments:

What’s the use of economic impact analysis?

Big businesses seem to like to commission economic impact studies (eg Airbnb or Auckland Airport). These studies usually claim to measure the total amount of economic activity directly and indirectly associated with the business in question.

Lots of methodological criticisms can be made about these studies — that economic activity is not the same thing as wellbeing, that costs are sometimes counted as benefits, that opportunity costs are not properly accounted for, that they often lack a counterfactual scenario and so measure gross impacts rather than net, or that the “multipliers” sometimes used in the analysis are subject to large uncertainties.

But put all these things aside and suppose an economic impact study had successfully and accurately measured the true total net contribution of a particular business to economic activity. What is the use of this number?

Such an analysis would tell us how much economic activity would be lost if the business were to cease to exist, taking into account the value of the next best use of the resources currently used by the business. But is this ever a relevant scenario? For example, no one is threatening to close Auckland Airport, so the economic activity that it creates is not at risk. The total economic impact of a business, even if correctly measured, is meaningless unless we are making a decision about whether or not to shut the business down.

Instead, the actual decisions that typically get made are much more marginal — should we allow this business to expand, or build a thing, or some such. An overall economic impact number does not inform these kinds of decisions. Analysing the incremental effects of actual and realistic decisions would yield much smaller, but much more useful numbers.

The case of Airbnb is particularly interesting. I’m a big fan of Airbnb — I think they operate a very innovative business that generates real value in some interesting ways. So I was a bit disappointed to see them fall back on fairly typical economic impact analysis, when instead they could have highlighted the new ways that their business really generates wellbeing and economic activity. This would be harder to do but would tell a much more compelling story. I’ll write another post about this shortly.

Census data galore

Stats NZ has released the 2013 Census meshblock dataset, which gives detailed data from the latest Census. To celebrate, I’ve made a couple of maps of household income distribution in Auckland.

The first shows median household income by Census Area Unit. The five shades of blue show the quintiles of the distribution (click the map to see a bigger one).

The next shows the annual average percentage change in household income between 2006 and 2013.

So did the rich get richer? This plot shows the growth rate of median household income between 2006 and 2013 versus median household income in 2006. There doesn’t seem to be a relationship, though the median household income in 2006 is obviously truncated at $100,000.

Dose of rationality

I stumbled on the excellent Healthcare Triage YouTube channel yesterday. The presenter (a doctor) uses medical research to answer important everyday health questions such as is it safe to eat food dropped on the floor if you pick it up in five seconds (I thought it was three seconds?):

Is there any benefit from using antibacterial soap (spoiler: no, but don’t stop washing your hands entirely):

And this great one about risk and how things like irrational fear of flying can lead to safety policies that actually result in more deaths as well as having extraordinary costs:

Dude, where’s my house?

The Royal NZ Herald ran an appalling headline yesterday:

Chinese snap up more kiwi properties

I knew times were tough for traditional media, but their business must really be circling the drain if they’re resorting to racist click-bait like this.

The article is based on the latest property market report by the BNZ and REINZ, which shows that the proportion of properties sold to overseas Chinese has increased from 15% to 25% in a year. This is out of the 6.4% of properties sold to overseas buyers, so overseas Chinese are buying about 1.6% of all properties that are sold — a tiny number that hardly qualifies as “snapping up”.

The BNZ-REINZ report also states (page 5):

There is no upward trend evident in the proportion of NZ dwelling sales to people located offshore.

So sales to overseas Chinese are increasing but sales to people living in other countries are decreasing. Unless you’re racist, there’s nothing to see here.

Should we even care if foreigners are buying “kiwi properties”? Houses don’t move easily, so if a foreigner buys a house in NZ, the house stays in NZ. If the foreigner chooses to rent the house out then the house is still available to provide housing to “kiwis”, and who owns it is irrelevant.

But what if the nasty foreigner keeps the house empty for most of the time, reserving it for their own pleasure when they occasionally visit NZ? Then the kiwi battler mums and dads can’t use the house any more. Should we care?

I think not — this is just another export like milk or kiwifruit. Saying that foreigners should be restricted from buying houses to keep them available for “kiwis” is like saying that milk exports should be restricted to keep the price of milk low in NZ supermarkets. Yes milk might be cheaper for some, but this is economic nonsense. Just like NZ farmers are happy to sell their milk to foreigners, so too are some NZ homeowners happy to sell (export) their house at a good price to a foreigner. Restricting sales to foreigners will make those sellers worse off, something that is too easily overlooked.

Instead, the real problem if there is one, is that the NZ housing market supply is not responding well to the signal provided by higher house prices. An increase in demand (whether from foreigners or locals) should cause more houses to be built, but this does not seem to be happening to a great extent due to planning restrictions, NIMBYs, and construction costs. These are trickier problems to solve than pointing fingers at foreigners.

NZ tertiary education strategy: Fail?

The government’s brand new strategy for NZ’s tertiary education sector for the next five years contains not a single reference to online education, and the word “internet” appears only once:

While patterns of competition, demand, and work continue to change rapidly, geographical barriers to learning are reducing as a result of advances enabled by digital technologies. For example, super-fast broadband is supporting new modes of internet-based provision and a broader trend toward more flexible, less place-based provision. These technology-driven changes will require New Zealand’s tertiary education sector to advance its thinking quickly on new delivery models

That’s a pretty glib quote with no clear strategy. Those geographic barriers were what protected NZ’s tertiary sector for a long time and the barriers are crumbling fast. Why learn at a NZ university from an average professor (apologies to my former colleagues …) when you can learn at a world leading university from a brilliant teacher without even leaving home? Yes there are some aspects of student life that can’t (yet) be duplicated online but those with highest willingness to pay for education want quality.

And why enroll in a rigid multi-year degree structured around a set of 1-hour lectures delivered over three-month semesters with long holidays when you can speed things up considerably and skip the boring bits and the padding that inevitably gets added to justify three or four years of fees?

I see online education as a huge threat to the NZ tertiary sector; the government seems to have its head in the sand.

Two things that made me go hmmm

Chorus’s results for the second half of 2013 show an after tax profit of $78m. Donal shows that under conservative assumptions, this translates to an after-tax return of 19%. Not bad for a (mostly) regulated business!

And Telecom’s latest half year result indicates a dividend of $43 million from its 50% holding in the Southern Cross Cable. No one knows what SCC’s total profit is, but as Sam Morgan commented:

Opening the bank data vaults

This tweet by Rod Drury is exciting:

Banks are treasure troves of data but legitimate concerns about privacy and security have presumably prevented them from opening this up until now. Online banking was once cutting edge, but despite some recent innovations, its basic functions have remained the same for a long time.

The privacy and security issues won’t go away though. I suppose people could be persuaded to give up privacy in exchange for valuable uses of their data, but security is another thing. When you give a Twitter app access to your account, for example, you let it read your stream and possibly tweet on your behalf. You’d have to trust an app a lot to let it make transactions on your behalf, and one security breach would dramatically undermine the whole system.

So how to overcome these problems while opening up bank data for new innovative uses? One possible solution (I’m not sure if this is what ASB has in mind) is don’t let the data leave the bank servers. Instead, open up online banking (or some other platform) as a secure operating system for developers to create apps on. Done right, the bank can retain control of the data and transactions, while still allowing a banking app ecosystem to flourish inside a secure sandbox.

Perhaps once people get used to this in online banking, it could be rolled out to other platforms, allowing developers to create apps that are authenticated or branded as operating within the sandbox, but live outside the online banking website itself. This would certainly be exciting if ASB and others can pull it off.

The final Countdown

John is pessimistic about the prospects of the Commerce Commission taking any action against Countdown regarding recent allegations, partly because a key bit of NZ’s competition law is broken, but also because what Countdown has allegedly done is shafted its suppliers, not its customers or its competitors. Countdown would probably argue that squeezing its suppliers is a sign of competition between supermarkets.

The only possibly robust argument that I can think of is that Countdown’s alleged behaviour could be designed to reduce competition faced by its “house brands” in certain product categories. So Countdown could be taking advantage of its market power to reduce competition in certain segments. This is plausible but a difficult case to prove.

Anyway, as a consumer what often irks me at the supermarket (not just Countdown) in New Zealand is the frequently pathetic quality of fresh produce for sale (at not-cheap prices). On more than one occasion I have been tempted to start a Facebook page or some such, to allow people to post photos of limp, rotten, or damaged produce, with the hope of shaming supermarkets into improving their quality.

I don’t know if it’s true, but it seems to me that in the wholesale markets, the supermarkets often purchase pretty much the lowest grade of export-reject produce, and sell it at comparable prices as exported produce gets sold overseas. Which is a shame, considering that NZ grows most of this stuff, and there are lower transport costs.

My tailor recently blogged about The Locavore Edition — an Australian guide that connects consumers to local food producers. Can we have something like this in NZ ASAP please, so that we can bypass the supermarkets all together.