Tim Harford's Blog, page 112

June 8, 2013

A statistical needle in a bureaucratic haystack

Since You Asked

The data aren’t useful because they’re spread across a gazillion spreadsheets, says Tim Harford


‘Finding government statistics is not easy. Both expert users and occasional users struggle to navigate their way through the multiple places in which statistics are published.’ UK House of Commons public administration select committee report, May 2013


How hard can it be to find a few statistics? And since when is this a matter for a parliamentary committee?



You’ve obviously never tried to use the Office for National Statistics website. Try a simple-sounding query – such as what households are currently spending in a week, or retail price inflation for the past 50 years – and you are highly unlikely to get anywhere using the search window. It’s like Google on an acid trip, throwing several thousand random results at you.


It can’t be that hard.



I recently sat down with one of the UK’s finest economic journalists, Evan Davis of the BBC, and we tried to get the results we wanted either through the search window or by trying to second-guess the tormented mind of the person who constructed the branches of the database’s hierarchy. It was hopeless. Even when Mr Davis used his expertise to shortcut the process, we found ourselves thwarted at every turn. (As an aside, Google delivered the correct result in seconds.)


I am sure Chris Giles, the FT’s economics editor, would not be defeated so easily.


Perhaps not, but Mr Giles testified to the public administration committee and took the trouble to run through, step by step, just how difficult it would be to find the answer to a simple, practical statistical question – such as whether unemployment today is lower or higher than it was in the mid-1990s. For an expert user, who knows that the relevant code for the data in question is MGSX, finding an answer to that question is slow and awkward. For a more typical user, finding an answer might be impossible.


Let’s return to the question of why a parliamentary committee should care?



It’s encouraging that MPs do care, because professional researchers at the Commons library will do all the hard work for them and they need never do battle with the ONS website. Most other people have to do the leg work themselves; and, if the ONS site is hard to use, they will turn to other sources, which are more likely to be wrong or to contain partisan spin.


Why is this such a hard problem?



I suspect the ONS is making it look harder than it really is, but making statistics accessible isn’t a straightforward task. Our official statistics have their own longstanding categorisation system, which makes little sense to the lay person, so a user-friendly navigation system must help someone sidestep that. There’s a lot of data available, in principle, and there are many different ways in which users might reasonably want to see them presented, not to mention the difficulty of dealing with synonyms such as “family spending” instead of “household expenditure”. All that said: the ONS website is a national embarrassment.


Should I conclude that other countries make a better job of this?



The US’s Fred database (short for Federal Reserve Economic Data) is well-respected for being comprehensive and easy to use. The World Development Indicators, under the guardianship of the World Bank, are impressive if fiddly. But the truth is that this stuff isn’t terribly easy.


I thought the government was going to release more data. Does that mean the problem will get worse?



Demand for data can only rise, so the ONS needs to get its house in order. But the government’s “open data” plan is a somewhat separate issue. That sort of data released could contain almost anything – for instance, the real-time location of every bus in London, to enable applications and websites to help people plan their journeys. Handy stuff – but processed official statistics, which are quality-assured, are something different.


Aren’t government departments and councils meant to be releasing highly detailed data about what they’re spending?



They are – for every item more than £25,000. But the data aren’t very useful at the moment. They are often a bit unreliable and spread across a gazillion spreadsheets. There are many such problems but, as with the ONS website, we are promised improvements in due course. The journey of a thousand miles begins with a single step, as they say. I’ll grant the government this much: the steps have been in the right direction.


Also published at ft.com.


 •  0 comments  •  flag
Share on Twitter
Published on June 08, 2013 00:44

June 1, 2013

Why weird science is all in a day’s work

Since You Asked

Stories of the formula for the perfect penalty kick are cheaper than an ads, writes Tim Harford


‘People who have surgery towards the end of the week are more likely to die than those who have procedures earlier on, researchers say’ BBC.co.uk, May 29


This is presumably the National Health Service’s equivalent of Detroit’s lemons. If you buy a car that was assembled on a Friday afternoon, woe betide you . . . 


It is conceivable surgeons operate after a boozy Friday lunch. A more plausible explanation is the NHS is short-staffed at weekends and so if your surgery leads to complications you may be less likely to get prompt attention from experienced staff. Several studies have suggested it’s not a great idea to be stuck in hospital over the weekend, but there has been a suspicion the problem may not be the staff but the patients. Those who rock up for emergency surgery at three o’clock on a Saturday morning may just be different sorts of people with different conditions, compared to those who arrive at lunchtime on Wednesday. This research looked at planned surgery, not emergency surgery, which (one hopes) removes that source of confusion.



I’m sceptical. Haven’t we heard “researchers say” all sorts of things about different days of the week?


“Researchers say” the strangest things, at least according to the newspapers. You may be thinking of the “Blue Monday” equation, which purported to show the last Monday in January was, scientifically speaking, the year’s most depressing day.


That’s the one!


It’s nonsense. Harry Frankfurt’s essay “On Bullshit” pointed out while a liar knows the truth and is determined to conceal it, the bull merchant has no interest in whether something is true or not. This particular piece of nonsense is arbitrary pseudoscience.


Researchers publish nonsensical pseudoscience in the media all the time. Which is why I was sceptical about the “don’t get sick on Friday” study.


The problem is we constantly read that “researchers say” one thing or another. Perhaps that phrase once conveyed something meaningful – that experts had conducted a rigorous study on a topic and that we didn’t need to worry with the details, but could skip to the punchline. If so, public relations companies have hijacked the phrase, using it as a vector to infect the nervous systems of journalists and their editors. The Blue Monday study was paid for by a travel agency. Other nonsensical equations have been commissioned by ice cream makers, lingerie manufacturers, supermarkets and bookmakers. Some academic is persuaded to attach his good name to the sorry affair – and the definition of “academic” is often very loose. Nobody cares. Stories breathlessly relating the discovery of the mathematical formula for the perfect penalty kick, the perfect pair of breasts or the perfect weekend are routinely published. They are cheaper than paying for advertising.


But you’re going to tell me that the hospital mortality study was different, because it wasn’t sponsored by some corporate PR outfit?


You’re missing the point. The real flaw with Blue Monday wasn’t that it was commissioned by a corporation. It was that it had no scientific content. Science isn’t just whatever emerges from the mouth of someone with a tenuous university affiliation. Science is a process. The hospital study is part of that scientific process. It identified a hypothesis of importance. It gathered data – more than 4m inpatient admissions for every hospital in England over the course of three years. It analysed the data, with enough statistical power that the observed patterns were enormously unlikely to be the result of chance. It found an effect that was large enough to be of real practical concern. The research refers to, and supplements, previous studies in the area – and future studies will refer to, and supplement, this one. It was peer-reviewed and published in the British Medical Journal, an organ with a reputation to defend.


So the research checks out.


With apologies to Star Trek, I’m an economist, Jim, not a doctor. But it looks solid to me. Whether the BMJ study ultimately turns out to be correct, it is a world away from Blue Monday or the equation for the beer-goggles effect. Yet to the casual consumer of newspaper reporting, the difference is far from clear. So now half the country is credulous about pseudoscience, while the other half is disbelieving of perfectly good research. It’s all far more disheartening than a Monday morning.


Also published at ft.com.


 •  0 comments  •  flag
Share on Twitter
Published on June 01, 2013 01:23

Being economical with the data

Undercover Economist

Economics will have to change what it recognises as a question, and what it recognises as an answer


According to IBM, the computers with which we have surrounded ourselves are now generating 2.5 quintillion bytes of data a day around the world. That’s about half a CD’s worth of data per person per day. “Big data” is the topic of countless breathless conference presentations and consultants’ reports. What, then, might it contribute to economics?


Not everyone means the same thing when they talk about “big data”, but here are a few common threads. First, the dataset is far too big for a human to comprehend without a lot of help from some sort of visualisation software. The time-honoured trick of plotting a scatter graph to see what patterns or anomalies it suggests is no use here. Second, the data is often available at short notice, at least to some people. Your mobile phone company knows where your phone is right now. Third, the data may be heavily interconnected – in principle Google could have your email, your Android phone location, knowledge of who is your friend on the Google Plus social network, and your online search history. Fourth, the data is messy: videos that you store on your phone are “big data” but a far cry from neat database categories – date of birth, employment status, gender, income.


This hints at problems for economists. We have been rather spoiled: in the 1930s and 1940s pioneers such as Simon Kuznets and Richard Stone built tidy, intellectually coherent systems of national accounts. Literally billions of individual transactions are summarised as “UK GDP in 2012”; billions of price movements are represented by a single index of inflation. The data come in nice “rectangular” form – inflation for x countries over y years, for instance.


The big data approach is very different. Take, for instance, credit card data. In principle Mastercard has a wonderful dataset: it knows who is spending how much, where, and on what kind of product, and it knows it instantly. But this is what economists Liran Einav and Jonathan Levin call a “convenience sample” – not everyone has a Mastercard, and not everyone who has one will use it much.


It would be astonishing if the Mastercard dataset couldn’t tell economic researchers something useful, but it’s very poorly matched to the kind of data we normally use or even the kind of questions we normally ask. We like to find causal links, not just patterns – and for everyone, or a representative sample of people, not for an arbitrary sub-group.


Perhaps it’s no surprise that the most immediate use of big data in economics has been in forecasting (or “nowcasting”), which has always been a pragmatic and academically-marginal activity in economics. Analyses of tweets, of Google searches for unemployment benefit or motor insurance, or of trackers of trucks in Germany, have been used ad hoc to understand how the economy is doing, and they seem to work well enough. MIT’s “billion prices project” provides daily estimates of inflation from around the world.


More traditional attempts to use big data have been influential. For instance, Raj Chetty, John Friedman and Jonah Rockoff linked administrative data on 2.5m schoolchildren from New York City to information on what they earned as adults decades later. A single year’s exposure to a poor teacher turns out to have large and long-lasting effects on career success. Amy Finkelstein and a team of colleagues evaluated Medicaid, the low-income US healthcare programme, linking data on hospital records to credit history and other variables. Without large datasets such research would be impossible.


These recent studies promise much more to come for economics. But to take full advantage of the data revolution, the profession will have to change both what it recognises as a question, and what it recognises as an answer.


Also published at ft.com.


 •  0 comments  •  flag
Share on Twitter
Published on June 01, 2013 00:40

May 29, 2013

Royal Society Annual Public Lecture

Speeches

I am delighted to announce that I will be giving the Royal Economic Society’s annual public lecture this year, in Sheffield on 26th November and London on 28th November. Come along! (Tickets will be released in September.)


 •  0 comments  •  flag
Share on Twitter
Published on May 29, 2013 01:58

May 26, 2013

Next series of Pop Up Ideas to be recorded on June 10th

Radio

Pop Up Economics is now Pop Up Ideas, and the next series will be inspired by ideas from anthropology and political science. We’re recording the entire series in front of a live audience – please come along. You can apply for free tickets here.


Speakers will include anthropologist / capital markets editor of the Financial Times, Gillian Tett; counterinsurgency expert David Kilcullen; and a man who needs no introduction, Malcolm Gladwell. And yours truly, of course.


1 like ·   •  0 comments  •  flag
Share on Twitter
Published on May 26, 2013 03:47

A simple way to make a great cup of coffee

Marginalia

This is a public service announcement.



 •  0 comments  •  flag
Share on Twitter
Published on May 26, 2013 01:28

May 25, 2013

Misinformation can be beautiful too

Undercover Economist

Data visualisation creates powerful, elegant images from complex information, but can also be potentially deceptive


Camouflage usually means blending in. That wasn’t an option for the submarine-dodging battleships of a century ago, which advertised their presence against an ever-changing sea and sky with bow waves and smokestacks. And so dazzle camouflage was born, an abstract riot of squiggles and harlequin patterns. It wasn’t hard to spot a dazzle ship but the challenge for the periscope operator was quickly to judge a ship’s speed and direction before firing a torpedo on a ponderous intercept. Dazzle camouflage was intended to provoke misjudgments, and there is some evidence that it worked.


Now let’s talk about data visualisation, the latest fashion in numerate journalism, albeit one that harks back to the likes of Florence Nightingale. She was not only the most famous nurse in history but the creator of a beautiful visualisation technique, the “Coxcomb diagram”, and the first woman to be elected as a member of the Royal Statistical Society.


Data visualisation creates powerful, elegant images from complex data. It’s like good prose: a pleasure to experience and a force for good in the right hands, but also seductive and potentially deceptive. Because we have less experience of data visualisation than of rhetoric, we are naive, and allow ourselves to be dazzled. Too much data visualisation is the statistical equivalent of dazzle camouflage: striking looks grab our attention but either fail to convey useful information or actively misdirect us.


For a relatively harmless example, consider The New Yorker’s recent online subway map of inequality. “New York has a problem with inequality,” we are told. Then we are invited to click on different subway maps to see a cross-sectional graph, showing us the peaks and troughs of median income along different subway lines. The result is gorgeous but far less informative than a map would have been. It is a piece of art pretending to be a piece of statistical analysis.


A more famous example is David McCandless’s unforgettable animation “Debtris”, in which large blocks fall slowly against an eight-bit soundtrack in homage to the addictive computer game Tetris. Their size indicates their dollar value. “$60bn: estimated cost of Iraq war in 2003” is followed by “$3000bn: estimated total cost of Iraq war”, and then Walmart’s revenue, the UN’s budget, the cost of the financial crisis, and much else.


The animation is pure dazzle camouflage. Statistical apples are compared with statistical oranges throughout. The Iraq comparison, for instance, is not one of “then versus now” as it first appears – but one of what the US Department of Defense once thought it would spend versus a broader estimate, including a financial value on the lives of dead soldiers, and over a trillion dollars of “macroeconomic costs”. The war was a disaster. No need for a statistical bait-and-switch to make that case.


Information can be beautiful, McCandless tells us. Unfortunately misinformation can be beautiful too. Or, as statistical guru Michael Blastland puts it, “We are in danger of making the same statistical mistakes that we’ve always made – only prettier.”


Those beautiful Coxcomb diagrams are no exception. They show the causes of mortality in the Crimean war, and make a powerful case that better hygiene saved lives. But Hugh Small, a biographer of Nightingale, argues that she chose the Coxcomb diagram in order to make exactly this case. A simple bar chart would have been clearer: too clear for Nightingale’s purposes, because it suggested that winter was as much of a killer as poor hygiene was. Nightingale’s presentation of data was masterful. It was also designed not to inform but to persuade. When we look at modern data visualisations, we should remember that.


Also published at ft.com.


 •  0 comments  •  flag
Share on Twitter
Published on May 25, 2013 01:36

Loose money in all that spare change

Since You Asked

The disappearance of small coins will be little noticed, writes Tim Harford


‘There is a wish among the German population to keep hold of the small coins. can personally only join that opinion.’ Jens Weidmann, Bundesbank president, quoted in the German newspaper Bild


Why is a German central banker bothering to campaign for the retention of the euro cent? It’s not as if anyone is proposing scrapping them.


Actually, the European Commission has proposed exactly that.


Oh.


They have a case: one and two euro-cent coins are expensive to make, relative to their face value. And they are useless things, as are the British penny and the US cent.


But people don’t want to see these coins disappear.


There does seem to be a psychological barrier. This is most obvious in the US, which did away with the half cent back in 1857, when it was worth quite a bit – 14 modern cents, after adjusting for consumer prices. Relative to the wages of the day, the half cent was worth almost a dollar in today’s money. It was no small thing to get rid of the half cent when people were only paid a few cents an hour. Nevertheless, the coin was withdrawn and everyone survived. Modern pennies, because they feel like some kind of foundational building block of the monetary system, have clung on stubbornly.


And why not? We like them.


No, we hate them. Actions speak louder than words. People don’t use these coins as money, for the excellent reason that they’re not terribly useful as a medium of exchange – too heavy and too difficult to count when purchasing anything but the tiniest item. The reason that so many euro cents have to be minted is that people get the things in change but don’t then spend them. They end up down the back of sofas or gathering lint in people’s belly buttons.


Isn’t that Gresham’s law or something?


It’s actually a weird inverted form of Gresham’s law, which says that “bad money drives out good”. When two coins with the same face value are circulating, people will tend to spend the coins with lower metallic value and keep the rest. The commemorative John F. Kennedy half dollar, for instance, was 90 per cent silver when first minted in 1964. Later half dollars contained less, then no silver. Gresham’s law predicts that the silver half-dollars will not circulate – and indeed, they do not. According to the website Coinflation, the melt value of a 1964 half dollar is $8. But the situation with the euro cent is the reverse: people hold on to them not because they are too valuable to lose but because they are too trivial to use.


Or they are popped in a charity collection box. Charities have a lot to lose if small-denomination coins are abolished.


The cost of keeping these coins in circulation has been €1.4bn since 2002, according to the European Commission. We should be able to figure out a cheaper way to encourage donations.


But the euro cent is hardly the least valuable coin in the world.


Indeed not. The More or Less programme on the BBC looked into this in 2012 and found a number of absurdly small-value coins. A jar of 500 of Tanzania’s smallest coin, the 5 cent piece, was worth just one British penny. There were 1,300 Burmese pya to the penny. The lowest-value coin in the world was Uzbekistan’s 1 tiyin.


How much is that worth?


One hundredth of a som, which as I am sure you know means that the tiyin was worth about one-three thousandth of a penny.


Has any country withdrawn small-denomination coins?


Various economic basket cases have, yes: Sweden, Australia, New Zealand and the home of the next Bank of England governor, Canada. New Zealand’s smallest coin is the 10 cent piece, worth more than 6 euro cents or 5 pence. The most valuable smallest coin of all is the Norwegian krone, worth more than 13 euro cents or 11 pence. The Norwegian economy has coped.


Things are awfully expensive in Norway. Perhaps people are right to fear inflation if these small coins are eliminated.


There’s a question of cause and effect here. Are things expensive in Norway because the country abolished smaller denominations, or were smaller denominations abolished because things were expensive? Don’t worry about inflation: the effect of scrapping small coins is unlikely to be noticeable, and inflation in the eurozone is low anyway. I don’t think the bastions of small-denominationism – Tanzania, Myanmar and Uzbekistan – suggest there is anything especially prophylactic about keeping our smallest coins.


Also published at ft.com.


 •  0 comments  •  flag
Share on Twitter
Published on May 25, 2013 00:33

May 18, 2013

The antisocial network?

Undercover Economist

Two economists have been collecting data to assess whether online friends are good for the soul. The quick answer: not really


Eliza was the first software to simulate human conversation. It was developed in the mid-1960s by Joseph Weizenbaum, a computer scientist at MIT. The most famous version mimicked a psychotherapist, thus hiding its conversational incompetence: “Men are all alike.” “In what way?” “They’re always bugging us about something specific or other.” “Can you think of a specific example?” “Well, my boyfriend made me come here.” “Your boyfriend made you come here?” And so on.


It was a clever program but Professor Weizenbaum, who died in 2008, was disturbed by the fact that several people seemed to find talking to Eliza genuinely therapeutic. Computers, concluded Weizenbaum, might not be terribly good for our emotional wellbeing.


These days, concern has moved to the amount of time we spend checking social networks – even if our online friends are, presumably, human beings. Two economists, John Helliwell and Haifang Huang, have been collecting the data necessary to assess whether such online friends are good for the soul. The quick answer: not really.


Helliwell and Huang analyse a Canadian social survey of 5,000 people called the “Happiness Monitor”, which measures how happy, stressed or satisfied with their lives people are, using a variety of standard questions. The Happiness Monitor also asks people how many friends they have in their real-life social network, as well as how large their online network is.


They find that having troupes of friends is correlated with a sense of wellbeing. (As is common with such exercises, the direction of causation is unclear: perhaps happy people attract friends.) The effect is substantial: having twice as many friends is associated with the same increase in happiness as having a 50 per cent increase in income. But move the social network online, and larger networks do nothing for our happiness. Millions of digital sceptics will be unsurprised.


I am sceptical about the value of Facebook myself, but the most natural reading of Helliwell and Huang’s results is that a Facebook “friend” is not necessarily a friend at all, just a setting that tells software whose status updates to show us. The Happiness Monitor doesn’t even use the word “friend” when asking about the size of online social networks. (I have 73 Facebook friends and 65,000 Twitter followers, and it is not clear which represents the size of my online social network.)


Another study, by Fenne Deters and Matthias Mehl, published late last year in Social Psychological & Personality Science, asks a different question about our online socialising: how do we feel when we post status updates to Facebook? And how do we feel if nobody responds? Deters and Mehl ran a randomised trial with 102 students at the University of Arizona. The control group was given no specific instructions; the treatment group was asked to post more status updates “than they usually post per week”. Some ignored the instruction – but those who did not said they felt less lonely. It would be easy to over-interpret these results: the sample is small and there is something artificial about posting updates to Facebook in response to the request of an experimental psychologist.


The study is intriguing. It did not seem to matter whether anyone responded to the status updates. Perhaps people felt that they were being read even if there was no feedback; or perhaps responses came via email, text or face to face, unseen by the experimenters.


Or perhaps Facebook updates make us feel connected even though nobody out there is listening. That suggests a curious view of social networking: it may have little to do with true socialising. We may simply feel satisfied with the illusion that someone is paying attention. Joseph Weizenbaum, the creator of Eliza, would not have been surprised.


Also published at ft.com.


 •  0 comments  •  flag
Share on Twitter
Published on May 18, 2013 01:46

Mile-high bid to step up to a better class

Since You Asked

Auctions seem a fine way of assessing willingness to pay, writes Tim Harford


‘For those seeking a way out of the economy cabin, a number of airlines, including Virgin Atlantic, Etihad, Austrian Airlines and Tap Portugal offer a facility whereby you tell them what you are prepared to pay to get upgraded.’

Financial Times, May 15


Am I supposed to be impressed? Isn’t the normal method of being upgraded to pay for it?


I think the distinction here is that traditionally, the customer – or the customer’s employer or client – pays the sticker price to fly business class or first class. This new scheme invites the customer to suggest the price. The airline will sit on that bid for a while, see whether better offers come in and, depending on how busy the front of the aeroplane is, will accept or reject the offer.


I see – that is a change.


Yes, but not a big change. A well-designed auction flushes out information about what people might be willing to pay, and charges accordingly. In some markets that is a very useful innovation, but there’s a reason why Tesco doesn’t run auctions when you pop in to buy milk.


It’s too much hassle, of course.


Partly that – although it is possible to run auctions incredibly quickly and cheaply in some circumstances. Every time you type in a search term on Google, the company runs a quick auction to decide which adverts to display. But Tesco doesn’t need to bother trying to figure out how to run auctions because it already has a fantastic amount of information about willingness to pay in general. It may also have information about your own shopping habits through which it can offer selective discounts.


And the same must be true for airlines.


Yes. Airlines are always tweaking the prices of their seats on each particular flight as the departure date looms and the plane starts to look full or empty. The auction may generate a bit of extra information, and therefore a bit of extra cash, but it seems marginal. My theory is that airlines want to differentiate between people who insist on flying business class and people who are willing to take a chance that they will not. A good way to offer different prices to these people is to sell some seats upfront and others in a more opaque auction.



Still, you’re making auctions sound rather passé.


This idea seems passé, but auctions certainly aren’t. The Bank of England uses auctions to determine how to inject liquidity into the banking system, for instance. This is a challenge because the problem is multidimensional: the BoE could offer loans backed by all sorts of different collateral, but would rather lend against safe collateral than riskier stuff. The potential interest rates at which the loans might be made depend on how desperate the system as a whole is for liquidity, but the spread between loans on safe collateral and loans on dodgy collateral should also vary depending on demand. Paul Klemperer, an economist at Oxford university, calls this general problem a “product mix auction” and has figured out a way to make the process run instantaneously, rather than dragging out for weeks as government auctions of yesteryear used to do.


It’s a long way from Sotheby’s, though. Or for that matter, a long way from eBay.


Both Sotheby’s and eBay accept proxy bids that come into play only when needed, and that is the key to Prof Klemperer’s scheme. And Sotheby’s could use a product mix auction to sell off a wine cellar, full of cases of similar but not identical wines. But you are right: auctions are becoming automated and a lot of auctions take place without us puny humans knowing. Your electricity supplier may soon be installing a computer that will vary the price of electricity by the second, and if prices are high some of your appliances will drop out of the bidding: the lights will dim, the fridge will allow the temperature to rise a little, the immersion heater will wait for a more propitious moment. An auction is a natural way for the computers to determine who gets priority.


I, for one, welcome our new silicon masters. Anything else?


You can already auction off your time to the highest bidder through Amazon’s “Mechanical Turk”, an online marketplace for small tasks requiring a bit of human judgment. It’s possible that sort of thing might become more widespread. On which point, I’ll have to leave you. This conversation was pleasant enough but I’ve just received a better offer.


Also published at ft.com.


 •  0 comments  •  flag
Share on Twitter
Published on May 18, 2013 01:02