Preview Mode Links will not work in preview mode

Oct 5, 2023

In this episode we talk about the growth of data use in the media and the potential impact of misinformation on the public’s trust in official statistics.  

Navigating podcast host Miles Fletcher through this minefield is Prof Sir David Spiegelhalter, from the University of Cambridge; Ed Humpherson, Head of the Office for Statistics Regulation; and award-winning data journalist Simon Rogers. 

 

Transcript 

 

MILES FLETCHER 

Welcome again to Statistically Speaking, the official podcast of the UK’s Office for National Statistics, I'm Miles Fletcher. Now we've talked many times before in these podcasts about the rise of data and its impact on our everyday lives. It's all around us of course, and not least in the media we consume every day. But what or who to trust: mainstream media, public figures and national institutions like the ONS, or those random strangers bearing gifts of facts and figures in our social media feeds? 

To help us step carefully through the minefields of misinformation and on, we hope, to the terra firma of reliable statistical communication, we have three interesting and distinguished voices, each with a different perspective. Professor Sir David Spiegelhalter is a well-known voice to UK listeners. He's chair of the Winton Centre for Risk Evidence Communication at the University of Cambridge and was a very prominent voice on the interpretation of public health data here during the COVID pandemic. Also, we have Ed Humpherson, Director General of regulation and head of the Office for Statistics Regulation (OSR), the official stats watchdog if you like, and later in this podcast, I'll be joined by award winning data journalist and writer Simon Rogers, who now works as data editor at Google. 

Professor, you've been one of the most prominent voices these last few years – a fascinating few years, obviously, for statistics in which we were told quite frankly, this was a golden age for statistics and data. I mean, reflecting on your personal experience as a prominent public voice in that debate, when it comes to statistics and data, to be very general, how well informed are we now as a public, or indeed, how ill-informed on statistics?  

 

DAVID SPIEGELHALTER 

I think things have improved after COVID. You know, for a couple of years we saw nothing but numbers and graphs on the news and in the newspapers and everywhere, and that went down very well. People didn't object to that. In fact, they wanted more. And I think that has led to an increased profile for data journalism, and there's some brilliant ones out there. I'm just thinking of John Burn-Murdoch on the FT but lots of others as well, who do really good work. Of course, in the mainstream media there is still the problem of non-specialists getting hold of data and getting it wrong, and dreadful clickbait headlines. It is the sub editors that wreck it all just by sticking some headline on what might be a decent story to get the attention and which is quite often misleading. So that's a standard problem. In social media, yeah, during COVID and afterwards, there are people I follow who you might consider as - I wouldn't say amateurs at all, but they're not professional pundits or media people - who just do brilliant stuff, and who I've learned so much from. There are also some terrible people out there, widespread misinformation claims which are based on data and sound convincing because they have got numbers in them. And that, I mean, it's not a new problem, but now it is widespread, and it's really tricky to counter and deal with, but very important indeed.  

 

MF 

So the issue aside from - those of us who deal with the media have heard this a hundred times - I don't write the headlines, reporters will tell you when you challenge that misleading kind of headline. But would you say it’s the mainstream media then, because they can be called out on what they report, who broadly get things right? And that the challenge is everything else - it's out there in the Wild West of social media?  

 

DS 

Yeah, mainstream media is not too bad, partly because, you know, we've got the BBC in this country, we’ve got regulations, and so it's not too bad. And social media, it's the Wild West. You know, there are people who really revel in using numbers and data to make inappropriate and misleading claims.  

 

MF 

Is there anything that can be done? Is it the government, or even those of us like the ONS who produce statistics, who should we be wading in more than we do? Should we be getting out there onto the social media platforms and putting people right?  

 

DS 

It's difficult I mean, I don't believe in sort of censorship. I don't think you can stop this at source at all. But just because people can say this, it doesn't give them a right for it to be broadcast wide, in a way and to be dumped into people's feeds. And so my main problem is with the recommendation algorithms of social media, where people will see things because it's getting clicks, and the right algorithm thinks persona will like it. And so we just get fed all this stuff. That is my real problem and the obscurity and the lack of accountability of recommendation algorithms right across social media is I think, a really shocking state of affairs. Of course, you know, we come on to this later, but we should be doing something about education, and actually sort of pre-empting some of the misunderstandings is something I feel very strongly about with my colleagues. You’ve got to get in there quick, and rather than being on the backfoot and just reacting to false claims that have been made, you've got to sort of realise how to take the initiative and to realise what misunderstandings, misinterpretations can be made, and get in there quickly to try to pre-empt them. But that of course comes down to the whole business of how ONS and others communicate their data.  

 

MF 

Because when you ask the public whether they trust them - and the UK statistics authority does this every two years - you ask the public if they trust ONS statistics, and a large proportion of them say they do. But of course, if they're not being presented with those statistics, then they're still going to end up being misled.  

 

DS 

Yeah, I mean, it's nice to get those responses back. But, you know...that's in terms of respondents and just asking a simple question, do you trust something or not? I think it's good to hear but we can't be complacent about that at all. I’m massively influenced by the approach of the philosopher, Baroness Onora O’Neill, who really makes a sharp distinction between organisations wanting to be trusted and revelling in being trusted, and she says that shouldn't be your objective to be trusted. Your objective should be to be trustworthy, to deserve trust, and then it might be offered up to you. And so the crucial thing is trustworthiness of the statistics system and in the communications, and that's what I love talking about, because I think it's absolutely important and it puts the responsibility really firmly back to the communicator to demonstrate trustworthiness.  

 

MF 

So doing more as stats producers to actually actively promote data and get people to come perhaps away from the social platforms, and to have their own websites that present data in an accessible way, in an understandable way, where people can get it for nothing without requiring an expensive subscription or something, as some of the best of the media outlets would require.  

 

DS 

The other thing I'd say is there's no point of being trustworthy if you’re dull, as no one's going to look at it or take any notice, and other media aren't going to use it. So I think it's really worthwhile to invest, make a lot of effort to make what you're putting out there as attractive, as vivid and as grabbing as possible. The problem is that in trying to do that, I mean, that's what a lot of communicators and media people want to do, because of course they want people to read their stuff. But what that tends to do largely is make their stuff kind of opinionated and have a very strong line, essentially to persuade you to either do something or think something or buy something or vote something. So much communication has to do with persuading that I think it's just completely inappropriate. In this context, what we should be doing is informing people. 

 

In a way we want to persuade them to take notice, so that's why you want to have really good quality communications, vivid, get good people out there. But in the end, they’re just trying to inform people, and that's why I love working with ONS. I just think this is a really decent organisation whose job is just trying to raise the...to obviously provide official statistics...but in their communications, it's to try to raise the level of awareness raise the level of discussion, and by being part of a non -ministerial department, they're not there, the comms department, to make the minister look good, or to make anyone look good. Its just there to tell people how it is. 

 

MF 

Exactly. To put that data into context. Is this a big number or is this is a small number, right? Adjectives can sometimes be very unhelpful, but often the numbers don't speak for themselves, do they. 

 

DS  

Numbers never speak for themselves, we imbue them with meaning, which is a great quote as well from Nate Silver. 

 

MF 

And in doing that, of course, you have to walk the same line that the media do, in making them relevant and putting them into context, but not at the same time distorting them. There's been a big debate going on recently, of course, about revisions. And if you've listened to this podcast, which we'd always advise and consume other articles that the ONS has published, we've said a lot about the whole process of revising GDP, and the uncertainty that's built into those initial estimates, which although helpful, are going to be pretty broad. And then of course, when the picture changes dramatically, people are kind of entitled to say, oh hang on, you told us this was something different and the narrative has changed. The story has changed because of that uncertainty with the numbers, shouldn't you have done more to tell us about that uncertainty. That message can sometimes get lost, can’t it?  

 

DS 

Yeah, it's terribly important. You’ve got to be upfront. We develop these five points on trustworthy communication and the first one was inform, not persuade. And the second is to be balanced and not to have a one-sided message to tell both sides of the story, winners and losers, positives and negatives. And then to admit uncertainty, to just say what you don't know. And in particular, in this case, “provisionality”, the fact that things may change in the future, is incredibly important to emphasise, and I think not part of a lot of discussion. Politicians find it kind of impossible to say I think, that things are provisional and to talk about quality of the evidence and limitations in the evidence, which you know, if you're only basing GDP on a limited returns to start with, on the monthly figures, then you need to be clear about that. And the other one is to pre-empt the misunderstandings, and again, that means sort of getting in there first to tell you this point, this may change. This is a provisional judgement, and you know, I think that that could be emphasised yet more times, yet more.  

 

MF 

And yet there's a risk in that though, of course the message gets lost and diluted and the... 

 

DS 

Oh no, it always gets trotted out - oh, we can't admit uncertainty. We can't tell both sides of story. We have to tell a message that is simple because people are too stupid to understand it otherwise, it's so insulting to the audience. I really feel a lot of media people do not respect their audience. They treat them as children - oh we've got to keep it simple, we mustn't give the nuances or the complexity. All right, if you're going to be boring and just put long paragraphs of caveats on everything, no one is going to read that or take any notice of them. But there are ways to communicate balance and uncertainty and limitations without being dull. And that's what actually media people should focus on. Instead of saying, oh, we can't do that. You should be able to do it. Good media, good storytelling should be able to have that nuance in. You know, that's the skill.  

 

MF 

You’re absolutely right, you can't disagree with any of that, and yet, in communicating with the public, even as a statistics producer, you are limited somewhat by the public's ability to get used to certain content. I mean, for example, the Met Office recently, a couple of years back, started putting in percentage of chance of rainfall, which is something that it hadn't done before. And some work on that revealed just how few people actually understood what they were saying in that, and what the chances were actually going to be of it raining when they went out for the afternoon’s work.  

 

DS 

Absolute nonsense. That sorry, that's completely I mean, I completely rely on those percentages. My 90-year-old father used to understand those percentages. Because it’s a novelty if you are going to ask people what they understand, they might say something wrong, such as, oh, that's the percentage of the area that it's going to rain in or something like that. No, it's the percentage of times it makes that claim that it's right. And those percentages have been used in America for years, they're completely part of routine forecast and I wouldn't say the American public is enormously better educated than the British public. So this is just reluctance and conservatism. It's like saying oh well people don't understand graphs. We can't put up line graphs on the news, people don't understand that. This is contempt for the public. And it just shows I think, a reluctance to make an effort to explain things. And people get used to stuff, once they've learned what a graph looks like, when they see it again, then they'll understand it. So you need to educate the public and not, you know, in a patronising way, it's just that, you know, otherwise you're just being misleading. If you just say, oh, you know, it'll rain or not rain you're just misleading them. If you just say it might rain, that's misleading. What does that mean? It can mean different things. I want a percentage and people do understand them, when they've got some experience of them.  

 

MF 

And what about certainty in estimates? Here is a reaction we add to the migration figures that ONS published earlier in the summer. Somebody tweeted back to say, well estimates, that’s all very good but I want the actual figures. I want to know how many people have migrated. 

 

DS 

Yeah, I think actually, it's quite a reasonable question. Because, you know, you kind of think well can’t you count them, we actually know who comes in and out of the country. In that case it’s really quite a reasonable question to ask. I want to know why you can't count them. And in fact, of course ONS is moving towards counting them. It's moving away from the survey towards using administrative data to count them. So I think in that case, that's quite a good question to ask. Now in other situations, it's a stupid question. If you want to know if someone says, oh, I don't want an estimate of how many people you know, go and vote one way or do something or other, I want to know how many, well then you think don’t be daft. We can't go and ask everybody this all the time. So that's a stupid question. So the point is that in certain contexts, asking whether something is an estimate or not, is reasonable. Sometimes it's not and that can be explained, I think, quite reasonably to people.  

 

MF 

And yet, we will still want to be entertained. We also want to have numbers to confirm our own prejudices.  

 

DS 

Yeah, people will always do that. But that's not what the ONS is for, to confirm people's prejudices. People are hopeless at estimating. How many, you know, migrants there are, how many people, what size ethnic minorities and things, we know if you ask people these numbers, theyre pretty bad at it. But people are bad at estimating all numbers. So no, it's ONS’s job to try to explain things and in a vivid way that people will be interested in, particularly when there's an argument about a topic going on, to present the evidence, not one side or the other, but that each side can use, and that's why I really feel that the ONS’s migration team, you know, I have a lot of respect for them, when they're changing their format or consulting on it, they go to organization's on both sides. They go to Migration Watch and the Migration Observatory and talk to them about you know, can they understand what's going on, is this data helping them in their deliberations.  

 

MF 

Now, you mentioned earlier in the conversation, education, do we have a younger generation coming up who are more stats literate or does an awful lot more need to be done?  

 

DS 

A lot more needs to be done in terms of data education in schools. I'm actually part of a group at the Royal Society that is proposing a whole new programme called mathematics and data education, for that to be put together within a single framework, because a lot of this isn't particularly maths, and maths is not the right way or place to teach it. But it still should be an essential part of education, understanding numbers, understanding data, their limitations and their strengths and it uses some numeracy, uses some math but it's not part of maths. The problem has always been where does that fit in the syllabus because it doesn't, particularly at the moment. So that's something that every country is struggling with. We're not unique in that and, and I think it's actually essential that that happens. And when you know, the Prime Minister, I think quite reasonably says people should study mathematics until 18. I mean, I hope he doesn't mean mathematics in the sense of the algebra and the geometry that kids do, get forced to do essentially, for GCSE, and some of whom absolutely loathe it. And so, but that's not really the sort of mathematics that everyone needs. Everyone needs data literacy. Everyone needs that. 

 

MF 

Lies, damned lies and statistics is an old cliche, it's still robustly wheeled out in the media every time, offering some perceived reason to doubt what the statisticians have said. I mean looking ahead, how optimistic are you, do you think that one day we might finally see the end of all that?  

 

DS 

Well my eyes always go to heaven, and I just say for goodness sake. So I like it when it's used, because I say, do you really believe that? You know, do you really believe that, because if you do you're just rejecting evidence out of hand. And this is utter stupidity. And nobody could live like that. And it emphasises this idea somehow, among the more non-data-literate, it encourages them to think that numbers they hear either have to be sort of accepted as God given truths or rejected out of hand. And this is a terrible state to be in, the point is we should interpret any number we hear, any claim based on data, same as we’d interpret any other claim made by anybody about anything. We’ve got to judge it on its merits at the time and that includes do we trust the source? Do I understand how this is being explained to me? What am I not being told? And so why is this person telling me this? So all of that comes into interpreting numbers as well. We hear this all the time on programmes like More or Less, and so on. So I like it as a phrase because it is so utterly stupid, then so utterly, easily demolished, that it encourages, you know, a healthy debate. 

 

MF 

We're certainly not talking about good statistics, we're certainly not talking about quality statistics, properly used. And that, of course, is the role of the statistics watchdog as we're obliged to call him, or certainly as the media always call him, and that's our other guest, Ed Humpherson. 

  

Ed, having listened to what the professor had to say there, from your perspective, how much misuse of statistics is there out there? What does your organisation, your office, do to try and combat that?  

  

ED HUMPHERSON 

  

Well, Miles the first thing to say is I wish I could give you a really juicy point of disagreement with David to set off some kind of sparky dialogue. Unfortunately, almost everything, if not everything that David said, I completely agree with - he said it more fluently and more directly than I would, but I think we are two fellow travellers on all of these issues.  

  

In terms of the way we look at things at the Office for Statistics Regulation that I head up, we are a statistics watchdog. That's how we are reported. Most of our work is, so to speak, below the visible waterline: we do lots and lots of work assessing reviewing the production of statistics across the UK public sector. We require organisations like the ONS, but also many other government departments, to be demonstrating their trustworthiness; to explain their quality; and to deliver value. And a lot of that work just goes on, week in week out, year in year out to support and drive-up evidence base that's available to the British public. I think what you're referring to is that if we care about the value and the worth of statistics in public life, we can't just sort of sit behind the scenes and make sure there's a steady flow. We actually have to step up and defend statistics when they are being misused because it's very toxic, I think, to the public. Their confidence in statistics if they're subjected to rampant misuse or mis explanation of statistics, it's all very well having good statistics but if they go out into the world and they get garbled or misquoted, that I think is very destructive. So what we do is we either have members of the public raise cases with us when they see something and they're not they're not sure about it, or indeed we spot things ourselves and we will get in contact with the relevant department and want to understand why this thing has been said, whether it really is consistent with the underlying evidence, often it isn't, and then we make an intervention to correct the situation. And we are busy, right, there's a lot there's a lot of there's a lot of demand for work. 

  

MF 

Are instances of statistical misuse on the rise? 

  

EH 

We recently published our annual summary of what we call casework - that's handling the individual situations where people are concerned. And we revealed in that that we had our highest ever number of cases, 372, which might imply that, you know, things are getting worse. I'd really strongly caution against that interpretation. I think what that increase is telling you is two other things. One is, as we as the Office for Statistics Regulation, do our work, we are gradually growing our profile and more people are aware that they can come to us, that’s the first thing this is telling you; and the second thing is that people care a lot more about statistics and data now, exactly as Sir David was saying that this raised profile during the pandemic. I don't think it's a sign that there's more misuse per se. I do think perhaps, the thing I would be willing to accept is, there's just a generally greater tendency for communication to be datafied. In other words, for communication to want to use data: it sounds authoritative, it sounds convincing. And I think that may be driving more instances of people saying well, a number has been used there, I want to really understand what that number is. So I would be slightly cautious about saying there is more misuse, but I would be confident in saying there's probably a greater desire to use data and therefore a greater awareness both of the opportunity to complain to us and of its importance.  

  

MF 

Underlying all of your work is compliance with the UK code of practice for statistics, a very important document, and one that we haven't actually mentioned in this podcast so far…  

  

EH 

Shame on you, Miles, shame on you.  

  

MF 

We're here to put that right, immediately. Tell us about what the code of practice is. What is it for? what does it do?  

  

EH 

So the Code of Practice is a statutory code and its purpose is to ensure that statistics serve the public good. And it does that through a very simple structure. It says that in any situation where an individual or an organisation is providing information to an audience, there are three things going on. There's the trustworthiness of the speaker, and the Code sets out lots of requirements on organisations as to how they can demonstrate they're trustworthiness. And it's exactly in line with what David was saying earlier and exactly in line with the thinking of Onora O’Neill – a set of commitments which demonstrate trustworthiness. Like a really simple commitment is to say, we will pre-announce at least four weeks in advance when the statistics are going to be released, and we will release them at the time that we say, so there is no risk that there's any political interference in when the news comes out. It comes out at the time that has been pre-announced. Very clear commitment, very tangible, evidence-based thing. It's a binary thing, right? You either do that or you do not. And if you do not: You're not being trustworthy. The second thing in any situation where people are exchanging information is the information itself. What's its quality? Where's this data from? How's it been compiled? What are its strengths and limitations? And the code has requirements on all of those areas. That is clarity of what the numbers are, what they mean, what they don't mean. And then thirdly, in that exchange of information, is the information of any use to the audience? It could be high, high quality, it could be very trustworthy, but it could, to use David's excellent phrase, it could just be dull”. It could be irrelevant, it could not be important. And the value pillar is all about that. It's all about the user having relevant, insightful information on a question that they care about. That's, Miles, what the Code of Practice is: it’s trustworthiness, it’s quality and it’s value. And those things we think are kind of pretty universal actually, which is why they don't just apply now to official statistics. We take them out and we apply them to all sorts of situations where Ministers and Departments are using numbers, we always want to ask those three questions. Is it trustworthy? Is it quality, is it value? That's the Code 

  

MF 

And when they've satisfied your stringent requirements and been certified as good quality, there is of course a badge to tell the users that they have been.  

  

EH 

There's a badge - the badge means that we have accredited them as complying with that Code of Practice. It's called the National Statistics badge. The term is less important and what it means what it means is we have independently assessed that they comply in full with that Code 

  

MF 

Most people would have heard, if they have heard of the OSR’s work, they'll have seen it perhaps in the media. They'll have seen you as the so-called data watchdog, the statistics watchdog. It's never gently explained as it it's usually ‘slammed’, ‘criticised’, despite the extremely measured and calm language you use, but you're seen as being the body that takes politicians to task. Is that really what you do? It seems more often that you're sort of gently helping people to be right.  

  

EH 

That's exactly right. I mean, it's not unhelpful, frankly, that there's a degree of respect for the role and that when we do make statements, they are taken seriously and they're seen as significant, but we are not, absolutely not, trying to generate those headlines. We are absolutely not trying to intimidate or scare or, you know, browbeat people. Our role is very simple. Something has been said, which is not consistent with the underlying evidence. We want to make that clear publicly. And a lot of time what our intervention does actually is it strengthens the hand of the analysts in government departments so that their advice is taken more seriously at the point when things are being communicated. Now, as I say, it's not unwelcome sometimes that our interventions do get reported on. But I always try and make these interventions in a very constructive and measured way. Because the goal is not column inches. Absolutely not. The goal is the change in the information that's available to the public.  

  

MF 

You're in the business of correcting the record and not giving people a public shaming.  

  

EH 

Exactly, exactly. And even correcting the record actually, there's some quite interesting stuff about whether parliamentarians correct the record. And in some ways, it'd be great if parliamentarians corrected the record when they have been shown to have misstated with statistics. But actually, you could end up in a world where people correct the record and in a sort of tokenistic way, it's sort of, you know, buried in the depths of the Hansard parliamentary report. What we want is for people not to be misled, for people to not think that, for example, the number of people in employment is different from what it actually is. So actually, it's the outcome that really matters most; not so much the correction as are people left understanding what the numbers actually say 

  

MF 

Surveys show - I should be careful using that phrase, you know - nonetheless, but including the UKSA survey, show that the public were much less inclined to trust in the words of the survey. Politicians use of statistics and indeed, Chris Bryant the Labour MP said that politicians who have been who've been found to have erred statistically should be forced to apologise to Parliament. Did you take that on board? Is there much in that? 

  

EH 

When he said that, he was actually directly quoting instances we've been involved with and he talks about our role very directly in that sense. Oh, yeah, absolutely. We support that. It will be really, really good. I think the point about the correction, Miles, is that it shows it's a manifestation of a culture that takes fidelity to the evidence, truthfulness to the evidence, faithfulness to the evidence, it takes that seriously, as I say, what I don't want to get into is a world where you know, corrections are sort of tokenistic and buried. I think the key thing is that it's part of an environment in which all actors in public debate realise it's in everybody's interests or evidence; data and statistics to be used fairly and appropriately and part of that is that if they've misspoken, they correct the record. From our experience, by and large, when we deal with these issues, the politicians concerned want to get it right. What they want to do is, they want to communicate their policy vision, their idea of the policy or what the, you know, the state of the country is. They want to communicate that, sure, that's their job as politicians, but they don't want to do so in a way that is demonstrably not consistent with the underlying evidence. And in almost all cases, they are I wouldn’t say they're grateful, but they're respectful of the need to get it right and respect the intervention. And very often the things that we encounter are a result of more of a cockup than a conspiracy really - something wasn't signed off by the right person in the right place and a particular number gets blown out of proportion, it gets ripped from its context, it becomes sort of weaponized; it's not really as a deliberate attempt to mislead. Now, there are probably some exceptions to that generally positive picture I'm giving. but overall it's not really in their interests for the story to be about how they misuse the numbers. That's not really a very good look for them. They'd much rather the stories be about what they're trying to persuade the public of, and staying on the right side of all of the principles we set out helps that to happen.  

  

MF 

Your remit runs across the relatively controlled world sort of government, Parliament and so forth. And I think the UK is quite unusual in having a body that does this in an independent sort of way. Do you think the public expects you to be active in other areas, we mentioned earlier, you know, the wilder shores of social media where it's not cockup theories you're going to be hearing there, it's conspiracy theories based on misuse of data. Is there any role that a statistics regulator could possibly take on in that arena?  

  

EH 

Absolutely. So I mentioned earlier that the way we often get triggered into this environment is when members of the public raised things with us. And I always think that's quite a solemn sort of responsibility. You know, you have a member of the public who's concerned about something and they care about it enough to contact us - use the raise a concern part of our website - so I always try and take it seriously. And sometimes they're complaining about something which isn't actually an official statistic. And in those circumstances, even if we say to them, well, this isn't really an official statistic, we will say, but, applying our principles, this would be our judgement. Because I think we owe it to those people who who've taken the time to care about a statistical usage, we owe it to take them seriously. And we have stepped in. Only recently we're looking at some claims about the impact of gambling, which are not from a government department, but from parts of the gambling industry. We also look at things from local government, who are not part of central government. So we do we do look at those things, Miles. It's a relatively small part of our work, but, as I say, our principles are universal and you've got to take seriously a situation in which a member of the public is concerned about a piece of evidence.  

 

MF 

Professor Spiegelhalter, what do you make of this regulatory function that the OSR pursues, are we unusual in the UK in having something along those lines?  

 

DS 

Ed probably knows better than I do, but I haven't heard of anybody else and I get asked about it when I'm travelling and talking to other people. I have no conflict of interest. I'm Non-Executive Director for the UK Stats Authority, and I sit on the regulation committee that oversees the way it works. So of course, I'm a huge supporter of what they do. And as described, it's a subtle role because it's not to do with performing, you know, and making a big song and dance and going grabbing all that attention but working away just to try to improve the standard of stats in this country. I think we're incredibly fortunate to have such a body and in fact, we know things are never perfect and there's always room for improvement of course, but I think we're very lucky to have our statistical system.  

 

MF 

A final thought from you...we’re at a moment in time now where people are anticipating the widespread implementation of AI, artificial intelligence, large language models and all that sort of thing. Threat or opportunity for statistics, or both? 

 

DS 

Oh, my goodness me, it is very difficult to predict. I use GPT a lot in my work, you know, both for sort of research and making inquiries about stuff and also to help me do codings I'm not very good at. I haven't yet explored GPT-4's capacity for doing automated data analysis, but I want to, and actually, I'd welcome it. if it's good, if you can put some data in and it does stuff - that's great. However, I would love to see what guardrails are being put into it, to prevent it doing stupid misleading things. I hope that that does become an issue in the future, that if AI is automatically interpreting data for example, that it's actually got some idea of what it's doing. And I don't see that that's impossible. I mean, there were already a lot of guardrails in about sexist statements, racist statements, violent statements and so on. There's all sorts of protection already in there. Well, can’t we have protection against grossly misleading statistical analysis?  

 

MF 

A future over the statistics watchdog perhaps? 

 

DF 

Quite possibly. 

 

EH 

Miles, I never turn down suggestions for doing new work. 
 
MF 

So we’ve heard how statistics are regulated in the UK, and covered the role of the media in communicating data accurately, and now to give some insight into what that might all look like from a journalist’s perspective, it’s time to introduce our next guest, all the way from California, award-winning journalist and data editor at Google, Simon Rogers. Simon, welcome to Statistically Speaking. Now, before you took up the role at Google you were actually at the forefront of something of a data journalism movement here in the UK. Responsible for launching and editing The Guardian’s data blog, looking at where we are now and how things have come on since that period, to what extent do you reckon journalists can offer some kind of solution to online misinterpretation of information? 

 

Simon Rogers   
At a time when misinformation is pretty rampant, then you need people there who can make sense of the world and help you make sense of the world through data and facts and things that are true, as opposed to things that we feel might be right. And it's kind of like there is a battle between the heart and the head out there in the world right now. And there are the things that people feel might be right, but are completely wrong. And where, I think, Data Journalists can be the solution to solving that. Now, having said that, there are people as we know who will never believe something, and it doesn't matter. There are people for whom it literally doesn't matter, you can do all the fact checks that you want, and I think that is a bit of a shock for people, this realisation that sometimes it's just not enough, but I think honestly, the fact that there are more Data Journalists now than before...There was an EJC survey, the European Journalism Centre did a survey earlier this year about the state of data journalism. There are way more data journalists now than there were the last time they did the survey. It's becoming much more...it’s just a part of being a reporter now. You don't have to necessarily be identified as a separate data journalist to work with data. So we're definitely living in a world where there are more people doing this really important work, but the need, I would say it has never been greater.  

 

MF 

How do you think data journalists then tend to see their role? Is it simply a mission to explain, or do some of them see it as their role to actually prove some theories and vindicate a viewpoint, or is it a mixture, are there different types of data journalists?  

 

SR 

I would say there were as many types of data journalists as there are types of journalists. And that's the thing about the field, there's no standard form of data journalism, which is one of the things that I love about it. That your output at the end of the day can be anything, it can be a podcast or it can be an article or a number or something on social media. And because of the kind of variety, and the fact I think, that unlike almost any other role in the newsroom, there really isn't like a standard pattern to becoming a data journalist. As a result of that, I think what you get are very different kind of motivations among very different kinds of people. I mean, for me, personally, the thing that interested me when I started working in the field was the idea of understanding and explaining. That is my childhood, with Richard Scarry books and Dorling Kindersley. You know, like trying to understand the world a little bit better. I do think sometimes people have theories. Sometimes people come in from very sophisticated statistical backgrounds. I mean, my background certainly wasn't that and I would say a lot of the work, the stats and the way that we use data isn't necessarily that complicated. It's often things like, you know, is this thing bigger than that thing? Has this thing grown? You know, where in the world is this thing, the biggest and so on. But you can tell amazing stories that way. And I think this motivation to use a skill, but there are still those people who get inured by maths in the same way that I did when I was at school, you know, but I think the motivation to try and make it clear with people that definitely seems to me to be a kind of a common thread among most of the data journalists that I’ve met. 

 

MF 

Do you think that journalists therefore, people going into journalism, and mentioning no names, as an occupation...used to be seen as a bit less numerous, perhaps whose skills tended to be in the verbal domain. Do you think therefore these days you’ve got to have at least a feel for data and statistics to be able to be credible as a journalist?  

 

SR 

I think it is becoming a basic skill for lots of journalists who wouldn't necessarily consider themselves data journalists. We always said eventually it is just journalism. And the reason is because the amount of sources now that are out there, I don't think you can tell a full story unless you take account of those. COVID’s a great example of that, you know, here's a story that data journalists, I think, performed incredibly well. Someone like John Burn-Murdoch on the Financial Times say, where they’ve got a mission to explain what's going on and make it clear to people at a time when nothing was clear, we didn't really know what was going on down the road, never mind globally. So I think that is becoming a really important part being a journalist. I mean, I remember one of my first big data stories at the Guardian was around the release of the coins database – a big spending database from the government - and we had it on the list as a “data story and people would chuckle, snigger a little bit of the idea that there'll be a story on the front page of the paper about data, which they felt to be weird, and I don't think people would be snickering or chuckling now about that. It's just normal. So my feeling is that if you're a reporter now, not being afraid of data and understanding the tools that are there to help you, I think that's a basic part of the role and it's being reflected in the way that journalism schools are working. I teach here one semester a year at the San Francisco Campus of Medill. There's an introduction to data journalism course and we get people coming in there from all kinds of backgrounds. Often half the class are just, they put their hands up if they're worried about math or scared of data, but somehow at the end of the course they are all making visualisations and telling data stories, so you know, those concerns can always be overcome.  

 

MF 

I suppose it's not that radical a development really if you think back, particularly from where we're sitting in the ONS. Of course, many of the biggest news stories outside of COVID have been data driven. think only of inflation for example, the cost of living has been a big running story in this country, and internationally of course, over the last couple of years. Ultimately, that's a data driven story. People are relying on the statisticians to tell them what the rate of inflation is, confirming of course what they're seeing every day in the shops and when they're spending money.  

 

SR 

Yeah, no, I agree. Absolutely. And half of the stories that are probably about data, people don't realise they're writing about data. However, I think there is a tendency, or there has been in the past, a tendency to just believe all data without questioning it, in the way that as a reporter, you would question a human source and make sure you understood what they were saying. If we gave one thing and that thing is that reporters would then come back to you guys and say ask an informed question about this data and dive into a little bit more, then I think we've gained a lot.  

 

MF 

So this is perhaps what good data journalists are bringing to the table, perhaps and ability to actually sort out the good data from the bad data, and actually, to use it appropriately to understand uncertainty and understand how the number on the page might not be providing the full picture. 

 

SR  

Absolutely. I think it's that combination of traditional journalistic skills and data that to me always make the strongest storytelling. When you see somebody, you know, who knows a story inside out like a health correspondent, who knows everything there is to know about health policy, and then they're telling a human story perhaps about somebody in that condition, and then they've got data to back it up - it’s like the near and the far. This idea of the near view and the far view, and journalism being the thing that brings those two together. So there’s the view from 30,000 feet that the data gives you and then the individual view that the more kind of qualitative interview that you get with somebody who is in a situation gives you. The two things together - that’s incredibly powerful. 

 

MF 

And when choosing the data you use for a story I guess it’s about making sound judgements – you know, basic questions like “is this a big number?”, “is this an important number?”  

 

SR 

Yeah, a billion pounds sounds like a lot of money, but they need to know how much is a billion pounds, is it more about a rounding error for the government. 

 

MF 

Yes, and you still see as well, outside of data journalism I stress, you still see news organisations making much of percentage increases or what looks like a significant increase in something that's pretty rare to start with.  

 

SR 

Yeah, it's all relative. Understanding what something means relatively, without having to give them a math lesson, I think is important.  

 

MF 

So this talk about supply, the availability of data journalism, where do people go to find good data journalism, perhaps without having to subscribe? You know, some of the publications that do it best are after all behind paywalls, where do we find the good stuff that's freely available? 

 

SR 

If I was looking from scratch for the best data journalism, I think there are lots of places you can find it without having to subscribe to every service. Obviously, you have now the traditional big organisations like the Guardian, and New York Times, and De Spiegel in Germany, there is a tonne of data journalism now happening in other countries around the world that I work on supporting the Sigma Data Journalism Awards. And over half of those entries come from small one or two people units, you know, practising their data journalism in countries in the world where it's a lot more difficult than it is to do it in the UK. For example, Texty in Ukraine, which is a Ukrainian data journalism site, really, and they're in the middle of a war zone right now and they're producing data journalism. In fact, Anatoly Barranco, their data editor, is literally in the army and on the frontline, but he’s also producing data journalism and they produce incredible visualisations. They've used AI in interesting ways to analyse propaganda and social media posts and stuff. And the stuff happening everywhere is not just limited to those big partners behind paywalls. And what you do find also, often around big stories like what’s happened with COVID, people will put their work outside of the paywall. But um, yeah, data is like an attraction. I think visualisation is an attraction for readers. I'm not surprised people try and monetize that, but there is enough going on out there in the world. 

 

MF 

And all that acknowledged, could the producers of statistics like the ONS, and system bodies around the world, could we be doing more to make sure that people using this data in this way have it in forms have it available to be interpreted? Is there more than we can do?  

 

SR 

I mean, there was the JC survey that I mentioned earlier, it’s definitely worth checking out because one thing it shows is that 57% of data journalists say that getting access to data is still their biggest challenge. And then followed by kind of like lack of resources, time pressure, things like thatPDFs are still an issue out there in the world. There's two things to this for me, on one side it's like, how do I use the data, help me understand what I'm looking at. On the other side is that access, so you know, having more kind of API's and easy downloads, things that are not formatted to look pretty but formatted for use. Those kinds of things are still really important. I would say the ONS has made tremendous strides, certainly since I was working in the UK, on accessibility to data and that's a notable way, and I've seen the same thing with gov.us here in the States. 

 

MF 

Well it’s good to hear the way the ONS has been moving in the right direction. Certainly I think we've been tough on PDFs. 
 
SR 
Yes and to me it's noticeable. It's noticeable and you've obviously made a deliberate decision to do that, which is great. That makes the data more useful, right, and makes it more and more helpful for people.  

 

MF 

Yes, and at the other end of the chain, what about storing publishers and web platforms, particularly well you’re at Google currently, but generally, what can these big platforms do to promote good data journalism and combat misinformation? I mean, big question there. 

 

SR 

Obviously, I work with Google Trends data, which is probably the world's biggest publicly available data set. I think a big company like Google has a responsibility to make this data public, and the fact that it is, you can download reusable datasets, is incredibly powerful. I'm very proud to work on that. I think that all companies have a responsibility to be transparent, especially when you have a unique data set. That didn't exist 20 years earlier, and it's there now and it can tell you something about how the world works. I mean, for instance, when we look at something like I mean, I've mentioned COVID before, but it's such a big event in our recent history. How people were searching around COVID is incredibly fascinating and it was important information to get out there. Especially at a time when the official data is always going to be behind what's actually happening out there. And is there a way you can use that data to predict stuff, predict where cases are going to come up... We work with this data every day and we're still just scratching the surface of what's possible with it. 

 

MF  

And when it comes to combating misinformation we stand, so we're told, on the threshold of another revolution from artificial intelligence, large language models, and so forth. How do you see that future? Is AI friend, foe, or both?  

 

SR 

I work for a company that is a significant player in the AI area, so I give you that background. But I think in the field of data, we've seen a lot of data users use AI to really help produce incredible work, where instead of having to read through a million documents, they can get the system to do it for them and pull out stories. Yeah, like any other tool, it can be anything but the potential to help journalists do their jobs better, and for good, I think is pretty highI'm going to be optimistic and hope that that's the way things go. 

 

MF 

Looking optimistically to the future then, thank you very much Simon for joining us. And thanks also to my other guests, Professor Sir David Spiegelhalter and Ed Humpherson. Taking their advice on board then, when we hear or read about data through the news or experience it on social media, perhaps we should first always ask ourselves – do we trust the source? Good advice indeed.  

 

You can subscribe to new episodes of this podcast on Spotify, Apple podcasts, and all the other major podcast platforms. You can also get more information, or ask us a question, by following the @ONSFocus on X, or Twitter, take your pick. I’m Miles Fletcher, from myself and our producer Steve Milne, thanks for listening. 
 
ENDS