If open data sets aren’t being deleted, is government data still ‘endangered’?
Any large-scale deletion of open government datasets that was feared at the outset of Donald Trump’s presidency has not happened. And yet there remain sneaky ways to undermine openness, panelists agreed Tuesday at an event on endangered data.
“The good news is that it’s not as bad as we feared,” New America’s Denice Ross said, talking about the concern, after the 2016 election, that the new administration would not honor President Barack Obama’s level of commitment to open data. This, Ross suggested, is partially owing to the attention that initiatives like Data Refuge, which preserves datasets on climate change, have brought to the issue.
“[Data Refuge] really raised the nation’s awareness about these valuable data assets that could be at risk,” Ross said at the event held by the Center for Data Innovation and the Information Technology and Innovation Foundation. “And I think the mere act of holding these Data Refuge events around the country itself probably stemmed some of the data loss we might have seen.”
Paul M. Farber, managing director of the Penn Program in the Environmental Humanities, agreed that post-election 2016 calls to action to save open data — from Data Refuge and others — stopped the “doomsday scenario” of mass data deletion.
Panelists also stated that although some datasets — such as those on climate change — can be politically charged, open data isn’t really a partisan issue. They cited the bipartisan OPEN Government Data Act, which would require that public data is published in a machine-readable format, as proof.
But deleting full datasets is not the only way to damage open data, the group acknowledged. For example an agency could release data but with fewer attributes — meaning it is still technically available, but less useful. The most recent FBI Crime in the United States report, as one example, contains close to 70 percent fewer data tables than its 2015 version did.
An agency could also decrease openness by changing the data schema, so that the data is not comparable across multiple years. Such changes could be intentional, or owing to limited funding or low staffing levels, panelists pointed out. Opening government data takes investment.
The group agreed that storytelling — finding specific individuals who have benefitted from open data and showing how it impacted them — is key to gathering support for open data initiatives.
“I mentioned a use case of ranchers requiring the NOAA [National Oceanic and Atmospheric Administration] data in order to apply for federal relief when they have a drought,” Ross said, by way of illustrating the storytelling point. “Find a rancher who used that data to apply for relief and what it meant to that rancher.”
The panel ended with a thought-provoking question from moderator Daniel Castro. Is it possible, he asked, that data could see an attack on the basis of veracity, similarly to the “fake news” attack on the media?
The panelists did not speculate.