How the cloud is helping turn data catalogs into content


Written by
Geoplatform serves a platform for people to make mapping data come alive. (

It’s not enough for people like Jerry Johnston to simply publish the abundance of geospatial data that belongs to the federal government — he wants the public to be able to derive value from it.

That’s where cloud computing comes in.

Johnston, director of the Interior Department’s Information and Technology Management Division, is using Amazon Web Services to turn the government’s geospatial data into robust, functional content for anyone interested in using it. Through his work with the Federal Geographic Data Committee, Johnston has leaned on AWS to support, a repository for people to find, search for and share various geospatial data sets.

“Using AWS, we built our own platform and software offerings that agencies are using right now,” Johnston said Monday during a session at Amazon Web Services’ Public Sector Summit. “It’s been a huge advancement in making things available.”

Prior to the platform’s launch, Johnston said agencies had problems with the storage and the compute power needed to give the data to the public. Now with the capabilities presented with cloud computing, data offices can ship their sets to the FGDC and leave the infrastructure worries to Amazon.

[Read more: Culture jeopardizes open data’s future beyond Obama]

“Our infrastructure is scaling up and down as we need it,” Johnston said. “For the first time for the infrastructure that we have, we can start to realize this vision of ‘If you can’t host it, we can host it for you.’”

The website not only serves as a repository but also as a platform for users to do more than just post data sets. An ArcGIS system is built into the site, which allows agencies to build maps off their data. There is also a marketplace that shows where various data sets live throughout the federal government and where people can acquire the licenses to work with that data.

The website was built as part of the National Spatial Data Infrastructure, which seeks ways to push geospatial data use across multiple sectors of the economy. While Johnston said has been a big part of what the FDGC has accomplished over the last 18 months, the committee is still working on other ways to make the data more accessible.

One of the things the FDGC is working on is retooling the metadata associated with the geospatial data sets, since the majority of it isn’t machine readable.

“That’s something that we are suffering from as we try to automate some of the discovery and build better tools that can find data for users who are trying to solve a specific problem,” Johnston said. “The reality is a lot of the metadata came from scientists who are trying to describe their projects and write narratives. It’s not the easiest thing to crawl from an internet standpoint.”

[Read more: PDF, HTML files dominate]

Johnston also said the committee is working on data standards for agencies that would give the public a better chance of finding relevant data or submitting duplicative data sets to the catalog.

“In truth, some people are really good at that, they’re motivated to do that, they have the budget to do that — others aren’t for whatever reason,” Johnston said, referring to agencies adhering to standards. “What we are trying to do is make it easy.”

But even as Johnston is trying to make easy to use, it’s going to be a complex process.

“If this is really going to be successful, we are going to need to build tools that enable people to take functionality content tools data available on and embed them in their own applications,” he said.

Contact the reporter on this story via email at, or follow him on Twitter at @gregotto. His OTR and PGP info can be found hereSubscribe to the Daily Scoop for stories like this in your inbox every morning by signing up here:

-In this Story-

Amazon Web Services (AWS), Cloud, data analytics, Department of the Interior, Departments, Geospatial, open data, Tech