Census Bureau moving beyond surveys and censuses with cloud-based data ecosystem
The Census Bureau plans to increase cloud migration to meet growing demand for its data and make better use of nontraditional sources, according to a request for information.
Dubbed the Census Acceleration to Secure Cloud (CASC), the IT modernization approach is in the early stages of understanding the state of the industry and its offerings and planning procurements for small businesses and full and open competition.
Demand for data at the pandemic’s outset prompted the bureau to create the Household Pulse Survey to fill gaps in the government’s understanding of the social and economic impacts, but now it wants an ecosystem for collection, storage and processing.
“The USCB’s focus as an agency must no longer be simply to field surveys and censuses, and to publish the results, but rather to shift to combining data science with traditional survey methods, elevating and diversifying data products, and placing data at the center of the approach by accelerating to secure native cloud services,” reads the request for information (RFI). “The objective of this initiative is to address challenges and propose new ways in which the USCB will take advantage of native secure cloud services.”
The CASC approach consists of three pillars: technical support reducing the bureau’s on-premise data center footprint over time through cloud migration, secure cloud technical capabilities and services, and technical services assisting the migration of applications and development of new ones in the cloud.
Four initiatives form the foundation of CASC, the first being an enterprise data lake (EDL).
Next is the Frames Program, involving the collocation and linking of datasets within the EDL to do everything from tailoring a survey to answering new questions about jobs and COVID-19 vaccination rates.
“Centralization and ‘linkability’ will increase efficiency, reduce duplicative efforts to maintain and manage data, and greatly expand our capacity to answer critical questions about the population and economy at multiple geographic scales,” reads the RFI. “These linked, augmented and continuously updated datasets will provide a more comprehensive means for maintaining and updating the inventory of our nation’s addresses, jobs, businesses, people and other linked data.”
The third initiative is Data Ingest and Collection for the Enterprise (DICE), a modern platform serving as the entry point for all of the bureau’s data. A foundation for DICE was deployed during 2020 census work, but more needs to be done to enable adaptive survey design and reduce the need for costly updates and system rebuilds.
Last is the Center for Enterprise Dissemination Services and Consumer Innovation (CEDSCI), the primary platform for public data dissemination. The bureau envisions CEDSCI as a means to provide data products quickly and improve the user experience to allow for discovery and new visualizations.
Together the four initiatives will form an integrated system of systems called the Census Operations and Data Ecosystem (CODE).
“CODE will provide myriad data linking capabilities, using secure and confidential data sources, for evidence-building questions like: “Did a government business incentive program reduce poverty in selected neighborhoods?” reads the RFI.
Market research will continue beyond the initial CASC RFI to keep pace with the evolving IT environment. Respondents have until 9 a.m. EST on Oct. 11 to submit comments on the RFI.