Federal chief data officers seek information on synthetic data generation
The Federal Chief Data Officers Council is looking for information on synthetic data generation as it works to establish best practices, according to a solicitation posted Thursday.
The request for information was posted by the General Services Administration for public inspection on the Federal Register Thursday. The RFI asks the public for comments on how to define synthetic data generation, in addition to potential applications, challenges, and ethics of the technology.
After the document is officially published Friday, the CDO Council will accept comments for roughly a month.
The National Institute of Standards and Technology defines synthetic data generation as a “process in which seed data are used to create artificial data that have some of the statistical characteristics of the seed data.” That data can be used for things like creating “larger and more diverse datasets,” improving model performance, and protecting privacy, the solicitation said.
The council has already determined that there are wide-ranging applications and challenges with the technology, and creating a more formal definition for synthetic data generation would benefit the federal government, according to the document.
President Joe Biden’s October executive order on artificial intelligence listed synthetic data generation as an example of a “privacy-enhancing technology.” The process is already used by agencies, such as the Department of Veterans Affairs, which employed synthetic data during the COVID-19 pandemic to protect veteran health information.