It could be a group of five or a group of 50, but if you asked a group of experts for a definition of big data, you’d be hard pressed to get a clear-cut answer. What’s clear is that whatever big data is, lines need to be drawn that shape how it impacts the public, industry and government.
How to measure this impact was the basis for the Federal Trade Commission’s big data workshop, which brought business leaders, academics and consumer advocates together Monday to discuss whether big data is helping or harming consumers.
Pamela Dixon, the founder and executive director of the World Privacy Forum says she could find examples of big data both offering help and causing harm, but it’s difficult to build policies off either side of the argument due to a lack of understanding as to what big data actually is.
“Big data is immature,” Dixon says. “There is no firm, scalpel-like, definition of big data. Show me an actual legislative definition of it. I know you can’t, because there isn’t one yet. So what do we do with that? We can’t just throw out the existing fairness structures. We need to use the existing fairness structures that we have.”
FTC commissioner Julie Brill spoke about how those current fairness structures — particularly the Fair Credit Reporting Act — should serve as benchmarks for new regulations aimed at companies that are creating alternative credit scores out of the data they are collecting.
“The use of new sources of information, including information that goes beyond traditional credit files, to score consumers raises fresh questions about whether these alternate scores may have disparate impacts along racial, ethnic or other lines that the law protects,” Brill said. “Those questions are likely to linger and grow more urgent…until the companies that develop these alternate scores go further to demonstrate that their models do not contain racial, ethnic, or other prohibited biases.”
Bias was a common thread throughout each panel, due to the fact that big data by practice divides and separates people into myriad of groups.
“As businesses segment consumers to determine what products are marketed to them, the prices they are charged, and the level of customer service they receive, the worry is that existing disparities will be exacerbated,” said FTC Chairwoman Edith Ramirez. “Is this discrimination? In one sense, yes. By its nature, that is what big data does in the commercial sphere — analyzes vast amounts of information to differentiate among us at lightning speed through a complex and opaque process. But is it unfair, biased, or even illegal discrimination?”
Throughout the day, examples of inclusive and exclusive big data practices were on display. Gene Gsell, a senior vice president of SAS, spoke about how big data has helped serve people who haven’t been able to use banks or secure car loans by traditional means.
“One of the things that is driving change is the ability to process this data, the ability to collect it, the ability to do something with it,” Gsell said. “[Businesses] don’t say they want to discriminate. But we want to be able to predict.”
However, studies conducted by LaTanya Sweeney, a professor of government and technology at Harvard University who serves as the FTC’s chief technologist, show the line between discrimination and predicting consumer behavior is a tenuous one.
Sweeney conducted a study that found web searches for black names were 25 percent more likely than searches for white names to return ads suggesting the person had an arrest record, regardless of whether the person had ever actually been arrested. Another study Sweeney reviewed at the workshop showed ads for harshly criticized credit cards were often directed to the homepage of a popular black fraternity.
Regardless of the bias, Nicol Turner-Lee, the vice president for the Minority Media and Telecommunications Council, said everyone needs to be aware that these practices are now a reality for anyone connected to the Internet.
“People have to understand that their data is being used for particular purposes,” she said. “Lets face it, the Internet is this big buffet of places. It’s not that simple to say ‘I’m going to the Internet for this or for that.’ When you give your email address on the Internet, there is an information service that’s taking that and making algorithms that tailored a search to you.”
Gsell says these tailored searches are not novel and more of an evolution of what businesses have been trying to do for the last century.
“Big data has been around for a long time, today there is just more of it,” Gsell said. “This phenomena is not something that just came into vogue. The industry gets more credit than what actually exists. Most people are overwhelmed with all of that data.”
Even if most businesses have yet to discover a way to harness all of the data they collect, Dixon argued that the collection alone is enough to create bias and policy is needed to protect the public from the negative impact that bias could have in their daily life.
“The moment a person is put into a category or is classified, that triggers a data paradox,” Dixon said. “The bottom line is when you classify an individual you trigger this and when that is triggered, we have to do something about it.”