AI relies on quality data – and the more of it, the better. But that can be easier said than done now that recent privacy regulations like General Data Protection Regulation (GDPR) are requiring organizations to mask personal data. GDPR is a European regulation developed to protect privacy and make data collection and management more transparent and secure.
While GDPR is a European regulation, the U.S. may well be following suit. Already, several U.S. states have recently introduced legislation to expand data privacy rules and more states are expected to join over the coming months.
During a European Union privacy conference, Apple’s Tim Cooke just recently issued a call to action for U.S.-wide data-protection regulation, saying individuals’ personal information has been “weaponized.” According to the same article, U.S. Senator Mark Warner said he was encouraged by Microsoft, Apple and others’ support of regulation. He said, “Too often we’ve heard companies whose business models depend on intrusive and opaque collection of user data claim that any changes to the status quo will radically undermine American innovation, but Apple and others demonstrate that innovation doesn’t have to be a race to the bottom when it comes to data protection and user rights.”
But how will GDPR really impact AI? What happens, for example, when specific information needs to be collected on individuals to predict customer behavior, such as who might be likely to purchase a product or upgrade technology? Besides allowing people to request that companies remove their data, GDPR also requires companies to anonymize their data, unless identifying information is crucial to its worthiness.
This is especially true when it comes to AI in healthcare. According to an article, the GDPR introduced a right to explanation, which means that the logic of an automated decision can be challenged and the results tested – so businesses will need to think carefully before building an AI solution that cannot explain itself. Where required by GDPR, privacy impact assessments will be needed and privacy will take on even more urgency.
To conform to data privacy needs, professionals in all industries working with big data need to take out identifying details before processing the information. Similarly, the businesses using them should ensure that there is training or verify their workers’ knowledge in handling big data to avoid ethical violations and significant fines.
Maybe It’s not all bad news for AI innovation
There is a lot of speculation out there about what privacy regulations will mean for AI, but ironically, the regulations that some feel are stifling innovation may actually be forcing businesses to get their data houses in order. Consider that one of the main barriers to AI is data collection and many companies don’t know what type of data they have, let alone where it is or in what shape it is in. It’s common for different departments or business units to have their own data silos, leaving everyone to guess about what information is available in their organization.
Now, GDPR and other regulations are forcing companies to take a long hard look at their data and get their houses in order. They have to think critically about their data, what type they are storing, and what rules they are implementing to protect it. They need to find out where the data is, what it contains, how it is being used and ensure that the quality is good. Accomplishing all of these things and unlocking the data that has been stuck inside an organization is critical for any AI effort.
There’s always another way
The market is finding a way to unlock all of this hidden data – and to do so securely. Technology like synthetic data generation lets companies access personal information without identifying any individuals. Apple uses differential privacy to gather information on a group of users without using individual information, and Google offers a data loss prevention technology that strips personal information from databases.
Vendors are continuing to provide incentives for individuals to share their data. While that’s nothing new – it’s been around since loyalty cards were introduced – they are finding new and more attractive ways to encourage customers to opt in with their information.
Retail and e-commerce industries, in particular, will have a hard time adjusting to new privacy regulations because it is not anything they have ever had to do before. They can learn from industries like healthcare and financial services that have had to grapple with these data privacy issues for a while and figured out how to manage data privacy effectively and still mine the data they need. For example, healthcare organizations know how to share information in files while masking personal data.
How personal do you want to get?
When it comes down to it, how often do you really need to know about a specific customer? To determine what products to offer and the order in which to display website content, for example, companies need to know about patterns of behavior, and in some cases where individuals are located, but very specific and personalized data may not necessarily be needed. After all, data is needed to train algorithms, not to expose specific individuals. While AI innovation is being fueled to new heights thanks to solid, quality data, individual privacy need not be sacrificed in the process.