Is the data you use to train your AI really as safe as you think?
AI and neural networks are spreading throughout society. This growth is partially driven by the large amount of data fed to these systems during their training process. Oftentimes this data is private and sensitive in nature, for example including user data or patient records. The prevailing belief is that once the models are trained, the training data cannot be accessed through the model alone. But is this really the case?
The fact that AI systems generally react differently to training data and new data has created a novel research field and several methods which allow a potential attacker to infer information about unknown training data. For example it’s possible to learn about gender biases, check if the training included a particular image or even to fully recreate the training data.
The National Forensic Center (NFC) has performed a research project regarding recreating training data from image classifiers and will be presenting the results and the state of the art techniques.
Speaker: Elliot Gestrin is a student at Linköping University, chairman of the university’s AI and Robotics association FIA and has worked for NFC this summer.