Finance meets Artificial Intelligence

Privacy-preserving Natural language

How can millions of pieces of data be automatically bundled and processed to train algorithms while protecting sensitive content and preserving the privacy of those affected? 
We would like to address these tasks with the topic of AI-based text processing while preserving privacy and data protection (privacy-preserving natural language processing), because a large proportion of user data comes from natural language, such as search and chatbot queries, call centers, call notes, (automatic) transcriptions of telephone calls, voice assistants, as well as text-based information such as emails, documents and websites, to name just a few. It is therefore essential to curate NLP datasets that preserve user privacy and train machine learning models that only store non-identifying user data.

The main methods and challenges here are:

1. Personal information detection, i.e. how to automatically find those words or phrases in texts that contain personal user information

2. Privacy-preserving text analysis, i.e. how to integrate differential privacy methods and homomorphic encryption methods into automatic text analysis

3. Privacy-enhancing technologies, i.e. how to integrate and improve data protection and privacy in current AI methods.

 

Host: Prof. Dr.-Ing. Tanja Schultz

Contact person: Lily Meister