German-French Ph.D. Workshop on Secure Big Data
The development of ICT has tremendously changed people’s way of living during the past decade. The resulting big data, on one hand, can help to build appealing industrial products, on the other hand, raises serious concerns about people‘s security and privacy.
This workshop aims to bring together Ph.D. students from CISPA Helmholtz Center for Information Security (CISPA) and Lorraine Research Laboratory in Computer Science and its Applications (LORIA) to address the security and privacy issues of big data. Participants are expected to present their current research projects, engage in scientific discussion, and establish potential collaborations. The workshop will also invite leading researchers in the field to share their newest results and research experiences.
When: October 24 - 26, 2018
Where: Landhotel Saarschleife in Saarland, Germany
Day 1 (October 24th)
|11:30-12:00||Welcome and Introduction: Michael Backes and Jean-Yves Marion|
|13:30-14:50||Ph.D. Session 1|
|15:20-16:40||Ph.D. Session 2|
Day 2 (October 25th)
|09:00-10:20||Ph.D. Session 3|
|10:45-12:00||Keynote Speech (Mirco Musolesi)|
|13:30-14:45||Keynote Speech (Emiliano De Cristofaro)|
Day 3 (October 26th)
|09:00-10:15||Keynote Speech (Sara Hajian)|
Emiliano De Cristofaro
Title: Privacy and Machine Learning: It's Complicated
Abstract: In this talk, we will cover our recent work at the intersection of privacy and machine learning. First, we show how to efficiently support simple unsupervised learning applications that rely on users' data, without invading their privacy. We do so by combining data structures for succinct data representation (such as count-min sketches) with additively homomorphic encryption, showing that the error loss introduce by the sketches does not affect the accuracy of the model. Then, we turn to generative models -- which are increasingly more often used to artificially generate plausible samples of various kinds of data, including images, videos, texts, and music. We present a novel technique for privately releasing generative models and entire high-dimensional datasets produced by these models, showing that our techniques provide realistic synthetic samples which can also be used to accurately compute arbitrary number of counting queries. Finally, we analyze privacy in the context of collaborative/federated learning: these allow multiple participants, each with his own training dataset, to build a joint model by training local models and periodically exchanging model parameters or gradient updates. We demonstrate that these updates leak unintended information about the participants’ training data, presenting both well-known "membership inference" attacks as well as "property inference" ones where the adversary can infer properties that hold only for a subset of the training data and are independent of the properties that the joint model aims to capture.
Bio: Emiliano De Cristofaro is an Associate Professor ("Reader" until recently) in Security and Privacy Enhancing Technologies at University College London (UCL)'s Computer Science Department, where he heads the Information Security Research Group. He is also a Faculty Fellow at the Alan Turing Institute, the national institute for data science and AI. Before joining UCL in 2013, he was a research scientist at Xerox PARC. He received a summa-cum-laude Laurea degree in Computer Science from the University of Salerno, Italy (2005), then, in 2011, a Ph.D. in Networked Systems from the University of California, Irvine, advised by Gene Tsudik. His dissertation, titled "Sharing Sensitive Information with Privacy," can be found [here](https://emilianodc.com/PAPERS/dissertation.pdf). During his Ph.D., he also spent a few months on research internships at NEC in Heidelberg (2008), INRIA in Grenoble (2009), and Nokia in Lausanne (2010). Overall, he does research in security and privacy enhancing technologies. These days he works on understanding and countering security issues via measurement studies and data-driven analysis, as well as tackling problems at the intersection of machine learning and security/privacy.
Title: Identification (and Obfuscation) in the Smartphone Era
Abstract: An increasing number of mobile users is actively sharing their location and other personal information through a variety of applications and services. Many mobile applications are continuously collecting location data that allow service providers to map user movement and profile users, for example for marketing applications. The same is true for the majority of the most popular social networking platforms, which offer the possibility of associating the current location of users to their posts and photos. Conversely, this type of data can be used to identify individuals, for example for crime prevention and national security.
In this talk, I will give an overview of the work of my lab in the area of user identification and profiling using sensor data from smartphones and online social networks. I will discuss the challenges and opportunities in this area and I will outline our research agenda for the coming years.
Bio: Mirco Musolesi is a Reader (equivalent to an Associate Professor in the North-American system) in Data Science at University College London and a Turing Fellow at the Alan Turing Institute, the UK National Institute for Data Science and Artificial Intelligence. At UCL he leads the Intelligent Social Systems Lab. He held research and teaching positions at Dartmouth, Cambridge, St Andrews and Birmingham. He is a computer scientist with a strong interest in sensing, modelling, understanding and predicting human behaviour and social dynamics in space and time, at different scales, using the "digital traces" we generate daily in our online and offline lives. He is interested in developing mathematical and computational models as well as implementing real-world systems based on them. This work has applications in a variety of domains, such as intelligent systems design, ubiquitous computing, security&privacy, and data science for social good. More details about his research profile can be found at: http://www.ucl.ac.uk/~ucfamus/
Title: Discovering and Mitigating Algorithmic Discrimination
Abstract: Algorithms and decision-making based on Big Data are becoming pervasive and essential tools in personal finance, health care, hiring, housing, education, and policy-making. They determine the media we consume, the stories we read, the people we meet, the places we visit, but also whether we get a job or whether our loan request is approved. It is therefore of societal and ethical importance to ask whether these algorithms can be discriminative on grounds such as gender, ethnicity, marital or health status. The answer is yes, as several high-profile cases of algorithmic discrimination have been described in recent years. In this talk, I will present different aspects of algorithmic discrimination starting by a comprehensive survey of cases in which algorithmic bias has been found . Then, I will present two complementary approaches: computational methods for discrimination discovery, and discrimination prevention by means of fairness-aware algorithms.
Bio: Sara Hajian (https://scholar.google.es/citations?user=rXY4178AAAAJ&hl=en) is a data scientist at NTENT, a search technology company. She received her Ph.D. degree from Computer Engineering and Maths Department of the Universitat Rovira i Virgili (URV). She received her M.Sc. degree in Computer Science from Iran University of Science and Technology (IUST). Her research interests are data mining methods and algorithms, social media and social network analysis, privacy-preserving data mining and publishing, and algorithmic discrimination. She has been a visiting student at the Knowledge Discovery and Data Mining Laboratory (KDD-Lab), a joint research group of the Information Science and Technology Institute of the Italian National Research Council (CNR) in Pisa and the Computer Science Department of the University of Pisa. She has been a visiting scientist at Yahoo! Labs in Barcelona. The results of her research on algorithmic discrimination featured in Communications of ACM journal. She co-organized the first IEEE ICDM International Workshop on Privacy and Discrimination in Data Mining (IEEE PDDM 2016).
Ph.D. Session 1
|13:30-13:50||Ahmed Salem||ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models|
|13:50-14:10||Abdallah Dawoud||OS Support for Capability-based Permissions in Android|
|14:10-14:30||Sergiu Bursuc||Private Data on Untrusted Platforms: Votes, Witnesses and More|
|14:30-14:50||Tribhuvanesh Orekondy||User Linkability in Federated Learning|
Ph.D. Session 2
|16:00-16:20||Brij Mohan Lal Srivastava||Privacy-preserving Speech Processing|
|16:20-16:40||Kathrin Grosse||Attacker Models for Machine Learning|
Ph.D. Session 3
|09:00-09:20||Tahleen Rahman||Everything About You: A Multimodal Approach towards Friendship Inference in Online Social Networks|
|09:20-09:40||Bizhan Alipour||Phishing Detection by Machine Learning Techniques|
|09:40-10:00||Bartek Surma||Discovering Hidden Attributes of Online Social Networks’ Users|
|10:00-10:20||Sourya Joyee De||User-centric Privacy Risk Analysis in Online Social Networks|
Please apply electronically by sending an email to firstname.lastname@example.org with:
- a one-page CV
- an application letter describing your current Ph.D. project, your motivation for participating in the workshop, and your desired outcome.
The program committee will select the applications.
Application deadline: September 8, 2018
Participation fee: free of charge
Michael Backes (CISPA)
Jilles Vreeken (CISPA)
Sven Bugiel (CISPA)
Jannik Dreier (LORIA)
Abdelkader Lahmadi (LORIA)
Marine Minier (LORIA)
Yang Zhang (CISPA)
Sandra Strohbach (CISPA)
Abdessamad Imine (LORIA)
[¹] Sunrise at Saarschleife, https://www.flickr.com/photos/christianreimer/8967719632/ . This image is licensed under a Creative Commons License (Attribution-ShareAlike 2.0 Generic, CC BY-SA 2.0), https://creativecommons.org/licenses/by-sa/2.0/