A survey of privacy attacks in machine learning

A survey of privacy attacks in machine learning#

Note

Hey guys, this is my personal reading note. I am not sure there might be some mistakes in my understanding. Please feel free to correct me (hsiangjenli@gmail.com) if you find any. Thanks!

Abstract#

  • Analysis of more than 45 papers related to privacy attacks against machine learning

  • Attack taxonomy (focus on privacy and confidentiality attacks)

  • Exploration of the causes of privacy leaks

  • The most common defenses methods, open problems and future directions

  • Implementation of the attacks

Introduction#

How models leaks information

  1. The way of constructing models

    • For example, adversarial robustness (make the model can defenses against adversarial examples) can leak information about the training data

    • Because the modelmay overfitting to the training data, which means the model already memorized the training data

  2. Poor generalization and memorization of sensitive data samples

three types of attacks on machine learning systems:

  1. attacks against integrity - 完整性攻擊 - 攻擊 input data,讓模型做出錯誤決策

  2. attacks against a system’s availability - Maximize the misclassification error

  3. attacks against privacy and confidentiality !!! - try to infer information about user data and models - 試圖推理出 data 的資訊或是模型的敏感訊息

Type of Machine learning architecture

  • Centralized Learning

  • Distributed Learning
    1. collaborative or federated learning (FL)

    2. fully decentralized or peer-to-peer (P2P) learning

    3. split learning

References#

[1] (1,2)

Maria Rigaki and Sebastian Garcia. A survey of privacy attacks in machine learning. ACM Computing Surveys, 56(4):1–34, 2023.