DS603: Robust Machine Learning

Elective, IIT Bombay, C-MInDS, 2025

Course Title: Robust Machine Learning
Instructor: Arjun Bhagoji
TA: Mohamad Hassan N C (mohamad.hassan@iitb.ac.in)
Time: Wednesday, Friday 9.30-11.00am
Room: LT103
Office Hours: 4-5pm on Wednesdays, CC120

Course Description

Progress in machine learning is often measured under controlled, well understood conditions. However, safety-critical workflows in realistic settings require ML systems to be reliable even when faced with new and unexpected conditions. Sufficiently adverse conditions may violate the statistical assumptions underlying common ML models, causing undesirable behavior. This undesirable behavior manifest along the following three dimensions:

  1. Robustness: Lack of robustness to unseen and adversarial inputs
  2. Privacy: Leakage of private data or model parameters, and
  3. Fairness: Uneven performance across subpopulations.

This course will equip students to, on the theoretical front, rigorously reason about conditions under which unreliable behavior occurs, and on the practical side, use these insights to build reliable ML systems. While the course will cover all three aspects, the focus will largely be on robustness, with lighter treatment of the other two aspects.

Intended Audience: The intended audience for this class is graduate students working in machine learning and data science, who are interested in doing research in this area. However, interested undergraduates (3rd year and higher) are welcome to attend as well.

Pre-requisites: Mathematical maturity will be assumed as will the basics of algorithms, probability, linear algebra, and optimization. An introductory course in machine learning should have been taken. For the project component, familiarity with scientific programming in Python and the use of libraries such as Numpy and Pytorch will be beneficial.

Course Schedule

WeekDate (Day)TopicReferencesNotesComments
130/07 (Wed)Course Outline, Supervised Learning RecapPodcast on Note-taking, Section 3 of Percy Liang’s Lecture Notes on SLT, Selected Material from Chapters 2,3,4,5,12 of Understanding Machine Learning, 3.1 from Convex Optimization: Algorithms and Complexity  
101/08 (Fri)Unsupervised learning for anomaly detectionA unifying review of deep and shallow anomaly detection, Chapter 8 of Learning with Kernels Start of Module 1 on Robustness
206/08 (Wed)Modern approaches to out-of-distribution detectionA Unified Survey on Anomaly, Novelty, Open-Set, and Out-of-Distribution Detection: Solutions and Future Challenges
  
208/08 (Fri)Formalizing distributionally robust optimization and robust statisticsDistributionally Robust Optimization and Robust Statistics  
313/08 (Wed)Robust mean estimationRecent Advances in Algorithmic High-Dimensional Robust Statistics Project Milestone 0: Project groups due
315/08 (Fri)Independence Day   
420/08 (Wed)Learning with label noiseLearning with Noisy Labels  
422/08 (Fri)Poisoning attacksMachine Learning Security against Data Poisoning: Are We There Yet? Poisoning Attacks against Support Vector Machines  
527/08 (Wed)Defenses against poisoning attacksStronger Data Poisoning Attacks Break Data Sanitization Defenses, Planting Undetectable Backdoors in Machine Learning Models Ganesh Chaturthi (Makeup TBD)
529/08 (Fri)Adversarial examplesIntriguing Properties of Neural Networks, Towards Evaluating the Robustness of Neural Networks,Delving into Transferable Adversarial Examples and Black-box Attacks, Square Attack: a query-efficient black-box adversarial attack via random search  
603/09 (Wed)Jailbreaking or adversarial examples by any other nameJailbreaking LLMs and Agentic Systems Quiz (Tentative)
605/09 (Fri)(Empirical) Defenses against adversarial examplesTowards Deep Learning Models Resistant to Adversarial Attacks, Theoretically Principled Trade-off between Robustness and Accuracy id-E-Milad (Makeup TBD)
710/09 (Wed)Learning with adversarial examples: Optimal robust lossAdversarial Risk via Optimal Transport and Optimal Couplings, Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries  
712/09 (Fri)Learning with adversarial examples: Generalization boundsRademacher Complexity for Adversarially Robust Generalization Project Milestone 1: Idea pitch to instructor
817/09 (Wed)Mid-Semester Week   
819/09 (Fri)Mid-Semester Week   
924/09 (Wed)Verified robust training: Exact certification [Guest Lecture: Prof. Supratik Chakraborty (CSE, IITB)]   
926/09 (Fri)Verified robust training: Convex relaxationsProvable Defenses via the Convex Outer Adversarial Polytope, Certified Defenses against Adversarial Examples End of Module 1 on Robustness
1001/10 (Wed)Buffer for Module 1   
1003/10 (Fri)Privacy attacks on ML ModelsEnhanced Membership Inference Attacks against Machine Learning Models, Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing, High-Fidelity Extraction of Neural Network Models Start of Module 2 on Privacy
1108/10 (Wed)Differential Privacy and Private training of ML models [Guest Lecture: Prof. Krishna Pillutla (DSAI, IITM)]   
1110/10 (Fri)Decentralized learning [Guest Lecture: Prof. Pranay Sharma (C-MInDS, IITB)]  Project Milestone 2: Progress update
1215/10 (Wed)Privacy-robustness tradeoffs: Attacks on decentralized learningA Little Is Enough: Circumventing Defenses For Distributed Learning, Analyzing federated learning through an adversarial lens  
1217/10 (Fri)Buffer for Module 2The Hidden Vulnerability of Distributed Learning in Byzantium End of Module 2 on Privacy
1322/10 (Wed)Bias in ML ModelsGender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification, Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints Start of Module 3 on Fairness and Explainability
1324/10 (Fri)Fairness DefinitionsTutorial: 21 fairness definitions and their politics, Fairness and machine learning, Chapter 3  
1429/10 (Wed)Fair training of ML modelsA Reductions Approach to Fair Classification  
1431/10 (Fri)Interpretability techniques: classical and modernUnderstanding Black-box Predictions via Influence Functions, “Why Should I Trust You?”: Explaining the Predictions of Any Classifier  
1505/11 (Wed)Challenges with interpretabilityInterpretation of Neural Networks is Fragile, Impossibility Theorems for Feature Attribution End of Module 3 on Fairness and Explainability, Guru Nanak’s Birthday (Makeup TBD)
1507/11 (Fri)Towards responsible AI modelsFawkes: Protecting Privacy against Unauthorized Deep Learning Models, Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models, Algorithmic Collective Action in Machine Learning, MultiRobustBench: Benchmarking Robustness Against Multiple Attacks  

Resources

Supplementary Books

  1. Understanding Machine Learning: From Theory to Algorithms
  2. All of Statistics
  3. Mathematics for Machine Learning
  4. Convex Optimization: Algorithms and Complexity
  5. Convex Optimization

Similar Courses

  1. Jerry Li’s course
  2. Jacob Steinhardt’s course

Formats

  1. Scribe notes

Other references mentioned in class

  1. Lecture 1: Feldman 2012, Agnostic Learning of Halfspaces is Hard, Bartlett et al., Convexity, Classification, and Risk Bounds

Grading (Tentative)

Best 2 out of 3 exams: 40%
Final project: 40% (Project presentations will be held at mutually convenient times between 10/11/2025 and 26/11/2025)
Scribing: 10% (Scribes are due in one week after the lecture, sign up here)
Class participation: 10%

Attendance Policy

4 unexplained absences are allowed. Any absences beyond that require instructor permission.

Accommodations

Students with disabilities and health issues should approach the instructor at any point during the semester to discuss accommodations. The course aim is to learn together and legitimate bottlenecks will be resolved collaboratively.