Introduction to Machine Learning (2023)

Introduction

The course will introduce the foundations of learning and making predictions from data. We will study basic concepts such as trading goodness of fit and model complexity. We will discuss important machine learning algorithms used in practice, and provide hands-on experience in a series of course projects. VVZ Information is available here.
News
  • [05.02.2024] The 2024 winter exam and the respective solutions catalog (source file used for the grading) are now online!
  • [13.08.2023] The 2023 summer exam and the respective solutions, and solutions catalog (source file used for the grading) are now online!
  • [01.08.2023] The lecture notes have been updated. We have two new chapters: Clustering (provisional and not yet proof-read by the professors) and Probabilistic modelling.
  • [13.07.2023] (Exam review session) There will be an exam review session on 31 July from 10-12 in ETA F 5. This will be held by Andisheh Amrollani and Mohammad Reza Karimi who were in charge of exams for the years of 2021 and 2022. They will solve the previous exam year’s questions as voted by you on Moodle and answer general questions you might have about the exam.
    Here is the vote on Moodle: https://moodle-app2.let.ethz.ch/mod/choice/view.php?id=920239.
  • [13.07.2023] (Plagiarism checks) We have finished plagiarism checks for the projects. If you have not received an email accusing you of plagiarism and you have a grade of 4 or above in the projects you are eligible to take the exam. If you do not satisfy any of the two above requirements, we ask you to deregister from the exam (deadline for online de-registration is 30 July). If you are not eligible and do not de-register we will give you a NO-SHOW grade.
  • [13.07.2023]  (Attendance only doctoral students) Your grades will be passed by tomorrow.
  • [04.07.2023] We have added the exam for 2021 winter together with its solution, please find them in the Performance Assessment section.
  • [24.06.2023] The lecture notes have been updated. We have two new chapters: Chapter 8 Neural Networks and Chapter 9 PCA. More coming soon!
  • [07.06.2023] The two exams for 2022 academic year have been posted in the exam section to help you familiarize the exam contents.
  • [07.06.2023] Project grades have been emailed. They are still subject to plagiarism checks.
  • [31.05.2023] There will be a Q&A next Wednesday (07 June) presenting solutions for Project 4.
  • [30.05.2023] There will be NO lectures held this week on Tuesday (30 May) and Wednesday (31 May). There still WILL BE a tutorial on Friday (2 June), as usual at 2pm, on large language models.
  • [02.05.2023] A new version of the lecture notes is released! This version includes two new chapters: Chapter 6 Model Evaluation and Selection and Chapter 7 Bias-Variance Tradeoff and Regularization. In case you have anything to report, please contact Xinyu Sun.
  • [03.04.2023] Because of the Easter break, there is no tutorial for this week on Friday (07.04). Instead, the session will be recorded and put online.
  • [14.03.2023] For students who do not want to solve projects on their personal laptops, check out this guide by Vukasin Bozic explaining how to use Euler, the scientific computer clusters of ETH. More information regarding Euler is available here.
  • [07.03.2023] The Projects and FAQ section have been updated. Importantly there will be a project introduction session on each Wednesday the project is released. And a project solution session on the Wednesday the week after the project deadline. Both to be held at the Q&A session 17(sharp)-18.
  • [02.03.2023] Project 0 is online. It is ungraded and aims to help you familiarize the project workflow.
  • [27.02.2023] Regarding the Q&A on March 1st: Introduction to Python for Data Science (jupyter, numpy, pandas, seaborn, matplotlib). Please check the README before the tutorial. This is for installing all the necessary libraries so you can follow along during the tutorial.
  • [22.02.2023] During the Q&A session on March 1st, Olga Mineeva and Stefan Stark will present a Python libraries introduction covering Numpy and Pandas.
  • [10.02.2023] Welcome to the course Introduction to Machine Learning!

FAQ
This course is compulsory for my program, but I cannot register. What should I do?A: If this course is compulsory for your study program (Kernfach), you are able to register irrespective of the waiting list. Please allow some time for the transfer from the waiting list.
Is physical attendance to the didactic activities mandatory?A: Physical attendance at lectures, tutorials and Q&As is not mandatory but strongly encouraged.
How can I access the materials on the website?A: Lecture slides, exercises, lecture notes, and recordings of the Q&A are password protected. To obtain the password you need to be inside the ETH network or use the ETH VPN and click here. Check here to learn how to establish a VPN connection.
What programming knowledge is required for this course?A: For the programming background, we recommend knowing Python. For those without experience in it, check out this Python tutorial.
Will the projects contain boilerplate codes?A: Yes. There will be some code to guide you through the steps.
What are the most useful libraries to learn for the projects?A: numpy, sklearn, pandas, torch.
Is there a preferred library for the deep learning section?A: We will give a tutorial on pytorch and use pytorch in the boilerplate code for the projects. However, the use of other libraries is not disallowed.
Am I eligible to take the exam if I fail some of the projects?A: You must receive an average grade of 4 or above to be eligible to take the exam. Failure in a project (not passing an easy baseline) means a grade of 2. However, as long as the average is above 4 (you can achieve this by getting a 6 in other projects) you can take the exam. If you do not do so we will ask you to deregister from the exam. If you do not de-register we will assign you a no-show grade.
I did the projects last year (spring semester) and didn’t take or failed the exam in the spring or following autumn semester (or both). Do they count towards this year’s projects?A: No, they do not. Projects can be submitted only in the spring semester and they make you eligible to take the exam in the same (spring) semester and the autumn semester following.  If you do not take the exam or fail the exam in these two sessions, you have to enroll in the class next year and redo the projects again.
I am an attendance-only doctoral student what do I need to do?A: You need to enroll in the class and complete the projects to get your attendance certified. Your department decides how many credits you get for just attendance. Most departments require you to take the exam as well to be eligible for credits . This includes the D-INFK, D-MAVT departments. Please contact your department study administrator for details.
I am a Ph.D. student, can I not do the projects or the exam but still get a grade?A: Most Ph.D. students that would like to get credits for the class need to do the projects and the exam. A minority of departments allow you to get credits without taking an exam. See the question above.
Is distance examination allowed?A: Distance examination is allowed, but you need to file an official request via study administration. We do not handle these requests.

Lectures
Lectures will be on Tuesdays and Wednesdays, 14:15 to 16:00, in ETA F5 with a simultaneous video screening in ETF E1. You can ask questions in person or via the course channel on EduApp.
All lectures and tutorials will be recorded, and the recording will be made available within a day after the lecture in the ETH Video Portal and below.
The first lecture will take place on Tuesday, 21 February 2023.
Date Topic Slides Recording
Tue 21.02. Introduction Preliminary [Final] Recording
Wed 22.02. Linear Regression Preliminary [Final] Recording
Tue 28.02. Optimization Preliminary [Final] Recording
Wed 01.03. Optimization & Nonlinear Features Preliminary [Final] Recording
Tue 07.03. Model Selection Preliminary [Final] Recording
Wed 08.03. Bias-Variance Tradeoff & Regularization Preliminary [Final] Recording
Tue 14.03. Classification Preliminary [Final] Recording
Wed 15.03. Classification II Preliminary [Final] Recording
Tue 21.03. Classification & Kernel Methods Preliminary [Final] Recording
Wed 22.03. Kernel & Other Methods Preliminary [Final] Recording
Tue 28.03. Neural Networks Preliminary [Final] Recording
Wed 29.03. Neural Networks II Preliminary [Final] Recording
Tue 04.04. Neural Networks III Preliminary [Final] Recording
Wed 05.04. Neural Networks IV Preliminary [Final] Recording
Tue 18.04. Clustering Preliminary [Final] Recording
Wed 19.04. Dimension Reduction Preliminary [Final] Recording
Tue 25.04. Dimension Reduction II Preliminary [Final] Recording
Wed 26.04. PyTorch Tutorial Demo Recording
Tue 02.05. Probabilistic Modeling Preliminary [Final] Recording
Wed 03.05. Probabilistic Modeling II Preliminary [Final] Recording
Tue 09.05. Probabilistic Modeling III Preliminary [Final] Recording
Tue 16.05. Gaussian Mixture Models Preliminary [Final] Recording
Wed 17.05. Gaussian Mixture Models II Preliminary [Final] Recording
Tue 23.05. Gaussian Mixture Models III Preliminary [Final] Recording
Wed 24.05. Generative Models with Neural Networks Preliminary [Final] Recording
Mon 31.07. Exam review session Notes

Lecture Notes
We provide a detailed manuscript that contains the most important mathematical background needed for understanding the course. These notes will also serve as a reference for the lectures and set up the notation and needed theorems, definitions, and concepts.
The manuscript is not final and will be updated and expanded with notes for some of the lectures as the course progresses, so please check back regularly. For typos, errors and suggestions, please contact Xinyu Sun at xinsun@student.ethz.ch .
Lecture Notes Last version: 1 August 2023
Tutorials
Tutorials will be held on Fridays, 14:15 to 16:00 ETA F5 with a simultaneous video screening in ETF E1. Similar to the lectures, you will be able to ask questions via the course channel on the EduApp.
The tutorial will be recorded and the recording will be made available after the tutorial on ETH Video Portal.
The first tutorial will take place on Friday, 24 February 2023.
Date Topic Materials Recording Homework/Solution
Fri 24.02. Math Recap Notes Recording
Fri 03.03. Linear Regression & Optimization Materials Recording Homework 1
Fri 10.03. Review of Homework 1 Slides Recording Solution 1
Fri 17.03. Classification Materials Recording Homework 2
Fri 24.03. Review of Homework 2 Materials Recording Solution 2
Fri 31.03. Neural Network Slides Recording Homework 3
Fri 07.04. Review of Homework 3 Slides Recording Solution 3
Fri 21.04. Clustering and Dimension Reduction Slides Recording Homework 4
Fri 28.04. Review of Homework 4 Slides Recording Solution 4
Fri 05.05. Probabilistic Modeling Slides Recording Homework 5
Fri 12.05. Review of Homework 5 Slides Recording Solution 5
Fri 19.05. Generative Models Slides Recording Homework 6
Fri 26.05. Review of Homework 6 Slides Recording Solution 6
Fri 02.06. Large Language Models - Recording
Q&A Sessions
Q&A sessions (virtual office hours) will be held on Wednesdays, 17:00 to 18:00 virtually on Zoom. The Q&A sessions are an informal opportunity to ask questions about the course. We may use some Q&A sessions for giving more information about the projects. You will be able to ask questions via the native Zoom chat or by speaking out, if requested to do so. It is not mandatory to attend the Q&A sessions.
The Q&A session will be recorded and the recording will be made available after the Q&A session.
The first Q&A session will take place on Wednesday, 22 February.
Date Topic Recording
Wed 22.02. Administration Recording
Wed 01.03. Python Tutorial Recording
Wed 08.03. Linear Regression & Optimization Recording
Wed 15.03. Project 1 Introduction Recording
Wed 22.03. Classification & Kernel Methods Recording
Wed 29.03. Project 2 Introduction Recording
Wed 05.04. Project 1 Solution Recording
Wed 19.04. (No one attended)
Wed 26.04. Project 3 Introduction Recording
Wed 03.05. Project 2 Solution Recording
Wed 10.05. Project 4 Introduction Recording
Wed 17.05. Project 3 Solution Recording
Wed 24.05. (No one attended)
Wed 07.06. Project 4 Solution Recording
Contact
Instructors Prof. Andreas Krause and Prof. Fanny Yang
Head TA Andisheh Amrollahi
Assistants
Pragnya Alatur, Parnian Kassraie, Lars Lorch, Lenart Treven, Bhavya Sukhija, David Lindner, Yarden As, Scott Sussex, Hugo Yeche, Vignesh Ram, Vukasin Bozic, Charlotte Bunne, Cynthia Chen, Sonali Andani, Mojmir Mutny, Zhenrong Lang, Gavrilopoulos Georgios, Phillip Scherer, Xinyu Sun, Zhiyuan Hu, Zhenru Jia, Rajesh Sharma, Giorgia Racca, Angeline Pouget, Yuhao Mao, Thomas Out, Javier Abad Martinez, Piersilvio De Bartolomeis, Alexandru Tifrea, Viacheslav Borovitskiy, Jannis Bolick, Stefan Stark, Olga Mineeva, Harun Mustafa, Daniel Yang
Mailing List
Please use Moodle for questions regarding course material, organization and projects. If you need to contact the Head TA or the lecturer directly, please send an email to introml23-info@inf.ethz.ch. Please think twice before you send an email though and make sure you read all information here carefully.
Lectures
Tue 14-16 ETA F5 ETF E1 (via video)
Wed 14-16 ETA F5 ETF E1 (via video)
Tutorials
Fri 14-16 ETA F5 ETF E1 (via video)
Questions & Answers
Wed 17-18 Virtual Zoom

Code Projects
The code projects will require solving machine learning problems with methods taught within the course. Projects will require handing in the solution code as well as a short report. You are allowed to work in groups of 1 – 3 students, but it is your responsibility to find a group. You can search for teammates by posting on Moodle.
In particular, there will be 5 code projects. The first project is ungraded and will allow you to become familiar with our code submission workflow. The remaining projects are graded (pass/fail) and mandatory for passing the course.
Following is a timetable of the projects. Details regarding the projects can be found here. If you are having technical issues please send an email to introml23-projects@inf.ethz.ch.
Project Release Date End Date Weight on Project Grade
Project 0 (dummy) Wed, 01.03.2023 08:00 - 0
Project 1a&b Wed, 15.03.2023 08:00 Wed, 29.03.2023 12:00 (Noon) 0.25 (0.125 each)
Project 2 Wed, 29.03.2023 08:00 Wed, 26.04.2023 12:00 (Noon) 0.25
Project 3 Wed, 26.04.2023 16:00 Wed, 10.05.2023 12:00 (Noon) 0.25
Project 4 Wed, 10.05.2023 08:00 Wed, 31.05.2023 12:00 (Noon) 0.25

There will be a presentation introducing the project in the Q&A session on the same Wednesday each project gets released. Also, there will be a project solution session in the Q&A session the week after the deadline of a project. Hence there are 8 sessions for all 4 graded projects in total.
Demos
The demos are hosted at GitLab. Demos are based on jupyter notebook (with Python 3.9). Please look at this intro for installing and running instructions. We recommend that you create a conda environment to maintain the code base. You should have been automatically added to the repository as a Reporter. This gives you the opportunity to clone the project on your machine and run the demos. If you still cannot access the repository, please send an email to zhiyhu@student.ethz.ch with the subject “[IML 2023] GitLab access” and include your nethz in the email.
Performance Assessment
70% session examination, 30% code project; the final grade will be calculated as weighted average of both these elements. As a compulsory continuous performance assessment task, the project must be passed on its own. The coding projects are an integral part (60 hours of work, 2 credits) of the course. Participation is mandatory. To be eligible for the examination of Introduction to Machine Learning (252-0220-00L), you need to pass the code projects, i.e., attain an overall project grade of 4 or higher. Students who do not pass the project are required to de-register from the exam and will otherwise be treated as a “no show”.
For the final exam, you can bring two A4-pages (that is, one A4-sheet of paper), either handwritten or 11 point minimum font size. A simple non-programmable calculator is allowed during the exam. The exam will be multiple choice. Here you can find an example of the question types as well as how to fill out the answer sheet to guarantee successful automatic grading.
Previous Exams: Exam 2015, Exam 2016, Exam 2017, Exam 2018, Exam 2019, Exam 2020, Sol 2020, Exam 2021 Summer, Sol 2021 Summer, Exam 2021 Winter, Sol 2021 Winter, Exam 2022 Summer, Sol 2022 Summer, Sol 2022 Summer final Exam 2022 Winter, Sol 2022 Winter, Sol 2022 Winter final
Other Resources