Introduction to Machine Learning (2024)

Introduction

The course will introduce the foundations of learning and making predictions from data. We will study basic concepts such as trading goodness of fit and model complexity. We will discuss important machine learning algorithms used in practice, and provide hands-on experience in a series of course projects. VVZ Information is available here.
News
  • [14.08.2024] The solutions of the exam are now available Solutions.
  • [18.06.2024] The final version (up to typos) of the lecture notes has been published.
  • [18.05.2024] The lecture notes on kernels have been published.
  • [24.04.2024] The lecture notes chapters on clustering and PCA have been published! The kernels chapter will appear soon.
  • [15.04.2024] The lecture notes chapter on neural networks has been published. The kernels chapter will appear soon.
  • [29.03.2024] Link to todays tutorial
  • [26.03.2024] The lecture notes chapter on classification has been published!
  • [13.03.2024] Project 1 is online until Wednesday, 27.03.2024, 23:59 CEST. In todays Q&A session the project team will answer any questions about the project. The solutions will be presented in the first Q&A session after the Easter break 10.04.2024.
  • [19.02.2024] Welcome to the course Introduction to Machine Learning!

FAQ
This course is compulsory for my program, but I cannot register. What should I do?A: If this course is compulsory for your study program (Kernfach), you are able to register irrespective of the waiting list. Please allow some time for the transfer from the waiting list.
Is physical attendance to the didactic activities mandatory?A: Physical attendance at lectures, tutorials and Q&As is not mandatory but strongly encouraged.
How can I access the materials on the website?A: Lecture slides, exercises, lecture notes, and recordings of the Q&A are password protected. To obtain the password you need to be inside the ETH network or use the ETH VPN and click here. Check here to learn how to establish a VPN connection. This year’s credentials to the recordings are (username: per-24s, password: scR694P).
What programming knowledge is required for this course?A: For the programming background, we recommend knowing Python. For those without experience in it, check out this Python tutorial.
Will the projects contain boilerplate codes?A: Yes. There will be some code to guide you through the steps.
What are the most useful libraries to learn for the projects?A: numpy, sklearn, pandas, torch.
Is there a preferred library for the deep learning section?A: We will give a tutorial on pytorch and use pytorch in the boilerplate code for the projects. However, the use of other libraries is not disallowed.
Am I eligible to take the exam if I fail some of the projects?A: You must receive an average grade of 4 or above to be eligible to take the exam. Failure in a project (not passing an easy baseline) means a grade of 2. However, as long as the average is above 4 (you can achieve this by getting a 6 in other projects) you can take the exam. If you do not do so we will ask you to deregister from the exam. If you do not de-register we will assign you a no-show grade.
I did the projects last year (spring semester) and didn’t take or failed the exam in the spring or following autumn semester (or both). Do they count towards this year’s projects?A: No, they do not. Projects can be submitted only in the spring semester and they make you eligible to take the exam in the same (spring) semester and the autumn semester following.  If you do not take the exam or fail the exam in these two sessions, you have to enroll in the class next year and redo the projects again.
I am an attendance-only doctoral student what do I need to do?A: You need to enroll in the class and complete the projects to get your attendance certified. Your department decides how many credits you get for just attendance. Most departments require you to take the exam as well to be eligible for credits . This includes the D-INFK, D-MAVT departments. Please contact your department study administrator for details.
I am a Ph.D. student, can I not do the projects or the exam but still get a grade?A: Most Ph.D. students that would like to get credits for the class need to do the projects and the exam. A minority of departments allow you to get credits without taking an exam. See the question above.
Is distance examination allowed?A: Distance examination is allowed, but you need to file an official request via study administration. We do not handle these requests.

Lectures
Lectures will be on Tuesdays and Wednesdays, 14:15 to 16:00, in ETA F5 with a simultaneous video screening in ETF E1. You can ask questions in person or via the course channel on EduApp.
All lectures and tutorials will be recorded, and the recording will be made available within a day after the lecture in the ETH Video Portal and below.
The first lecture will take place on Tuesday, 20 February 2024.
Date Topic Slides Recording
Tue 20.02. Introduction Preliminary [Final] Recording
Wed 21.02. Linear Regression Preliminary [Final] Recording
Tue 27.02. Nonlinear regression & Optimization I Preliminary [Final] Recording
Wed 28.02. Optimization II & overparameterization Preliminary [Final] Recording
Tue 05.03. Pytorch and Python tutorial Recording
Wed 06.03. Model evaluation and selection Preliminary [Final] Recording
Tue 12.03. Bias-variance tradeoff & Regularization Preliminary [Final] Recording
Wed 13.03. Classification I Preliminary [Final] Recording
Tue 19.03. Classification II Preliminary [Final] Recording
Wed 20.03. Classification III & Kernel methods Preliminary [Final] Recording
Tue 26.03. Kernel and other methods Preliminary [Final] Recording
Wed 27.03. Neural Networks I Preliminary [Final] Recording
Tue 09.04. Neural Networks II (Backpropagation/Blackboard) Preliminary [Final] Recording
Wed 10.04. Neural Networks III Preliminary [Final] Recording
Tue 16.04. Neural Networks IV Preliminary [Final] Recording
Wed 17.04. Clustering Preliminary [Final] Recording
Tue 23.04. Dimension Reduction I Preliminary [Final] Recording
Wed 24.04. Dimension Reduction II Preliminary [Final] Recording
Tue 30.04. Probabilistic Modeling I Preliminary [Final] Recording
Wed 01.05. Holiday
Tue 07.05. Probabilistic Modeling II Preliminary [Final] Recording
Wed 08.05. Probabilistic Modeling III Preliminary Recording
Tue 14.05. No class
Wed 15.05. Gaussian Mixture Models I Preliminary [Final] Recording
Tue 21.05. Gaussian Mixture Models II Preliminary [Final] Recording
Wed 22.05. Gaussian Mixture Models III Preliminary [Final] Recording
Tue 28.05. No class
Wed 29.05. LLMs Preliminary Recording

Lecture Notes
We provide a detailed manuscript that contains the most important mathematical background needed for understanding the course. These notes will also serve as a reference for the lectures and set up the notation and needed theorems, definitions, and concepts.
The manuscript is not final and will be updated and expanded with notes for some of the lectures as the course progresses, so please check back regularly. For typos, errors and suggestions, please go to the corresponding Moodle section.
Lecture Notes Last version:  20 June 2024
Tutorials
Tutorials will be held on Fridays, 14:15 to 16:00 ETA F5 with a simultaneous video screening in ETF E1. Similar to the lectures, you will be able to ask questions via the course channel on the EduApp.
The tutorial will be recorded and the recording will be made available after the tutorial in the ETH Video Portal and below.
The first tutorial will take place on Friday, 23 February 2024.
Date Topic Materials Recording Homework/Solution
Fri 23.02. Math recap Recording Homework/Solution
Fri 01.03. Linear regression and Optimisation Slides [Final] Recording
Fri 08.03. Probability recap and model selection Slides Recording Homework/Solution
Fri 15.03. Homework 2 discussion Slides Recording
Fri 22.03. Classification and Kernel Methods Slides [Final] Recording Homework/Solution
Fri 29.03. Review of Homework 3 Slides [Final] Recording Notebook
Fri 12.04. Homework 4 (Exercises 1-3) Slides [Final] Recording Homework/Solution
Fri 19.04. Homework 4 (Exercise 4) Slides [Final] Recording
Fri 26.04. Clustering and Dimensionality Reduction Slides Recording Homework/Solution
Fri 03.05. Clustering and Dimensionality Reduction Slides Recording
Fri 10.05. Probabilistic modeling Slides Recording Homework/Solution
Fri 17.05. Discussion of Homework 5 Slides [Final] Recording
Fri 24.05. Generative Models and Gaussian Mixture Models Slides [Final] Recording Homework/Solutions
Fri 31.05. LLMs Slides [Final] Recording
Q&A Sessions
Q&A sessions (virtual office hours) will be held on Wednesdays, 17:00 to 18:00 virtually on Zoom. The Q&A sessions are an informal opportunity to ask questions about the course. We may use some Q&A sessions for giving more information about the projects. You will be able to ask questions via the native Zoom chat or by speaking out, if requested to do so. It is not mandatory to attend the Q&A sessions.
The Q&A session will be recorded and the recording will be made available after the Q&A session.
The first Q&A session will take place on Wednesday, 21 February.
Date Topic Recording
Wed 21.02. Administration Recording
Wed 28.02. Math recap Recording
Wed 06.03. Python environment setup Recording
Wed 13.03. Project 1 Recording
Wed 20.03. General questions about material Recording
Wed 27.03. Project 2 Recording
Wed 10.04. Solution of Project 1 Recording
Wed 17.04. No one attended Recording
Wed 24.04. Project 3 Recording
Wed 08.05. Solution of Project 2 and Introduction to Project 4 Recording
Wed 15.05. Solution of Project 3 Recording
Wed 22.05. No one attended Recording
Wed 29.05. Solution of Project 4 Recording
Tue 06.08. Exam Q&A Recording/Slides
Contact
Instructors Prof. Fernando Perez Cruz and Prof. Fanny Yang
Head TA Lenart Treven
Assistants
Jonas Hübotter, Lars Lorch, Dongho Kang, Jin Cheng, Viacheslav Borovitskiy, Berken Utku Demirel, Alexandru Tifrea, Bhavi Sukhija, Yarden As, Scott Sussex, Miguel Zamora, Kiran Doshi, Marco Baumann, Zijun Hui, Tobias Wegel, Julia Kostin, Cynthia Chen, Maxim Huber, Aidyn Ubingazhibov, Sascha Bongni, Yue Li, Mohammad Reza Karimi, Armin Lederer, Piersilvio DE Bartolomeis, Sergeev Fedor, Javier Abad Martinez, Paola Malsot, Anna Kerekes, Hugo Yéche, Alex Immer, Sonali Andani, Mamoun Chami, Charlton Connor, Rajesh Sharma, Filippo Masotti
Mailing List Please use Moodle for questions regarding course material, organization and projects. If you need to contact the Head TA or the lecturer directly, please send an email to introml24-info@inf.ethz.ch. Please think twice before you send an email though and make sure you read all information here carefully.
Lectures
Tue 14-16 ETA F5 ETF E1 (via video)
Wed 14-16 ETA F5 ETF E1 (via video)
Tutorials
Fri 14-16 ETA F5 ETF E1 (via video)
Questions & Answers
Wed 17-18 Virtual Zoom

Code Projects
The code projects will require solving machine learning problems with methods taught within the course. Projects will require handing in the solution code as well as a short report. You are allowed to work in groups of 1 – 3 students, but it is your responsibility to find a group. You can search for teammates by posting on Moodle.
In particular, there will be 5 code projects. The first project is ungraded and will allow you to become familiar with our code submission workflow. The remaining projects are graded (pass/fail) and mandatory for passing the course.
Following is a timetable of the projects. Details regarding the projects can be found here. If you are having technical issues please send an email to introml24-projects@inf.ethz.ch.
Project Release Date End Date Weight on Project Grade
Project 0 Fri, 01.03.2024 08:00 - 0
Project 1a&b Wed, 13.03.2024 08:00 Wed, 27.03.2024 23:59 CEST 0.25 (0.125 each)
Project 2 Wed, 27.03.2024 08:00 Tue, 23.04.2024 23:59 CEST 0.25
Project 3 Tue, 23.04.2024 08:00 Wed, 08.05.2024 23:59 CEST 0.25
Project 4 Wed, 08.05.2024 08:00 Wed, 22.05.2024 23:59 CEST 0.25

There will be a presentation introducing the project in the Q&A session on the same Wednesday each project gets released. Also, there will be a project solution session in the Q&A session the week after the deadline of a project. Hence there are 8 sessions for all 4 graded projects in total.
Setting up Anaconda
This is a guide that will help you setup the Anaconda environment that you will need for the projects: Setting up Anaconda.
Demos
The demos are hosted at GitLab. Demos are based on jupyter notebook (with Python 3.9). Please look at this intro for installing and running instructions. We recommend that you create a conda environment to maintain the code base. You should have been added to the repository automatically. If you cannot access the repository, please send an email to sbongni@student.ethz.ch with the subject “[IML 2024] GitLab access” and include your nethz in the email like this “[sbongni]” surrounded by square brackets. (If you get a 404-error or can’t see the repository this is likely because you don’t have access yet. If you have never logged into gitlab before, I was not able to add you because your account is not displayed.)
Performance Assessment
70% session examination, 30% code project; the final grade will be calculated as weighted average of both these elements. As a compulsory continuous performance assessment task, the project must be passed on its own. The coding projects are an integral part (60 hours of work, 2 credits) of the course. Participation is mandatory. To be eligible for the examination of Introduction to Machine Learning (252-0220-00L), you need to pass the code projects, i.e., attain an overall project grade of 4 or higher. Students who do not pass the project are required to de-register from the exam and will otherwise be treated as a “no show”.
For the final exam, you can bring two A4-pages (that is, one A4-sheet of paper), either handwritten or 11 point minimum font size. A simple non-programmable calculator is allowed during the exam. The exam will be multiple choice. Here you can find an example of the question types as well as how to fill out the answer sheet to guarantee successful automatic grading.

Other Resources