Introduction
The course will introduce the foundations of learning and making predictions from data. We will study basic concepts such as trading goodness of fit and model complexity. We will discuss important machine learning algorithms used in practice, and provide hands-on experience in a series of course projects. VVZ Information is available here.News
- [14.08.2024] The solutions of the exam are now available Solutions.
- [18.06.2024] The final version (up to typos) of the lecture notes has been published.
- [18.05.2024] The lecture notes on kernels have been published.
- [24.04.2024] The lecture notes chapters on clustering and PCA have been published! The kernels chapter will appear soon.
- [15.04.2024] The lecture notes chapter on neural networks has been published. The kernels chapter will appear soon.
- [29.03.2024] Link to todays tutorial
- [26.03.2024] The lecture notes chapter on classification has been published!
- [13.03.2024] Project 1 is online until Wednesday, 27.03.2024, 23:59 CEST. In todays Q&A session the project team will answer any questions about the project. The solutions will be presented in the first Q&A session after the Easter break 10.04.2024.
- [19.02.2024] Welcome to the course Introduction to Machine Learning!
FAQ
This course is compulsory for my program, but I cannot register. What should I do?
A: If this course is compulsory for your study program (Kernfach), you are able to register irrespective of the waiting list. Please allow some time for the transfer from the waiting list.Is physical attendance to the didactic activities mandatory?
A: Physical attendance at lectures, tutorials and Q&As is not mandatory but strongly encouraged.How can I access the materials on the website?
A: Lecture slides, exercises, lecture notes, and recordings of the Q&A are password protected. To obtain the password you need to be inside the ETH network or use the ETH VPN and click here. Check here to learn how to establish a VPN connection. This year’s credentials to the recordings are (username: per-24s, password: scR694P).What programming knowledge is required for this course?
A: For the programming background, we recommend knowing Python. For those without experience in it, check out this Python tutorial.Will the projects contain boilerplate codes?
A: Yes. There will be some code to guide you through the steps.What are the most useful libraries to learn for the projects?
A: numpy, sklearn, pandas, torch.Is there a preferred library for the deep learning section?
A: We will give a tutorial on pytorch and use pytorch in the boilerplate code for the projects. However, the use of other libraries is not disallowed.Am I eligible to take the exam if I fail some of the projects?
A: You must receive an average grade of 4 or above to be eligible to take the exam. Failure in a project (not passing an easy baseline) means a grade of 2. However, as long as the average is above 4 (you can achieve this by getting a 6 in other projects) you can take the exam. If you do not do so we will ask you to deregister from the exam. If you do not de-register we will assign you a no-show grade.I did the projects last year (spring semester) and didn’t take or failed the exam in the spring or following autumn semester (or both). Do they count towards this year’s projects?
A: No, they do not. Projects can be submitted only in the spring semester and they make you eligible to take the exam in the same (spring) semester and the autumn semester following. If you do not take the exam or fail the exam in these two sessions, you have to enroll in the class next year and redo the projects again.I am an attendance-only doctoral student what do I need to do?
A: You need to enroll in the class and complete the projects to get your attendance certified. Your department decides how many credits you get for just attendance. Most departments require you to take the exam as well to be eligible for credits . This includes the D-INFK, D-MAVT departments. Please contact your department study administrator for details.I am a Ph.D. student, can I not do the projects or the exam but still get a grade?
A: Most Ph.D. students that would like to get credits for the class need to do the projects and the exam. A minority of departments allow you to get credits without taking an exam. See the question above.Is distance examination allowed?
A: Distance examination is allowed, but you need to file an official request via study administration. We do not handle these requests.Lectures
Lectures will be on Tuesdays and Wednesdays, 14:15 to 16:00, in ETA F5 with a simultaneous video screening in ETF E1. You can ask questions in person or via the course channel on EduApp.All lectures and tutorials will be recorded, and the recording will be made available within a day after the lecture in the ETH Video Portal and below.
The first lecture will take place on Tuesday, 20 February 2024.
Date | Topic | Slides | Recording |
---|---|---|---|
Tue 20.02. | Introduction | Preliminary [Final] | Recording |
Wed 21.02. | Linear Regression | Preliminary [Final] | Recording |
Tue 27.02. | Nonlinear regression & Optimization I | Preliminary [Final] | Recording |
Wed 28.02. | Optimization II & overparameterization | Preliminary [Final] | Recording |
Tue 05.03. | Pytorch and Python tutorial | Recording | |
Wed 06.03. | Model evaluation and selection | Preliminary [Final] | Recording |
Tue 12.03. | Bias-variance tradeoff & Regularization | Preliminary [Final] | Recording |
Wed 13.03. | Classification I | Preliminary [Final] | Recording |
Tue 19.03. | Classification II | Preliminary [Final] | Recording |
Wed 20.03. | Classification III & Kernel methods | Preliminary [Final] | Recording |
Tue 26.03. | Kernel and other methods | Preliminary [Final] | Recording |
Wed 27.03. | Neural Networks I | Preliminary [Final] | Recording |
Tue 09.04. | Neural Networks II (Backpropagation/Blackboard) | Preliminary [Final] | Recording |
Wed 10.04. | Neural Networks III | Preliminary [Final] | Recording |
Tue 16.04. | Neural Networks IV | Preliminary [Final] | Recording |
Wed 17.04. | Clustering | Preliminary [Final] | Recording |
Tue 23.04. | Dimension Reduction I | Preliminary [Final] | Recording |
Wed 24.04. | Dimension Reduction II | Preliminary [Final] | Recording |
Tue 30.04. | Probabilistic Modeling I | Preliminary [Final] | Recording |
Wed 01.05. | Holiday | ||
Tue 07.05. | Probabilistic Modeling II | Preliminary [Final] | Recording |
Wed 08.05. | Probabilistic Modeling III | Preliminary | Recording |
Tue 14.05. | No class | ||
Wed 15.05. | Gaussian Mixture Models I | Preliminary [Final] | Recording |
Tue 21.05. | Gaussian Mixture Models II | Preliminary [Final] | Recording |
Wed 22.05. | Gaussian Mixture Models III | Preliminary [Final] | Recording |
Tue 28.05. | No class | ||
Wed 29.05. | LLMs | Preliminary | Recording |
Lecture Notes
We provide a detailed manuscript that contains the most important mathematical background needed for understanding the course. These notes will also serve as a reference for the lectures and set up the notation and needed theorems, definitions, and concepts.The manuscript is not final and will be updated and expanded with notes for some of the lectures as the course progresses, so please check back regularly. For typos, errors and suggestions, please go to the corresponding Moodle section.
Lecture Notes | Last version: 20 June 2024 |
Tutorials
Tutorials will be held on Fridays, 14:15 to 16:00 ETA F5 with a simultaneous video screening in ETF E1. Similar to the lectures, you will be able to ask questions via the course channel on the EduApp.The tutorial will be recorded and the recording will be made available after the tutorial in the ETH Video Portal and below.
The first tutorial will take place on Friday, 23 February 2024.
Date | Topic | Materials | Recording | Homework/Solution |
---|---|---|---|---|
Fri 23.02. | Math recap | Recording | Homework/Solution | |
Fri 01.03. | Linear regression and Optimisation | Slides [Final] | Recording | |
Fri 08.03. | Probability recap and model selection | Slides | Recording | Homework/Solution |
Fri 15.03. | Homework 2 discussion | Slides | Recording | |
Fri 22.03. | Classification and Kernel Methods | Slides [Final] | Recording | Homework/Solution |
Fri 29.03. | Review of Homework 3 | Slides [Final] | Recording | Notebook |
Fri 12.04. | Homework 4 (Exercises 1-3) | Slides [Final] | Recording | Homework/Solution |
Fri 19.04. | Homework 4 (Exercise 4) | Slides [Final] | Recording | |
Fri 26.04. | Clustering and Dimensionality Reduction | Slides | Recording | Homework/Solution |
Fri 03.05. | Clustering and Dimensionality Reduction | Slides | Recording | |
Fri 10.05. | Probabilistic modeling | Slides | Recording | Homework/Solution |
Fri 17.05. | Discussion of Homework 5 | Slides [Final] | Recording | |
Fri 24.05. | Generative Models and Gaussian Mixture Models | Slides [Final] | Recording | Homework/Solutions |
Fri 31.05. | LLMs | Slides [Final] | Recording |
Q&A Sessions
Q&A sessions (virtual office hours) will be held on Wednesdays, 17:00 to 18:00 virtually on Zoom. The Q&A sessions are an informal opportunity to ask questions about the course. We may use some Q&A sessions for giving more information about the projects. You will be able to ask questions via the native Zoom chat or by speaking out, if requested to do so. It is not mandatory to attend the Q&A sessions.The Q&A session will be recorded and the recording will be made available after the Q&A session.
The first Q&A session will take place on Wednesday, 21 February.
Date | Topic | Recording |
---|---|---|
Wed 21.02. | Administration | Recording |
Wed 28.02. | Math recap | Recording |
Wed 06.03. | Python environment setup | Recording |
Wed 13.03. | Project 1 | Recording |
Wed 20.03. | General questions about material | Recording |
Wed 27.03. | Project 2 | Recording |
Wed 10.04. | Solution of Project 1 | Recording |
Wed 17.04. | No one attended | Recording |
Wed 24.04. | Project 3 | Recording |
Wed 08.05. | Solution of Project 2 and Introduction to Project 4 | Recording |
Wed 15.05. | Solution of Project 3 | Recording |
Wed 22.05. | No one attended | Recording |
Wed 29.05. | Solution of Project 4 | Recording |
Tue 06.08. | Exam Q&A | Recording/Slides |
Contact
Instructors | Prof. Fernando Perez Cruz and Prof. Fanny Yang |
Head TA | Lenart Treven |
Assistants |
Jonas Hübotter, Lars Lorch, Dongho Kang, Jin Cheng, Viacheslav Borovitskiy, Berken Utku Demirel, Alexandru Tifrea, Bhavi Sukhija, Yarden As, Scott Sussex, Miguel Zamora, Kiran Doshi, Marco Baumann, Zijun Hui, Tobias Wegel, Julia Kostin, Cynthia Chen, Maxim Huber, Aidyn Ubingazhibov, Sascha Bongni, Yue Li, Mohammad Reza Karimi, Armin Lederer, Piersilvio DE Bartolomeis, Sergeev Fedor, Javier Abad Martinez, Paola Malsot, Anna Kerekes, Hugo Yéche, Alex Immer, Sonali Andani, Mamoun Chami, Charlton Connor, Rajesh Sharma, Filippo Masotti |
Mailing List | Please use Moodle for questions regarding course material, organization and projects. If you need to contact the Head TA or the lecturer directly, please send an email to introml24-info@inf.ethz.ch. Please think twice before you send an email though and make sure you read all information here carefully. |
Lectures
Tue 14-16 | ETA F5 | ETF E1 (via video) |
Wed 14-16 | ETA F5 | ETF E1 (via video) |
Tutorials
Fri 14-16 | ETA F5 | ETF E1 (via video) |
Questions & Answers
Wed 17-18 | Virtual | Zoom |
Code Projects
The code projects will require solving machine learning problems with methods taught within the course. Projects will require handing in the solution code as well as a short report. You are allowed to work in groups of 1 – 3 students, but it is your responsibility to find a group. You can search for teammates by posting on Moodle.In particular, there will be 5 code projects. The first project is ungraded and will allow you to become familiar with our code submission workflow. The remaining projects are graded (pass/fail) and mandatory for passing the course.
Following is a timetable of the projects. Details regarding the projects can be found here. If you are having technical issues please send an email to introml24-projects@inf.ethz.ch.
Project | Release Date | End Date | Weight on Project Grade |
---|---|---|---|
Project 0 | Fri, 01.03.2024 08:00 | - | 0 |
Project 1a&b | Wed, 13.03.2024 08:00 | Wed, 27.03.2024 23:59 CEST | 0.25 (0.125 each) |
Project 2 | Wed, 27.03.2024 08:00 | Tue, 23.04.2024 23:59 CEST | 0.25 |
Project 3 | Tue, 23.04.2024 08:00 | Wed, 08.05.2024 23:59 CEST | 0.25 |
Project 4 | Wed, 08.05.2024 08:00 | Wed, 22.05.2024 23:59 CEST | 0.25 |
There will be a presentation introducing the project in the Q&A session on the same Wednesday each project gets released. Also, there will be a project solution session in the Q&A session the week after the deadline of a project. Hence there are 8 sessions for all 4 graded projects in total.
Setting up Anaconda
This is a guide that will help you setup the Anaconda environment that you will need for the projects: Setting up Anaconda.Demos
The demos are hosted at GitLab. Demos are based on jupyter notebook (with Python 3.9). Please look at this intro for installing and running instructions. We recommend that you create a conda environment to maintain the code base. You should have been added to the repository automatically. If you cannot access the repository, please send an email to sbongni@student.ethz.ch with the subject “[IML 2024] GitLab access” and include your nethz in the email like this “[sbongni]” surrounded by square brackets. (If you get a 404-error or can’t see the repository this is likely because you don’t have access yet. If you have never logged into gitlab before, I was not able to add you because your account is not displayed.)Performance Assessment
70% session examination, 30% code project; the final grade will be calculated as weighted average of both these elements. As a compulsory continuous performance assessment task, the project must be passed on its own. The coding projects are an integral part (60 hours of work, 2 credits) of the course. Participation is mandatory. To be eligible for the examination of Introduction to Machine Learning (252-0220-00L), you need to pass the code projects, i.e., attain an overall project grade of 4 or higher. Students who do not pass the project are required to de-register from the exam and will otherwise be treated as a “no show”.For the final exam, you can bring two A4-pages (that is, one A4-sheet of paper), either handwritten or 11 point minimum font size. A simple non-programmable calculator is allowed during the exam. The exam will be multiple choice. Here you can find an example of the question types as well as how to fill out the answer sheet to guarantee successful automatic grading.
Other Resources
-
-
- Marc Peter Deisenroth, A Aldo Faisal, and Cheng Soon Ong Mathematics for Machine Learning. Cambridge University Press, 2020.
- K. Murphy. Machine Learning: a Probabilistic Perspective. MIT Press, 2012.
- C. Bishop. Pattern Recognition and Machine Learning. Springer, 2007. (optional)
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2001.
- L. Wasserman. All of Statistics: A Concise Course in Statistical Inference. Springer, 2004.
- G. James., D. Witten and et al. An Introduction to Statistical Learning. Springer, 2021.
-