Teaching ML @ ECML-PKDD 2021

Content

Content
Important dates
Programme
1. Preface Satellite Event - September 8, 2021
2. Schedule for workshop Day September 13, 2021
  1. Video Conference Details
About this workshop
Motivation
Topics Covered
Accepted Papers
Questions, Concerns or Feedback

Important dates

May 05, 2021: Paper submissions open
Jun 25, 2021: Paper submissions due Deadline extended
Jul 21, 2021: Paper decisions
Aug 18, 2021: Camera ready due
Sep 01, 2021: all papers online
Sep 08, 2021: satellite event virtual
Sep 13, 2021: day of the workshop (paper presentations and discussion) virtual

Programme

This workshop is planned as an online event. For both events we will use Zoom as our online conference service. We will publish the conference link on this website before the event.

Preface Satellite Event - September 8, 2021

Time/CEST	Title	Speaker
2:00 pm	AI Campus - the learning platform for artificial intelligence (video)	Cornelia Gamst
3:00 pm	Embedding Ethics in Data Structures Classes (video)	Sorelle Friedler
4:00 pm	Rethinking the dreaded textbook: taking inspiration from the inclusive, ethical, and active machine learning classroom (video)	Alicia Johnson
5:00 pm	Wrap up

The videos of each talk will be made available on youtube. The notes including the Q&A are available here.

Demo of “AI Campus - the learning platform for artificial intelligence”

Everybody (well, almost) wants to learn Machine Learning, or Data Science these days. The big online learning platforms already offer a wide variety of options, differing in level, length, depth and also learning formats. But somehow in Germany there still seems to be a missing piece. “AI Campus” wants to fill that gap with specialized courses and learning nuggets for a broad target group: from in depth courses for domain experts to learning nuggets offering a basic understanding of what’s behind AI and machine learning for the general public.

In this workshop, we will give a brief demo of learning formats available on AI Campus, discuss our general curriculum framework and we can take a closer look at how we do things in specific courses that are of interest to the audience of the workshop.

About Cornelia Gamst

Cornelia Gamst is the curriculum manager at the KI-Campus and together with the DFKI responsible for the content of the overall programme offered at the KI-Campus.

Embedding Ethics in Data Structures Classes

Fairness in machine learning, or more broadly AI Ethics, has become a hot topic in research over the past 5 years and these topics are increasingly being incorporated into the machine learning and AI curriculum. In this talk, I’ll argue that we can best prepare our students to participate in these conversations and build better machine learning systems by introducing them to ethical ideas early in the curriculum. Data Structures, generally taught in the second semester of a college computer science curriculum, is an early and required class where students learn to think of themselves as problem solvers. Integrating data-driven real world projects and associated ethical concerns into data structures teaches students the background they’ll need to build better ML. I’ll discuss projects integrating racial equity concerns and environmental impact into the data structures curriculum.

About Sorelle Friedler

Sorelle Friedler is an Associate Professor of Computer Science at Haverford College and an Affiliate at the Data & Society Research Institute. She is a Co-Founder of the ACM Conference on Fairness, Accountability, and Transparency. Sorelle’s research focuses on the fairness and interpretability of machine learning algorithms, with applications from criminal justice to materials discovery, and she has incorporated these ideas into introductory computer science, data science, and algorithms courses.

Rethinking the dreaded textbook: taking inspiration from the inclusive, ethical, and active machine learning classroom

As machine learning educators, we’re trained to translate and weave developing “best practices” into our classrooms. For example, we might aim to create inclusive courses which center data ethics while promoting active learning. Especially since these same ends aren’t always embedded in our classroom resources (eg: textbooks), pulling this off can require some patchwork and wizardry. In this talk, we’ll discuss Bayes Rules! An Introduction to Bayesian Modeling with R (Johnson, Ott, & Dogucu), our attempt at: (1) supporting educators in their implementation of inclusive, ethical, and active learning practices; and (2) reflecting that these goals are critical to the entire machine learning workflow, not just the classroom; all while (3) being ourselves.

About Alicia Johnson

Professor Johnson’s primary research interest is in Markov chain Monte Carlo (MCMC) methods. The focus of her research has been on the convergence rates of chains corresponding to MCMC algorithms on general state spaces. In addition, she enjoys the unlimited applications of statistics. As a consultant for a division of the Centers for Disease Control and Prevention and the University of Minnesota, she has collaborated on projects in entomology, forestry, primatology, public health, and others. In other words, her work in statistics allows her to keep learning a little about a lot!

Schedule for workshop Day September 13, 2021

Time/CEST	Topic
2:30 - 2:45 pm	Welcome
2:45 - 3:45 pm	Community Connect During the community connect, participants will have the opportunity to discuss the workshop’s papers with the authors. This is a chance to connect over emerging pedagogical practices and philosophies while also building a more robust community of machine learning educators.
3:45 - 4:00 pm	Break
4:00 - 5:00 pm	Workshop discussions
5:00 - 5:30 pm	Wrap up

Video Conference Details

The satellite event will be streamed on zoom. To connect, use the following link. Telephone dial-ins are available here.

About this workshop

Machine Learning based approaches have become ubiquitous in many areas of society, industry and academia. Understanding what Machine Learning (ML) is providing and reproducing what it infers, has become an essential prerequisite for adoption. In this line of thought, course materials, introductory media and lecture series of a broad variety, depth, and quality are public availability. To this date and the best our knowledge, there is no structured approach to collect and discuss best practices in teaching Machine Learning. This workshop strives to change this.

With our workshop, we want to start an academic discussion on best practices. We would like to help improve existing material as a community and make conceiving new material more effective. We are very happy that this idea was approved for ECML PKDD 2021 workshop programme. Like ECML PKDD 2021, Teaching ML 2021 will be a virtual event.

Motivation

Many experts and practitioners who develop Machine Learning models or infrastructure around these models are confronted with the opportunity to teach Machine Learning at some point in their career. Traditionally, many rely on their gut feeling to design courses that are motivated by these circumstances. The methods of choice are often PowerPoint or similar technologies.

This workshop targets those who would like to know, how teachers from around the globe approach teaching Machine Learning: How deep do they dive into the matter? What mental models do they use to visualize concepts? What media is at play in teaching ML by others?

With this workshop, we hope that all participants obtain a better feeling where they stand with their teaching and how they can improve or collaborate with others.

Topics Covered

The main goal of this workshop is to motivate and nourish best practices at any stage of the teaching process. For this, we would like to cover a structured approach to teaching motivated by the carpentries or a variation thereof. We believe that the core concepts contained in this are helpful for any teaching practitioners.

The central activity of the workshop will be twofold:

a call-for-papers whereby teaching professionals or beginners are asked to describe their method of choice when teaching a given ML topic. We like to attract at maximum 4-page long mini-articles (excluding references and acknowledgements) that present or discuss a teaching activity related to machine learning. For more details, see below.
accepted papers will be shared in a community connection session, where presenters and participants can discuss the papers. (This is like a poster session but without posters)

Proceedings and Paper Format

All papers will be published in with Proceedings of Machine Learning Research. The papers must be written in English and formatted according to the ICML 2021 latex template.

The maximum length of papers is 4 pages (excluding references and acknowledgements) in this format. The program chairs reserve the right to reject any over-length papers without review. Papers that ‘cheat’ the page limit by, including but not limited to, using smaller than specified margins or font sizes will also be treated as over-length. Note that for example negative vspaces are also not allowed.

Additional materials (e.g. proofs, audio, images, video, data, or source code) can be provided as URLs inside the paper of your submission. The reviewers and the program committee reserve the right to judge the paper solely based on the 4 pages; looking at any additional material is at the discretion of the reviewers and is not required.

We strive to pursue a double-blind review process. All papers need to be ‘best-effort’ anonymized. We strongly encourage to also make code and data available anonymously (e.g., in an anonymous git repository or Dropbox folder). It is allowed to have a (non-anonymous) pre-print online, but it should not be cited in the submitted paper to preserve anonymity. Reviewers will be asked not to search for them.

Paper Submission

We invite interested authors to submit their article on openreview.net here.

Paper Reviews

We will conduct an open double-blinded peer-review using openreview.net on all contributions and select contributions based on the reviewers’ feedback. Here are the important dates:

May 5, 2021: Submission opens
June 25, 2021: Submission Deadline Deadline extended (no submissions past this date)
July 21, 2021: Paper Confirmations

Each submitted paper will be reviewed publicly by at least two experienced machine learning instructors.

Accepted Papers

Workshop Proceedings

Accepted papers are published in the Proceedings of Machine Learning Research (PMLR). They are open-access and can be retrieved from the PMLR website.

Staying Ahead in the MOOC-Era by Teaching Innovative AI Courses

Patrick Glauner

As a result of the rapidly advancing digital transformation of teaching, universities have started to face major competition from Massive Open Online Courses (MOOCs). Universities thus have to set themselves apart from MOOCs in order to justify the added value of three to five-year degree programs to prospective students. In this paper, we show how we address this challenge at Deggendorf Institute of Technology in ML and AI. We first share our best practices and present two concrete courses including their unique selling propositions: Computer Vision and Innovation Management for AI. We then demonstrate how these courses contribute to Deggendorf Institute of Technology’s ability to differentiate itself from MOOCs (and other universities).

Teaching machine learning through end-to-end decision making

Hussain Kazmi

Advances in machine learning and data science hold the potential to greatly optimize the overall energy sector, and prevent the worst outcomes of anthropogenic climate change. However, despite the urgent need for trained energy data scientists and the presence of a number of technically challenging issues that need to be tackled, the sector continues to suffer from a personnel shortage and remains mired in outdated technology. In many programs, energy engineers continue to graduate without even rudimentary programming skills, let alone knowledge of data science. This paper highlights key findings from an introductory course on machine learning and optimization designed specifically for energy engineering students. The course employs a number of teaching aids, which we hope will be useful for the broader community as well. The course was developed in a pan-European setting, supported by four different European universities as part of a broader roadmap to overhaul energy education.

Deep Learning Projects from a Regional Council: An Experience Report

Jónathan Heras

Due to the impact of Deep Learning both in industry and academia, there is a growing demand of graduates with skills in this field, and Universities are starting to offer courses that include Deep Learning subjects. Hands-on assignments that teach students how to tackle Deep Learning tasks are an instrumental part of those courses. However, most Deep Learning assignments have two main drawbacks. First, they use either toy datasets, that are useful to teach concepts but whose solutions do not generalise to real problems, or employ datasets that require specialised knowledge to fully understand the problem. Secondly, most Deep Learning assignments are focused on training a model, and do not take into account other stages of the Deep Learning pipeline, such as data cleaning or model deployment. In this work, we present an experience in an Artificial Intelligence course where we have tackled the aforementioned drawbacks by using datasets from the regional council where our University is located. Namely, the students of the course have developed several computer vision and natural language processing projects; for instance, a news classifier or an application to colourise historical images. We share the workflow followed to organise this experience, several lessons that we have learned, and challenges that can be faced by other instructors that try to conduct a similar initiative.

An Introduction to AI for GLAM

Daniel van Strien, Mark Bell, Nora Rose McGregor, Michael Trizna

There is a growing interest in utilising Machine Learning (ML) techniques within Galleries, Libraries, Archives and Museums (GLAM), and a corresponding demand for training to enable practitioners to engage confidently in this area. Staff at these institutions are seeking practical knowledge and skills in ML concepts and methods specific to the sector’s work, such as in the curation and collection of heritage collections. In this paper, we discuss the motivations and methods behind ““An Introduction to AI for GLAM” a new Carpentries workshop under development through an international partnership between British Library, Smithsonian Institution, and The National Archives UK. This new workshop aims to introduce GLAM practitioners to the essential conceptual and practical considerations for supporting, participating in and undertaking machine learning-based research and projects within the GLAM sector.

Using Matchboxes to Teach the Basics of Machine Learning: an Analysis of (Possible) Misconceptions

Erik Marx, Thiemo Leonhardt, David Baberowski, Nadine Bergner

The idea of chess-playing matchboxes, conceived by Martin Gardner as early as 1962, is becoming more and more relevant in learning materials in the area of AI and Machine Learning. Thus, it can be found in a large number of workshops and papers as an innovative teaching method to convey the basic ideas of reinforcement learning. In this paper the concept and its variations will be presented and the advantages of this analog approach will be shown. At the same time, however, the limitations of the approach are analyzed and the question of alternatives is raised.

Teaching Deep Learning, a boisterous ever-evolving field

Alfredo Canziani

Machine and deep learning techniques are actively being developed with over 150 papers submitted daily to arXiv, each of which is introducing its own notation. To offer a course that reflects the latest developments of the field and illustrate them in a cohesive and consistent manner, one needs to systematically consume the literature, summarise and standardise it, implement working examples, and deliver a concise and consistent presentation of a given topic. This paper reports all the best practices developed by the author in their last decade of teaching experience.

Teaching Machine Learning for the Physical Sciences:A summary of lessons learned and challenges

Viviana Acquaviva

This paper summarizes some challenges encountered and best practices established in several years of teaching Machine Learning for the Physical Sciences at the undergraduate and graduate level. I discuss motivations for teaching ML to Physicists, desirable properties of pedagogical materials such as accessibility, relevance, and likeness to real-world research problems, and give examples of components of teaching units.

Teaching Responsible Machine Learning to Engineers

Hilde Jacoba Petronella Weerts, Mykola Pechenizkiy

With the increasing application of machine learning models in practice, there is a growing need to incorporate ethical considerations in engineering curricula. In this paper, we reflect upon the development of a course on responsible machine learning for undergraduate engineering students. We found that technical material was relatively easy to grasp when it was directly linked to prior knowledge on machine learning. However, it was non-trivial for engineering students to make a deeper connection between real-world outcomes and ethical considerations such as fairness. Moving forward, we call upon educators to focus on the development of realistic case studies that invite students to interrogate the role of an engineer.

Deeper Learning By Doing: Integrating Hands-On Research Projects Into A Machine Learning Course

Sebastian Raschka

Machine learning has seen a vast increase of interest in recent years, along with an abundance of learning resources. While conventional lectures provide students with important information and knowledge, we also believe that additional project-based learning components can motivate students to engage in topics more deeply. In addition to incorporating project-based learning in our courses, we aim to develop project-based learning components aligned with real-world tasks, including experimental design and execution, report writing, oral presentation, and peer-reviewing. This paper describes the organization of our project-based machine learning courses with a particular emphasis on the class project components and shares our resources with instructors who would like to include similar elements in their courses.

Teaching Machine Learning in the Context of Critical Quantitative Information Literacy

Carrie Diaz Eaton

Bates College, is a small liberal arts postsecondary institution in the northeast United States. An information literacy course, Calling Bull, serves as an introductory data science class as well as a prerequisite-free quantitative literacy class. In this context, we spend a week discussing machine learning, with an emphasis on facial recognition algorithms. The emphasis is on the general algorithmic approach, critical inquiry of the process and careful interpretation of results presented in research or decision-making. This module relies on the use of open educational materials, discussion, and careful attention to issues of marginalization and algorithmic justice.

Teaching Uncertainty Quantification in Machine Learning through Use Cases

Matias Valdenegro-Toro

Uncertainty in machine learning is not generally taught as general knowledge in Machine Learning course curricula. In this paper we propose a short curriculum for a course about uncertainty in machine learning, and complement the course with a selection of use cases, aimed to trigger discussion and let students play with the concepts of uncertainty in a programming setting. Our use cases cover the concept of output uncertainty, Bayesian neural networks and weight distributions, sources of uncertainty, and out of distribution detection. We expect that this curriculum and set of use cases motivates the community to adopt these important concepts into courses for safety in AI.

Experiences from Teaching Practical Machine Learning Courses to Master’s Students with Mixed Backgrounds

Omar Shouman, Simon Fuchs, Holger Wittges

Machine learning education has become more accessible and relevant to students from various backgrounds. Practical courses complement theoretical lectures by focusing on applied machine learning. In this work, we report about our experiences from teaching two machine learning practical courses to master students from different study programs; an introductory and an advanced course. We present a summary of the teaching and evaluation methods used in both courses. We summarize our experiences and the feedback collected from the students through a survey. We conclude with our recommendations on teaching and designing practical machine learning courses.

A lesson for teaching fundamental Machine Learning concepts and skills to molecular biologists

Rabea Müller, Akinyemi Mandela Fasemore, Muhammad Elhossary, Konrad U Förstner

Machine Learning represents an invaluable set of tools for the analysis of data in molecular biology as well as bio-medicine. Here we present an training approach to teach fundamental machine learning skills to researchers in their early career stage (PhD and postdoc level) with the aim to empower them to apply these methods in their own research projects. The content was developed for being delivered in a short and intense learning period as part of a remote systems biology workshop but can be adapted to other scenarios with a less restricted time frame.

Participatory Live Coding and Learning-Centered Assessment in Programming for Data Science

Sarah M Brown

Programming for Data Science is a programming intensive data science course. This paper discusses a revision of the course to center student learning. The revision effort centered the desired learning outcomes and resulted in a course that charted an explicit path toward achieving them for students. This paper summarizes the design overall and provides practical details about the instruction via participatory live coding and assessment with a competency based grading scheme

Putting the “Machine” Back in Machine Learning for Engineering Students

Rudy Chin, Dimitrios Stamoulis, Diana Marculescu

Computer hardware architecture has played an important role in the recent advances made in deep learning and associated applications. However, effective teaching strategies for hardware architectures for machine learning require a different structure and technical background than classic machine learning. More specifically, not only does the material need to convey necessary machine learning concepts to students, but also covers the hardware and software infrastructure concepts required for supporting machine learning systems. In this paper, we describe our approach to designing the course materials along with student assessment and evaluation for the ““Hardware Architectures for Machine Learning”” course targeting Electrical and Computer Engineering graduate students.

Teaching Machine Learning in Argentina: the ClusterAI pipeline

Martin Palazzo, Agustin Velazquez, Melisa Breda, Matias Callara, Nicolas Aguirre

Teaching machine learning has been a growing activity in almost any educational establishment. Despite the high availability on study materials, Latin America region has seen a lack of educational programs focused on machine learning. Additionally the majority of educational materials are available only in English. In this work we propose the ClusterAI pipeline based on a curated list of topics in Spanish and a collaboration with the Buenos Aires city government that open public data-sets that let students to apply machine learning models on real data.

Reviewers

We are extremely grateful for the group of volunteers that make this event happen by providing their reviews to submitted papers:

Claudia A. Engel (Stanford)
Melody Su (Mount Holyoke College)
Carola Gajek (University of Augsburg)
Lukas Heinrich (CERN)
Colin Sauze (Prifysgol Aberystwyth University)
David Rousseau (National institute of nuclear and particle physics, IN2P3)
Sebastian Starke (HZDR)
Steve Schmerler (HZDR)
Ludwig Bothmann (University of Munich)
John Laudun (US Army Combined Arms Center)
Sinead Williamson (University of Texas)
Patrick Glauner (Deggendorf Institute of Technology)
Jonathan Heras (Universidad de la rioja)
Matias Valdenegro (German Research Center for Artificial Intelligence, DFKI)
Mike Trizna (Smithsonian Institute)
Guillaume Muller (Université Jean Monnet Saint-Etienne)
Martin Palazzo (Universidad Tecnológica Nacional Facultad Regional Buenos Aires)
Thiemo Leonhardt (TU Dresden)

Questions, Concerns or Feedback

We are happy to hear from you regarding your questions, concerns or feedback. Please do so by opening an issue here or contact us.