Teaching Machine Learning Workshop at ECML 2022

Content

  1. Content
  2. Important dates
  3. About this workshop
  4. Programme
    1. Preface Satellite Event
      1. “MOOC Machine learning in python with scikit-learn” by David Arturo Amor Quiroz
      2. “Teaching in the Open: advancing education by adopting open source and open science practices” by Prof. Lorena A. Barba
    2. Hybrid Conference Workshop at ECML’22
  5. Motivation
  6. Topics Covered
    1. Proceedings and Paper Format
    2. Paper Submission
    3. Paper Reviews
  7. Accepted Papers
    1. Developing Open Source Educational Resources for Machine Learning and Data Science
    2. Teaching Machine Learning with mlr3 using Shiny
    3. Teaching Machine Learning with Applied Interdisciplinary Real World Projects
    4. A Deep Learning Bootcamp for Engineering & Management Students
    5. Hearts Gym: Learning Reinforcement Learning as a Team Event
    6. Stimulating student engagement with an AI board game tournament
    7. Will the sun shine? – An accessible dataset for teaching machine learning and deep learning
    8. Data, Trees, and Forests – Decision Tree Learning in K-12 Education
    9. Introduction to AI: Crash Course for an audience with diverse backgrounds
    10. Machine Learning Students Overfit to Overfitting
    11. Reviewers
  8. Questions, Concerns or Feedback

Important dates

  • June 20, 2022: Paper submissions due
  • June 27, 2022: extended Paper submissions due
  • July 24, 2022: Peer Review submissions due
  • July 26, 2022: Paper decisions
  • Sep 06, 2022: Camera ready due (including video recording)
  • Sep 09, 2022: all papers online
  • Sep 13, 2022: satellite event virtual
  • Sep 23, 2022: day of the workshop (paper presentations as posters and coordinated discussions)

About this workshop

Machine Learning based approaches have become ubiquitous in many areas of society, industry and academia. Understanding what Machine Learning (ML) is providing and reproducing what it infers, has become an essential prerequisite for adoption. In this line of thought, course materials, introductory media and lecture series of a broad variety, depth, and quality are public availability. To this date and the best our knowledge, there is no structured approach to collect and discuss best practices in teaching Machine Learning. This workshop strives to change this.

With our workshop, we want to perpetuate an academic discussion on best practices. We would like to help improve existing material as a community and make conceiving new material more effective. We are very happy that this idea was approved for ECML PKDD 2021 and ECML PKDD 2020 workshop programme. We hope to continue this in 2022.

Programme

This workshop is currently planned as an hybrid event. We will use Zoom as our online conference service if not endorsed otherwise by the conference.

The workshop programme is split into two parts:

  • a satellite event as a preface to the conference workshop
  • the conference workshop

Preface Satellite Event

This event will take place on September 13, 2022, starting 2:45 pm CEST. The times and presenters are indicated in the table below. All details to participate are available on the event note pad, e.g. the zoom connection details.

Note: To participate in the satellite event, no ECML ticket is needed.

Time/CEST Title Speaker
2:45 pm Welcome and Housekeeping The Organizers
3:00 pm MOOC Machine learning in python with scikit-learn David Arturo Amor Quiroz
4:00 pm Teaching in the Open: advancing education by adopting open source and open science practices Lorena Barba

“MOOC Machine learning in python with scikit-learn” by David Arturo Amor Quiroz

The MOOC “Machine learning in Python with scikit-learn” first aired in the spring 2021 and since then it has had two sessions with an average of 11,500 registered participants per session. In this talk we are going to discuss how the material can be used by teachers and students, as well as the technical and pedagogical choices and the general experience that the team have gained through the first 2 sessions.

The course is free of charge, requires no installation, includes final attestation and a discussion forum where the scikit-learn core developers were answering student’s questions. It offers a hands-on course with 7 modules (+ 1 introductory module), 15 video lessons, 70 programming notebooks, 26 quizzes, 7 wrap-up quizzes and 21 non-graded exercises. A static version of the course material is available on JupyterBook and the code can be found on GitHub where everybody can contribute.

David Arturo Amor Quiroz did his PhD in physics at the Institute of Nuclear Sciences of UNAM, Mexico (2014-2018). He is currently working at the National Institute for Research in Digital Science and Technology (INRIA), France, as part of the maintenance team of the Machine Learning library called scikit-learn.

“Teaching in the Open: advancing education by adopting open source and open science practices” by Prof. Lorena A. Barba

Subjects like machine learning and data science are changing very fast. It’s thus a challenge for any particular department or college to teach high-quality and up-to-date courses on these topics. By looking to adopt ethos and processes from open source software and open science, we can enhance quality and outcomes in teaching new subjects. Ethos means the practices and values that characterize open source communities. The open education movement starting in the 1990s was inspired by open source software; its most visible efforts have been open courseware (OCW) and open educational resources (OER). But key features were missed: the open development model, community building, and networked collaboration. Teaching in the Open means looking to form collaborations in the development of curricula and content, sharing learning objects under permissive licenses, thinking of reuse from the beginning, participating in peer review of content and learning objects, and accepting community contributions. In this vein, we founded The Journal of Open Source Education (https://jose.theoj.org), publishing papers on both software for educational purposes, and learning modules, particularly on computing-based courses. The contributions of authors, editors, and reviewers show that communities of teacher-scholars are forming, growing, and having impact.

Lorena A. Barba is professor of mechanical and aerospace engineering at the George Washington University in Washington, DC. An international leader in computational science and engineering, she is also a long-standing advocate of open source software for science and education, and she is well known for her courses and open educational resources. She was a recipient of the 2016 Leamer-Rosenthal Award for Open Social Sciences, and in 2017, was nominated and received an honorable mention in the Open Education Awards for Excellence of the Open Education Consortium. Barba served (2014–2021) in the Board of Directors for NumFOCUS, a 501(c)3 public charity in the United States that supports and promotes world-class, innovative, open-source scientific software. She is an expert in research reproducibility, and was a member of the National Academies study committee on Reproducibility and Replicability in Science, which released its report in 2019. She served as Reproducibility Chair for the SC19 (Supercomputing) Conference, is Editor-in-Chief and track editor for Reproducible Research in IEEE’s Computing in Science & Engineering, was founder and Associate Editor-in-Chief for the Journal of Open Source Software, and is Editor-in-Chief of The Journal of Open Source Education. She was General Chair of the global JupyterCon 2020 and was named Jupyter Distinguished Contributor in 2020.

Hybrid Conference Workshop at ECML’22

As we have a strong European and US community involved, we try to reflect this in our workshop setup on September 23, 2022, in Grenoble (France).

Our workshop will take place at the conference (European community, hybrid setup) and online-only after this event (US and European community). We start with an hybrid workshop in Grenoble from 2:30 pm CEST to 6:30 pm CEST at ECML’22 in Grenoble.

We will use gather.town as a virtual meeting space and ether.pad for collaborative notes. Please contribute!

Agenda:

  • welcome
  • identifying topics of interest
  • guided group discussion
  • intermediate reports
  • break and chance to change topic
  • guided group discussion
  • conclusion of group discussions
  • hand-over to virtual on gather.town

Online workshop, Sep 23, 2022, to start 7:00 pm CEST all virtual on gather.town

  • welcome and hand-over report from physical event
  • group discussion
  • conclusion

For every paper, the authors need to provide a poster (or similar) and a short video introducing the paper. The videos will be posted along with the paper link on our website. Every participant will get an email with a random selection of 3 videos to watch, before the workshop.

The key contents of the workshop discussion will find their way into a summary paper published along all papers of the workshop on PMLR.

Motivation

Many experts and practitioners who develop Machine Learning models or infrastructure around these models are confronted with the opportunity to teach Machine Learning at some point in their career. Traditionally, many rely on their gut feeling to design courses that are motivated by these circumstances. The methods of choice are often PowerPoint or similar technologies.

This workshop targets those who would like to know, how teachers from around the globe approach teaching Machine Learning: How deep do they dive into the matter? What mental models do they use to visualize concepts? What media is at play in teaching ML by others?

With this workshop, we hope that all participants obtain a better feeling where they stand with their teaching and how they can improve or collaborate with others.

Topics Covered

The main goal of this workshop is to motivate and nourish best practices at any stage of the teaching process. For this, we would like to cover a structured approach to teaching motivated by the carpentries or a variation thereof. We believe that the core concepts contained in this are helpful for any teaching practitioners.

The central activity of the workshop will be twofold:

  1. a call-for-papers whereby teaching professionals or beginners are asked to describe their method of choice when teaching a given ML topic. We like to attract at maximum 4-page long mini-articles (excluding references and acknowledgements) that present or discuss a teaching activity related to machine learning. For more details, see below.

  2. accepted papers will be shared in a community connection session, where presenters and participants can discuss the papers. (This is like a poster session but without posters)

Proceedings and Paper Format

All papers will be published in with Proceedings of Machine Learning Research. The papers must be written in English and formatted according to the ICML 2021 latex template.

The maximum length of papers is 4 pages (excluding references and acknowledgements) in this format. The program chairs reserve the right to desk reject any over-length papers without review. Papers that ‘cheat’ the page limit by, including but not limited to, using smaller than specified margins or font sizes will also be treated as over-length. Note that for example negative \vspaces are also not allowed.

Additional materials (e.g. proofs, audio, images, video, data, or source code) can be provided as URLs inside the paper of your submission. The reviewers and the program committee reserve the right to judge the paper solely based on the 4 pages; looking at any additional material is at the discretion of the reviewers and is not required. In order not to undisclose the submitting author’s identity, please consider using tools like anonymous.4open.science.

We strive to pursue a double-blind review process. All papers need to be ‘best-effort’ anonymized. We strongly encourage to also make code and data available anonymously (e.g., in an anonymous git repository or Dropbox folder). It is allowed to have a (non-anonymous) pre-print online, but it should not be cited in the submitted paper to preserve anonymity. Reviewers will be asked not to search for them.

Paper Submission

For past content accepted at our workshop, please see the proceedings of 2021 and 2020. We are open to any submission aligned with the goals of our workshop. In 2022, we cordially encourage authors to focus on

  • in-depth discussions of teaching exerices
  • quantitive studies of learner progression
  • quantitive assessment of teaching exercises
  • in-depth discussions of data sets amendable for teaching ML
  • discussions of unplugged material that teaches ML without a computer
  • discussions on how to foster feedback between learners and instructors (perhaps in an automated fashion)
  • discussions on how to manage learner expectations prior to a course
  • comparing in-presence versus online teaching experiences

Please submit your papers on our openreview.net page. Should you encounter any problems, please reach out to us on our issue tracker or as described below.

Paper Reviews

We will conduct an open double-blinded peer-review using openreview.net on all contributions and select contributions based on the reviewers’ feedback. Here are the important dates:

  • May 13, 2022: Submission opens
  • June 27, 2022: extended Paper submissions due
  • July 24, 2022: Peer Review submissions due
  • July 26, 2022: Paper decisions

Each submitted paper will be reviewed publicly by at least two experienced machine learning instructors.

Accepted Papers

For past workshops, see the accepted papers in 2021 and 2020. All published papers are available on PMLR.

Developing Open Source Educational Resources for Machine Learning and Data Science

by Ludwig Bothmann, Sven Strickroth, Giuseppe Casalicchio, David Rügamer, Marius Lindauer, Fabian Scheipl, Bernd Bischl

Resources: video, paper

Teaching Machine Learning with mlr3 using Shiny

by Gero Szepannel, Laurens Tetzlaff, Alexander Frahm, Karsten Lübke

Resources: video, paper

Teaching Machine Learning with Applied Interdisciplinary Real World Projects

by Gulustan Dogan

Resources: paper

A Deep Learning Bootcamp for Engineering & Management Students

by Lukas Lodes, Alexander Schiendorfer

Resources: video, paper

Hearts Gym: Learning Reinforcement Learning as a Team Event

by Stefan Kesselheim, Jan Ebert, Danimir T Doncevic

Resources: video, paper

Stimulating student engagement with an AI board game tournament

by Ken Hasselmann, Quentin Lurkin

Resources: video, paper

Will the sun shine? – An accessible dataset for teaching machine learning and deep learning

by Florian Huber, Dafne Erica van Kuppevelt, Peter Steinbach, Colin Sauze, Yang Liu, Berend Weel

Resources: video, paper

Data, Trees, and Forests – Decision Tree Learning in K-12 Education

by Tilman Michaeli, Stefan Seegerer, Lennard Kerber, Ralf Romeike

Resources: video, paper

Introduction to AI: Crash Course for an audience with diverse backgrounds

by Donatella Cea, Helene Hoffmann, Marie Piraud

Resources: video, paper

Machine Learning Students Overfit to Overfitting

by Matias Valdenegro-Toro, Matthia Sabatelli

Resources: video, paper

Reviewers

We are extremely grateful for the group of volunteers that make this event happen by providing their reviews to submitted papers in the last years. We hope to attract reviewers again this year. Should you be interested, please let us know and contact us as indicated below.

Questions, Concerns or Feedback

We are happy to hear from you regarding your questions, concerns or feedback. Please do so by opening an issue here or contact us.

Session Chair

Katherine M. Kinnaird

Clare Boothe Luce Assistant Professor Department of Computer Science and Program in Statistical & Data Sciences Smith College

Peter Steinbach

Team Lead AI Consultants for Matter Research at Helmholtz-Zentrum Dresden-Rossendorf

Oliver Guhr

Research fellow at the HTW Dresden in the department of artificial intelligence.

Registration

More information about the registration process will be published soon.