This workshop is planned as an online event. For both events we will use Zoom as our online conference service. We will publish the conference link on this website before the event.
Time/CEST | Title | Speaker |
---|---|---|
2:00 pm | AI Campus - the learning platform for artificial intelligence (video) | Cornelia Gamst |
3:00 pm | Embedding Ethics in Data Structures Classes (video) | Sorelle Friedler |
4:00 pm | Rethinking the dreaded textbook: taking inspiration from the inclusive, ethical, and active machine learning classroom (video) | Alicia Johnson |
5:00 pm | Wrap up |
The videos of each talk will be made available on youtube. The notes including the Q&A are available here.
Everybody (well, almost) wants to learn Machine Learning, or Data Science these days. The big online learning platforms already offer a wide variety of options, differing in level, length, depth and also learning formats. But somehow in Germany there still seems to be a missing piece. “AI Campus” wants to fill that gap with specialized courses and learning nuggets for a broad target group: from in depth courses for domain experts to learning nuggets offering a basic understanding of what’s behind AI and machine learning for the general public.
In this workshop, we will give a brief demo of learning formats available on AI Campus, discuss our general curriculum framework and we can take a closer look at how we do things in specific courses that are of interest to the audience of the workshop.
About Cornelia Gamst
Cornelia Gamst is the curriculum manager at the KI-Campus and together with the DFKI responsible for the content of the overall programme offered at the KI-Campus.
Fairness in machine learning, or more broadly AI Ethics, has become a hot topic in research over the past 5 years and these topics are increasingly being incorporated into the machine learning and AI curriculum. In this talk, I’ll argue that we can best prepare our students to participate in these conversations and build better machine learning systems by introducing them to ethical ideas early in the curriculum. Data Structures, generally taught in the second semester of a college computer science curriculum, is an early and required class where students learn to think of themselves as problem solvers. Integrating data-driven real world projects and associated ethical concerns into data structures teaches students the background they’ll need to build better ML. I’ll discuss projects integrating racial equity concerns and environmental impact into the data structures curriculum.
About Sorelle Friedler
Sorelle Friedler is an Associate Professor of Computer Science at Haverford College and an Affiliate at the Data & Society Research Institute. She is a Co-Founder of the ACM Conference on Fairness, Accountability, and Transparency. Sorelle’s research focuses on the fairness and interpretability of machine learning algorithms, with applications from criminal justice to materials discovery, and she has incorporated these ideas into introductory computer science, data science, and algorithms courses.
As machine learning educators, we’re trained to translate and weave developing “best practices” into our classrooms. For example, we might aim to create inclusive courses which center data ethics while promoting active learning. Especially since these same ends aren’t always embedded in our classroom resources (eg: textbooks), pulling this off can require some patchwork and wizardry. In this talk, we’ll discuss Bayes Rules! An Introduction to Bayesian Modeling with R (Johnson, Ott, & Dogucu), our attempt at: (1) supporting educators in their implementation of inclusive, ethical, and active learning practices; and (2) reflecting that these goals are critical to the entire machine learning workflow, not just the classroom; all while (3) being ourselves.
About Alicia Johnson
Professor Johnson’s primary research interest is in Markov chain Monte Carlo (MCMC) methods. The focus of her research has been on the convergence rates of chains corresponding to MCMC algorithms on general state spaces. In addition, she enjoys the unlimited applications of statistics. As a consultant for a division of the Centers for Disease Control and Prevention and the University of Minnesota, she has collaborated on projects in entomology, forestry, primatology, public health, and others. In other words, her work in statistics allows her to keep learning a little about a lot!
Time/CEST | Topic |
---|---|
2:30 - 2:45 pm | Welcome |
2:45 - 3:45 pm | Community Connect During the community connect, participants will have the opportunity to discuss the workshop’s papers with the authors. This is a chance to connect over emerging pedagogical practices and philosophies while also building a more robust community of machine learning educators. |
3:45 - 4:00 pm | Break |
4:00 - 5:00 pm | Workshop discussions |
5:00 - 5:30 pm | Wrap up |
The satellite event will be streamed on zoom. To connect, use the following link. Telephone dial-ins are available here.
Machine Learning based approaches have become ubiquitous in many areas of society, industry and academia. Understanding what Machine Learning (ML) is providing and reproducing what it infers, has become an essential prerequisite for adoption. In this line of thought, course materials, introductory media and lecture series of a broad variety, depth, and quality are public availability. To this date and the best our knowledge, there is no structured approach to collect and discuss best practices in teaching Machine Learning. This workshop strives to change this.
With our workshop, we want to start an academic discussion on best practices. We would like to help improve existing material as a community and make conceiving new material more effective. We are very happy that this idea was approved for ECML PKDD 2021 workshop programme. Like ECML PKDD 2021, Teaching ML 2021 will be a virtual event.
Many experts and practitioners who develop Machine Learning models or infrastructure around these models are confronted with the opportunity to teach Machine Learning at some point in their career. Traditionally, many rely on their gut feeling to design courses that are motivated by these circumstances. The methods of choice are often PowerPoint or similar technologies.
This workshop targets those who would like to know, how teachers from around the globe approach teaching Machine Learning: How deep do they dive into the matter? What mental models do they use to visualize concepts? What media is at play in teaching ML by others?
With this workshop, we hope that all participants obtain a better feeling where they stand with their teaching and how they can improve or collaborate with others.
The main goal of this workshop is to motivate and nourish best practices at any stage of the teaching process. For this, we would like to cover a structured approach to teaching motivated by the carpentries or a variation thereof. We believe that the core concepts contained in this are helpful for any teaching practitioners.
The central activity of the workshop will be twofold:
a call-for-papers whereby teaching professionals or beginners are asked to describe their method of choice when teaching a given ML topic. We like to attract at maximum 4-page long mini-articles (excluding references and acknowledgements) that present or discuss a teaching activity related to machine learning. For more details, see below.
accepted papers will be shared in a community connection session, where presenters and participants can discuss the papers. (This is like a poster session but without posters)
All papers will be published in with Proceedings of Machine Learning Research. The papers must be written in English and formatted according to the ICML 2021 latex template.
The maximum length of papers is 4 pages (excluding references and acknowledgements) in this format. The program chairs reserve the right to reject any over-length papers without review. Papers that ‘cheat’ the page limit by, including but not limited to, using smaller than specified margins or font sizes will also be treated as over-length. Note that for example negative vspaces are also not allowed.
Additional materials (e.g. proofs, audio, images, video, data, or source code) can be provided as URLs inside the paper of your submission. The reviewers and the program committee reserve the right to judge the paper solely based on the 4 pages; looking at any additional material is at the discretion of the reviewers and is not required.
We strive to pursue a double-blind review process. All papers need to be ‘best-effort’ anonymized. We strongly encourage to also make code and data available anonymously (e.g., in an anonymous git repository or Dropbox folder). It is allowed to have a (non-anonymous) pre-print online, but it should not be cited in the submitted paper to preserve anonymity. Reviewers will be asked not to search for them.
We invite interested authors to submit their article on openreview.net here.
We will conduct an open double-blinded peer-review using openreview.net on all contributions and select contributions based on the reviewers’ feedback. Here are the important dates:
Each submitted paper will be reviewed publicly by at least two experienced machine learning instructors.
Accepted papers are published in the Proceedings of Machine Learning Research (PMLR). They are open-access and can be retrieved from the PMLR website.
Patrick Glauner
As a result of the rapidly advancing digital transformation of teaching, universities have started to face major competition from Massive Open Online Courses (MOOCs). Universities thus have to set themselves apart from MOOCs in order to justify the added value of three to five-year degree programs to prospective students. In this paper, we show how we address this challenge at Deggendorf Institute of Technology in ML and AI. We first share our best practices and present two concrete courses including their unique selling propositions: Computer Vision and Innovation Management for AI. We then demonstrate how these courses contribute to Deggendorf Institute of Technology’s ability to differentiate itself from MOOCs (and other universities).
Hussain Kazmi
Advances in machine learning and data science hold the potential to greatly optimize the overall energy sector, and prevent the worst outcomes of anthropogenic climate change. However, despite the urgent need for trained energy data scientists and the presence of a number of technically challenging issues that need to be tackled, the sector continues to suffer from a personnel shortage and remains mired in outdated technology. In many programs, energy engineers continue to graduate without even rudimentary programming skills, let alone knowledge of data science. This paper highlights key findings from an introductory course on machine learning and optimization designed specifically for energy engineering students. The course employs a number of teaching aids, which we hope will be useful for the broader community as well. The course was developed in a pan-European setting, supported by four different European universities as part of a broader roadmap to overhaul energy education.
Jónathan Heras
Due to the impact of Deep Learning both in industry and academia, there is a growing demand of graduates with skills in this field, and Universities are starting to offer courses that include Deep Learning subjects. Hands-on assignments that teach students how to tackle Deep Learning tasks are an instrumental part of those courses. However, most Deep Learning assignments have two main drawbacks. First, they use either toy datasets, that are useful to teach concepts but whose solutions do not generalise to real problems, or employ datasets that require specialised knowledge to fully understand the problem. Secondly, most Deep Learning assignments are focused on training a model, and do not take into account other stages of the Deep Learning pipeline, such as data cleaning or model deployment. In this work, we present an experience in an Artificial Intelligence course where we have tackled the aforementioned drawbacks by using datasets from the regional council where our University is located. Namely, the students of the course have developed several computer vision and natural language processing projects; for instance, a news classifier or an application to colourise historical images. We share the workflow followed to organise this experience, several lessons that we have learned, and challenges that can be faced by other instructors that try to conduct a similar initiative.
Daniel van Strien, Mark Bell, Nora Rose McGregor, Michael Trizna
There is a growing interest in utilising Machine Learning (ML) techniques within Galleries, Libraries, Archives and Museums (GLAM), and a corresponding demand for training to enable practitioners to engage confidently in this area. Staff at these institutions are seeking practical knowledge and skills in ML concepts and methods specific to the sector’s work, such as in the curation and collection of heritage collections. In this paper, we discuss the motivations and methods behind ““An Introduction to AI for GLAM” a new Carpentries workshop under development through an international partnership between British Library, Smithsonian Institution, and The National Archives UK. This new workshop aims to introduce GLAM practitioners to the essential conceptual and practical considerations for supporting, participating in and undertaking machine learning-based research and projects within the GLAM sector.
Erik Marx, Thiemo Leonhardt, David Baberowski, Nadine Bergner
The idea of chess-playing matchboxes, conceived by Martin Gardner as early as 1962, is becoming more and more relevant in learning materials in the area of AI and Machine Learning. Thus, it can be found in a large number of workshops and papers as an innovative teaching method to convey the basic ideas of reinforcement learning. In this paper the concept and its variations will be presented and the advantages of this analog approach will be shown. At the same time, however, the limitations of the approach are analyzed and the question of alternatives is raised.
Alfredo Canziani
Machine and deep learning techniques are actively being developed with over 150 papers submitted daily to arXiv, each of which is introducing its own notation. To offer a course that reflects the latest developments of the field and illustrate them in a cohesive and consistent manner, one needs to systematically consume the literature, summarise and standardise it, implement working examples, and deliver a concise and consistent presentation of a given topic. This paper reports all the best practices developed by the author in their last decade of teaching experience.
Viviana Acquaviva
This paper summarizes some challenges encountered and best practices established in several years of teaching Machine Learning for the Physical Sciences at the undergraduate and graduate level. I discuss motivations for teaching ML to Physicists, desirable properties of pedagogical materials such as accessibility, relevance, and likeness to real-world research problems, and give examples of components of teaching units.
Hilde Jacoba Petronella Weerts, Mykola Pechenizkiy
With the increasing application of machine learning models in practice, there is a growing need to incorporate ethical considerations in engineering curricula. In this paper, we reflect upon the development of a course on responsible machine learning for undergraduate engineering students. We found that technical material was relatively easy to grasp when it was directly linked to prior knowledge on machine learning. However, it was non-trivial for engineering students to make a deeper connection between real-world outcomes and ethical considerations such as fairness. Moving forward, we call upon educators to focus on the development of realistic case studies that invite students to interrogate the role of an engineer.
Sebastian Raschka
Machine learning has seen a vast increase of interest in recent years, along with an abundance of learning resources. While conventional lectures provide students with important information and knowledge, we also believe that additional project-based learning components can motivate students to engage in topics more deeply. In addition to incorporating project-based learning in our courses, we aim to develop project-based learning components aligned with real-world tasks, including experimental design and execution, report writing, oral presentation, and peer-reviewing. This paper describes the organization of our project-based machine learning courses with a particular emphasis on the class project components and shares our resources with instructors who would like to include similar elements in their courses.
Carrie Diaz Eaton
Bates College, is a small liberal arts postsecondary institution in the northeast United States. An information literacy course, Calling Bull, serves as an introductory data science class as well as a prerequisite-free quantitative literacy class. In this context, we spend a week discussing machine learning, with an emphasis on facial recognition algorithms. The emphasis is on the general algorithmic approach, critical inquiry of the process and careful interpretation of results presented in research or decision-making. This module relies on the use of open educational materials, discussion, and careful attention to issues of marginalization and algorithmic justice.
Matias Valdenegro-Toro
Uncertainty in machine learning is not generally taught as general knowledge in Machine Learning course curricula. In this paper we propose a short curriculum for a course about uncertainty in machine learning, and complement the course with a selection of use cases, aimed to trigger discussion and let students play with the concepts of uncertainty in a programming setting. Our use cases cover the concept of output uncertainty, Bayesian neural networks and weight distributions, sources of uncertainty, and out of distribution detection. We expect that this curriculum and set of use cases motivates the community to adopt these important concepts into courses for safety in AI.
Omar Shouman, Simon Fuchs, Holger Wittges
Machine learning education has become more accessible and relevant to students from various backgrounds. Practical courses complement theoretical lectures by focusing on applied machine learning. In this work, we report about our experiences from teaching two machine learning practical courses to master students from different study programs; an introductory and an advanced course. We present a summary of the teaching and evaluation methods used in both courses. We summarize our experiences and the feedback collected from the students through a survey. We conclude with our recommendations on teaching and designing practical machine learning courses.
Rabea Müller, Akinyemi Mandela Fasemore, Muhammad Elhossary, Konrad U Förstner
Machine Learning represents an invaluable set of tools for the analysis of data in molecular biology as well as bio-medicine. Here we present an training approach to teach fundamental machine learning skills to researchers in their early career stage (PhD and postdoc level) with the aim to empower them to apply these methods in their own research projects. The content was developed for being delivered in a short and intense learning period as part of a remote systems biology workshop but can be adapted to other scenarios with a less restricted time frame.
Sarah M Brown
Programming for Data Science is a programming intensive data science course. This paper discusses a revision of the course to center student learning. The revision effort centered the desired learning outcomes and resulted in a course that charted an explicit path toward achieving them for students. This paper summarizes the design overall and provides practical details about the instruction via participatory live coding and assessment with a competency based grading scheme
Rudy Chin, Dimitrios Stamoulis, Diana Marculescu
Computer hardware architecture has played an important role in the recent advances made in deep learning and associated applications. However, effective teaching strategies for hardware architectures for machine learning require a different structure and technical background than classic machine learning. More specifically, not only does the material need to convey necessary machine learning concepts to students, but also covers the hardware and software infrastructure concepts required for supporting machine learning systems. In this paper, we describe our approach to designing the course materials along with student assessment and evaluation for the ““Hardware Architectures for Machine Learning”” course targeting Electrical and Computer Engineering graduate students.
Martin Palazzo, Agustin Velazquez, Melisa Breda, Matias Callara, Nicolas Aguirre
Teaching machine learning has been a growing activity in almost any educational establishment. Despite the high availability on study materials, Latin America region has seen a lack of educational programs focused on machine learning. Additionally the majority of educational materials are available only in English. In this work we propose the ClusterAI pipeline based on a curated list of topics in Spanish and a collaboration with the Buenos Aires city government that open public data-sets that let students to apply machine learning models on real data.
We are extremely grateful for the group of volunteers that make this event happen by providing their reviews to submitted papers:
We are happy to hear from you regarding your questions, concerns or feedback. Please do so by opening an issue here or contact us.
Clare Boothe Luce Assistant Professor Department of Computer Science and Program in Statistical & Data Sciences Smith College
Team Lead AI Consultants for Matter Research at Helmholtz-Zentrum Dresden-Rossendorf
Research fellow at the HTW Dresden in the department of artificial intelligence.