teaching-ml3

Learning from others
1. What do experienced teachers (e.g. Carpentries trainers) do that could be used in ML education?
2. How do you set the goal of a course?
3. How do you test if you achieved your goal?
4. How do you balance theoretical foundations and practical skills?

1.1

short introduction to the carpentries, https://carpentries.org
- PS: live coding approach might not be ideally suited for ML
- AS maybe live derivations / math hands-on for people with
- PS: during bootcamps, we saw that coding imposes a mental overhead when ML is overpowered (people have to deal with the code and not focus on the ML behind)
AS: MOOCs by Andrew Ng
- Andrew Ng provides fill-in-the-blanks matlab scrips
- AS doubts that this is helping ingestion of
DH: interpreted carpentries approach freely with ML
- rather not focus on live-coding
- concentrate on interplay of instructions and exercises to help learners to ingest
- when doing live-coding with jupyter notebooks, take time to do so (done with astrophysics)
AS: maybe beneficial to focus on the application side rather than the maths behind
- example: adapted idea from bishop directly (linear regression with gaussian inputs, doing uncertainty quantification after) was rather tough on coding heavy learners
- DH confirms this observation
- PS: know your learners (and what they need)! Take their domain into account!
- PS: saw this with Rebecca Fiebrink to design curriculum to help application designers and first ask learners to get an intuition and then focus on the math (nicely done by Javier in one of the papers)
- DH: warning, see many people that use ML only as black box without considering implications (think AI ethics)
  - if you teach something as black box, people will use it as black box
- AS: many coders that try to pick up ML, starting with something simple (like a linear regression/classification) has the benefit to not overload learners
  - with a simple ansatz, one can concentrate on the validation of results
  - this linear classifier can help to uncover (hidden) biases can increase awareness for consequences for automated decisiosn
- DH: many students just want to learn the magic "Deep Learning"
  - DL is nothing more to map from data a to b with a super complicated mapping
  - AS: think GPT-3, this systems **appears** to know the next word
DH: 1.4
- PS: concept for a 1 week course:
  - day 1 without computer (only GUIs at best, have learners discuss)
  - day 2-3 simple black box (training, validation)
  - day 4-5 Deep Learning (with math!)
- DH: good measure from the carpentries, use a pre-/post-survey
  - pre-workshop survey to ask your learners BEFORE the workshop what they know and what they can do
    - those that identify as advanced users, assign them the role of deputy instructor
  - post-workshop survey
- DH: helpful to have something creative/simple as the first thing to teach
  - DH: to start summer school on data science, give pen and paper to learners and have them draw a prototype for a data viz (helps minimize fear of code)
  - AS: interesting to change media, too much code can connect negatively with ML content (have this creative part more strongly in math focussed courses)
  - for a ML course in one week, invest first day on AI ethics only
- PS: super important to keep creative part in mind
  - creative handling of data is extremely vital in practice: describe your data! can you draw your data in shape and type? what story does your data tell?
- AS: kaggleification (help!)
  - kaggleification = teach students to disentangle their problem from a "real-world" setting (input, output, goal/optimization)
  - DH: this approach depends on the crowd, academic learners come with a heterogenous knowledge
  - DH: common observation -> learners stay with their domains technical terms and have a hard time to generalize and use terms valid for ML
  - DH: ML considered as magic to help find interesting stuff -> stands in direct contrast to the (empirical) scientific method (hypothesis first, empirical check, synthesis, knowledge)
  - AS: would be nice to have a fixed set of use cases or example data sets
    - DH: many example datasets are too clean (to teach something to students)
    - PS: nice idea by Javier, sample learners for height, shoe size, age and play with this data set in the course
    - PS: common problem with example dataset -> single dataset doesn't fit all crowds (especially with mixed experience)
      - DH: approach with students, have them order themselves
      - DH: give students a toke of 2 types, ask them to find out what the classification criterion for the 2 label/token types is
    - AS: related to this, a taxonomy of data (rather tables, images, time series, ...) is important to convey
      - AS: goal for learners how to deal with different types of data
- 1.1 Other material that is available online
PS: 1.2 how to set goals?
- DH: carpentries style is beneficial (each sublesson has designated learning objectives)
  - PS: very essential for mixing-and-matching (leave lessons out, reorder flow, adapt to learner speed,...)
- PS: depends on where people want to go with ML knowledge
  - PS: use pre-workshop survey to ask learners, why do you want to learn ML? what do you want ML to use for?
  - PS: post-workshop survey 6 months after course: what have you done with the knowledge?
- AS: agree with PS to use pre-workshop survey
- DH: find minimal viable teaching concepts
  - for example: ML is not magic, every learner should be able to contemplate why is classifyer returning this result
  - be humble: no course will achieve a complete understanding in every learner
- AS: interesting to see both sides (what do learners hear versus what do I want to teach)
  - learner goals and teacher goals
- DH: with carpentries approach (sticky notes, half-day pro/con rounds) provides frequent feedback on learner progress
  - PS: adapt teacher goals on-the-fly
PS: 1.3 how to check if goals have been met?
- DH/PS: carpentries methods have been very helpful
- PS: half-day feedback on the way out of the room
  - (red sticky note with items people were confused with or want trainers to improve)
  - (green sticky notes with items that people liked)
  - discuss this feedback and act on it
- AS: context in academic exams
  - create awareness during class ("this could be a exam question")
  - exam situation is tricky (pressure)
  - DH: creative ways to do exam (in the US, examiners have a lot of freedom to design exams)
    - saw many ways online (finish novel with different ending and discuss with relevance to course)
    - AS: exam as find-the-error with synthetic example
    - DH: discuss use cases rather than play Q&A game
    - AS: used capstone project (rudimentary ML implementation with accuracy of 60%)
      - goal: improve given code
      - students had problems with this
  - DH: learning by failure
    - have learners assign to the same task
    - have them work in groups
    - give each group only subset of entire training set, all groups have the same test set
    - each group's subset has different problem (different bias, variance ...)
    - motivate groups to talk to each other
- PS: surveys are a good tool, but have no strong confidence if my survey's are "good"
  - DH: lucky to be in an interdisciplinary project where an expert on surveys was present and helped
  - PS: many experts from other domains have little to no time to help
  - DH: maybe start by asking in the carpentries community

Survey Design Resources:

Summary

1.1. learning from others

use carpentries teaching pillars (know your learners, stay close to application, ...)
adapt live-coding to math heavy topics
be couragous to teach math/code without computers (candy, crayons, ...)

1.2 how to set goals?

carpentries style surveys (pre-workshop to get to know leaners; [long-term] post-workshop to see where learners are going)
learner goals and teacher goals --> teachers should have a clear idea for what they want learners to take out of the class
adaptive material is helpful (mix-and-match lessons to learners)

1.3 how to check if goals have been met?

sticky notes for rapid visual feedback
feedback made transparent to all learners
surveys essential, but can be tricky to design

1.4 balancing theory and practice

from domain problems to nice form for ML ("kagglification")
spend some time discussing topics unrelated to math or programming ("ethics", "applications", "suitability") from the start at best
good use cases for teaching, including different cornerstones of data types ("image-like", "tabular", "time series", "text" ...) to show breadth

Bugs oder Hugs bitte an mailto:admin@okfn.de