Thanks for coming!
Thanks to everyone for coming! I think ML is a really fascinating and important field, that I'd love to see a lot more mathematicians getting into, and I hope I managed to make a good case for that! Feel totally free to message me if you have any questions about that, or the talk.
I'd really appreciate hearing any feedback about what went well with the talk, and what I could do better in future talks!
Feedback form or just message me
Note for anyone reading this: I skipped chapters 1e, 2c and 3 in the full talk (alas), but there’s some awesome content in there! I recommend reading it if they sound interesting:
1e: What’s really going on with adversarial examples
2c: How the network sees the data set
3: An overview of reinforcement learning
Message me if you have any questions about them!
Graphics
Note: From here on, the graphics are specific points in long articles, so I give short phrases to CTRL+F for to find them
I recommend layer 4c
Visualising neuron interaction
Search for by jointly optimizing two
Building Blocks of Interpretability
Search for Making sense of these
Search for Semantic dictionaries give us a
Search for network detected features at that position
Which parts of the image matter?
Search for concisely summarizing the story of the neural network
Search for something we can optimize the factorization for
Search for how well does this work
What data matters for a class?
Search for which we ignored in this discussion
What differentiates nearby classes? (ie the baseball shark)
Search for Further Isolating Classes
Best further reading:
Learning more about ML
Getting started
fast.ai online course - really good for getting started with making something, and the basic ideas
Interpretability
distill.pub - A small journal with incredibly well-written ML papers
Feature Visualization, The Building Blocks of Interpretability, Activation Atlas, Circuits
These have papers, graphics & code
Reinforcement Learning
Sutton & Barto textbook – mathematical introduction, not overly deep learning focused
OpenAI Gym – deep learning focused, coding focused
References + Further Reading:
Explanation of momentum: https://distill.pub/2017/momentum/
Image Kernels: https://setosa.io/ev/image-kernels/
https://web.archive.org/web/20200310063743/https://slatestarcodex.com/2020/01/06/a-very-unlikely-chess-game/
Maths + Navy Seal Copypasta: https://www.gwern.net/GPT-3#navy-seal-copypasta-parodies
AI Dungeon - play an RPG with GPT-2 https://aidungeon.io/
GPT-3 paper https://arxiv.org/abs/2005.14165
Adversarial Examples are Features not Bugs https://arxiv.org/pdf/1905.02175.pdf
Commentary https://distill.pub/2019/advex-bugs-discussion
Feature visualisation https://distill.pub/2017/feature-visualization/
Visualisation of every neuron https://distill.pub/2017/feature-visualization/appendix/
Building Blocks of Interpetability https://distill.pub/2018/building-blocks/
Activation Atlas https://distill.pub/2019/activation-atlas/
Circuits https://distill.pub/2020/circuits/
Meaning of neurons in early layers https://distill.pub/2020/circuits/early-vision/
Curve detectors https://distill.pub/2020/circuits/curve-detectors/
Introduction to Reinforcement Learning (fairly mathsy!) https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-learning.html
Examples of bad metrics https://danielmiessler.com/blog/how-to-create-bad-metrics-incentivize-wrong-behaviors/
Specification gaming https://deepmind.com/blog/article/Specification-gaming-the-flip-side-of-AI-ingenuity
List of specification gaming examples http://tinyurl.com/specification-gaming
Deep Reinforcement Learning from Human Preferences https://openai.com/blog/deep-reinforcement-learning-from-human-preferences/