Jaehoon Lee
Google DeepMind, Brain Team
San Francisco Bay Area
E-Mail: jaehlee at google dot com
Curriculum Vitae: CV
I am a Staff Research Scientist at Google DeepMind and was part of Google Brain Team working on scientific understanding of deep neural networks. I was in second-year cohort of the AI Residency program.
Before joining Google in 2017, my main research focus was on theoretical high-energy physics. I was a postdoctoral researcher in the Department of Physics & Astronomy at University of British Columbia (UBC) in the String Theory Group. Before that, I completed my PhD in Center for Theoretical Physics (CTP) at MIT working on theoretical physics.
News
-
[NEW!] Jan 2024: Our paper Small-scale proxies for large-scale Transformer training instabilities is accepted at ICLR 2024 as an oral (1.2% of submitted papers)! See you at Vienna, Austria!
-
[NEW!] Dec 2023: Our new paper Beyond human data: Scaling self-training for problem-solving with language models is on ArXiv!
-
[NEW!] Nov 2023: Our new paper Frontier Language Models are not Robust to Adversarial Arithmetic, or” What do I need to say so you agree 2 + 2= 5? is on ArXiv!
-
Sep 2023: Our new paper Small-scale proxies for large-scale Transformer training instabilities is on ArXiv!
-
Sep 2023: Our new paper Replacing softmax with relu in vision transformers is on ArXiv!
-
May 2023: Our paper Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models is now published in TMLR!
-
May 2023: Now I’m part of Google DeepMind! Excited to join forces with our DeepMind colleagues.
Selected Publication
For full publication list see: [Google Scholar] [Semantic Scholar] [arXiv]
-
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman et al., Jaehoon Lee*, Justin Gilmer*, Simon Kornblith*
[arXiv: 2309.14322] -
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
BIG-bench collaboration, member of Core Contributors
[Transactions on Machine Learning Research (TMLR), 2023] [https://github.com/google/BIG-bench] [arXiv: 2206.04615] -
Dataset Distillation with Infinitely Wide Convolutional Networks
Timothy Nguyen, Roman Novak, Lechao Xiao, Jaehoon Lee
Neural Information Processing Systems (NeurIPS), 2021
[arXiv: 2107.13034] [code / dataset] [Google AI Blog] -
Explaining Neural Scaling Laws
Yasaman Bahri*, Ethan Dyer*, Jared Kaplan*, Jaehoon Lee*, Utkarsh Sharma*
[arXiv: 2102.06701] -
Towards NNGP-guided Neural Architecture Search
Daniel S. Park*, Jaehoon Lee*, Daiyi Peng, Yuan Cao, Jascha Sohl-Dickstein
[arXiv: 2011.06006] [code] [[US Patent App. 17/377,142]] -
Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit
Ben Adlam*, Jaehoon Lee*, Lechao Xiao*, Jeffrey Pennington and Jasper Snoek
International Conference on Learning Representations (ICLR), 2021 [code / dataset]
ICML 2020 Workshop on Uncertainty & Robustness in Deep Learning [arXiv: 2010:07355] -
Finite Versus Infinite Neural Networks: an Empirical Study
Jaehoon Lee, Samuel S. Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein
Neural Information Processing Systems (NeurIPS), 2020. [spotlight]
[arXiv: 2007.15801]. -
Neural Tangents: Fast and Easy Infinite Neural Networks in Python
Roman Novak, Lechao Xiao, Jiri Hron, Jaehoon Lee, Alexander A. Alemi, Jascha Sohl-Dickstein, Samuel S. Schoenholz
International Conference on Learning Representation(ICLR), 2020 [spotlight]
[arXiv: 1912.02803] [code] -
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee*, Lechao Xiao*, Samuel S. Schoenholz, Yasaman Bahri, Jascha Sohl-Dickstein, Jeffrey Pennington
Neural Information Processing Systems (NeurIPS), 2019.
Special Isssue, Journal of Statistical Mechanics: Theory and Experiment, 2020.
[arXiv: 1902.06720] [code1] [code2] [Wikipedia(Neural tangent kernel)] -
Measuring the Effects of Data Parallelism on Neural Network Training
Christopher J. Shallue*, Jaehoon Lee*, Joseph Antognini, Jascha Sohl-Dickstein, Roy Frostig, George E. Dahl
Journal of Machine Learning Research, 2019.
[arXiv: 1811.03600] -
Deep Neural Networks as Gaussian Processes
Jaehoon Lee*, Yasaman Bahri*, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein
International Conference on Learning Representations (ICLR), 2018.
[arXiv: 1711.00165] [code] [Wikipedia(Neural network Gaussian process)]
Research
- Recent research interests include:
- Theoretical aspects of deep neural networks
- Scientific and principled study of deep neural networks and their learning algorithms
- Principled study of large scale neural networks (e.g. Neural scaling laws, inifinite-width limit)
- Theoretical physics with focus on high energy physics
- Interplay between physics and machine learning
- Services:
- Action Editor for TMLR
- Area Chair for NeurIPS, ICLR, ICML
- Reviewer for ICLR / ICML / NeurIPS / JMLR / Neural Computation / Pattern Recognition Letters / Nature Communications / TPAMI / AISTATS
- Organizer for Aspen Winter Conference on Physics for Machine Learning
- Organizer for ICML Workshop on Theoretical Physics for Deep Learning
- Organizer for Vancouver deep learning study group