In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
Conference on Neural Information Processing Systems (NeurIPS) 2024 (🏆 Spotlight)
On the role of soft-max activation in pretraining for in-context learning.
I am a fifth year PhD student advised by Prof. Sanjay Shakkottai in
the Wireless Networking
and Communications Group at UT Austin. Before joining UT Austin, I received my B.S.
in Computer Science from Texas A&M University.
I work on statistical learning theory.
In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
Conference on Neural Information Processing Systems (NeurIPS) 2024 (🏆 Spotlight)
On the role of soft-max activation in pretraining for in-context learning.
InfoNCE Loss Provably Learns Cluster-Preserving Representations
Conference on Learning Theory (COLT) 2023
On how InfoNCE can learn meaningful representations by inheriting inductive biases.
A Theoretical Justification for Image Inpainting using Denoising Diffusion Probabilistic Models
On whether diffusion based inpainting can recover samples.
PAC Generalization via Invariant Representations
International Conference on Machine Learning (ICML) 2023
On the number of random interventions needed for PAC generalization of invariant representations.
Improved Algorithms for Misspecified Linear Markov Decision Processes
International Conference on Artificial Intelligence and Statistics (AISTATS) 2022
On algorithms for linear Bandits and MDPs where the reward function isnt exactly linear.
Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation
International Conference on Machine Learning (ICML) 2022
L1 Regression with Lewis Weights Subsampling
Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (RANDOM) 2021
On the label complexity of finding approximately correct solutions to LAD regression.
Stochastic Linear Bandits with Protected Subspace
On an algorithm for linear Bandits when there is an unknown subspace that provides no reward.
Locating conical degeneracies in the spectra of parametric self-adjoint matrices
SIAM Journal on Matrix Analysis and Applications, Volume 42
On convergence of the Newton-Raphson algorithm to find matrices that have two identical eigenvalues from among a parameterized family of self-adjoint matrices.
I participated in various contests in high school and college. Here is a plug about how I did:
I was a counsellor at the SMaRT
Camp at the Dept. of Math at TAMU from 2017 through 2019.
At a very different time, I got a 523 on the MCAT. I haven't used that number anywhere else so I thought
I would put it here at least.