The main reason for applying for the MSc in Computing (AI and ML) at a prestigious institution is that I want to study with Dr. [First Placeholder Name] and Dr. [Second Placeholder Name]. They have done great works in the field that I want to dive into: Bayesian deep learning.
My interest in Bayesian deep learning is based on my past research experience. Ever since I stepped into the world of AI research, I have been a fan of deep learning and believed these powerful and complex networks are and will be the most important tools. The success of deep learning largely relies on its scalability and in this era of big data, with inexpensive high-speed computing available, 175-billion-parameter models can be trained. However, the downside of this powerfulness is they are too complex to understand, so we have to treat them as black boxes. Such intransparency makes human unable to trust and deploy them in the real world. Therefore, as I was trying to improve deep neural networks (DNNs) for monocular depth estimation (MDE), a task of predicting scene depth from a single image, I decided to take methodologies in explainable AI. I first discovered that in well-trained DNNs for MDE, some hidden units were selective to some ranges of depth. Later, I was influenced by Professor [Third Placeholder Name], who argued that researchers should focus on make ML models more inherently interpretable instead of explainable because post-hoc explanations are always inaccurate and often misleading. Inspired by this idea, I proposed a training method making every hidden unit selective and boosted the DNNs’ interpretability.
While I am excited by getting my first-authored paper published at a top conference, I keep reflecting the broader impact of my research. What kind of interpretability or explainability do we need from AI for real-world deployment? After all, humans are not interpretable, either—we are able to explain our decisions, but when it comes to ‘System 1’ decisions (intuitions), explanations are indeed not accurate. So, The intransparency is not the problem; the problem is they are not safe enough to be trusted.
Dr. [First Placeholder Name]’s tutorial on approximate inference at NeurIPS 2020 triggered me to think about the importance of uncertainty for trustworthy intelligence. Unlike deep learning models, humans can tell when they are not sure. A responsible doctor will ask for further examines if she thinks it is hard to tell the tumour is benign or malignant, making her more reliable than DNNs. This ability to know what it doesn’t know can be quantified as a model’s uncertainty, and fortunately, Bayesian inference provides us with principled methods to represent uncertainty. In recent years, researchers tackled the limitation of deep learning from Bayesian perspectives and achieved many advances in an arising field called Bayesian deep learning (BDL), where strengths from deep learning like scalability can be combined with abilities of Bayesian methods like uncertainty reasoning. BDL is advantageous because this theoretical-grounded framework captures not only aleatoric uncertainty (irreducible data uncertainty) elegantly but also epistemic uncertainty (reducible model uncertainty), which is more useful for safety-critical applications. And, only in this framework can uncertainty be propagated forward in complicated intelligent systems.
Particularly, in my future study towards postgraduate and doctoral degrees, I plan to work on the adversarial robustness of DNNs from BDL perspectives. Studies have shown that DNNs are vulnerable to adversarial attacks and such issue could be dangerous for real-world AI systems. I also think it is an intriguing direction because research has shown that the adversarial training framework proposed by [Fourth Placeholder Name] et al. for adversarial robustness could disentangle the robust and non-robust features of DNNs. But do BNNs also learn robust and non-robust features, and can they be marginalized? Are non-robust features also transferable across different BNNs? Do adversarially robust BNNs also transfer better? Can adversarial robustness be incorporated into the Bayesian inference framework as the prior? Maybe more importantly, is learning robust posterior help learn the true posterior?
In the tutorial at NeurIPS 2020, Dr. [First Placeholder Name] and collaborator introduced approximate inference, methods that approximate difficult-to-compute probability densities, especially integrals in Bayesian inference. It is one of the core topics in BDL because when computing Bayesian inference for complex models like DNNs, the integrals are impossible to compute exactly. I am especially attracted to her work on the adversarial robustness of deep generative classifiers, motivated by the cause of DNNs’ vulnerability by its discriminative nature and the poor performance of non-deep Bayesian generative classifiers on image classification. They proposed a method to improve classical Naive Bayes classifier and achieved better robustness because it not only leaded to more successful detection for adversarial examples but also perform well under attacks that detectors failed to detect. Besides, Dr. [Second Placeholder Name] is another researcher that I dream to work with. Dr. [Second Placeholder Name] defended his PhD thesis on Gaussian process three years ago. In the more recent research, he and his colleagues connect GP and DNN and achieve a more flexible and accurate deep GP. He also contributes to invariance learning in neural networks. Within a Bayesian context, data augmentation can be replaced by optimising the marginal likelihood to learn invariance (which can be seen as more general robustness than adversarial robustness). I am curious if I can adapt this scheme to the adversarial training framework to solve the inner maximisation problem more efficiently.
In the taught programme, the only compulsory module Mathematics for Machine Learning will be taught by Dr. [Second Placeholder Name]. I will be armed with a solid mathematical background and skills, and with his instruction, I believe I will be able to connect these fundamental ideas with cutting-edge research. He will also be teaching Probabilistic Inference in the Spring term, discussing topics in Bayesian machine learning like Gaussian processes and approximate inference. In addition, Dr. [First Placeholder Name] is the instructor of Deep Learning. Thanks to her rich research experience in BDL, the course taught at the university emphasizes more on the mathematical principles of deep learning models, especially for those which are optimised by Bayesian approaches like VAE. More importantly, I will get the opportunity to conduct a research project with their help, in which I will explore the adversarial robustness of deep learning from a Bayesian view. Moreover, course including Natural Language Processing, Robot Learning and Control, and Logic-Based Learning will supplement me the knowledge of other AI application fields and methods besides computer vision and make me fully understand the potential of AI. In sum, I believe the programme is my perfect next step and I am ready for it.
Leave a Reply