Doctor of Philosophy with a major in Machine Learning
The Doctor of Philosophy with a major in Machine Learning program has the following principal objectives, each of which supports an aspect of the Institute’s mission:
- Create students that are able to advance the state of knowledge and practice in machine learning through innovative research contributions.
- Create students who are able to integrate and apply principles from computing, statistics, optimization, engineering, mathematics and science to innovate, and create machine learning models and apply them to solve important real-world data intensive problems.
- Create students who are able to participate in multidisciplinary teams that include individuals whose primary background is in statistics, optimization, engineering, mathematics and science.
- Provide a high quality education that prepares individuals for careers in industry, government (e.g., national laboratories), and academia, both in terms of knowledge, computational (e.g., software development) skills, and mathematical modeling skills.
- Foster multidisciplinary collaboration among researchers and educators in areas such as computer science, statistics, optimization, engineering, social science, and computational biology.
- Foster economic development in the state of Georgia.
- Advance Georgia Tech’s position of academic leadership by attracting high quality students who would not otherwise apply to Tech for graduate study.
All PhD programs must incorporate a standard set of Requirements for the Doctoral Degree.
The central goal of the PhD program is to train students to perform original, independent research. The most important part of the curriculum is the successful defense of a PhD Dissertation, which demonstrates this research ability. The academic requirements are designed in service of this goal.
The curriculum for the PhD in Machine Learning is truly multidisciplinary, containing courses taught in nine schools across three colleges at Georgia Tech: the Schools of Computational Science and Engineering, Computer Science, and Interactive Computing in the College of Computing; the Schools of Aerospace Engineering, Chemical and Biomolecular Engineering, Industrial and Systems Engineering, Electrical and Computer Engineering, and Biomedical Engineering in the College of Engineering; and the School of Mathematics in the College of Science.
Summary of General Requirements for a PhD in Machine Learning
- Core curriculum (4 courses, 12 hours). Machine Learning PhD students will be required to complete courses in four different areas: Mathematical Foundations, Probabilistic and Statistical Methods in Machine Learning, ML Theory and Methods, and Optimization.
- Area electives (5 courses, 15 hours).
- Responsible Conduct of Research (RCR) (1 course, 1 hour, pass/fail). Georgia Tech requires that all PhD students complete an RCR requirement that consists of an online component and in-person training. The online component is completed during the student’s first semester enrolled at Georgia Tech. The in-person training is satisfied by taking PHIL 6000 or their associated academic program’s in-house RCR course.
- Qualifying examination (1 course, 3 hours). This consists of a one-semester independent literature review followed by an oral examination.
- Doctoral minor (2 courses, 6 hours).
- Research Proposal. The purpose of the proposal is to give the faculty an opportunity to give feedback on the student’s research direction, and to make sure they are developing into able communicators.
- PhD Dissertation.
Almost all of the courses in both the core and elective categories are already taught regularly at Georgia Tech. However, two core courses (designated in the next section) are being developed specifically for this program. The proposed outlines for these courses can be found in the Appendix. Students who complete these required courses as part of a master’s program will not need to repeat the courses if they are admitted to the ML PhD program.
Core Courses
Machine Learning PhD students will be required to complete courses in four different areas. With the exception of the Foundations course, each of these area requirements can be satisfied using existing courses from the College of Computing or Schools of ECE, ISyE, and Mathematics.
Machine Learning core:
Mathematical Foundations of Machine Learning. This required course is the gateway into the program, and covers the key subjects from applied mathematics needed for a rigorous graduate program in ML. Particular emphasis will be put on advanced concepts in linear algebra and probabilistic modeling. This course is cross-listed between CS, CSE, ECE, and ISyE.
ECE 7750/ISYE 7750/CS 7750/CSE 7750 Mathematical Foundations of Machine Learning
Probabilistic and Statistical Methods in Machine Learning
- ISYE 6412, Theoretical Statistics
- ECE 7751/ISYE 7751/CS 7751/CSE 7751 Probabilistic Graphical Models
- MATH 7251 High Dimension Probability
- MATH 7252 High Dimension Statistics
Machine Learning: Theory and Methods. This course serves as an introduction to the foundational problems, algorithms, and modeling techniques in machine learning. Each of the courses listed below treats roughly the same material using a mix of applied mathematics and computer science, and each has a different balance between the two.
- CS 7545 Machine Learning Theory and Methods
- CS 7616, Pattern Recognition
- CSE 6740/ISYE 6740, Computational Data Analysis
- ECE 6254, Statistical Machine Learning
- ECE 6273, Methods of Pattern Recognition with Applications to Voice
Optimization. Optimization plays a crucial role in both developing new machine learning algorithms and analyzing their performance. The three courses below all provide a rigorous introduction to this topic; each emphasizes different material and provides a unique balance of mathematics and algorithms.
- ECE 8823, Convex Optimization: Theory, Algorithms, and Applications
- ISYE 6661, Linear Optimization
- ISYE 6663, Nonlinear Optimization
- ISYE 7683, Advanced Nonlinear Programming
Electives
After core requirements are satisfied, all courses listed in the core not already taken can be used as (appropriately classified) electives.
In addition to meeting the core area requirements, each student is required to complete five elective courses. These courses are required for getting a complete breadth in ML. These courses must be chosen from at least two of the five subject areas listed below. In addition, students can use up to six special problems research hours to satisfy this requirement.
i. Statistics and Applied Probability: To build breadth and depth in the areas of statistics and probability as applied to ML.
- AE 6505, Kalman Filtering
- AE 8803 Gaussian Processes
- BMED 6700, Biostatistics
- ECE 6558, Stochastic Systems
- ECE 6601, Random Processes
- ECE 6605, Information Theory
- ISYE 6402, Time Series Analysis
- ISYE 6404, Nonparametric Data Analysis
- ISYE 6413, Design and Analysis of Experiments
- ISYE 6414, Regression Analysis
- ISYE 6416, Computational Statistics
- ISYE 6420, Bayesian Statistics
- ISYE 6761, Stochastic Processes I
- ISYE 6762, Stochastic Processes II
- ISYE 7400, Adv Design-Experiments
- ISYE 7401, Adv Statistical Modeling
- ISYE 7405, Multivariate Data Analysis
- ISYE 8803, Statistical and Probabilistic Methods for Data Science
- ISYE 8813, Special Topics in Data Science
- MATH 6221, Probability Theory for Scientists and Engineers
- MATH 6266, Statistical Linear Modeling
- MATH 6267, Multivariate Statistical Analysis
- MATH 7244, Stochastic Processes and Stochastic Calculus I
- MATH 7245, Stochastic Processes and Stochastic Calculus II
ii. Advanced Theory: To build a deeper understanding of foundations of ML.
- AE 8803, Optimal Transport Theory and Applications
- CS 7280, Network Science
- CS 7510, Graph Algorithms
- CS 7520, Approximation Algorithms
- CS 7530, Randomized Algorithms
- CS 7535, Markov Chain Monte Carlo Algorithms
- CS 7540, Spectral Algorithms
- CS 8803, Continuous Algorithms
- ECE 6283, Harmonic Analysis and Signal Processing
- ECE 6555, Linear Estimation
- ISYE 7682, Convexity
- MATH 6112, Advanced Linear Algebra
- MATH 6241, Probability I
- MATH 6262, Advanced Statistical Inference
- MATH 6263, Testing Statistical Hypotheses
- MATH 6580, Introduction to Hilbert Space
- MATH 7338, Functional Analysis
- MATH 7586, Tensor Analysis
- MATH 88XX, Special Topics: High Dimensional Probability and Statistics
iii. Applications: To develop a breadth and depth in variety of applications domains impacted by/with ML.
- AE 6373, Advanced Design Methods
- AE 8803, Machine Learning for Control Systems
- AE 8803, Nonlinear Stochastic Optimal Control
- BMED 6780, Medical Image Processing
- BMED 6790/ECE 6790, Information Processing Models in Neural Systems
- BMED 7610, Quantitative Neuroscience
- BMED 8813BHI, Biomedical and Health Informatics
- BMED 8813MHI, mHealth Informatics
- BMED 8813MLB, Machine Learning in Biomedicine
- BMED 8823ALG, OMICS Data and Bioinformatics Algorithms
- CHBE 6745, Data Analytics for Chemical Engineers
- CHBE 6746, Data-Driven Process Engineering
- CS 6440, Introduction to Health Informatics
- CS 6465, Computational Journalism
- CS 6471, Computational Social Science
- CS 6474, Social Computing
- CS 6475, Computational Photography
- CS 6476, Computer Vision
- CS 6601, Artificial Intelligence
- CS 7450, Information Visualization
- CS 7476, Advanced Computer Vision
- CS 7630, Autonomous Robots
- CS 7632, Game AI
- CS 7636, Computational Perception
- CS 7643, Deep Learning
- CS 7646, Machine Learning for Trading
- CS 7647, Machine Learning with Limited Supervision
- CS 7650, Natural Language Processing
- CSE 6141, Massive Graph Analysis
- CSE 6240, Web Search and Text Mining
- CSE 6242, Data and Visual Analytics
- CSE 6301, Algorithms in Bioinformatics and Computational Biology
- ECE 4580, Computational Computer Vision
- ECE 6255, Digital Processing of Speech Signals
- ECE 6258, Digital Image Processing
- ECE 6260, Data Compression and Modeling
- ECE 6273, Methods of Pattern Recognition with Application to Voice
- ECE 6550, Linear Systems and Controls
- ECE 8813, Network Security
- ISYE 6421, Biostatistics
- ISYE 6810, Systems Monitoring and Prognosis
- ISYE 7201, Production Systems
- ISYE 7204, Info Prod & Ser Sys
- ISYE 7203, Logistics Systems
- ISYE 8813, Supply Chain Inventory Theory
- HS 6000, Healthcare Delivery
- MATH 6759, Stochastic Processes in Finance
- MATH 6783, Financial Data Analysis
iv. Computing and Optimization: To provide more breadth and foundation in areas of math, optimization and computation for ML.
- AE 6513, Mathematical Planning and Decision-Making for Autonomy
- AE 8803, Optimization-Based Learning Control and Games
- CS 6515, Introduction to Graduate Algorithms
- CS 6550, Design and Analysis of Algorithms
- CSE 6140, Computational Science and Engineering Algorithms
- CSE 6643, Numerical Linear Algebra
- CSE 6644, Iterative Methods for Systems of Equations
- CSE 6710, Numerical Methods I
- CSE 6711, Numerical Methods II
- ECE 6553, Optimal Control and Optimization
- ISYE 6644, Simulation
- ISYE 6645, Monte Carlo Methods
- ISYE 6662, Discrete Optimization
- ISYE 6664, Stochastic Optimization
- ISYE 6679, Computational methods for optimization
- ISYE 7686, Advanced Combinatorial Optimization
- ISYE 7687, Advanced Integer Programming
v. Platforms: To provide breadth and depth in computing platforms that support ML and Computation.
- CS 6421, Temporal, Spatial, and Active Databases
- CS 6430, Parallel and Distributed Databases
- CS 6290, High-Performance Computer Architecture
- CSE 6220, High Performance Computing
- CSE 6230, High Performance Parallel Computing
Qualifying Examination
The purpose of the Qualifying Examination is to judge the candidate’s potential as an independent researcher.
The Ph.D. qualifying exam consists of a focused literature review that will take place over the course of one semester. At the beginning of the second semester of their second year, a qualifying committee consisting of three members of the ML faculty will assign, in consultation with the student and the student’s advisor, a course of study consisting of influential papers, books, or other intellectual artifacts relevant to the student’s research interests. The student’s focus area and current research efforts (and related portfolio) will be considered in defining the course of study.
At the end of the semester, the student will submit a written summary of each artifact which highlights their understanding of the importance (and weaknesses) of the work in question and the relationship of this work to their current research. Subsequently, the student will have a closed oral exam with the three members of the committee. The exam will be interactive, with the student and the committee discussing and criticizing each work and posing questions related the students current research to determine the breadth of student’s knowledge in that specific area.
The success of the examination will be determined by the committee’s qualitative assessment of the student’s understanding of the theory, methods, and ultimate impact of the assigned syllabus.
The student will be given a passing grade for meeting the requirements of the committee in both the written and the oral part. Unsatisfactory performance on either part will require the student to redo the entire qualifying exam in the following semester year. Each student will be allowed only two attempts at the exam.
Students are expected to perform the review by the end of their second year in the program.
Doctoral Dissertation
The primary requirement of the PhD student is to do original and substantial research. This research is reported for review in the PhD dissertation, and presented at the final defense.
As the first step towards completing a dissertation, the student must prepare and defend a Research Proposal. The proposal is a document of no more than 20 pages in length that carefully describes the topic of the dissertation, including references to prior work, and any preliminary results to date. The written proposal is submitted to a committee of three faculty members from the ML PhD program, and is presented in a public seminar shortly thereafter. The committee members provide feedback on the proposed research directions, comments on the strength of writing and oral presentation skills, and might suggest further courses to solidify the student’s background. Approval of the Research Proposal by the committee is required at least six months prior to the scheduling of the PhD defense. It is expected that the student complete this proposal requirement no later than their fourth year in the program.
The PhD thesis committee consists of five faculty members: the student’s advisor, three additional members from the ML PhD program, and one faculty member external to the ML program. The committee is charged with approving the written dissertation and administering the final defense. The defense consists of a public seminar followed by oral examination from the thesis committee.
Doctoral minor (2 courses, 6 hours):
The minor follows the standard Georgia Tech requirement: 6 hours, preferably outside the student’s home unit, with a GPA in those graduate-level courses of at least 3.0. The courses for the minor should form a cohesive program of study outside the area of Machine Learning; no ML core or elective courses may be used to fulfill this requirement and must be approved by your thesis advisor and ML Academic Advisor. Typical programs will consist of three courses two courses from the same school (any school at the Institute) or two courses from the same area of study.