Statistics
24 Hillhouse, 432.0666
www.stat.yale.edu/
M.A., Ph.D.
Chair
Joseph Chang
Director of Graduate Studies
John Emerson [F] (24 Hillhouse, john.emerson@yale.edu)
Joseph Chang [Sp] (24 Hillhouse)
Professors
Donald Andrews (Economics), Andrew Barron, Joseph Chang, John Hartigan (Emeritus), Theodore Holford (Epidemiology & Public Health; Biostatistics), Peter Phillips (Economics), David Pollard, Edward Tufte (Political Science; Computer Science), Heping Zhang (Epidemiology & Public Health; Biostatistics)
Associate Professors
Hannes Leeb, Edmund Yeh (Electrical Engineering)
Assistant Professors
Lisha Chen, John Emerson, Mokshay Madiman, Sekhar Tatikonda (Electrical Engineering), Harrison Zhou
Lecturer
Jonathan Reuning-Scherer
Fields of Study
Fields comprise the main areas of statistical theory (with emphasis on foundations, Bayes theory, decision theory, nonparametric statistics), probability theory (stochastic processes, asymptotics, weak convergence), information theory, econometrics, classification, statistical computing, and graphical methods.
Special Admissions Requirements
GRE scores for the General Test and for the Subject Test in the area closest to the undergraduate major should accompany an application; the math subject test is strongly recommended. All applicants should have a strong mathematical background, including advanced calculus, linear algebra, elementary probability theory, and at least one course providing an introduction to mathematical statistics. An undergraduate major may be in statistics, mathematics, computer science, or in a subject in which significant statistical problems may arise. For those whose native language is not English, the Test of English as a Foreign Language (TOEFL) scores are required.
Special Requirements for the Ph.D. Degree
There is no foreign language requirement. Normally during the first two years, fourteen term courses in this and other departments are taken to prepare students for research and practice of statistics. These include courses devoted to case studies and practical work, for which students prepare a written report and give an oral presentation. The qualifying examination consists of three parts: a written report on an analysis of a data set, a written examination on theoretical statistics, and an oral examination. The examination is taken not later than when scheduled by the department in the middle of the second year, with provision for one subsequent reexamination of one or more parts in the event that a student does not pass the first time. All parts of the qualifying examination must be completed before the beginning of the third year. A prospectus for the dissertation should be submitted no later than the first week of March in the third year. The prospectus must be accepted by the department before the end of the third year if the student is to register for a fourth year. Upon successful completion of the qualifying examination and the prospectus (and meeting of Graduate School requirements), the student is admitted to candidacy. Students are expected to attend weekly departmental seminars.
Master’s Degree
M.A. (en route to the Ph.D.). This degree may be awarded upon completion of eight term courses and two terms of residence.
Master’s Degree Program. Students are also admitted directly to a terminal master’s degree program. To qualify for the M.A., the student must successfully complete an approved program of eight term courses, chosen in consultation with the director of graduate studies. Full-time students must take a minimum of three courses per term. Part-time students are also accepted into the master’s degree program. See Terminal M.A./M.S. Degrees.
Program information is available on the Web at www.stat.yale.edu.
Courses
STAT 500b, Introductory Statistics. John Emerson.
MWF 10.3011.20
An introduction to statistical reasoning. Topics include numerical and graphical summaries of data, data acquisition and experimental design, probability, hypothesis testing, confidence intervals, correlation and regression. Application of statistical concepts to data; analysis of real-world problems.
STAT 501506, Introduction to Statistics.
A basic introduction to statistics, including numerical and graphical summaries of data, probability, hypothesis testing, confidence intervals, and regression. Each course focuses on applications to a particular field of study and is taught jointly by two instructors, one specializing in statistics and the other in the relevant area of application.The first seven weeks are attended by all students in STAT 501506 together as general concepts and methods of statistics are developed. The course separates for the last six and a half weeks, which develop the concepts with examples and applications. Computers are used for data analysis.These courses are alternatives; they do not form a sequence and only one may be taken for credit.
STAT 501au,Introduction to Statistics: Life Sciences. Jonathan Reuning-Scherer,Günter Wagner.
TTh 12.15
Statistical and probabilistic analysis of biological problems presented with a unified foundation in basic statistical theory. Problems are drawn from genetics, ecology, epidemiology, and bioinformatics. Also E&EB 510au.
STAT 502au,Introduction to Statistics: Political Science. Jonathan Reuning-Scherer, Sung-youn Kim.
TTh 12.15
Statistical analysis of politics, elections, and political psychology. Problems presented with reference to a wide array of examples: public opinion, campaign finance, racially motivated crime, and public policy.
STAT 503au,Introduction to Statistics: Social Sciences. Jonathan Reuning-Scherer.
TTh 12.15
Descriptive and inferential statistics applied to analysis of data from the social sciences. Introduction of concepts and skills for understanding and conducting quantitative research.
STAT 505au,Introduction to Statistics: Medicine. Jonathan Reuning-Scherer,David Salsburg.
TTh 12.15
Statistical methods relied upon in medicine and medical research. Practice in reading medical literature competently and critically, as well as practical experience performing statistical analysis of medical data.
[STAT 506au,Introduction to Statistics: Data Analysis.]
STAT 530bu,Introductory Data Analysis. Hannes Leeb.
MW 2.303.45
Survey of statistical methods: plots, transformations, regression, analysis of variance, clustering, principal components, contingency tables, and time series analysis. R and Web data sources are used. After STAT 501a.
STAT 538au,Probability and Statistics. Joseph Chang.
MWF 2.303.20
Fundamental principles and techniques of probabilistic thinking, statistical modeling, and data analysis. Essentials of probability: conditional probability, random variables, distributions, law of large numbers, central limit theorem, Markov chains. Statistical inference with emphasis on the Bayesian approach: parameter estimation, likelihood, prior and posterior distributions, Bayesian inference using Markov chain Monte Carlo. Introduction to regression and linear models. Computers are used throughout for calculations, simulations, and analysis of data. After MATH 118a or b or 120a or b. Some acquaintance with matrix algebra and computing assumed.
STAT 541au,Probability Theory. Harrison Zhou.
MWF 9.25010.15
A first course in probability theory: probability spaces, random variables, expectations and probabilities, conditional probability, independence, some discrete and continuous distributions, central limit theorem, Markov chains, probabilistic modeling. After or concurrent with MATH 120a or b or the equivalent.
STAT 542bu,Theory of Statistics. Mokshay Madiman.
NWF 9.25010.15
Principles of statistical analysis: maximum likelihood, sampling distributions, estimation; confidence intervals; tests of significance; regression; analysis of variance; and the method of least squares. Some statistical computing. After STAT 541a and concurrently with or after MATH 222a or b or 225a or b or the equivalent.
STAT 551bu,Stochastic Processes. Joseph Chang.
MW 12.15
Introduction to the study of random processes, including Markov chains, Markov random fields, martingales, random walks, Brownian motion, and diffusions. Techniques in probability such as coupling and large deviations. Applications to image reconstruction, Bayesian statistics, finance, probabilistic analysis of algorithms, genetics, and evolution. After STAT 541a or the equivalent.
STAT 600bu,Advanced Probability. David Pollard.
TTh 2.303.45
Measure theoretic probability, conditioning, laws of large numbers, convergence in distribution, characteristic functions, central limit theorems, martingales. Some knowledge of real analysis is assumed.
[STAT 602b, Probability Coupling.]
[STAT 603a, Stochastic Calculus.]
[STAT 606b, Markov Processes and Random Fields.]
[STAT 607b, Inequalities for Probability and Statistics.]
STAT 610a, Statistical Inference. Hannes Leeb.
TTh 10.3011.45
A systematic development of the mathematical theory of statistical inference covering methods of estimation, hypothesis testing, and confidence intervals. An introduction to statistical decision theory. Undergraduate probability at the level of STAT 541a assumed.
STAT 612au,Linear Models. David Pollard.
TTh 910.15
The geometry of least squares; distribution theory for normal errors; regression, analysis of variance, and designed experiments; numerical algorithms (with particular reference to S-plus); alternatives to least squares. Generalized linear models. Linear algebra and some acquaintance with statistics assumed.
STAT 617b, Random Matrices in Statistics. Hannes Leeb.
HTBA
Contemporary data often feature a large number of explanatory variables and a comparatively small sample size. In such settings, traditional large-sample approximations can be inappropriate, because the asymptotics have not taken hold. The class covers a variety of results on large-dimensional random matrices that can be used to address some of these problems. These include convergence of the largest or smallest eigenvalue, the distribution of individual eigenvalues, the distribution of the ensemble of all eigenvalues, as well as some applications to statistical problems. After STAT 541a and STAT 612a (or similar).
[STAT 619b, Statistical Decision Theory in Modern Statistical Methodology.]
STAT 625a, Case Studies. David Pollard.
Statistical analysis of a variety of problems including the value of a baseball player, the fairness of real estate taxes, how to win the Tour de France, energy consumption in Yale buildings, and interactive questionnaires for course evaluations. We emphasize methods of choosing data, acquiring data, and assessing data quality. Computations use R.
STAT 626b, Practical Work. John Emerson.
Individual one-term projects, with students working on studies outside the department, under the guidance of a statistician.
STAT 627a or b, Statistical Consulting. John Emerson, Lisha Chen.
Statistical consulting and collaborative research projects often require statisticians to explore new topics outside their area of expertise. This course exposes students to real problems, requiring them to draw on their expertise in probability, statistics, and data analysis. Students complete the course with individual projects supervised jointly by faculty outside the department and by one of the instructors. (1⁄2 credit per semester.)
[STAT 636b, Monte Carlo Methods.]
STAT 637a, Deterministic and Stochastic Optimization. Mokshay Madiman.
HTBA
Study of the theory and algorithms used to solve optimization problems in both deterministic and stochastic settings, with an emphasis on the latter. Topics include duality theory and descent methods in deterministic optimization; stochastic approximation, motivated by the need to optimize in the presence of noisy measurements; simulated annealing, motivated by the global optimization problem; and the theory of optimal transportation, an important example of infinite-dimensional optimization problems. Familiarity with stochastic processes (e.g., STAT 551b) is assumed. Knowledge of ordinary differential equations and real analysis is recommended.
STAT 645b, Statistical Methods in Genetics and Bioinformatics. Joseph Chang.
tth 10.3011.45
Stochastic modeling and statistical methods applied to problems such as mapping quantitative trait loci, analyzing gene expression data, sequence alignment, and reconstructing evolutionary trees. Statistical methods include maximum likelihood, Bayesian inference, Markov chain Monte Carlo, and some methods of classification and clustering. Models introduced include variance components, hidden Markov models, Bayesian networks, and coalescent. Recommended background: STAT 541a, STAT 542b. Prior knowledge of biology is not required. Also BIS 692b, CB&B 645b.
[STAT 654a, Topics in Bayesian Inference and Data Analysis.]
STAT 660b, Multivariate Statistical Methods for the Social Sciences. Jonathan Reuning-Scherer.
TTh 12.15
An introduction to the analysis of multivariate data. Topics include principal components analysis, factor analysis, cluster analysis (hierarchical clustering, k-means), discriminant analysis, multidimensional scaling, and structural equations modeling. Emphasis is placed on practical application of multivariate techniques to a variety of examples in the social sciences. Students complete extensive computer work using either SAS or SPSS. Prerequisites: knowledge of basic inferential procedures, experience with linear models (regression and ANOVA). Experience with some statistical package and/or familiarity with matrix notation is helpful but not required. Requirements: regular assignments and a final project.
STAT 661au,Data Analysis. Lisha Chen.
MW 2.303.45
By analyzing data sets using the S-plus statistical computing language, a selection of statistical topics are studied: linear and nonlinear models, maximum likelihood, resampling methods, curve estimation, model selection, classification, and clustering. Weekly sessions are held in the Social Sciences Statistical Laboratory. After STAT 542a and MATH 222a or b or 225a or b or the equivalents.
STAT 664bu,Information Theory. Andrew Barron.
TTh 910.15
Foundations of information theory in communications, statistical inference, statistical mechanics, probability, and algorithm complexity. Quantities of information and their properties: entropy, conditional entropy, divergence, mutual information, channel capacity. Basic theorems of data compression and coding for noisy channels. Applications in statistics, communication networks, and finance. After STAT 541a. Also ENAS 954bu.
STAT 665bu, Data Mining and Machine Learning. Lisha Chen.
MW 11.3012.45
Techniques for data mining and machine learning from both statistical and computational perspectives, including support vector machines, bagging, boosting, neural networks, and other nonlinear and nonparametric regression methods. Discussion includes the basic ideas and intuition behind these methods, a more formal understanding of how and why they work, and opportunities to experiment with machine learning algorithms and to apply them to data. After STAT 542b.
[STAT 667a, Probabilistic Networks, Algorithms, and Applications.]
[STAT 668a, Information and Probability.]
[STAT 669a, Information and Statistics.]
STAT 673a, Functional Data Analysis. Harrison Zhou.
HTBA
Data in the form of observed functions (curves and surfaces) arise in applications including growth analysis, meteorology, economics, and medicine. This course presents ideas and techniques for the statistical analysis of such data. Included are smoothing methods (wavelets, Fourier series, and splines), curve registration, principal components analysis, linear modeling, and canonical correlation analysis. We cover one topic each week, with one lecture for introducing real data and the other lecture for methodology and theory. Additional topics in asymptotic analysis as time permits. Knowledge of statistical theory at the level of STAT 542b is assumed.
[STAT 674au,Analysis of Spatial and Time Series Data.]
[STAT 680b, Nonparametric Statistics.]
STAT 690a or b, Independent Study.
By arrangement with faculty. Approval of director of graduate studies required.
STAT 695a, Internship in Statistical Research. John Emerson.
The internship is designed to give students an opportunity to gain practical exposure to problems in the analysis of statistical data, as part of a research group within industries such as: medical and pharmaceutical research, finance, information technologies, telecommunications, public policy, and others. The internship experience often serves as a basis for the Ph.D. dissertation. Students work with the director of graduate studies and other faculty advisers to select suitable placements. Students submit a one-page description of their internship plans to the DGS by May 1, which will be evaluated by the DGS and other faculty advisers by May 15. Upon completion of the internship, students submit a written report of their work to the DGS, no later than October 1. The internship is graded on a Satisfactory/Unsatisfactory basis, and is based on the student’s written report and an oral presentation. This course is an elective requirement for the Ph.D. degree. Prerequisites: completion of one semester of the Ph.D. program.
Next: Urban Education Studies Program
|