Statistics
24 Hillhouse, 432.0666
M.A., Ph.D.
Chair
Andrew Barron
Director of Graduate Studies
John Hartigan (Rm 207, 24 Hillhouse, john.hartigan@yale.edu)
Professors
Donald Andrews (Economics), Andrew Barron, Joseph Chang, John Hartigan, Theodore Holford (Epidemiology & Public Health; Biostatistics), Peter Phillips (Economics), David Pollard, Edward Tufte (Political Science; Computer Science)
Associate Professor
Heping Zhang (Epidemiology & Public Health; Biostatistics)
Assistant Professors
John Emerson, Hannes Leeb, Harrison Zhou
Lecturer
Jonathan Reuning-Scherer
Fields of Study
Fields comprise the main areas of statistical theory (with emphasis on foundations, Bayes theory, decision theory, nonparametric statistics), probability theory (stochastic processes, asymptotics, weak convergence), information theory, econometrics, classification, statistical computing, and graphical methods.
Special Admissions Requirements
GRE scores for the General Test and for the Subject Test in the area of the undergraduate major should accompany an application. All applicants should have a strong mathematical background, including advanced calculus, linear algebra, elementary probability theory, and at least one course providing an introduction to mathematical statistics. An undergraduate major may be in statistics, mathematics, computer science, or in a subject in which significant statistical problems may arise. For those whose native language is not English, the Test of English as a Foreign Language (TOEFL) scores are required.
Special Requirements for the Ph.D. Degree
There is no foreign language requirement. Normally during the first two years, fourteen term courses in this and other departments are taken to prepare students for research and practice of statistics. These include courses devoted to case studies and practical work, for which students prepare a written report and give an oral presentation. The qualifying examination consists of three parts: a written report on an analysis of a data set, a written examination on theoretical statistics, and an oral examination. The examination is taken not later than when scheduled by the department in the middle of the second year, with provision for one subsequent reexamination of one or more parts in the event that a student does not pass the first time. All parts of the qualifying examination must be completed before the beginning of the third year. A prospectus for the dissertation should be submitted no later than the first week of March in the third year. The prospectus must be accepted by the department before the end of the third year if the student is to register for a fourth year. Upon successful completion of the qualifying examination and the prospectus (and meeting of Graduate School Requirements), the student is admitted to candidacy.
Master's Degree
M.A. (en route to the Ph.D.). This degree may be awarded upon completion of eight term courses and two terms of residence.
Master's Degree Program. Students are also admitted directly to a terminal master's degree program. To qualify for the M.A., the student must successfully complete eight term courses, chosen in consultation with the director of graduate studies. Full-time students must take a minimum of three courses per term. Part-time students are also accepted into the master's degree program. See page Terminal M.A./M.S. Degrees.
Program materials are available upon request to the Director of Graduate Studies, Department of Statistics, Yale University, PO Box 208290, New Haven CT 06520-8290; e-mail, susan.jackson-mack@yale.edu.
Courses
STAT 501506, Introduction to Statistics.
A basic introduction to statistics, including numerical and graphical summaries of data, probability, hypothesis testing, confidence intervals, and regression. Each course focuses on applications to a particular field of study and is taught jointly by two instructors, one specializing in statistics and the other in the relevant area of application.The first seven weeks are attended by all students in STAT 501506 together as general concepts and methods of statistics are developed. The course separates for the last six and a half weeks, which develop the concepts with examples and applications. Computers are used for data analysis.These courses are alternatives; they do not form a sequence and only one may be taken for credit.
STAT 501au, Introduction to Statistics: Life Sciences. Jonathan Reuning-Scherer, Günter Wagner.
TTh 12.15
Statistical and probabilistic analysis of biological problems presented with a unified foundation in basic statistical theory. Problems are drawn from genetics, ecology, epidemiology, and bioinformatics. Also E&EB 510au.
STAT 502au, Introduction to Statistics: Political Science. Jonathan Reuning-Scherer, Donald Green.
TTh 12.15
Statistical analysis of politics, elections, and political psychology. Problems presented with reference to a wide array of examples: public opinion, campaign finance, racially motivated crime, and public policy.
STAT 503au, Introduction to Statistics: Social Sciences. John Hartigan, Jonathan Reuning-Scherer.
TTh 12.15
Descriptive and inferential statistics applied to analysis of data from the social sciences. Introduction of concepts and skills for understanding and conducting quantitative research.
[STAT 504au, Introduction to Statistics in Psychology.]
STAT 505au, Introduction to Statistics: Medicine. Jonathan Reuning-Scherer and staff.
TTh 12.15
Statistical methods relied upon in medicine and medical research. Practice in reading medical literature competently and critically, as well as practical experience performing statistical analysis of medical data.
[STAT 506au, Introduction to Statistics: Data Analysis.]
STAT 530bu, Introductory Data Analysis. John Hartigan.
MW2.303.45
Survey of statistical methods: plots, transformations, regression, analysis of variance, clustering, principal components, contingency tables, and time series analysis. S-PLUS and Web data sources are used. After STAT 501a.
STAT 538au, Probability and Statistics for Scientists. Joseph Chang.
MWF 2.303.20
Fundamental principles and techniques of probabilistic thinking, statistical modeling, and data analysis. Essentials of probability: conditional probability, random variables, distributions, law of large numbers, central limit theorem, Markov chains. Statistical inference with emphasis on the Bayesian approach: parameter estimation, likelihood, prior and posterior distributions, Bayesian inference using Markov chain Monte Carlo. Introduction to regression and linear models. Computers are used throughout for calculations, simulations, and analysis of data. After MATH 118a or b or 120a or b. Some acquaintance with matrix algebra and computing assumed.
STAT 541au, Probability Theory. Hannes Leeb.
MWF 9.3010.20
A first course in probability theory: probability spaces, random variables, expectations and probabilities, conditional probability, independence, some discrete and continuous distributions, central limit theorem, Markov chains, probabilistic modeling. After or concurrent with MATH 120a or b or the equivalent.
STAT 542bu, Theory of Statistics. Harrison Zhou, Andrew Barron.
MWF 9.3010.20
Principles of statistical analysis: maximum likelihood, sampling distributions, estimation; confidence intervals; tests of significance; regression; analysis of variance; and the method of least squares. Some statistical computing. After STAT 541a and concurrently with or after MATH 222a or b or 225a or b or the equivalent.
STAT 551bu, Stochastic Processes. Joseph Chang.
MW 12.15
Introduction to the study of random processes, including Markov chains, Markov random fields, martingales, random walks, Brownian motion, and diffusions. Techniques in probability such as coupling and large deviations. Applications to image reconstruction, Bayesian statistics, finance, probabilistic analysis of algorithms, genetics, and evolution. After STAT 541a or the equivalent.
STAT 600bu, Advanced Probability. David Pollard.
TTh 2.303.45
Measure theoretic probability, conditioning, laws of large numbers, convergence in distribution, characteristic functions, central limit theorems, martingales. Some knowledge of real analysis is assumed.
STAT 603a, Stochastic Calculus. David Pollard.
HTBA
Martingales in discrete and continuous time, Brownian motion, sample path properties, predictable processes, stochastic integrals with respect to Brownian motion and semimartingales, stochastic differential equations. Applications mostly to counting processes and finance. Knowledge of measure-theoretic probability at the level of STAT 600b is a prerequisite for the course, although some key concepts, such as conditioning, are reviewed. After STAT 600b.
STAT 605a, Foundations of Statistics. John Hartigan.
This course investigates philosophical and historical issues in the foundations of statistics. The origins and evolution of probability. The Bayesian-frequentist dichotomy. Is decision theory necessary or useful? Is robustness possible? Are asymptotic results applicable? How are independence assumptions justified, and what to do if they are not? Puzzles and paradoxes. The likelihood and invariance principles. Fiducial inference. Practical probability.
STAT 607b, Inequalities for Probability and Statistics. David Pollard.
HTBA
A guided tour of some inequalities useful in statistical and probabilistic problems. The course is broken down into independent segments, each treating a specific method and an illustrative application. Acquaintance with probability at the 600 level is helpful for some segments. Possible topics: convexity arguments; tail bounds for martingales and independent summands; metric entropy and maximal inequalities; VC dimension and combinatorial methods; distances between probability measures; majorizing measures and generic chaining; isoperimetric inequalities; concentration inequalities; Gaussian processes. Applications to: statistical inference; asymptotic theory; minimax rates of convergence; machine learning; complexity.
STAT 610a, Statistical Inference. Harrison Zhou.
HTBA
A systematic development of the mathematical theory of statistical inference covering methods of estimation, hypothesis testing, and confidence intervals. An introduction to statistical decision theory. Undergraduate probability at the level of STAT 541a assumed.
STAT 612au, Linear Models. Hannes Leeb.
TTh 910.15
The geometry of least squares; distribution theory for normal errors; regression, analysis of variance, and designed experiments; numerical algorithms (with particular reference to S-plus); alternatives to least squares. Generalized linear models. Linear algebra and some acquaintance with statistics assumed.
STAT 625a, Case Studies. John Hartigan, John Emerson.
Statistical analysis of a variety of problems including the value of a baseball player, the fairness of real estate taxes, how to win the Tour de France, energy consumption in Yale buildings, and interactive questionnaires for course evaluations. We emphasize methods of choosing data, acquiring data, and assessing data quality. Computations use R.
STAT 626b, Practical Work. John Emerson, John Hartigan.
Individual one-term projects, with students working on studies outside the department, under the guidance of a statistician.
STAT 627b, Statistical Consulting. John Emerson, John Hartigan.
Statistical consulting and collaborative research projects often require statisticians to explore new topics outside their area of expertise. This course exposes students to real problems, requiring them to draw on their expertise in probability, statistics, and data analysis. Students complete the course with individual projects supervised jointly by faculty outside the department and by one of the instructors.
STAT 645b, Statistical Methods in Genetics and Bioinformatics. Joseph Chang.
HTBA
Stochastic modeling and statistical methods applied to problems such as mapping quantitative trait loci, analyzing gene expression data, sequence alignment, and reconstructing evolutionary trees. Statistical methods include maximum likelihood, Bayesian inference, Monte Carlo Markov chains, and some methods of classification and clustering. Models introduced include variance components, hidden Markov models, Bayesian networks, and coalescent. Recommended background: STAT 541a, STAT 542b. Prior knowledge of biology is not required. Times to be arranged at organizational meeting.
STAT 660b, Multivariate Statistical Methods for the Social Sciences.Jonathan Reuning-Scherer.
HTBA
An introduction to the analysis of multivariate data. Topics include principal components analysis, factor analysis, cluster analysis (hierarchical clustering, k-means), discriminant analysis, multidimensional scaling, and structural equations modeling. Emphasis is placed on practical application of multivariate techniques to a variety of examples in the social sciences. Students complete extensive computer work using either SAS or SPSS. Prerequisites: knowledge of basic inferential procedures, experience with linear models (regression and ANOVA). Experience with some statistical package and/or familiarity with matrix notation is helpful but not required. Requirements: regular assignments and a final project.
STAT 661au, Data Analysis. John Emerson.
MW2.303.45
By analyzing data sets using the S-plus statistical computing language, a selection of statistical topics are studied: linear and nonlinear models, maximum likelihood, resampling methods, curve estimation, model selection, classification, and clustering. Weekly sessions are held in the Social Sciences Statistical Laboratory. After STAT 542a and MATH 222a or b or 225a or b or the equivalents.
STAT 664bu, Information Theory. Edmund Yeh.
TTh 910.15
Foundations of information theory in communications, statistical inference, statistical mechanics, probability, and algorithm complexity. Quantities of information and their properties: entropy, conditional entropy, divergence, mutual information, channel capacity. Basic theorems of data compression and coding for noisy channels. Applications in statistics, communication networks, and finance. After STAT 541a.
STAT 665bu, Data Mining and Machine Learning. Hannes Leeb.
MW 11.3012.45
Techniques for data mining and machine learning from both statistical and computational perspectives, including support vector machines, bagging, boosting, neural networks, and other nonlinear and nonparametric regression methods. Discussion includes the basic ideas and intuition behind these methods, a more formal understanding of how and why they work, and opportunities to experiment with machine learning algorithms and to apply them to data. After STAT 542b.
[STAT 674au, Analysis of Spatial and Time Series Data.]
STAT 695a, Internship in Statistical Research. John Hartigan.
The internship is designed to give students an opportunity to gain practical exposure to problems in the analysis of statistical data, as part of a research group within industries such as: medical and pharmaceutical research, finance, information technologies, telecommunications, public policy, and others. The internship experience often serves as a basis for the Ph.D. dissertation. Students work with the director of graduate studies and other faculty advisers to select suitable placements. Students submit a one-page description of their internship plans to the DGS by May 1, which will be evaluated by the DGS and other faculty advisers by May 15. Upon completion of the internship, students submit a written report of their work to the DGS, no later than October 1. The Internship is graded on a Satisfactory/Unsatisfactory basis, and is based on the student's written report and an oral presentation. This course is an elective requirement for the Ph.D. degree. Prerequisites: completion of one semester of the Ph.D. program.
STAT 700, Departmental Seminar.
Important activity for all members of the department. See weekly seminar announcements.
Next: The
Whitney Seminar
|