Condividi
Fr. 126.00
Christophe Giraud, Christophe (Paris Sud University Giraud
High-dimensional Statistics
Inglese · Copertina rigida
Pubblicazione il 30.12.2014
Descrizione
This book provides a straightforward, up-to-date introduction to high-dimensional statistics. It avoids any unnecessary technicalities, instead focusing on the main underlying concepts in simple settings. It gives readers a strong background in high-dimensional statistics by explaining key concepts for analyzing complex and high-dimensional data. Along with end-of-chapter exercises, the book includes many new proofs that are significantly simpler than those found in research papers.
Sommario
Preface
Acknowledgments
Introduction
High-Dimensional Data
Curse of Dimensionality
Lost in the Immensity of High-Dimensional Spaces
Fluctuations Cumulate
An Accumulation of Rare Events May Not Be Rare
Computational Complexity
High-Dimensional Statistics
Circumventing the Curse of Dimensionality
A Paradigm Shift
Mathematics of High-Dimensional Statistics
About This Book
Statistics and Data Analysis
Purpose of This Book
Overview
Discussion and References
Take-Home Message
References
Exercises
Strange Geometry of High-Dimensional Spaces
Volume of a p-Dimensional Ball
Tails of a Standard Gaussian Distribution
Principal Component Analysis
Basics of Linear Regression
Concentration of the Square Norm of a Gaussian Random Variable
Model Selection
Statistical Setting
To Select among a Collection of Models
Models and Oracle
Model Selection Procedures
Risk Bound for Model Selection
Oracle Risk Bound
Optimality
Minimax Optimality
Frontier of Estimation in High Dimensions
Minimal Penalties
Computational Issues
Illustration
An Alternative Point of View on Model Selection
Discussion and References
Take-Home Message
References
Exercises
Orthogonal Design
Risk Bounds for the Different Sparsity Settings
Collections of Nested Models
Segmentation with Dynamic Programming
Goldenshluger–Lepski Method
Minimax Lower Bounds
Aggregation of Estimators
Introduction
Gibbs Mixing of Estimators
Oracle Risk Bound
Numerical Approximation by Metropolis–Hastings
Numerical Illustration
Discussion and References
Take-Home Message
References
Exercises
Gibbs Distribution
Orthonormal Setting with Power Law Prior
Group-Sparse Setting
Gain of Combining
Online Aggregation
Convex Criteria
Reminder on Convex Multivariate Functions
Subdifferentials
Two Useful Properties
Lasso Estimator
Geometric Insights
Analytic Insights
Oracle Risk Bound
Computing the Lasso Estimator
Removing the Bias of the Lasso Estimator
Convex Criteria for Various Sparsity Patterns
Group–Lasso (Group Sparsity)
Sparse–Group Lasso (Sparse–Group Sparsity)
Fused–Lasso (Variation Sparsity)
Discussion and References
Take-Home Message
References
Exercises
When Is the Lasso Solution Unique?
Support Recovery via the Witness Approach
Lower Bound on the Compatibility Constant
On the Group–Lasso
Dantzig Selector
Projection on the l1-Ball
Ridge and Elastic-Net
Estimator Selection
Estimator Selection
Cross-Validation Techniques
Complexity Selection Techniques
Coordinate-Sparse Regression
Group-Sparse Regression
Multiple Structures
Scaled-Invariant Criteria
References and Discussion
Take-Home Message
References
Exercises
Expected V-Fold CV l2-Risk
Proof of Corollary 5.5
Some Properties of Penalty (5.4)
Selecting the Number of Steps for the Forward Algorithm
Multivariate Regression
Statistical Setting
A Reminder on Singular Values
Low-Rank Estimation
If We Knew the Rank of A*
When the Rank of A* Is Unknown
Low Rank and Sparsity
Row-Sparse Matrices
Criterion for Row-Sparse and Low-Rank Matrices
Convex Criterion for Low Rank Matrices
Convex Criterion for Sparse and Low-Rank Matrices
Discussion and References
Take-Home Message
References
Exercises
Hard-Thresholding of the Singular Values
Exact Rank Recovery
Rank Selection with Unknown Variance
Graphical Models
Reminder on Conditional Independence
Graphical Models
Directed Acyclic Graphical Models
Nondirected Models
Gaussian Graphical Models (GGM)
Connection with the Precision Matrix and the Linear Regression
Estimating g by Multiple Testing
Sparse Estimation of the Precision Matrix
Estimation of g by Regression
Practical Issues
Discussion and References
Take-Home Message
References
Exercises
Factorization in Directed Models
Moralization of a Directed Graph
Convexity of –log(det(K))
Block Gradient Descent with the l1 / l2 Penalty
Gaussian Graphical Models with Hidden Variables
Dantzig Estimation of Sparse Gaussian Graphical Models
Gaussian Copula Graphical Models
Restricted Isometry Constant for Gaussian Matrices
Multiple Testing
An Introductory Example
Differential Expression of a Single Gene
Differential Expression of Multiple Genes
Statistical Setting
p-Values
Multiple Testing Setting
Bonferroni Correction
Controlling the False Discovery Rate
Heuristics
Step-Up Procedures
FDR Control under the WPRDS Property
Illustration
Discussion and References
Take-Home Message
References
Exercises
FDR versus FWER
WPRDS Property
Positively Correlated Normal Test Statistics
Supervised Classification
Statistical Modeling
Bayes Classifier
Parametric Modeling
Semi-Parametric Modeling
Nonparametric Modeling
Empirical Risk Minimization
Misclassification Probability of the Empirical Risk Minimizer
Vapnik–Chervonenkis Dimension
Dictionary Selection
From Theoretical to Practical Classifiers
Empirical Risk Convexification
Statistical Properties
Support Vector Machines
AdaBoost
Classifier Selection
Discussion and References
Take-Home Message
References
Exercises
Linear Discriminant Analysis
VC Dimension of Linear Classifiers in Rd
Linear Classifiers with Margin Constraints
Spectral Kernel
Computation of the SVM Classifier
Kernel Principal Component Analysis (KPCA)
Gaussian Distribution
Gaussian Random Vectors
Chi-Square Distribution
Gaussian Conditioning
Probabilistic Inequalities
Basic Inequalities
Concentration Inequalities
McDiarmid Inequality
Gaussian Concentration Inequality
Symmetrization and Contraction Lemmas
Symmetrization Lemma
Contraction Principle
Birgé’s Inequality
Linear Algebra
Singular Value Decomposition (SVD)
Moore–Penrose Pseudo-Inverse
Matrix Norms
Matrix Analysis
Subdifferentials of Convex Functions
Subdifferentials and Subgradients
Examples of Subdifferentials
Reproducing Kernel Hilbert Spaces
Notations
Bibliography
Index
Info autore
Christophe Giraud was a student of the École Normale Supérieure de Paris, and he received a Ph.D in probability theory from the University Paris 6. He was assistant professor at the University of Nice from 2002 to 2008. He has been associate professor at the École Polytechnique since 2008 and professor at Paris Sud University (Orsay) since 2012. His current research focuses mainly on the statistical theory of high-dimensional data analysis and its applications to life sciences.
Riassunto
Ever-greater computing technologies have given rise to an exponentially growing volume of data. Today massive data sets (with potentially thousands of variables) play an important role in almost every branch of modern human activity, including networks, finance, and genetics. However, analyzing such data has presented a challenge for statisticians and data analysts and has required the development of new statistical methods capable of separating the signal from the noise.
Introduction to High-Dimensional Statistics is a concise guide to state-of-the-art models, techniques, and approaches for handling high-dimensional data. The book is intended to expose the reader to the key concepts and ideas in the most simple settings possible while avoiding unnecessary technicalities.
Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this highly accessible text:
Describes the challenges related to the analysis of high-dimensional data
Covers cutting-edge statistical methods including model selection, sparsity and the lasso, aggregation, and learning theory
Provides detailed exercises at the end of every chapter with collaborative solutions on a wikisite
Illustrates concepts with simple but clear practical examples
Introduction to High-Dimensional Statistics is suitable for graduate students and researchers interested in discovering modern statistics for massive data. It can be used as a graduate text or for self-study.
Testo aggiuntivo
"Introduction to High-Dimensional Statistics by Christophe Giraud succeeds singularly at providing a structured introduction to this active field of research. … it is arguably the most accessible overview yet published of the mathematical ideas and principles that one needs to master to enter the field of high-dimensional statistics. … recommended to anyone interested in the main results of current research in high-dimensional statistics as well as anyone interested in acquiring the core mathematical skills to enter this area of research."—Journal of the American Statistical Association, December 2015
"This is an attractive textbook. It will prove a very useful addition to any library or personal reference collection. … This book achieves well what it sets out to provide, an introduction to the mathematical foundations of high-dimensional statistics. … likely to stand the test of time well."—International Statistical Review, 83, 2015
"There is a real need for this book. It can quickly make someone new to the field familiar with modern topics in high-dimensional statistics and machine learning, and it is great as a textbook for an advanced graduate course."—Marten H. Wegkamp, Cornell University, Ithaca, New York, USA
"As a mathematician, I am quite charmed by the book and its focus on getting the important ideas through in as short a form as possible, all the while sacrificing none of the mathematical correctness. I certainly plan to use it myself as a support in my own lectures!"—Gilles Blanchard, University of Potsdam, Germany"The book Introduction to High-Dimensional Statistics by Christophe Giraud succeeds singularly at providing a structured introduction to this active field of research. It describes a statistical pipeline where statistical principles enable the development of new methods, which, in turn, require a new mathematical analysis...A striking aspect of this book is the omnipresence of computational considerations across chapters. The author carefully points to potential implementations, R packages and algorithmic details that have now become inherent to modern high-dimensional statistical research...Giraud also offers informative and fairly comprehensive bibliographical notes that point to the main results of the field as well as connected work...It should be recommended to anyone interested in the main results of current research in high-dimensional statistics as well as anyone interested in acquiring the core mathematical skills to enter this area of research."- Philippe Rigollet, Massachusetts Institute of Technology, USA
Dettagli sul prodotto
| Autori | Christophe Giraud, Christophe (Paris Sud University Giraud |
| Editore | Taylor & Francis Ltd. |
| Lingue | Inglese |
| Formato | Copertina rigida |
| Pubblicazione | 30.12.2014, ritardato |
| EAN | 9781482237948 |
| ISBN | 978-1-4822-3794-8 |
| Pagine | 150 |
| Serie |
Chapman & Hall/CRC Monographs on Statistics & Applied Probability Chapman & Hall/CRC Monographs on Statistics and Applied Probability Chapman & Hall/CRC Monographs |
| Categorie |
Scienze naturali, medicina, informatica, tecnica
> Matematica
> Teoria delle probabilità, stocastica, statistica matematica
Scienze sociali, diritto, economia > Economia > Tematiche generali, enciclopedie |
Recensioni dei clienti
Per questo articolo non c'è ancora nessuna recensione. Scrivi la prima recensione e aiuta gli altri utenti a scegliere.
Scrivi una recensione
Top o flop? Scrivi la tua recensione.