Book 圖書(shū)名稱: Machine Learning. A Practical Approach on the Statistical Learning Theory
Author 作者: Rodrigo Fernandes de Mello, Moacir Antonelli Ponti
Publisher 出版社: Springer
Page 頁(yè)數(shù):368
Publishing Date 出版時(shí)間: Sep 2018
Language 語(yǔ)言: English
Size 大。 11 MB
Format 格式: pdf 文字版
ISBN: 978-3-319-94989-5
Edition: 第1版
This book presents the Statistical Learning Theory in a detailed and easy to understand way, by using practical examples, algorithms and source codes. It can be used as a textbook in graduation or undergraduation courses, for self-learners, or as reference with respect to the main theoretical concepts of Machine Learning. Fundamental concepts of Linear Algebra and Optimization applied to Machine Learning are provided, as well as source codes in R, making the book as self-contained as possible.
It starts with an introduction to Machine Learning concepts and algorithms such as the Perceptron, Multilayer Perceptron and the Distance-Weighted Nearest Neighbors with examples, in order to provide the necessary foundation so the reader is able to understand the Bias-Variance Dilemma, which is the central point of the Statistical Learning Theory.
Afterwards, we introduce all assumptions and formalize the Statistical Learning Theory, allowing the practical study of different classification algorithms. Then, we proceed with concentration inequalities until arriving to the Generalization and the Large-Margin bounds, providing the main motivations for the Support Vector Machines.
From that, we introduce all necessary optimization concepts related to the implementation of Support Vector Machines. To provide a next stage of development, the book finishes with a discussion on SVM kernels as a way and motivation to study data spaces and improve classification results.
Table of Content
Foreword......Page 3
Contents......Page 4
Acronyms......Page 7
1.1 Machine Learning Definition......Page 8
1.2 Main Types of Learning......Page 11
1.3 Supervised Learning......Page 12
1.4 How a Supervised Algorithm Learns?......Page 26
1.5.1 The Perceptron......Page 35
1.5.2 Multilayer Perceptron......Page 59
1.6 Concluding Remarks......Page 79
References......Page 80
2.1 Motivation......Page 82
2.2 Basic Concepts......Page 83
2.2.1 Probability Densities and Joint Probabilities......Page 84
2.2.2 Identically and Independently Distributed Data......Page 89
2.2.3 Statistical Learning Theory Assumptions......Page 96
2.2.4 Expected Risk and Generalization......Page 97
2.2.5 Bounds for Generalization: A Practical Example......Page 99
2.2.6 Bayes Risk and Universal Consistency......Page 104
2.2.7 Consistency, Overfitting and Underfitting......Page 105
2.2.8 Bias of Classification Algorithms......Page 108
2.3 Empirical Risk Minimization Principle......Page 109
2.3.1 Consistency and the ERM Principle......Page 111
2.3.2 Restriction of the Space of Admissible Functions......Page 112
2.3.3 Ensuring Uniform Convergence in Practice......Page 115
2.4 Symmetrization Lemma and the Shattering Coefficient......Page 117
2.4.1 Shattering Coefficient as a Capacity Measure......Page 118
2.4.2 Making the ERM Principle Consistent for Infinite Functions......Page 120
2.5 Generalization Bounds......Page 122
2.6 The Vapnik-Chervonenkis Dimension......Page 125
2.6.1 Margin Bounds......Page 128
2.7 Computing the Shattering Coefficient......Page 129
2.8 Concluding Remarks......Page 133
References......Page 134
3.2 Distance-Weighted Nearest Neighbors......Page 136
3.3 Using the Chernoff Bound......Page 145
3.4 Using the Generalization Bound......Page 153
3.5 Using the SVM Generalization Bound......Page 156
3.6 Empirical Study of the Biases of Classification Algorithms......Page 164
3.8 List of Exercises......Page 167
References......Page 168
4.2.1 Basis......Page 169
4.2.2 Linear Transformation......Page 171
4.2.3 Inverses of Linear Transformations......Page 174
4.2.4 Dot Products......Page 176
4.2.5 Change of Basis and Orthonormal Basis......Page 178
4.2.6 Eigenvalues and Eigenvectors......Page 180
4.3 Using Basic Algebra to Build a Classification Algorithm......Page 185
4.4 Hyperplane-Based Classification: An Intuitive View......Page 196
4.5 Hyperplane-Based Classification: An Algebraic View......Page 204
4.5.1 Lagrange Multipliers......Page 208
4.5.2 Karush-Kuhn-Tucker Conditions......Page 212
4.6 Formulating the Hard-Margin SVM Optimization Problem......Page 217
4.7 Formulating the Soft-Margin SVM Optimization Problem......Page 225
4.9 List of Exercises......Page 231
References......Page 232
5.2 Introducing Optimization Problems......Page 233
5.3 Main Types of Optimization Problems......Page 234
5.4.1 Solving Through Graphing......Page 240
5.4.2.1 Using the Table and Rules......Page 247
5.4.2.2 Graphical Interpretation of Primal and Dual Forms......Page 251
5.4.2.3 Using Lagrange Multipliers......Page 255
5.4.3 Using an Algorithmic Approach to Solve Linear Problems......Page 259
5.4.4 On the KKT Conditions for Linear Problems......Page 269
5.4.4.1 Applying the Rules......Page 272
5.4.4.2 Graphical Interpretation of the KKT Conditions......Page 275
5.5 Convex Optimization Problems......Page 277
5.5.1 Interior Point Methods......Page 287
5.5.1.1 Primal-Dual IPM for Linear Problem......Page 288
5.5.2 IPM to Solve the SVM Optimization Problem......Page 303
5.5.3 Solving the SVM Optimization Problem Using Package LowRankQP......Page 317
5.6 Concluding Remarks......Page 328
References......Page 329
6 Brief Intro on Kernels......Page 331
6.1 Definitions, Typical Kernels and Examples......Page 332
6.1.1 The Polynomial Kernel......Page 333
6.1.2 The Radial Basis Function Kernel......Page 334
6.1.3 The Sigmoidal Kernel......Page 335
6.1.4 Practical Examples with Kernels......Page 336
6.2 Principal Component Analysis......Page 338
6.3 Kernel Principal Component Analysis......Page 342
6.4 Exploratory Data Analysis......Page 345
6.4.1 How Does the Data Space Affect the Kernel Selection?......Page 346
6.4.2 Kernels on a 3-Class Problem......Page 355
6.4.3 Studying the Data Spaces in an Empirical Fashion......Page 358
6.4.4 Additional Notes on Kernels......Page 361
6.5 SVM Kernel Trick......Page 362
6.6 A Quick Note on the Mercer's Theorem......Page 366
6.8 List of Exercises......Page 367
Reference......Page 368
== 回帖見(jiàn)免費(fèi)下載 ==
本帖隱藏的內(nèi)容
聲明: 本資源僅供學(xué)術(shù)研究參考之用,發(fā)布者不負(fù)任何法律責(zé)任,敬請(qǐng)下載者支持購(gòu)買正版。