Home Explore Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Published by Supoet Srinutapong, 2018-11-26 20:04:28

Description: This leading textbook provides a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners. No previous knowledge of pattern recognition or machine learning concepts is assumed. This is the first machine learning textbook to include a comprehensive coverage of recent developments such as probabilistic graphical models and deterministic inference methods, and to emphasize a modern Bayesian perspective. It is suitable for courses on machine learning, statistics, computer science, signal processing, computer vision, data mining, and bioinformatics. This hard cover book has 738 pages in full colour, and there are 431 graded exercises.

Keywords: machine learning, statistics, computer science, signal processing, computer vision, data mining,bioinformatics

Read the Text Version

Pages:

266 5. NEURAL NETWORKSin which the parameter ξ is drawn from a distribution p(ξ), then the error functiondeﬁned over this expanded data set can be written as 1 {y(s(x, ξ)) − t}2p(t|x)p(x)p(ξ) dx dt dξ. (5.130) E= 2We now assume that the distribution p(ξ) has zero mean with small variance, so thatwe are only considering small transformations of the original input vectors. We canthen expand the transformation function as a Taylor series in powers of ξ to gives(x, ξ) = s(x, 0) + ξ ∂ s(x, ξ) ξ2 ∂2 + O(ξ3) + s(x, ξ) ∂ξ ξ=0 2 ∂ξ2 ξ=0 = x + ξτ + 1 ξ2τ + O(ξ3) 2where τ denotes the second derivative of s(x, ξ) with respect to ξ evaluated at ξ = 0.This allows us to expand the model function to give y(s(x, ξ)) = y(x) + ξτ T∇y(x) + ξ2 (τ )T ∇y(x) + τ T∇∇y(x)τ + O(ξ3). 2Substituting into the mean error function (5.130) and expanding, we then have 1 {y(x) − t}2p(t|x)p(x) dx dtE= 2 + E[ξ] {y(x) − t}τ T∇y(x)p(t|x)p(x) dx dt + E[ξ2] {y(x) − t} 1 (τ )T ∇y(x) + τ T∇∇y(x)τ 2 + τ T∇y(x) 2 p(t|x)p(x) dx dt + O(ξ3).Because the distribution of transformations has zero mean we have E[ξ] = 0. Also,we shall denote E[ξ2] by λ. Omitting terms of O(ξ3), the average error function thenbecomes E = E + λΩ (5.131)where E is the original sum-of-squares error, and the regularization term Ω takes theform Ω= {y(x) − E[t|x]} 1 (τ )T ∇y(x) + τ T∇∇y(x)τ 2 + τ T ∇y(x) 2 p(x) dx (5.132)in which we have performed the integration over t.

Pages:

Supoet Srinutapong

Pattern Recognition and Machine Learning

Like this book? You can publish your book online for free in a few minutes!

Create your own flipbook

TOP SEARCH

business design fashion music health life sports home marketing children

Pattern Recognition and Machine Learning

Keywords: machine learning, statistics, computer science, signal processing, computer vision, data mining,bioinformatics

Read the Text Version

Supoet Srinutapong

TOP SEARCH

RELATED PUBLICATIONS