| | Location: Home » Artificial Intelligence Systems » The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) | |
|
|
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) |  | Authors: Trevor Hastie, Robert Tibshirani, Jerome Friedman Publisher: Springer Category: Book
List Price: $89.95 Buy New: $69.92 as of 3/16/2010 01:58 WIT details You Save: $20.03 (22%)
New (30) Used (15) from $59.82
Seller: allnewbooks Rating: 38 reviews
Media: Hardcover Edition: 2nd ed. 2009. Corr. 3rd printing Pages: 746 Number Of Items: 1 Shipping Weight (lbs): 3.1 Dimensions (in): 9 x 6.3 x 1.6
ISBN: 0387848576 Dewey Decimal Number: 519 EAN: 9780387848570
| |
| Also Available In:
|
| Similar Items:
| |
| Editorial Reviews:
Product Description
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
|
| Customer Reviews:
Showing reviews 1-5 of 38
Useful book on data mining February 6, 2002 frank lindemann 90 out of 96 found this review helpful
I use data mining tools in my financial engineering and financial modeling work and I have found this book to be very useful. This book provides two crucial types of information. First, it provides enough theory to allow a potential user to understand the essential insights that motivate specific techniques and to evaluate the situations in which those technique are appropriate. Second, the book gives the exact algorithms to implement the various techniques. While no book I have seen covers every data mining methodology available, this one has the strongest coverage I have seen in additive models, non-linear regression, and CART/MART (regression/classification trees). It also has very strong coverage in many other areas. I highly recommend it.
data mining from the viewpoint of statisticians January 24, 2008 Michael R. Chernick (Holland PA) 26 out of 26 found this review helpful
Data mining is a field developed by computer scientists but many of its crucial elements are imbedded in important and subtle statistical concepts. Statisticians can play an important role in the development of this field but as was the case with artificial intelligence, expert systems and neural networks the statistical research community has been slow to respond. Hastie, Tibshirani and Friedman are changing this.
Friedman has been a major player in pattern recognition of high dimensional data, in tree classification, regularized discriminant analysis and multivariate adaptive regression splines. He has also done some exciting new research on boosting methods.
Hastie and Tibshirani invented additive models which are very general types of regression models. Tibshirani invented the lasso method and is a leader among the researchers on bootstrap. Hastie invented principal curves and surfaces.
These tools and the expertise of these authors make them naturals to contribute to advances in data mining. They come with great expertise and see data mining from the statistical perspective. They see it as part of a more general process of statistical learning from data.
The book is well written and illustrated with many pretty color graphs and figures. Color adds a dimension in pattern recognition and the authors exploit it in this book. It is really the first of its kind that treats data mining from a statistical perspective and is so comprehensive and up-to-date.
The important statistical tools that are covered in this book include under the category of supervised learning; regression, discriminant analysis, kernel methods, model assessment and selection, bootstrapping, maximum likelihood and Bayesian inference, additive models, classification and regression trees, multivariate adaptive regression splines, boosting, regularization methods, nearest neighbor classification, k means clustering algorithms and neural networks. These methods are illustrated using real problems.
Similarly under the category of unsupervised learning, clustering and association are covered. They cover the latest developments in principal components and principal curves, multidimensional scaling, factor analysis and projection pursuit.
This book is innovative and fresh. It is an important contribution that will become a classic. The level is between intermediate and advanced. Good for an advanced special topics course for graduate students in statistics. A comparable text is the text by Mannila, Hand and Smyth.
This book made effective use of color and maintained a competitive price. This had a major impact on publishers like Wiley that could not sell a book at this size and initial price. Wiley is still looking for a book comparable to this one that they can use to compete with Springer-Verlag. I know this information because I heard from the Wiley acquisitions editor that I worked with on my two books.
Counter to review from Sep 8 September 11, 2003 Dr. Thomas Lengauer (Germany) 21 out of 24 found this review helpful
The review from September 8 expresses an opinion which is the exact opposite of mine, and is worded so strongly that I have to object. I gave a course using the book to bioinformaticians, most of them with a computer science background, and found the book exceptionally well prepared and suitable for a graduate course. The book serves the dual purpose of an introduction and a reference. An especially nice feature is how the authors explain the relationships and differences between different methods. By doing so, they provide context which I have not seen in any other book on this subject. The book is a very nice combination of basic theory and performance evaluation on data from a wide variety of domains and it is quite up-to-date. It has a well developed website going with it and the graphical material can be obtained electronically from the publisher. The book is an outstanding contribution to the field.
Most Useful Machine Learning Book September 24, 2007 K. Branson (Pasadena, CA) 5 out of 5 found this review helpful
This book describes most of the important topics in machine learning. Most machine learning books just present a criterion and and an optimization algorithm. For instance, LDA is often presented as: here is the Fisher criterion, it seems like a good thing to maximize. "The Elements of Statistical Learning" also presents that this is the right criterion if the distributions of the data for each class are Gaussian with the same covariance. This book puts all the algorithms in the same statistical language, which makes them easy to compare and choose between.
I also appreciate the emphasis this book puts on algorithms that are more recently popular/effective. I very much appreciate the discussions of logistic regression vs. LDA, ridge and lasso regression, boosting/additive logistic regression and additive trees, decision and regression trees, ...
The only qualm I have with this book is that it is rather biased toward the authors' own research. It is difficult from reading this book alone to differentiate between classical techniques and the authors' recent proposed algorithms.
my big brown book of statistic learning tools March 22, 2009 S. Matthews (Mainz, Germany) 6 out of 7 found this review helpful
This is a quite interesting, and extremely useful book, but it is wearing to read in large chunks. The problem, if you want to call it that, is that it is essentially a 700 page catalogue of clever hacks in statistical learning. From a technical point of view it is well-ehough structured, but there is not the slightest trace of an overarching philosophy. And if you don't actually have a philosophical perspective in place before you start, the read you face might well be an even harder grind. Be warned.
Some of the reviews here complain that there is too much math. I don't think that is an issue. If you have decent intuitions in geometry, linear algebra, probability and information theory, then you should be able to cruise through and/or browse in a fairly relaxed way. If you don't have those intuitions, then you are attempting to read the wrong book.
There were a couple of things that I expected (things I happen to know a bit about), but that were missing. On the unsupervised learning side, the discussion of Gaussian mixture clustering was, I thought, a bit short and superficial, and did not bring out the combination of theoretical and practical power that the method offers. On the supervised learning side, I was surprised that a book that dedicates so much time to linear regression finds no room for a discussion of Gaussian process regression as far as I could see (the nearest point of approach is the use of Gaussian radial basis functions [oops: having written that, I immediately came across a brief discussion (S5.8.1) of, essentially, GP regression - though with no reference to standard literature]).
Showing reviews 1-5 of 38
|
|
|
|
| |
|