Data Mining Algorithms in C++
Released: May 06, 2017
Publisher: CreateSpace Independent Publishing Platform
Format: Paperback, 326 pages
to view more data
Description:
In my decades of custom programming and consultation, I have explored diverse applications, including automated analysis of high-altitude photographs, automated medical diagnosis, realtime detection of threatening military vehicles, and automated trading of financial markets. A common thread in all of these applications is that I was faced with a multitude of observed or computed variables, and my task involved finding and analyzing relationships among these variables. As a result, I have accumulated a wealth of algorithms for doing so. This book presents theoretical and intuitive justifications, along with highly commented source code, for my favorite data-mining techniques. This book makes no pretense of being 'complete' in any manner whatsoever. Please do not be annoyed if your own favorite techniques did not make my cut, or if the book ignores some popular standard techniques. These are simply the algorithms that I have found most useful in my own work over the years. Some of them are venerable old techniques such as the use of maximum-likelihood factor analysis for determining the degree to which variables contain unique information, versus being redundant due to hidden common factors impacting several variables. Some of them are powerful modern techniques, such as Combinatorially Symmetric Cross Validation for determining if a model is hampered by overfitting, or Feature Weighting as Regularized Energy-Based Learning for ranking variables in predictive power when there are too few training cases to employ traditional methods. Some of them are (I believe) my own invention, such as a method for clustering variables in the restricted context of a subspace of interest, and visual display of anomalous regions in which joint and marginal densities conflict, or in which contribution to mutual information is concentrated. But all of them share a great quality: I have found them to be exceptionally useful in my own data-mining endeavors. I suspect that you will as well.
We're an Amazon Associate. We earn from qualifying purchases at Amazon and all stores listed here.