This textbook first appeared in early 2007 and has been used by numerous
students and practitioners and in many courses, ranging from dedicated
data mining classes to more general business analytics courses (including our
own experience teaching this material both online and in person for more than
10 years). The first edition, based on the Excel add-in XLMiner, was followed
by two more XLMiner editions, a JMP edition, and now this R edition, with
its companion website, www.dataminingbook.com.
This new R edition, which relies on the free and open-source R software,
presents output from R, as well as the code used to produce that output,
including specification of a variety of packages and functions. Unlike computerscience
or statistics-oriented textbooks, the focus in this book is on data mining
concepts, and how to implement the associated algorithms in R. We assume a
basic facility with R.
For this R edition, two new co-authors stepped on board—Inbal Yahav and
Casey Lichtendahl—bringing both expertise teaching business analytics courses
using R and data mining consulting experience in business and government.
Such practical experience is important, since the open-source nature of R software
makes available a plethora of approaches, packages, and functions available
for data mining. Given the main goal of this book—to introduce data mining
concepts using R software for illustration—our challenge was to choose an
R code cocktail that supports highlighting the important concepts. In addition
to providing R code and output, this edition also incorporates updates and
new material based on feedback from instructors teaching MBA, undergraduate,
diploma, and executive courses, and from their students as well.
One update, compared to the first two editions of the book, is the title:
we now use Business Analytics in place of Business Intelligence. This reflects the
change in terminology since the second edition: Business Intelligence today
refers mainly to reporting and data visualization (“what is happening now”),
while Business Analytics has taken over the “advanced analytics,” which include
predictive analytics and data mining. In this new edition, we therefore use the
updated terms.