Data Mining: A Tutorial-Based Primer, Second Edition by Richard J. Roiger

Machine Theory

By Richard J. Roiger

"Dr. Roiger does an exceptional activity of describing in step-by-step aspect formulae concerned about quite a few facts mining algorithms, besides illustrations. furthermore, his tutorials in Weka software program offer first-class grounding for college kids in comprehending the underpinnings of laptop studying as utilized to information Mining. The inclusion of RapidMiner software program tutorials and examples within the publication is usually a distinct plus because it is without doubt one of the preferred info Mining software program structures in use today."

--Robert Hughes, Golden Gate college, San Francisco, CA, USA

Data Mining: A Tutorial-Based Primer, moment Edition offers a entire creation to facts mining with a spotlight on version development and trying out, in addition to on studying and validating effects. The textual content publications scholars to appreciate how facts mining could be hired to resolve genuine difficulties and realize no matter if an information mining answer is a possible replacement for a selected challenge. primary information mining innovations, ideas, and review equipment are offered and applied with the aid of famous software program instruments.

Several new issues were extra to the second one version together with an advent to special information and knowledge analytics, ROC curves, Pareto carry charts, equipment for dealing with large-sized, streaming and imbalanced facts, help vector machines, and prolonged assurance of textual info mining. the second one variation includes tutorials for characteristic choice, facing imbalanced info, outlier research, time sequence research, mining textual info, and more.

The textual content offers in-depth assurance of RapidMiner Studio and Weka’s Explorer interface. either software program instruments are used for stepping scholars throughout the tutorials depicting the information discovery procedure. this enables the reader greatest flexibility for his or her hands-on facts mining experience.



Show description

Read or Download Data Mining: A Tutorial-Based Primer, Second Edition PDF

Similar machine theory books

Genetic Programming: First European Workshop, EuroGP’98 Paris, France, April 14–15, 1998 Proceedings

This e-book constitutes the refereed lawsuits of the 1st eu Workshop on Genetic Programming, EuroGP'98, held in Paris, France, in April 1998, lower than the sponsorship of EvoNet, the eu community of Excellence in Evolutionary Computing. the quantity provides 12 revised complete papers and 10 brief displays rigorously chosen for inclusion within the publication.

Operators for Similarity Search: Semantics, Techniques and Usage Scenarios

This publication offers a complete educational on similarity operators. The authors systematically survey the set of similarity operators, essentially concentrating on their semantics, whereas additionally touching upon mechanisms for processing them successfully. The publication begins by means of offering introductory fabric on similarity seek structures, highlighting the significant function of similarity operators in such structures.

Graph-based social media analysis

Considering the mathematical foundations of social media research, Graph-Based Social Media research presents a finished advent to using graph research within the examine of social and electronic media. It addresses a huge clinical and technological problem, particularly the confluence of graph research and community conception with linear algebra, electronic media, computer studying, huge info research, and sign processing.

The Digital Dionysus: Nietzsche and the Network-Centric Condition

Patricia Ticineto Clough: 'a superb collaboration between severe theorists from various disciplines to discover the import of Nietzschean idea for modern concerns in media, applied sciences and digitization. the result's The electronic Dionysus, a must-read for students in media, aesthetics, politics, and philosophy'

Extra info for Data Mining: A Tutorial-Based Primer, Second Edition

Sample text

Preface ◾ xxxvii USING WEKA AND RAPIDMINER Students are likely to benefit most by developing a working knowledge of both tools. This is best accomplished by students beginning their data mining experience with Weka’s Explorer interface. The Explorer is easy to navigate and makes several of the more difficult preprocessing tasks transparent to the user. Missing data are automatically handled by most data mining algorithms, and data type conversions are automatic. The format for model evaluation, be it a training/test set scenario or cross-validation, is implemented with a simple click of the mouse.

Before each data item enters the warehouse, the item is time-stamped, transformed as necessary, and checked for errors. The transfer process can be complex, especially when several operational databases are involved. Once entered, the records in the data warehouse become read-only and are not subject to change. A data warehouse stores all data relating to the same subject (such as a customer) in the same table. This distinguishes the data warehouse from an operational database, which stores information so as to optimize transaction processing.

Chapter 10 employs RapidMiner to cover the same material. There are advantages to examining at least some of the material in both chapters. Weka’s neural network function is able to mine data having a numeric output attribute, and RapidMiner’s self-organizing map operator can perform dimensionality reduction as well as unsupervised clustering. SUGGESTED COURSE OUTLINES The text is appropriate for the undergraduate information systems or computer science student. It can also provide assistance for the graduate student who desires a working knowledge of data mining and knowledge discovery.

Download PDF sample

Rated 4.11 of 5 – based on 33 votes