Feature Selection for Knowledge Discovery and Data Mining

Huan Liu, Hiroshi Motoda (Autoren)

Buch | Hardcover

214 Seiten

1998
Springer (Verlag)
978-0-7923-8198-3 (ISBN)

Artikel merken

As computer power grows and data collection technologies advance, a plethora of data is generated in almost every field where computers are used. , machine learning and data mining sys tems), discovering knowledge from data can still be fiendishly hard due to the characteristics of the computer generated data.

As computer power grows and data collection technologies advance, a plethora of data is generated in almost every field where computers are used. The com puter generated data should be analyzed by computers; without the aid of computing technologies, it is certain that huge amounts of data collected will not ever be examined, let alone be used to our advantages. Even with today's advanced computer technologies (e. g. , machine learning and data mining sys tems), discovering knowledge from data can still be fiendishly hard due to the characteristics of the computer generated data. Taking its simplest form, raw data are represented in feature-values. The size of a dataset can be measUJ·ed in two dimensions, number of features (N) and number of instances (P). Both Nand P can be enormously large. This enormity may cause serious problems to many data mining systems. Feature selection is one of the long existing methods that deal with these problems. Its objective is to select a minimal subset of features according to some reasonable criteria so that the original task can be achieved equally well, if not better. By choosing a minimal subset offeatures, irrelevant and redundant features are removed according to the criterion. When N is reduced, the data space shrinks and in a sense, the data set is now a better representative of the whole data population. If necessary, the reduction of N can also give rise to the reduction of P by eliminating duplicates.

1. Data Processing and KDD.- 1.1 Inductive Learning from Observation.- 1.2 Knowledge Discovery and Data Mining.- 1.3 Feature Selection and Its Roles in KDD.- 1.4 Summary.- References.- 2. Perspectives of Feature Selection.- 2.1 Feature Selection for Classification.- 2.2 A Search Problem.- 2.3 Selection Criteria.- 2.4 Univariate vs. Multivariate Feature Selection.- 2.5 Filter vs. Wrapper Models.- 2.6 A Unified View.- 2.7 Conclusion.- References.- 3. Aspects of Feature Selection.- 3.1 Overview.- 3.2 Basic Feature Generation Schemes.- 3.3 Search Strategies.- 3.4 Evaluation Measures With Examples.- 3.5 Conclusion.- References.- 4. Feature Selection Methods.- 4.1 Representative Feature Selection Algorithms.- 4.2 Employing Feature Selection Methods.- 4.3 Conclusion.- References.- 5. Evaluation and Application.- 5.1 Performance Assessment.- 5.2 Evaluation Methods for Classification.- 5.3 Evaluation of Selected Features.- 5.4 Evaluation: Some Examples.- 5.5 Balance between Different Performance Criteria.- 5.6 Applying Feature Selection Methods.- 5.7 Conclusions.- References.- 6. Feature Transformation and Dimensionality Reduction.- 6.1 Feature Extraction.- 6.2 Feature Construction.- 6.3 Feature Discretization.- 6.4 Beyond the Classification Model.- 6.5 Conclusions.- References.- 7. Less is More.- 7.1 A Look Back.- 7.2 A Glance Ahead.- References.- Appendices.- A-Data Mining and Knowledge Discovery Sources.- A.1 Web Site Links.- A.2 Electronic Newsletters, Pages and Journals.- A.3 Some Publically Available Tools.- B-Data Sets and Software Used in This Book.- B.1 Data Sets.- B.2 Software.- References.

Reihe/Serie	The Springer International Series in Engineering and Computer Science ; 454
Zusatzinfo	XXIII, 214 p.
Verlagsort	Dordrecht
Sprache	englisch
Maße	155 x 235 mm
Themenwelt	Mathematik / Informatik ► Informatik ► Datenbanken
	Informatik ► Theorie / Studium ► Algorithmen
	Informatik ► Theorie / Studium ► Kryptologie
	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
ISBN-10	0-7923-8198-X / 079238198X
ISBN-13	978-0-7923-8198-3 / 9780792381983
Zustand	Neuware