Nicht aus der Schweiz? Besuchen Sie lehmanns.de

Classification, (Big) Data Analysis and Statistical Learning (eBook)

eBook Download: PDF
2018
XVI, 242 Seiten
Springer International Publishing (Verlag)
978-3-319-55708-3 (ISBN)

Lese- und Medienproben

Classification, (Big) Data Analysis and Statistical Learning -
Systemvoraussetzungen
106,99 inkl. MwSt
(CHF 104,50)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

This edited book focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas such as economics, marketing, education, social sciences, medicine, environmental sciences and the pharmaceutical industry. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in Santa Margherita di Pula (Cagliari), Italy, October 8-10, 2015.



Francesco Mola is full professor of Statistics at the Department of Business and Economics at the University of Cagliari. He received his Ph.D in Computational Statistics and Data Analysis from the University of Naples Federico II. His research interests are in the field of multivariate data analysis and statistical learning, particularly data science and computational statistics. He has published more than sixty papers in international journals, encyclopedias, conference proceedings, and edited books.

 

Claudio Conversano is associate professor of Business and Economics at the University of Cagliari. He received his Ph.D in Computational Statistics and Data Analysis from the University of Naples Federico II. His research interests include nonparametric statistics, statistical learning and computational finance. He has published more than forty papers in international journals, encyclopedias, conference proceedings, and edited books.

 

Maurizio Vichi is full professor of Statistics and head of the Department of Statistical Sciences at the Sapienza University of Rome. He is president of the Federation of European National Statistical Societies (FENStatS), former president of the Italian Statistical Society, and of the International Federation of Classification Societies (IFCS). He is coordinating editor of the journal Advances in Data Analysis and Classification, editor of the international book series Classification, Data Analysis and Knowledge Organization, and the series Studies in Theoretical and Applied Statistics, published by Springer. He is a member of ESAC

Francesco Mola is full professor of Statistics at the Department of Business and Economics at the University of Cagliari. He received his Ph.D in Computational Statistics and Data Analysis from the University of Naples Federico II. His research interests are in the field of multivariate data analysis and statistical learning, particularly data science and computational statistics. He has published more than sixty papers in international journals, encyclopedias, conference proceedings, and edited books.   Claudio Conversano is associate professor of Statistics at the Department of Business and Economics at the University of Cagliari. He received his Ph.D in Computational Statistics and Data Analysis from the University of Naples Federico II. His research interests include nonparametric statistics, statistical learning and computational finance. He has published more than forty papers in international journals, encyclopedias, conference proceedings, and edited books.   Maurizio Vichi is full professor of Statistics and head of the Department of Statistical Sciences at the Sapienza University of Rome. He is president of the Federation of European National Statistical Societies (FENStatS), former president of the Italian Statistical Society, and of the International Federation of Classification Societies (IFCS). He is coordinating editor of the journal Advances in Data Analysis and Classification, editor of the international book series Classification, Data Analysis and Knowledge Organization, and the series Studies in Theoretical and Applied Statistics, published by Springer. He is a member of ESAC

Preface 6
Acknowledgements 8
Contents 10
Contributors 13
Big Data 17
From Big Data to Information: Statistical Issues Through a Case Study 18
1 Introduction on Big Data 18
2 Case Study 20
2.1 ISTAT Dataset 21
2.2 Telecom Italia Dataset 21
2.3 Method of Comparison of the Two Datasets 23
2.4 Results 23
3 Conclusions 25
References 25
Enhancing Big Data Exploration with Faceted Browsing 27
1 Introduction 28
2 Faceted Browsing and Big Data 28
3 Improving Faceted Navigation with Bayesian Networks 30
4 Big Data and Solr 31
5 Test 33
6 Conclusion and Future Work 34
References 35
Big Data Meet Pharmaceutical Industry: An Application on Social Media Data 36
1 Introduction 36
2 The Goal and the Data 38
3 Methodology and Results 40
4 Further Developments 41
References 42
Electre Tri Machine Learning Approach to the Record Linkage 44
1 Linked Data: The Record Linkage 44
2 The Multiple Criteria Electre Tri Method: A Brief Description 46
3 Application to Real Data: A Preliminary Stage 48
4 Conclusions 51
References 52
Social Networks 53
Finite Sample Behavior of MLE in Network Autocorrelation Models 54
1 Introduction 55
2 A Brief Review of Network Autocorrelation Models 55
3 Simulation Design 56
4 Results 57
5 Discussion and Conclusions 60
References 61
Network Analysis Methods for Classification of Roles 62
1 Introduction 62
2 Theoretical Framework 63
3 Materials and Methods 63
4 Results: The Classification of Social Roles 65
5 Conclusions 68
References 69
MCA-Based Community Detection 70
1 Community Detection Algorithms 70
2 MCA Based Consensus Community Detection 71
3 The Analysis of the Consensus Matrix 72
4 Simulation Study 73
5 Application on Real Data 75
6 Conclusions 76
References 76
Exploratory Data Analysis 78
Rank Properties for Centred Three-Way Arrays 79
1 Introduction 79
2 Notation and Known Results 81
3 Main Result 82
4 Examples 84
5 Conclusion 86
References 86
Principal Component Analysis of Complex Data and Application to Climatology 87
1 Introduction 87
2 Complex Singular Value Decomposition 89
3 Data 90
4 Results 90
4.1 The Comparison of Methods 90
4.2 The CPCA of Winds 92
5 Conclusion 94
References 94
Motivations and Expectations of Students' Mobility Abroad: A Mapping Technique 96
1 Introduction 96
2 Questionnaire and Sample 97
3 VOSviewer 98
4 Results 100
5 Conclusions 103
References 104
Testing Circular Antipodal Symmetry Through Data Depths 105
1 Introduction 105
2 Reflective and Antipodal Symmetry 106
3 Data Depth-Based Tests for Antipodal Symmetry 107
3.1 The Angular Simplicial and the Arc Distance Depths 107
3.2 Tests for Antipodal Symmetry 108
4 Evaluating the Test Procedures: An Empirical Study 109
4.1 Simulation Design 109
4.2 Simulation Results: Nominal Versus Observed Significance Level 110
4.3 Simulation Results: Power of the Tests 110
4.4 Simulation Results: Computational Costs 111
5 Findings and Final Remarks 112
References 112
Statistical Modeling 113
Multivariate Stochastic Downscaling for Semicontinuous Data 114
1 Introduction 114
2 Joint Spatial Modeling 115
2.1 Beyond the First Stage of Modeling 117
3 Application to Emilia-Romagna Data 118
3.1 Results 119
References 122
Exploring Italian Students' Performances in the SNV Test: A Quantile Regression Perspective 123
1 Introduction 123
2 Students' Performances Data: Description and Main Evidence 124
3 Quantile Regression: The Essentials 127
4 The Effects of Educational Predictors on Italian Students' Performances: QR Results 128
5 Concluding Remarks and Future Avenues 131
References 131
Estimating the Effect of Prenatal Care on Birth Outcomes 133
1 Introduction 133
2 Data 134
3 Model 136
4 Results 137
5 Conclusions 138
References 138
Clustering and Classification 140
Clustering Upper Level Units in Multilevel Models for Ordinal Data 141
1 Introduction 141
2 Density-Based Clustering of Upper Level Units 142
3 Application 143
3.1 Null Model 144
3.2 Model with Covariates 145
4 Final Remarks 147
References 147
Clustering Macroseismic Fields by Statistical Data Depth Functions 149
1 Introduction 150
2 Statistical Tools: Depth Functions and Similarity Measures 151
3 Application to Simulated and Real Datasets 154
3.1 Analysis of Simulated Datasets 154
3.2 Analysis of a Set of Italian Macroseismic Fields 155
References 157
Comparison of Cluster Analysis Approaches for Binary Data 158
1 Introduction 158
2 Methods 160
2.1 Monothetic Analysis Cluster 160
2.2 Model-Based Co-clustering 161
3 Comparison of the Methods on UNESCO Data 162
4 Conclusions 164
References 165
Classification Models as Tools of Bankruptcy Prediction—Polish Experience 166
1 Introduction 166
2 Dataset and Examination Variants 167
3 Comparison of the Prognostic Capabilities of the Models 168
4 Financial Ratios Most Often Used in the Models and Their Distributions 170
5 Summary 173
References 175
Quality of Classification Approaches for the Quantitative Analysis of International Conflict 176
1 General Purpose 176
2 Data Sets 177
3 Evaluative Comparisons 179
3.1 Evaluation Design 180
3.2 Evaluation Results 180
4 Conclusion 182
References 183
Time Series and Spatial Data 184
P-Splines Based Clustering as a General Framework: Some Applications Using Different Clustering Algorithms 185
1 Introduction 185
2 P-Spline Based Clustering in a Nutshell 186
3 Some Experiments on Real Data Sets 187
3.1 Hierarchical Clustering 188
3.2 Partitioning Algorithms 189
4 Conclusion 191
References 191
Comparing Multistep Ahead Forecasting Functions for Time Series Clustering 193
1 Introduction 193
2 The Distance Measure 194
3 Visual Exploration and Time Series Clustering 195
4 A Case Study 197
References 200
Comparing Spatial and Spatio-temporal FPCA to Impute Large Continuous Gaps in Space 202
1 Introduction 202
2 Methodology 203
3 Variance Functions 205
3.1 Reconstruction of Long Gaps in Functional Data 206
4 An Application to Air Pollution Data and a Simulation Study 207
5 Discussion and Further Developments 209
References 209
Finance and Economics 210
A Graphical Tool for Copula Selection Based on Tail Dependence 211
1 Introduction 211
2 Graphical Tools to Detect Tail Dependence 212
3 An Application to Financial Time Series 215
4 Conclusions 217
References 218
Bayesian Networks for Financial Market Signals Detection 219
1 Introduction 219
2 Data Description 220
3 The Financial Market Network 221
4 Examination of Different Scenarios 222
4.1 Scenario A: The Effect of Volatility 222
4.2 Scenario B: The Effect of Price/Earnings 225
5 Concluding Remarks 225
References 226
A Multilevel Heckman Model to Investigate Financial Assets Among Older People in Europe 227
1 Introduction 227
2 The Model 228
3 Data 230
4 Empirical Application 230
5 Conclusions 233
References 234
Bifurcation and Sunspots in Continuous Time Optimal Model with Externalities 235
1 Introduction 235
2 The Deterministic Economic General Model 237
3 Application: Natural Resource System 238
4 Stochastic Dynamic 239
4.1 Simulations 240
5 Conclusions 241
References 242
27 Erratum to: Big Data Meet Pharmaceutical Industry: An Application on Social Media Data 243
Erratum to:& #6

Erscheint lt. Verlag 21.2.2018
Reihe/Serie Studies in Classification, Data Analysis, and Knowledge Organization
Studies in Classification, Data Analysis, and Knowledge Organization
Zusatzinfo XVI, 242 p. 65 illus., 21 illus. in color.
Verlagsort Cham
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Mathematik / Informatik Mathematik Wahrscheinlichkeit / Kombinatorik
Wirtschaft
Schlagworte 00B25, 03-06, 03C45, 91C15, 62H30, 68T10, 91C20, 91D30 • applications of statistics • Big Data • classification • Clustering • Computational Statistics • Data Science • integration of data • machine learning • Social Networks • Statistical Data Analysis • Statistical Learning • time series clustering
ISBN-10 3-319-55708-4 / 3319557084
ISBN-13 978-3-319-55708-3 / 9783319557083
Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 7,4 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
der Grundkurs für Ausbildung und Praxis

von Ralf Adams

eBook Download (2023)
Carl Hanser Verlag GmbH & Co. KG
CHF 29,30
Wie Unternehmen Daten zur Skalierung ihres Geschäfts nutzen können

von Jonas Rashedi

eBook Download (2024)
Springer Fachmedien Wiesbaden (Verlag)
CHF 27,35