Nicht aus der Schweiz? Besuchen Sie lehmanns.de

Distributed Computing in Big Data Analytics (eBook)

Concepts, Technologies and Applications
eBook Download: PDF
2017 | 1st ed. 2017
X, 162 Seiten
Springer International Publishing (Verlag)
978-3-319-59834-5 (ISBN)

Lese- und Medienproben

Distributed Computing in Big Data Analytics -
Systemvoraussetzungen
96,29 inkl. MwSt
(CHF 93,95)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

Big data technologies are used to achieve any type of analytics in a fast and predictable way, thus enabling better human and machine level decision making. Principles of distributed computing are the keys to big data technologies and analytics. The mechanisms related to data storage, data access, data transfer, visualization and predictive modeling using distributed processing in multiple low cost machines are the key considerations that make big data analytics possible within stipulated cost and time practical for consumption by human and machines. However, the current literature available in big data analytics needs a holistic perspective to highlight the relation between big data analytics and distributed processing for ease of understanding and practitioner use.

This book fills the literature gap by addressing key aspects of distributed processing in big data analytics. The chapters tackle the essential concepts and patterns of distributed computing widely used in big data analytics. This book discusses also covers the main technologies which support distributed processing. Finally, this book provides insight into applications of big data analytics, highlighting how principles of distributed computing are used in those situations.

Practitioners and researchers alike will find this book a valuable tool for their work, helping them to select the appropriate technologies, while understanding the inherent strengths and drawbacks of those technologies.

Editor’s Notes 5
Contents 8
On the Role of Distributed Computing in Big Data Analytics 9
1 Introduction 9
2 History and Key Characteristics of Big Data 11
3 Key Aspects of Big Data Analytics 14
4 Popular Technologies for Big Data Analytics Utilizing Concepts of Distributed Computing 15
4.1 Hadoop 15
4.2 Yarn 16
4.3 Hadoop Map Reduce 16
4.4 Spark 16
5 Conclusion 17
References 17
Fundamental Concepts of Distributed Computing Used in Big Data Analytics 19
1 Introduction 19
2 Multithreading and Multiprocessing 20
2.1 Concept of Multiprocessing 20
2.2 Example of Multiprocessing 20
2.3 Concept of Multithreading 20
2.4 Example of Multithreading 21
2.5 Difference between Multiprocessing and Multithreading 22
3 Computing Architecture in Distributed Computing 24
3.1 SISD 24
3.2 Vector Processor 24
3.3 SIMD 24
3.4 MIMD 26
3.5 SM-MIMD 26
3.6 DM-MIMD 27
4 Scalability in Distributing Computing 28
4.1 Scalability Requirement and Category 28
4.2 Scaling Up 29
4.3 Scaling Out 30
4.4 Prospect of Scale Up and Scale Out 31
5 Queuing Network Model for Distributed Computing 31
5.1 Asynchronous Communication 32
5.2 Queue System 32
5.3 Queue Modeling 33
6 Application of CAP Theorem 34
6.1 Basic Concepts of Consistency, Availability, and Partition Tolerance 34
6.2 Combination of Consistency, Availability, and Partition Tolerance 35
7 Quality of Service (QoS) Requirements in Big Data Analytics 36
7.1 Performance 36
7.2 Interoperability 36
7.3 Fault-Tolerance 37
7.4 Security 37
7.5 Manageability 38
7.6 Load-Balance 39
7.7 High-Availability (HA) 39
7.8 SLA 40
8 Conclusion 41
References 41
Distributed Computing Patterns Useful in Big Data Analytics 43
1 Introduction 43
2 Primitives for Concurrent Programming 45
2.1 Concurrency Expression 45
2.2 Synchronization 46
3 Communication Protocols and Message Exchange 47
3.1 Synchronous Communication 47
3.2 Asynchronous Communication 48
3.3 Pseudo-Synchronous Communication 48
3.4 Client/Server Paradigm 49
3.5 Communication Deployment in Big Data 49
4 Data Distribution in Big Data on Distributed Environments 51
5 Implementation Problems 56
5.1 Race Condition Problems 56
5.2 Message Exchange 58
6 Conclusion 59
References 60
Distributed Computing Technologies in Big Data Analytics 64
1 Introduction 64
2 Distributed Database 66
2.1 NoSQL Database 67
3 Distributed Storage 71
3.1 Hadoop Distributed File System (HDFS) 72
4 Distributed Computation 74
4.1 Map-Reduce in Hadoop 75
4.2 Spark 77
5 Machine Learning Platforms 78
6 Search System 79
6.1 Search Software 80
7 Big Data Messaging Software 82
8 Cache 84
8.1 Distributed Caching Systems 84
9 Data Visualization 86
10 Conclusion 86
References 88
Security Issues and Challenges in Big Data Analytics in Distributed Environment 90
1 Introduction 90
1.1 Security Issues in Big Data in Distributed Environment 92
2 Infrastructure Based Security 92
2.1 Secure Computations 92
2.2 Secure Non-relational Data Stores 94
3 Data Privacy 94
3.1 Privacy Preservation in Data Mining 94
3.2 Cryptography Control Mechanism 95
3.3 Granular Access Control 95
4 Data Integrity and Data Management 96
4.1 Granular Audits 96
4.2 Secure Transactions and Transaction Logs 96
4.3 Data Provenance 97
5 Reactive Security 97
5.1 Input Validation at Distributed Nodes 97
5.2 Real Time Security 98
6 Countermeasures 98
7 Conclusion 100
References 100
Scientific Computing and Big Data Analytics: Application in Climate Science 102
1 Introduction 102
2 Computational Challenges in Solving Scientific Problems 103
3 Climate Change and Big Data Analytics 105
4 Use Case on Climate Analytics 105
4.1 The Scientific Challenge of the Climate System 105
4.2 Computational Challenge of the Climate Modeling 107
4.3 Post-processing Climate Model Output 109
4.4 BigData Climate Analytics Using Spark 109
5 Conclusions 111
References 112
Distributed Computing in Cognitive Analytics 114
1 Introduction 114
2 Building Blocks of Cognitive Analytic System 115
2.1 The Data Repositories 115
2.2 The Data Ingestion Tools 115
2.3 The Analytical Frameworks 116
2.4 The Hardware Components 118
2.5 Key Non-functional Requirements to Consider 118
2.5.1 High Concurrency Throughput 118
2.5.2 Interfaces for Interaction with Systems 118
2.5.3 High Availability and Disaster Recovery 119
2.5.4 Linear Scalability 119
2.5.5 Ability to Prioritize Workload 119
2.6 Cognitive System – Implementation Patterns 120
3 Cognitive System – Use Cases 120
3.1 Cognitive Systems in Health Care 121
3.2 Cognitive Systems in Internet of Things Domain 122
3.3 Cognitive Analytics to Become a Customer Centric Organization 124
3.3.1 Next Best Action 124
3.3.2 Changing Engagement Patterns 124
3.3.3 360 ° View of Customer 124
3.3.4 Understand Thy Customer 125
4 Conclusion 126
References 127
Distributed Computing in Social Media Analytics 128
1 Introduction 128
2 Open Source Tools for Social Media Analytics 129
3 Influencer Analytics 129
3.1 Understanding the Impact of Influencers 129
3.2 Wimbledon Influencer Case Study 130
4 Social Polling 132
4.1 Sentiment Analysis 132
4.2 Intent Detection 134
4.3 Topic Monitoring 134
4.4 User Segmentation 136
4.5 Some Social Polling Examples 137
4.6 Social Polling for Demand Planning 138
5 Conclusion 139
References 140
Utilizing Big Data Analytics for Automatic Building of Language-agnostic Semantic Knowledge Bases 143
1 Introduction 143
2 Search Engines 144
2.1 Key Technologies 144
2.2 Inverted Index 145
2.3 Sharding of Data 145
2.4 Replication of Data 146
2.5 Denormalized Data Model 147
2.6 Distributed Aggregation and Scoring 147
3 Recommendation Systems 148
4 Semantic Discovery 149
4.1 Problem Description 149
4.2 Semantic Similarity 150
4.3 Probabilistic Semantic Similarity Scoring Using PGMHD 151
4.4 Distributed PGMHD 152
5 Word Sense Ambiguity Detection 152
5.1 Ambiguity Score 154
5.2 Resolving Word Sense Ambiguity 155
6 Semantic Knowledge Graph 157
6.1 Model Structure 158
6.2 Materialization of Nodes and Edges 158
6.3 Discovering Semantic Relationships 160
6.4 Scoring Semantic Relationships 160
6.5 Scaling Characteristics 163
7 Real World Applications 164
8 Conclusion 165
References 165

Erscheint lt. Verlag 29.8.2017
Reihe/Serie Scalable Computing and Communications
Scalable Computing and Communications
Zusatzinfo X, 162 p. 72 illus., 63 illus. in color.
Verlagsort Cham
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Mathematik / Informatik Informatik Netzwerke
Technik Nachrichtentechnik
Schlagworte Cloud Computing • Cognitive analytics • Distributed Computing • Graph Computing • Hadoop • internet of things • machine learning • Scientific data analytics • Social Media Analytics • Spark • Streaming analytics
ISBN-10 3-319-59834-1 / 3319598341
ISBN-13 978-3-319-59834-5 / 9783319598345
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 5,5 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
der Grundkurs für Ausbildung und Praxis

von Ralf Adams

eBook Download (2023)
Carl Hanser Verlag GmbH & Co. KG
CHF 29,30
Das umfassende Handbuch

von Wolfram Langer

eBook Download (2023)
Rheinwerk Computing (Verlag)
CHF 48,75