Nicht aus der Schweiz? Besuchen Sie lehmanns.de
Apache Spark 2.x Cookbook - Rishi Yadav

Apache Spark 2.x Cookbook (eBook)

Over 70 cloud-ready recipes for distributed Big Data processing and analytics

(Autor)

eBook Download: EPUB
2017
294 Seiten
Packt Publishing (Verlag)
978-1-78712-751-7 (ISBN)
Systemvoraussetzungen
39,59 inkl. MwSt
(CHF 38,65)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

While Apache Spark 1.x gained a lot of traction and adoption in the early years, Spark 2.x delivers notable improvements in the areas of API, schema awareness, Performance, Structured Streaming, and simplifying building blocks to build better, faster, smarter, and more accessible big data applications. This book uncovers all these features in the form of structured recipes to analyze and mature large and complex sets of data.

Starting with installing and configuring Apache Spark with various cluster managers, you will learn to set up development environments. Further on, you will be introduced to working with RDDs, DataFrames and Datasets to operate on schema aware data, and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will also work through recipes on machine learning, including supervised learning, unsupervised learning & recommendation engines in Spark.

Last but not least, the final few chapters delve deeper into the concepts of graph processing using GraphX, securing your implementations, cluster optimization, and troubleshooting.


Over 70 recipes to help you use Apache Spark as your single big data computing platform and master its librariesAbout This BookThis book contains recipes on how to use Apache Spark as a unified compute engineCover how to connect various source systems to Apache SparkCovers various parts of machine learning including supervised/unsupervised learning & recommendation enginesWho This Book Is ForThis book is for data engineers, data scientists, and those who want to implement Spark for real-time data processing. Anyone who is using Spark (or is planning to) will benefit from this book. The book assumes you have a basic knowledge of Scala as a programming language.What You Will LearnInstall and configure Apache Spark with various cluster managers & on AWSSet up a development environment for Apache Spark including Databricks Cloud notebookFind out how to operate on data in Spark with schemasGet to grips with real-time streaming analytics using Spark Streaming & Structured StreamingMaster supervised learning and unsupervised learning using MLlibBuild a recommendation engine using MLlibGraph processing using GraphX and GraphFrames librariesDevelop a set of common applications or project types, and solutions that solve complex big data problemsIn DetailWhile Apache Spark 1.x gained a lot of traction and adoption in the early years, Spark 2.x delivers notable improvements in the areas of API, schema awareness, Performance, Structured Streaming, and simplifying building blocks to build better, faster, smarter, and more accessible big data applications. This book uncovers all these features in the form of structured recipes to analyze and mature large and complex sets of data.Starting with installing and configuring Apache Spark with various cluster managers, you will learn to set up development environments. Further on, you will be introduced to working with RDDs, DataFrames and Datasets to operate on schema aware data, and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will also work through recipes on machine learning, including supervised learning, unsupervised learning & recommendation engines in Spark.Last but not least, the final few chapters delve deeper into the concepts of graph processing using GraphX, securing your implementations, cluster optimization, and troubleshooting.Style and approachThis book is packed with intuitive recipes supported with line-by-line explanations to help you understand Spark 2.x's real-time processing capabilities and deploy scalable big data solutions. This is a valuable resource for data scientists and those working on large-scale data projects.
Erscheint lt. Verlag 31.5.2017
Sprache englisch
Themenwelt Sachbuch/Ratgeber Freizeit / Hobby Sammeln / Sammlerkataloge
ISBN-10 1-78712-751-6 / 1787127516
ISBN-13 978-1-78712-751-7 / 9781787127517
Haben Sie eine Frage zum Produkt?
EPUBEPUB (Ohne DRM)

Digital Rights Management: ohne DRM
Dieses eBook enthält kein DRM oder Kopier­schutz. Eine Weiter­gabe an Dritte ist jedoch rechtlich nicht zulässig, weil Sie beim Kauf nur die Rechte an der persön­lichen Nutzung erwerben.

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belle­tristik und Sach­büchern. Der Fließ­text wird dynamisch an die Display- und Schrift­größe ange­passt. Auch für mobile Lese­geräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür die kostenlose Software Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
The Process of Leading Organizational Change

von Donald L. Anderson

eBook Download (2023)
Sage Publications (Verlag)
CHF 95,70
Exploring the Central Brooks Range, Second Edition

von Robert Marshall; George Marshall

eBook Download (2023)
University of California Press (Verlag)
CHF 37,95
A Translation and Study of the Gukansho, an Interpretative History of …

von Delmer Brown; Ichiro Ishida

eBook Download (2023)
University of California Press (Verlag)
CHF 51,75