Dataproc Cookbook
Running Spark and Hadoop Workloads in Google Cloud
Seiten
2025
O'Reilly Media (Verlag)
978-1-0981-5770-8 (ISBN)
O'Reilly Media (Verlag)
978-1-0981-5770-8 (ISBN)
- Noch nicht erschienen (ca. März 2025)
- Versandkostenfrei
- Auch auf Rechnung
- Artikel merken
Get up to speed with Dataproc, the fully managed and highly scalable service for running open source big data tools and frameworks, including Hadoop, Spark, Flink, and Presto. This cookbook shows data engineers, data scientists, data analysts, and cloud architects how to use Dataproc, integrated with Google Cloud, for data lake modernization, ETL, and secure data science at a fraction of the cost.
Narasimha Sadineni from Google and former Googler Anu Venkataraman show you how to set up and run Hadoop and Spark jobs on Dataproc. You'll learn how to create Dataproc clusters and run data engineering and data science workloads in long-running, ephemeral, and serverless ways. In the process, you'll gain an understanding of Dataproc, orchestration, logging and monitoring, Spark History Server, and migration patterns.
This cookbook includes hands-on examples for configuring, logging, securing clusters, and migrating from on-prem to Dataproc. You'll learn how to:
Create Dataproc clusters on Compute Engine and Kubernetes Engine
Run data science workloads on Dataproc
Execute Spark jobs on Dataproc Serverless
Optimize Dataproc clusters to be cost effective and performant
Monitor Spark jobs in various ways
Orchestrate various workloads and activities
Use different methods for migrating data and workloads from existing Hadoop clusters to Dataproc
Narasimha Sadineni from Google and former Googler Anu Venkataraman show you how to set up and run Hadoop and Spark jobs on Dataproc. You'll learn how to create Dataproc clusters and run data engineering and data science workloads in long-running, ephemeral, and serverless ways. In the process, you'll gain an understanding of Dataproc, orchestration, logging and monitoring, Spark History Server, and migration patterns.
This cookbook includes hands-on examples for configuring, logging, securing clusters, and migrating from on-prem to Dataproc. You'll learn how to:
Create Dataproc clusters on Compute Engine and Kubernetes Engine
Run data science workloads on Dataproc
Execute Spark jobs on Dataproc Serverless
Optimize Dataproc clusters to be cost effective and performant
Monitor Spark jobs in various ways
Orchestrate various workloads and activities
Use different methods for migrating data and workloads from existing Hadoop clusters to Dataproc
Erscheint lt. Verlag | 31.3.2025 |
---|---|
Verlagsort | Sebastopol |
Sprache | englisch |
Maße | 178 x 232 mm |
Themenwelt | Mathematik / Informatik ► Informatik ► Datenbanken |
ISBN-10 | 1-0981-5770-2 / 1098157702 |
ISBN-13 | 978-1-0981-5770-8 / 9781098157708 |
Zustand | Neuware |
Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
Haben Sie eine Frage zum Produkt? |
Mehr entdecken
aus dem Bereich
aus dem Bereich
Einführung in die Praxis der Datenbankentwicklung für Ausbildung, …
Buch | Softcover (2021)
Springer Fachmedien Wiesbaden GmbH (Verlag)
CHF 69,95