Big Data Analytics

Big Data Analytics book aims at providing the fundamentals of Apache Spark and Hadoop. All Spark components – Spark Core, Spark SQL, DataFrames, Data sets, Conventional Streaming, Structured Streaming, MLlib, Graphx and Hadoop core components – HDFS, MapReduce and Yarn are explored in greater depth with implementation examples on Spark + Hadoop clusters.

Big Data Analytics

Venkat Ankam

Packt Publishing

2016

Abstract

This book is based on the latest 2.0 version of Apache Spark and 2.7 version of Hadoop integrated with most commonly used tools.

Learn all Spark stack components including latest topics such as DataFrames, DataSets, GraphFrames, Structured Streaming, DataFrame based ML Pipelines and SparkR.

Integrations with frameworks such as HDFS, YARN and tools such as Jupyter, Zeppelin, NiFi, Mahout, HBase Spark Connector, GraphFrames, H2O and Hivemall.

Big Data Analytics book aims at providing the fundamentals of Apache Spark and Hadoop. All Spark components – Spark Core, Spark SQL, DataFrames, Data sets, Conventional Streaming, Structured Streaming, MLlib, Graphx and Hadoop core components – HDFS, MapReduce and Yarn are explored in greater depth with implementation examples on Spark + Hadoop clusters.

It is moving away from MapReduce to Spark. So, advantages of Spark over MapReduce are explained at great depth to reap benefits of in-memory speeds. DataFrames API, Data Sources API and new Data set API are explained for building Big Data analytical applications. Real-time data analytics using Spark Streaming with Apache Kafka and HBase is covered to help building streaming applications. New Structured streaming concept is explained with an IOT (Internet of Things) use case. Machine learning techniques are covered using MLLib, ML Pipelines and SparkR and Graph Analytics are covered with GraphX and GraphFrames components of Spark.

Readers will also get an opportunity to get started with web based notebooks such as Jupyter, Apache Zeppelin and data flow tool Apache NiFi to analyze and visualize data.

What you will learn

Find out and implement the tools and techniques of big data analytics using Spark on Hadoop clusters with wide variety of tools used with Spark and Hadoop

Understand all the Hadoop and Spark ecosystem components

Get to know all the Spark components: Spark Core, Spark SQL, DataFrames, DataSets, Conventional and Structured Streaming, MLLib, ML Pipelines and Graphx

See batch and real-time data analytics using Spark Core, Spark SQL, and Conventional and Structured Streaming

Get to grips with data science and machine learning using MLLib, ML Pipelines, H2O, Hivemall, Graphx, SparkR and Hivemall.

Venkat has delivered hundreds of trainings, presentations, and white papers in the big data sphere. While this is his first attempt at writing a book, many more books are in the pipeline.

Table of Contents

Big Data Analytics at 10,000 foot view

Getting Started with Apache Hadoop and Apache Spark

Deep Dive into Apache Spark

Big Data Analytics with Spark SQL, DataFrames, and Datasets

Real-Time Analytics with Spark Streaming and Structured Streaming

Notebooks and Dataflows with Spark and Hadoop

Machine Learning with Spark and Hadoop

Building Recommendation Systems with Spark and Mahout

Graph Analytics with GraphX

Interactive Analytics with SparkR

Citation

Venkat Ankam, Big Data Analytics,Packt Publishing, 2016

Collection

Lĩnh vực Công nghệ thông tin

Related document

Big Data AnalyticsBegining 3D game development with unity 4: All-in-one, multi-platform-game developmentEmbedded systems: Introduction to ARM cortex-M Microcontrollers. Volume 1
Big Data AnalyticsBegining 3D game development with unity 4: All-in-one, multi-platform-game developmentEmbedded systems: Introduction to ARM cortex-M Microcontrollers. Volume 1

QR code

Big Data Analytics

Content

  • Thứ Bảy, 20:49 05/11/2022

Tin tiêu điểm

PGS.TS Nguyễn Thị Hồng Nga, Giám đốc - Trung tâm Đào tạo Sau đại học trao tặng 02 đầu sách ngoại văn cho Trung tâm Thông tin - Thư viện

Thứ Sáu, 07:37 24/05/2024
Hướng dẫn khai thác Bộ sưu tập tài nguyên giáo dục mở (OER)

Hướng dẫn khai thác Bộ sưu tập tài nguyên giáo dục mở (OER)

Thứ Bảy, 15:58 04/05/2024

Truy cập hàng triệu sách điện tử miễn phí với The Online Books Page

Thứ Hai, 08:38 22/01/2024
5 khóa học miễn phí về thiết kế đồ họa

5 khóa học miễn phí về thiết kế đồ họa

Thứ Tư, 09:33 13/12/2023

7 khóa học “Kỹ thuật cơ khí” sinh viên ngành Cơ khí cần biết

Thứ Sáu, 13:57 08/12/2023

Các bài đã đăng

Energy Transfer in Alternative Vehicles

Thứ Sáu, 14:29 21/06/2024

Age of Auto Electric: Environment, Energy, and the Quest for the Sustainable Car

Thứ Sáu, 14:16 21/06/2024

Our Car as Power Plant

Thứ Sáu, 14:05 21/06/2024

企業変革の名著を読む = Đọc những cuốn sách hay nhất về chuyển đổi doanh nghiệp

Thứ Tư, 14:18 12/06/2024

Đánh thức năng lực vô hạn = Unlimited power

Thứ Tư, 14:08 12/06/2024

Financial Statement Analysis and Business Valuation for the Practical Lawyer

Thứ Bảy, 20:35 05/11/2022

AC Motor Control and Electrical Vehicle Applications

Thứ Bảy, 19:43 05/11/2022

Bộ đề luyện thi năng lực Hán ngữ HSK4

Thứ Sáu, 23:35 04/11/2022

호텔서비스 매너와 실무 = Hotel Service Manner

Thứ Sáu, 23:32 04/11/2022

관광통역안내사 필기+면접 용어상식사전(합격의공식 시대에듀) = Tourism Interpreter Handwriting + Interview Terminology Common Sense Dictionary (Official Age of Pass)

Thứ Sáu, 23:25 04/11/2022