Big Data Analytics

Big Data Analytics book aims at providing the fundamentals of Apache Spark and Hadoop. All Spark components – Spark Core, Spark SQL, DataFrames, Data sets, Conventional Streaming, Structured Streaming, MLlib, Graphx and Hadoop core components – HDFS, MapReduce and Yarn are explored in greater depth with implementation examples on Spark + Hadoop clusters.

Big Data Analytics

Venkat Ankam

Packt Publishing

2016

Abstract

This book is based on the latest 2.0 version of Apache Spark and 2.7 version of Hadoop integrated with most commonly used tools.

Learn all Spark stack components including latest topics such as DataFrames, DataSets, GraphFrames, Structured Streaming, DataFrame based ML Pipelines and SparkR.

Integrations with frameworks such as HDFS, YARN and tools such as Jupyter, Zeppelin, NiFi, Mahout, HBase Spark Connector, GraphFrames, H2O and Hivemall.

Big Data Analytics book aims at providing the fundamentals of Apache Spark and Hadoop. All Spark components – Spark Core, Spark SQL, DataFrames, Data sets, Conventional Streaming, Structured Streaming, MLlib, Graphx and Hadoop core components – HDFS, MapReduce and Yarn are explored in greater depth with implementation examples on Spark + Hadoop clusters.

It is moving away from MapReduce to Spark. So, advantages of Spark over MapReduce are explained at great depth to reap benefits of in-memory speeds. DataFrames API, Data Sources API and new Data set API are explained for building Big Data analytical applications. Real-time data analytics using Spark Streaming with Apache Kafka and HBase is covered to help building streaming applications. New Structured streaming concept is explained with an IOT (Internet of Things) use case. Machine learning techniques are covered using MLLib, ML Pipelines and SparkR and Graph Analytics are covered with GraphX and GraphFrames components of Spark.

Readers will also get an opportunity to get started with web based notebooks such as Jupyter, Apache Zeppelin and data flow tool Apache NiFi to analyze and visualize data.

What you will learn

Find out and implement the tools and techniques of big data analytics using Spark on Hadoop clusters with wide variety of tools used with Spark and Hadoop

Understand all the Hadoop and Spark ecosystem components

Get to know all the Spark components: Spark Core, Spark SQL, DataFrames, DataSets, Conventional and Structured Streaming, MLLib, ML Pipelines and Graphx

See batch and real-time data analytics using Spark Core, Spark SQL, and Conventional and Structured Streaming

Get to grips with data science and machine learning using MLLib, ML Pipelines, H2O, Hivemall, Graphx, SparkR and Hivemall.

Venkat has delivered hundreds of trainings, presentations, and white papers in the big data sphere. While this is his first attempt at writing a book, many more books are in the pipeline.

Table of Contents

Big Data Analytics at 10,000 foot view

Getting Started with Apache Hadoop and Apache Spark

Deep Dive into Apache Spark

Big Data Analytics with Spark SQL, DataFrames, and Datasets

Real-Time Analytics with Spark Streaming and Structured Streaming

Notebooks and Dataflows with Spark and Hadoop

Machine Learning with Spark and Hadoop

Building Recommendation Systems with Spark and Mahout

Graph Analytics with GraphX

Interactive Analytics with SparkR

Citation

Venkat Ankam, Big Data Analytics,Packt Publishing, 2016

Collection

Lĩnh vực Công nghệ thông tin

Related document

Big Data AnalyticsBegining 3D game development with unity 4: All-in-one, multi-platform-game developmentEmbedded systems: Introduction to ARM cortex-M Microcontrollers. Volume 1
Big Data AnalyticsBegining 3D game development with unity 4: All-in-one, multi-platform-game developmentEmbedded systems: Introduction to ARM cortex-M Microcontrollers. Volume 1

QR code

Big Data Analytics

Content

  • Thứ Bảy, 20:49 05/11/2022

Tin tiêu điểm

Truy cập hàng triệu sách điện tử miễn phí với The Online Books Page

Thứ Hai, 08:38 22/01/2024
5 khóa học miễn phí về thiết kế đồ họa

5 khóa học miễn phí về thiết kế đồ họa

Thứ Tư, 09:33 13/12/2023

7 khóa học “Kỹ thuật cơ khí” sinh viên ngành Cơ khí cần biết

Thứ Sáu, 13:57 08/12/2023
[Coursera] Khóa học “Tìm hiểu các phương pháp nghiên cứu” của ĐH Luân Đôn

[Coursera] Khóa học “Tìm hiểu các phương pháp nghiên cứu” của ĐH Luân Đôn

Thứ Hai, 08:55 06/11/2023

Khai thác danh mục tạp chí mở Directory of Open Access Journals (DOAJ)

Thứ Sáu, 15:50 18/08/2023

Các bài đã đăng

面接・面談の達人 目には見えない力を鍛える125の問い = 125 Câu Hỏi Tăng Cường Sức Mạnh Vô Hình Của Bạn

Thứ Sáu, 14:12 03/05/2024

小学生のまんが俳句辞典 新装版 = Từ điển Manga Haiku dành cho học sinh tiểu học Phiên bản mới

Thứ Sáu, 13:55 03/05/2024

接続詞の技術 = Nghệ thuật liên từ

Thứ Sáu, 13:36 03/05/2024

Nguyên lý Marketing

Thứ Tư, 09:37 24/04/2024

Hướng dẫn kỹ thuật soạn thảo hợp đồng kinh tế lao động - dân sự và các mẫu hợp đồng thông dụng

Thứ Tư, 09:15 24/04/2024

Financial Statement Analysis and Business Valuation for the Practical Lawyer

Thứ Bảy, 20:35 05/11/2022

AC Motor Control and Electrical Vehicle Applications

Thứ Bảy, 19:43 05/11/2022

Bộ đề luyện thi năng lực Hán ngữ HSK4

Thứ Sáu, 23:35 04/11/2022

호텔서비스 매너와 실무 = Hotel Service Manner

Thứ Sáu, 23:32 04/11/2022

관광통역안내사 필기+면접 용어상식사전(합격의공식 시대에듀) = Tourism Interpreter Handwriting + Interview Terminology Common Sense Dictionary (Official Age of Pass)

Thứ Sáu, 23:25 04/11/2022