Skip to content

alfianhid/Big-Data-Apache-Spark-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Big-Data-Apache-Spark-Projects

Big data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation. Apache Spark is a super-fast unified analytics software for large-scale data processing; includes big data and machine learning.

This repository contains a collection of my projects while studying in the Big Data & Data Mining course in college. In my final exam, I created a project to classify air quality in London using the Naive Bayes algorithm and a dataset derived from https://datahub.io/core/london-air-quality.