Cloud Computing + Big Data
last year project https://github.com/ankitjoshinitjsr/SummerInternship2017.
Improvements in Cloud :- 1.) Added some new features to cloud - added multi terminal simulatenously for docker containers. 2.) Paas service is now provided through docker containers. 3.) Saas service is now provided through docker containers. 4.) Added Big Data Analysis on Hadoop cluster and few basic mapper-reducer programs. 5.) Automated Hadoop cluster setup using DevOps tool - Ansible.
Big Data :-
Big data is a term used to refer to data sets that are too large or complex for traditional data-processing application software to adequately deal with. To process this large sets of data we use a distributed storage model hadoop and a distributed processing model mapreduce. MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. A MapReduce program is composed of a map procedure, which performs filtering and sorting, and a reduce method, which performs a summary operation. Hadoop provides a software framework for distributed storage and processing of big data using the MapReduce programming model.