Agile Data Science - SparkML



Machine learning library also called the “SparkML” or “MLLib” comprises of basic learning algorithms, including classification, regression, clustering and collaborative filtering.

Why learn SparkML for Agile?

Spark is turning into the de-facto platform for building machine learning algorithms and applications. The developers work on Spark for implementing machine algorithms in a scalable and concise manner in the Spark framework. We will become familiar with the concepts of Machine learning, its uses and algorithms with this framework. Agile always opts for a framework, which delivers short and quick results.

ML Algorithms

ML Algorithms incorporate basic learning algorithms such as classification, regression, clustering and collaborative filtering.

Features

It incorporates feature extraction, transformation, dimension reduction and selection.

Pipelines

Pipelines provide tools for developing, evaluating and tuning machine-learning pipelines.

Popular Algorithms

Following are a few popular algorithms −

  • Basic Statistics

  • Regression

  • Classification

  • Recommendation System

  • Clustering

  • Dimensionality Reduction

  • Feature Extraction

  • Optimization

Recommendation System

A recommendation system is a subclass of information filtering system that seeks prediction of “rating” and “preference” that a user suggests to a given thing.

Recommendation system includes various filtering systems, which are utilized as follows −

Collaborative Filtering

It incorporates building a model based on the past behavior as well as similar decisions made by other users. This specific filtering model is utilized to predict items that a user is interested to take in.

Content based Filtering

It includes the filtering of discrete characteristics of an item in order to recommend and add new items with comparable properties.

In our subsequent chapters, we will concentrate on the utilization of recommendation system for solving a specific problem and improving the prediction performance from the agile methodology perspective.





Input your Topic Name and press Enter.