Machine Learning Basics for your next Interview!!

Sanjeev Sahu
6 min readAug 22, 2021

--

Read till end and you’ll get your basics clear.

When people hear the word Machine Learning or AI they think of terminators or get afraid😨 of losing their jobs.

But Machine learning is much more than that.

A good framework for thinking about computer technologies is to think of them as tools that help us increase our productivity.

The same is with Machine Learning. It helps us simplify daily tasks and automating them.

Think of spam filter, prediction, recommendation, unusual activity etc. All these are possible because of Machine Learning.

Read till the end to learn more about this fascinating subject of Machine Learning.

  1. What is Machine learning?

Machine Learning is the science (and art) of programming computers so they can learn from data.

Take the example of spam filters. The ML program can learn to flag spam emails from given examples of spam emails. *flagged by users and example of regular spam and ham emails.

Without ML our email boxes would have been filled with tons of spam mails!

2. But why use ML?

It’s simple actually, here we wouldn’t need to do all the hard work like writing programs to detect specific words and mark as spam.

With ML algorithms, the program learns from the data ( spam emails) and accordingly marks future emails as spam or ham.

There can be mistakes in identifying spams but it still makes your lives easier.

The traditional approach to solve problems.
The ML approach to solve problems.

3. Types of ML systems

The 3 main broad categories of Machine learning systems are:

  • Whether or not they are trained with human supervision

🎯 supervised

😵 unsupervised

🚛 semi-supervised

🤑 reinforcement learning

  • Whether or not they can learn incrementally on the fly

📶 online learning

🤝 batch learning

  • Whether they work by simply comparing new data points to known data points or instead detect patterns in the training data and build a predictive model.

📣 Instance-based learning

💃 model-based learning

Too much to remember? Let’s try to understand each one of them.

4. Supervised Learning

Let’s say we want to know if a customer will stop using our product or service in near future.

What we’ll do is take all the data on the customers and try to predict if a customer will churn out or not in the future.

Manually it would have been too tough if you are a big corporation with hundreds of thousands of customers.

But Machine learning algorithms make this easier.

What it does is — recognize the patterns when a customer churns out. The patterns are recognized from data that is fed into the model. These are known as features. We have to predict if future customers will churn out or not.

Customer churn is your label.

Labels are desired solutions.

We use the various features to predict the labels.

Problems, where we have labels given in the data, these problems can be solved by supervised learning algorithms.

A typical supervised learning task is classification. The spam filter is a good example of this: it is trained with many example emails along with their class (spam or ham), and it must learn how to classify new emails.

Classifying spam and ham mails.

So when a new mail is received then the program is able to classify it as spam or non-spam.

Another typical task is to predict a target numeric value, such as the price of a car, given a set of features (mileage, age, brand, etc.) called predictors. This sort of task is called regression.

Regression problem.

Some of the popular supervised learning algorithms are -

  • K-nearest neighbors
  • Linear Regression
  • Logistic Regression
  • Support Vector Machines
  • Decision Trees and Random forests
  • Neural Networks

5. Unsupervised learning algorithms

The problems where we do not have any label given are solved by unsupervised learning algorithms. Data is unlabeled. A good example is when you want to learn more about your website visitors. You might not know more about them, but you can cluster them together based on their similarities. Then you can understand their behaviors in clusters.

Unsupervised Learning.

Applying ML techniques to dig into large amounts of data can help discover patterns is known as data mining.

6. Semi-supervised learning

Some algorithms can deal with partially labelled training data, usually a lot of unlabeled data and a little bit of labelled data. This is called semi-supervised learning.

We all know google photos ( way ahead of its competition) when we take a group pics from our phones, the photos app automatically recognizes that the same person appears in different pics ( this is unsupervised learning) and now all the app needs is for you to tell who this person is.

Next time you click a pic, you’ll see Google automatically recognizes the person.

Some of the semi-supervised algorithms are Deep belief networks (DBNs).

Deep belief networks (DBNs) are based on unsupervised components called restricted Boltzmann machines (RBMs) stacked on top of one another. RBMs are trained sequentially in an unsupervised manner, and then the whole system is fine-tuned using supervised learning techniques.

7. Reinforcement Learning

Reinforcement learning is a very different beast.

The learning system called an agent in this context, can observe the environment, select and perform actions, and get rewards in return (or penalties in the form of negative rewards.

It must then learn by itself what is the best strategy, called a policy, to get the most reward over time.

8. Batch and Online Learning

Another criterion used to classify Machine Learning systems is whether or not the system can learn incrementally from a stream of incoming data.

Twitter generates 12 + TB of data every day, Facebook generates 25 + TB of data every day and Google generates much more than these quantities every day. Now imagine how data is generated when these services work every day!

How is this connected you ask? 🤔

In batch learning, the algorithms must be trained using all available data. This takes a lot of time ⏳ and computing resources (RAM , CPU). Data of that magnitude is impossible to train with local computers when these companies have petabytes of data.

9. Online learning!

In online learning, you train the system incrementally by feeding it data instances sequentially, either individually or by small groups.

Each learning step is fast and cheap, so the system can learn about new data on the fly, as it arrives.

Online learning is great for systems that receive data as a continuous flow (e.g., stock prices) and need to adapt to change rapidly or autonomously.

Bonus:

Check out my notes on ML course by google in a notion page. 👇

Machine Learning crash course Notes.

--

--

Sanjeev Sahu

Product Management, Machine Learning, Tech, Business, Startups