Machine Learning Interview Questions and Answers

Q1: What is machine learning?

A: In answering this question, try to show your understand of the broad applications of machine learning, as well as how it fits into AI. Put it into your own words, but convey your understanding that machine learning is a form of AI that automates data analysis to enable computers to learn and adapt through experience to do specific tasks without explicit programming

Q2: Why is naive Bayes so ‘naive’ ?

A: Naive Bayes is so ‘naive’ because it assumes that all of the features in a data set are equally important and independent. As we know, these assumption are rarely true in real-world scenario.

Q3: How do classification and regression differ?

A: Classification predicts group or class membership. Regression involves predicting a response. Classification is the better technique when you need a more definite answer.

Q4: What is kernel SVM?

A: Kernel SVM is the abbreviated version of kernel support vector machine. Kernel methods are a class of algorithms for pattern analysis and the most common one is the kernel SVM.

Q5: What is a recommendation system?

A: Anyone who has used Spotify or shopped at Amazon will recognize a recommendation system: It’s an information filtering system that predicts what a user might want to hear or see based on choice patterns provided by the user.

Q6: What are some factors that explain the success and recent rise of deep learning?

A: The success of deep learning in the past decade can be explained by three main factors:

  1. More data. The availability of massive labeled datasets allows us to train models with more parameters and achieve state-of-the-art scores. Other ML algorithms do not scale as well as deep learning when it comes to dataset size.
  2. GPU. Training models on a GPU can reduce the training time by orders of magnitude compared to training on a CPU. Currently, cutting-edge models are trained on multiple GPUs or even on specialized hardware.
  3. Improvements in algorithms. ReLU activation, dropout, and complex network architectures have also been very significant factors.

Q7: How is kNN different from kmeans clustering?

A:  Don’t get mislead by ‘k’ in their names. You should know that the fundamental difference between both these algorithms is, kmeans is unsupervised in nature and kNN is supervised in nature. kmeans is a clustering algorithm. kNN is a classification (or regression) algorithm.

kmeans algorithm partitions a data set into clusters such that a cluster formed is homogeneous and the points in each cluster are close to each other. The algorithm tries to maintain enough separability between these clusters. Due to unsupervised nature, the clusters have no labels.

kNN algorithm tries to classify an unlabeled observation based on its k (can be any number ) surrounding neighbors. It is also known as lazy learner because it involves minimal training of model. Hence, it doesn’t use training data to make generalization on unseen data set.

Q8: Why do ensembles typically have higher scores than individual models?

A: An ensemble is the combination of multiple models to create a single prediction. The key idea for making better predictions is that the models should make different errors. That way the errors of one model will be compensated by the right guesses of the other models and thus the score of the ensemble will be higher.

We need diverse models for creating an ensemble. Diversity can be achieved by:

  • Using different ML algorithms. For example, you can combine logistic regression, k-nearestneighbors and decision trees.
  • Using different subsets of the data for training. This is called bagging.
  • Giving a different weight to each of the samples of the training set. If this is done iteratively,weighting the samples according to the errors of the ensemble, it’s called boosting.

Many winning solutions to data science competitions are ensembles. However, in real-life machine learning projects, engineers need to find a balance between execution time and accuracy.

Q9: What is decision tree classification?

A: A decision tree builds classification (or regression) models as a tree structure, with datasets broken up into ever smaller subsets while developing the decision tree, literally in a tree-like way with branches and nodes. Decision trees can handle both categorical and numerical data.

Q10: Is rotation necessary in PCA? If yes, Why? What will happen if you don’t rotate the components?

A: Yes, rotation (orthogonal) is necessary because it maximizes the difference between variance captured by the component. This makes the components easier to interpret. Not to forget, that’s the motive of doing PCA where, we aim to select fewer components (than features) which can explain the maximum variance in the data set. By doing rotation, the relative location of the components doesn’t change, it only changes the actual coordinates of the points.

If we don’t rotate the components, the effect of PCA will diminish and we’ll have to select more number of components to explain variance in the data set.

Q11: How is ML different from artificial intelligence?

A: AI involves machines that execute tasks which are programmed and based on human intelligence, whereas ML is a subset application of AI where machines are made to learn information. They gradually perform tasks and can automatically build models from the learnings.

Q12: What is a recommendation system?

A: Anyone who has used Spotify or shopped at Amazon will recognize a recommendation system: It’s an information filtering system that predicts what a user might want to hear or see based on choice patterns provided by the user.

Q13: What is optimization in ML?

A: Optimization in general refers to minimising or maximising an objective function (in linear programming). In the context of ML, optimisation refers to tuning of hyperparameters which result in minimising the error function (or loss function).

Q14: What is data augmentation? Can you give some examples?

A: Data augmentation is a technique for synthesizing new data by modifying existing data in such a way that the target is not changed, or it is changed in a known way.

Computer vision is one of fields where data augmentation is very useful. There are many modifications that we can do to images:

  • Resize
  • Horizontal or vertical flip
  • Rotate
  • Add noise
  • Deform
  • Modify colors

Each problem needs a customized data augmentation pipeline. For example, on OCR, doing flips will change the text and won’t be beneficial; however, resizes and small rotations may help.

Q15: What are neural networks and where do they find their application in ML? Elaborate.

A: Neural networks are information processing models that derive their functions based on biological neurons found in the human brain. The reason they are the choice of technique in ML is because they help discover patterns in data that are sometimes too complex to comprehend by humans.

Q16: What is dimensionality reduction? Explain in detail.

A: The process of reducing variables in a ML classification scenario is called Dimensionality reduction. The process is segregated into sub-processes called feature extraction and feature selection. Dimensionality reduction is done to enhance visualisation of training data. It finds the appropriate set of variables known as principal variables.

Q17: What Is Deep Learning?

A: Deep learning is a process where it is considered to be a subset of machine learning process.

Q18: What Are The Three Stages To Build The Model In Machine Learning?
A:

  • Model building
  • Model testing
  • Applying the model

Q19: What is the function of Unsupervised Learning?

A: The function of Unsupervised Learning are as below:

  • Find clusters of the data of the data
  • Find low-dimensional representations of the data
  • Find interesting directions in data
  • Interesting coordinates and correlations
  • Find novel observations

Q20: What are the advantages decision trees?

A: The advantages decision trees are:

  • Decision trees are easy to interpret
  • Non parametric
  • There are relatively few parameters to tune

Q21: What is Over fitting in Machine Learning?

A: Over fitting in Machine Learning is defined as when a statistical model describes random error or noise instead of underlying relationship or when a model is excessively complex.

Q22: What is dimensionality reduction? Explain in detail.

A: The process of reducing variables in a ML classification scenario is called Dimensionality reduction. The process is segregated into sub-processes called feature extraction and feature selection. Dimensionality reduction is done to enhance visualisation of training data. It finds the appropriate set of variables known as principal variables.

Q23: What is the difference between supervised and unsupervised machine learning?

A: A Supervised learning is a process where it requires training labeled data While Unsupervised learning it doesn’t require data labeling.

Q24: How do bias and variance play out in machine learning?

A: Both bias and variance are errors. Bias is an error due to flawed assumptions in the learning algorithm. Variance is an error resulting from too much complexity in the learning algorithm.

Q25: How can you avoid over fitting?

A: We can avoid over fitting by using:

  • Lots of data
  • Cross-validation

Q26: What is Bias Error in machine learning algorithm?

A: Bias is the common error in the machine learning algorithm due to simplistic assumptions. It may undermine your data and does not allow you to achieve maximum accuracy. Further generalizing the knowledge from the training set to the test sets would be highly difficult for you.

Q27: How to handle or missing data in a data set?

A: An individual can easily find missing or corrupted data in a data set either by dropping the rows or columns. On contrary, they can decide to replace the data with another value.

In Pandas they are two ways to identify the missing data, these two methods are very useful.

isnull() and dropna().

Q28: What is inductive machine learning?

A: Inductive machine learning is all about a process of learning by live example.

Q29: How can you ensure that you are not over fitting with a particular model?

A: In Machine Learning concepts, they are three main methods or processes to avoid over fitting:

  • Firstly, keep the model simple
  • Must and should use cross validation techniques
  • It is mandatory to use regularization techniques, for example, LASSO.

Q30: What is your favorite algorithm and also explain the algorithm in briefly in a minute?

A: This type of questions is very common and asked by the interviewers to understand the candidate skills and assess how well he can communicate complex theories in the simplest language.

This one is a tough question and usually, individuals are not at all prepared for this situation so please be prepared and have a choice of algorithms and make sure you practice a lot before going into any sort of interviews.

Suggested Course : Machine Learning

Improve your career by taking our machine learning courses.

Learn More