Top 100 interview questions on Data Science & Machine Learning
Questions could comprise of coding and theory questions both based on tech skills as : Language: R, Python Skills: Machine Learning, Statistics
Question bank on data science concepts

Why did you do masters in mathematics?

Rate yourself in statistics.

Rate yourself in Machine Learning.

You don’t know java. Why should we hire you?

Explain Logistic Regression

What is logit?

Rate yourself in R.

How will you extract data based on only two categories from column A and one category from Column B of a data frame in R?

Do you know how to pseudo code in python?

What is 99th percentile?

What is the probability that a ball chosen will not be green from a bag which contains 5 red ball, 7 green ball and 2 black balls?

what is regression?

what is correlation?

how will you calculate correlation?

what are the assumptions behind logistic regression?

Are you aware of SQL DB?

What do you do if there is multi collinearity in dataset?

Can you give examples for normally distributed dataset?

Can you give examples for uniformly distributed dataset?

Explain SVM.

What is hyperplane?

Why didn’t you normalise dataset? (While i was explaining a project)

Explain chisquare distribution?

When will you use chisquare test?

What is the difference between decision trees and random forest?

How will you tune random forest model?

How to build a model on textual data?

How to convert text data to vector format?

What is tfidf?

what is difference between Bag of words and tfidf?

What is confusion matrix?

How will you evaluate a classification model?

What is variance and bias?

What is recall score?

What is precision score?

What is Fscore?

We don’t know how data science works. We can just get you a project and get connected with client. You will have to solve the problem yourself. can you do that?

You haven’t worked in any production level project. Why?

You have mostly worked with AV’s and Kaggle’s datasets. Why not some real dataset?

How does correlation plays role in modelling?

What would you do if you have got imbalanced dataset to work upon?

Write SQL Query to get top 5 students w.r.t marks in mathematics if you have a table which contains data of mark sheet of students of a class.

Which algorithm was used in restaurant reviews classification project?

Explain Naive Bayes and its assumptions.
Machine Learning question bank for experienced

Explain this project and your role?

What problems you faced while working on A project?

How many frames were passed per second in openCV based project?

Why did you use regulariser?

Have you worked on pyspark, if yes then which one?(RDD or Dataframe)

Explain ReLu.

When will you use ReLu?

Difference between softmax and sigmoid function?

How does CNN works?

What is max pooling and how does it works?

What is recall score?

What is precision score?

What is Fscore?

What was the data source used in A project?

Why have you mostly worked with keras and why not tensorflow?

Do you know data structures?

Can you write a custom function for deep learning model?

what is the role of loss function and optimiser in a deep learning model?

How will you use SVM for multi categorical classification(more than 2 categories?

How does strides work in CNN?

What are the libraries that you worked with in Python?

What are the libraries that you worked with in Python related to NLP?

Where have you used pandas in your projects?

What is tensor?

Define features you would need so as to tell if a person is diabetic or not?

Tell a project in which you had to create data from scratch and the problems you faced while working on it.

What base model you would suggest for Google’s smart reply?
