Related Subjects

All Subject

Deep Learning Interview Questions and Answers | Data Science Interview Questions and Answers | Artificial Intelligence Interview Questions and Answers | Hadoop Interview Questions and Answers

Machine Learning Interview Questions and Answers

Question - 71 : - What is meant by Ensemble Learning?

Answer - 71 : -

Ensemble learning refers to the combination of multiple Machine Learning models to create more powerful models. The primary techniques involved in ensemble learning are bagging and boosting.

Question - 72 : - Outlier Values can be Discovered from which Tools?

Answer - 72 : -

The various tools that can be used to discover outlier values are scatterplots, boxplots, Z-score, etc.

Question - 73 : - What are the Two Main Types of Filtering in Machine Learning? Explain.

Answer - 73 : -

The two types of filtering are:

Collaborative filtering
Content-based filtering

Collaborative filtering refers to a recommender system where the interests of the individual user are matched with preferences of multiple users to predict new content.

Content-based filtering is a recommender system where the focus is only on the preferences of the individual user and not on multiple users.

Question - 74 : - What are the Various Tests for Checking the Normality of a Dataset?

Answer - 74 : -

In Machine Learning, checking the normality of a dataset is very important. Hence, certain tests are performed on a dataset to check its normality. Some of them are:

D’Agostino Skewness Test
Shapiro-Wilk Test
Anderson-Darling Test
Jarque-Bera Test
Kolmogorov-Smirnov Test

Question - 75 : - What is meant by Correlation and Covariance?

Answer - 75 : -

Correlation is a mathematical concept used in statistics and probability theory to measure, estimate, and compare data samples taken from different populations. In simpler terms, correlation helps in establishing a quantitative relationship between two variables.

Covariance is also a mathematical concept; it is a simpler way to arrive at a correlation between two variables. Covariance basically helps in determining what change or affect does one variable has on another.

Question - 76 : - What do you understand about the P-value?

Answer - 76 : -

P-value is used in decision-making while testing a hypothesis. The null hypothesis is rejected at the minimum significance level of the P-value. A lower P-value indicates that the null hypothesis is to be rejected.

Question - 77 : - What is Rescaling of Data and how is it done?

Answer - 77 : -

In real-world scenarios, the attributes present in data are in a varying pattern. So, rescaling the characteristics to a common scale is beneficial for algorithms to process data efficiently.

We can rescale data using Scikit-learn. The code for rescaling the data using MinMaxScaler is as follows:

#Rescaling data

import pandas

import scipy

import numpy

from sklearn.preprocessing import MinMaxScaler

names = ['Abhi', 'Piyush', 'Pranay', 'Sourav', 'Sid', 'Mike', 'pedi', 'Jack', 'Tim']

Dataframe = pandas.read_csv(url, names=names)

Array = dataframe.values

# Splitting the array into input and output

X = array[:,0:8]

Y = array[:,8]

Scaler = MinMaxScaler(feature_range=(0, 1))

rescaledX = scaler.fit_transform(X)

# Summarizing the modified data

numpy.set_printoptions(precision=3)

print(rescaledX[0:5,:])

Question - 78 : -
What is Binarizing of Data? How to Binarize?

Answer - 78 : -

Converting data into binary values on the basis of threshold values is known as binarizing of data. The values that are less than the threshold are set to 0 and the values that are greater than the threshold are set to 1. This process is useful when feature engineering has to be performed. This can also be used for adding unique features. Data can be binarized using Scikit-learn. The code for binarizing data using Binarizer is as follows:

from sklearn.preprocessing import Binarizer

import pandas

import numpy

names = ['Abhi', 'Piyush', 'Pranay', 'Sourav', 'Sid', 'Mike', 'pedi', 'Jack', 'Tim']

dataframe = pandas.read_csv(url, names=names)

array = dataframe.values

# Splitting the array into input and output

X = array[:,0:8]

Y = array[:,8]

binarizer = Binarizer(threshold=0.0).fit(X)

binaryX = binarizer.transform(X)

# Summarizing the modified data

numpy.set_printoptions(precision=3)

print(binaryX[0:5,:])

Question - 79 : - How to Standardize Data?

Answer - 79 : -

Standardization is the method that is used for rescaling data attributes. The attributes are likely to have a mean value of 0 and a value of the standard deviation of 1. The main objective of standardization is to prompt the mean and standard deviation for the attributes.

Data can be standardized using Scikit-learn. The code for standardizing the data using StandardScaler is as follows:

# Python code to Standardize data (0 mean, 1 stdev)

from sklearn.preprocessing import StandardScaler

import pandas

import numpy

names = ['Abhi', 'Piyush', 'Pranay', 'Sourav', 'Sid', 'Mike', 'pedi', 'Jack', 'Tim']

dataframe = pandas.read_csv(url, names=names)

array = dataframe.values

# Separate the array into input and output components

X = array[:,0:8]

Y = array[:,8]

scaler = StandardScaler().fit(X)

rescaledX = scaler.transform(X)

# Summarize the transformed data

numpy.set_printoptions(precision=3)

print(rescaledX[0:5,:])

Question - 80 : - We know that one-hot encoding increases the dimensionality of a dataset, but label encoding doesn’t. How?

Answer - 80 : -

When one-hot encoding is used, there is an increase in the dimensionality of a dataset. The reason for the increase in dimensionality is that every class in categorical variables, forms a different variable.

Example: Suppose there is a variable “Color.” It has three sublevels, “Yellow,” “Purple,” and “Orange.” So, one-hot encoding “Color” will create three different variables as Color.Yellow, Color.Purple, and Color.Orange.

In label encoding, the subclasses of a certain variable get the value 0 and 1. So, label encoding is only used for binary variables.

This is why one-hot encoding increases the dimensionality of data and label encoding does not.

Previous Next

NCERT Solutions

Share your email for latest updates

Name:

Email:

Machine Learning Interview Questions and Answers

Related Subjects

Machine Learning Interview Questions and Answers

NCERT Solutions

Share your email for latest updates

Latest News

10000+ interview questions in different categories

Freshers and experienced

Testimonial

NCERT Questions Answers

Halpura.com

Our partners