Deep Learning Interview Questions and Answers
Question - 91 : - How does the method of unsupervised learning aid in deep learning?
Answer - 91 : -
Unlike supervised learning, this is a type of process where the involvement of categorization is nil. It is solely used to detect the unrevealed or uncovered attributes and formation in an unidentified set of information. Other than the mentioned function, the specific method is also utilized to perform the following tasks.
- Detect data jamming or data entanglement
- Detect low spatial data depiction
- Point out the appropriate data alignment
- Locate alluring data intersection and links
- Data clarification
Question - 92 : - Mention the three steps to build the necessary assumption structure in deep learning
Answer - 92 : -
The process of developing an assumption structure involves three specific actions. The foremost step includes algorithm development. This particular process is lengthy as the out has to undergo several processes prior to the outcome generation. The second step involves algorithm analyzing which indicates the in-process methodology. The third step is all about implementing the generated algorithm in the final procedure. The entire framework is interlinked and requires utmost continuity throughout the process.
Question - 93 : - Define the concept of the perceptron
Answer - 93 : -
The above-titled terminology fundamentally refers to the model used for supervised categorization that indicates a single input among the various existing non-binary outcomes.
Question - 94 : - Demonstrate the significant elements suffused in the Bayesian logic system
Answer - 94 : -
There are mainly two elements involved in the particular system, and the former one includes rational explanatory infused with an array of Bayesian specifications that grasps the approximate framework of the specific field. The other element holds a quantitative approach towards the same and is mainly used to record or capture the calculable data in the specific domain.
Question - 95 : - Define the concept of an additive learning algorithm
Answer - 95 : -
The above-mentioned technique is referred to as the method of algorithms capturing learning elements from a given set of information which is an accessible post to the generation of a classifier that has been produced from the existing set of data.
Question - 96 : - Why are GPUs important for implementing deep learning models?
Answer - 96 : -
Whenever we are trying to build any neural network model, the model training phase is the most resource-consuming job. Each iteration of model training comprises thousands (or even more) of matrix multiplication operations taking place. If there are less than around 1 lakh parameters in a neural network model, then it would not take more than a few minutes (or few hours at most) to train. But when we have millions of parameters, that is when our sizable computers would probably give up. This is where GPUs come into the picture. GPUs (Graphics Processing Units) are nothing but CPUs but with more ALUs (Arithmetic logic units) than our normal CPUs which are specifically meant for this kind of heavy mathematical computation.
Question - 97 : - Which is the best algorithm for face detection ?
Answer - 97 : -
There are several machine learning algorithms available for face detection but the best ones are the ones which involve CNNs and deep learning. Some notable algorithms for face detection are listed below FaceNet Probablisit Face Embedding ArcFace Cosface Spherface
Question - 98 : - For any given problem, how do you decide if you have to use transfer learning or fine-tuning?
Answer - 98 : -
Transfer learning is a method used when a model is developed for one task is reused to work on a second task. Fine tuning is one approach to achieve transfer learning. In Transfer Learning we train the model with a dataset and after we train the same model with another dataset that has a different distribution of classes. In Fine-tuning, an approach of Transfer Learning, we have a dataset, and we make an 80-20 split and use 80% of it in training. Then we train the same model with the remaining 20%. Usually, we change the learning rate to a smaller one, so it does not have a significant impact on the already adjusted weights. To decide which method to choose, one should experiment first by using transfer learning as it is easy and fast, and if it does not suffice the purpose, then use fine tuning.
Question - 99 : - Explain the difference between Gradient Descent and Stochastic Gradient Descent.
Answer - 99 : -
To begin with, Gradient descent and stochastic gradient descent both are popular machine learning and deep learning optimization algorithms which are used for updating a set of parameters in an iterative way in order to minimize an error function. In gradient descent in order to update parameters, the entire dataset set is to be considered for a particular iteration while in stochastic gradient descent, computation is carried over only one single training sample. For example, if a dataset has 10000 datapoints, then GD, will train on all the 10000 datapoints and this will take a longer time, while on the other hand, Stochastic GD, will be much faster as we will train on only a single sample and update the parameters. This is because Stochastic gradient descent usually converges faster than gradient descent on large datasets, because updates are more frequent.
Question - 100 : - Why Sigmoid or Tanh is not preferred to be used as the activation function in the hidden layer of the neural network?
Answer - 100 : -
A common problem with Tanh or Sigmoid functions is that they saturate. Once saturated, the learning algorithms cannot adapt to the weights and enhance the performance of the model. Thus, Sigmoid or Tanh activation functions prevent the neural network from learning effectively leading to a vanishing gradient problem. The vanishing gradient problem can be addressed with the use of Rectified Linear Activation Function (ReLu) instead of sigmoid and using a Xavier initialization.