Data Warehouse Interview Questions and Answers
Question - 71 : - What is the junk dimension?
Answer - 71 : -
A Junk Dimension is a type of dimension table consisting of attributes that do not belong in the fact table or in any of the other existing dimension tables. The characteristics of these attributes are usually text or various flags, e.g., non-generic comments or very simple yes/no or true/false indicators. These sorts of attributes typically remain when all the apparent dimensions within the business process are identified, and thus the designer is faced with the challenge of where to place these attributes that don’t belong within the other dimensions.
In some scenarios where data might not be appropriate to store within the schema. The info or attributes are often stored during a junk dimension; the character of the junk during this particular dimension is typically Boolean or flag values. A single dimension is formed by lumping a small number of dimensions. This is called a junk dimension adjunct dimension has unrelated attributes the process of grouping these random flags and text attributes in a dimension by transmitting them to a distinguished sub-dimension is related to junk dimension, so essentially any data that need not be stored in the data warehouse because it is unnecessary is stored in the junk dimension.
Question - 72 : - What are the different types of SCDs used in data warehousing?
Answer - 72 : -
SCDs stands for slowly changing dimensions. It is basically a dimension where data changes do not happen frequently or on any regular basis. There are three types of SCDs the first is SCD1 it is a record that is used to replace the original record. Even when there is only one record existing within the database, the present data is going to be replaced, and therefore the new data will take its place.
SCD2 is the new record file that is added to the dimension table. The record exists in the database with the current data and the previous data that is stored in the audit or history.
SCD3 uses the original data that is modified to the new data. This consists of two records, one record that exists in the database and the other record which will replace the old database record with this new information.
Question - 73 : - Which one is faster: multidimensional OLAP or relational OLAP?
Answer - 73 : -
Multi-dimensional OLAP also known as MOLAP is faster than relational OLAP because of the following reasons in MOLAP.
The data is stored in a multi-dimensional queue; the storage is not in the relational database but in proprietary formats. MOLAP stores all the possible combinations of data in a multidimensional array.
Question - 74 : - What is hybrid SCD?
Answer - 74 : -
Hybrid SCDs are combinations of both SCD1 and SCD2. It may happen that in a table, some columns are important and need to track changes for them that are captured by the historical data for them. Whereas in some columns, even if the data changes, that does not need to bother. For such tables are implemented hybrid SCDs wherein some columns are of type 1, and some are of type 2. So basically, a blanket rule is not applied on the entire table rather than customized on which particular columns where a particular rule needed to be applied.
Question - 75 : - Why do we overwrite the execute method and struts so as parts of the start framework?
Answer - 75 : -
We can develop the action servlets and the action form servlets and other circuit classes in the action form class. We can develop a validated method that can return action errors object and in this method. We can write the validation code as well if this method returns null or action errors with the size of zero. The web container will call execute as part of the action class. If it returns a size greater than zero, it will call the execute method. It will rather execute the JSP servlet or the HTML file as the value for the input attribute is part of the attribute in the struts-config XML file.
Question - 76 : - How are the time dimensions loaded?
Answer - 76 : -
Time dimensions are usually loaded by a program that loops through all possible dates appearing within the data, its commonplace for 100 years to be represented during a time dimension with one row per day.
Question - 77 : - What is the difference between a data warehouse and a data mart?
Answer - 77 : -
A data warehouse is a set of data isolated from operational systems, so it is basically a way from the database itself; it is a view of the database. This helps an organization deal with its decision-making process. The Data Mart is a subset of a data warehouse that is geared to a particular business line. Data mods provide the stock of condensed data collected in the organization for analysis on a particular field or entity. So this is basically stating that the data warehouse contains a whole variety of information while a data mart is just a subset of that information based on a particular business line or model.
Question - 78 : - What are the five main testing phases of a project?
Answer - 78 : -
ETL test is performed in five stages which are the following the identification of data sources and requirements; first, you will identify which data sources you want for your data warehouse and what are the requirement of the data warehouse and the analytical requirements that your organization needs the acquisition of data naturally after identifying the data source you will acquire that data implementing business logic and dimensional modeling on that data building and publishing that data and the reports that you will create out of the analytics that you perform.
Question - 79 : - What do you mean by the slice action and how many slice operated dimensions are used?
Answer - 79 : -
A slice operation is the filtration process in a data warehouse. It selects a specific dimension from a given cube and provides a new sub-cube in the slice operation. Only a single dimension is used, so basically, out of a multi-dimensional data warehouse, if it needs a very specific dimension that needs further analytics or processing, then it will use the slice operation in that data warehouse.
Question - 80 : - What are the stages of data warehousing?
Answer - 80 : -
There are 7 Steps to Data Warehousing:
Step 1: Determine Business Objectives
Step 2: Collect and Analyze Information
Step 3: Identify Core Business Processes
Step 4: Construct a Conceptual Data Model
Step 5: Identify Data Sources and Data Transformations planning
Step 6: Set Tracking Duration
Step 7: Implement the Plan