Data Warehouse Interview Questions and Answers
Question - 81 : - What is the difference between data cleaning and data transformation?
Answer - 81 : -
Data cleaning is the process that removes data that doesn’t belong in your dataset. Data transformation is that the method by which data from one format or structure converts into another. Transformation processes also can be mentioned as data wrangling or data mugging, transforming, and mapping data from one “raw” data form into another format for warehousing and analyzing. This text focuses on the processes of cleaning that data.
Question - 82 : - What is the benefit of Normalization?
Answer - 82 : -
Normalization helps in reducing data redundancy. Thus it saves physical database spaces and has minimal write operation cost.
Question - 83 : - What is Denormalization in a Database?
Answer - 83 : -
Denormalization is employed to access the info from a higher or lower normal sort of database. It creates redundancy and stores multiple copies of the same data into different tables.
Question - 84 : - What is the benefit of denormalization?
Answer - 84 : -
Denormalization adds required redundant terms into the tables to avoid using complex joins and lots of other complex operations. Denormalization doesn’t mean that normalization won’t be done, but the denormalization process takes place after the normalization process.
Question - 85 : - What is an Extent?
Answer - 85 : -
An Extent is a fixed number of contiguous data blocks as per configuration. It is obtained during a single allocation and used to store a specific type of information.
Question - 86 : - What is a source qualifier?
Answer - 86 : -
A source qualifier represents the rows that the Server reads when it executes a session. Source qualifier transformation needs to be connected for the addition of a relational or a flat file source definition to a mapping.
Question - 87 : - What is ETL Pipeline?
Answer - 87 : -
ETL Pipeline refers to a group of processes to extract the info from one system, transform it, and cargo it into some database or data warehouse. ETL pipelines are built for data warehousing applications, which incorporate both enterprise data warehouses and subject-specific data marts. ETL pipelines also are used for data migration solutions. Data warehouse/ business intelligence engineers build ETL pipelines.
Question - 88 : - What is the Data Pipeline?
Answer - 88 : -
Data Pipeline refers to any set of process elements that move data from one system to a different one. Data Pipeline is often built for an application that uses data to bring value. It is often used for integrating the info across the applications, building the info-driven web products, and completing the data mining activities. Data engineers build the data pipeline.
Question - 89 : - What is Fact? What are the types of Facts?
Answer - 89 : -
A fact may be a central component of a multi-dimensional model that contains the measures to be analyzed. Facts are related to dimensions.
Types of facts are:
- Additive Facts
- Semi-additive Facts
- Non-additive Facts