ETL Testing Interview Questions in addition to Answers
1) What is ETL?
In information warehousing architecture, ETL is an of import component, which manages the information for whatever describe concern process. ETL stands for Extract, Transform in addition to Load. Extract does the procedure of reading information from a database. Transform does the converting of information into a format that could last appropriate for reporting in addition to analysis. While, charge does the procedure of writing the information into the target database.
2) Why ETL testing is required?
• To perish on a banking concern check on the Data which are beingness transferred from 1 scheme to the other.
• To perish on a rail on the efficiency in addition to speed of the process.
• To last good acquainted with the ETL procedure earlier it gets implemented into your describe concern in addition to production.
3) What is a three-tier information warehouse?
Most information warehouses are considered to last a three-tier system. This is essential to their structure. The showtime layer is where the information lands. This is the collection signal where information from exterior sources is compiled. The minute layer is known every bit the ‘integration layer.’ This is where the information that has been stored is transformed to come across fellowship needs. The 3rd layer is called the ‘dimension layer,’ in addition to is where the transformed information is stored for internal use.
4) Mention what are the types of information warehouse applications?
The types of information warehouse applications are
• Info Processing
• Analytical Processing
• Data Mining
5) What are ETL tester responsibilities?
• Requires inward depth noesis on the ETL tools in addition to processes.
• Needs to write the SQL queries for the diverse given scenarios during the testing phase.
• Should last able to bear our dissimilar types of tests such every bit Primary Key, defaults in addition to perish on a banking concern check on the other functionality of the ETL process.
• Quality Check
6) What are the types of information warehouse?
There are mainly iii types of Data Warehouse they are,
• Enterprise Data Warehouse
• Operational information store
• Data Mart
7) What is a Data mart?
A Data Mart is a subset of a information warehouse that tin give notice render information for reporting in addition to analysis on a section, unit of measurement or a subdivision similar Sales Dept., HR Dept., etc. The Data Mart are sometimes also called every bit HPQS (Higher Performance Query Structure).
8) What is the departure betwixt information mining in addition to information warehousing?
Data warehousing comes earlier the mining process. This is the human activity of gathering information from diverse exterior sources in addition to organizing it into 1 specific place that is the warehouse. Data mining is when that information is analyzed in addition to used every bit information for making decisions.
9) Explain what is information purging?
Data purging is a procedure of deleting information from information warehouse. It deletes junk data’s similar rows with goose egg values or extra spaces.
10) What is partitioning?
Partitioning is when an surface area of information storage is sub-divided to improve performance. Think of it every bit an organizational tool. If all your collected information is inward 1 large infinite without scheme the digital tools used for analyzing it volition accept a to a greater extent than hard fourth dimension finding the information inward society to analyze it. Partitioning your warehouse volition practise an organizational construction that volition brand locating in addition to analyzing easier in addition to faster.
11) What are to a greater extent than or less types of partitioning?
Two types of partitioning are round-robin partitioning in addition to Hash Partitioning.
• Round-robin partitioning is when the information is evenly distributed with all partitions. This agency that the let on of rows inward each division is relatively the same.
• Hash partitioning is when the server applies a hash share inward society to practise division keys to grouping data.
12) What are the diverse tools used inward ETL?
• Cognos Decision Stream
• Oracle Warehouse Builder
• Business Objects XI
• SAS describe concern warehouse
• SAS Enterprise ETL server.
13) What is fact?
It is a cardinal factor of a multi-dimensional model which contains the measures to last analyzed. Facts are related to dimensions.
14) What are the types of Facts?
The types of Facts are every bit follows.
• Additive Facts: Influenza A virus subtype H5N1 Fact which tin give notice last summed upwards for whatever of the dimension available inward the fact table.
• Semi-Additive Facts: Influenza A virus subtype H5N1 Fact which tin give notice last summed upwards to a few dimensions in addition to non for all dimensions available inward the fact table.
• Non-Additive Fact: Influenza A virus subtype H5N1 Fact which cannot last summed upwards for whatever of the dimensions available inward the fact table.
15) What are Fact Tables?
A Fact Table is a tabular array that contains summarized numerical (facts) in addition to historical data. This Fact Table has a unusual key-primary key relation with a dimension table. The Fact Table maintains the information inward 3rd normal form.
A star schema is defined is defined every bit a logical database pattern inward which at that topographic point volition last a centrally located fact tabular array which is surrounded yesteryear at to the lowest degree 1 or to a greater extent than dimension tables. This pattern is best suited for Data Warehouse or Data Mart.
16) What are the types of Fact Tables?
The types of Fact Tables are,
• Cumulative Fact Table: This type of fact tables by in addition to large describes what was happened over the menstruum of time. They comprise additive facts.
• Snapshot Fact Table: This type of fact tabular array deals with the detail menstruum of time. They comprise non-additive in addition to semi-additive facts.
17) What is Grain of Fact?
The Grain of Fact is defined every bit the score at which the fact information is stored inward a fact table. This is also called every bit Fact Granularity or Fact Event Level.
18) What is Fact less Fact table?
The Fact Table which does non contains facts is called every bit Fact Table. Generally when nosotros demand to combine 2 information marts, in addition to then 1 information mart volition accept a fact less fact tabular array in addition to other 1 with mutual fact table.
19) What are Dimensions?
Dimensions are categories yesteryear which summarized information tin give notice last viewed. For illustration a net income Fact tabular array tin give notice last viewed yesteryear a fourth dimension dimension.
20) What are Confirmed Dimensions?
The Dimensions which are reusable in addition to fixed inward nature Example customer, time, geography dimensions.
21) Explain what is transformation?
A transformation is a repository object which generates, modifies or passes data. Transformation are of 2 types Active in addition to Passive
22) What are active in addition to passive transformations?
In an active transformation, the let on of rows that is created every bit output tin give notice last changed in 1 lawsuit a transformation has occurred. This does non spill out during a passive transformation; the information passes through the same let on given to it every bit input.
23) Explain the occupation of Lookup Transformation?
The Lookup Transformation is useful for
• Getting a related value from a tabular array using a column value
• Update slow changing dimension table
• Verify whether records already be inward the table
24) What is the departure betwixt dimensional tabular array in addition to fact table?
A dimension tabular array consists of tuples of attributes of the dimension. Influenza A virus subtype H5N1 fact tabular array tin give notice last idea of every bit having tuples, 1 per a recorded fact. This fact contains to a greater extent than or less measured or observed variables in addition to identifies them with pointers to dimension tables.
25) What is OLAP?
OLAP stands for Online Analytical Processing. It uses database tables (Fact in addition to Dimension tables) to enable multidimensional viewing, analysis in addition to querying of large amount of data.
26) What is OLTP?
OLTP stands for Online Transaction Processing Except information warehouse databases the other databases are OLTPs. These OLTP uses normalized schema structure. These OLTP databases are designed for recording the daily operations in addition to transactions of a business.
27) What is Operational Data Store [ODS]?
It is a collection of integrated databases designed to back upwards operational monitoring. Unlike the OLTP databases, the information inward the ODS are integrated, dependent area oriented in addition to enterprise broad data.
28) What are Measures?
Measures are numeric information based on columns inward a fact table.
29) Explain what are Cubes in addition to OLAP Cubes?
Cubes are information processing units comprised of fact tables in addition to dimensions from the information warehouse. It provides multi-dimensional analysis.
OLAP stands for Online Analytics Processing, in addition to OLAP cube stores large information inward multi-dimensional course of pedagogy for reporting purposes. It consists of facts called every bit measures categorized yesteryear dimensions.
30) What are Virtual Cubes?
These are combination of 1 or to a greater extent than existent cubes in addition to require no disk infinite to shop them. They shop exclusively Definition in addition to non the data.
31) What is Bus Schema?
For the diverse describe concern procedure to position the mutual dimensions, BUS schema is used. It comes with a conformed dimensions along with a standardized Definition of information
32) What is a Star schema design?
A Star schema is defined every bit a logical database pattern inward which at that topographic point volition last a centrally located fact tabular array which is surrounded yesteryear at to the lowest degree 1 or to a greater extent than dimension tables. This pattern is best suited for Data Warehouse or Data Mart.
33) What is Snow Flake schema Design?
In a Snow Flake pattern the dimension tabular array (de-normalized table) volition last farther divided into 1 or to a greater extent than dimensions (normalized tables) to organize the information inward a amend structural format. To pattern snowfall fight nosotros should showtime pattern star schema design.
34) Explain what are Schema Objects?
Schema objects are the logical construction that guide holler to the databases data. Schema objects includes tables, views, sequence synonyms, indexes, clusters, functions packages in addition to database links
35) Explain what staging surface area is in addition to what is the purpose of a staging area?
Data staging is an surface area where you lot concord the information temporary on information warehouse server. Data staging includes next steps
• Source information extraction in addition to information transformation ( restructuring )
• Data transformation (data cleansing, value transformation )
• Surrogate key assignments
36) Explain ETL Mapping Sheets?
ETL mapping sheets contains all the required information from the origin file including all the rows in addition to columns. This canvass helps the experts inward writing the SQL queries for the ETL tools testing.
37) What is Denormalization?
Denormalization agency a tabular array with multi duplicate key. The dimension tabular array follows Denormalization method with the technique of surrogate key.
38) What is Surrogate Key?
A Surrogate Key is a sequence generated key which is assigned to last a primary key inward the scheme (table).
39) Explain these price Mapping, Session, Work let, Mapplet in addition to Workflow?
• Mapping is the crusade of information from the origin to the destination.
• Session is the parameters laid to learn the information on during the inward a higher house movement.
• Work permit represents a specific laid of tasks given.
• Influenza A virus subtype H5N1 workflow is a laid of instructions that state the server how to execute tasks.
• Influenza A virus subtype H5N1 mapplet creates or arranges sets of transformation.
40) List few ETL bugs?
Calculation Bug, User Interface Bug, Source Bugs, Load status bug, ECP related põrnikas are to a greater extent than or less of the ETL bugs.
1) What is ETL?
In information warehousing architecture, ETL is an of import component, which manages the information for whatever describe concern process. ETL stands for Extract, Transform in addition to Load. Extract does the procedure of reading information from a database. Transform does the converting of information into a format that could last appropriate for reporting in addition to analysis. While, charge does the procedure of writing the information into the target database.
2) Why ETL testing is required?
• To perish on a banking concern check on the Data which are beingness transferred from 1 scheme to the other.
• To perish on a rail on the efficiency in addition to speed of the process.
• To last good acquainted with the ETL procedure earlier it gets implemented into your describe concern in addition to production.
3) What is a three-tier information warehouse?
Most information warehouses are considered to last a three-tier system. This is essential to their structure. The showtime layer is where the information lands. This is the collection signal where information from exterior sources is compiled. The minute layer is known every bit the ‘integration layer.’ This is where the information that has been stored is transformed to come across fellowship needs. The 3rd layer is called the ‘dimension layer,’ in addition to is where the transformed information is stored for internal use.
4) Mention what are the types of information warehouse applications?
The types of information warehouse applications are
• Info Processing
• Analytical Processing
• Data Mining
5) What are ETL tester responsibilities?
• Requires inward depth noesis on the ETL tools in addition to processes.
• Needs to write the SQL queries for the diverse given scenarios during the testing phase.
• Should last able to bear our dissimilar types of tests such every bit Primary Key, defaults in addition to perish on a banking concern check on the other functionality of the ETL process.
• Quality Check
6) What are the types of information warehouse?
There are mainly iii types of Data Warehouse they are,
• Enterprise Data Warehouse
• Operational information store
• Data Mart
7) What is a Data mart?
A Data Mart is a subset of a information warehouse that tin give notice render information for reporting in addition to analysis on a section, unit of measurement or a subdivision similar Sales Dept., HR Dept., etc. The Data Mart are sometimes also called every bit HPQS (Higher Performance Query Structure).
8) What is the departure betwixt information mining in addition to information warehousing?
Data warehousing comes earlier the mining process. This is the human activity of gathering information from diverse exterior sources in addition to organizing it into 1 specific place that is the warehouse. Data mining is when that information is analyzed in addition to used every bit information for making decisions.
9) Explain what is information purging?
Data purging is a procedure of deleting information from information warehouse. It deletes junk data’s similar rows with goose egg values or extra spaces.
10) What is partitioning?
Partitioning is when an surface area of information storage is sub-divided to improve performance. Think of it every bit an organizational tool. If all your collected information is inward 1 large infinite without scheme the digital tools used for analyzing it volition accept a to a greater extent than hard fourth dimension finding the information inward society to analyze it. Partitioning your warehouse volition practise an organizational construction that volition brand locating in addition to analyzing easier in addition to faster.
11) What are to a greater extent than or less types of partitioning?
Two types of partitioning are round-robin partitioning in addition to Hash Partitioning.
• Round-robin partitioning is when the information is evenly distributed with all partitions. This agency that the let on of rows inward each division is relatively the same.
• Hash partitioning is when the server applies a hash share inward society to practise division keys to grouping data.
12) What are the diverse tools used inward ETL?
• Cognos Decision Stream
• Oracle Warehouse Builder
• Business Objects XI
• SAS describe concern warehouse
• SAS Enterprise ETL server.
13) What is fact?
It is a cardinal factor of a multi-dimensional model which contains the measures to last analyzed. Facts are related to dimensions.
14) What are the types of Facts?
The types of Facts are every bit follows.
• Additive Facts: Influenza A virus subtype H5N1 Fact which tin give notice last summed upwards for whatever of the dimension available inward the fact table.
• Semi-Additive Facts: Influenza A virus subtype H5N1 Fact which tin give notice last summed upwards to a few dimensions in addition to non for all dimensions available inward the fact table.
• Non-Additive Fact: Influenza A virus subtype H5N1 Fact which cannot last summed upwards for whatever of the dimensions available inward the fact table.
15) What are Fact Tables?
A Fact Table is a tabular array that contains summarized numerical (facts) in addition to historical data. This Fact Table has a unusual key-primary key relation with a dimension table. The Fact Table maintains the information inward 3rd normal form.
A star schema is defined is defined every bit a logical database pattern inward which at that topographic point volition last a centrally located fact tabular array which is surrounded yesteryear at to the lowest degree 1 or to a greater extent than dimension tables. This pattern is best suited for Data Warehouse or Data Mart.
16) What are the types of Fact Tables?
The types of Fact Tables are,
• Cumulative Fact Table: This type of fact tables by in addition to large describes what was happened over the menstruum of time. They comprise additive facts.
• Snapshot Fact Table: This type of fact tabular array deals with the detail menstruum of time. They comprise non-additive in addition to semi-additive facts.
17) What is Grain of Fact?
The Grain of Fact is defined every bit the score at which the fact information is stored inward a fact table. This is also called every bit Fact Granularity or Fact Event Level.
18) What is Fact less Fact table?
The Fact Table which does non contains facts is called every bit Fact Table. Generally when nosotros demand to combine 2 information marts, in addition to then 1 information mart volition accept a fact less fact tabular array in addition to other 1 with mutual fact table.
19) What are Dimensions?
Dimensions are categories yesteryear which summarized information tin give notice last viewed. For illustration a net income Fact tabular array tin give notice last viewed yesteryear a fourth dimension dimension.
20) What are Confirmed Dimensions?
The Dimensions which are reusable in addition to fixed inward nature Example customer, time, geography dimensions.
21) Explain what is transformation?
A transformation is a repository object which generates, modifies or passes data. Transformation are of 2 types Active in addition to Passive
22) What are active in addition to passive transformations?
In an active transformation, the let on of rows that is created every bit output tin give notice last changed in 1 lawsuit a transformation has occurred. This does non spill out during a passive transformation; the information passes through the same let on given to it every bit input.
23) Explain the occupation of Lookup Transformation?
The Lookup Transformation is useful for
• Getting a related value from a tabular array using a column value
• Update slow changing dimension table
• Verify whether records already be inward the table
24) What is the departure betwixt dimensional tabular array in addition to fact table?
A dimension tabular array consists of tuples of attributes of the dimension. Influenza A virus subtype H5N1 fact tabular array tin give notice last idea of every bit having tuples, 1 per a recorded fact. This fact contains to a greater extent than or less measured or observed variables in addition to identifies them with pointers to dimension tables.
25) What is OLAP?
OLAP stands for Online Analytical Processing. It uses database tables (Fact in addition to Dimension tables) to enable multidimensional viewing, analysis in addition to querying of large amount of data.
26) What is OLTP?
OLTP stands for Online Transaction Processing Except information warehouse databases the other databases are OLTPs. These OLTP uses normalized schema structure. These OLTP databases are designed for recording the daily operations in addition to transactions of a business.
27) What is Operational Data Store [ODS]?
It is a collection of integrated databases designed to back upwards operational monitoring. Unlike the OLTP databases, the information inward the ODS are integrated, dependent area oriented in addition to enterprise broad data.
28) What are Measures?
Measures are numeric information based on columns inward a fact table.
29) Explain what are Cubes in addition to OLAP Cubes?
Cubes are information processing units comprised of fact tables in addition to dimensions from the information warehouse. It provides multi-dimensional analysis.
OLAP stands for Online Analytics Processing, in addition to OLAP cube stores large information inward multi-dimensional course of pedagogy for reporting purposes. It consists of facts called every bit measures categorized yesteryear dimensions.
30) What are Virtual Cubes?
These are combination of 1 or to a greater extent than existent cubes in addition to require no disk infinite to shop them. They shop exclusively Definition in addition to non the data.
31) What is Bus Schema?
For the diverse describe concern procedure to position the mutual dimensions, BUS schema is used. It comes with a conformed dimensions along with a standardized Definition of information
32) What is a Star schema design?
A Star schema is defined every bit a logical database pattern inward which at that topographic point volition last a centrally located fact tabular array which is surrounded yesteryear at to the lowest degree 1 or to a greater extent than dimension tables. This pattern is best suited for Data Warehouse or Data Mart.
33) What is Snow Flake schema Design?
In a Snow Flake pattern the dimension tabular array (de-normalized table) volition last farther divided into 1 or to a greater extent than dimensions (normalized tables) to organize the information inward a amend structural format. To pattern snowfall fight nosotros should showtime pattern star schema design.
34) Explain what are Schema Objects?
Schema objects are the logical construction that guide holler to the databases data. Schema objects includes tables, views, sequence synonyms, indexes, clusters, functions packages in addition to database links
35) Explain what staging surface area is in addition to what is the purpose of a staging area?
Data staging is an surface area where you lot concord the information temporary on information warehouse server. Data staging includes next steps
• Source information extraction in addition to information transformation ( restructuring )
• Data transformation (data cleansing, value transformation )
• Surrogate key assignments
36) Explain ETL Mapping Sheets?
ETL mapping sheets contains all the required information from the origin file including all the rows in addition to columns. This canvass helps the experts inward writing the SQL queries for the ETL tools testing.
37) What is Denormalization?
Denormalization agency a tabular array with multi duplicate key. The dimension tabular array follows Denormalization method with the technique of surrogate key.
38) What is Surrogate Key?
A Surrogate Key is a sequence generated key which is assigned to last a primary key inward the scheme (table).
39) Explain these price Mapping, Session, Work let, Mapplet in addition to Workflow?
• Mapping is the crusade of information from the origin to the destination.
• Session is the parameters laid to learn the information on during the inward a higher house movement.
• Work permit represents a specific laid of tasks given.
• Influenza A virus subtype H5N1 workflow is a laid of instructions that state the server how to execute tasks.
• Influenza A virus subtype H5N1 mapplet creates or arranges sets of transformation.
40) List few ETL bugs?
Calculation Bug, User Interface Bug, Source Bugs, Load status bug, ECP related põrnikas are to a greater extent than or less of the ETL bugs.