Dimensional Data Modeling Interview Questions and Answers

Q1 : What is relational data modeling?
A : The visual representation of objects in a relational database (usually a normalized) is called as relational data modeling. Table contains rows and columns

Q2 : Distinguish between OLTP and OLAP?
A : OLTP stands for Online Transaction Processing system & OLAP stands for Online Analytical processing system. OLTP maintains the transactional data of the business & is highly normalized generally. On the contrary, OLAP is for analysis and reporting purpose & it is in de-normalized form.
This difference between OLAP and OLTP also gives you the way to choosing the design of schema. If your system is OLTP, you should go with star schema design and if your system is OLAP, you should go with snowflake schema.

Q3 : Is this true that all databases should be in 3NF?
A : It is not mandatory for a database to be in 3NF. However, if your purpose is easy maintenance of data, less redundancy, and efficient access then you should go with a de-normalized database.

Q4 : What is data model repository?
A : Data Model and its relevant data like entity definition, attribute definition, columns, data types etc. are stored in a repository, which can be accessed by data modelers and the entire team.

Q5 : What is a foreign key constraint?
A : Parent table has primary key and a foreign key constraint is imposed on a column in the child table. The foreign key column value in the child table will always refer to primary key values in the parent table.

Q6 : What are a physical data model and physical data modeling?
A : The physical data model includes all required tables, columns, relationship, database properties for the physical implementation of databases. Database performance, indexing strategy, and physical storage are important parameters of a physical model. The important or main object in a database is a table which consists or rows and columns. The approach by which physical data models are created is called as physical data modeling

Q7 : What is identifying relationship?
A : Usually, in a data model, parent tables and child tables are present. Parent table and child table are connected by a relationship line. If the referenced column in the child table is a part of the primary key in the child table, relationship is drawn by thick lines by connecting these two tables, which is called as identifying relationship.

Q8 : What is an enterprise data model?
A : Enterprise data model comprises of all entities required by an enterprise. The development of a common consistent view and understanding of data elements and their relationships across the enterprise is referred to as Enterprise Data Modeling. For better understanding purpose, these data models are split up into subject areas

Q9 : What is normalization?
A : E.F. Codd gave some rules to design relational databases and the rules were focused on removing data redundancy which helps to overcome normal data modeling problems. The process of removing data redundancy is known as normalization.

Q10 : What is De-Normalization?
A : De-Normalization is a process of adding redundancy to the data. This helps to quickly retrieve the information from the database.

Q11 : What is reverse engineering in a data model?
A : Reverse Engineering is a process useful for creating the data models from database or scripts. Data modeling tools have options to connect to the database by which we can reverse engineer a database into a data model.

Q12 : What is the difference between the hashed file stage and sequential file stage in relates to DataStage Server?
A : In datastage server jobs, can we use sequential file stage for a lookup instead of hashed filestage. If yes, then what’s the advantage of a Hashed File stage over sequential file stage
search is faster in hash files as you can directly get the address of record directly by hash algorithm as records are stored like that but in case of sequential file u must compare all the records.

Q13 : What is forward engineering in a data model?
A : Forward Engineering is a process by which DDL scripts are generated from the data model. Data modeling tools have options to create DDL scripts by connecting with various databases. With these scripts, databases can be created.

Q14 : What are the types of normalization?
A : First normal form, Second normal form, third normal forms are three types of normalization used in practice. Beyond these normal forms, Boyce-Codd fourth and fifth normal forms are also available.

Q15 : What is reverse engineering in a data model?
A : Reverse Engineering is a process useful for creating the data models from database or scripts. Data modeling tools have options to connect to the database by which we can reverse engineer a database into a data model.

Q16 : What is third normal form?
A : An entity is in the third normal form if it is in the second normal form and all of its attributes are not transitively dependent on the primary key. Transitive dependence means that descriptor key attributes depend not only on the whole primary key, but also on other descriptor key attributes that, in turn, depend on the primary key. In SQL terms, the third normal form means that no column within a table is dependent on a descriptor column that, in turn, depends on the primary key.
For 3NF, first, the table must be in 2NF, plus, we want to make sure that the non-key fields are dependent upon ONLY the PK, and not other non-key fields for its existence. This is very similar to to 2NF, except that now you are comparing the non-key fields to OTHER non-key fields. After all, we know that the relationship to the PK is good, because we established that in 2NF.

Q17 : What is a subtype and super type entity?
A : An entity can be split into many entities (sub-entities) and grouped based on some characteristics and each sub-entity will have attributes relevant to that entity. These entities are called subtype entities. The attributes which are common to these entities are moved to a super (higher) level entity, which is called a supertype entity.

Q18 : What is a sequence?
A : Sequence is a database object to generate unique number.

Q19 : How many null values can be inserted in a column that has unique constraint?
A : Many null values can be inserted in an unique constraint column because one null value is not equal to another null value.

Q20 : What is the self-recursive relationship?
A : A standalone column in a table will be connected to the primary key of the same table, which is called as recursive relationship.

Q21 : What is a column (attribute)?
A : Column also known as a field is a vertical alignment of the data and contains related information to that column.

Q22 : What is the difference between a logical and physical data model?
Logical | Physical Data Modeling:
A : When a data modeler works with the client, his title may be a logical data modeler or a physical data modeler or combination of both. A logical data modeler designs the data model to suit business requirements, creates and maintains the lookup data, compare the versions of data model, maintains change log, generate reports from data model and whereas a physical data modeler has to know about the source and target databases properties.
A physical data modeler should know the technical know-how to create data models from existing databases and to tune the data models with referential integrity, alternate keys, indexes and how to match indexes to SQL code. It would be good if the physical data modeler knows about replication, clustering and so on.

Q23 : What is a table (entity)?
A : Data stored in form of rows and columns is called as table. Each column has datatype and based on the situation, integrity constraints are enforced on columns.

Q24 : What are the important types of Relationships in a data model?
A : Identifying, Non-Identifying Relationship, Self-Recursive relationship are the types of relationship.

Q25 : What is ER (entity relationship) diagram or ERD?
A : ER diagram is a visual representation of entities and the relationships between them. In a data model, entities (tables) look like square boxes or rectangular boxes, which contain attributes and these entities, are connected by lines (relationship).