Database Design - Part 3

Database Design - Part 3

In the second part of the article we end with the transformation to the relational model, that is, designing the tables for the database. Said tables were obtained based on the entity-relationship model and in accordance with the transformation rules described in the previous article.

Now we need to analyze the dependencies that exist between the attributes or fields of the tables and in this way to be able to normalize said tables.

Normalization involves trying to eliminate the redundancy of data that could arise when constructing the already transformed tables and start storing data in them. Some degrees of redundancy will have to be accepted in most databases depending on the functionality of the application, the semantics of the data and the user restrictions.

The issue of standardization is a bit more complex and the objective of this article is not boring into the technical details, so we will try to explain it in a simple and clear way.

Given the simplicity of our example, when carrying out the transformation to the relational model, the resulting tables were obtained within the second and third normal form and this is due to the dependencies between attributes or fields.

There may be dependencies of various types between the fields of a table. The dependencies are inherent properties to the semantic content of the data, forming part of the user restrictions of the relational model.

There are different types of dependencies: functional, multi-valued, hierarchical and in combination.

Functional dependencies

Given table R formed by the fields TOB Y C, representing it of the form R (A, B, C). If B depends functionally on A (then A implies or determines B), if for each value of A there is only a single possible value for B.

Notation: A -> B

example: For the Tour table, a functional dependency exists between the number field and the name field:
number -> name

If it were also true that there can not be two routes with the same name, this would imply that the name can also be current as a key of the route table and therefore also functionally derefines the number:

number <-> name

Multivalued dependencies

From the table R (A, B, C), A multidetermines B if for each value of A there is a well-defined set of possible values ​​in B, independent of the rest of the fields in the table.

Notation: A -> -> B

example: If in the tourist table we would also like to keep the tourist's phone and that some of them indicate more than one number: cell phone, house, office, etc. The multivalued dependence would be given by:

name -> -> phone

For each person we have several telephone numbers.

Normal forms

There are six normalization levels of a table. A table is in one or another degree of normalization if it meets a series of properties (restrictions) that will be explained below.

The first three normal forms were defined by Codd and are based on the functional dependencies that exist between the fields in the table. Subsequently, and based on the multivalued dependencies and in combination, two more levels of normalization were defined, 4th and 5th respectively.

Between the third and the fourth one was defined which is known as the normal form of Boyce-Codd (FNBC) which re-defines the 3rd normal form due to some anomalies found.

In this way the tables in the normal way have more data redundancy than the higher levels and therefore, more data update anomalies. The 5th normal form is the maximum degree of normalization that can be achieved for a table.

First normal form

A table is said to be in the 1st normal form when each field only takes a value from the underlying simple domain. That is, there are no repetitive groups.

We can show an example with the following tables, the first table does not comply with the restriction of the first normal form, however the second one does comply with it.

The first normal form is an inherent constraint to the relational model, so compliance is mandatory for all tables.

Second normal form

It is said that a table is in the second normal form if:

  • It is in the 1st normal form
  • Each field that is not part of the primary key has complete functional dependence with respect to any of its keys.

For example: Tourist (idTurist, name, nationality) where the primary key is idTurist, exists:

name -> nationality

Therefore the Tourist table with those fields is in the second normal form.

Third normal form

It is said that the table is in the 3rd normal form if:

  • It is in the second normal form
  • There is no field that is not part of the primary key that transitively depends on any key in the table.

For example: R (A, B, C) if B -> CR is in the 2nd normal form, since there is no transitive dependence of B or C with respect to A

Analysis

The objective of the analysis is to carry out a process of decomposition by means of successive projections to obtain, from a table, a series of resulting tables, which must comply with the conservation of the information (fields and records), of the dependencies and that there is minimum data redundancy.

In this way the n resulting tables will be equivalent to the initial table and will have less data redundancy because they are at a higher normalization level.

Since normalization seems a complicated issue, we can benefit greatly by understanding the most elementary concepts of normalization.

One of the easiest ways to understand this is to think of our tables as spreadsheets. For example, in the case of tourists' phones, if we add a field to the table for each phone number that we could capture for a tourist, we would have something like this:

Some tourists tourists have one, two or three numbers, not all have the three numbers for which we have capacity in the table.

Just as we have mentioned that the goal of normalization is to eliminate data redundancy, so is the elimination of the number of empty cells.

The realization that the tourist table has a fixed set of fields (name, sex, age, nationality) and a variable set of fields (telefono1, telefono2, telefono3), gives us an idea of ​​how to divide the data into multiple tables that they will be related to each other.

When we talk about the decomposition process to obtain n resulting tables, we see it in the example of the telephones and the process is described with the following figure:

When applying the rules of normalization to the original table R, it is broken down into two tables R1 and R2, then, given the redundancy of R2, we apply the rules and break it down into R3 and R4. All the resulting tables: R1, R2, R3, and R4 retain the attribute A, which is the primary key and with which we will establish the relationship between them, however for each resulting table, the primary key will be a combination of the key primary of the original table and some other field of each resulting table, depending on the constraints.

If we apply the rule of the first normal form for the data table of the example, which tells us to separate the repetitive groups. In this case the repetitive group is the set of fields for telephone, which depend on the id of the tourist, so we separate them together.

Separating into two groups the fields we would have:

1st Group (idTourist, name, sex, age, nationality)
2nd Group (idTurist, telefono1, telefono2, telefono3)

Now for the second normal form, in this we analyze only those groups that have combined keys, since there are no combined keys, we ignore this step.

For the third normal form we examine the interdependencies between non-key fields.

All fields or attributes in each group that are not keys must be examined to verify that there are no interdependencies between them. If some are found, such dependencies must be separated into different groups whose key must be the field of which they are dependent, leaving this key field also in the original group.

For the first group we analyzed that name, sex, age and nationality are dependent on idTurist and there is no interdependence between them, therefore, we ignore the first group.

In the second group the same thing happens, each phone field is dependent on the idTurist. But now not all tourists will have three phones, so we modify the second group and let each record identify a unique telephone for each tourist, in this way the second group will be:

2nd Group (idTurist, telephone)

In it, we make the primary and unique key the combination of both fields, that is, the primary key will be formed by idTurist and telephone. This way we can also avoid that the same number is captured twice for the same tourist, but we can have the same number for different tourists.

Now, we know that for the field sex, there can only be two values ​​(H or M), so we can use a list of values ​​for this field and avoid errors in the capture of it.

In the case of the nationality field, the amount of values ​​that can be had is great, however to avoid errors in the capture with the possible data we could separate it from group 1 and create a new group, to which we will have to assign a unique key :

3rd group (idNationality, nationality)

The first group would be modified in the following way:

1st Group (idTourist, name, sex, age, idNAtionality)

With this we could create a list of values ​​for nationality and avoid errors in the capture.

Given the few fields that we have in the Tour and Realize tables, there are no functional independence and dependence and multivalued in these tables, so there is no need to carry out an additional normalization, these tables are in the 3rd normal form.

Finally the tables to be built in the database are the following:

These are already the normalized tables until the third normal form that will be built in the database. For this the next step is to analyze the types of data that are required for each of the fields.

In the fourth and last part of this article we will talk about the details for the construction of these tables, the data types for the fields, restrictions and validations, as well as the considerations necessary for the creation of presentations (layouts) for the application: , consultation, reports, etc.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x

JacobSoft

Receive notifications of new articles and tutorials every time a new one is added

Thank you, you have subscribed to the blog and the newsletter

There was an error while trying to send your request. Please try again.

JacobSoft will use the information you provide to be in contact with you and send you updates.