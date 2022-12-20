When operating, mobile and web apps generate and use a large amount of data. Hence software must be capable of saving and modifying incoming data, finding information on request, and maintaining correct interaction between its interface and back-end. How to ensure all this? By designing a database. This guide that describes five steps of designing a relational database will assist you with resolving the task.

A brief overview of database design

A database is a server technology aimed at dynamically storing information for software. Database design includes:

– defining the purpose of database development;

– thinking out the structure;

– deciding on how information will be stored;

– building data relationships.

In this piece, we are talking about designing a relational database at three levels: conceptual, logical, and practical. There, data is accumulated in tables interconnected with primary and foreign keys, forming one-to-one (1:1), one-to-many (1:М), and many-to-many (М:M) relationships.

In order to deliver a robust database, developers do the following:

– analyze the requirements and find out what tasks it should resolve;

– think out the database schema;

– arrange data in tables according to the selected type of relationship;

– assign primary keys;

– ensure compliance with rational database rules (normalization).

A correctly developed database enables data coherency, takes up less disk space, and is easy to administer.

Scrutinizing a database development life cycle

How to properly orchestrate a space for information storage and processing? Let’s take a look at how the database development life cycle goes.

#1. Requirements analysis: why the database is required

Developers analyze requirements for the database to get a better understanding of what the business will use it for and what database model is the most suitable for the project. According to estimates by the database management expert ScaleGrid, SQL is used in 60.5% of cases.

Understanding the purpose of use will help to successfully go through the next stages of designing a database. Also, it is worth understanding how different users will work with the database. If a company for database development works on a database for a public library, its developers need to think out the ways librarians and readers will access it. Online store development requires considering how etailers, buyers, and couriers will interact with its database.

In order to collect information about database use cases, one needs to conduct mini-research:

– study product requirements;

– reconfirm client expectations;

– survey users of the future software;

– analyze working documentation (invoices, item cards, order placement forms, and other artifacts);

– study physical and digital files a business deals with.

The result of these actions will lead to obtaining data for the database. For more convenience, you can typify this data (Clients, Goods, Orders, etc.) and add attributes describing given data types (full name, address, phone number, email, etc.). Looking ahead, don’t be lazy to split the information into details – for instance, put the name of the country in another field to filter people by place of residence.

The aim of this stage of the database development life cycle is to comprehend and sort through:

what types of data to include;

where data comes from;

how data is used by different users.

Knowing this will facilitate the work on the second stage called planning.

#2. Planning: working on a database schema

In order to trace how data from different tables is interrelated, a database framework – schema – is used. If there is no any, a software team is at risk of going off course, especially if they work on a large-scale project with hundreds of tables.

A schema defines the database structure and allows developers to think out the architecture so that:

data is formatted sequentially;

a unique primary key is assigned to each entry;

all essential data is included.

A schema includes table titles, columns with data names and types, primary keys, and top-priority information. It can be displayed like this:

Table 1

Title: Employees

Columns: ID, full name, contact number, position, department.

It is advisable to visualize schemas using database schema design – particularly, entity-relationship diagrams (ERD) – to simplify the perception of information.

A database design diagram helps to gain greater insight into three relationship types:

– 1:1 – two entities are connected (one passport number can only belong to one person);

– 1:М – three or more entities are connected (one mother can have several children, but one child can only have one mother);

– М:M – more than two attributes from each table are connected (several scientists can publish their works in multiple journals).

There are different types of database schema to visualize the above:

Physical – demonstrates how data is stored on disk in GB or TB; Logical – shows how data is orchestrated in tables; View – describes the user’s interaction with the database; Star – implements the link between several dimensional tables through a central fact table; Snowflake – a more complex version of a star schema, where dimensional tables are additionally connected with sub-dimensional tables.

When a developer knows how many tables to include, how to title them, and how many records should be there, orchestrating data and building appropriate relationships becomes easier.

#3. Creation of tables according to the required type of relationships

When such a frame is ready, it’s time to fill it with the necessary data. Here are the database design tips to link data correctly:

– ensure coherence by assigning a data type (text, number, binary data, etc.);

– assign a primary key – a unique identifier of an object, thing, or person (e.g., user ID);

– analyze the relationships between tables.

Primary keys ensure database integrity, so there are some rules for assigning them to follow. They can’t have a Null value, contain a non-unique one, or change. A foreign key of Table B must align with the primary key of Table A it refers to. If a primary key is deleted or changed, this must affect other related tables.

When naming tables and columns, one should follow established database norms. Don’t use SQL reserved words in titles, as a syntax error can occur. Spaces, hyphens, quotation marks, or other special characters can’t be used too. It is not recommended to name tables using nouns in the plural form since a table presents a complex of values by default. Also, unnecessary prefixes and suffixes should be avoided.

A robust database only has as many tables as required. Follow the reduction strategy to determine the limit – the fewer tables are there, the easier it is to administer them.

#4. Database normalization

Normalization is vital in database design. It is a process of orchestrating the database structure according to certain rules. Here, the purpose is to eliminate information redundancy and inconsistency so that there is no contradiction in data or a fact showing up more than once.

A database needs to go through the following three stages of normalization:

First normal form (1NF): each cell only has one value. There must be a primary key assigned to a table and no duplicate facts; Second normal form (2NF): a table is in 1NF, and each column depends on the primary key; Third normal form (3NF): a table is in 2NF, and any column only depends on the key column.

If you feel like entering two values in one cell, this is a good reason to put the list in the second table, linking these two tables using the third one. Let’s suppose we have a table where IDs, titles, genres, and duration of movies are specified. The table has two entries: Logan and Shutter Island. The ID of the first movie is 1, its genres are thriller and drama, and the duration is 141 minutes. The ID of the second movie is 2, its genre is detective, and the duration is 138 minutes.

If we try to search for a drama movie, nothing will come up because 1NF is violated. The cell next to the title of Logan has two genres: thriller and drama.

If we edit the table and put each genre separately, we will violate another database design principle of normalization – information redundancy elimination. In this case, it looks as though there are two different movies with the same title.

This problem will be fixed if we divide the data from the Movie table into two tables – Movie and Genre – and then link these tables with an intermediate one – Movie_genre.

Such a structure allows a developer to easily add new movies and genres and change duration, quickly finding information via the join operator. A database must be designed following normalization principles; otherwise, there will be malfunctions in the app operation.

#5. Data upload and database testing

The final stage is devoted to ensuring that a reliable and secure database is built, making the financial, medical, space, or other software operate correctly. Before launching a database, it is necessary to check whether:

Data is displayed correctly in both the interface and the back-end

Does the information entered by a user match the records in the tables? Does a user action on the front-end cause a corresponding action on the server?

Transactions comply with ACID (atomicity, consistency, isolation, durability) properties

Atomic transactions either succeed or not – i.e., a transaction doesn’t go through if part of it fails. Consistency guarantees that transactions result in the correct state of the database. Isolation means that if procedures take place simultaneously, the result is the same as if they were sequential. Durability ensures the immutability of a transaction after it is completed, so no external circumstances (power outages, failures, etc.) are able to change the transaction.

Information is updated after changes are made

Data updates carried out in one place must be reflected in all screens, forms, and tables. Therefore, it is recommended to think out test activities covering the basic operations with data: creation, retrieval, update, and deletion. It is crucial to make sure that procedures work correctly and that there are no inconsistencies with information.

Database schema design is carried out properly

Developers need to check if the database schema properly organizes data, the tables have primary and foreign keys assigned, they are titled correctly, and the fields have constraints.

Triggers work as planned

An event in a certain table can invoke a trigger (an automatically executed piece of code). For instance, adding a student to a Group table can automatically add them to tables with the classes they attend. The proper operation of a trigger is verified by executing the SQL query embedded in it and analyzing the result of the execution using the Select operator.

The database complies with the business logic specified in the documentation

Testers assess whether the database is built in compliance with the business rules specified in the project documentation, whether its logic is correct, and whether complex functions (like relational constraints, triggers, and stored procedures) effectively maintain it.

Testing a database is the same procedure as testing any other software: prepare the environment – run a test – analyze the result – report the result. The correct operation of a database is crucial to eliminate severe malfunctions and aggravation of the situation as the database expands.

Four reasons why database design is important

Any program is overflowed with data. A computer game contains lots of information about its characters, the location of enemies, map characteristics, and so on. An app for working with clients of an online shop stores full names of buyers, their phone numbers, addresses, and statuses in the loyalty program. In car rental software, there is data on tenants, travel history, and geolocation information.

Such data must be properly organized and stored so that software works without failures, namely:

– stores, changes, and finds information on request;

– prevents third-party users from accessing personal information (data confidentiality and protection);

– can be scaled smoothly;

– saves data even in case of malfunctions (for instance, automated database backup or other protection methods).

Effective database design is a must for an app’s high performance.

Conclusion

A properly organized database design process provides a possibility to create a reliable mechanism for exchanging data flows between the user interface and the server. Provided that all the stages of designing a database are done properly, the resulting solution functions smoothly, quickly entering, finding, and changing information in tables.

Such a database has no random data in random places even if a thousand people use the app simultaneously. An effective database design process organizes complex data in a simple fashion so that information exchange occurs instantly despite this complexity. Follow these five tips on how to design a database, and you will achieve all your business goals.