DATABASE MANAGEMENT SYSTEM (DBMS)

#bitscourseware #selflearning #database #DBMS #GATE

1. Basic Introduction

Two-Tier Architecture
Three-Tier Architecture
Three-Schema Architecture
- External Level
- Conceptual Level
- Internal Level
Data Independence
- Logical Data Independence
- Physical Data Independence

2. Data Models

Network Data Model
Hierarchical Data Model
Relational Data Model
Entity-Relationship (ER) Model
Object-Oriented Data Model

3. Entity-Relationship (E-R) Model

Entity Types
Attributes
- Simple and Composite
- Single-valued and Multi-valued
- Derived Attributes
Relationships
Types of Relationships (1:1, 1:N, M:N)
Weak Entities
Participation Constraints
Cardinality Constraints

4. Keys in DBMS

Super Key
Candidate Key
Primary Key
Alternate Key
Foreign Key

5. Functional Dependencies & Normalization

Functional Dependencies (FD)

Trivial and Non-trivial FD
Attribute Closure
Canonical Cover (Minimal Cover)
Armstrong’s Axioms

Decomposition Properties

Lossless Join Decomposition
Dependency Preservation

Normal Forms

First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)
Multivalued Dependencies (MVD)
Fourth Normal Form (4NF)
Join Dependencies
Fifth Normal Form (5NF)

6. Transaction Control & Concurrency

ACID Properties

Atomicity
Consistency
Isolation
Durability

Concurrency Problems

Read-Write (R-W) Conflict
Write-Read (W-R) Conflict
Write-Write (W-W) Conflict
Dirty Read
Lost Update
Non-Repeatable Read

Schedules & Control

Serial and Concurrent Schedules
Conflict Serializability
View Serializability
Recoverability
Cascading Rollback

Concurrency Control Techniques

Two-Phase Locking (2PL)
- Strict 2PL
Timestamp Ordering
Deadlock Handling

7. SQL & Relational Algebra

SQL

Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
Constraints
- NOT NULL
- UNIQUE
- PRIMARY KEY
- FOREIGN KEY
- CHECK
Aggregate Functions
Joins
Nested Queries
Views
Triggers

Relational Algebra

Selection (σ)
Projection (π)
Union
Intersection
Set Difference
Cartesian Product
Join
Division

8. Indexing & File Organization

Primary Index
Clustering Index
Secondary Index
Dense Index
Sparse Index
Hash Indexing
B-Tree
B+ Tree

What is DBMS

Database is a collection of relation data. Pasted image 20260224231336

DBMS Advantages

Why DBMS?

Before DBMS, traditional File Systems were used to store data.

In file-based systems:

Data was stored in separate files.
Each application maintained its own data.
There was no centralized control.
Data redundancy and inconsistency were common.
Data sharing was difficult.
Concurrency handling was weak or manual.
Security mechanisms were limited.

Because of these limitations, Database Management Systems (DBMS) were introduced.

What Problems Did DBMS Solve?

Reduced Data Redundancy
Reduced Data Inconsistency
Improved Data Sharing
Centralized Data Management
Better Data Integrity
Efficient Data Access
Concurrency Control
Enhanced Security
Backup and Recovery Support

Why We Use DBMS?

DBMS follows a Client-Server Architecture.

Data is stored in a centralized database server.
Clients (applications/users) request data from the server.
The DBMS manages storage, retrieval, security, and concurrency.

The data can be stored on:

Local storage (HDD, SSD)
Cloud servers
Distributed systems

Key Advantages of DBMS

1. Concurrency Control

Multiple users can access data simultaneously without corrupting it.

2. Security

Authentication, authorization, and access control mechanisms protect data.

3. Data Integrity

Constraints ensure accuracy and consistency of data.

4. Backup and Recovery

System can recover from crashes and failures.

5. Data Independence

Changes in storage structure do not affect application programs.

6. Reduced Redundancy

Single source of truth reduces duplicate data.

t## Two-Tier and Three-Tier Architecture | Database Management System

1. Two-Tier Architecture (2-Tier System)

A Two-Tier Architecture consists of:

Client Layer (User Machine)
Database Server

In this model:

The client directly communicates with the database server.
Business logic may exist in the client application.
SQL queries are sent directly to the database.

Structure

Client Application <------> Database Server

Real-Life Example

A desktop application like a local banking software installed in a branch office.
The software directly connects to a central MySQL or Oracle database server.

Characteristics

Simple design
Suitable for small-scale systems
Faster communication (direct connection)
Less secure and less scalable compared to 3-tier

2. Three-Tier Architecture (3-Tier System)

A Three-Tier Architecture consists of:

Presentation Layer (Client / UI)
Application Layer (Business Logic Server)
Database Layer (Database Server)

In this model:

The client does NOT directly communicate with the database.
The application server processes business logic.
The database server stores and manages data.

Structure

Client (Frontend)
↓
Application Server (Backend)
↓
Database Server

Real-Life Example

Modern web applications:
- Browser (Frontend)
- Node.js / Django / Spring Boot (Backend Server)
- MySQL / PostgreSQL (Database)

Example:

Instagram
Amazon
Banking apps

Characteristics

Better security (database is not directly exposed)
Better scalability
Easier maintenance
Suitable for large-scale systems

What is Schema? | How to Define Schema

What is Schema?

A Schema is the logical structure or blueprint of a database.

It defines:

Tables
Attributes (columns)
Data types
Relationships
Constraints

Schema describes how data is organized, not the actual data itself.

In RDBMS

In a Relational Database Management System (RDBMS):

Data is represented in the form of tables (relations).
Each table consists of rows (tuples) and columns (attributes).

Example in ER Model

Entity: Student

Attributes:

name
age
roll_no
address

Entity: Course

Attributes:

course_id
course_name
duration

These entities and their attributes form the conceptual schema.

Implementing Schema in SQL

Schema is implemented using DDL (Data Definition Language) commands.

DDL is used to:

CREATE tables
ALTER tables
DROP tables

Example SQL Implementation

CREATE TABLE Student (
    roll_no INT PRIMARY KEY,
    name VARCHAR(100),
    age INT,
    address VARCHAR(200)
);

CREATE TABLE Course (
    course_id INT PRIMARY KEY,
    course_name VARCHAR(100),
    duration INT
);

Three-Schema Architecture | Three Levels of Abstraction

What is Schema?

Schema = Structure of the database.

It defines:

Tables
Attributes
Relationships
Constraints

Schema describes how data is organized.

Three-Schema Architecture

The Three-Schema Architecture is used to separate user view, logical structure, and physical storage of the database.

It consists of three levels:

1. External Schema (View Level)

Also called View Level.
Represents how different users see the database.
Each user can have a different external schema.

Example:

Student view
Faculty view
Dean view

Each view shows only relevant data.

Purpose:

Security
Customization
Data hiding

2. Conceptual Schema (Logical Level)

Describes the complete logical structure of the database.
Defines entities, attributes, relationships, and constraints.
Independent of physical storage.

Models used:

ER Model
Relational Model

Example: Tables like:

Student(roll_no, name, age, address)
Course(course_id, course_name, duration)

This level describes WHAT data is stored.

3. Internal Schema (Physical Level)

Also called Physical Schema.
Describes how data is stored on disk.
Defines:
- File organization
- Indexing
- Storage structures
- Access paths

This level describes HOW data is stored.

Flow of Architecture

Users
↓
External Schema (View Level)
↓
Conceptual Schema (Logical Level)
↓
Internal Schema (Physical Level)
↓
Database (Disk Storage)

Key Concept: Data Independence

Logical Data Independence
Changes in conceptual schema do not affect external schema.
Physical Data Independence
Changes in internal schema do not affect conceptual schema.

What is Data Independence | Logical vs. Physical Independence

Data Independence in DBMS

Data Independence refers to the ability to modify the schema at one level of the database architecture without affecting the next higher level.

It provides transparency to users and application programs.

1. Logical Data Independence

Logical Data Independence is the ability to modify the conceptual schema without affecting the external schema (view level).

The conceptual schema defines the logical structure of the database, such as:

Adding or removing columns
Adding new tables
Modifying relationships

These changes should not require rewriting application programs.

This is achieved using views:

Views are virtual tables.
They display only selected columns or data.
Even if the base table structure changes, the view can remain consistent for the user.

Benefit: Application programs continue to work even when the logical structure of the database changes.

2. Physical Data Independence

Physical Data Independence is the ability to modify the internal (physical) schema without affecting the conceptual schema.

The physical schema defines how data is stored:

File organization
Indexing methods
Storage structures
Disk changes (HDD to SSD, cloud migration, etc.)

Users and application programs do not perceive these backend changes.

Example: A company may change indexing strategy or migrate to a different storage system, but the database structure (tables, attributes) remains the same from the user's perspective.

Benefit: System performance can be improved without modifying application logic.

Why Data Independence is Important

Provides transparency to users
Reduces application maintenance
Improves system flexibility
Prevents frequent rewriting of application programs
Enhances long-term scalability

Integrity Constraints in Database

Integrity Constraints are rules defined in a database to ensure the accuracy, consistency, and reliability of data.

They enforce predefined conditions that data must satisfy before it is stored in the database.

Types of Integrity Constraints

1. Domain Constraints

Domain constraints define the valid set of values that an attribute can take.

They enforce:

Data type (INT, VARCHAR, DATE, etc.)
Size limitations
Range restrictions
Format conditions

Example:

Age must be an integer.
Age must be between 0 and 120.
Email must follow a valid format.

These constraints ensure that attribute values belong to a valid domain.

2. Entity Integrity Constraint

Entity integrity ensures that:

Every table must have a Primary Key.
The Primary Key must be unique.
The Primary Key cannot be NULL.

This ensures that each row (tuple) in a table is uniquely identifiable.

Example: Student(roll_no, name, age)

Here, roll_no as PRIMARY KEY guarantees no duplicate or NULL values.

3. Referential Integrity Constraint

Referential integrity ensures consistency between related tables.

A Foreign Key in one table must match an existing Primary Key in the referenced table.
Or it must be NULL (if allowed).

Example:

Student(roll_no, name)
Enrollment(roll_no, course_id)

If roll_no in Enrollment is a foreign key, it must exist in the Student table.

This prevents orphan records.

4. Key Constraints

Key constraints ensure uniqueness of certain attributes.

Candidate Key: Minimal attribute set that uniquely identifies a tuple.
Primary Key: Selected candidate key.
UNIQUE constraint: Enforces uniqueness but may allow NULL (depending on DBMS).

Example: Email in a User table may have a UNIQUE constraint.

Why Integrity Constraints Are Important

Prevent invalid data entry
Maintain consistency across tables
Avoid duplication
Protect relational structure
Ensure reliable query results

What is Candidate Key and Primary Key?

What is a Key?

A Key is an attribute or a set of attributes that uniquely identifies a tuple (row) in a relation (table).

A key is not just a normal attribute. It must uniquely identify each row in a table.

Example: Student Table

Student(roll_no, email, name, age)

Assume:

roll_no is unique
email is unique

Both can uniquely identify a student.

Candidate Key

A Candidate Key is a minimal set of attributes that uniquely identifies a tuple in a table.

Important properties:

Must be unique
Must be minimal (no unnecessary attribute included)

In the example:

roll_no → Candidate Key
email → Candidate Key

If we combine (roll_no, email), it is still unique, but it is NOT minimal. So it is not a candidate key.

Primary Key

A Primary Key is one candidate key chosen by the database designer to uniquely identify tuples.

Rules:

Cannot be NULL
Must be unique
Only one primary key per table

In the example:

If we choose roll_no as the primary key:

PRIMARY KEY (roll_no)

Then email remains a candidate key (often enforced using UNIQUE).

Simple Difference

Candidate Key = All possible minimal unique identifiers
Primary Key = One selected candidate key

Key Insight

All Primary Keys are Candidate Keys.
But not all Candidate Keys are Primary Keys.

What is Primary Key in DBMS

What is Primary Key in DBMS?

A Primary Key is an attribute (or a set of attributes) that uniquely identifies each row (tuple) in a table.

It is selected from the candidate keys of the table.

Properties of a Primary Key

Must be Unique
No two rows can have the same primary key value.
Cannot be NULL
Every row must have a valid primary key value.
Only One Primary Key per Table
A table can have multiple candidate keys, but only one is chosen as the primary key.

Example

Student(roll_no, email, name, age)

If:

roll_no is unique
email is also unique

Both are candidate keys.

If we choose roll_no as the primary key:

PRIMARY KEY (roll_no)

Then:

roll_no → Primary Key
email → Candidate Key (may use UNIQUE constraint)

Why Primary Key is Important

Uniquely identifies records
Prevents duplicate entries
Enables relationships between tables (via Foreign Keys)
Ensures entity integrity

Foreign Key in DBMS

What is a Foreign Key?

A Foreign Key is an attribute or a set of attributes in one table that refers to the Primary Key of the same table or another table.

It is used to maintain Referential Integrity between tables.

Purpose of Foreign Key

Establish relationship between two tables
Prevent invalid data entry
Maintain consistency across related tables

Example

Student Table

Student(roll_no, name, address)

roll_no	name	address
1	A	Delhi
2	B	Mumbai
3	C	Chandigarh

Here: roll_no is the PRIMARY KEY.

Course Table

Course(course_id, course_name, roll_no)

course_id	course_name	roll_no
C1	DBMS	1
C2	Networks	2

Here: roll_no is a FOREIGN KEY that references Student(roll_no).

SQL Implementation

CREATE TABLE Student (
    roll_no INT PRIMARY KEY,
    name VARCHAR(50),
    address VARCHAR(100)
);

CREATE TABLE Course (
    course_id VARCHAR(10),
    course_name VARCHAR(50),
    roll_no INT,
    FOREIGN KEY (roll_no) REFERENCES Student(roll_no)
);

Insert, Update & Delete from Foreign Key table