A Database Management System (DBMS) offers various essential functions to ensure the effective management and utilization of data. These functions include transaction support, enabling necessary operations on the database, and data communication support, which facilitates data exchange.
Components of DBMS:
1.User Interface
2.Data Definition Language (DDL) and Data Manipulation Language (DML) Compilers
3.System Catalog/Data Dictionary
4.Runtime Database Processor
5.Concurrency Control, Backup, & Recovery Manager
5.Concurrency Control, Backup, & Recovery Manager:
Supports multiple user interaction.
Backup the database (scheduled or nonscheduled).
Recovers the database in case of system crash.
4. Runtime Database Processor:
Executes privileged commands while communicating with system catalog.
Executes the compiled queries.
3.System Catalog/Data Dictionary:
Stores information about the database itself.
Including schema, procedures, functions, triggers, and much more.
Every time a DDL statement is run, the data dictionary becomes modified
2. DDL and DML Compilers:
DDL compiler processes schema definitions and stores descriptions of the schema in the system catalog.
DML compiler is responsible for compiling DML commands into object code to access the database.
1.User Interface:
Web Forms
Application Front Ends
SQL Command Line Interface
Database Administrators (DBA):
Most technical class of database users.
Know how to design and manage the database.
Responsible for making database backups and recovering data in case of problems.
End Users
Naïve Users:
Use database applications.
No interaction with other parts of the database.
Not interested in the inner workings of the database.
Sophisticated Users:
Have some additional rights over the Naïve users.
Can access data in more ways.
They need to be aware of the database structure.
Application Programmers:
Write applications for database users.
Know structure of the database.
Know needs of an organization.
Relational Algebra:
Relational Algebra is collection of algebraic operators
Difference:
Difference of two relations
Same rules as in Union operation apply here.
Notation: Relation1 - Relation2
Example:Difference between the Fname and Lname of Student and Employee.
ΠFname, Lname (Student) - ΠFname, Lname (Employee)
Intersection:
Similar information between two relations
Example:Same rules as in Union operation apply here.
Notation: Relation1 ∩ Relation2
Example: Intersection between Fname and Lname of Student and Employee.
ΠFname, Lname (Student) ∩ ΠFname, Lname (Employee)
Union:
Combine information vertically
Duplicate tuples are eliminated.
Notation: Relation1 U Relation2
Example:List of Fname and Lname from Student and Employee
ΠFname, Lname (Student) U ΠFname, Lname (Employee)
Cartesian Product:
Cross-product to combine two or more relations.
Notation: Relation1 X Relation2
Example: ID, Fname, Lname of students with GPA > 3.4 who signed up for CS courses.
σStudent.ID==Course.Sid ∧ GPA > 3.4 ∧ Dept. == “CS” (Student X Course)
Projection:
Vertical decomposition.
Notation: Π attribute names relation
Example: tudents with Fname and Lname:
ΠFname, Lname Student
Selection:
Horizontal decomposition.
Notation: σcondition relation
Example:
Students with GPA > 3.6:
σGPA > 3.6 Student
Codds Rule
Rule 12: Non-Subversion Rule
If a relational system has or supports a low-level (single-record-at-a-time) language, that low-level language cannot be used to subvert or bypass the integrity rules or constraints expressed in the higher-level (multiple-records-at-a-time) relational language
Rule 11: Distribution Independence
The data manipulation sublanguage of a relational DBMS must enable application programs and terminal activities to remain logically unimpaired whether and whenever data are physically centralized or distributed.
Rule 10: Integrity Independence
Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.
Rule 9: Logical Data Independence
Application programs and terminal activities remain logically unimpaired when information preserving changes of any kind that theoretically permit unimpairment are made to the base tables.
Rule 8: Physical Data Independence
Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representation or access methods.
Rule 7: High-Level Insert, Update, and Delete
The capability of handling a base relation or a derived relation as a single operand applies not only to the retrieval of data but also to the insertion, update, and deletion of data.
Rule 6: View Updating Rule
All views that are theoretically updatable are also updatable by the system.
Rule 5: Comprehensive Data Sublanguage Rule
A relational system may support several languages and various modes of terminal use.
Rule 4: Dynamic Online Catalog Based on the Relational Model
The database description is represented at the logical level in the same way as ordinary data, so authorized users can apply the same relational language to its interrogation as they apply to regular data.
Rule 3: Systematic Treatment of NULL Values
NULL values (distinct from empty character string or a string of blank characters and distinct from zero or any other number) are supported in the fully relational RDBMS for representing missing information in a systematic way, independent of data type.
Rule 2: Guaranteed Access Rule
Each and every datum (atomic value) is guaranteed to be logically accessible by resorting to a combination of table name, primary key value, and column name
Rule 1: The Information Rule
All Information in the relational database is represented in exactly one and only one way – by values in tables.
Constructing an ERD:
Identify Entities:
Remove duplicate entities:
list the attributes of each entity:
Mark the primary keys:
Define relationships:
Describe the cardinality of the relationships:
Remove redundant relationships:
Combine into single diagram.
If there are many to many relationships hiding another entity, introduce a new entity, and turn many to many relationship into 1-M and M-1.
MySQL:
Datatype:
Numeric
String
Date and time
Large storage
Larger storage:
TINYBLOB(M) – Maximum length of 255 bytes.
BLOB(M) – Maximum length of 64K bytes.
MEDIUMBLOB(M) – Maximum length of 16M bytes.
LONGBLOB(M) – Maximum length of 4G bytes.
Date and Time:
DATE – To store dates. For example: 1982-10-23
TIME – To store time values. For example: 10:55:10
DATETIME – To store date/time combinations. For example: 1970-06-18 11:49:59
TIMESTAMP – To store event logging.
String:
CHAR(M) – The range is 0 to 255 characters.
VARCHAR(M) – The range of M is 0 to 65,535.
TEXT – If we are not sure about the length!
Numeric:
Integer – to store whole numbers
Floating point – real numbers
Fixed point – decimal numbers with fixed precision
Functions of DBMS:
1.Data Processing
2.A User Accessible Catalog
3.Transaction Support
4.Concurrency Control Services
5.Recovery Services
6.Authorization Services
7.Support for Data Communication
8.Integrity Services
8.Integrity Services:
Integrity services make sure that the information is real and correct as defined by the rules
7.Support for Data Communication:
DBMS allows the communication of the data
6.Authorization Services:
Who is authorized to access the database?
If a user is authorized then what kind of user actions are authorized?
5.Recovery Services:
A database can enter an inconsistent state due to some invalid action.
DBMS should be able to recover -> minimal data loss.
4.Concurrency Control Services:
DBMS provides concurrency control services to make sure all operations are correct
3.Transaction Support:
DBMS is responsible for supporting all required operations/actions on the database.
2.A User Accessible Catalog:
DBMS provides access to administrative users about database catalog
1.Data Processing:
Operations on the data:
Creation of data
Storing of the data,
Providing access to the data,
Placing data in storage devices, etc.
Normalization:
Transforming data from a problem into relations while ensuring data integrity and eliminating data redundancy
BCNF - Table is in 3NF and every determinant is a candidate key.
3NF - Table is in 2NF and there are no non-key transitive functional dependencies.
2NF - Table is in 1NF and has no partial key dependencies.
1NF - Table contains no repeating attributes or groups of attributes and has a primary key.
External:
Each end user has a data view of the database subject to limited permissions.
View of the data
Can be different, for example DOB
Calculated data, for example age
Barrier/limitations
Restricts user when adding the data or making changes in the data.
Conceptual:
A complete description of the information content of the database.
Represents all entities, attributes, their relationships, and constraints.
Conceptual schema defines present and future needs of an organization.
Internal:
Describes the physical storage structure of the database.
How the data is going to be stored on the disk.
Level of data:
1.Real World Data
2.Metadata
3.Data Occurrences
Data Occurrences: Data Existence
Actual data about real world objects.
Meeting rules defined at the meta data level.
For example: Any student:
Andy, Murphy, 27/09/1983, 39 lower clanbrassil street - Dublin 8
Adam, Brennan, 27/12/1981, some address here - Dublin 7
Metadata: Data about data
It tells you about type, size, and number of attributes for a real world object.
For example: Any student:
First Name, Text, 25 character size field,
Last Name, Text, 25 character size field,
Age, Date type, 8 byte size,
Real World Data: Objects that exist in reality
For example: Any student, Any employee.
Database terms
1.Data
2.Information
3.Relational database terminology
4.Flat database vs. Relational database
5.Database
6.Database Management Systems (DBMS)
7.Structured Query Language (SQL)
SQL:
DBMS use SQL to Interact with the database, Create new tables, Insert data into tables, Update data, Delete data from tables, Retrieve information across tables
DBMS:
Software that manages data and performs certain operations on data
Database:
are the data and the associated rules
Flat vs Relational Database
Flat database:
All the data is stored into a single table.
Data duplication occurs.
For example: A phone directory
Relational database:
Splits the data into several tables.
Reduces data duplication.
For example: A phone directory (broken down into two tables)
Database Terminology:
Attribute: Column or field: Name and data type
Attribute value: Valid value for an attribute
Tuple: Set of attribute values: Row or record
Relation: Table: set of tuples sharing the same attributes
Information: Data converted into a meaningful form is called information
Data: Collection of raw facts collected for a specific purpose
Features of Database
1.Efficiency
2.Accuracy
3.Ease of Updating
4.Security
5.Controlled Redundancy
Introduction:
A database is a shared collection of logically related data, designed to meet the information needs of multiple users in an organization