Database Management System
3.1. Introduction of Database: Database is systematic collection of data. Database management system has numerous advantage over traditional file system such as support multiple users concurrently, can add role, constraints, backup features, multiple users etc.
Commercially, there are DBMS packages available in market. It manages for users to define schema and provide language to manipulate data.
Fundamental characteristics of DBMS is it offers data independence which means changing on one level doesn’t harm on other level. Database architecture are generally 3 tier architecture. They are external (view) level (provide data to application), logical (what data is stored and relationships) and physical layer (how data is stored).
3.2. Database Model: Relational Database Model, E-R Model
Relational database is widely used in web applications. It is necessary to make normalized relation to remove anomaly and inconsistencies of database. Generally 3nf is required to make free from general inconsistencies. 1nf is achieved when data is atomic and no repeated groups. 2nf is achieved by removing partial dependencies. 3nf is achieved by removing transitive dependencies. BCNF is achieved by decomposing relation if there are multiple and overlapped composite keys. 4nf remove multivalued dependencies. Foreign key in any relation must be primary key or any other candidate key of parent table and cascade delete or update is necessary (optional) operation. Selection and projection are basic operation to retrieve rows and display columns respectively. Cardinality refers numbers of rows or tuples, degree refers to number of attributes and arity is number of relations participating in relationship (in ER model).
E-R model is a conceptual model in DBMS. Rectangle denotes entity, diamond represents relationship between two entities, ellipse represent attribute. Dotted ellipse represents derived attribute, double rectangle represent weak entity. Similarly, dotted ellipse represent derived attribute. Double ellipse represent multi-valued attribute.
3.3. Database Design: Logical Design, Conceptual Design, Mapping Conceptual to Logical, Logical to Physical
Logical design is schema design. Conceptual design is constructed by database designer for example ER diagram to find out entities and relationships. To map conceptual to logical means representing entities and relationships in schema, identifying primary keys and adding constraints by designer. And logical to physical specifies storing keys, indexing, file organization which is generally performed by dbms itself for users.
3.4. Normalization: 1NF, 2NF, 3NF, BCNF
Already written.
3.5. Transaction and Transaction Processing, Multi-User & Concurrency, Types of Failure, Backups and Recovery of Database:
ACID property is required in transaction. Atomic means either all or none of a transaction. Consistency means while transaction performing it is in inconsistent state and after committing it must in consistent state. Isolation means there can many transactions and each transaction must isolated from other transaction during simultaneous process. Durability means database must be persist even in case of media system failure.
Concurrency can be achieved lock manager either shared or exclusive through lock manager. It can also be managed by 2PL method.
Types of failure are system failure, media failure and internal error.
Backups are full backup, incremental back (each day compare with previous day) and differential backup (each day compare with full backup day).
Recovery of database: Shadow paging (create a copy of table and update in it), logging (immediate or deferred)
3.6. Query processing and optimization
There are three basic steps: parsing, evaluation of parse tree which is optimization to choose best evaluation plan and finally one evaluation plan is pushed to evaluation engine to generate data. Materialization (generate temp files) and pipelining are methods used in final step of query processing.
3.7. Basic Concept of major RDBMS products: Oracle, SQL Server and MYSQL
3.8. Basic Concept of Object-relational and object-oriented database management system.
3.9. Introduction to Data Warehousing, Data Mining and Big Data: Data warehouse feature is subject oriented, integrated, non-volatile and time-variant. Data store generally from 5-10 years. Data mining has set of steps to find pattern and knowledge in data. Steps are data collection, processing and apply model. Big data is multiple of V: velocity, veracity, volume.
3.10. OLAP and OLTP: OLAP process non-transactional data or historical data. Its size is huge in comparison to OLTP. OLTP is transactional or real-time data of database. Its size is small in comparison to OLAP.