Friday, 27 June 2014

File Organization and its type

Posted by Khushboo Arora With No comments

FILE DATA ORGANIZATION

INTRODUCTION

It is concern with how the data is stored in the database.It allows the designer to make decisions of how the database is to be implemented whereas the logical DB design is concerned with what data is stored in the database.

It follows 2 steps:-

1.) Translate the global logical data model for target DBMS-It includes operations like designing of base relation ,derived data and design of constraints.

2.) Design physical representation- It includes operations like analysis transaction,selection of file organization,selection of indexes and estimate the disk space requirements.

TYPES OF FILE ORGANIZATION-The various types of file organization are:-

1.) Heap or un-ordered file organization

2.) Hash file organization

3.) Index sequential access method ( ISAM )

4.) B tree or Binary tree

5.) Clustered tables

1.) Heap or un-ordered file organization-It is the simplest type of file organization in which the record are placed in the file in the same order as they are inserted.

Advantages-

1.) For storing bulky data

2.) Traversing in the form of linear search

Dis-Advantage-

1.) Time consuming process

2.) Hash File Organization- In this records are not stored sequentially in the file instead a hash function is used to calculate the address of the page in which the record is to be stored.The field on which hash function is calculated is known as hash field and if that field act as a key of the relation then it is known as hash key.

Advantages-Search is easy.

Dis-Advantage- Not possible for finding range of values

3.) Index sequential access method ( ISAM )-Data is organized into records which are composed of fixed length and the records are record sequentially. ISAM is a most versatile storage structure than hashing and it provides better functionality when data is retrieve based up on exact key match, pattern matching or a range of values.

4.) B tree or Binary tree-

a.) It is a data structure used to store large amount of information and in this method secondary storage is used to store the leaf nodes of the tree.

b.) The leaf nodes are the only ones that actually stores data icons.

c.) All other nodes are called index nodes or I nodes and these are used to store the guide values which allows us to traverse a tree structure from the root and arrive at the leaf node containing the data item.

d.)The performance of a B tree does not decreases as the relation is updated.

5.) Clustered tables- Clusteres are group of one or more tables physically stored together because they same common columns.

PHYSICAL DATA ORGANIZATION | PRIMARY INDEXES | DENSE INDEX |SPARSE INDEX

Posted by Khushboo Arora With No comments

PRIMARY INDEXES-

a.) A Primary index is an ordered file whose records are of fixed length with two fields.

b.) The first field is of the same data type as the ordered key field called the primary key of the data file.

c.) The second field is a pointer to a disk block(a block address)

d.) There is one index entry (or index record) in the index file for each block in the data file.

e.) Each index entry has the value of the primary key field for the first record in a block and a pointer that block as its two field values.

INDEXES CAN BE CHARACTERIZED AS DENSE OR SPARSE.

1.) DENSE INDEX-A dense index has an index entry for every search key value (and hence entry record) in the data file. A dense index in databases is a file with pairs of keys and pointers for every record in the data file. Every key in this file is associated with a particular pointer to a record in the sorted data file. In clustered indices with duplicate keys, the dense index points to the first record with that key.

2.) SPARSE INDEX- A sparse index on the other hand, has index entries for only some of the search values.A sparse index in databases is a file with pairs of keys and pointers for every block in the data file. Every key in this file is associated with a particular pointer to the block in the sorted data file. In clustered indices with duplicate keys, the sparse index points to the lowest search key in each block.

Therefore a primary index is a non-dense index,since it includes an entry for each disk block of the data file and the keys of its anchor record rather than for every search value(or every record).

Introduction to normalization | NORMALIZATION

Posted by Khushboo Arora With No comments

NORMALIZATION

INTRODUCTION

It is the process of decomposing or splitting the relation into relations with fewer attributes ,thereby minimizing the redundancy of data and minimizing insertion,deletion and updation anomalies.
It is a step by step reversible process of transforming and unnormalized relation into relations with simpler structure.
Normalization works through a series of stages called normal forms.The normal forms are applicable to individual relations.
The relation (table) is said to be in particular normal form if it satisfies a certain sets of constraints.

WHY WE NEED NORMALIZATION?

Normalization consists of a series of guidelines that help to guide you in creating a good database structure.

ANOMALY IN NORMALIZATION

Insert Anomaly- Insert anomaly refers to a situation when one cannot insert a new tuple(row) into a relation due to lack of data. E.G- if we want to insert a student with a value of rollno ,name and phone no attributes to be 28765 ,'kapil' and 78779988 respectively but does not have any hobby.This information cannot be inserted until the student has specified some hobby.This is because the primary key is composed of rollno and hobby attributes.
Delete Anomaly-The delete anomaly refers to a situation where the deletion of data results in unintended loss of some important data.E.G-Suppose the student with roll no 9876 is no longer interested in gardening then in that case deleting the tuple(row) that gives information about varun's gardening hobby will result in loss of vital information that the student varun with roll no 9876 ever existed in the class. This is because there is only one row which contains information about the student with roll no 9876.
Update Anomaly-The update anomaly refers to a situation where an update of a single data value requires multiple rows of data to be updated.E.G- if the student with roll no 7798 changes his phone no 987532567 then it requires updation in both the rows that give information about him.

ADVANTAGES OF NORMALIZATION

Minimizes data redundancy
Greater overall database organization
Data consistency within the database
Much more flexible database design
Enforces concept of referential integrity

DIS-ADVANTAGES OF NORMALIZATION

You cannot start building the database before you know what the user needs.
On normalizing the relations to higher normal form i.e. 4NF,5NF the performance degrades.
It is very time consuming and difficult process in normalizing relations to higher degree.
Careless decomposition may lead to bad design of database which may lead to serious problems.

TYPES OF NORMALIZATION

FIRST NORMAL FORM(1NF)
SECOND NORMAL FORM(2NF)
THIRD NORMAL FORM(3NF)
BOYCE-CODD NORMAL FORM(BCNF)
FOURTH NORMAL FORM(4NF)
FIFTH NORMAL FORM(5NF)

First Normal Form (1NF) | normalization and its types

Posted by Khushboo Arora With 1 comment

FIRST NORMAL FORM(1NF)

A relation is said to be in first normal form, if it satisfies the following rules.

There are no duplicate rows i.e. all the primary key attributes are defined.
There are no repeating groups in the table i.e. each rows and columns intersection contains one and only one value and not a set of vales.

The first normal form does not allows a relations to contain nested relations.

E.G-Consider a relation "record"

St_id	St_name	subject	marks
101	Amrit	Computer Economics Maths	98 56 43
102	Anupreet	Maths Eco	98 99

It is not in 1NF

HOW TO CONVERT IT INTO 1NF?

METHOD 1

STEP 1-To remove repeating values for a column the record relation is converted to a flat relation i.e. STU1 by repeating the pair having composite primary key (St_id,St_name)

STEP 2- For every entry in the table consequently removing the attribute record with its sub-ordinate attributes.

Relation "STU1"

St_id	St_name	subject	marks
101	Amrit	Computer	98
101	Amrit	Economics	56
101	Amrit	Maths	43
102	Anupreet	Maths	98
102	Anupreet	Eco	99

It is in 1NF but this method increases redundancy.

METHOD 2

Another method is to remove the attributes that violates the first normal form and place it in a separate relation along with the primary key.So, un-normalized relation("record") is decomposed into 2 sub-relations .

(stu_details),(stu_performance)

Relation (stu_details)

St_id	St_name
101	Amrit
102	Anupreet

Relation (stu_performance)

St_id	subject	marks
101	Computer	98
101	Economics	56
101	Maths	43
102	Maths	98
102	Eco	99

It is in 1NF.

Anomalies -Insertion,deletion and updation anomaly

Functional Dependency (FD) | Database design

Posted by Khushboo Arora With No comments

FUNCTIONAL DEPENDENCY(FD)

An attribute of an relational 'R' is said to be functional dependent on attribute 'X' if and only if each 'X' value is associated with exactly one value of 'Y' at any given time.

REPRESENTATION-

Where 'X' is known as determinant and 'Y' is known as determined.

EXAMPLE-

HOW TO SHOW FUNCTIONAL DEPENDENCY DIAGRAMMATICALLY-

Consider a relation

S.NO	SNAME	DEPT	CITY
S1	A	10	X
S2	B	20	Y
S3	C	20	Y

Now find various functional dependencies which are

Represent this functional dependency diagrammatically

FULLY FUNCTIONAL DEPENDENCY-

An attribute 'Y' Of relation 'R' is fully functional dependent on attribute 'X' if it is function dependent on 'X' and not functionally dependent on any subset of 'X'.

When primary key is composite then the non key attributes must be identified by the entire key and not by some attributes.

Second Normal Form(2NF) | normalization and its type

Posted by Khushboo Arora With No comments

SECOND NORMAL FORM(2NF)

A relation is in 2NF if it holds these two conditions:-

1. The relation is in 1NF.

2. Every non -key attribute should be fully functional dependent on its primary key.If the primary key consists of a single attribute then the relation is in 2NF.

EXAMPLE- Consider a relation. "order books" where primary key is ('ORDER NO','BOOK NAME')

ORDER NO	BOOK NAME	QUANTITY	PRICE
1	C	15	175
1	DB	20	220
2	.NET	30	200
3	C	40	175
4	DB	15	220

Let's find fully functional dependency,functional dependency and also which are not fully functional dependent in this relation.

In this order no, book name and quantity are fully functional dependent,order no, book name and price are not fully functional dependent because book name is functional dependent on price. Therefore the relation is not in 2NF.

HOW TO MAKE IT IN 2NF-

Partially key dependency should be remove by splitting the 1NF relation into appropriate relations such that every non-key attribute is fully functional dependent on primary key of respective relations.

1. Create a new relation from source relation that contains the attributes that are fully functional dependent on the concatenated primary key.

ORDER NO	BOOK NAME	QUANTITY
1	C	15
1	DB	20
2	.NET	30
3	C	40
4	DB	15

2. Create any relation from the source relation with the attributes that are not fully functionally dependent on the concatenated primary key and the member of the key on which they are dependent.

BOOK NAME	PRICE
C	175
DB	220
.NET	200
C	175
DB	220

Now the relation is in 2NF.

Anomalies-Insertion,deletion and updation anomalies are still there.

Third Normal Form (3NF) |Normalization and its type

Posted by Khushboo Arora With No comments

THIRD NORMAL FORM(3NF)

INTRODUCTION--

A relation is said to be in third normal form(3NF) if both conditions hold simultaneously.

1.) The relation is in Second Normal Form(2NF)

2.) Non-key attributes of the relation should not be transitively dependent on the primary key.

The main purpose of the 3NF is to remove the transitive dependency which is the main cause of anomalies in the 2NF.

EXAMPLE- To explain the 3NF, let us consider the example of STUDENT relation.Here the primary key is Stu_Id.

Stu_Id	Stu_Name	Teach_Id	Teach_Name	Teach_Qual
2523	Anurag	T001	Navathe	Ph.D
3712	Pankaj	T004	Date	M.Tech
4096	Gagan	T001	Navathe	Ph.D
2716	Anshu	T004	Date	M.Tech
1768	Harman	T009	Desai	M.Tech

The STUDENT relation is in 2NF but it suffers from insertion,deletion,updation anomalies.The main reason of these anomalies is because some attributes are transitively functionally dependent on the primary key.This results in redundancy in the relation.The dependencies that exist in STUDENT relation are:-

Here, the Stu_Name is functionally dependent on the attribute Stu_Id. Also Teach_Id is functionally dependent on the attribute Stu_Id. Also, Teach_Name is functionally dependent on Teach_Id so we can say that Teach_Name is transitively functionally dependent on Stu_Id.Similarly, Teach_Qual is transitively functional dependent on Stu_Id attribute.

So to make the relation in 3NF, We have to eliminate this transitive dependence on the primary key.To achieve this

1.) Create a new relation from the source relation that contains all the original attributes but without those attributes that are transitively dependent on the primary key.So in case of STUDENT relation,the resulting relation will contains Stu_Id ,Stu_Name and Teach_Id attributes.The primary key in this relation is same as in the original relation i.e. (Stu_Id).

Stu_Id	Stu_Name	Teach_Id
2523	Anurag	T001
3712	Pankaj	T004
4096	Gagan	T001
2716	Anshu	T004
1768	Harman	T009

2.) The other relation created from the source relation will contain attributes that are transitively functionally dependent on the primary key of the original relation plus the non key attribute from the original source relation on which the transitive functional dependent attributes are functionally dependent. The primary key of the resulting relation will be a non key attribute on which the non key attributes are functionally dependent.

Teach_Id	Teach_Name	Teach_Qual
T001	Navathe	Ph.D
T004	Date	M.Tech
T009	Desai	M.Tech

Now the relation is in 3NF.

DbmsEnotes

A complete collection of Database management system notes,tools and project..

Friday, 27 June 2014

File Organization and its type

PHYSICAL DATA ORGANIZATION | PRIMARY INDEXES | DENSE INDEX |SPARSE INDEX

Introduction to normalization | NORMALIZATION

First Normal Form (1NF) | normalization and its types

Functional Dependency (FD) | Database design

Second Normal Form(2NF) | normalization and its type

Third Normal Form (3NF) |Normalization and its type

Top Popular Posts

Categories

Contact Form

Contributors

Blog Archive

Pages - Menu

Save as PDF