Showing posts with label Data Recovery. Show all posts
Showing posts with label Data Recovery. Show all posts

Friday, 27 June 2014

Decomposition in Database Design | Database Design


DECOMPOSITION

A decomposition means dividing a table into more than one table.
There are 2 types of decomposition.
1.) Losy Decompsition
2.) Lossless Decomposition

1.) LOSY DECOMPOSITION- When we decompose the table,after recombining it  we doesn't get the accurate result or we can say the same relation as that of original.
2.) LOSSLESS DECOMPOSITION- When we decompose the table,after recombining it  we get the accurate result or we can say the same relation as that of original.

EXAMPLE-Consider a relation "PRO"

FACTORY
COMPONENT
PROJECT
GM
Engine
MPC
GM
Gear box
125A
Honda
Engine
125A
GM
Engine
125A

Now decompose this relation into two parts-First part is A1 and second part is A2.

Relation A1

FACTORY
COMPONENT
GM
Engine
GM
Gear box
Honda
Engine

Relation A2

COMPONENT
PROJECT
Engine
MPC
Gear box
125A
Engine
125A

Now again combine these two relations and check either we get the original relation or not. The relation after recombine is

FACTORY
COMPONENT
PROJECT
GM
Engine
MPC
GM
Gear box
125A
Honda
Engine
125A
GM
Engine
125A
Honda
Engine
MPC

Now here we can see that after recombine the relation, we could not get the original relation.An extra tuple is added to the relation which is Honda,Engine,MPC. So, this relation contains losy decomposition.

Relational Algebra in Database management system | Relational Model


RELATIONAL ALGEBRA

INTRODUCTION-
It is the collection of operations used to manipulate a relations or tables.These operations enables the user to specify the retrieval request on one or more relations that result in a new relation without changing the original relation.It is a procedural language.The user has to specify what is required and what are the sequence of steps to get the desired output.

Relational Closure- When operations are performed on existing relations to produce a new relation then the original relation remains unaffected. Therefore the output from one operation can become the input of some other operation.This property is called relational closure.

TYPES OF RELATIONAL OPERATIONS- There are two types of relational operations
1.) Traditional Set operations or Basic Set oriented operations
2.) Special relational Operations


1.) Traditional Set operations or Basic Set oriented operations- 
These include Union,integration ,difference and Cartesian product.These are the binary operations, it means that these can be applied to a pair of relations.

2.) Special relational Operations-
These includes operations like
  • Selection
  • Projection
  • Join
  • Division  
where selection and projection are unary operations and Join and division are binary operations.

Saturday, 1 March 2014

Object based data model | Data model


OBJECT BASED DATA MODEL

DEFINITION:-
It uses the concept such as entities ,attributes and relationship.

What is an entity?- An Entity is the distinct object in the organization and that is to be represented in database.There are 2 types of entities.Strong and weak entity.Entity type that do not have any key attributes are known as weak entity type but regular entity type that have key attributes are known as strong entity.

What is an attribute?- An attribute is that which describe some aspect of the object that we wish to record.
Types of attributes:-
1. Single value attributes:- These are the attribute that have single value.
2. Composite attributes:- It composed of multiple components.e.g

3. Simple attributes:- It is an attribute that composed of a single component with an independent existence. These attributes can't be further sub-divided.e.g age,rollno.
4. Multi-valued attributes: It can hold multiple value for a single attribute. e.g phone no

TYPES OF OBJECT BASED DATA MODELS:-There are 4 types of Object based data models:-
1.) ER(entity relationship) model
2.) Object oriented ER model
3.) Semantic model
4.) Functional model

1.) ER MODEL(entity relationship model)(Top-down approach):-It is the graphical technique i.e. used to convert the requirements of the system to a graphical representation for easy understanding.It is a conceptual model that used the objects as entity, attributes and relationships.Its basic component is ER diagram.It is simple and easy to understand .It can be easily converted to other models.
ER NOTATIONS:-


HOW TO DESIGN AN ER DIAGRAM:- EG. BANKING SYSTEM


2.) OBJECT ORIENTED ER MODEL:-The object oriented data model extends the definition of an entity to include,not only the attributes that describe the state of the object but also the actions that are associated with the object,that is its behavior.The object is said to encapsulate both state and behavior.

3.) SEMANTIC MODEL:- It is used to express greater inter-dependencies among the entities of interest.These inter-dependencies enable the model to represent semantics of the data in DB.This class of data models is influenced by the work of artificial intelligence researches semantic data model. It is developed to organize and represent knowledge not data. In coming times,DBMS will be fully partially intelligent

4.)FUNCTIONAL MODEL:-It describes computation within system(How output is derived from input).

Relationship and their types | Data models


RELATIONSHIPS AND RELATIONSHIP SETS

INTRODUCTION OF RELATIONSHIPS-

After two or more entities are identified and defined with attributes, the participants determine if a relationship exists between the entities. A relationship is any association,linkage, or connection between the entities of interest to the business; it is a two-directional, significant association between two entities, or between an entity and itself.Each relationship has a name, an optional (optional or mandatory), and a degree (how many).

A relationship is described in real terms.Relationship represents an association between two or more entities.Entities enrolled in a relationship are called its participants

EXAMPLE-   Employees are assigned to projects
                    Projects have sub-tasks
                    Departments manage one or more projects

TYPES OF RELATIONS-
1.) 1:M(One to many)-In a one-to-many relationship, each row in the related to table can be related to many rows in the relating table. This allows frequently used information to be saved only once in a table and referenced many times in all other tables.
EXAMPLE-In class there is only one faculty member and there are number of children. This is an example of one to many relationship.


                                     (ONE-MANY RELATIONSHIP)

2.)M:M (Many to many relationship)-A many-to-many relationship refers to a relationship between tables in a database when a parent row in one table contains several child rows in the second table, and vice versa. Many-to-many relationships are often tricky to represent.
EXAMPLE- There are so many skills of a person and its vice versa.

                                      (MANY-MANY RELATIONSHIP)

 3.) M:1(Many to one relationship)- A many-to-one relationship is where one entity (typically a column or set of columns) contains values that refer to another entity (a column or set of columns) that has unique values. In relational databases, these many-to-one relationships are often enforced by foreign key/primary key relationships, and the relationships typically are between fact and dimension tables and between levels in a hierarchy.
EXAMPLE- In any company there are no of stores and for one store there should be one head employee.

4.) 1:1(One to one relationship)-In a one-to-one relationship, each row in a table is related to a single row in another table.
EXAMPLE- 
In a warehouse application a storage-bin table might have a one-to-one relationship with a widget table. This application would model a physical warehouse where each storage bin contains one type of widget and each widget resides in one storage bin.

Relational model | Data models


RELATIONAL MODEL

INTRODUCTION-
Relational model stores the data in the form of table. It consists of three major components.
1.) The set of relations and set of domains that defines the way data can be represented.
2.) Integrity rules that defines the procedure to protect the data.
3.) The operations that can be performed on the table.

CHARACTERISTICS-
Characteristics of relational database are:-
1.) All data is conceptually represented as an orderly arrangement of data into rows and column called a relation or table.
2.) All the values are scaler i.e. at any given row or column position there is one and only one value.
3.) All operations are performed on the entire relation and the result is an entire relation.

BASIC TERMINOLOGY OF RELATIONAL DATABASE-  

1.) TUPLE OF A RELATION-Each row in a table is known as tuple.
2.) CARDINALITY OF A RELATION-  It is the no of tuples in the relation.
3.) DEGREE OF A RELATION-No of columns is known as degree of a relation.
4.) DOMAIN OF A RELATION-It defines the kind of data represented by the attribute.It is the set of all possible values that an attribute may contain.
5.) BODY OF THE RELATION-It consists of an unordered set of 0 or more tuples.

EXAMPLE:- Consider a table


1.) TUPLE OF A RELATION 8
2.) CARDINALITY OF A RELATION-  8
3.) DEGREE OF A RELATION5
4.) DOMAIN OF A RELATION- All the data entered in the table is the domain of a relation.

KEYS OF A RELATION-
1.) PRIMARY KEY- It is the key that uniquely identifies a record. It doesn't have null values.

2.) FOREIGN KEY- It refers to the primary key of some other table.It permits only those values which appear in the primary key of the table to which it refers.

Hierarchical model (Record based data model) | Data models


HIERARCHICAL MODEL(RECORD BASED DATA MODEL)

INTRODUCTION-
It is developed to manage large amounts of data for complex manufacturing projects.
                             (STRUCTURE OF HIERARCHICAL MODEL)

This structure contains levels or segments.The top layer is known as the parent of the segment .Root segment is the parent of level 1 segment.This model defects a set of one to many relationship between parent and children segments.E.G COLLEGE STRUCTURE


ADVANTAGES OF HIERARCHICAL MODEL:-
1.) It provides simplicity and it is easy to design.
2.) It also provide data integrity.Since, it is based on parent-child relationship.So, there is always a link between these segments.

DRAWBACKS OF HIERARCHICAL MODEL:-
1.) There exist operational anomalies in this model.
2.) Implementation Complexity
3.) It is easy to design but very difficult to implement.
4.) Database Management System-If you make any change in the database structure then you need to make all the necessary changes in all the application program that access the database.

Data models and their types | database managenent system


DATA MODELS

INTRODUCTION:-
It can be  defined as an integrated collection of concepts for describing and manipulating data,relationship between data and the constrains of the data in any organization.
It consists of three components:-
1.)Structured part:- It consist of set of rules according to which database can be connected.
2.) Manipulative part:- It defines the types of operations that are allowed on data.
3.) Set of integrity rules:-It ensures that the data is accurate.

PURPOSE OF DATA MODEL:- Purpose of data model is to represent the data and to make it understandable.

TYPES OF DATA MODEL:-
There are 3 types of data models:-
1.) Object based data models
2.) Physical data models
3.) Record based

1.) OBJECT BASED DATA MODELS:-It uses the concepts such as entities,attributes and relationship.It is of further 4 types:-
1.) ER(entity relationship)model
2.) Object oriented ER model
3.) Semantic model
4.) Functional model

2.) PHYSICAL DATA MODEL:-It describes how the data is stored in the computer representing the information such as record structure ,record modeling and access paths.

3.) RECORD BASED DATA MODEL:- These are used in describing the data at logical and view levels.These are used to specify the overall logical structure of the database.Record database models are of 3 types which are:-
1.) Hierarchical model
2.) Network model
3.) Relational model

Tuesday, 25 February 2014

Timestamp and its protocol |Concurrency Control


TIMESTAMP BASED PROTOCOLS-

This protocol require that we must have prior knowledge about the order in which the transaction will be accessed.In order to ordering the transaction, we associate a unique fixed timestamp ,denoted by TS(Ti) to each transaction like Ti.This time stamp is assigned by the database system before the transaction Ti starts execution.If a transaction Ti has been assigned timestamp TS(Ti) ,and a new transaction Tj enters the system then

                                                  TS(Ti)<TS(Tj)

DRAWBACKS OF TIMESTAMP-

1. Each value stored in the database requires two additional timestamp fields one for the last time the field was read and one for the last update.

2. It increases the memory requirements and the processing overhead of database.

IMPLEMENTATION-

To implement this scheme ,we associate with each data item Q two timestamp values:-

1. W-timestamp(Q) denoted the largest timestamp of any transaction that executed write(Q) successfully.
2.  R-timestamp(Q) denoted the largest timestamp of any transaction that executed read(Q) successfully.

TIMESTAMP ORDERING PROTOCOL-

The timestamp-ordering protocol ensures that any conflicting read and write operations are executed in timestamp order.This protocol operates as follows-


1. SUPPOSE A TRANSACTION Ti ISSUES READ(Q)

(a) If TS(Ti)<W-timestamp(Q),then Ti needs to read a value of Q that was already overwritten.Hence the read operation is rejected and Ti is roll back.(Q)

(b) If TS(Ti)>=W-timestamp(Q),then read operation is executed and R-timesatmp(Q) is set to the maximum of R-timestamp(Q) i.e. TS(Ti).


2. SUPPOSE A TRANSACTION Ti ISSUES WRITE(Q)

(a) If TS(Ti)<R-timestamp(Q),then the value of Q that Ti is producing was needed previously and the system assumed that value would never be produced.Hence the write operation is rejected and Ti is roll back.(Q)

(b) If TS(Ti)<W-timestamp(Q),then Ti is attempting to write an obsolete value of Q. Hence the write operation is rejected and Ti is roll back.(Q).

(c) Otherwise the write operation is executed and W- timestamp(Q) is set o TS(Ti).

Cascading Rollback |Types of Cascading Rollback |Concurrency Control


CASCADING ROLLBACK

As shown in partial schedule each transaction observes two phase locking protocol. EG-


                    T5                 T6                 T7

                  Lock-X(A)                                                       
                  Read(A,a)
                  Lock-S(B)                                                       
                  Read(B,b)
                  Write(A,a)
                  Unlock(A)
                                        Lock-X(A)                                                       
                                        Read(A,a)
                                        Write(A,a)
                                        Unlock(A)
                                                               Lock-S(A)                                                       
                                                                Read(A,a)


                  Rollback



Let us consider if transaction T5 fails after the read(A,a) operation of transaction of T7. Then, T5 must be rolled back which also result into rolled back T6 & T7. Because transaction T6 & T7 reads the value modified by transaction T5. Thus,we can say that rollback of T5 results into rollback of T6 & T7 also. This problem is called a cascading of rollback.

SOLUTIONS TO AVOID CASCADING OF ROLLBACKS: -

1.) STRICT TWO PHASE LOCKING PROTOCOL:- It requires that in addition to locking being Two phase, all exclusive mode locks taken by a transaction must be held until that transaction commits.

2.) RIGOROUS TWO PHASE LOCKING PROTOCOL:- It requires all locks(Shared and Exclusive) to be held until the transaction commits.It can be easily verified that,this transaction can be serialized in the order in which they commit.                                                

Deadlock and their prevention |Concurrency Control


DEADLOCK-
In order to understand deadlock, let us consider the following example:
           
 T1                T2                CONCURRENCY CONTROLLER

 lock-X(A)                         Lock granted
                     lock-X(B)     Lock granted
 lock-X(B)                        Request is queued i.e T1 wait for T2 to                              release the lock on data item B.

                  lock-X(A)        Request is again queued i.e T2 will                                                 wait for T1 to release the lock on data                                                                                               item A.

As shown in partial schedule transaction T1 is waiting for transactions T2 to release its lock on data item B and transaction T2 is waiting for transaction T1 to release its lock on data item A. Such a cycle of transactions waiting for locks to be released is called a Deadlock.

DEADLOCK CONDITIONS:
1. Circular wait
2. No preemption
3. Mutual exclusion
4. Hold & wait

HOW TO PREVENT FROM DEADLOCK?

IST APPROACH:- To allow every transaction lock all the data items, it needs before it starts execution and if any of the data items can't be obtained then none of the data items should be locked.

DRAWBACKS: It is very difficult to acquires locks on 3 data items simultaneously.

2ND APPROACH:- It involves assigning an arbitrary serial order on the data item and ensure that data items are locked by the transaction in the order.

DRAWBACKS: It reduces concurrency.

3RD APPROACH:- Based on the concept of Time Stamp Values.This approach is used to decide whether a transaction has to wait or abort & rollback.


TIME STAMP VALUES
A time stamp is a unique identifier assigned to each transaction. It usually based on the order in which the transactions are started.

Two methods under 3RD Approach:-
1.) Wait Die- If T1 is the requesting transaction and T2 is requested transaction then
CASE 1 - t(T1)<t(T2) then T1 is allow to wait.
CASE 2 - t(T1)>t(T2) then T1 is abort and rollback.
where t represents time stamp value.

2.) Wound Wait-If T1 is the requesting transaction and T2 is requested transaction then
CASE 1 - t(T1)<t(T2) then T1 is abort and rollback.
CASE 2 - t(T1)>t(T2) then T1 is allow to wait.
where t represents time stamp value.


DEADLOCK DETECTION:
For detecting a deadlock information requires are:
1.) The current set of transactions.
2.) Information about the current allocations of the data item to each of the transactions.
3.) Current set of data items for which each transaction is waiting.
Wait for graph method is used.In this we have to make graph for all transaction.Each node represent the active transaction and edges represents the waiting request.If the cycle is formed then it is in deadlock otherwise not.


DEADLOCK RECOVERY:
There are three methods in which deadlock will recover which are:-

1.) SELECTION OF VICTIM: We should rollback those transactions that will incur the minimum cost. When a deadlock is detected the choice of which transaction to abort can be made using following criteria:
1.) The transaction which have the fewest locks
2.) The transaction that done the least work
3.) The transaction that is farthest from competition.

2.) ROLLBACK: Once we have decided that a particular transaction must be rolled back, we must determine how far this transaction should be rolled back.The simplest solution is to total rollback;Abort the transaction and then restart it. However it is necessary to rollback the transaction only as far as necessary to break the deadlock. But this method requires the system to maintain additional information about the state of all the running system.

3.) PROBLEM OF STARVATION: In a system where the selection of victim is based primary on cost factors, it may happen that the same transaction is always picked as a victim. As a result this transaction never completes its designated task. This situation is called Starvation. The most common solution is to include the number of rollbacks in the cost factor. 

Introduction to Database management system | Database management system


DATABASE
A database is a collection of related data.By data we mean known facts that can be recorded and that have implicit meaning. 
              EG. Consider the names,tel-phone numbers and address of the people you know. you may have recorded this data in an indexed address book or you may have stored it on a hard drive,using a personal computer and software such as Microsoft access or excel. This collection of related data with an implicit meaning is a database.

DATABASE MANAGEMENT SYSTEM- A DBMS is a collection of programs that enables users to create and maintain a database.The DBMS is a general purpose software system that facilitates the processes of defining,constructing, manipulating and sharing databases among various users and applications.Defining a database involves specifying the data types,structures and constraints of the data to be stored in the database.

OBJECTIVES OF DBMS:-
1.) To solve growing business needs of any organization .
2.) Store,retrieve or update the data.
3.) Sharing data among various users.
4.) Providing security to the database.
5.) Recovery of data in case of any failure.

REQUIREMENTS OF DBMS:-
1.) It provides data definition language.
2.) It should provide facilities for storing,updating or deleting the data.
3.) It should support multiple views of a data.
4.) It provides facilities for specifying integrity constraints so that only valid information is stored in the database.
5.) It should provide security of data and prevent un-authorized access and update.
6.) It should provide con-currency control mechanisms.
7.) Facilities for database recovery and maintenance.


MASTER FILE AND TRANSACTION FILE IN DB-

MASTER FILE-It stores static data.It changes occasionally and stores all the details of the object.

TRANSACTION FILE-The file that contains data about the costumer transaction is known transaction file.It is a dynamic file.

INSTANCE AND SCHEMA OF DB-
Collection of information stored in the database at any particular movement is known as instance of the database.The overall design of the database and its description is known as schema and it does not change frequently.

SUB-SCHEMA-It is a subset of schema and inherits the same properties that a schema has.

File system Versus DBMS |Introduction to Database management system


FILE SYSTEMS VERSUS A DBMS-

FILE SYSTEM
Also known as traditional file processing system.File processing systems was an early attempt to computerize the manual filing system that we are all familiar with. A file system is a method for storing and organizing computer files and the data they contain to make it easy to find and access them.
                            File systems may use a storage device such as a hard disk or CD-ROM and involve maintaining the physical location of the files. EG- In BANK there is a file name "accounts file".If anyone wants to search they have the ability to search the accounts of all users.

Characteristics of File Processing System:-
1.) It is a group of files storing data of an organization.
2.) Data stored in the simple files known as flat files.
3.) Data is stored in hard-disk or any CD-ROM.
4.) Each file is independent from one another.
(File Management System)

DBMS:-A DBMS is a collection of programs that enables users to create and maintain a database.The DBMS is a general purpose software system that facilitates the process of defining,constructing manipulating and sharing databases among various users and applications.Defining a database involves specifying the data types,structures and constraints of the data to be stored in the database.Only user can access his own data.

ADVANTAGE-The advantages of DBMS are:-
1.Security
2. Non-redundancy
3. Consistency
4. Correctness Isolation


                                  (Data Base Management System)

Data Independence |Introduction to Database management system


DEFINITION-
Data independence means upper layers are unaffected by any change in the lower layers.

TYPES OF DATA INDEPENDENCE-
There are 2 types of data independence.
1. LOGICAL DATA INDEPENDENCE
2. PHYSICAL DATA INDEPENDENCE

1. LOGICAL DATA INDEPENDENCE- If conceptual level changes no effect on external level.

2. PHYSICAL DATA INDEPENDENCE-If Internal level changes no effect on conceptual level.


 (3 LEVEL ARCHITECTURE)

Architecture of database System | Introduction to Database management system


DATABASE SYSTEM ARCHITECTURE
Three layers of architecture of DBMS:-
1. External level/View level
2. Conceptual level/Logical level
3. Internal level/Physical level

1. External level/View level-It is the user view and it shows the data that is relevant to the user. It deals with the way in which individual user view the data.It hides the details of internal and conceptual level.

2. Conceptual level/Logical level-It describes what data is stored in the database and the relationship among them.It contains the logical structure of the database as seen by the DBA.

3. Internal level/Physical level- It is the physical representation of the database. It covers the data structures used to store the data on storage devices,It also deals with storage space allocation.


                                  (3-LEVEL ARCHITECTURE)

OBJECTIVES OF 3 LEVEL ARCHITECTURE-
1. Each user access the same data but have a different view.
2. No dealing with physical storage details.
3. DBA changes database storage structure without effecting users view.
4. The main objective of this 3 level architecture is to provide DATA INDEPENDENCE.