2. A B+ tree can contain a maximum of 7 pointers in a node. What is the minimum      number of keys in leaves?           a. 6           b. 3           c. 4           d. 7    3. Which of the following is false?           a. A B+ -tree grows downwards           b. A B+ -tree is balanced           c. In a B+ -tree, the sibling pointers allow sequential searching           d. B+ -tree is shallower than B-tree    4. Which of the following is true?           a. larger the order of B-tree, less frequently the split occurs           b. larger the order of B-tree, more frequently the split occurs           c. smaller the order of B-tree, more frequently the split occurs           d. smaller the order of B-tree, less frequently the split occurs    5. The index consists of_________.           a. list of keys           b. pointers to the master list           c. both (a) and (b)           d. All of these    Answer  1. (c) 2. (b) 3. (a) 4. (a) 5. (c)    7.7 REFERENCES    Text Books:      • T1 R. Elmasri and S.B. Navathe, Fundamentals of Database Systems, Pearson           Education, New Delhi.      • T2 C.J. Date, An Introduction to Database Systems Pearson Education, New Delhi.      • T3 Data, C. and Darwen, H, Reading, A Guide to the SQL Standard, Addison-Wesley           Publications, New Delhi.    Reference Books:      • R1 A. Silberschatz, H.F. Korth and S. Sudarshan, Database System Concepts,           McGraw-Hill, International Edition.      • R2 Ivan Bayross, SQL / PL/SQL, BPB Publications.                                                                                101                                        CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 8: FILE ORGANIZATION 2    Structure  8.0 Learning objective  8.1 Hashing, Hashing Functions, Collision Resolution  8.2 Extendible Hashing  8.3 Dynamic Hashing Approach Implementation and Performance  8.4 Summary  8.5 Keywords  8.6 Learning Activity  8.7 Unit End Questions  8.8 References    8.0 LEARNING OBJECTIVE    After studying this unit, student will be able to      ▪ Explain the concepts in Hashing technique, how to use hash functions and Collision           resolution.      ▪ Describe the performance and implement the extendible hashing and Dynamic           Hasing.    8.1 HASHING, HASHING FUNCTIONS, COLLISION RESOLUTION    Hashing technique is used to calculate the direct location of a data record on the disk without  using an index structure. In this technique, data is stored at the data blocks whose address is  generated by using the hashing function. The memory location where these records are stored  is known as a data bucket or data blocks.  In this, a hash function can choose any of the column value to generate the address. Most of  the time, the hash function uses the primary key to generate the address of the data block. A  hash function is a simple mathematical function to any complex mathematical function. We  can even consider the primary key itself as the address of the data block. That means each  row whose address will be the same as a primary key stored in the data block.                                          102    CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.1: Data Block Address Representation  Figure 8.1 shows data block addresses same as primary key value. This hash function can  also be a simple mathematical function like exponential, mod, cos, sin, etc. Suppose we have  mod (5) hash function to determine the address of the data block. In this case, it applies mod  (5) hash function on the primary keys and generates 3, 3, 1, 4 and 2 respectively, and records  are stored in those data block addresses.                                  Figure 8.2: Hashing Key Representation                        103  Types of Hashing:                                                          CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.3: Types of Hashing    Static Hashing:  In static hashing, the resultant data bucket address will always be the same. That means if we  generate an address for EMP_ID =103 using the hash function mod (5) then it will always  result in same bucket address 3. Here, there will be no change in the bucket address.  Hence in this static hashing, the number of data buckets in memory remains constant  throughout. In this example, we will have five data buckets in the memory used to store the  data.    Figure 8.3: Static Hashing Representation    Operations of Static Hashing  Searching a record: When a record needs to be searched, then the same hash function  retrieves the address of the bucket where the data is stored.                                               104    CU IDOL SELF LEARNING MATERIAL (SLM)
Insert a Record: When a new record is inserted into the table, then we will generate an  address for a new record based on the hash key and record is stored in that location.  Delete a Record: To delete a record, we will first fetch the record which is supposed to be  deleted. Then we will delete the records for that address in memory.  Update a Record: To update a record, we will first search it using a hash function, and then  the data record is updated.  If we want to insert some new record into the file but the address of a data bucket generated  by the hash function is not empty, or data already exists in that address. This situation in the  static hashing is known as bucket overflow. This is a critical situation in this method.    To overcome this situation, there are various methods. Some commonly used methods are as  follows:  Open Hashing: When a hash function generates an address at which data is already stored,  then the next bucket will be allocated to it. This mechanism is called as Linear Probing.  For example: suppose R3 is a new address which needs to be inserted, the hash function  generates address as 112 for R3. But the generated address is already full. So the system  searches next available data bucket, 113 and assigns R3 to it.                                            Figure 8.4: Open Hashing    Close Hashing: When buckets are full, then a new data bucket is allocated for the same hash  result and is linked after the previous one. This mechanism is known as Overflow chaining.  For example: Suppose R3 is a new address which needs to be inserted into the table, the hash  function generates address as 110 for it. But this bucket is full to store the new data. In this  case, a new bucket is inserted at the end of 110 buckets and is linked to it.                                          105    CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.5: Close Hashing    Quadratic probing:  Quadratic probing is very much similar to open hashing or linear probing. Here, the only  difference between old and new bucket is linear. Quadratic function is used to determine the  new bucket address.  Quadratic probing is an open-addressing scheme where we look for i2‘th slot in i’th iteration  if the given hash value x collides in the hash table.  How Quadratic Probing is done?  Let hash(x) be the slot index computed using the hash function.    If the slot hash(x) % S is full, then we try (hash(x) + 1*1) % S.  If (hash(x) + 1*1) % S is also full, then we try (hash(x) + 2*2) % S.  If (hash(x) + 2*2) % S is also full, then we try (hash(x) + 3*3) % S.  This process is repeated for all the values of i until an empty slot is found.    Quadratic Probing uses a hash function of the form  h (k,i) = (h' (k) + c1i + c2i2) mod m  Where (as in linear probing) h' is an auxiliary hash function c1 and c2 ≠0 are auxiliary  constants and i=0, 1...m-1. The initial position is T [h' (k)]; later position probed is offset by  the amount that depend in a quadratic manner on the probe number i the method of quadratic  probing is found to be better than linear probing. However, to ensure that the full hash table is  covered, the values of c1, and c2 are constrained. It may happen that two keys produce the  same probe sequence such that:  h(k1, i) = h(k2, i)  Consider inserting the keys 74, 28, 36,58,21,64 into a hash table of size m =11 using  quadratic probing with c1=1 and c2=3. Further consider that the primary hash function is h'  (k) = k mod m.                                          106    CU IDOL SELF LEARNING MATERIAL (SLM)
01234567                                                             89    10   ////////                                                             //    /                                             Figure 8.6: Initial State    89    10                                                                        74 /  /  This is the initial state of hash table  Here c1= 1 and c2=3                                                   89    10  h (k, i) = [k mod m + i + 3i2 ] mod m                                 74 /  /  Insert 74.  h (74,0)= (74 mod 11+0+3x0) mod 11                                    89    10  = (8 +0+0) mod 11 = 8                                                 74 /  /  T [8] is free; insert the key 74 at this place.                                                                                   107   01234567   ////////                                          Figure 8.7: After inserting 74  Insert 28.  h (28, 0) = (28 mod 11 + 0 + 3 x 0) mod 11  = (6 +0 + 0) mod 11 = 6.  T [6] is free; insert key 28 at this place.     01234567   / / / / / / 28 /                                          Figure 8.8: After inserting 28  Insert 36.  h (36, 0) = (36 mod 11 + 0 + 3 x 0) mod 11  = (3 + 0+0) mod 11=3  T [3] is free; insert key 36 at this place.     01234567   / / / 36 / / 28 /                                          Figure 8.9: After inserting 36    Insert 58.  h (58, 0) = (58 mod 11 + 0 + 3 x 0) mod 11  = (3 + 0 + 0) mod 11 = 3  T [3] is not free, so next probe sequence is computed as  h (59, 1) = (58 mod 11 + 1 + 3 x12) mod 11    CU IDOL SELF LEARNING MATERIAL (SLM)
= (3 + 1 + 3) mod 11  =7 mod 11= 7  T [7] is free; insert key 58 at this place.    0 1 2 3 4 5 6 7 8 9 10  / / / 36 / / 28 58 74 / /                                         Figure 8.10: After inserting 58  Insert 21.  h (21, 0) = (21 mod 11 + 0 + 3 x 0)  = (10 + 0) mod 11 = 10  T [10] is free; insert key 21 at this place.    0 1 2 3 4 5 6 7 8 9 10  / / / 36 / / 28 58 74 / 21    Figure 8.11: After inserting 21    Insert 64.  h (64, 0) = (64 mod 11 + 0 + 3 x 0)  = (9 + 0+ 0) mod 11 = 9.  T [9] is free; insert key 64 at this place.  Thus, after inserting all keys, the hash table is    0 1 2 3 4 5 6 7 8 9 10  / / / 36 / / 28 58 74 64 21                                      Figure 8.12: After inserting all keys  Double Hashing:  Double Hashing is another method similar to linear probing. Here the difference is fixed as in  linear probing, but this fixed difference is calculated by using another hash function. That’s  why the name is double hashing.  Double Hashing is one of the best techniques available for open addressing because the  permutations produced have many of the characteristics of randomly chosen permutations.  Double hashing uses a hash function of the form  h (k, i) = (h1(k) + i h2 (k)) mod m  Where h1 and h2 are auxiliary hash functions and m is the size of the hash table. h1 (k) = k  mod m or h2 (k) = k mod m'. Here m' is slightly less than m (say m-1 or m-2)    let hash(x) be the slot index computed using hash function.                                                                          108    CU IDOL SELF LEARNING MATERIAL (SLM)
If slot hash(x) % S is full, then we try (hash(x) + 1*hash2(x)) % S  If (hash(x) + 1*hash2(x)) % S is also full, then we try (hash(x) + 2*hash2(x)) % S  If (hash(x) + 2*hash2(x)) % S is also full, then we try (hash(x) + 3*hash2(x)) % S    Example: Consider inserting the keys 76, 26, 37,59,21,65 into a hash table of size m = 11  using double hashing.  Consider that the auxiliary hash functions are h1 (k)=k mod 11 and h2(k) = k mod 9.  Solution: Initial state of Hash table is    0 1 2 3 4 5 6 7 8 9 10  / / / / / / / / / //                                             Figure 8.13: Initial State  Insert 76.  h1(76) = 76 mod 11 = 10  h2(76) = 76 mod 9 = 4  h (76, 0) = (10 + 0 x 4) mod 11  = 10 mod 11 = 10  T [10] is free, so insert key 76 at this place.    0 1 2 3 4 5 6 7 8 9 10  / / / / / / / / / / 76                                         Figure 8.14: After Inserting 76  Insert 26.  h1(26) = 26 mod 11 = 4  h2(26) = 26 mod 9 = 8  h (26, 0) = (4 + 0 x 8) mod 11  = 4 mod 11 = 4  T [4] is free, so insert key 26 at this place.    0 1 2 3 4 5 6 7 8 9 10  / / / / 26 / / / / / 76    Figure 8.15: After Inserting 26    Insert 37.  h1(37) = 37 mod 11 = 4  h2(37) = 37 mod 9 = 1  h (37, 0) = (4 + 0 x 1) mod 11 = 4 mod 11 = 4                                                                                        109    CU IDOL SELF LEARNING MATERIAL (SLM)
T [4] is not free, the next probe sequence is                         8  9 10  h (37, 1) = (4 + 1 x 1) mod 11 = 5 mod 11 = 5                         /  / 76  T [5] is free, so insert key 37 at this place.                                                                        8  9 10   01234567                                                             /  59 76   / / / / 26 37 / /                                                                        8  9 10                                       Figure 8.16: After Inserting 37  /  59 76  Insert 59.  h1(59) = 59 mod 11 = 4                                                                   110  h2(59) = 59 mod 9 = 5  h (59, 0) = (4 + 0 x 5) mod 11 = 4 mod 11 = 4  Since, T [4] is not free, the next probe sequence is  h (59, 1) = (4 + 1 x 5) mod 11 = 9 mod 11 = 9  T [9] is free, so insert key 59 at this place.     01234567   / / / / 26 37 / /                                         Figure 8.17: After Inserting 59  Insert 21.  h1(21) = 21 mod 11 = 10  h2(21) = 21 mod 9 = 3  h (21, 0) = (10 + 0 x 3) mod 11 = 10 mod 11 = 10  T [10] is not free, the next probe sequence is  h (21, 1) = (10 + 1 x 3) mod 11 = 13 mod 11 = 2  T [2] is free, so insert key 21 at this place.     01234567   / / 21 / 26 37 / /                                         Figure 8.18: After Inserting 21  Insert 65.  h1(65) = 65 mod 11 = 10  h2(65) = 65 mod 9 = 2  h (65, 0) = (10 + 0 x 2) mod 11 = 10 mod 11 = 10  T [10] is not free, the next probe sequence is  h (65, 1) = (10 + 1 x 2) mod 11 = 12 mod 11 = 1  T [1] is free, so insert key 65 at this place.  Thus, after insertion of all keys the final hash table is    CU IDOL SELF LEARNING MATERIAL (SLM)
0 1 2 3 4 5 6 7 8 9 10   / 65 21 / 26 37 / / / 59 76                                     Figure 8.19: After Inserting all keys    8.2 EXTENDIBLE HASHING    Extendible hashing is a dynamically updateable disk-based index structure which implements  a hashing scheme utilizing a directory. The index is used to support exact match queries, i.e.,  find the record with a given key.  Compared with the BC-tree index which also supports exact match queries (in logarithmic  number of I/Os), extendible hashing has better expected query cost O(1) I/O. Compared with  linear hashing, extendible hashing does not have any overflow page. Overflows are handled  by doubling the directory which logically doubles the number of buckets. Physically, only the  overflown bucket is split.  The extendible hashing scheme was introduced by Fagin R, Nievergelt J, Pippenger N,  Strong HR. Extendible hashing. A hash table is an in-memory data structure that associates  keys with values.  The primary operation it supports efficiently is a lookup: given a key, find the corresponding  value. It works by transforming the key using a hash function into a hash, a number that is  used as an index in an array to locate the desired location where the values should be.  Multiple keys may be hashed to the same bucket, and all keys in a bucket should be searched  upon a query. Hash tables are often used to implement associative arrays, sets, and caches.  Like arrays, hash tables have O (1) lookup cost on average.    Extendible hashing uses a directory to access its buckets. This directory is usually small  enough to be kept in main memory and has the form of an array with 2d entries, each entry  storing a bucket address (pointer to a bucket). The variable d is called the global depth of the  directory. To decide where a key k is stored, extendible hashing uses the last d bits of some  adopted hash function h(k) to choose the directory entry. Multiple directory entries may point  to the same bucket. Every bucket has a local depth leqd. The difference between local depth  and global depth affects overflow handling.  An example of extendible hashing is shown in Fig. 8.6. Here there are four directory entries  and four buckets. The global depth and all the four local depths are 2. For simplicity assume  the adopted hash function is h(k) D k. For instance, to search for record 15, one refers to  directory entry 15% 4 D 3 (or 11 in binary format), which points to bucket D.                                          111    CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.20: Extendible Hashing Representation    Overflow Handling:  If a bucket overflow happens, the bucket is split into two. The directory may or may not  double, depending on whether the local depth of the overflown bucket was equal to the global  depth before split.  If the local depth was equal to global depth, d bits are not enough to distinguish the search  values of the overflown bucket. Thus, a directory doubling occurs, which effectively uses one  more bit from the hash value. The directory size is then doubled (this does not mean that the  number of buckets is doubled as buckets will share directory entries). As an example, Fig. 8.7  illustrates extendible hashing after inserting a new record with key 63 into Fig.8.6. Bucket D  overflows and the records in it are redistributed between D (where the last three bits of a  record’s hash value are 011) and D0 (where the last three bits of a record’s hash value are  111). The directory doubles. The global depth is increased by one. The local depth of buckets  D and D0 are increased by one, while the local depth of the other buckets remains to be two.  Except 111, which points to the new bucket D0, each of the new directory entries points to  the existing bucket which shares the last two bits. For instance, directory entry 101 points to  the bucket referenced by directory entry 001.    In general, if the local depth of a bucket is d0, the number of directory entries pointing to the  bucket is 2dd0. All these directory entries share the last d0 bits. To split an overflown bucket  whose local depth is smaller than the global depth, one does not need to double the size of the  directory. Instead, half of the 2dd0 directory entries will point to the new bucket, and the  local depth of both the overflown bucket and its split image are increased by one. For  instance, Fig.8.8 illustrates the extendible hashing after inserting 17 and 13 into Fig. 8.7.  Bucket B overflows and a split image, bucket B0, is created. There are two directory entries                                          112    CU IDOL SELF LEARNING MATERIAL (SLM)
(001 and 101) that pointed to B before the split. Half of them (101) now points to the split  image B0. The local depth of both buckets B and B0 are increased by one.                            Figure 8.21: Extendible Hashing after inserting 63.    Figure 8.22: Extendible Hashing after inserting 17 and 13.    8.3 DYNAMIC HASHING APPROACH IMPLEMENTATION AND  PERFORMANCE    Dynamic Hashing                                                                113                     CU IDOL SELF LEARNING MATERIAL (SLM)
▪ The dynamic hashing method is used to overcome the problems of static hashing like           bucket overflow.        ▪ In this method, data buckets grow or shrink as the records increases or decreases. This           method is also known as Extendable hashing method.        ▪ This method makes hashing dynamic, i.e., it allows insertion or deletion without           resulting in poor performance.    How to search a key      ▪ First, calculate the hash address of the key.      ▪ Check how many bits are used in the directory, and these bits are called as i.      ▪ Take the least significant i bits of the hash address. This gives an index of the           directory.      ▪ Now using the index, go to the directory and find bucket address where the record           might be.    How to insert a new record      ▪ Firstly, you have to follow the same procedure for retrieval, ending up in some           bucket.      ▪ If there is still space in that bucket, then place the record in it.      ▪ If the bucket is full, then we will split the bucket and redistribute the records.    For example: Consider the following grouping of keys into buckets, depending on the prefix  of their hash address:                             Figure 8.23: Dynamic Hashing address with key    The last two bits of 2 and 4 are 00. So, it will go into bucket B0. The last two bits of 5 and 6  are 01, so it will go into bucket B1. The last two bits of 1 and 3 are 10, so it will go into  bucket B2. The last two bits of 7 are 11, so it will go into B3.                                          114    CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.24: Dynamic Hashing Bucket Representation    Insert key 9 with hash address 10001 into the above structure:      ▪ Since key 9 has hash address 10001, it must go into the first bucket. But bucket B1 is           full, so it will get split.      ▪ The splitting will separate 5, 9 from 6 since last three bits of 5, 9 are 001, so it will go           into bucket B1, and the last three bits of 6 are 101, so it will go into bucket B5.      ▪ Keys 2 and 4 are still in B0. The record in B0 pointed by the 000 and 100 entry           because last two bits of both the entry are 00.      ▪ Keys 1 and 3 are still in B2. The record in B2 pointed by the 010 and 110 entry           because last two bits of both the entry are 10.      ▪ Key 7 are still in B3. The record in B3 pointed by the 111 and 011 entry because last           two bits of both the entry are 11.                               Figure 8.25: Dynamic Hashing After Insertion                     115  Advantages of Dynamic Hashing:                                                          CU IDOL SELF LEARNING MATERIAL (SLM)
▪ In this method, the performance does not decrease as the data grows in the system. It           simply increases the size of memory to accommodate the data.        ▪ In this method, memory is well utilized as it grows and shrinks with the data. There           will not be any unused memory lying.        ▪ This method is good for the dynamic database where data grows and shrinks           frequently.    Disadvantages of Dynamic Hashing:      ▪ In this method, if the data size increases then the bucket size is also increased. These           addresses of data will be maintained in the bucket address table. This is because the           data address will keep changing as buckets grow and shrink. If there is a huge           increase in data, maintaining the bucket address table becomes tedious.      ▪ In this case, the bucket overflow situation will also occur. But it might take little time           to reach this situation than static hashing.    8.4 SUMMARY        ▪ In DBMS, hashing is a technique to directly search the location of desired data on the           disk without using index structure.        ▪ Hashing method is used to index and retrieve items in a database as it is faster to           search that specific item using the shorter hashed key instead of using its original           value.        ▪ Data bucket, Key, Hash function, Linear Probing, Quadratic probing, Hash index,           Double Hashing, Bucket Overflow are important terminologies used in hashing        ▪ Two types of hashing methods are 1) static hashing 2) dynamic hashing      ▪ In the static hashing, the resultant data bucket address will always remain the same.      ▪ Dynamic hashing offers a mechanism in which data buckets are added and removed             dynamically and on demand.      ▪ In order Indexing addresses in the memory are sorted according to a critical value             while in hashing addresses are always generated using a hash function on the key           value.      ▪ Hash collision is a state when the resultant hashes from two or more data in the data           set, wrongly map the same place in the hash table.      ▪ Rehashing and chaining are two methods which help you to avoid hashing collision.    8.5 KEYWORDS        ▪ Data bucket – Data buckets are memory locations where the records are stored.      ▪ Key: A DBMS key is an attribute or set of an attribute which helps you to identify a             row(tuple) in a relation(table). This allows you to find the relationship between two           tables.                                          116    CU IDOL SELF LEARNING MATERIAL (SLM)
▪ Hash function: A hash function, is a mapping function which maps all the set of           search keys to the address where actual records are placed.        ▪ Linear Probing – Linear probing is a fixed interval between probes. In this method,           the next available data block is used to enter the new record, instead of overwriting on           the older record.        ▪ Quadratic probing- It helps you to determine the new bucket address. It helps you to           add Interval between probes by adding the consecutive output of quadratic           polynomial to starting value given by the original computation.        ▪ Hash index – It is an address of the data block. A hash function could be a simple           mathematical function to even a complex mathematical function.        ▪ Double Hashing –Double hashing is a computer programming method used in hash           tables to resolve the issues of has a collision.        ▪ Bucket Overflow: The condition of bucket-overflow is called collision. This is a fatal           stage for any static has to function.    8.6 LEARNING ACTIVITY    Colleges have multiple departments where every department offers many courses. These  departments have a head (HOD) and various instructors. Even though there are many  instructors, one instructor can only work in one department. As you can see the organization  structure of a college is quite complicated and requires a lot of effort to manage.  You can add the enrollment information of the students as to how many students have taken a  particular course.  The system should allow easy access. Your developed DBMS-based solution would allow a  college to save a lot of time and resources; moreover, the user could see all the college  information from one place and modify it accordingly.    ___________________________________________________________________________  ____________________________________________________________________    8.7 UNIT END QUESTIONS    A. Descriptive Questions    Short Questions        1. What is meant by hashing?      2. Define hash function?      3. List out the types of hashing.      4. What is an open address in hashing?      5. What is dynamic hashing?                                          117    CU IDOL SELF LEARNING MATERIAL (SLM)
Long Questions        1. Explain two types of SQL hashing methods.      2. List and explain the hashing collision resolution technique.      3. Explain in detail about Open address in hashing.      4. Difference between open hashing and closed hashing.      5. Explain the important terminologies which are used in Hashing.    B. Multiple choice Questions        1. The term ______ is used to denote a unit of storage that can store one or more records               a. Basket               b. Bucket               c. Unit               d. Set        2. If K denotes the set of all the search key values, and B denotes the set of all bucket           addresses, a function from K to B is called as __________               a. Bucket function               b. Address function               c. Hash function               d. Search function        3. In a __________ , we obtain the address of the disk block containing a desired record           directly by computing a function on the search key value of the record               a. Hash file organization               b. Hash index organization               c. Hashing address               d. None of these        4. In a __________ we organize the search keys, with their associated pointers, into a           hash file structure               a. Hash file organization               b. Hash index organization               c. Hashing address               d. None of these        5. What is a bucket overflow?                          a. When a bucket does not have enough space                          b. There are insufficient buckets                          c. When Bucket skew occurs                                          118    CU IDOL SELF LEARNING MATERIAL (SLM)
d. All of these    Answer    1.(b) 2. (c) 3. (a) 4. (b) 5. (b)    8.8 REFERENCES    Text Books:      • T1 R. Elmasri and S.B. Navathe, Fundamentals of Database Systems, Pearson           Education, New Delhi.      • T2 C.J. Date, An Introduction to Database Systems Pearson Education, New Delhi.      • T3 Data, C. and Darwen, H, Reading, A Guide to the SQL Standard, Addison-Wesley           Publications, New Delhi.    Reference Books:      • R1 A. Silberschatz, H.F. Korth and S. Sudarshan, Database System Concepts,           McGraw-Hill, International Edition.      • R2 Ivan Bayross, SQL / PL/SQL, BPB Publications.                                                                             119                                       CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 9: RELATIONAL DATA MODEL    Structure  9.0 Learning Objective  9.1 Relational Model Concepts  9.2 Relational Constraints  9.3 Summary  9.4 Keywords  9.5 Learning Activity  9.6 Unit End Questions  9.7 References    9.0 LEARNING OBJECTIVES    After studying this unit, student will be able to      ▪ Learn the Relational Model and the uses of constraints.      ▪ Discuss the concepts in relational constraints.    9.1 RELATIONAL MODEL CONCEPTS    The database is defined as a set of relations in the Relational Model (RM). A table of values  is all that a relationship is. Each table row represents a grouping of related data values. The  table's rows represent a real-world person or relationship.  The table name and column names are helpful to interpret the meaning of values in each row.  The data are represented as a set of relations. In the relational model, data are stored as tables.  However, the physical storage of the data is independent of the way the data are logically  organized.  The following are a few of the most widely used Relational Database Management Systems:        ▪ DB2 and Informix Dynamic Server - IBM      ▪ Oracle and RDB – Oracle      ▪ SQL Server and Access - Microsoft  Relational Model Concepts:  Attribute: Each column in a Table. Attributes are the properties which define a relation. e.g.,  Student_Rollno, NAME, etc.  Tables – In the Relational model the, relations are saved in the table format. It is stored along  with its entities. A table has two properties rows and columns. Rows represent records and  columns represent attributes.  Tuple – It is nothing but a single row of a table, which contains a single record.  Relation Schema: A relation schema represents the name of the relation with its attributes.  Degree: The total number of attributes which in the relation is called the degree of the  relation.                                                                   120    CU IDOL SELF LEARNING MATERIAL (SLM)
Cardinality: Total number of rows present in the Table.  Column: The column represents the set of values for a specific attribute.  Relation instance – Relation instance is a finite set of tuples in the RDBMS system. Relation  instances never have duplicate tuples.  Relation key - Every row has one, two or multiple attributes, which is called relation key.   – Every attribute has some pre-defined value and scope which is known as attribute domain                                     Figure 9.1: Relation Representation    9.2 RELATIONAL CONSTRAINTS    Relational Integrity Constraints  Relational Integrity constraints in DBMS are referred to conditions that must be present for a  valid relation. These Relational constraints in DBMS are derived from the rules in the mini-  world that the database represents.  There are many types of Integrity Constraints in DBMS. Constraints on the Relational  database management system is mostly divided into three main categories are:        1. Domain Constraints      2. Key Constraints      3. Referential Integrity Constraints  Domain Constraints  Domain constraints can be violated if an attribute value is not appearing in the corresponding  domain or it is not of the appropriate data type.  Domain constraints specify that within each tuple, and the value of each attribute must be  unique. This is specified as data types which include standard data types integers, real  numbers, characters, Booleans, variable length strings, etc.  Example:  Create DOMAIN CustomerName  CHECK (value not NULL)  The example shown demonstrates creating a domain constraint such that CustomerName is  not NULL  Key Constraints                                          121    CU IDOL SELF LEARNING MATERIAL (SLM)
An attribute that can uniquely identify a tuple in a relation is called the key of the table. The  value of the attribute for different tuples in the relation has to be unique.  Example:  In the given table, CustomerID is a key attribute of Customer Table. It is most likely to have  a single key for one customer, CustomerID =1 is only for the CustomerName =\" Google\".                                           Table 9.1: Customer Table    CustomerID  CustomerName                          Status         1         Google                           Active         2         Amazon                           Active         3          Apple                           Inactive    Referential Integrity Constraints  Referential Integrity constraints in DBMS are based on the concept of Foreign Keys. A  foreign key is an important attribute of a relation which should be referred to in other  relationships. Referential integrity constraint state happens where relation refers to a key  attribute of a different or same relation. However, that key element must exist in the table.    Example:  In the below example, we have 2 relations, Customer and Billing.  Tuple for CustomerID =1 is referenced twice in the relation Billing. So we know  CustomerName=Google has billing amount $300                                                                122                CU IDOL SELF LEARNING MATERIAL (SLM)
Table 9.2: Customer and Billing Relation    Operations in Relational Model  Four basic update operations performed on relational database model are  Insert, update, delete and select.        • Insert is used to insert data into the relation      • Delete is used to delete tuples from the table.      • Modify allows you to change the values of some attributes in existing tuples.      • Select allows you to choose a specific range of data.  Whenever one of these operations are applied, integrity constraints specified on the relational  database schema must never be violated.    Insert Operation  The insert operation gives values of the attribute for a new tuple which should be inserted  into a relation.                                          Figure 9.3: Insert Operation    Update Operation  You can see that in the below-given relation table CustomerName= 'Apple' is updated from  Inactive to Active.                                         Figure 9.4: Update Operation    Delete Operation  To specify deletion, a condition on the attributes of the relation selects the tuple to be deleted.                                          Figure 9.5: Delete Operation                          123  In the above-given example, Customer Name= \"Apple\" is deleted from the table.                                                          CU IDOL SELF LEARNING MATERIAL (SLM)
The Delete operation could violate referential integrity if the tuple which is deleted is  referenced by foreign keys from other tuples in the same database.  Select Operation    Figure 9.6: Select Operation    In the above-given example, Customer Name=\"Amazon\" is selected    Best Practices for creating a Relational Model      • Data need to be represented as a collection of relations      • Each relation should be depicted clearly in the table      • Rows should contain data about instances of an entity      • Columns must contain data about attributes of the entity      • Cells of the table should hold a single value      • Each column should be given a unique name      • No two rows can be identical      • The values of an attribute should be from the same domain    Advantages of using Relational Model      • Simplicity: A Relational data model in DBMS is simpler than the hierarchical and           network model.      • Structural Independence: The relational database is only concerned with data and           not with a structure. This can improve the performance of the model.      • Easy to use: The Relational model in DBMS is easy as tables consisting of rows and           columns are quite natural and simple to understand      • Query capability: It makes possible for a high-level query language like SQL to           avoid complex database navigation.      • Data independence: The Structure of Relational database can be changed without           having to change any application.      • Scalable: Regarding a number of records, or rows, and the number of fields, a           database should be enlarged to enhance its usability.    Disadvantages of using Relational Model      • Few relational databases have limits on field lengths which can't be exceeded.      • Relational databases can sometimes become complex as the amount of data grows,           and the relations between pieces of data become more complicated.      • Complex relational database systems may lead to isolated databases where the           information cannot be shared from one system to another.                                                                     124    CU IDOL SELF LEARNING MATERIAL (SLM)
9.3 SUMMARY        ▪ The Relational database modelling represents the database as a collection of relations           (tables)        ▪ Attribute, Tables, Tuple, Relation Schema, Degree, Cardinality, Column, Relation           instance, are some important components of Relational Model        ▪ Relational Integrity constraints are referred to conditions which must be present for a           valid Relation approach in DBMS        ▪ Domain constraints can be violated if an attribute value is not appearing in the           corresponding domain or it is not of the appropriate data type        ▪ Insert, Select, Modify and Delete are the operations performed in Relational Model           constraints        ▪ The relational database is only concerned with data and not with a structure which can           improve the performance of the model        ▪ Advantages of Relational model in DBMS are simplicity, structural independence,           ease of use, query capability, data independence, scalability, etc.        ▪ Few relational databases have limits on field lengths which can't be exceeded.    9.4 KEYWORDS        • Insert: Insert is used to insert data into the relation      • Delete: Delete is used to delete tuples from the table.      • Modify: Modify allows you to change the values of some attributes in existing tuples.      • Select: Select allows you to choose a specific range of data    9.5 LEARNING ACTIVITY    Library Data Management  If you’re an avid reader, then chances are, you must’ve gone to a library. And you may  already know how many books a library has to keep track of. Libraries don’t have a lot of  staff, but they have to keep a record of all the books they have and the books they have lent.  You can simplify the management of a library’s data.  You should start with students and faculties, i.e., people can get books from the library. Now,  there would be a significant difference between the number of books a student can get and the  number of books a faculty can get. So, add those limits in your system as well. Then, every  book would have a unique ID.  Books with the same title and author would have different IDs according to their copies.  You’ll have to add entries for every book. And then, add the details of who issued the book  and when with the duration of their ownership.  Your DBMS-based solution should also have details on the books that people haven’t  returned and the due fines.                                          125    CU IDOL SELF LEARNING MATERIAL (SLM)
___________________________________________________________________________  ____________________________________________________________________    9.6 UNIT END QUESTIONS    A. Descriptive Questions    Short Questions       1. What is Relational Model?        2. What is Referential Integrity Constraints?      3. Write short notes on Domain constraints.      4. What is a table?      5. Define tuple?    Long Questions        1. Explain in detail about Relational Model Concepts.      2. Explain the Relational Constraints.      3. What is the constraint in a database? Explain types of constraints with suitable             example.      4. Explain in detail the operation of the Relational Constraint Model.      5. What are the types of relational constraints? Explain in detail?    B. Multiple choice Questions        1. An attribute is a __________ in a relation.               a. Row               b. Column               c. Value               d. Tuple    2. Statement 1: A tuple is a row in a relation Statement      2: Existence of multiple foreign keys in a same relation is possible           a. Both the statements are true           b. Statement 1 is correct but Statement 2 is false           c. Statement 1 is false but Statement 2 is correct           d. Both the statements are false    3. Data integrity constraints are used to:                                           126           a. Control who is allowed access to the data           b. Ensure that duplicate records are not entered into the table           c. Improve the quality of data entered for a specific property                                                   CU IDOL SELF LEARNING MATERIAL (SLM)
d. Prevent users from changing the values stored in the table        4. To include integrity constraint in an existing relation use :               a. Create table               b. Modify table               c. Alter table               d. Drop table        5. Foreign key is the one in which the ________ of one relation is referenced in another           relation.               a. Foreign key               b. Primary key               c. References               d. Check constraint    Answer  1.(b) 2. (a) 3. (c) 4. (c) 5. (b)    9.7 REFERENCES    Text Books:      • T1 R. Elmasri and S.B. Navathe, Fundamentals of Database Systems, Pearson           Education, New Delhi.      • T2 C.J. Date, An Introduction to Database Systems Pearson Education, New Delhi.      • T3 Data, C. and Darwen, H, Reading, A Guide to the SQL Standard, Addison-Wesley           Publications, New Delhi.    Reference Books:      • R1 A. Silberschatz, H.F. Korth and S. Sudarshan, Database System Concepts,           McGraw-Hill, International Edition.      • R2 Ivan Bayross, SQL / PL/SQL, BPB Publications.                                          127    CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 10: RELATIONAL ALGEBRA SQL    Structure  10.0 Learning objective  10.1 SQL Queries  10.2 Programming Using SQL  10.3 Summary  10.4 Keywords  10.5 Learning Activity  10.6 Unit End Questions  10.7 References    10.0 LEARNING OBJECTIVE    After studying this unit, student will be able to      ▪ Explain how to build new relations from other relations in the database      ▪ Write SQL commands to create tables and indexes, insert/update/delete data, and           query data in a relational DBMS    10.1 SQL QUERIES    Relational Algebra:      ▪ Relational Algebra is a procedural query language that accepts instances of relations           as input and outputs instances of relations. To run queries, it hires operators.      ▪ Unary and binary operators are also possible. They take relations as feedback and give           them back relations.      ▪ Relational algebra is performed recursively on relation and intermediate results are           also considered relations.           Types of Relational Operation:    Figure 10.1 Relational Operation      128    CU IDOL SELF LEARNING MATERIAL (SLM)
Select Operator (σ): Select Operator is denoted by sigma (σ) and it is used to find the tuples  (or rows) in a relation (or table) which satisfy the given condition.  If you understand little bit of SQL then you can think of it as a where clause in SQL, which is  used for the same purpose.    Syntax of Select Operator (σ)  σ Condition / Predicate (Relation/Table name)    Select Operator (σ) Example                                 Table 10.1 Customer    Customer_ID                  Customer_Name                          Customer_City      C10100                          Steve                                 Agra      C10111                          Raghu                                 Agra      C10115                                                                Noida      C10117                       Chaitanya                                Delhi      C10118                          Ajeet                                 Delhi                                       Carl    Query:  σ Customer_City=\"Agra\" (CUSTOMER)    SQL Query:           Select * from customer where Customer_City =”Agra”;    Output:                                      Table 10.2 Output of the Query    Customer ID                  Customer Name                          Customer City     C10100                           Steve                                 Agra     C10111                          Raghu                                  Agra    Project Operator (∏): Project operator is denoted by ∏ symbol and it is used to select  desired columns (or attributes) from a table (or relation).  Project operator in relational algebra is similar to the Select statement in SQL.    Syntax of Project Operator (∏)  ∏ column_name1, column_name2, ...., column_nameN(table_name)    Project Operator (∏) Example    In this example, we have a table CUSTOMER with three columns, we want to fetch only two  columns of the table, which we can do with the help of Project Operator ∏.                                                                                       129                 CU IDOL SELF LEARNING MATERIAL (SLM)
Table 10.3 Customer    Customer_ID                   Customer_Name         Customer_City      C10100                           Steve                Agra      C10111                           Raghu                Agra      C10115                                                Noida      C10117                        Chaitanya               Delhi      C10118                           Ajeet                Delhi                                        Carl    Query:  ∏ Customer_Name, Customer_City (CUSTOMER)    SQL Query:  Select Customer_Name, Customer_City from customer;    Output:                 Table 10.4 Output of the Query             Customer Name                              Customer City                  Steve                                     Agra                 Raghu                                      Agra                                                           Noida               Chaitanya                                    Delhi                  Ajeet                                     Delhi                  Carl    Union Operator (∪): Union operator is denoted by ∪ symbol and it is used to select all the  rows (tuples) from two tables (relations).        ▪ Let’s say we have two relations R1 and R2 both have same columns and we want to           select all the tuples(rows) from these relations then we can apply the union operator           on these relations.        ▪ The rows (tuples) that are present in both the tables will only appear once in the union           set. In short you can say that there are no duplicates present after the union operation.    Syntax of Union Operator (∪)  table_name1  ∪  table_name2                                                                       130                 CU IDOL SELF LEARNING MATERIAL (SLM)
Query:  ∏ Student_Name (student1)    ∪  ∏ Student_Name (student2)                               Table 10.5 Student1    ID NAME    1 Jack    2 Harry    3 Jackson                                 Table 10.6 Student2                 ID NAME                    3 Jackson                  4 Stephan                  5 David    SQL Query:  SELECT * FROM student1  UNION  SELECT * FROM student2;                                        Table 10.7 Output of the Query    ID NAME    1 Jack  2 Harry                                                                        131                               CU IDOL SELF LEARNING MATERIAL (SLM)
3 Jackson                    4 Stephan                    5 David    Union All: Union All operation is equal to the Union operation. It returns the set without  removing duplication and sorting the data.  Syntax:  SELECT column_name FROM table1  UNION ALL  SELECT column_name FROM table2;  SQL Query:  SELECT * FROM student1  UNION ALL  SELECT * FROM Student2;                                        Table 10.8 Output of the Query                 ID NAME                    1 Jack                  2 Harry                  3 Jackson                  3 Jackson                  4 Stephan                  5 David    Intersection Operator (∩): Intersection operator is denoted by ∩ symbol and it is used to  select common rows (tuples) from two tables (relations). Let’s say we have two relations R1    and R2 both have same columns and we want to select all those tuples(rows) that are present    in both the relations, then in that case we can apply intersection operation on these two  relations R1 ∩ R2.                                          132    CU IDOL SELF LEARNING MATERIAL (SLM)
Syntax of Intersection Operator (∩)           table_name1           ∩           table_name2    Query:  ∏ Student_Name (student1)  ∩  ∏ Student_Name (student2)  SQL Query:  SELECT * FROM student1  INTERSECT  SELECT * FROM student2;                                        Table 10.9 Output of the Query                  ID NAME                    3 Jackson    Set Difference (-): Set Difference is denoted by – symbol. Let’s say we have two relations  R1 and R2 and we want to select all those tuples(rows) that are present in Relation R1  but not present in Relation R2, this can be done using Set difference R1 – R2.        ▪ It combines the result of two SELECT statements. Minus operator is used to display           the rows which are present in the first query but absent in the second query.        ▪ It has no duplicates and data arranged in ascending order by default.    Syntax of Set Difference (-)  table_name1  -  table_name2    Query:  ∏ Student_Name (STUDENT)  -  ∏ Student_Name (COURSE)  SQL Query:  SELECT * FROM table1  MINUS                                                                        133    CU IDOL SELF LEARNING MATERIAL (SLM)
SELECT * FROM table2;                                     Table 10.10 Output of the Query    ID NAME    1 Jack    2 Harry    Cartesian product (X): Cartesian Product is denoted by X symbol. Let’s say we have two  relations R1 and R2 then the cartesian product of these two relations (R1 X R2) would  combine each tuple of first relation R1 with each tuple of second relation R2.  Syntax of Cartesian product (X)  R1 X R2    Query:  ∏ Student_Name (STUDENT)    X  ∏ Student_Name (Details)                              Table 10.11 Student    SNO                       FNAME                                     LNAME      1                        Albert                                     Singh      2                         Nora                                     Fatehi                                  Table 10.12 Details    AGE                            ROLLNO                        18                                                          21                                   5                                   9    SQL Query:  SELECT * FROM                                                                                   134                   CU IDOL SELF LEARNING MATERIAL (SLM)
student,                 Table 10.12 Details  Details;                FNAME      LNAME                ROLLNO               AGE         SNO       Albert     Singh                    5                18             1     Albert     Singh                    9                21             1      Nora      Fatehi                   5                18             2      Nora      Fatehi                   9                21             2    Rename (ρ): Rename (ρ) operation can be used to rename a relation or an attribute of a    relation.  Rename (ρ) Syntax:  ρ (new_relation_name, old_relation_name)  Rename (ρ) Example: Let’s say we have a table customer; we are fetching customer names    and we are renaming the resulted relation to CUST_NAMES.                                              Table 10.13 Customer    Customer_ID              Customer_Name              Customer_City     C10100                   Steve                      Agra     C10111                   Raghu                      Agra     C10115                   Chaitanya                  Noida     C10117                   Ajeet                      Delhi     C10118                   Carl                       Delhi    Query:  ρ(CUST_NAMES, ∏(Customer_Name)(CUSTOMER))    SQL Query:  SELECT CustomerName as CUST_NAMES FROM Customer;                                                                              135                  CU IDOL SELF LEARNING MATERIAL (SLM)
Table 10.14 Output of the Query                                     CUST_NAMES                                      Steve                                      Raghu                                      Chaitanya                                      Ajeet                                      Carl    10.2 PROGRAMMING USING SQL    Structured Query Language is a standard Database language that is used to create, maintain  and retrieve the relational database. Following are some interesting facts about SQL.        ▪ SQL is case insensitive. But it is a recommended practice to use keywords (like           SELECT, UPDATE, CREATE, etc) in capital letters and use user-defined things           (liked table name, column name, etc) in small letters.        ▪ We can write comments in SQL using “–” (double hyphen) at the beginning of any           line.        ▪ SQL is the programming language for relational databases (explained below) like           MySQL, Oracle, Sybase, SQL Server, Postgre, etc. Other non-relational databases           (also called NoSQL) databases like MongoDB, DynamoDB, etc do not use SQL        ▪ Although there is an ISO standard for SQL, most of the implementations slightly vary           in syntax. So we may encounter queries that work in SQL Server but do not work in           MySQL.                                              Table 10.14 Student    ROLL_NO    NAME                  ADDRESS         PHONE       AGE          1     RAM                    DELHI          2                                        9455123451  18          3  RAMESH                 GURGAON        9652431543  18          4    SUJIT                 ROHTAK        9156253131  20                                                   9156768971  18             SURESH                    DELHI    DDL COMMAND:  Data Definition Language helps you to define the database structure or schema. Let's learn  about DDL commands with syntax.  Five types of DDL commands in SQL are:                                                                 136               CU IDOL SELF LEARNING MATERIAL (SLM)
CREATE                                                                                      137  CREATE statements is used to define the database structure schema:  Syntax:  CREATE TABLE table_name (       column1 datatype,     column2 datatype,     column3 datatype,    ....  );  For example:  Create database university;  CREATE TABLE Student (     ROLL_NO int,     Name varchar(255),     Address varchar(255),     Phone varchar(12),     age int  );  DROP  Drops commands remove tables and databases from RDBMS.  Syntax  DROP TABLE Table_Name;  For example:  Drop table student;  ALTER  Alters command allows you to alter the structure of the database.  Syntax:  To add a new column in the table  ALTER TABLE table_name  ADD  column_name COLUMN-definition;    To modify an existing column in the table:  ALTER TABLE MODIFY (COLUMN DEFINITION....);    For example:  Alter table Student add subject varchar (255);                                         Table 10.16 After add Subject                                                          CU IDOL SELF LEARNING MATERIAL (SLM)
ROLL_NO  NAME    ADDRESS  PHONE                      AGE Subject     1     2     RAM     DELHI    9455123451                 18     3     RAMESH  GURGAON  9652431543                 18     4     SUJIT   ROHTAK   9156253131                 20           SURESH  DELHI    9156768971                 18    TRUNCATE:  This command used to delete all the rows from the table and free the space containing the  table.  Syntax:  TRUNCATE TABLE table_name;  Example:  TRUNCATE table students;    DML COMMAND:  Data Manipulation Language (DML) allows you to modify the database instance by inserting,  modifying, and deleting its data. It is responsible for performing all types of data  modification in a database.  There are three basic constructs which allow database program and user to enter data and  information are:  1. INSERT  2. UPDATE  3. DELETE    INSERT: This is a statement is a SQL query. This command is used to insert data into the  row of a table.  Syntax:  INSERT INTO TABLE_NAME (col1, col2, col3,.... col N)  VALUES (value1, value2, value3, .... valueN);  Or  INSERT INTO TABLE_NAME  VALUES (value1, value2, value3, .... valueN);  For example:  INSERT INTO students VALUES ('60', 'Tom', ‘abcdef',’9787097870’,’35’);                                           Table 10.17 After Insertion                                                             138                   CU IDOL SELF LEARNING MATERIAL (SLM)
ROLL_NO    NAME     ADDRESS                      PHONE                 AGE          1     RAM       DELHI          2                                        9455123451            18          3  RAMESH    GURGAON                     9652431543            18          4    SUJIT    ROHTAK                     9156253131            20         60                                        9156768971            18             SURESH       DELHI                    9787097870            35                Tom       abcdef    UPDATE: This command is used to update or modify the value of a column in the table.  Syntax:  UPDATE table_name SET [column_name1= value1,...column_nameN = valueN] [WHERE  CONDITION]  For example:  UPDATE students  SET Name = 'Jhon', phone= '9595950000'  WHERE ROLL_NO = 3;                                            Table 10.18 After Update    ROLL_NO    NAME     ADDRESS                      PHONE                 AGE          1     RAM       DELHI          2                                        9455123451            18          3  RAMESH    GURGAON                     9652431543            18          4     Jhon    ROHTAK                     9595950000            20         60                                        9156768971            18             SURESH       DELHI                    9787097870            35                Tom       abcdef    DELETE: This command is used to remove one or more rows from a table.  Syntax:  DELETE FROM table_name [WHERE condition];  For example:  DELETE FROM students WHERE FirstName = 'Jhon';                                             Table 10.19 After Delete                                                                           139               CU IDOL SELF LEARNING MATERIAL (SLM)
ROLL_NO  NAME       ADDRESS     PHONE          AGE     1        RAM        DELHI       9455123451     18     2        RAMESH     GURGAON     9652431543     18     4        SURESH     DELHI       9156768971     18     60       Tom        abcdef      9787097870     35    DCL (Data Control Language): DCL includes commands like GRANT and REVOKE,  which are useful to give \"rights & permissions.\" Other permission controls parameters of the  database system.  Examples of DCL commands:  Commands that come under DCL:        ▪ Grant      ▪ Revoke  Grant:  This command is use to give user access privileges to a database.  Syntax:  GRANT SELECT, UPDATE ON MY_TABLE TO SOME_USER, ANOTHER_USER;  For example:  GRANT SELECT ON Users TO'Tom'@'localhost;  Revoke:  It is useful to back permissions from the user.  Syntax:  REVOKE privilege_nameON object_nameFROM {user_name |PUBLIC |role_name}  For example:  REVOKE SELECT, UPDATE ON student FROM BCA, MCA;    TRANSACTION CONTROL LANGUAGE (TCL): TCL commands deal with the  transaction within the database.  Commit  This command is used to save all the transactions to the database.  Syntax:  Commit;  For example:  DELETE FROM Students WHERE RollNo =4;  COMMIT;  Rollback                                                   140             CU IDOL SELF LEARNING MATERIAL (SLM)
Rollback command allows you to undo transactions that have not already been saved to the  database.  Syntax:  ROLLBACK;  Example:  DELETE FROM Students WHERE RollNo =4;  SAVEPOINT  This command helps you to sets a save point within a transaction.  Syntax:  SAVEPOINT SAVEPOINT_NAME;  Example:  SAVEPOINT RollNo;    Data Query Language (DQL): DQL is used to fetch the data from the database. It uses only  one command:  SELECT:  This command helps you to select the attribute based on the condition described by the  WHERE clause.  Syntax:  SELECT expressions  FROM TABLES  WHERE conditions;  For example:  SELECT Name  FROM Student  WHERE RollNo > 3;  Table 10.20 Output of the Query                                                        NAME                                                       SURESH                                                           Tom    10.3 SUMMARY    • Select Operation (σ): It selects tuples that satisfy the given predicate from a relation.  • Project Operation (∏): It projects column(s) that satisfy a given predicate. Notation −        ∏A1, A2, An (r) Where A1, A2, An are attribute names of relation r. Duplicate rows are      automatically eliminated, as relation is a set.  • Intersection Operator (∩): Only those rows that are present in both the tables will      appear in the result set.                                          141    CU IDOL SELF LEARNING MATERIAL (SLM)
• Cartesian Product (Χ): Combines information of two different relations into one.  • Set Difference (−): The result of set difference operation is tuples, which are present in        one relation but are not in the second relation.  • Rename Operation (ρ):The rename operation allows us to rename the output relation.        'rename' operation is denoted with small Greek letter rho ρ.  • SQL is a database language designed for the retrieval and management of data in a        relational database.  • It helps users to access data in the RDBMS system  • Data Definition Language (DDL) helps you to define the database structure or schema.  • Data Manipulation Language (DML) allows you to modify the database instance by        inserting, modifying, and deleting its data.  • DCL (Data Control Language) includes commands like GRANT and REVOKE, which        are useful to give \"rights & permissions.\"  • Transaction control language or TCL commands deal with the transaction within the        database.  • Data Query Language (DQL) is used to fetch the data from the database.    10.4 KEYWORDS        • TRC: Tuple Relational Calculus      • DRC: Domain Relational Calculus      • DDL: Data Definition Language (DDL)      • DML: Data Manipulation Language (DML)      • DCL: Data Control Language (DCL)      • TCL: Transaction Control Language (TCL)      • DQL: Data Query Language (DQL)    10.5 LEARNING ACTIVITY    You can build a solution that saves student records for an educational institution. Handling  student records is no easy feat. You need to keep their name, subjects, fees, any provision of  concession, and their academic progress. A DBMS-based solution will allow the client to  save a lot of time and effort.  Your design goal should be to have separate files for each student where the data will store  information about the student. You can start by adding the following sections:             ▪ Student’s Name           ▪ Fees           ▪ Subjects (or Stream)           ▪ Grades (or Marks)           ▪ Concessions (or Scholarship)                                          142    CU IDOL SELF LEARNING MATERIAL (SLM)
▪ Additional Information  It’s one of the easy database project ideas. You can take it a step further, and add the option  to include students of different grades or sections. Your designed system should allow the  admin to enter the details mentioned above. And the admin should be able to access it easily.    ___________________________________________________________________________  ____________________________________________________________________    10.6 UNIT END QUESTIONS    A. Descriptive Questions    Short Questions        1. What is meant by relational algebra?      2. Define Rename operation in Relational Algebra with example?      3. Explain the Cartesian Product Operation in relational algebra.      4. What is DML command in sql?      5. What is DDL and TCL command?    Long Questions        1. List and explain the fundamental operation of relational algebra.      2. Explain Select and Intersection operation of Relational Algebra with example.      3. Explain Project and Union operation of Relational Algebra with example.      4. Consider the following Relational Database.             Student (roll_no, name, city,marks,c_no)           Course (c_no,cname,fees)           Construct Queries into Relational algebra.                 a. List Student Details enrolled for ‘BBA (C.A)’ Course.               b. List the Course having fees < 20000               c. Display all students living in either ‘Nasik’ or ‘Pune’ city.               d. Display Course detail for student ‘Gaurav Sharma’.      5. List out the various DDL commands. Explain anyone with an example.  B. Multiple choice Questions        1. Relational Algebra is a __________ query language that takes two relations as input           and produces another relation as an output of the query.               a. Relational               b. Structural               c. Procedural               d. Fundamental                                          143    CU IDOL SELF LEARNING MATERIAL (SLM)
2. Which of the following is a fundamental operation in relational algebra?               a. Set intersection               b. Natural join               c. Assignment               d. None of these        3. Which of the following is used to denote the selection operation in relational algebra?               a. Pi (Greek)               b. Sigma (Greek)               c. Lambda (Greek)        4. Which of the following is not a DDL command?               a. UPDATE               b. TRUNCATE               c. ALTER               d. None of these        5. Which of the following are TCL commands?               a. UPDATE and TRUNCATE               b. SELECT and INSERT               c. GRANT and REVOKE               d. ROLLBACK and SAVEPOINT    Answer  1.(a) 2. (d) 3. (b) 4. (a) 5. (d)    10.7 REFERENCES    Text Books:        • T1 R. Elmasri and S.B. Navathe, Fundamentals of Database Systems, Pearson           Education, New Delhi.        • T2 C.J. Date, An Introduction to Database Systems Pearson Education, New Delhi.      • T3 Data, C. and Darwen, H, Reading, A Guide to the SQL Standard, Addison-Wesley             Publications, New Delhi.    Reference Books:      • R1 A. Silberschatz, H.F. Korth and S. Sudarshan, Database System Concepts,           McGraw-Hill, International Edition.      • R2 Ivan Bayross, SQL / PL/SQL, BPB Publications.                                          144    CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 11: EER AND ER TO RELATIONAL MAPPING    Structure     11.0 Learning objective     11.1 Database Design Using EER To Relational Language     11.2 Summary     11.3 Keywords     11.4 Learning Activity     11.5 Unit End Questions     11.6 References    11.0 LEARNING OBJECTIVE    After studying this unit, student will be able to      ▪ Design and implement the ER diagram of the system.      ▪ Explain to Convert ER diagram to Relational Model for different scenarios.    11.1 DATABASE DESIGN USING EER TO RELATIONAL LANGUAGE    Outline      • ER-to-Relational mapping algorithm               • step 1: mapping of regular entity types               • step 2: napping of weak entity types               • step 3: mapping of binary 1:1 relation types               • step 4: mapping of binary 1:N relationship types               • step 5: mapping of binary M:N relationship types               • step 6: mapping of multivalued attributes               • step 7: mapping of N-ary relationship types      • mapping EER model constructs to relations               • step 8: options for mapping specialization or generalization               • step 9: mapping of union types (categories)                                          145    CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 11.1: ER conceptual schema diagram    Figure 11.2: Mapping the Company ER schema into a relational database schema                                                                                  146    CU IDOL SELF LEARNING MATERIAL (SLM)
Together                                                    147              CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 11.3: Diagram for together relation schema    An algorithm for ER-to-relational mapping  Step 1: mapping of regular entity types              • for each regular entity type E in the ER schema, create a relation R that includes                 all the simple attributes of E              • include only the simple component attributes of a composite attribute choose one                 of the key attributes of E as primary key for R              • if the chosen key of E is composite, the set of simple attributes that form it will                 together form the primary key of R              • e.g., EMPLOYEE, DEPARTMENT, PROJECT  Step 2: mapping of weak entity types        • for each weak entity type W in the ER schema with owner entity type E, create a           relation R and include all simple attributes of Was attributes of R        • include as foreign key attributes of R the primary key attribute(s) of the relation(s)           that correspond to the owner entity type(s)        • the primary key of R is the combination of the primary key(s) of the owner(s) and the           partial key of the weak entity type W, if any        • if there is a weak entity type E2 whose owner is also a weak entity type E1, then E1           should be mapped before E2 to determine its primary key first                                                       148    CU IDOL SELF LEARNING MATERIAL (SLM)
• e.g., DEPENDENT    Step 3: mapping of binary 1:1 relationship types      • for each binary 1:1 relationship type R in the ER schema, identify the relations S and           T that correspond to the entity types participating in R      • foreign key approach                  ▪ choose one of the relations, S, and include as a foreign key in S the primary                      key of T                  ▪ include all the simple attributes of R as attributes of S      • merged relation option                ▪ merge the two entity types and the relationship into a single relation      • relationship relation option                ▪ set up a third relation R for the purpose of cross-referencing the primary keys                    of S and T      • example: MANAGES -> DEPARTMENT.MGRSSN           DEPARTMENT.MGRSTARTDATE    Step 4: mapping of binary 1: N relationship types      • for each binary 1:N relationship type R, identify the relation S that represents the           participating entity type at the N-side of the relationship type      • include as foreign key in S the primary key of the relation T that represents the other           entity type participating in R      • include any simple attributes of the 1:N relationship type as attributes of S      • e.g., WORKS_FOR: S = EMPLOYEE, T = DEPARTMENT, DNO: the primary key           of T    Step 5: mapping of binary M: N relationship types      • for each binary M:N relationship type R, create a new relation S to represent R      • include as foreign key attributes in S the primary keys of the relations that represent           the participating entity types      • their combination will form the primary key of S      • include any simple attributes of R as attributes of S      • e.g., WORKS_ON: S = WORKS_ON    Step 6: mapping of multivalued attributes           • for each multivalued attribute A, create a new relation R           • R will include an attribute corresponding to A, plus the primary key attribute K -               as a foreign key in R – of the relation that represents the entity type or relationship               type that has A as an attribute                                          149    CU IDOL SELF LEARNING MATERIAL (SLM)
• the primary key of R is the combination of A and K           • if the multivalued attribute is composite, include its simple components           • e.g., Locations: A = DLOCATION, R = DEPT_LOCATIONS, K =DNUMBER    Step 7: mapping of N-ary relationship types               • for each n-ary relationship type R, where n > 2, create a new relation S to                    represent R               • include as foreign key attributes in S the primary keys of the relations that                    represent the participating entity types               • include any simple attributes of R as attributes of S               • the primary key of S is usually a combination of all the foreign key reference                    the relations representing the participating entity types               • e.g., SUPPLY                                 Figure 11.4: Diagram for mapping steps7    An algorithm for EER-to-relational mapping  Step 8: options for mapping specialization or generalization        • convert each specialization with m subclasses {S1, S2, …, Sm} and superclass C,           where the attributes of C are {k, a1, …, an} and k is the key, into relation schemas           using one of the following options:        • option 8A: multiple relations-superclass and subclasses                        o create a relation L for C with attributes Attrs(L) = {k, a1, …, an}and                             PK(L) = k                        o create a relation L i for each subclass Si, with the attributes                             Attrs(Li) = {k} {attributes of Si} and PK(Li) = k        • works for any specialization (total or partial, disjoint or overlapping)                                          150    CU IDOL SELF LEARNING MATERIAL (SLM)
                                
                                
                                Search
                            
                            Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
 
                    