2. A B+ tree can contain a maximum of 7 pointers in a node. What is the minimum number of keys in leaves? a. 6 b. 3 c. 4 d. 7 3. Which of the following is false? a. A B+ -tree grows downwards b. A B+ -tree is balanced c. In a B+ -tree, the sibling pointers allow sequential searching d. B+ -tree is shallower than B-tree 4. Which of the following is true? a. larger the order of B-tree, less frequently the split occurs b. larger the order of B-tree, more frequently the split occurs c. smaller the order of B-tree, more frequently the split occurs d. smaller the order of B-tree, less frequently the split occurs 5. The index consists of_________. a. list of keys b. pointers to the master list c. both (a) and (b) d. All of these Answer 1. (c) 2. (b) 3. (a) 4. (a) 5. (c) 7.7 REFERENCES Text Books: • T1 R. Elmasri and S.B. Navathe, Fundamentals of Database Systems, Pearson Education, New Delhi. • T2 C.J. Date, An Introduction to Database Systems Pearson Education, New Delhi. • T3 Data, C. and Darwen, H, Reading, A Guide to the SQL Standard, Addison-Wesley Publications, New Delhi. Reference Books: • R1 A. Silberschatz, H.F. Korth and S. Sudarshan, Database System Concepts, McGraw-Hill, International Edition. • R2 Ivan Bayross, SQL / PL/SQL, BPB Publications. 101 CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 8: FILE ORGANIZATION 2 Structure 8.0 Learning objective 8.1 Hashing, Hashing Functions, Collision Resolution 8.2 Extendible Hashing 8.3 Dynamic Hashing Approach Implementation and Performance 8.4 Summary 8.5 Keywords 8.6 Learning Activity 8.7 Unit End Questions 8.8 References 8.0 LEARNING OBJECTIVE After studying this unit, student will be able to ▪ Explain the concepts in Hashing technique, how to use hash functions and Collision resolution. ▪ Describe the performance and implement the extendible hashing and Dynamic Hasing. 8.1 HASHING, HASHING FUNCTIONS, COLLISION RESOLUTION Hashing technique is used to calculate the direct location of a data record on the disk without using an index structure. In this technique, data is stored at the data blocks whose address is generated by using the hashing function. The memory location where these records are stored is known as a data bucket or data blocks. In this, a hash function can choose any of the column value to generate the address. Most of the time, the hash function uses the primary key to generate the address of the data block. A hash function is a simple mathematical function to any complex mathematical function. We can even consider the primary key itself as the address of the data block. That means each row whose address will be the same as a primary key stored in the data block. 102 CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.1: Data Block Address Representation Figure 8.1 shows data block addresses same as primary key value. This hash function can also be a simple mathematical function like exponential, mod, cos, sin, etc. Suppose we have mod (5) hash function to determine the address of the data block. In this case, it applies mod (5) hash function on the primary keys and generates 3, 3, 1, 4 and 2 respectively, and records are stored in those data block addresses. Figure 8.2: Hashing Key Representation 103 Types of Hashing: CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.3: Types of Hashing Static Hashing: In static hashing, the resultant data bucket address will always be the same. That means if we generate an address for EMP_ID =103 using the hash function mod (5) then it will always result in same bucket address 3. Here, there will be no change in the bucket address. Hence in this static hashing, the number of data buckets in memory remains constant throughout. In this example, we will have five data buckets in the memory used to store the data. Figure 8.3: Static Hashing Representation Operations of Static Hashing Searching a record: When a record needs to be searched, then the same hash function retrieves the address of the bucket where the data is stored. 104 CU IDOL SELF LEARNING MATERIAL (SLM)
Insert a Record: When a new record is inserted into the table, then we will generate an address for a new record based on the hash key and record is stored in that location. Delete a Record: To delete a record, we will first fetch the record which is supposed to be deleted. Then we will delete the records for that address in memory. Update a Record: To update a record, we will first search it using a hash function, and then the data record is updated. If we want to insert some new record into the file but the address of a data bucket generated by the hash function is not empty, or data already exists in that address. This situation in the static hashing is known as bucket overflow. This is a critical situation in this method. To overcome this situation, there are various methods. Some commonly used methods are as follows: Open Hashing: When a hash function generates an address at which data is already stored, then the next bucket will be allocated to it. This mechanism is called as Linear Probing. For example: suppose R3 is a new address which needs to be inserted, the hash function generates address as 112 for R3. But the generated address is already full. So the system searches next available data bucket, 113 and assigns R3 to it. Figure 8.4: Open Hashing Close Hashing: When buckets are full, then a new data bucket is allocated for the same hash result and is linked after the previous one. This mechanism is known as Overflow chaining. For example: Suppose R3 is a new address which needs to be inserted into the table, the hash function generates address as 110 for it. But this bucket is full to store the new data. In this case, a new bucket is inserted at the end of 110 buckets and is linked to it. 105 CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.5: Close Hashing Quadratic probing: Quadratic probing is very much similar to open hashing or linear probing. Here, the only difference between old and new bucket is linear. Quadratic function is used to determine the new bucket address. Quadratic probing is an open-addressing scheme where we look for i2‘th slot in i’th iteration if the given hash value x collides in the hash table. How Quadratic Probing is done? Let hash(x) be the slot index computed using the hash function. If the slot hash(x) % S is full, then we try (hash(x) + 1*1) % S. If (hash(x) + 1*1) % S is also full, then we try (hash(x) + 2*2) % S. If (hash(x) + 2*2) % S is also full, then we try (hash(x) + 3*3) % S. This process is repeated for all the values of i until an empty slot is found. Quadratic Probing uses a hash function of the form h (k,i) = (h' (k) + c1i + c2i2) mod m Where (as in linear probing) h' is an auxiliary hash function c1 and c2 ≠0 are auxiliary constants and i=0, 1...m-1. The initial position is T [h' (k)]; later position probed is offset by the amount that depend in a quadratic manner on the probe number i the method of quadratic probing is found to be better than linear probing. However, to ensure that the full hash table is covered, the values of c1, and c2 are constrained. It may happen that two keys produce the same probe sequence such that: h(k1, i) = h(k2, i) Consider inserting the keys 74, 28, 36,58,21,64 into a hash table of size m =11 using quadratic probing with c1=1 and c2=3. Further consider that the primary hash function is h' (k) = k mod m. 106 CU IDOL SELF LEARNING MATERIAL (SLM)
01234567 89 10 //////// // / Figure 8.6: Initial State 89 10 74 / / This is the initial state of hash table Here c1= 1 and c2=3 89 10 h (k, i) = [k mod m + i + 3i2 ] mod m 74 / / Insert 74. h (74,0)= (74 mod 11+0+3x0) mod 11 89 10 = (8 +0+0) mod 11 = 8 74 / / T [8] is free; insert the key 74 at this place. 107 01234567 //////// Figure 8.7: After inserting 74 Insert 28. h (28, 0) = (28 mod 11 + 0 + 3 x 0) mod 11 = (6 +0 + 0) mod 11 = 6. T [6] is free; insert key 28 at this place. 01234567 / / / / / / 28 / Figure 8.8: After inserting 28 Insert 36. h (36, 0) = (36 mod 11 + 0 + 3 x 0) mod 11 = (3 + 0+0) mod 11=3 T [3] is free; insert key 36 at this place. 01234567 / / / 36 / / 28 / Figure 8.9: After inserting 36 Insert 58. h (58, 0) = (58 mod 11 + 0 + 3 x 0) mod 11 = (3 + 0 + 0) mod 11 = 3 T [3] is not free, so next probe sequence is computed as h (59, 1) = (58 mod 11 + 1 + 3 x12) mod 11 CU IDOL SELF LEARNING MATERIAL (SLM)
= (3 + 1 + 3) mod 11 =7 mod 11= 7 T [7] is free; insert key 58 at this place. 0 1 2 3 4 5 6 7 8 9 10 / / / 36 / / 28 58 74 / / Figure 8.10: After inserting 58 Insert 21. h (21, 0) = (21 mod 11 + 0 + 3 x 0) = (10 + 0) mod 11 = 10 T [10] is free; insert key 21 at this place. 0 1 2 3 4 5 6 7 8 9 10 / / / 36 / / 28 58 74 / 21 Figure 8.11: After inserting 21 Insert 64. h (64, 0) = (64 mod 11 + 0 + 3 x 0) = (9 + 0+ 0) mod 11 = 9. T [9] is free; insert key 64 at this place. Thus, after inserting all keys, the hash table is 0 1 2 3 4 5 6 7 8 9 10 / / / 36 / / 28 58 74 64 21 Figure 8.12: After inserting all keys Double Hashing: Double Hashing is another method similar to linear probing. Here the difference is fixed as in linear probing, but this fixed difference is calculated by using another hash function. That’s why the name is double hashing. Double Hashing is one of the best techniques available for open addressing because the permutations produced have many of the characteristics of randomly chosen permutations. Double hashing uses a hash function of the form h (k, i) = (h1(k) + i h2 (k)) mod m Where h1 and h2 are auxiliary hash functions and m is the size of the hash table. h1 (k) = k mod m or h2 (k) = k mod m'. Here m' is slightly less than m (say m-1 or m-2) let hash(x) be the slot index computed using hash function. 108 CU IDOL SELF LEARNING MATERIAL (SLM)
If slot hash(x) % S is full, then we try (hash(x) + 1*hash2(x)) % S If (hash(x) + 1*hash2(x)) % S is also full, then we try (hash(x) + 2*hash2(x)) % S If (hash(x) + 2*hash2(x)) % S is also full, then we try (hash(x) + 3*hash2(x)) % S Example: Consider inserting the keys 76, 26, 37,59,21,65 into a hash table of size m = 11 using double hashing. Consider that the auxiliary hash functions are h1 (k)=k mod 11 and h2(k) = k mod 9. Solution: Initial state of Hash table is 0 1 2 3 4 5 6 7 8 9 10 / / / / / / / / / // Figure 8.13: Initial State Insert 76. h1(76) = 76 mod 11 = 10 h2(76) = 76 mod 9 = 4 h (76, 0) = (10 + 0 x 4) mod 11 = 10 mod 11 = 10 T [10] is free, so insert key 76 at this place. 0 1 2 3 4 5 6 7 8 9 10 / / / / / / / / / / 76 Figure 8.14: After Inserting 76 Insert 26. h1(26) = 26 mod 11 = 4 h2(26) = 26 mod 9 = 8 h (26, 0) = (4 + 0 x 8) mod 11 = 4 mod 11 = 4 T [4] is free, so insert key 26 at this place. 0 1 2 3 4 5 6 7 8 9 10 / / / / 26 / / / / / 76 Figure 8.15: After Inserting 26 Insert 37. h1(37) = 37 mod 11 = 4 h2(37) = 37 mod 9 = 1 h (37, 0) = (4 + 0 x 1) mod 11 = 4 mod 11 = 4 109 CU IDOL SELF LEARNING MATERIAL (SLM)
T [4] is not free, the next probe sequence is 8 9 10 h (37, 1) = (4 + 1 x 1) mod 11 = 5 mod 11 = 5 / / 76 T [5] is free, so insert key 37 at this place. 8 9 10 01234567 / 59 76 / / / / 26 37 / / 8 9 10 Figure 8.16: After Inserting 37 / 59 76 Insert 59. h1(59) = 59 mod 11 = 4 110 h2(59) = 59 mod 9 = 5 h (59, 0) = (4 + 0 x 5) mod 11 = 4 mod 11 = 4 Since, T [4] is not free, the next probe sequence is h (59, 1) = (4 + 1 x 5) mod 11 = 9 mod 11 = 9 T [9] is free, so insert key 59 at this place. 01234567 / / / / 26 37 / / Figure 8.17: After Inserting 59 Insert 21. h1(21) = 21 mod 11 = 10 h2(21) = 21 mod 9 = 3 h (21, 0) = (10 + 0 x 3) mod 11 = 10 mod 11 = 10 T [10] is not free, the next probe sequence is h (21, 1) = (10 + 1 x 3) mod 11 = 13 mod 11 = 2 T [2] is free, so insert key 21 at this place. 01234567 / / 21 / 26 37 / / Figure 8.18: After Inserting 21 Insert 65. h1(65) = 65 mod 11 = 10 h2(65) = 65 mod 9 = 2 h (65, 0) = (10 + 0 x 2) mod 11 = 10 mod 11 = 10 T [10] is not free, the next probe sequence is h (65, 1) = (10 + 1 x 2) mod 11 = 12 mod 11 = 1 T [1] is free, so insert key 65 at this place. Thus, after insertion of all keys the final hash table is CU IDOL SELF LEARNING MATERIAL (SLM)
0 1 2 3 4 5 6 7 8 9 10 / 65 21 / 26 37 / / / 59 76 Figure 8.19: After Inserting all keys 8.2 EXTENDIBLE HASHING Extendible hashing is a dynamically updateable disk-based index structure which implements a hashing scheme utilizing a directory. The index is used to support exact match queries, i.e., find the record with a given key. Compared with the BC-tree index which also supports exact match queries (in logarithmic number of I/Os), extendible hashing has better expected query cost O(1) I/O. Compared with linear hashing, extendible hashing does not have any overflow page. Overflows are handled by doubling the directory which logically doubles the number of buckets. Physically, only the overflown bucket is split. The extendible hashing scheme was introduced by Fagin R, Nievergelt J, Pippenger N, Strong HR. Extendible hashing. A hash table is an in-memory data structure that associates keys with values. The primary operation it supports efficiently is a lookup: given a key, find the corresponding value. It works by transforming the key using a hash function into a hash, a number that is used as an index in an array to locate the desired location where the values should be. Multiple keys may be hashed to the same bucket, and all keys in a bucket should be searched upon a query. Hash tables are often used to implement associative arrays, sets, and caches. Like arrays, hash tables have O (1) lookup cost on average. Extendible hashing uses a directory to access its buckets. This directory is usually small enough to be kept in main memory and has the form of an array with 2d entries, each entry storing a bucket address (pointer to a bucket). The variable d is called the global depth of the directory. To decide where a key k is stored, extendible hashing uses the last d bits of some adopted hash function h(k) to choose the directory entry. Multiple directory entries may point to the same bucket. Every bucket has a local depth leqd. The difference between local depth and global depth affects overflow handling. An example of extendible hashing is shown in Fig. 8.6. Here there are four directory entries and four buckets. The global depth and all the four local depths are 2. For simplicity assume the adopted hash function is h(k) D k. For instance, to search for record 15, one refers to directory entry 15% 4 D 3 (or 11 in binary format), which points to bucket D. 111 CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.20: Extendible Hashing Representation Overflow Handling: If a bucket overflow happens, the bucket is split into two. The directory may or may not double, depending on whether the local depth of the overflown bucket was equal to the global depth before split. If the local depth was equal to global depth, d bits are not enough to distinguish the search values of the overflown bucket. Thus, a directory doubling occurs, which effectively uses one more bit from the hash value. The directory size is then doubled (this does not mean that the number of buckets is doubled as buckets will share directory entries). As an example, Fig. 8.7 illustrates extendible hashing after inserting a new record with key 63 into Fig.8.6. Bucket D overflows and the records in it are redistributed between D (where the last three bits of a record’s hash value are 011) and D0 (where the last three bits of a record’s hash value are 111). The directory doubles. The global depth is increased by one. The local depth of buckets D and D0 are increased by one, while the local depth of the other buckets remains to be two. Except 111, which points to the new bucket D0, each of the new directory entries points to the existing bucket which shares the last two bits. For instance, directory entry 101 points to the bucket referenced by directory entry 001. In general, if the local depth of a bucket is d0, the number of directory entries pointing to the bucket is 2dd0. All these directory entries share the last d0 bits. To split an overflown bucket whose local depth is smaller than the global depth, one does not need to double the size of the directory. Instead, half of the 2dd0 directory entries will point to the new bucket, and the local depth of both the overflown bucket and its split image are increased by one. For instance, Fig.8.8 illustrates the extendible hashing after inserting 17 and 13 into Fig. 8.7. Bucket B overflows and a split image, bucket B0, is created. There are two directory entries 112 CU IDOL SELF LEARNING MATERIAL (SLM)
(001 and 101) that pointed to B before the split. Half of them (101) now points to the split image B0. The local depth of both buckets B and B0 are increased by one. Figure 8.21: Extendible Hashing after inserting 63. Figure 8.22: Extendible Hashing after inserting 17 and 13. 8.3 DYNAMIC HASHING APPROACH IMPLEMENTATION AND PERFORMANCE Dynamic Hashing 113 CU IDOL SELF LEARNING MATERIAL (SLM)
▪ The dynamic hashing method is used to overcome the problems of static hashing like bucket overflow. ▪ In this method, data buckets grow or shrink as the records increases or decreases. This method is also known as Extendable hashing method. ▪ This method makes hashing dynamic, i.e., it allows insertion or deletion without resulting in poor performance. How to search a key ▪ First, calculate the hash address of the key. ▪ Check how many bits are used in the directory, and these bits are called as i. ▪ Take the least significant i bits of the hash address. This gives an index of the directory. ▪ Now using the index, go to the directory and find bucket address where the record might be. How to insert a new record ▪ Firstly, you have to follow the same procedure for retrieval, ending up in some bucket. ▪ If there is still space in that bucket, then place the record in it. ▪ If the bucket is full, then we will split the bucket and redistribute the records. For example: Consider the following grouping of keys into buckets, depending on the prefix of their hash address: Figure 8.23: Dynamic Hashing address with key The last two bits of 2 and 4 are 00. So, it will go into bucket B0. The last two bits of 5 and 6 are 01, so it will go into bucket B1. The last two bits of 1 and 3 are 10, so it will go into bucket B2. The last two bits of 7 are 11, so it will go into B3. 114 CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 8.24: Dynamic Hashing Bucket Representation Insert key 9 with hash address 10001 into the above structure: ▪ Since key 9 has hash address 10001, it must go into the first bucket. But bucket B1 is full, so it will get split. ▪ The splitting will separate 5, 9 from 6 since last three bits of 5, 9 are 001, so it will go into bucket B1, and the last three bits of 6 are 101, so it will go into bucket B5. ▪ Keys 2 and 4 are still in B0. The record in B0 pointed by the 000 and 100 entry because last two bits of both the entry are 00. ▪ Keys 1 and 3 are still in B2. The record in B2 pointed by the 010 and 110 entry because last two bits of both the entry are 10. ▪ Key 7 are still in B3. The record in B3 pointed by the 111 and 011 entry because last two bits of both the entry are 11. Figure 8.25: Dynamic Hashing After Insertion 115 Advantages of Dynamic Hashing: CU IDOL SELF LEARNING MATERIAL (SLM)
▪ In this method, the performance does not decrease as the data grows in the system. It simply increases the size of memory to accommodate the data. ▪ In this method, memory is well utilized as it grows and shrinks with the data. There will not be any unused memory lying. ▪ This method is good for the dynamic database where data grows and shrinks frequently. Disadvantages of Dynamic Hashing: ▪ In this method, if the data size increases then the bucket size is also increased. These addresses of data will be maintained in the bucket address table. This is because the data address will keep changing as buckets grow and shrink. If there is a huge increase in data, maintaining the bucket address table becomes tedious. ▪ In this case, the bucket overflow situation will also occur. But it might take little time to reach this situation than static hashing. 8.4 SUMMARY ▪ In DBMS, hashing is a technique to directly search the location of desired data on the disk without using index structure. ▪ Hashing method is used to index and retrieve items in a database as it is faster to search that specific item using the shorter hashed key instead of using its original value. ▪ Data bucket, Key, Hash function, Linear Probing, Quadratic probing, Hash index, Double Hashing, Bucket Overflow are important terminologies used in hashing ▪ Two types of hashing methods are 1) static hashing 2) dynamic hashing ▪ In the static hashing, the resultant data bucket address will always remain the same. ▪ Dynamic hashing offers a mechanism in which data buckets are added and removed dynamically and on demand. ▪ In order Indexing addresses in the memory are sorted according to a critical value while in hashing addresses are always generated using a hash function on the key value. ▪ Hash collision is a state when the resultant hashes from two or more data in the data set, wrongly map the same place in the hash table. ▪ Rehashing and chaining are two methods which help you to avoid hashing collision. 8.5 KEYWORDS ▪ Data bucket – Data buckets are memory locations where the records are stored. ▪ Key: A DBMS key is an attribute or set of an attribute which helps you to identify a row(tuple) in a relation(table). This allows you to find the relationship between two tables. 116 CU IDOL SELF LEARNING MATERIAL (SLM)
▪ Hash function: A hash function, is a mapping function which maps all the set of search keys to the address where actual records are placed. ▪ Linear Probing – Linear probing is a fixed interval between probes. In this method, the next available data block is used to enter the new record, instead of overwriting on the older record. ▪ Quadratic probing- It helps you to determine the new bucket address. It helps you to add Interval between probes by adding the consecutive output of quadratic polynomial to starting value given by the original computation. ▪ Hash index – It is an address of the data block. A hash function could be a simple mathematical function to even a complex mathematical function. ▪ Double Hashing –Double hashing is a computer programming method used in hash tables to resolve the issues of has a collision. ▪ Bucket Overflow: The condition of bucket-overflow is called collision. This is a fatal stage for any static has to function. 8.6 LEARNING ACTIVITY Colleges have multiple departments where every department offers many courses. These departments have a head (HOD) and various instructors. Even though there are many instructors, one instructor can only work in one department. As you can see the organization structure of a college is quite complicated and requires a lot of effort to manage. You can add the enrollment information of the students as to how many students have taken a particular course. The system should allow easy access. Your developed DBMS-based solution would allow a college to save a lot of time and resources; moreover, the user could see all the college information from one place and modify it accordingly. ___________________________________________________________________________ ____________________________________________________________________ 8.7 UNIT END QUESTIONS A. Descriptive Questions Short Questions 1. What is meant by hashing? 2. Define hash function? 3. List out the types of hashing. 4. What is an open address in hashing? 5. What is dynamic hashing? 117 CU IDOL SELF LEARNING MATERIAL (SLM)
Long Questions 1. Explain two types of SQL hashing methods. 2. List and explain the hashing collision resolution technique. 3. Explain in detail about Open address in hashing. 4. Difference between open hashing and closed hashing. 5. Explain the important terminologies which are used in Hashing. B. Multiple choice Questions 1. The term ______ is used to denote a unit of storage that can store one or more records a. Basket b. Bucket c. Unit d. Set 2. If K denotes the set of all the search key values, and B denotes the set of all bucket addresses, a function from K to B is called as __________ a. Bucket function b. Address function c. Hash function d. Search function 3. In a __________ , we obtain the address of the disk block containing a desired record directly by computing a function on the search key value of the record a. Hash file organization b. Hash index organization c. Hashing address d. None of these 4. In a __________ we organize the search keys, with their associated pointers, into a hash file structure a. Hash file organization b. Hash index organization c. Hashing address d. None of these 5. What is a bucket overflow? a. When a bucket does not have enough space b. There are insufficient buckets c. When Bucket skew occurs 118 CU IDOL SELF LEARNING MATERIAL (SLM)
d. All of these Answer 1.(b) 2. (c) 3. (a) 4. (b) 5. (b) 8.8 REFERENCES Text Books: • T1 R. Elmasri and S.B. Navathe, Fundamentals of Database Systems, Pearson Education, New Delhi. • T2 C.J. Date, An Introduction to Database Systems Pearson Education, New Delhi. • T3 Data, C. and Darwen, H, Reading, A Guide to the SQL Standard, Addison-Wesley Publications, New Delhi. Reference Books: • R1 A. Silberschatz, H.F. Korth and S. Sudarshan, Database System Concepts, McGraw-Hill, International Edition. • R2 Ivan Bayross, SQL / PL/SQL, BPB Publications. 119 CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 9: RELATIONAL DATA MODEL Structure 9.0 Learning Objective 9.1 Relational Model Concepts 9.2 Relational Constraints 9.3 Summary 9.4 Keywords 9.5 Learning Activity 9.6 Unit End Questions 9.7 References 9.0 LEARNING OBJECTIVES After studying this unit, student will be able to ▪ Learn the Relational Model and the uses of constraints. ▪ Discuss the concepts in relational constraints. 9.1 RELATIONAL MODEL CONCEPTS The database is defined as a set of relations in the Relational Model (RM). A table of values is all that a relationship is. Each table row represents a grouping of related data values. The table's rows represent a real-world person or relationship. The table name and column names are helpful to interpret the meaning of values in each row. The data are represented as a set of relations. In the relational model, data are stored as tables. However, the physical storage of the data is independent of the way the data are logically organized. The following are a few of the most widely used Relational Database Management Systems: ▪ DB2 and Informix Dynamic Server - IBM ▪ Oracle and RDB – Oracle ▪ SQL Server and Access - Microsoft Relational Model Concepts: Attribute: Each column in a Table. Attributes are the properties which define a relation. e.g., Student_Rollno, NAME, etc. Tables – In the Relational model the, relations are saved in the table format. It is stored along with its entities. A table has two properties rows and columns. Rows represent records and columns represent attributes. Tuple – It is nothing but a single row of a table, which contains a single record. Relation Schema: A relation schema represents the name of the relation with its attributes. Degree: The total number of attributes which in the relation is called the degree of the relation. 120 CU IDOL SELF LEARNING MATERIAL (SLM)
Cardinality: Total number of rows present in the Table. Column: The column represents the set of values for a specific attribute. Relation instance – Relation instance is a finite set of tuples in the RDBMS system. Relation instances never have duplicate tuples. Relation key - Every row has one, two or multiple attributes, which is called relation key. – Every attribute has some pre-defined value and scope which is known as attribute domain Figure 9.1: Relation Representation 9.2 RELATIONAL CONSTRAINTS Relational Integrity Constraints Relational Integrity constraints in DBMS are referred to conditions that must be present for a valid relation. These Relational constraints in DBMS are derived from the rules in the mini- world that the database represents. There are many types of Integrity Constraints in DBMS. Constraints on the Relational database management system is mostly divided into three main categories are: 1. Domain Constraints 2. Key Constraints 3. Referential Integrity Constraints Domain Constraints Domain constraints can be violated if an attribute value is not appearing in the corresponding domain or it is not of the appropriate data type. Domain constraints specify that within each tuple, and the value of each attribute must be unique. This is specified as data types which include standard data types integers, real numbers, characters, Booleans, variable length strings, etc. Example: Create DOMAIN CustomerName CHECK (value not NULL) The example shown demonstrates creating a domain constraint such that CustomerName is not NULL Key Constraints 121 CU IDOL SELF LEARNING MATERIAL (SLM)
An attribute that can uniquely identify a tuple in a relation is called the key of the table. The value of the attribute for different tuples in the relation has to be unique. Example: In the given table, CustomerID is a key attribute of Customer Table. It is most likely to have a single key for one customer, CustomerID =1 is only for the CustomerName =\" Google\". Table 9.1: Customer Table CustomerID CustomerName Status 1 Google Active 2 Amazon Active 3 Apple Inactive Referential Integrity Constraints Referential Integrity constraints in DBMS are based on the concept of Foreign Keys. A foreign key is an important attribute of a relation which should be referred to in other relationships. Referential integrity constraint state happens where relation refers to a key attribute of a different or same relation. However, that key element must exist in the table. Example: In the below example, we have 2 relations, Customer and Billing. Tuple for CustomerID =1 is referenced twice in the relation Billing. So we know CustomerName=Google has billing amount $300 122 CU IDOL SELF LEARNING MATERIAL (SLM)
Table 9.2: Customer and Billing Relation Operations in Relational Model Four basic update operations performed on relational database model are Insert, update, delete and select. • Insert is used to insert data into the relation • Delete is used to delete tuples from the table. • Modify allows you to change the values of some attributes in existing tuples. • Select allows you to choose a specific range of data. Whenever one of these operations are applied, integrity constraints specified on the relational database schema must never be violated. Insert Operation The insert operation gives values of the attribute for a new tuple which should be inserted into a relation. Figure 9.3: Insert Operation Update Operation You can see that in the below-given relation table CustomerName= 'Apple' is updated from Inactive to Active. Figure 9.4: Update Operation Delete Operation To specify deletion, a condition on the attributes of the relation selects the tuple to be deleted. Figure 9.5: Delete Operation 123 In the above-given example, Customer Name= \"Apple\" is deleted from the table. CU IDOL SELF LEARNING MATERIAL (SLM)
The Delete operation could violate referential integrity if the tuple which is deleted is referenced by foreign keys from other tuples in the same database. Select Operation Figure 9.6: Select Operation In the above-given example, Customer Name=\"Amazon\" is selected Best Practices for creating a Relational Model • Data need to be represented as a collection of relations • Each relation should be depicted clearly in the table • Rows should contain data about instances of an entity • Columns must contain data about attributes of the entity • Cells of the table should hold a single value • Each column should be given a unique name • No two rows can be identical • The values of an attribute should be from the same domain Advantages of using Relational Model • Simplicity: A Relational data model in DBMS is simpler than the hierarchical and network model. • Structural Independence: The relational database is only concerned with data and not with a structure. This can improve the performance of the model. • Easy to use: The Relational model in DBMS is easy as tables consisting of rows and columns are quite natural and simple to understand • Query capability: It makes possible for a high-level query language like SQL to avoid complex database navigation. • Data independence: The Structure of Relational database can be changed without having to change any application. • Scalable: Regarding a number of records, or rows, and the number of fields, a database should be enlarged to enhance its usability. Disadvantages of using Relational Model • Few relational databases have limits on field lengths which can't be exceeded. • Relational databases can sometimes become complex as the amount of data grows, and the relations between pieces of data become more complicated. • Complex relational database systems may lead to isolated databases where the information cannot be shared from one system to another. 124 CU IDOL SELF LEARNING MATERIAL (SLM)
9.3 SUMMARY ▪ The Relational database modelling represents the database as a collection of relations (tables) ▪ Attribute, Tables, Tuple, Relation Schema, Degree, Cardinality, Column, Relation instance, are some important components of Relational Model ▪ Relational Integrity constraints are referred to conditions which must be present for a valid Relation approach in DBMS ▪ Domain constraints can be violated if an attribute value is not appearing in the corresponding domain or it is not of the appropriate data type ▪ Insert, Select, Modify and Delete are the operations performed in Relational Model constraints ▪ The relational database is only concerned with data and not with a structure which can improve the performance of the model ▪ Advantages of Relational model in DBMS are simplicity, structural independence, ease of use, query capability, data independence, scalability, etc. ▪ Few relational databases have limits on field lengths which can't be exceeded. 9.4 KEYWORDS • Insert: Insert is used to insert data into the relation • Delete: Delete is used to delete tuples from the table. • Modify: Modify allows you to change the values of some attributes in existing tuples. • Select: Select allows you to choose a specific range of data 9.5 LEARNING ACTIVITY Library Data Management If you’re an avid reader, then chances are, you must’ve gone to a library. And you may already know how many books a library has to keep track of. Libraries don’t have a lot of staff, but they have to keep a record of all the books they have and the books they have lent. You can simplify the management of a library’s data. You should start with students and faculties, i.e., people can get books from the library. Now, there would be a significant difference between the number of books a student can get and the number of books a faculty can get. So, add those limits in your system as well. Then, every book would have a unique ID. Books with the same title and author would have different IDs according to their copies. You’ll have to add entries for every book. And then, add the details of who issued the book and when with the duration of their ownership. Your DBMS-based solution should also have details on the books that people haven’t returned and the due fines. 125 CU IDOL SELF LEARNING MATERIAL (SLM)
___________________________________________________________________________ ____________________________________________________________________ 9.6 UNIT END QUESTIONS A. Descriptive Questions Short Questions 1. What is Relational Model? 2. What is Referential Integrity Constraints? 3. Write short notes on Domain constraints. 4. What is a table? 5. Define tuple? Long Questions 1. Explain in detail about Relational Model Concepts. 2. Explain the Relational Constraints. 3. What is the constraint in a database? Explain types of constraints with suitable example. 4. Explain in detail the operation of the Relational Constraint Model. 5. What are the types of relational constraints? Explain in detail? B. Multiple choice Questions 1. An attribute is a __________ in a relation. a. Row b. Column c. Value d. Tuple 2. Statement 1: A tuple is a row in a relation Statement 2: Existence of multiple foreign keys in a same relation is possible a. Both the statements are true b. Statement 1 is correct but Statement 2 is false c. Statement 1 is false but Statement 2 is correct d. Both the statements are false 3. Data integrity constraints are used to: 126 a. Control who is allowed access to the data b. Ensure that duplicate records are not entered into the table c. Improve the quality of data entered for a specific property CU IDOL SELF LEARNING MATERIAL (SLM)
d. Prevent users from changing the values stored in the table 4. To include integrity constraint in an existing relation use : a. Create table b. Modify table c. Alter table d. Drop table 5. Foreign key is the one in which the ________ of one relation is referenced in another relation. a. Foreign key b. Primary key c. References d. Check constraint Answer 1.(b) 2. (a) 3. (c) 4. (c) 5. (b) 9.7 REFERENCES Text Books: • T1 R. Elmasri and S.B. Navathe, Fundamentals of Database Systems, Pearson Education, New Delhi. • T2 C.J. Date, An Introduction to Database Systems Pearson Education, New Delhi. • T3 Data, C. and Darwen, H, Reading, A Guide to the SQL Standard, Addison-Wesley Publications, New Delhi. Reference Books: • R1 A. Silberschatz, H.F. Korth and S. Sudarshan, Database System Concepts, McGraw-Hill, International Edition. • R2 Ivan Bayross, SQL / PL/SQL, BPB Publications. 127 CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 10: RELATIONAL ALGEBRA SQL Structure 10.0 Learning objective 10.1 SQL Queries 10.2 Programming Using SQL 10.3 Summary 10.4 Keywords 10.5 Learning Activity 10.6 Unit End Questions 10.7 References 10.0 LEARNING OBJECTIVE After studying this unit, student will be able to ▪ Explain how to build new relations from other relations in the database ▪ Write SQL commands to create tables and indexes, insert/update/delete data, and query data in a relational DBMS 10.1 SQL QUERIES Relational Algebra: ▪ Relational Algebra is a procedural query language that accepts instances of relations as input and outputs instances of relations. To run queries, it hires operators. ▪ Unary and binary operators are also possible. They take relations as feedback and give them back relations. ▪ Relational algebra is performed recursively on relation and intermediate results are also considered relations. Types of Relational Operation: Figure 10.1 Relational Operation 128 CU IDOL SELF LEARNING MATERIAL (SLM)
Select Operator (σ): Select Operator is denoted by sigma (σ) and it is used to find the tuples (or rows) in a relation (or table) which satisfy the given condition. If you understand little bit of SQL then you can think of it as a where clause in SQL, which is used for the same purpose. Syntax of Select Operator (σ) σ Condition / Predicate (Relation/Table name) Select Operator (σ) Example Table 10.1 Customer Customer_ID Customer_Name Customer_City C10100 Steve Agra C10111 Raghu Agra C10115 Noida C10117 Chaitanya Delhi C10118 Ajeet Delhi Carl Query: σ Customer_City=\"Agra\" (CUSTOMER) SQL Query: Select * from customer where Customer_City =”Agra”; Output: Table 10.2 Output of the Query Customer ID Customer Name Customer City C10100 Steve Agra C10111 Raghu Agra Project Operator (∏): Project operator is denoted by ∏ symbol and it is used to select desired columns (or attributes) from a table (or relation). Project operator in relational algebra is similar to the Select statement in SQL. Syntax of Project Operator (∏) ∏ column_name1, column_name2, ...., column_nameN(table_name) Project Operator (∏) Example In this example, we have a table CUSTOMER with three columns, we want to fetch only two columns of the table, which we can do with the help of Project Operator ∏. 129 CU IDOL SELF LEARNING MATERIAL (SLM)
Table 10.3 Customer Customer_ID Customer_Name Customer_City C10100 Steve Agra C10111 Raghu Agra C10115 Noida C10117 Chaitanya Delhi C10118 Ajeet Delhi Carl Query: ∏ Customer_Name, Customer_City (CUSTOMER) SQL Query: Select Customer_Name, Customer_City from customer; Output: Table 10.4 Output of the Query Customer Name Customer City Steve Agra Raghu Agra Noida Chaitanya Delhi Ajeet Delhi Carl Union Operator (∪): Union operator is denoted by ∪ symbol and it is used to select all the rows (tuples) from two tables (relations). ▪ Let’s say we have two relations R1 and R2 both have same columns and we want to select all the tuples(rows) from these relations then we can apply the union operator on these relations. ▪ The rows (tuples) that are present in both the tables will only appear once in the union set. In short you can say that there are no duplicates present after the union operation. Syntax of Union Operator (∪) table_name1 ∪ table_name2 130 CU IDOL SELF LEARNING MATERIAL (SLM)
Query: ∏ Student_Name (student1) ∪ ∏ Student_Name (student2) Table 10.5 Student1 ID NAME 1 Jack 2 Harry 3 Jackson Table 10.6 Student2 ID NAME 3 Jackson 4 Stephan 5 David SQL Query: SELECT * FROM student1 UNION SELECT * FROM student2; Table 10.7 Output of the Query ID NAME 1 Jack 2 Harry 131 CU IDOL SELF LEARNING MATERIAL (SLM)
3 Jackson 4 Stephan 5 David Union All: Union All operation is equal to the Union operation. It returns the set without removing duplication and sorting the data. Syntax: SELECT column_name FROM table1 UNION ALL SELECT column_name FROM table2; SQL Query: SELECT * FROM student1 UNION ALL SELECT * FROM Student2; Table 10.8 Output of the Query ID NAME 1 Jack 2 Harry 3 Jackson 3 Jackson 4 Stephan 5 David Intersection Operator (∩): Intersection operator is denoted by ∩ symbol and it is used to select common rows (tuples) from two tables (relations). Let’s say we have two relations R1 and R2 both have same columns and we want to select all those tuples(rows) that are present in both the relations, then in that case we can apply intersection operation on these two relations R1 ∩ R2. 132 CU IDOL SELF LEARNING MATERIAL (SLM)
Syntax of Intersection Operator (∩) table_name1 ∩ table_name2 Query: ∏ Student_Name (student1) ∩ ∏ Student_Name (student2) SQL Query: SELECT * FROM student1 INTERSECT SELECT * FROM student2; Table 10.9 Output of the Query ID NAME 3 Jackson Set Difference (-): Set Difference is denoted by – symbol. Let’s say we have two relations R1 and R2 and we want to select all those tuples(rows) that are present in Relation R1 but not present in Relation R2, this can be done using Set difference R1 – R2. ▪ It combines the result of two SELECT statements. Minus operator is used to display the rows which are present in the first query but absent in the second query. ▪ It has no duplicates and data arranged in ascending order by default. Syntax of Set Difference (-) table_name1 - table_name2 Query: ∏ Student_Name (STUDENT) - ∏ Student_Name (COURSE) SQL Query: SELECT * FROM table1 MINUS 133 CU IDOL SELF LEARNING MATERIAL (SLM)
SELECT * FROM table2; Table 10.10 Output of the Query ID NAME 1 Jack 2 Harry Cartesian product (X): Cartesian Product is denoted by X symbol. Let’s say we have two relations R1 and R2 then the cartesian product of these two relations (R1 X R2) would combine each tuple of first relation R1 with each tuple of second relation R2. Syntax of Cartesian product (X) R1 X R2 Query: ∏ Student_Name (STUDENT) X ∏ Student_Name (Details) Table 10.11 Student SNO FNAME LNAME 1 Albert Singh 2 Nora Fatehi Table 10.12 Details AGE ROLLNO 18 21 5 9 SQL Query: SELECT * FROM 134 CU IDOL SELF LEARNING MATERIAL (SLM)
student, Table 10.12 Details Details; FNAME LNAME ROLLNO AGE SNO Albert Singh 5 18 1 Albert Singh 9 21 1 Nora Fatehi 5 18 2 Nora Fatehi 9 21 2 Rename (ρ): Rename (ρ) operation can be used to rename a relation or an attribute of a relation. Rename (ρ) Syntax: ρ (new_relation_name, old_relation_name) Rename (ρ) Example: Let’s say we have a table customer; we are fetching customer names and we are renaming the resulted relation to CUST_NAMES. Table 10.13 Customer Customer_ID Customer_Name Customer_City C10100 Steve Agra C10111 Raghu Agra C10115 Chaitanya Noida C10117 Ajeet Delhi C10118 Carl Delhi Query: ρ(CUST_NAMES, ∏(Customer_Name)(CUSTOMER)) SQL Query: SELECT CustomerName as CUST_NAMES FROM Customer; 135 CU IDOL SELF LEARNING MATERIAL (SLM)
Table 10.14 Output of the Query CUST_NAMES Steve Raghu Chaitanya Ajeet Carl 10.2 PROGRAMMING USING SQL Structured Query Language is a standard Database language that is used to create, maintain and retrieve the relational database. Following are some interesting facts about SQL. ▪ SQL is case insensitive. But it is a recommended practice to use keywords (like SELECT, UPDATE, CREATE, etc) in capital letters and use user-defined things (liked table name, column name, etc) in small letters. ▪ We can write comments in SQL using “–” (double hyphen) at the beginning of any line. ▪ SQL is the programming language for relational databases (explained below) like MySQL, Oracle, Sybase, SQL Server, Postgre, etc. Other non-relational databases (also called NoSQL) databases like MongoDB, DynamoDB, etc do not use SQL ▪ Although there is an ISO standard for SQL, most of the implementations slightly vary in syntax. So we may encounter queries that work in SQL Server but do not work in MySQL. Table 10.14 Student ROLL_NO NAME ADDRESS PHONE AGE 1 RAM DELHI 2 9455123451 18 3 RAMESH GURGAON 9652431543 18 4 SUJIT ROHTAK 9156253131 20 9156768971 18 SURESH DELHI DDL COMMAND: Data Definition Language helps you to define the database structure or schema. Let's learn about DDL commands with syntax. Five types of DDL commands in SQL are: 136 CU IDOL SELF LEARNING MATERIAL (SLM)
CREATE 137 CREATE statements is used to define the database structure schema: Syntax: CREATE TABLE table_name ( column1 datatype, column2 datatype, column3 datatype, .... ); For example: Create database university; CREATE TABLE Student ( ROLL_NO int, Name varchar(255), Address varchar(255), Phone varchar(12), age int ); DROP Drops commands remove tables and databases from RDBMS. Syntax DROP TABLE Table_Name; For example: Drop table student; ALTER Alters command allows you to alter the structure of the database. Syntax: To add a new column in the table ALTER TABLE table_name ADD column_name COLUMN-definition; To modify an existing column in the table: ALTER TABLE MODIFY (COLUMN DEFINITION....); For example: Alter table Student add subject varchar (255); Table 10.16 After add Subject CU IDOL SELF LEARNING MATERIAL (SLM)
ROLL_NO NAME ADDRESS PHONE AGE Subject 1 2 RAM DELHI 9455123451 18 3 RAMESH GURGAON 9652431543 18 4 SUJIT ROHTAK 9156253131 20 SURESH DELHI 9156768971 18 TRUNCATE: This command used to delete all the rows from the table and free the space containing the table. Syntax: TRUNCATE TABLE table_name; Example: TRUNCATE table students; DML COMMAND: Data Manipulation Language (DML) allows you to modify the database instance by inserting, modifying, and deleting its data. It is responsible for performing all types of data modification in a database. There are three basic constructs which allow database program and user to enter data and information are: 1. INSERT 2. UPDATE 3. DELETE INSERT: This is a statement is a SQL query. This command is used to insert data into the row of a table. Syntax: INSERT INTO TABLE_NAME (col1, col2, col3,.... col N) VALUES (value1, value2, value3, .... valueN); Or INSERT INTO TABLE_NAME VALUES (value1, value2, value3, .... valueN); For example: INSERT INTO students VALUES ('60', 'Tom', ‘abcdef',’9787097870’,’35’); Table 10.17 After Insertion 138 CU IDOL SELF LEARNING MATERIAL (SLM)
ROLL_NO NAME ADDRESS PHONE AGE 1 RAM DELHI 2 9455123451 18 3 RAMESH GURGAON 9652431543 18 4 SUJIT ROHTAK 9156253131 20 60 9156768971 18 SURESH DELHI 9787097870 35 Tom abcdef UPDATE: This command is used to update or modify the value of a column in the table. Syntax: UPDATE table_name SET [column_name1= value1,...column_nameN = valueN] [WHERE CONDITION] For example: UPDATE students SET Name = 'Jhon', phone= '9595950000' WHERE ROLL_NO = 3; Table 10.18 After Update ROLL_NO NAME ADDRESS PHONE AGE 1 RAM DELHI 2 9455123451 18 3 RAMESH GURGAON 9652431543 18 4 Jhon ROHTAK 9595950000 20 60 9156768971 18 SURESH DELHI 9787097870 35 Tom abcdef DELETE: This command is used to remove one or more rows from a table. Syntax: DELETE FROM table_name [WHERE condition]; For example: DELETE FROM students WHERE FirstName = 'Jhon'; Table 10.19 After Delete 139 CU IDOL SELF LEARNING MATERIAL (SLM)
ROLL_NO NAME ADDRESS PHONE AGE 1 RAM DELHI 9455123451 18 2 RAMESH GURGAON 9652431543 18 4 SURESH DELHI 9156768971 18 60 Tom abcdef 9787097870 35 DCL (Data Control Language): DCL includes commands like GRANT and REVOKE, which are useful to give \"rights & permissions.\" Other permission controls parameters of the database system. Examples of DCL commands: Commands that come under DCL: ▪ Grant ▪ Revoke Grant: This command is use to give user access privileges to a database. Syntax: GRANT SELECT, UPDATE ON MY_TABLE TO SOME_USER, ANOTHER_USER; For example: GRANT SELECT ON Users TO'Tom'@'localhost; Revoke: It is useful to back permissions from the user. Syntax: REVOKE privilege_nameON object_nameFROM {user_name |PUBLIC |role_name} For example: REVOKE SELECT, UPDATE ON student FROM BCA, MCA; TRANSACTION CONTROL LANGUAGE (TCL): TCL commands deal with the transaction within the database. Commit This command is used to save all the transactions to the database. Syntax: Commit; For example: DELETE FROM Students WHERE RollNo =4; COMMIT; Rollback 140 CU IDOL SELF LEARNING MATERIAL (SLM)
Rollback command allows you to undo transactions that have not already been saved to the database. Syntax: ROLLBACK; Example: DELETE FROM Students WHERE RollNo =4; SAVEPOINT This command helps you to sets a save point within a transaction. Syntax: SAVEPOINT SAVEPOINT_NAME; Example: SAVEPOINT RollNo; Data Query Language (DQL): DQL is used to fetch the data from the database. It uses only one command: SELECT: This command helps you to select the attribute based on the condition described by the WHERE clause. Syntax: SELECT expressions FROM TABLES WHERE conditions; For example: SELECT Name FROM Student WHERE RollNo > 3; Table 10.20 Output of the Query NAME SURESH Tom 10.3 SUMMARY • Select Operation (σ): It selects tuples that satisfy the given predicate from a relation. • Project Operation (∏): It projects column(s) that satisfy a given predicate. Notation − ∏A1, A2, An (r) Where A1, A2, An are attribute names of relation r. Duplicate rows are automatically eliminated, as relation is a set. • Intersection Operator (∩): Only those rows that are present in both the tables will appear in the result set. 141 CU IDOL SELF LEARNING MATERIAL (SLM)
• Cartesian Product (Χ): Combines information of two different relations into one. • Set Difference (−): The result of set difference operation is tuples, which are present in one relation but are not in the second relation. • Rename Operation (ρ):The rename operation allows us to rename the output relation. 'rename' operation is denoted with small Greek letter rho ρ. • SQL is a database language designed for the retrieval and management of data in a relational database. • It helps users to access data in the RDBMS system • Data Definition Language (DDL) helps you to define the database structure or schema. • Data Manipulation Language (DML) allows you to modify the database instance by inserting, modifying, and deleting its data. • DCL (Data Control Language) includes commands like GRANT and REVOKE, which are useful to give \"rights & permissions.\" • Transaction control language or TCL commands deal with the transaction within the database. • Data Query Language (DQL) is used to fetch the data from the database. 10.4 KEYWORDS • TRC: Tuple Relational Calculus • DRC: Domain Relational Calculus • DDL: Data Definition Language (DDL) • DML: Data Manipulation Language (DML) • DCL: Data Control Language (DCL) • TCL: Transaction Control Language (TCL) • DQL: Data Query Language (DQL) 10.5 LEARNING ACTIVITY You can build a solution that saves student records for an educational institution. Handling student records is no easy feat. You need to keep their name, subjects, fees, any provision of concession, and their academic progress. A DBMS-based solution will allow the client to save a lot of time and effort. Your design goal should be to have separate files for each student where the data will store information about the student. You can start by adding the following sections: ▪ Student’s Name ▪ Fees ▪ Subjects (or Stream) ▪ Grades (or Marks) ▪ Concessions (or Scholarship) 142 CU IDOL SELF LEARNING MATERIAL (SLM)
▪ Additional Information It’s one of the easy database project ideas. You can take it a step further, and add the option to include students of different grades or sections. Your designed system should allow the admin to enter the details mentioned above. And the admin should be able to access it easily. ___________________________________________________________________________ ____________________________________________________________________ 10.6 UNIT END QUESTIONS A. Descriptive Questions Short Questions 1. What is meant by relational algebra? 2. Define Rename operation in Relational Algebra with example? 3. Explain the Cartesian Product Operation in relational algebra. 4. What is DML command in sql? 5. What is DDL and TCL command? Long Questions 1. List and explain the fundamental operation of relational algebra. 2. Explain Select and Intersection operation of Relational Algebra with example. 3. Explain Project and Union operation of Relational Algebra with example. 4. Consider the following Relational Database. Student (roll_no, name, city,marks,c_no) Course (c_no,cname,fees) Construct Queries into Relational algebra. a. List Student Details enrolled for ‘BBA (C.A)’ Course. b. List the Course having fees < 20000 c. Display all students living in either ‘Nasik’ or ‘Pune’ city. d. Display Course detail for student ‘Gaurav Sharma’. 5. List out the various DDL commands. Explain anyone with an example. B. Multiple choice Questions 1. Relational Algebra is a __________ query language that takes two relations as input and produces another relation as an output of the query. a. Relational b. Structural c. Procedural d. Fundamental 143 CU IDOL SELF LEARNING MATERIAL (SLM)
2. Which of the following is a fundamental operation in relational algebra? a. Set intersection b. Natural join c. Assignment d. None of these 3. Which of the following is used to denote the selection operation in relational algebra? a. Pi (Greek) b. Sigma (Greek) c. Lambda (Greek) 4. Which of the following is not a DDL command? a. UPDATE b. TRUNCATE c. ALTER d. None of these 5. Which of the following are TCL commands? a. UPDATE and TRUNCATE b. SELECT and INSERT c. GRANT and REVOKE d. ROLLBACK and SAVEPOINT Answer 1.(a) 2. (d) 3. (b) 4. (a) 5. (d) 10.7 REFERENCES Text Books: • T1 R. Elmasri and S.B. Navathe, Fundamentals of Database Systems, Pearson Education, New Delhi. • T2 C.J. Date, An Introduction to Database Systems Pearson Education, New Delhi. • T3 Data, C. and Darwen, H, Reading, A Guide to the SQL Standard, Addison-Wesley Publications, New Delhi. Reference Books: • R1 A. Silberschatz, H.F. Korth and S. Sudarshan, Database System Concepts, McGraw-Hill, International Edition. • R2 Ivan Bayross, SQL / PL/SQL, BPB Publications. 144 CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 11: EER AND ER TO RELATIONAL MAPPING Structure 11.0 Learning objective 11.1 Database Design Using EER To Relational Language 11.2 Summary 11.3 Keywords 11.4 Learning Activity 11.5 Unit End Questions 11.6 References 11.0 LEARNING OBJECTIVE After studying this unit, student will be able to ▪ Design and implement the ER diagram of the system. ▪ Explain to Convert ER diagram to Relational Model for different scenarios. 11.1 DATABASE DESIGN USING EER TO RELATIONAL LANGUAGE Outline • ER-to-Relational mapping algorithm • step 1: mapping of regular entity types • step 2: napping of weak entity types • step 3: mapping of binary 1:1 relation types • step 4: mapping of binary 1:N relationship types • step 5: mapping of binary M:N relationship types • step 6: mapping of multivalued attributes • step 7: mapping of N-ary relationship types • mapping EER model constructs to relations • step 8: options for mapping specialization or generalization • step 9: mapping of union types (categories) 145 CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 11.1: ER conceptual schema diagram Figure 11.2: Mapping the Company ER schema into a relational database schema 146 CU IDOL SELF LEARNING MATERIAL (SLM)
Together 147 CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 11.3: Diagram for together relation schema An algorithm for ER-to-relational mapping Step 1: mapping of regular entity types • for each regular entity type E in the ER schema, create a relation R that includes all the simple attributes of E • include only the simple component attributes of a composite attribute choose one of the key attributes of E as primary key for R • if the chosen key of E is composite, the set of simple attributes that form it will together form the primary key of R • e.g., EMPLOYEE, DEPARTMENT, PROJECT Step 2: mapping of weak entity types • for each weak entity type W in the ER schema with owner entity type E, create a relation R and include all simple attributes of Was attributes of R • include as foreign key attributes of R the primary key attribute(s) of the relation(s) that correspond to the owner entity type(s) • the primary key of R is the combination of the primary key(s) of the owner(s) and the partial key of the weak entity type W, if any • if there is a weak entity type E2 whose owner is also a weak entity type E1, then E1 should be mapped before E2 to determine its primary key first 148 CU IDOL SELF LEARNING MATERIAL (SLM)
• e.g., DEPENDENT Step 3: mapping of binary 1:1 relationship types • for each binary 1:1 relationship type R in the ER schema, identify the relations S and T that correspond to the entity types participating in R • foreign key approach ▪ choose one of the relations, S, and include as a foreign key in S the primary key of T ▪ include all the simple attributes of R as attributes of S • merged relation option ▪ merge the two entity types and the relationship into a single relation • relationship relation option ▪ set up a third relation R for the purpose of cross-referencing the primary keys of S and T • example: MANAGES -> DEPARTMENT.MGRSSN DEPARTMENT.MGRSTARTDATE Step 4: mapping of binary 1: N relationship types • for each binary 1:N relationship type R, identify the relation S that represents the participating entity type at the N-side of the relationship type • include as foreign key in S the primary key of the relation T that represents the other entity type participating in R • include any simple attributes of the 1:N relationship type as attributes of S • e.g., WORKS_FOR: S = EMPLOYEE, T = DEPARTMENT, DNO: the primary key of T Step 5: mapping of binary M: N relationship types • for each binary M:N relationship type R, create a new relation S to represent R • include as foreign key attributes in S the primary keys of the relations that represent the participating entity types • their combination will form the primary key of S • include any simple attributes of R as attributes of S • e.g., WORKS_ON: S = WORKS_ON Step 6: mapping of multivalued attributes • for each multivalued attribute A, create a new relation R • R will include an attribute corresponding to A, plus the primary key attribute K - as a foreign key in R – of the relation that represents the entity type or relationship type that has A as an attribute 149 CU IDOL SELF LEARNING MATERIAL (SLM)
• the primary key of R is the combination of A and K • if the multivalued attribute is composite, include its simple components • e.g., Locations: A = DLOCATION, R = DEPT_LOCATIONS, K =DNUMBER Step 7: mapping of N-ary relationship types • for each n-ary relationship type R, where n > 2, create a new relation S to represent R • include as foreign key attributes in S the primary keys of the relations that represent the participating entity types • include any simple attributes of R as attributes of S • the primary key of S is usually a combination of all the foreign key reference the relations representing the participating entity types • e.g., SUPPLY Figure 11.4: Diagram for mapping steps7 An algorithm for EER-to-relational mapping Step 8: options for mapping specialization or generalization • convert each specialization with m subclasses {S1, S2, …, Sm} and superclass C, where the attributes of C are {k, a1, …, an} and k is the key, into relation schemas using one of the following options: • option 8A: multiple relations-superclass and subclasses o create a relation L for C with attributes Attrs(L) = {k, a1, …, an}and PK(L) = k o create a relation L i for each subclass Si, with the attributes Attrs(Li) = {k} {attributes of Si} and PK(Li) = k • works for any specialization (total or partial, disjoint or overlapping) 150 CU IDOL SELF LEARNING MATERIAL (SLM)
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210