Home Explore Fundamentals of Database Systems [ PART II ]

Fundamentals of Database Systems [ PART II ]

Published by Willington Island, 2021-09-06 03:27:43

Description: [ PART II ]

For database systems courses in Computer Science

This book introduces the fundamental concepts necessary for designing, using, and implementing database systems and database applications. Our presentation stresses the fundamentals of database modeling and design, the languages and models provided by the database management systems, and database system implementation techniques.

The book is meant to be used as a textbook for a one- or two-semester course in database systems at the junior, senior, or graduate level, and as a reference book. The goal is to provide an in-depth and up-to-date presentation of the most important aspects of database systems and applications, and related technologies. It is assumed that readers are familiar with elementary programming and data-structuring concepts and that they have had some exposure to the basics of computer organization.

Read the Text Version

Pages:

1200 Bibliography Lenat, D. [1995] “CYC: A Large-Scale Investment in Longley, P. et al [2001] Geographic Information Systems Knowledge Infrastructure,” CACM 38:11, November and Science, John Wiley, 2001. 1995, pp. 32–38. Lorie, R. [1977] “Physical Integrity in a Large Segmented Lenzerini, M., and Santucci, C. [1983] “Cardinality Con- Database,” TODS, 2:1, March 1977. straints in the Entity Relationship Model,” in ER Con- ference [1983]. Lorie, R., and Plouffe, W. [1983] “Complex Objects and Their Use in Design Transactions,” in SIGMOD [1983]. Leung, C., Hibler, B., and Mwara, N. [1992] “Picture Retrieval by Content Description,” in Journal of Infor- Lowe, D. [2004] “Distinctive Image Features from Scale- mation Science, 1992, pp. 111–119. Invariant Keypoints”, Int. Journal of Computer Vision, Vol. 60, 2004, pp. 91–110. Levesque, H. [1984] “The Logic of Incomplete Knowledge Bases,” in Brodie et al., Ch. 7 [1984]. Lozinskii, E. [1986] “A Problem-Oriented Inferential Data- base System,” TODS, 11:3, September 1986. Li, W.-S., Seluk Candan, K., Hirata, K., and Hara, Y. [1998] Hierarchical Image Modeling for Object-based Media Lu, H., Mikkilineni, K., and Richardson, J. [1987] Retrieval in DKE, 27:2, September 1998, pp. 139–176. “Design and Evaluation of Algorithms to Compute the Transitive Closure of a Database Relation,” in Lien, E., and Weinberger, P. [1978] “Consistency, Concur- ICDE [1987]. rency, and Crash Recovery,” in SIGMOD [1978]. Lubars, M., Potts, C., and Richter, C. [1993] “A Review of Lieuwen, L., and DeWitt, D. [1992] “A Transformation- the State of Practice in Requirements Modeling,” Proc. Based Approach to Optimizing Loops in Database Pro- IEEE International Symposium on Requirements Engi- gramming Languages,” in SIGMOD [1992]. neering, San Diego, CA, 1993. Lilien, L., and Bhargava, B. [1985] “Database Integrity Lucyk, B. [1993] Advanced Topics in DB2, Addison- Block Construct: Concepts and Design Issues,” TSE, Wesley, 1993. 11:9, September 1985. Luhn, H. P. [1957] “A Statistical Approach to Mechanized Lin, J., and Dunham, M. H. [1998] “Mining Association Encoding and Searching of Literary Information,” IBM Rules,” in ICDE [1998]. Journal of Research and Development, 1:4, October 1957, pp. 309–317. Lindsay, B. et al. [1984] “Computation and Communica- tion in R*: A Distributed Database Manager,” TOCS, Lunt, T., and Fernandez, E. [1990] “Database Security,” in 2:1, January 1984. SIGMOD Record, 19:4, pp. 90–97. Lippman R. [1987] “An Introduction to Computing with Lunt, T. et al. [1990] “The Seaview Security Model,” IEEE Neural Nets,” IEEE ASSP Magazine, April 1987. TSE, 16:6, pp. 593–607. Lipski, W. [1979] “On Semantic Issues Connected with Luo, J., and Nascimento, M. [2003] “Content-based Sub- Incomplete Information,” TODS, 4:3, September 1979. image Retrieval via Hierarchical Tree Matching,” in Proc. ACM Int Workshop on Multimedia Databases, Lipton, R., Naughton, J., and Schneider, D. [1990] “Practi- New Orleans, pp. 63–69. cal Selectivity Estimation through Adaptive Sampling,” in SIGMOD [1990]. Madria, S. et al. [1999] “Research Issues in Web Data Min- ing,” in Proc. First Int. Conf. on Data Warehousing and Liskov, B., and Zilles, S. [1975] “Specification Techniques Knowledge Discovery (Mohania, M., and Tjoa, A., eds.) for Data Abstractions,” TSE, 1:1, March 1975. LNCS 1676. Springer, pp. 303–312. Litwin, W. [1980] “Linear Hashing: A New Tool for File Madria, S., Baseer, Mohammed, B., Kumar,V., and Bhow- and Table Addressing,” in VLDB [1980]. mick, S. [2007] “A transaction model and multiversion concurrency control for mobile database systems,” Dis- Liu, B. [2006] Web Data Mining: Exploring Hyperlinks, tributed and Parallel Databases (DPD), 22:2–3, 2007, Contents, and Usage Data (Data-Centric Systems pp. 165–196. and Applications), Springer, 2006. Maguire, D., Goodchild, M., and Rhind, D., eds. [1997] Liu, B. and Chen-Chuan-Chang, K. [2004] “Editorial: Spe- Geographical Information Systems: Principles and cial Issue on Web Content Mining,” SIGKDD Explora- Applications. Vols. 1 and 2, Longman Scientific and tions Newsletter 6:2 , December 2004, pp. 1–4. Technical, New York. Liu, K., and Sunderraman, R. [1988] “On Representing Mahajan, S., Donahoo. M. J., Navathe, S. B., Ammar, M., Indefinite and Maybe Information in Relational Data- Malik, S. [1998] “Grouping Techniques for Update bases,” in ICDE [1988]. Propagation in Intermittently Connected Databases,” in ICDE [1998]. Liu, L., and Meersman, R. [1992] “Activity Model: A Declar- ative Approach for Capturing Communication Behavior Maier, D. [1983] The Theory of Relational Databases, in Object-Oriented Databases,” in VLDB [1992]. Computer Science Press, 1983. Lockemann, P., and Knutsen, W. [1968] “Recovery of Disk Contents After System Failure,” CACM, 11:8, August 1968.

Bibliography 1201 Maier, D., and Warren, D. S. [1988] Computing with McClure, R., and Krüger, I. [2005] “SQL DOM: Compile Logic, Benjamin Cummings, 1988. Time Checking of Dynamic SQL Statements,” Proc. 27th Int. Conf. on Software Engineering, May 2005. Maier, D., Stein, J., Otis, A., and Purdy, A. [1986] “Devel- opment of an Object-Oriented DBMS,” OOPSLA, 1986. Mckinsey [2013] Big data: The next frontier for innova- tion, competition, and productivity, McKinsey Global Malewicz, G, [2010] “Pregel: a system for large-scale graph Institute, 2013, 216 pp. processing,” in SIGMOD [2010]. McLeish, M. [1989] “Further Results on the Security of Malley, C., and Zdonick, S. [1986] “A Knowledge-Based Partitioned Dynamic Statistical Databases,” TODS, Approach to Query Optimization,” in EDS [1986]. 14:1, March 1989. Mannila, H., Toivonen, H., and Verkamo, A. [1994] “Effi- McLeod, D., and Heimbigner, D. [1985] “A Federated cient Algorithms for Discovering Association Rules,” in Architecture for Information Systems,” TOOIS, 3:3, KDD-94, AAAI Workshop on Knowledge Discovery in July 1985. Databases, Seattle, 1994. Mehrotra, S. et al. [1992] “The Concurrency Control Prob- Manning, C., and Schütze, H. [1999] Foundations of lem in Multidatabases: Characteristics and Solutions,” Statistical Natural Language Processing, MIT Press, in SIGMOD [1992]. 1999. Melton, J. [2003] Advanced SQL: 1999—Understanding Manning, C., Raghavan, P., and and Schutze, H. [2008] Object-Relational and Other Advanced Features, Introduction to Information Retrieval, Cambridge Morgan Kaufmann, 2003. University Press, 2008. Melton, J., and Mattos, N. [1996] “An Overview of SQL3— Manola. F. [1998] “Towards a Richer Web Object Model,” The Emerging New Generation of the SQL Standard, in ACM SIGMOD Record, 27:1, March 1998. Tutorial No. T5,” VLDB, Bombay, September 1996. Manolopoulos, Y., Nanopoulos, A., Papadopoulos, A., and Melton, J., and Simon, A. R. [1993] Understanding the Theodoridis, Y. [2005] R-Trees: Theory and Applica- New SQL: A Complete Guide, Morgan Kaufmann, tions, Springer, 2005. 1993. March, S., and Severance, D. [1977] “The Determination Melton, J., and Simon, A. R. [2002] SQL: 1999—Under- of Efficient Record Segmentations and Blocking Fac- standing Relational Language Components, Morgan tors for Shared Files,” TODS, 2:3, September 1977. Kaufmann, 2002. Mark, L., Roussopoulos, N., Newsome, T., and Laohapipat- Melton, J., Bauer, J., and Kulkarni, K. [1991] “Object ADTs tana, P. [1992] “Incrementally Maintained Network to (with improvements for value ADTs),” ISO WG3 Report Relational Mappings,” Software Practice & Experience, X3H2-91-083, April 1991. 22:12, December 1992. Menasce, D., Popek, G., and Muntz, R. [1980] “A Locking Markowitz, V., and Raz, Y. [1983] “ERROL: An Entity- Protocol for Resource Coordination in Distributed Relationship, Role Oriented, Query Language,” in ER Databases,” TODS, 5:2, June 1980. Conference [1983]. Mendelzon, A., and Maier, D. [1979] “Generalized Mutual Martin, J., and Odell, J. [2008] Principles of Object- Dependencies and the Decomposition of Database oriented Analysis and Design, Prentice-Hall, 2008. Relations,” in VLDB [1979]. Martin, J., Chapman, K., and Leben, J. [1989] DB2- Mendelzon, A., Mihaila, G., and Milo, T. [1997] “Querying Concepts, Design, and Programming, Prentice-Hall, the World Wide Web,” Journal of Digital Libraries, 1989. 1:1, April 1997. Maryanski, F. [1980] “Backend Database Machines,” ACM Mesnier, M. et al. [ 2003]. “Object-Based Storage.” IEEE Computing Surveys, 12:1, March 1980. Communications Magazine, August 2003, pp. 84–90. Masunaga, Y. [1987] “Multimedia Databases: A Formal Metais, E., Kedad, Z., Comyn-Wattiau, C., and Bouzeg- Framework,” Proc. IEEE Office Automation Symposium, houb, M., “Using Linguistic Knowledge in View Inte- April 1987. gration: Toward a Third Generation of Tools,” DKE, 23:1, June 1998. Mattison, R., Data Warehousing: Strategies, Technolo- gies, and Techniques, McGraw-Hill, 1996. Mihailescu, M., Soundararajan, G., and Amza, C. “MixA- part: Decoupled Analytics for Shared Storage Systems” Maune, D. F. [2001] Digital Elevation Model Technolo- In USENIX Conf on File And Storage Technologies gies and Applications: The DEM Users Manual, (FAST), 2013 ASPRS, 2001. Mikkilineni, K., and Su, S. [1988] “An Evaluation of Rela- McCarty, C. et al. [2005]. “Marshfield Clinic Personalized tional Join Algorithms in a Pipelined Query Processing Medicine Research Project (PMRP): design, methods Environment,” TSE, 14:6, June 1988. and recruitment for a large population-based biobank,” Personalized Medicine, 2005, pp. 49–70.

1202 Bibliography Mikolajczyk, K., and Schmid, C. [2005] “A performance Distributed Software and Database Systems, IEEE CS, evaluation of local descriptors”, IEEE Transactions on July 1982. PAMI, 10:27, 2005, pp. 1615–1630. Motro, A. [1987] “Superviews: Virtual Integration of Mul- tiple Databases,” TSE, 13:7, July 1987. Miller, G. A. [1990] “Nouns in WordNet: a lexical inheri- Mouratidis, K. et al. [2006] “Continuous nearest neighbor tance system.” in International Journal of Lexicography monitoring in road networks,” in VLDB [2006], 3:4, 1990, pp. 245–264. pp. 43–54. Mukkamala, R. [1989] “Measuring the Effect of Data Dis- Miller, H. J., (2004) “Tobler’s First Law and Spatial Analysis,” tribution and Replication Models on Performance Annals of the Association of American Geographers, Evaluation of Distributed Systems,” in ICDE [1989]. 94:2, 2004, pp. 284–289. Mumick, I., Finkelstein, S., Pirahesh, H., and Ramakrish- nan, R. [1990a] “Magic Is Relevant,” in SIGMOD [1990]. Milojicic, D. et al. [2002] Peer-to-Peer Computing, HP Lab- Mumick, I., Pirahesh, H., and Ramakrishnan, R. [1990b] oratories Technical Report No. HPL-2002-57, HP Labs, “The Magic of Duplicates and Aggregates,” in VLDB Palo Alto, available at www.hpl.hp.com/techre- [1990]. ports/2002/HPL-2002-57R1.html. Muralikrishna, M. [1992] “Improved Unnesting Algorithms for Join and Aggregate SQL Queries,” in VLDB [1992]. Minoura, T., and Wiederhold, G. [1981] “Resilient Muralikrishna, M., and DeWitt, D. [1988] “Equi-depth Extended True-Copy Token Scheme for a Distributed Histograms for Estimating Selectivity Factors for Database,” TSE, 8:3, May 1981. Multi-dimensional Queries,” in SIGMOD [1988]. Murthy, A.C. and Vavilapalli, V.K. [2014] Apache Hadoop Missikoff, M., and Wiederhold, G. [1984] “Toward a Uni- YARN: Moving beyond MapReduce and Batch fied Approach for Expert and Database Systems,” in Processing with Apache Hadoop 2, Addison Wesley, EDS [1984]. 2014, 304 pp. Mylopolous, J., Bernstein, P., and Wong, H. [1980] “A Lan- Mitchell, T. [1997] Machine Learning, McGraw-Hill, 1997. guage Facility for Designing Database-Intensive Appli- Mitschang, B. [1989] “Extending the Relational Algebra to cations,” TODS, 5:2, June 1980. Naedele, M., [2003] Standards for XML and Web Services Capture Complex Objects,” in VLDB [1989]. Security, IEEE Computer, 36:4, April 2003, pp. 96–98. Moczar, L. [2015] Enterprise Lucene and Solr, Addison Naish, L., and Thom, J. [1983] “The MU-PROLOG Deduc- tive Database,” Technical Report 83/10, Department of Wesley, forthcoming, 2015, 496 pp. Computer Science, University of Melbourne, 1983. Mohan, C. [1993] “IBM’s Relational Database Products: Natan R. [ 2005] Implementing Database Security and Auditing: Includes Examples from Oracle, SQL Features and Technologies,” in SIGMOD [1993]. Server, DB2 UDB, and Sybase, Digital Press, 2005. Mohan, C. et al. [1992] “ARIES: A Transaction Recovery Navathe, S. [1980] “An Intuitive Approach to Normalize Network-Structured Data,” in VLDB [1980]. Method Supporting Fine-Granularity Locking and Par- Navathe, S., and Balaraman, A. [1991] “A Transaction tial Rollbacks Using Write-Ahead Logging,” TODS, Architecture for a General Purpose Semantic Data 17:1, March 1992. Model,” in ER [1991], pp. 511–541. Mohan, C., and Levine, F. [1992] “ARIES/IM: An Efficient Navathe, S. B., Karlapalem, K., and Ra, M. Y. [1996] “A and High-Concurrency Index Management Method Mixed Fragmentation Methodology for the Initial Dis- Using Write-Ahead Logging,” in SIGMOD [1992]. tributed Database Design,” Journal of Computers and Mohan, C., and Narang, I. [1992] “Algorithms for Creating Software Engineering, 3:4, 1996. Indexes for Very Large Tables without Quiescing Navathe, S. B. et al. [1994] “Object Modeling Using Clas- Updates,” in SIGMOD [1992]. sification in CANDIDE and Its Application,” in Dogac Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., and et al. [1994]. Schwarz, P. [1992] “ARIES: A Transaction Recovery Navathe, S., and Ahmed, R. [1989] “A Temporal Relational Method Supporting Fine-Granularity Locking and Par- Model and Query Language,” Information Sciences, tial Rollbacks Using Write-Ahead Logging,” TODS, 47:2, March 1989, pp. 147–175. 17:1, March 1992. Navathe, S., and Gadgil, S. [1982] “A Methodology for View Morris, K. et al. [1987] “YAWN! (Yet Another Window on Integration in Logical Database Design,” in VLDB [1982]. NAIL!), in ICDE [1987]. Morris, K., Ullman, J., and VanGelden, A. [1986] “Design Overview of the NAIL! System,” Proc. Third International Conference on Logic Programming, Springer-Verlag, 1986. Morris, R. [1968] “Scatter Storage Techniques,” CACM, 11:1, January 1968. Morsi, M., Navathe, S., and Kim, H. [1992] “An Extensible Object-Oriented Database Testbed,” in ICDE [1992]. Moss, J. [1982] “Nested Transactions and Reliable Distrib- uted Computing,” Proc. Symposium on Reliability in

Bibliography 1203 Navathe, S., and Kerschberg, L. [1986] “Role of Data Dic- O’Neil, P., and O’Neil, P. [2001] Database: Principles, Pro- tionaries in Database Design,” Information and Man- gramming, Performance, Morgan Kaufmann, 1994. agement, 10:1, January 1986. Obermarck, R. [1982] “Distributed Deadlock Detection Navathe, S., and Savasere, A. [1996] “A Practical Schema Algorithms,” TODS, 7:2, June 1982. Integration Facility Using an Object Oriented Approach,” in Multidatabase Systems (A. Elmagarmid Oh, Y.-C. [1999] “Secure Database Modeling and Design,” and O. Bukhres, eds.), Prentice-Hall, 1996. Ph.D. dissertation, College of Computing, Georgia Institute of Technology, March 1999. Navathe, S., and Schkolnick, M. [1978] “View Representa- tion in Logical Database Design,” in SIGMOD [1978]. Ohsuga, S. [1982] “Knowledge Based Systems as a New Interactive Computer System of the Next Generation,” Navathe, S., Ceri, S., Wiederhold, G., and Dou, J. [1984] in Computer Science and Technologies, North-Hol- “Vertical Partitioning Algorithms for Database Design,” land, 1982. TODS, 9:4, December 1984. Olken, F., Jagadish, J. [2003] Management for Integrative Navathe, S., Elmasri, R., and Larson, J. [1986] “Integrating Biology,” OMICS: A Journal of Integrative Biology, User Views in Database Design,” IEEE Computer, 7:1, January 2003. 19:1, January 1986. Olle, T. [1978] The CODASYL Approach to Data Base Navathe, S., Patil, U., and Guan, W. [2007] “Genomic and Management, Wiley, 1978. Proteomic Databases: Foundations, Current Status and Future Applications,” in Journal of Computer Science Olle, T., Sol, H., and Verrijn-Stuart, A., eds. [1982] Informa- and Engineering, Korean Institute of Information Sci- tion System Design Methodology, North-Holland, 1982. entists and Engineers (KIISE), 1:1, 2007, pp. 1–30 Olston, C. et al. [2008] Pig Latin: A Not-So-Foreign lan- Navathe, S., Sashidhar, T., and Elmasri, R. [1984a] “Relation- guage for Data Processing, in SIGMOD [2008]. ship Merging in Schema Integration,” in VLDB [1984]. Omiecinski, E., and Scheuermann, P. [1990] “A Parallel Negri, M., Pelagatti, S., and Sbatella, L. [1991] “Formal Algorithm for Record Clustering,” TODS, 15:4, Decem- Semantics of SQL Queries,” TODS, 16:3, September 1991. ber 1990. Ng, P. [1981] “Further Analysis of the Entity-Relationship Omura, J. K. [1990] “Novel applications of cryptography in Approach to Database Design,” TSE, 7:1, January 1981. digital communications,” IEEE Communications Magazine, 28:5, May 1990, pp. 21–29. Ngu, A. [1989] “Transaction Modeling,” in ICDE [1989], pp. 234–241. O’Neil, P. and Graefe, G., ‘Multi-Table Joins Through Bitmapped Join Indices’, SIGMOD Record, Vol. 24, Nicolas, J. [1978] “Mutual Dependencies and Some Results No. 3, 1995. on Undecomposable Relations,” in VLDB [1978]. Open GIS Consortium, Inc. [1999] “OpenGIS® Simple Nicolas, J. [1997] “Deductive Object-oriented Databases, Features Specification for SQL,” Revision 1.1, OpenGIS Technology, Products, and Applications: Where Are Project Document 99-049, May 1999. We?” Proc. Symposium on Digital Media Information Base (DMIB ’97), Nara, Japan, November 1997. Open GIS Consortium, Inc. [2003] “OpenGIS® Geography Markup Language (GML) Implementation Specifica- Nicolas, J., Phipps, G., Derr, M., and Ross, K. [1991] tion,” Version 3, OGC 02-023r4., 2003. “Glue-NAIL!: A Deductive Database System,” in SIGMOD [1991]. Oracle [2005] Oracle 10, Introduction to LDAP and Oracle Internet Directory 10g Release 2, Oracle Corporation, Niemiec, R. [2008] Oracle Database 10g Performance 2005. Tuning Tips & Techniques , McGraw Hill Osborne Media, 2008, 967 pp. Oracle [2007] Oracle Label Security Administrator’s Guide, 11g (release 11.1), Part no. B28529-01, Oracle, Nievergelt, J. [1974] “Binary Search Trees and File Organiza- available at http://download.oracle.com/docs/cd/ tion,” ACM Computing Surveys, 6:3, September 1974. B28359_01/network.111/b28529/intro.htm. Nievergelt, J., Hinterberger, H., and Seveik, K. [1984]. “The Oracle [2008] Oracle 11 Distributed Database Concepts Grid File: An Adaptable Symmetric Multikey File 11g Release 1, Oracle Corporation, 2008. Structure,” TODS, 9:1, March 1984, pp. 38–71. Oracle [2009] “An Oracle White Paper: Leading Practices for Nijssen, G., ed. [1976] Modelling in Data Base Manage- Driving Down the Costs of Managing Your Oracle Iden- ment Systems, North-Holland, 1976. tity and Access Management Suite,” Oracle, April 2009. Nijssen, G., ed. [1977] Architecture and Models in Data Osborn, S. L. [1977] “Normal Forms for Relational Data- Base Management Systems, North-Holland, 1977. bases,” Ph.D. dissertation, University of Waterloo, 1977. Nwosu, K., Berra, P., and Thuraisingham, B., eds. [1996] Osborn, S. L. [1989] “The Role of Polymorphism in Design and Implementation of Multimedia Database Schema Evolution in an Object-Oriented Database,” Management Systems, Kluwer Academic, 1996. TKDE, 1:3, September 1989.

1204 Bibliography Osborn, S. L.[1979] “Towards a Universal Relation Inter- Patterson, D., Gibson, G., and Katz, R. [1988] “A Case for face,” in VLDB [1979]. Redundant Arrays of Inexpensive Disks (RAID),” in SIGMOD [1988]. Ozsoyoglu, G., Ozsoyoglu, Z., and Matos, V. [1985] “Extending Relational Algebra and Relational Calculus Paul, H. et al. [1987] “Architecture and Implementation of the with Set Valued Attributes and Aggregate Functions,” Darmstadt Database Kernel System,” in SIGMOD [1987]. TODS, 12:4, December 1987. Pavlo, A. et al. [2009] A Comparison of Approaches to Ozsoyoglu, Z., and Yuan, L. [1987] “A New Normal Form Large Scale Data Analysis, in SIGMOD [2009]. for Nested Relations,” TODS, 12:1, March 1987. Pazandak, P., and Srivastava, J., “Evaluating Object DBMSs Ozsu, M. T., and Valduriez, P. [1999] Principles of Distrib- for Multimedia,” IEEE Multimedia, 4:3, pp. 34–49. uted Database Systems, 2nd ed., Prentice-Hall, 1999. Pazos- Rangel, R. et. al. [2006] “Least Likely to Use: A New Palanisamy, B. et al. [2011] “Purlieus: locality-aware Page Replacement Strategy for Improving Database resource allocation for MapReduce in a cloud,” In Proc. Management System Response Time,” in Proc. CSR ACM/IEEE Int. Conf for High Perf Computing, Net- 2006: Computer Science- Theory and Applications, St. working , Storage and Analysis, (SC) 2011. Petersburg, Russia, LNCS, Volume 3967, Springer, 2006, pp. 314–323. Palanisamy, B. et al. [2014] “VNCache: Map Reduce Anal- ysis for Cloud-archived Data”, Proc. 14th IEEE/ACM PDES [1991] “A High-Lead Architecture for Implementing Int. Symp. on Cluster, Cloud and Grid Computing, a PDES/STEP Data Sharing Environment,” Publication 2014. Number PT 1017.03.00, PDES Inc., May 1991. Palanisamy, B., Singh, A., and Liu, Ling, “Cost-effective Pearson, P. et al. [1994] “The Status of Online Mendelian Resource Provisioning for MapReduce in a Cloud”, Inheritance in Man (OMIM) Medio 1994” Nucleic IEEE TPDS, 26:5, May 2015. Acids Research, 22:17, 1994. Papadias, D. et al. [2003] “Query Processing in Spatial Net- Peckham, J., and Maryanski, F. [1988] “Semantic Data work Databases,” in VLDB [2003] pp. 802–813. Models,” ACM Computing Surveys, 20:3, September 1988, pp. 153–189. Papadimitriou, C. [1979] “The Serializability of Concur- rent Database Updates,” JACM, 26:4, October 1979. Peng, T. and Tsou, M. [2003] Internet GIS: Distributed Geographic Information Services for the Internet Papadimitriou, C. [1986] The Theory of Database Con- and Wireless Network, Wiley, 2003. currency Control, Computer Science Press, 1986. Pfleeger, C. P., and Pfleeger, S. [2007] Security in Comput- Papadimitriou, C., and Kanellakis, P. [1979] “On Concur- ing, 4th ed., Prentice-Hall, 2007. rency Control by Multiple Versions,” TODS, 9:1, March 1974. Phipps, G., Derr, M., and Ross, K. [1991] “Glue-NAIL!: A Deductive Database System,” in SIGMOD [1991]. Papazoglou, M., and Valder, W. [1989] Relational Database Management: A Systems Programming Approach, Piatetsky-Shapiro, G., and Frawley, W., eds. [1991] Prentice-Hall, 1989. Knowledge Discovery in Databases, AAAI Press/ MIT Press, 1991. Paredaens, J., and Van Gucht, D. [1992] “Converting Nested Algebra Expressions into Flat Algebra Expres- Pistor P., and Anderson, F. [1986] “Designing a General- sions,” TODS, 17:1, March 1992. ized NF2 Model with an SQL-type Language Interface,” in VLDB [1986], pp. 278–285. Parent, C., and Spaccapietra, S. [1985] “An Algebra for a General Entity-Relationship Model,” TSE, 11:7, Pitoura, E., and Bhargava, B. [1995] “Maintaining Consis- July 1985. tency of Data in Mobile Distributed Environments.” In 15th ICDCS, May 1995, pp. 404–413. Paris, J. [1986] “Voting with Witnesses: A Consistency Scheme for Replicated Files,” in ICDE [1986]. Pitoura, E., and Samaras, G. [1998] Data Management for Mobile Computing, Kluwer, 1998. Park, J., Chen, M., and Yu, P. [1995] “An Effective Hash- Based Algorithm for Mining Association Rules,” in Pitoura, E., Bukhres, O., and Elmagarmid, A. [1995] SIGMOD [1995]. “Object Orientation in Multidatabase Systems,” ACM Computing Surveys, 27:2, June 1995. Parker Z., Poe, S., and Vrbsky, S.V. [2013] “Comparing NoSQL MongoDB to an SQL DB,” Proc. 51st ACM South- Polavarapu, N. et al. [2005] “Investigation into Biomedical east Conference [ACMSE ’13], Savannah, GA, 2013. Literature Screening Using Support Vector Machines,” in Proc. 4th Int. IEEE Computational Systems Bioinfor- Paton, A. W., ed. [1999] Active Rules in Database Sys- matics Conference (CSB’05), August 2005, pp. 366–374. tems, Springer-Verlag, 1999. Ponceleon D. et al. [1999] “CueVideo: Automated Multi- Paton, N. W., and Diaz, O. [1999] Survey of Active Data- media Indexing and Retrieval,” Proc. 7th ACM Multi- base Systems, ACM Computing Surveys, 31:1, 1999, media Conf., Orlando, Fl., October 1999, p.199. pp. 63–103.

Bibliography 1205 Ponniah, P. [2010] Data Warehousing Fundamentals for IT Reese, G. [1997] Database Programming with JDBC and Professionals, 2nd Ed., Wiley Interscience, 2010, 600pp. Java, O’Reilley, 1997. Poosala, V., Ioannidis, Y., Haas, P., and Shekita, E. [1996] Reisner, P. [1977] “Use of Psychological Experimentation “Improved Histograms for Selectivity Estimation of as an Aid to Development of a Query Language,” TSE, Range Predicates,” in SIGMOD [1996]. 3:3, May 1977. Porter, M. F. [1980] “An algorithm for suffix stripping,” Reisner, P. [1981] “Human Factors Studies of Database Program, 14:3, pp. 130–137. Query Languages: A Survey and Assessment,” ACM Computing Surveys, 13:1, March 1981. Ports, D.R.K. and Grittner, K. [2012] “Serializable Snap- shot Isolation in PostgreSQL,” Proceedings of VLDB, Reiter, R. [1984] “Towards a Logical Reconstruction 5:12, 2012, pp. 1850–1861. of Relational Database Theory,” in Brodie et al., Ch. 8 [1984]. Potter, B., Sinclair, J., and Till, D. [1996] An Introduction to Formal Specification and Z, 2nd ed., Prentice-Hall, Reuter, A. [1980] “A Fast Transaction Oriented Logging 1996. Scheme for UNDO recovery,” TSE 6:4, pp. 348–356. Prabhakaran, B. [1996] Multimedia Database Manage- Revilak, S., O’Neil, P., and O’Neil, E. [2011] “Precisely Seri- ment Systems, Springer-Verlag, 1996. alizable Snapshot Isolation (PSSI),” in ICDE [2011], pp. 482–493. Prasad, S. et al. [2004] “SyD: A Middleware Testbed for Collaborative Applications over Small Heterogeneous Ries, D., and Stonebraker, M. [1977] “Effects of Locking Devices and Data Stores,” Proc. ACM/IFIP/USENIX 5th Granularity in a Database Management System,” International Middleware Conference (MW-04), TODS, 2:3, September 1977. Toronto, Canada, October 2004. Rissanen, J. [1977] “Independent Components of Rela- Price, B. [2004] “ESRI Systems IntegrationTechnical tions,” TODS, 2:4, December 1977. Brief—ArcSDE High-Availability Overview,” ESRI, 2004, Rev 2 (www.lincoln.ne.gov/city/pworks/gis/pdf/ Rivest, R. et al.[1978] “A Method for Obtaining Digital arcsde.pdf ). Signatures and Public-Key Cryptosystems,” CACM, 21:2, February 1978, pp. 120–126. Rabitti, F., Bertino, E., Kim, W., and Woelk, D. [1991] “A Model of Authorization for Next-Generation Database Robbins, R. [1993] “Genome Informatics: Requirements Systems,” TODS, 16:1, March 1991. and Challenges,” Proc. Second International Conference on Bioinformatics, Supercomputing and Complex Ramakrishnan, R., and Gehrke, J. [2003] Database Man- Genome Analysis, World Scientific Publishing, 1993. agement Systems, 3rd ed., McGraw-Hill, 2003. Robertson, S. [1997] “The Probability Ranking Principle Ramakrishnan, R., and Ullman, J. [1995] “Survey of in IR,” in Readings in Information Retrieval (Jones, K. Research in Deductive Database Systems,” Journal of S., and Willett, P., eds.), Morgan Kaufmann Multimedia Logic Programming, 23:2, 1995, pp. 125–149. Information and Systems Series, pp. 281–286. Ramakrishnan, R., ed. [1995] Applications of Logic Data- Robertson, S., Walker, S., and Hancock-Beaulieu, M. bases, Kluwer Academic, 1995. [1995] “Large Test Collection Experiments on an Oper- ational, Interactive System: Okapi at TREC,” Informa- Ramakrishnan, R., Srivastava, D., and Sudarshan, S. [1992] tion Processing and Management, 31, pp. 345–360. “{CORAL} : {C} ontrol, {R} elations and {L} ogic,” in VLDB [1992]. Rocchio, J. [1971] “Relevance Feedback in Information Retrieval,” in The SMART Retrieval System: Experi- Ramakrishnan, R., Srivastava, D., Sudarshan, S., and She- ments in Automatic Document Processing, (G. shadri, P. [1993] “Implementation of the {CORAL} Salton, ed.), Prentice-Hall, pp. 313–323. deductive database system,” in SIGMOD [1993]. Rosenkrantz, D., Stearns, D., and Lewis, P. [1978] System- Ramamoorthy, C., and Wah, B. [1979] “The Placement of Level Concurrency Control for Distributed Database Relations on a Distributed Relational Database,” Proc. Systems, TODS, 3:2, pp. 178–198. First International Conference on Distributed Comput- ing Systems, IEEE CS, 1979. Rotem, D., [1991] “Spatial Join Indices,” in ICDE [1991]. Roth, M. A., Korth, H. F., and Silberschatz, A. [1988] Ramesh, V., and Ram, S. [1997] “Integrity Constraint Inte- gration in Heterogeneous Databases an Enhanced “Extended Algebra and Calculus for Non-1NF Rela- Methodology for Schema Integration,” Information tional Databases,” TODS, 13:4, 1988, pp. 389–417. Systems, 22:8, December 1997, pp. 423–446. Roth, M., and Korth, H. [1987] “The Design of Non-1NF Relational Databases into Nested Normal Form,” in Ratnasamy, S. et al. [2001] “A Scalable Content-Address- SIGMOD [1987]. able Network.” SIGCOMM 2001. Rothnie, J. et al. [1980] “Introduction to a System for Dis- tributed Databases (SDD-1),” TODS, 5:1, March 1980. Reed, D. P. [1983] “Implementing Atomic Actions on Decen- tralized Data,” TOCS, 1:1, February 1983, pp. 3–23.

1206 Bibliography Roussopoulos, N. [1991] “An Incremental Access Method Salton, G., and Buckley, C. [1991] “Global Text Matching for View-Cache: Concept, Algorithms, and Cost for Information Retrieval” in Science, 253, August 1991. Analysis,” TODS, 16:3, September 1991. Salton, G., Yang, C. S., and Yu, C. T. [1975] “A theory of Roussopoulos, N., Kelley, S., and Vincent, F. [1995] “Near- term importance in automatic text analysis,” Journal of est Neighbor Queries,” in SIGMOD [1995], pp. 71–79. the American Society for Information Science, 26, pp. 33–44 (1975). Rozen, S., and Shasha, D. [1991] “A Framework for Auto- mating Physical Database Design,” in VLDB [1991]. Salzberg, B. [1988] File Structures: An Analytic Approach, Prentice-Hall, 1988. Rudensteiner, E. [1992] “Multiview: A Methodology for Supporting Multiple Views in Object-Oriented Salzberg, B. et al. [1990] “FastSort: A Distributed Single- Databases,” in VLDB [1992]. Input Single-Output External Sort,” in SIGMOD [1990]. Ruemmler, C., and Wilkes, J. [1994] “An Introduction to Samet, H. [1990] The Design and Analysis of Spatial Disk Drive Modeling,” IEEE Computer, 27:3, March Data Structures, Addison-Wesley, 1990. 1994, pp. 17–27. Samet, H. [1990a] Applications of Spatial Data Structures: Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F., and Computer Graphics, Image Processing, and GIS, Lorensen, W. [1991] Object Oriented Modeling and Addison-Wesley, 1990. Design, Prentice-Hall, 1991. Sammut, C., and Sammut, R. [1983] “The Implementation of Rumbaugh, J., Jacobson, I., Booch, G. [1999] The Unified UNSW-PROLOG,” The Australian Computer Journal, Modeling Language Reference Manual, Addison- May 1983. Wesley, 1999. Santucci, G. [1998] “Semantic Schema Refinements for Rusinkiewicz, M. et al. [1988] “OMNIBASE—A Loosely Multilevel Schema Integration,” DKE, 25:3, 1998, Coupled: Design and Implementation of a Multidata- pp. 301–326. base System,” IEEE Distributed Processing Newsletter, 10:2, November 1988. Sarasua, W., and O’Neill, W. [1999]. “GIS in Transporta- tion,” in Taylor and Francis [1999]. Rustin, R., ed. [1972] Data Base Systems, Prentice-Hall, 1972. Rustin, R., ed. [1974] Proc. BJNAV2. Sarawagi, S., Thomas, S., and Agrawal, R. [1998] “Integrating Sacca, D., and Zaniolo, C. [1987] “Implementation of Association Rules Mining with Relational Database sys- tems: Alternatives and Implications,” in SIGMOD [1998]. Recursive Queries for a Data Language Based on Pure Horn Clauses,” Proc. Fourth International Conference Savasere, A., Omiecinski, E., and Navathe, S. [1995] “An on Logic Programming, MIT Press, 1986. Efficient Algorithm for Mining Association Rules,” in Sadri, F., and Ullman, J. [1982] “Template Dependencies: VLDB [1995]. A Large Class of Dependencies in Relational Data- bases and Its Complete Axiomatization,” JACM, 29:2, Savasere, A., Omiecinski, E., and Navathe, S. [1998] “Min- April 1982. ing for Strong Negative Association in a Large Database Sagiv, Y., and Yannakakis, M. [1981] “Equivalence among of Customer Transactions,” in ICDE [1998]. Relational Expressions with the Union and Difference Operators,” JACM, 27:4, November 1981. Schatz, B. [1995] “Information Analysis in the Net: The Sahay, S. et al. [2008] “Discovering Semantic Biomedical Interspace of the Twenty-First Century,” Keynote Plenary Relations Utilizing the Web,” in Journal of ACM Lecture at American Society for Information Science Transactions on Knowledge Discovery from Data (ASIS) Annual Meeting, Chicago, October 11, 1995. (TKDD), Special issue on Bioinformatics, 2:1, 2008. Sakai, H. [1980] “Entity-Relationship Approach to Con- Schatz, B. [1997] “Information Retrieval in Digital Librar- ceptual Schema Design,” in SIGMOD [1980]. ies: Bringing Search to the Net,” Science, 275:17 Janu- Salem, K., and Garcia-Molina, H. [1986] “Disk Striping,” in ary 1997. ICDE [1986], pp. 336–342. Salton, G. [1968] Automatic Information Organization Schek, H. J., and Scholl, M. H. [1986] “The Relational and Retrieval, McGraw Hill, 1968. Model with Relation-valued Attributes,” Information Salton, G. [1971] The SMART Retrieval System—Experi- Systems, 11:2, 1986. ments in Automatic Document Processing, Prentice- Hall, 1971. Schek, H. J., Paul, H. B., Scholl, M. H., and Weikum, G. Salton, G. [1990] “Full Text Information Processing Using [1990] “The DASDBS Project: Objects, Experiences, the Smart System,” IEEE Data Engineering Bulletin and Future Projects,” TKDE, 2:1, 1990. 13:1, 1990, pp. 2–9. Scheuermann, P., Schiffner, G., and Weber, H. [1979] “Abstraction Capabilities and Invariant Properties Modeling within the Entity-Relationship Approach,” in ER Conference [1979]. Schlimmer, J., Mitchell, T., and McDermott, J. [1991] “Justification Based Refinement of Expert Knowledge” in Piatetsky-Shapiro and Frawley [1991].

Bibliography 1207 Schmarzo, B. [2013] Big Data: Understanding How Data Shipman, D. [1981] “The Functional Data Model and the Powers Big Business, Wiley, 2013, 240 pp. Data Language DAPLEX,” TODS, 6:1, March 1981. Schlossnagle, G. [2005] Advanced PHP Programming, Shlaer, S., Mellor, S. [1988] Object-Oriented System Sams, 2005. Analysis: Modeling the World in Data, Prentice-Hall, 1988. Schmidt, J., and Swenson, J. [1975] “On the Semantics of the Relational Model,” in SIGMOD [1975]. Shneiderman, B., ed. [1978] Databases: Improving Usability and Responsiveness, Academic Press, 1978. Schneider, R. D. [2006] MySQL Database Design and Tuining, MySQL Press, 2006. Shvachko, K.V. [2012] “HDFS Scalability: the limits of growth,” Usenix legacy publications, Login, Vol. 35, Scholl, M. O., Voisard, A., and Rigaux, P. [2001] Spatial No. 2, pp. 6–16, April 2010 (https://www.usenix. Database Management Systems, Morgan Kauffman, org/legacy/publications/login/2010-04/openpdfs/ 2001. shvachko.pdf ) Sciore, E. [1982] “A Complete Axiomatization for Full Join Sibley, E., and Kerschberg, L. [1977] “Data Architecture and Dependencies,” JACM, 29:2, April 1982. Data Model Considerations,” NCC, AFIPS, 46, 1977. Scott, M., and Fowler, K. [1997] UML Distilled: Applying Siegel, M., and Madnick, S. [1991] “A Metadata Approach the Standard Object Modeling Language, Addison- to Resolving Semantic Conflicts,” in VLDB [1991]. Wesley, 1997. Siegel, M., Sciore, E., and Salveter, S. [1992] “A Method for Selinger, P. et al. [1979] “Access Path Selection in a Relational Automatic Rule Derivation to Support Semantic Query Database Management System,” in SIGMOD [1979]. Optimization,” TODS, 17:4, December 1992. Senko, M. [1975] “Specification of Stored Data Structures SIGMOD [1974] Proc. ACM SIGMOD-SIGFIDET Confer- and Desired Output in DIAM II with FORAL,” in ence on Data Description, Access, and Control, Rustin, VLDB [1975]. R., ed., May 1974. Senko, M. [1980] “A Query Maintenance Language for the SIGMOD [1975] Proc. 1975 ACM SIGMOD International Data Independent Accessing Model II,” Information Conference on Management of Data, King, F., ed., San Systems, 5:4, 1980. Jose, CA, May 1975. Shapiro, L. [1986] “Join Processing in Database Systems SIGMOD [1976] Proc. 1976 ACM SIGMOD International with Large Main Memories,” TODS, 11:3, 1986. Conference on Management of Data, Rothnie, J., ed., Washington, June 1976. Shasha, D., and Bonnet, P. [2002] Database Tuning: Principles, Experiments, and Troubleshooting SIGMOD [1977] Proc. 1977 ACM SIGMOD International Techniques, Morgan Kaufmann, Revised ed., 2002. Conference on Management of Data, Smith, D., ed., Toronto, August 1977. Shasha, D., and Goodman, N. [1988] “Concurrent Search Structure Algorithms,” TODS, 13:1, March 1988. SIGMOD [1978] Proc. 1978 ACM SIGMOD International Conference on Management of Data, Lowenthal, E., and Shekhar, S., and Chawla, S. [2003] Spatial Databases, A Dale, N., eds., Austin, TX, May/June 1978. Tour, Prentice-Hall, 2003. SIGMOD [1979] Proc. 1979 ACM SIGMOD International Shekhar, S., and Xong, H. [2008] Encyclopedia of GIS, Conference on Management of Data, Bernstein, P., ed., Springer Link (Online service). Boston, MA, May/June 1979. Shekita, E., and Carey, M. [1989] “Performance Enhance- SIGMOD [1980] Proc. 1980 ACM SIGMOD International ment Through Replication in an Object-Oriented Conference on Management of Data, Chen, P., and DBMS,” in SIGMOD [1989]. Sprowls, R., eds., Santa Monica, CA, May 1980. Shenoy, S., and Ozsoyoglu, Z. [1989] “Design and Imple- SIGMOD [1981] Proc. 1981 ACM SIGMOD International mentation of a Semantic Query Optimizer,” TKDE, 1:3, Conference on Management of Data, Lien, Y., ed., Ann September 1989. Arbor, MI, April/May 1981. Sheth, A. P., and Larson, J. A. [1990] “Federated Database SIGMOD [1982] Proc. 1982 ACM SIGMOD International Systems for Managing Distributed, Heterogeneous, and Conference on Management of Data, Schkolnick, M., Autonomous Databases,” ACM Computing Surveys, ed., Orlando, FL, June 1982. 22:3, September 1990, pp. 183–236. SIGMOD [1983] Proc. 1983 ACM SIGMOD International Sheth, A., Gala, S., and Navathe, S. [1993] “On Automatic Conference on Management of Data, DeWitt, D., and Reasoning for Schema Integration,” in International Gardarin, G., eds., San Jose, CA, May 1983. Journal of Intelligent Co-operative Information Sys- tems, 2:1, March 1993. SIGMOD [1984] Proc. 1984 ACM SIGMOD Internaitonal Conference on Management of Data, Yormark, E., ed., Sheth, A., Larson, J., Cornelio, A., and Navathe, S. [1988] Boston, MA, June 1984. “A Tool for Integrating Conceptual Schemas and User Views,” in ICDE [1988].

1208 Bibliography SIGMOD [1985] Proc. 1985 ACM SIGMOD International SIGMOD [2002] Proceedings of 2002 ACM SIGMOD Inter- Conference on Management of Data, Navathe, S., ed., national Conference on Management of Data, Franklin, Austin, TX, May 1985. M., Moon, B., and Ailamaki, A., eds., Madison, WI, June 2002. SIGMOD [1986] Proc. 1986 ACM SIGMOD International Conference on Management of Data, Zaniolo, C., ed., SIGMOD [2003] Proceedings of 2003 ACM SIGMOD Interna- Washington, May 1986. tional Conference on Management of Data, Halevy, Y., Zachary, G., and Doan, A., eds., San Diego, CA, June 2003. SIGMOD [1987] Proc. 1987 ACM SIGMOD International Conference on Management of Data, Dayal, U., and SIGMOD [2004] Proceedings of 2004 ACM SIGMOD Inter- Traiger, I., eds., San Francisco, CA, May 1987. national Conference on Management of Data, Weikum, G., Christian König, A., and DeBloch, S., eds., Paris, SIGMOD [1988] Proc. 1988 ACM SIGMOD International France, June 2004. Conference on Management of Data, Boral, H., and Lar- son, P., eds., Chicago, June 1988. SIGMOD [2005] Proceedings of 2005 ACM SIGMOD Inter- national Conference on Management of Data, Widom, SIGMOD [1989] Proc. 1989 ACM SIGMOD International J., ed., Baltimore, MD, June 2005. Conference on Management of Data, Clifford, J., Lindsay, B., and Maier, D., eds., Portland, OR, June 1989. SIGMOD [2006] Proceedings of 2006 ACM SIGMOD Inter- national Conference on Management of Data, Chaud- SIGMOD [1990] Proc. 1990 ACM SIGMOD International hari, S., Hristidis,V., and Polyzotis, N., eds., Chicago, Conference on Management of Data, Garcia-Molina, H., IL, June 2006. and Jagadish, H., eds., Atlantic City, NJ, June 1990. SIGMOD [2007] Proceedings of 2007 ACM SIGMOD SIGMOD [1991] Proc. 1991 ACM SIGMOD International International Conference on Management of Data, Conference on Management of Data, Clifford, J., and Chan, C.-Y., Ooi, B.-C., and Zhou, A., eds., Beijing, King, R., eds., Denver, CO, June 1991. China, June 2007. SIGMOD [1992] Proc. 1992 ACM SIGMOD International SIGMOD [2008] Proceedings of 2008 ACM SIGMOD Inter- Conference on Management of Data, Stonebraker, M., national Conference on Management of Data, Wang, J. ed., San Diego, CA, June 1992. T.-L., ed., Vancouver, Canada, June 2008. SIGMOD [1993] Proc. 1993 ACM SIGMOD International SIGMOD [2009] Proceedings of 2009 ACM SIGMOD Inter- Conference on Management of Data, Buneman, P., and national Conference on Management of Data, Cetinte- Jajodia, S., eds., Washington, June 1993. mel, U., Zdonik,S., Kossman, D., and Tatbul, N., eds., Providence, RI, June–July 2009. SIGMOD [1994] Proceedings of 1994 ACM SIGMOD Inter- national Conference on Management of Data, Snod- SIGMOD [2010] Proceedings of 2010 ACM SIGMOD Inter- grass, R. T., and Winslett, M., eds., Minneapolis, MN, national Conference on Management of Data, Elmagar- June 1994. mid, Ahmed K. and Agrawal, Divyakant eds., Indianapolis, IN, June 2010. SIGMOD [1995] Proceedings of 1995 ACM SIGMOD Inter- national Conference on Management of Data, Carey, M., SIGMOD [2011] Proceedings of 2011 ACM SIGMOD Inter- and Schneider, D. A., eds., Minneapolis, MN, June 1995. national Conference on Management of Data, Sellis, T., Miller, R., Kementsietsidis, A., and Velegrakis, Y., eds., SIGMOD [1996] Proceedings of 1996 ACM SIGMOD Inter- Athens, Greece, June 2011. national Conference on Management of Data, Jagadish, H. V., and Mumick, I. P., eds., Montreal, June 1996. SIGMOD [2012] Proceedings of 2012 ACM SIGMOD International Conference on Management of Data, Sel- SIGMOD [1997] Proceedings of 1997 ACM SIGMOD Inter- cuk Candan, K., Chen, Yi, Snodgrass, R., Gravano, L., national Conference on Management of Data, Peckham, Fuxman, A., eds., Scottsdale, Arizona, June 2012. J., ed., Tucson, AZ, May 1997. SIGMOD [2013] Proceedings of 2013 ACM SIGMOD Inter- SIGMOD [1998] Proceedings of 1998 ACM SIGMOD Inter- national Conference on Management of Data, Ross, K., national Conference on Management of Data, Haas, L., Srivastava, D., Papadias, D., eds, New York, June 2013. and Tiwary, A., eds., Seattle, WA, June 1998. SIGMOD [2014] Proceedings of 2014 ACM SIGMOD Inter- SIGMOD [1999] Proceedings of 1999 ACM SIGMOD Inter- national Conference on Management of Data, Dyreson, national Conference on Management of Data, Faloutsos, C., Li, Feifei., Ozsu, T., eds., Snowbird, UT, June 2014. C., ed., Philadelphia, PA, May 1999. SIGMOD [2015] Proceedings of 2015 ACM SIGMOD Inter- SIGMOD [2000] Proceedings of 2000 ACM SIGMOD Interna- national Conference on Management of Data, Mel- tional Conference on Management of Data, Chen, W., bourne, Australia, May-June 2015, forthcoming. Naughton J., and Bernstein, P., eds., Dallas, TX, May 2000. Silberschatz, A., Korth, H., and Sudarshan, S. [2011] Data- SIGMOD [2001] Proceedings of 2001 ACM SIGMOD Inter- base System Concepts, 6th ed., McGraw-Hill, 2011. national Conference on Management of Data, Aref, W., ed., Santa Barbara, CA, May 2001.

Bibliography 1209 Silberschatz, A., Stonebraker, M., and Ullman, J. [1990] Srivastava, D., Ramakrishnan, R., Sudarshan, S., and She- “Database Systems: Achievements and Opportunities,” shadri, P. [1993] “Coral++: Adding Object-orientation in ACM SIGMOD Record, 19:4, December 1990. to a Logic Database Language,” in VLDB [1993]. Simon, H. A. [1971] “Designing Organizations for an Infor- Srivastava, J, et al. [2000] “Web Usage Mining: Discovery mation-Rich World,” in Computers, Communi- and Applications of Usage Patterns from Web Data,” cations and the Public Interest, (Greenberger, M., ed.), SIGKDD Explorations, 1:2, 2000. The Johns Hopkins University Press, 1971, (pp. 37–72). Stachour, P., and Thuraisingham, B. [1990] “The Design Sion, R., Atallah, M., and Prabhakar, S. [2004] “Protecting and Implementation of INGRES,” TKDE, 2:2, June Rights Proofs for Relational Data Using Watermark- 1990. ing,” TKDE, 16:12, 2004, pp. 1509–1525. Stallings, W. [1997] Data and Computer Communi- Sklar, D. [2005] Learning PHP5, O’Reilly Media, Inc., 2005. cations, 5th ed., Prentice-Hall, 1997. Smith, G. [1990] “The Semantic Data Model for Security: Stallings, W. [2010] Network Security Essentials, Appli- Representing the Security Semantics of an Applica- cations and Standards, 4th ed., Prentice-Hall, 2010. tion,” in ICDE [1990]. Smith, J. et al. [1981] “MULTIBASE: Integrating Distrib- Stevens, P., and Pooley, R. [2003] Using UML: Software uted Heterogeneous Database Systems,” NCC, AFIPS, Engineering with Objects and Components, Revised 50, 1981. edition, Addison-Wesley, 2003. Smith, J. R., and Chang, S.-F. [1996] “VisualSEEk: A Fully Automated Content-Based Image Query System,” Proc. Stoesser, G. et al. [2003] “The EMBL Nucleotide Sequence 4th ACM Multimedia Conf., Boston, MA, November Database: Major New Developments,” Nucleic Acids 1996, pp. 87–98. Research, 31:1, January 2003, pp. 17–22. Smith, J., and Chang, P. [1975] “Optimizing the Perfor- mance of a Relational Algebra Interface,” CACM, 18:10, Stoica, I., Morris, R., Karger, D. et al. [2001] “Chord: A October 1975. Scalable Peer-To-Peer Lookup Service for Internet Smith, J., and Smith, D. [1977] “Database Abstractions: Applications,” SIGCOMM 2001. Aggregation and Generalization,” TODS, 2:2, June 1977. Smith, K., and Winslett, M. [1992] “Entity Modeling in the Stonebraker, M., Aoki, P., Litwin W., et al. [1996] “Mari- MLS Relational Model,” in VLDB [1992]. posa: A Wide-Area Distributed Database System” Smith, P., and Barnes, G. [1987] Files and Databases: An VLDB J, 5:1, 1996, pp. 48–63. Introduction, Addison-Wesley, 1987. Snodgrass, R. [1987] “The Temporal Query Language Stonebraker M. et al. [2005] “C-store: A column oriented TQuel,” TODS, 12:2, June 1987. DBMS,” in VLDB [2005]. Snodgrass, R., and Ahn, I. [1985] “A Taxonomy of Time in Databases,” in SIGMOD [1985]. Stonebraker, M. [1975] “Implementation of Integrity Snodgrass, R., ed. [1995] The TSQL2 Temporal Query Constraints and Views by Query Modification,” in Language, Springer, 1995. SIGMOD [1975]. Soutou, G. [1998] “Analysis of Constraints for N-ary Rela- tionships,” in ER98. Stonebraker, M. [1993] “The Miro DBMS” in SIGMOD Spaccapietra, S., and Jain, R., eds. [1995] Proc. Visual Data- [1993]. base Workshop, Lausanne, Switzerland, October 1995. Spiliopoulou, M. [2000] “Web Usage Mining for Web Site Stonebraker, M., and Rowe, L. [1986] “The Design of Evaluation,” CACM 43:8, August 2000, pp. 127–134. POSTGRES,” in SIGMOD [1986]. Spooner D., Michael, A., and Donald, B. [1986] “Modeling CAD Data with Data Abstraction and Object-Oriented Stonebraker, M., ed. [1994] Readings in Database Sys- Technique,” in ICDE [1986]. tems, 2nd ed., Morgan Kaufmann, 1994. Srikant, R., and Agrawal, R. [1995] “Mining Generalized Association Rules,” in VLDB [1995]. Stonebraker, M., Hanson, E., and Hong, C. [1987] “The Srinivas, M., and Patnaik, L. [1994] “Genetic Algorithms: Design of the POSTGRES Rules System,” in ICDE [1987]. A Survey,” IEEE Computer, 27:6, June 1994, pp.17–26. Srinivasan, V., and Carey, M. [1991] “Performance of Stonebraker, M., with Moore, D. [1996] Object-Relational B-Tree Concurrency Control Algorithms,” in SIGMOD DBMSs: The Next Great Wave, Morgan Kaufmann, [1991]. 1996. Stonebraker, M., Wong, E., Kreps, P., and Held, G. [1976] “The Design and Implementation of INGRES,” TODS, 1:3, September 1976. Stroustrup, B. [1997] The C++ Programming Language: Special Edition, Pearson, 1997. Su, S. [1985] “A Semantic Association Model for Corporate and Scientific-Statistical Databases,” Information Science, 29, 1985. Su, S. [1988] Database Computers, McGraw-Hill, 1988. Su, S., Krishnamurthy, V., and Lam, H. [1988] “An Object- Oriented Semantic Association Model (OSAM*),” in

1210 Bibliography AI in Industrial Engineering and Manufacturing: Tsangaris, M., and Naughton, J. [1992] “On the Performance Theoretical Issues and Applications, American Insti- of Object Clustering Techniques,” in SIGMOD [1992]. tute of Industrial Engineers, 1988. Subrahmanian V. S., and Jajodia, S., eds. [1996] Multime- Tsichritzis, D. [1982] “Forms Management,” CACM, 25:7, dia Database Systems: Issues and Research Direc- July 1982. tions, Springer-Verlag, 1996. Subrahmanian, V. [1998] Principles of Multimedia Data- Tsichritzis, D., and Klug, A., eds. [1978] The ANSI/X3/ bases Systems, Morgan Kaufmann, 1998. SPARC DBMS Framework, AFIPS Press, 1978. Sunderraman, R. [2007] ORACLE 10g Programming: A Primer, Addison-Wesley, 2007. Tsichritzis, D., and Lochovsky, F. [1976] “Hierarchical Swami, A., and Gupta, A. [1989] “Optimization of Large Database Management: A Survey,” ACM Computing Join Queries: Combining Heuristics and Combinato- Surveys, 8:1, March 1976. rial Techniques,” in SIGMOD [1989]. Sybase [2005] System Administration Guide: Volume 1 Tsichritzis, D., and Lochovsky, F. [1982] Data Models, and Volume 2 (Adaptive Server Enterprise 15.0), Prentice-Hall, 1982. Sybase, 2005. Tan, P., Steinbach, M., and Kumar, V. [2006] Introduction Tsotras, V., and Gopinath, B. [1992] “Optimal Versioning to Data Mining, Addison-Wesley, 2006. of Object Classes,” in ICDE [1992]. Tanenbaum, A. [2003] Computer Networks, 4th ed., Prentice-Hall PTR, 2003. Tsou, D. M., and Fischer, P. C. [1982] “Decomposition of a Tansel, A. et al., eds. [1993] Temporal Databases: Relation Scheme into Boyce Codd Normal Form,” Theory, Design, and Implementation, Benjamin SIGACT News, 14:3, 1982, pp. 23–29. Cummings, 1993. Teorey, T. [1994] Database Modeling and Design: The Fun- U.S. Congress [1988] “Office of Technology Report, damental Principles, 2nd ed., Morgan Kaufmann, 1994. Appendix D: Databases, Repositories, and Informat- Teorey, T., Yang, D., and Fry, J. [1986] “A Logical Design ics,” in Mapping Our Genes: Genome Projects: How Methodology for Relational Databases Using the Big, How Fast? John Hopkins University Press, 1988. Extended Entity-Relationship Model,” ACM Computing Surveys, 18:2, June 1986. U.S. Department of Commerce [1993] TIGER/Line Files, Thomas, J., and Gould, J. [1975] “A Psychological Study of Bureau of Census, Washington, 1993. Query by Example,” NCC AFIPS, 44, 1975. Thomas, R. [1979] “A Majority Consensus Approach to Ullman, J. [1982] Principles of Database Systems, 2nd Concurrency Control for Multiple Copy Data Bases,” ed., Computer Science Press, 1982. TODS, 4:2, June 1979. Thomasian, A. [1991] “Performance Limits of Two-Phase Ullman, J. [1985] “Implementation of Logical Query Lan- Locking,” in ICDE [1991]. guages for Databases,” TODS, 10:3, September 1985. Thuraisingham, B. [2001] Managing and Mining Multi- media Databases, CRC Press, 2001. Ullman, J. [1988] Principles of Database and Knowledge- Thuraisingham, B., Clifton, C., Gupta, A., Bertino, E., and Base Systems, Vol. 1, Computer Science Press, 1988. Ferrari, E. [2001] “Directions for Web and E-commerce Applications Security,” Proc. 10th IEEE International Ullman, J. [1989] Principles of Database and Knowledge- Workshops on Enabling Technologies: Infrastructure for Base Systems, Vol. 2, Computer Science Press, 1989. Collaborative Enterprises, 2001, pp. 200–204. Thusoo, A. et al. [2010] Hive—A Petabyte Scale Data Ullman, J. D., and Widom, J. [1997] A First Course in Warehouse Using Hadoop, in ICDE [2010]. Database Systems, Prentice-Hall, 1997. Todd, S. [1976] “The Peterlee Relational Test Vehicle—A System Overview,” IBM Systems Journal, 15:4, Uschold, M., and Gruninger, M. [1996] “Ontologies: December 1976. Principles, Methods and Applications,” Knowledge Toivonen, H., “Sampling Large Databases for Association Engineering Review, 11:2, June 1996. Rules,” in VLDB [1996]. Tou, J., ed. [1984] Information Systems COINS-IV, Vadivelu, V., Jayakumar, R. V., Muthuvel, M., et al. [2008] Plenum Press, 1984. “A backup mechanism with concurrency control for multilevel secure distributed database systems.” Proc. Int. Conf. on Digital Information Management, 2008, pp. 57–62. Vaidya, J., and Clifton, C., “Privacy-Preserving Data Mining: Why, How, and What For?” IEEE Security & Privacy (IEEESP), November–December 2004, pp. 19–27. Valduriez, P., and Gardarin, G. [1989] Analysis and Comparison of Relational Database Systems, Addison-Wesley, 1989. van Rijsbergen, C. J. [1979] Information Retrieval, But- terworths, 1979. Valiant, L. [1990] “ A Bridging Model for Parallel Compu- tation,” CACM, 33:8, August 1990. Vassiliou, Y. [1980] “Functional Dependencies and Incom- plete Information,” in VLDB [1980].

Bibliography 1211 Vélez, F., Bernard, G., Darnis, V. [1989] “The O2 Object VLDB [1985] Proc. Eleventh International Conference on Manager: an Overview.” In VLDB [1989] , pp. 357–366. Very Large Data Bases, Pirotte, A., and Vassiliou, Y., eds., Stockholm, Sweden, August 1985. Verheijen, G., and VanBekkum, J. [1982] “NIAM: An Information Analysis Method,” in Olle et al. [1982]. VLDB [1986] Proc. Twelfth International Conference on Very Large Data Bases, Chu, W., Gardarin, G., and Verhofstad, J. [1978] “Recovery Techniques for Database Ohsuga, S., eds., Kyoto, Japan, August 1986. Systems,” ACM Computing Surveys, 10:2, June 1978. VLDB [1987] Proc. Thirteenth International Conference on Vielle, L. [1986] “Recursive Axioms in Deductive Data- Very Large Data Bases, Stocker, P., Kent, W., and Ham- bases: The Query-Subquery Approach,” in EDS [1986]. mersley, P., eds., Brighton, England, September 1987. Vielle, L. [1987] “Database Complete Proof Production VLDB [1988] Proc. Fourteenth International Conference on Based on SLD-resolution,” in Proc. Fourth International Very Large Data Bases, Bancilhon, F., and DeWitt, D., Conference on Logic Programming, 1987. eds., Los Angeles, August/September 1988. Vielle, L. [1988] “From QSQ Towards QoSaQ: Global VLDB [1989] Proc. Fifteenth International Conference on Optimization of Recursive Queries,” in EDS [1988]. Very Large Data Bases, Apers, P., and Wiederhold, G., eds., Amsterdam, August 1989. Vielle, L. [1998] “VALIDITY: Knowledge Independence for Electronic Mediation,” invited paper, in Practical Appli- VLDB [1990] Proc. Sixteenth International Conference on cations of Prolog/Practical Applications of Constraint Very Large Data Bases, McLeod, D., Sacks-Davis, R., Technology (PAP/PACT ’98), London, March 1998. and Schek, H., eds., Brisbane, Australia, August 1990. Vin, H., Zellweger, P., Swinehart, D., and Venkat Rangan, VLDB [1991] Proc. Seventeenth International Conference P. [1991] “Multimedia Conferencing in the Etherphone on Very Large Data Bases, Lohman, G., Sernadas, A., Environment,” IEEE Computer, Special Issue on Mul- and Camps, R., eds., Barcelona, Catalonia, Spain, Sep- timedia Information Systems, 24:10, October 1991. tember 1991. VLDB [1975] Proc. First International Conference on Very VLDB [1992] Proc. Eighteenth International Conference on Large Data Bases, Kerr, D., ed., Framingham, MA, Very Large Data Bases, Yuan, L., ed., Vancouver, Can- September 1975. ada, August 1992. VLDB [1976] Systems for Large Databases, Lockemann, VLDB [1993] Proc. Nineteenth International Conference on P., and Neuhold, E., eds., in Proc. Second International Very Large Data Bases, Agrawal, R., Baker, S., and Bell, Conference on Very Large Data Bases, Brussels, Bel- D. A., eds., Dublin, Ireland, August 1993. gium, July 1976, North-Holland, 1976. VLDB [1994] Proc. 20th International Conference on Very VLDB [1977] Proc.Third International Conference on Large Data Bases, Bocca, J., Jarke, M., and Zaniolo, C., Very Large Data Bases, Merten, A., ed., Tokyo, Japan, eds., Santiago, Chile, September 1994. October 1977. VLDB [1995] Proc. 21st International Conference on Very VLDB [1978] Proc. Fourth International Conference on Large Data Bases, Dayal, U., Gray, P.M.D., and Nishio, Very Large Data Bases, Bubenko, J., and Yao, S., eds., S., eds., Zurich, Switzerland, September 1995. West Berlin, Germany, September 1978. VLDB [1996] Proc. 22nd International Conference on Very VLDB [1979] Proc. Fifth International Conference on Very Large Data Bases, Vijayaraman, T. M., Buchman, A. P., Large Data Bases, Furtado, A., and Morgan, H., eds., Mohan, C., and Sarda, N. L., eds., Bombay, India, Sep- Rio de Janeiro, Brazil, October 1979. tember 1996. VLDB [1980] Proc. Sixth International Conference on Very VLDB [1997] Proc. 23rd International Conference on Very Large Data Bases, Lochovsky, F., and Taylor, R., eds., Large Data Bases, Jarke, M., Carey, M. J., Dittrich, K. R., Montreal, Canada, October 1980. Lochovsky, F. H., and Loucopoulos, P., eds., Zurich, Switzerland, September 1997. VLDB [1981] Proc. Seventh International Conference on Very Large Data Bases, Zaniolo, C., and Delobel, C., VLDB [1998] Proc. 24th International Conference on Very eds., Cannes, France, September 1981. Large Data Bases, Gupta, A., Shmueli, O., and Widom, J., eds., New York, September 1998. VLDB [1982] Proc. Eighth International Conference on Very Large Data Bases, McLeod, D., and Villasenor, Y., eds., VLDB [1999] Proc. 25th International Conference on Very Mexico City, September 1982. Large Data Bases, Zdonik, S. B., Valduriez, P., and Orlowska, M., eds., Edinburgh, Scotland, September VLDB [1983] Proc. Ninth International Conference on Very 1999. Large Data Bases, Schkolnick, M., and Thanos, C., eds., Florence, Italy, October/November 1983. VLDB [2000] Proc. 26th International Conference on Very Large Data Bases, Abbadi, A. et al., eds., Cairo, Egypt, VLDB [1984] Proc. Tenth International Conference on Very September 2000. Large Data Bases, Dayal, U., Schlageter, G., and Seng, L., eds., Singapore, August 1984.

1212 Bibliography VLDB [2001] Proc. 27th International Conference on Very agement,” System Development Corporation, Report Large Data Bases, Apers, P. et al., eds., Rome, Italy, Sep- SP-2634, 1967. tember 2001. Wallace, D. [1995] “1994 William Allan Award Address: Mitochondrial DNA Variation in Human Evolution, VLDB [2002] Proc. 28th International Conference on Very Degenerative Disease, and Aging.” American Journal Large Data Bases, Bernstein, P., Ionnidis, Y., Ramak- of Human Genetics, 57:201–223, 1995. rishnan, R., eds., Hong Kong, China, August 2002. Walton, C., Dale, A., and Jenevein, R. [1991] “A Taxonomy and Performance Model of Data Skew Effects in Paral- VLDB [2003] Proc. 29th International Conference on Very lel Joins,” in VLDB [1991]. Large Data Bases, Freytag, J. et al., eds., Berlin, Wang, K. [1990] “Polynomial Time Designs Toward Both Germany, September 2003. BCNF and Efficient Data Manipulation,” in SIGMOD [1990]. VLDB [2004] Proc. 30th International Conference on Very Wang, Y., and Madnick, S. [1989] “The Inter-Database Large Data Bases, Nascimento, M. et al., eds., Toronto, Instance Identity Problem in Integrating Autonomous Canada, September 2004. Systems,” in ICDE [1989]. Wang, Y., and Rowe, L. [1991] “Cache Consistency and VLDB [2005] Proc. 31st International Conference on Very Concurrency Control in a Client/Server DBMS Large Data Bases, Böhm, K. et al., eds., Trondheim, Architecture,” in SIGMOD [1991]. Norway, August-September 2005. Warren, D. [1992] “Memoing for Logic Programs,” CACM, 35:3, ACM, March 1992. VLDB [2006] Proc. 32nd International Conference on Very Weddell, G. [1992] “Reasoning About Functional Depen- Large Data Bases, Dayal, U. et al., eds., Seoul, Korea, dencies Generalized for Semantic Data Models,” September 2006. TODS, 17:1, March 1992. Weikum, G. [1991] “Principles and Realization Strategies VLDB [2007] Proc. 33rd International Conference on Very of Multilevel Transaction Management,” TODS, 16:1, Large Data Bases, Koch, C. et al., eds., Vienna, Austria, March 1991. September, 2007. Weiss, S., and Indurkhya, N. [1998] Predictive Data Mining: A Practical Guide, Morgan Kaufmann, 1998. VLDB [2008] Proc. 34th International Conference on Very Whang, K. [1985] “Query Optimization in Office Large Data Bases, as Proceedings of the VLDB Endow- By Example,” IBM Research Report RC 11571, ment, Volume 1, Auckland, New Zealand, August 2008. December 1985. Whang, K., and Navathe, S. [1987] “An Extended Disjunc- VLDB [2009] Proc. 35th International Conference on Very tive Normal Form Approach for Processing Recursive Large Data Bases, as Proceedings of the VLDB Endow- Logic Queries in Loosely Coupled Environments,” in ment, Volume 2 , Lyon, France, August 2009. VLDB [1987]. Whang, K., and Navathe, S. [1992] “Integrating Expert VLDB [2010] Proc. 36th International Conference on Very Systems with Database Management Systems—an Large Data Bases, as Proceedings of the VLDB Endow- Extended Disjunctive Normal Form Approach,” Infor- ment, Volume 3, Singapore, August 2010. mation Sciences, 64, March 1992. Whang, K., Malhotra, A., Sockut, G., and Burns, L. [1990] VLDB [2011] Proc. 37th International Conference on Very “Supporting Universal Quantification in a Two- Large Data Bases, as Proceedings of the VLDB Endow- Dimensional Database Query Language,” in ICDE ment, Volume 4, Seattle, August 2011. [1990]. Whang, K., Wiederhold, G., and Sagalowicz, D. [1982] VLDB [2012] Proc. 38th International Conference on Very “Physical Design of Network Model Databases Using Large Data Bases, as Proceedings of the VLDB Endow- the Property of Separability,” in VLDB [1982]. ment, Volume 5, Istanbul, Turkey, August 2012. White, Tom [2012] Hadoop: The Definitive Guide, (3rd Ed.), Oreilly, Yahoo! Press, 2012. [hadoopbook.com]. VLDB [2013] Proc. 39th International Conference on Very Widom, J., “Research Problems in Data Warehousing,” Large Data Bases, as Proceedings of the VLDB Endow- CIKM, November 1995. ment, Volume 6, Riva del Garda, Trento, Italy, August Widom, J., and Ceri, S. [1996] Active Database Systems, 2013. Morgan Kaufmann, 1996. VLDB [2014] Proc. 39th International Conference on Very Large Data Bases, as Proceedings of the VLDB Endow- ment, Volume 7, Hangzhou, China, September 2014. VLDB [2015] Proc. 40th International Conference on Very Large Data Bases, as Proceedings of the VLDB Endow- ment, Volume 8, Kohala Coast, Hawaii, September 2015, forthcoming. Voorhees, E., and Harman, D., eds., [2005] TREC Experi- ment and Evaluation in Information Retrieval, MIT Press, 2005. Vorhaus, A., and Mills, R. [1967] “The Time-Shared Data Management System: A New Approach to Data Man-

Bibliography 1213 Widom, J., and Finkelstein, S. [1990] “Set Oriented Wu, X., and Ichikawa, T. [1992] “KDA: A Knowledge- Production Rules in Relational Database Systems,” in based Database Assistant with a Query Guiding Facil- SIGMOD [1990]. ity,” TKDE 4:5, October 1992. Wiederhold, G. [1984] “Knowledge and Database Man- www.oracle.com/ocom/groups/public/@ocompublic/doc- agement,” IEEE Software, January 1984. uments/webcontent/039544.pdf. Wiederhold, G. [1987] File Organization for Database Xie, I. [2008] Interactive Information Retrieval in Digi- Design, McGraw-Hill, 1987. tal Environments, IGI Publishing, Hershey, PA, 2008. Wiederhold, G. [1995] “Digital Libraries, Value, and Pro- Xie, W. [2005] “Supporting Distributed Transaction Pro- ductivity,” CACM, April 1995. cessing Over Mobile and Heterogeneous Platforms,” Ph.D. dissertation, Georgia Tech, 2005. Wiederhold, G., and Elmasri, R. [1979] “The Structural Model for Database Design,” in ER Conference [1979]. Xie, W., Navathe, S., Prasad, S. [2003] “Supporting QoS- Aware Transaction in the Middleware for a System of Wiederhold, G., Beetem, A., and Short, G. [1982] “A Data- Mobile Devices (SyD),” in Proc. 1st Int. Workshop on base Approach to Communication in VLSI Design,” Mobile Distributed Computing in ICDCS ’03, Provi- IEEE Transactions on Computer-Aided Design of dence, RI, May 2003. Integrated Circuits and Systems, 1:2, April 1982. XML (2005): www.w3.org/XML/. Wilkinson, K., Lyngbaek, P., and Hasan, W. [1990] “The Yan, W.P., and Larson, P.A. [1995] “Eager aggregation and IRIS Architecture and Implementation,” TKDE, 2:1, March 1990. Lazy Aggregation,” in VLDB [1995]. Yannakakis, Y. [1984] “Serializability by Locking,” JACM, Willshire, M. [1991] “How Spacey Can They Get? Space Overhead for Storage and Indexing with Object- 31:2, 1984. Oriented Databases,” in ICDE [1991]. Yao, S. [1979] “Optimization of Query Evaluation Algo- Wilson, B., and Navathe, S. [1986] “An Analytical Framework rithms,” TODS, 4:2, June 1979. for Limited Redesign of Distributed Databases,” Proc. Sixth Yao, S., ed. [1985] Principles of Database Design, Vol. 1: Advanced Database Symposium, Tokyo, August 1986. Logical Organizations, Prentice-Hall, 1985. Wiorkowski, G., and Kull, D. [1992] DB2: Design and Yee, K.-P. et al. [2003] “Faceted metadata for image search Development Guide, 3rd ed., Addison-Wesley, 1992. and browsing,” Proc.ACM CHI 2003 (Conference on Witkowski, A., et al, “Spreadsheets in RDBMS for OLAP”, Human Factors in Computing Systems), Ft. Lauderdale, in SIGMOD [2003]. FL, pp. 401–408. Yee, W. et al. [2002] “Efficient Data Allocation over Multi- Wirth, N. [1985] Algorithms and Data Structures, Pren- ple Channels at Broadcast Servers,” IEEE Transactions tice-Hall, 1985. on Computers, Special Issue on Mobility and Databases, 51:10, 2002. Witten, I. H., Bell, T. C., and Moffat, A. [1994] Managing Yee, W., Donahoo, M., and Navathe, S. [2001] “Scaling Gigabytes: Compressing and Indexing Documents Replica Maintenance in Intermittently Synchronized and Images, Wiley, 1994. Databases,” in CIKM, 2001. Yoshitaka, A., and Ichikawa, K. [1999] “A Survey on Con- Wolfson, O. Chamberlain, S., Kalpakis, K., and Yesha, Y. tent-Based Retrieval for Multimedia Databases,” [2001] “Modeling Moving Objects for Location Based TKDE, 11:1, January 1999. Services,” NSF Workshop on Infrastructure for Mobile Youssefi, K. and Wong, E. [1979] “Query Processing in a and Wireless Systems, in LNCS 2538, pp. 46–58. Relational Database Management System,” in VLDB [1979]. Wong, E. [1983] “Dynamic Rematerialization: Processing Zadeh, L. [1983] “The Role of Fuzzy Logic in the Manage- Distributed Queries Using Redundant Data,” TSE, 9:3, ment of Uncertainty in Expert Systems,” in Fuzzy Sets May 1983. and Systems, 11, North-Holland, 1983. Zaharia M. et al. [2012] “Resilient Distributed Datasets: A Wong, E., and Youssefi, K. [1976] “Decomposition—A Strat- Fault-Tolerant Abstraction for In-Memory Cluster egy for Query Processing,” TODS, 1:3, September 1976. Computing,” in Proc. Usenix Symp. on Networked Sys- tem Design and Implementation (NSDI) April 2012, Wong, H. [1984] “Micro and Macro Statistical/Scientific pp. 15–28. Database Management,” in ICDE [1984]. Zaniolo, C. [1976] “Analysis and Design of Relational Schemata for Database Systems,” Ph.D. dissertation, Wood, J., and Silver, D. [1989] Joint Application Design: University of California, Los Angeles, 1976. How to Design Quality Systems in 40% Less Time, Wiley, 1989. Worboys, M., Duckham, M. [2004] GIS – A Computing Perspective, 2nd ed., CRC Press, 2004. Wright, A., Carothers, A., and Campbell, H. [2002]. “Gene- environment interactions the BioBank UK study,” Pharmacogenomics Journal, 2002, pp. 75–82.

1214 Bibliography Zaniolo, C. [1988] “Design and Implementation of a Logic Databases: Techniques and Applications (Shih, T. K., Based Language for Data Intensive Applications,” ed.), Idea Publishing, 2002. ICLP/SLP 1988, pp. 1666–1687. Zhou, X., and Pu, P. [2002] “Visual and Multimedia Infor- mation Management,” Proc. Sixth Working Conf. on Zaniolo, C. [1990] “Deductive Databases: Theory meets Visual Database Systems, Zhou, X., and Pu, P. (eds.), Practice,” in EDBT,1990, pp. 1–15. Brisbane Australia, IFIP Conference Proceedings 216, Kluwer, 2002. Zaniolo, C. et al. [1986] “Object-Oriented Database Sys- Ziauddin, M. et al. [2008] “Optimizer Plan Change tems and Knowledge Systems,” in EDS [1984]. Management: Improved Stability and Performance in Oracle 11g,” in VLDB [2008]. Zaniolo, C. et al. [1997] Advanced Database Systems, Zicari, R. [1991] “A Framework for Schema Updates in an Morgan Kaufmann, 1997. Object-Oriented Database System,” in ICDE [1991]. Zloof, M. [1975] “Query by Example,” NCC, AFIPS, 44, 1975. Zantinge, D., and Adriaans, P. [1996] Managing Client Zloof, M. [1982] “Office By Example: A Business Language Server, Addison-Wesley, 1996. That Unifies Data, Word Processing, and Electronic Mail,” IBM Systems Journal, 21:3, 1982. Zave, P. [1997] “Classification of Research Efforts in Zobel, J., Moffat, A., and Sacks-Davis, R. [1992] “An Requirements Engineering,” ACM Computing Efficient Indexing Technique for Full-Text Database Surveys, 29:4, December 1997. Systems,” in VLDB [1992]. Zvieli, A. [1986] “A Fuzzy Relational Calculus,” in EDS Zeiler, Michael. [1999] Modeling Our World—The ESRI [1986]. Guide to Geodatabase Design, 1999. Zhang, T., Ramakrishnan, R., and Livny, M. [1996] “Birch: An Efficient Data Clustering Method for Very Large Databases,” in SIGMOD [1996]. Zhao, R., and Grosky, W. [2002] “Bridging the Semantic Gap in Image Retrieval,” in Distributed Multimedia

Index ‘ ’, string notation (single quotation), 182, instantiation, 130 OQL collections and, 413–414 196, 347–348 knowledge representation (KR) and, parallel algorithms, 686 QBE (Query-by-Example) language, :, multiple inheritance (colon) notation, 129 393 Access control 1175–1177 query execution and, 709 @, XPath attribute names, 444 content-based, 1142 SQL query retrieval and, 216–219 =, EQUIJOIN comparison operator, 253 credentials and, 1142 relational algebra for, 260–261 –>, dereferencing in SQL, 386 defined, 1126 Aggregate operation implementation, –>, operation arrow notation, 392 Directory Services Markup Language ←, assignment operation, relational 678–679 (DSML) and, 1142 Aggregation algebra, 245 e-commerce environment and, 1141 ρ, RENAME operator, 245–246 mandatory access control (MAC), semantic modeling process, 131–133 “ ”, operator notation (double quotation), UML class diagrams, 87–88 1121, 1134–1137 Algorithms, concurrency control 196, 347–348 mobile applications, 1141–1142 Thomas’s write rule, 795 $, XQuery variable prefix, 445 row-level, 1139–1140 timestamp ordering (TO), 793 %, arbitrary number replacement Web policies, 1141–1142 Algorithms, data mining XML, 1140–1141, 1142 apriori algorithm, 1075–1076 symbol, SQL, 195–196 Access paths BIRCH algorithm, 1090 ( ), SQL notation data modeling, 34 FP-growth algorithm, 1077–1080 DBMS classification from, 52 genetic algorithms (GAs), 1093 constraint conditions for assertions, Action, SQL triggers, 227 k-means algorithm, 1088–1089 226 Active database systems, 4, 22 partition algorithm, 1081 Active database techniques, SQL, 202 sampling algorithm, 1076–1077 explicit set of values, 214 Active databases Algorithms, database recovery tuple value comparisons, 210 design issues, 967–972 ARIES recovery algorithm, 827–831 ( ), XML DTD element notation, 434 enhanced data models, 963–974 idempotent operations of, 815 *, SQL notation event-condition-action (ECA) model, NO-UNDO/REDO, 815, 821–823 attribute specification and retrieval, UNDO/REDO, 815 963–964 Algorithms, encryption 193 expert (knowledge-based) systems, asymmetric key encryption tuple rows in query results, 218 *, XPath elements (wildcard symbol), 444 962–963 algorithms, 1151 *__, NATURAL JOIN comparison implementation issues, 967–972 RSA public key encryption algorithm, triggers, 963–967, 973–974 operator, 253 Active rules 1152 / and //, path separators, XML, 443 applications for, 972–973 symmetric key algorithms, 1150–1151 /, escape operator, SQL, 196 event-condition-action (ECA) model, Algorithms, normalization [ ], UDT arrays (brackets), 383 alternative RDB designs, 524–527 _, single character replacement symbol, 963–964 BCNF schemas, 522–523 functionality of, 962 dependency preservation, 519–522 SQL, 195–196 statement-level rules in STARBURST, ER-to-relational mapping, 290–296 ||, concatenation operator (double bar), nonadditive (lossless) join property 970–972 SQL, 182–183 Actuator, disk devices, 551 decomposition, 519–523 d, disjointness constraint notation, Acyclic graphs, 52. See also Hierarchies RDB schema design, 519–527 Adaptive optimization, Oracle, 735 3NF schemas, 519–522 114–115 ADD CONSTRAINT keyword, SQL, 234 Algorithms, queries ∪, set union operation, 120 Advanced Encryption Standards (AES), external sorting, 660–663 ≡, equivalent to symbol, 274 heuristic algebra optimization, σ, SELECT operator, 241 1150 ⇒, implies symbol, 274 After image (AFIM) updating, 816 700–701 1NF, see First normal form (1NF) Agent-based approach, Web content parallel processing, 683–687 2NF, see Second normal form (2NF) PROJECT operation, 676–678 3NF, see Third normal form (3NF) analysis, 1053–1054 SELECT operation, 663–668 4NF, see Fourth normal form (4NF) Aggregate functions set operation, 676–678 5NF, see Fifth normal form (5NF) Alias (tuple variables) of attributes, 192 Abstraction concepts asterisk (*) for tuple rows of query results, 218 1215 aggregation, 131–133 association, 131–132 discarded NULL values, 218 classification, 130 grouping and, 216–218, 260–261 identification, 130–131

1216 Index ALL option, SQL, 194–195, 210 Internet SCSI (iSCSI), 590 Associations, UML class diagrams 87–88 All-key relation, 491, 493 label security, 1156–1157 Associative arrays, PHP, 350 Allocation of file blocks on a disk, 564 mappings, 37 Asterisk (*) ALTER command, SQL, 233–234 network-attached storage (NAS), ALTER TABLE command, SQL, 180 all attribute specification, 193 Analysis, RDB design by, 503 589–590 tuple rows of query results, 218 Analytical data store (ADS), 1105 n-tier for Web applications, 49–51 Asymmetric key encryption algorithms, Analytical operations, spatial databases, parallel database, 683 parallel versus distributed, 869 1151 988 pure distributed databases, 869–871 Atom constructor, 368, 369 Anchor texts, 1027 shared-disk, 683 Atomic (single-valued) types, 368 AND/OR/NOT operators shared-memory, 683 Atomic literals, 388 shared-nothing, 684 Atomic objects, ODMG models, 388, Boolean conditions, 270–271 storage area networks (SANs), quantifier transformations using, 274 395–398 Annotations, XML language, 440 588–589 Atomic values Anomalies storage, 588–592 deletion, 467 three-schema, 36–38 domains, 151 insertion, 465–466 three-tier client/server, 49–51, 872–875 first normal form (1NF), 477–478 modification, 467 two-tier client-server, 49 tuples, 155 RDB design and, 465–467 Web applications, 49–51 Atomicity property, transactions, 14, 157 tuple redundant information YARN (Hadoop v2), 940–942 Atoms ARIES recovery algorithm, 827–831 domain relational calculus formulas, avoidance using, 465–467 Arithmetic operations, SQL query update, 465–467 277–278 Anti-join (AJ) operator, 658–660, recovery and, 196–197 tuple relational calculus formulas, Armstrong’s axioms, 506–509 677–678, 681, 719–720 Array constructor, 369 270–271 Apache systems Array processing, Oracle, 735–736 truth value of, 270, 277 Arrays Attribute data, 989 Apache Cassandra, 900 Attribute-defined specialization, 114, 126 Apache Giraph, 943 associative, 350 Attribute preservation, RDB Apache Hive, 933–936 brackets ([ ]) for, 383 Apache Pig, 932–933 dynamic, 345–346 decomposition condition, 513 Apache Tez, 943 numeric, 349 Attribute versioning, 982–984 Apache Zookeeper, 900 PHP programming, 345–346, 348–350 Attributes. See also Entities Big data technologies for, 932–936, UDT elements, 383 AS option, SQL, 196 ambiguous, prevention of, 191–192 943–944 Assertions asterisk (*) for, 193 API (Application programming constraint conditions in parentheses clarity of in RDB design, 461–465 complex, 66–67, 441 interface) ( ) for, 226 composite, 65–66, 441 client-side program calls from, 49 CREATE ASSERTION statement, conceptual data models, 33 data mining, 1095 constraints and defaults in SQL, database programming and, 312, 326 225–226 library of functions, 312, 326 declarative, 225–227 184–186 Application-based (semantic) relation schema and, 156 data types in SQL, 182–184 SQL constraint specification, 158, 165, default values, 184–186 constraints, 158 defined, 63 Application development environments, 225–226 defining, 114 Assignment operations (←), relational degree (arity) of, 152 47 derived, 66 Application programmers, 16 algebra, 245 discriminating, 299–300 Application programs, 6, 313 Association rules EER-to-relational mapping, 298–300 Application server, 44, 50 entities and, 63–65 ApplicationMaster (AM), YARN, 942 apriori algorithm, 1075–1076 ER models, 63–70 Apriori algorithm, 1075–1076 complications with, 1084 ER-to-relational mapping, 295–296 Arbitrary number replacement symbol confidence of, 1074 functional dependency of, 472–473 data mining, 1073–1084 grouping, 219, 260–261 (%), 195–196 FP-growth algorithm, 1077–1080 HTML tags, 430 Architecture frequent-pattern (FP) tree, 1077–1080 key (uniqueness constraint), 68–69 hierarchies and, 1081–1082 multiple keys for, 631–632 automated storage tiering (AST), 591 market-basket data model, 1073–1075 multivalued, 66, 295–296, 481 centralized DBMS, 46–47 multidimensional associations, normal form keys, 477 client/server, 47–49 NULL values, 66, 184–186 data independence and, 37–38 1082–1083 ODMG model objects, 396 database systems and, 46–51 negative associations, 1082–1084 ordered indexes, 631–632 distributed databases (DDBs), 868–875 partition algorithm, 1081 partial key, 79 federated database (FDBS) schema, sampling algorithm, 1076–1077 prime/nonprime, 477 support for, 1074 project, 189 871–872 Association, semantic modeling process, query retrieval in SQL, 191–192 Fibre Channel over Ethernet (FCoE), 131–132 590–591 Fibre Channel over IP (FCIP), 590

Index 1217 RDB design and, 461–465, 472–473 Backup utility, 45 BIRCH algorithm, 1090 relation schema and, 152, 461–465 Bag constructor, 369 Bitemporal relations, 980–982 relational algebra, 245–246 Base class, 127 Bit-level striping, RAID, 584, 586 relational model domains and, Base tables (relations), 180, 182 Bit-string data types, 183 Before image (BFIM) updating, 816 Bits of data, 547 152–153 Behavior inheritance, 393 Bitmap indexes, 634–637, 1109–1110 relationships as, 74 BETWEEN comparison operator, SQL, BLOBs (binary large objects, 560–561 relationships types of, 78 Block-level striping, RAID, 584–585, 586 renaming, 192, 214–215, 245–246 196–197 Block transfer time, disk devices, 552 roles for a domain, 152 Bidirectional associations, UML class Blocking factor, records, 563 semantics for, 461–465 Blocking records, 563–564 simple (atomic), 65–66 diagrams, 87 Boolean data types, 183 single-valued, 66 Big data storage systems, 3, 26, 31, 51 Boolean model, IR, 1030 SQL use of, 184–186, 191–192 Big data technologies Boolean queries, 1035–1036 stored, 66 Boolean (TRUE/FALSE) statements subclass specialization, 114 Apache systems, 932–936, 943–944 tree-structured data models, XML, 433 cloud computing, 947–949 OQL, 414 tuple modification for, 166, 168–169 distributed and database combination, relational algebra expressions, 241–242 update (modify) operation for, SQL query retrieval, 212–214 841 tuple relational calculus formulas, 168–169 Hadoop, 916–917, 921–926 value sets (domains) of, 69–70 MapReduce (MR), 917–921, 926–936 270–271 versioning, 982–984 parallel RDBMS compared to, 944–946 Bottom-tier database server, DBMS as, visible/hidden, 371, 375 technological development of, XML, 433, 441 344 Audio data source analysis, 999 911–913 Bottom-up conceptual synthesis, 119 Audio sources, multimedia databases, variety of data, 915 Bottom-up method, RDB design, 460, velocity of data, 915 996 veracity of data, 915–916 504 Audit trail, 1127 volume of data, 914 Bound columns approach, SQL/CLI Authorization, SQL views as mechanisms YARN (Hadoop v2), 936–944, 949–953 Binary association, UML class diagrams, query results, 329 of, 232 Boyce-Codd normal form (BCNF) AUTHORIZATION command, SQL, 315 87 Authorization identifier, SQL schemas, Binary locks, 782–784 decomposition of relations not in, Binary operations 489–491 179 Automated storage tiering (AST), 591 complete set of, 255 definition of, 488 Autonomy, DDBs, 845–846 DIVISION operation, 255–257 nonadditive join test for binary Auxiliary access structure, 546 JOIN operation, 251–255 Availability OUTER JOIN operations, 262–264 decomposition (NJB), 490 query tree notation, 257–259 relations in, 487–489 DDBs, 844–845 relational algebra and, 240, 251–259, Browsing, 1027 loss of, database threat of, 1122 Browsing interfaces, 40 NOSQL, 885–886 262–264 Bucket join, MapReduce (MR), 931 AVERAGE function, grouping, 260 set theory for, 247 Buckets, hashing, 575–576 AVG function, SQL, 217 Binary relationships Buffer, disk blocks, 550–551 Axioms, 1005 cardinality ratios for, 76–77 Buffer replacement policy, 749 B-trees constraints on, 76–78 Buffer space, nested-loop join and, dynamic multilevel indexes degree of, 73 ER models, 73–74, 76–78 672–673 implementation, 617–622 ER-to-relational mapping, 293–295 Buffering file organization and, 583 existence dependency, 77–78 dynamic multilevel index participation constraints, 77–78 buffer management, 557–558 relationship type, 73–74 buffer replacement strategies, 559–560 implementation, 617–622 ternary relationships compared to, CPU processing and, 556–557 physical database design and, 601–602, data using disk devices, 552 88–91 database recovery, 815–816 617–622 Binary search, files, 570 disk blocks, 541, 556–560, 815–816 unbalanced, 617 Bind variables, SQL injection and, double buffering technique, 556–557 variations of, 629–630 Buffering (caching) modules, 20, 42 B+-trees 1145–1146 Built-in functions, UDT, 384 bitmaps for leaf nodes of, 636–637 Binding Built-in interfaces, ODMG models, dynamic multilevel index C++ language binding, 417–418 393–396 implementation, 622–625 early (static), 344 Built-in variables, PHP, 352–353 physical database design and, 601–602, JDBC statement parameters, 333 Bulk loading process, indexes, 639 late (dynamic), 377 Bulk transfer time, disk devices, 552 622–630 OBDs, 377 Business rules, 21 search, insert and deletion with, ODMG standards and, 386, 417–418 Bytes of data, 547 programming language, 312 C language, SQL/CLI (SQI call level 625–629 polymorphism and, 377 variations of, 629–630 SQL/CLI statement parameters, 329 interface), 326–331 Backup and recovery subsystem, 20 C++ language binding, ODMG, 417–418

1218 Index Cache memory, 543 Child nodes, tree structures, 617 Column-based storage of relations, Caching (buffering) disk blocks, database Ciphertext, 1149 indexing for, 642 Class diagrams, UML, 85–88 recovery, 815–816 Class library Comments, PHP programming, 345 Calendar, 975 Commit point, transaction processing, 756 CALL statement, stored procedures, 337 OOPL (object-oriented programming Committed projection, schedules, 760 Candidate key, 159–160, 477 language) and, 312 Communication autonomy, DDBs, 845 Canned transactions, 15 Communication software, DBMS, 46 CAP theorem, NOSQL, 888–890 SQL imported from JDBC, 331, 332 Communication variables in embedded Cardinality Classes SQL, 315, 316 JOIN operations, 719–720 EER model relationships, 108–110 Commutative property, SELECT of a relational domain, 152 inheritance, 110, 118 CARDINALITY function, 383 interface inheritance, ODL, 404–405 operation, 243 Cardinality ratios, 76–77 interfaces, instantiable behavior and, Comparison operators Cartesian product of a relational domain, 392 select-from-where query structure 153 Java, 331 and, 188–190 CARTESIAN PRODUCT operation, object data models, 52 ODL, 400, 404–405 select-project-join query structure 249–251 ODMG models, 392, 404–405 and, 189, 191 CASCADE option, SQL, 233, 234 operations and type definitions, 371 Cascaded values property specification, 130 SQL query retrieval, 188–191, 195–197 subclasses, 108–110, 126 substring pattern matching, 195–197 insert violation and, 167 superclasses, 109, 110, 126 Compiled queries, 710 SELECT operation sequence of, 243 Clausal form, deductive databases, Compilers SQL constraint options, 186–187 DBMS interface modules, 42–45 Cascading rollback phenomenon 1003–1005 DDL for schema definitions, 42–43 database recovery and, 819–821 Client, defined, 48 query, 43–44 schedules, 762 Client computer, 44 precompiler, 44 timestamp ordering, 794 Client machines, 47 Complete schedule conditions, 760 CASE (computer-aided software Client module, 31 Complete set of relational binary Client program, 313 engineering), 46–47 Client/server architectures operations, 255 CASE clause, SQL, 222–223 Completeness (totalness) constraint, 115 Casual end users, 15–16 basic, 47–49 Complex attributes, 66–67 Catalog management, DDBs, 875 centralized DBMS, 46–47 Complex elements, XML, 431, 441 Catalogs two-tier, 49 Composite, 65–66 Client tier, HTML and, 344 Composite (compound) attributes, component modules and, 42–45 CLOSE CURSOR command, SQL, 318 DBMS, 10–11, 35, 38, 42–45 Closed world assumption, 156 XML, 441 file storage in, 10–11 Closure, functional dependencies, Composite keys, 631 schema description storage, 35, 38, 180 Concatenation operator (||) in SQL, SQL concept, 179–180 505–506, 508 Catastrophic failures, database backup Cloud computing 182–183 Concept hierarchy, 1053 and recovery from, 832–833 Big data technology for, 947–949 Conceptual (schema) level, 37 Categories environment, 31 Conceptual data models, 33 Cloud storage, 3 Conceptual design defined, 126 Clustered file, 572, 583, 602–603 EER modeling concept, 108, 120–122, Clustering, data mining, 1088–1091 comparison of ODB and RDB, 405–406 Clustering indexes, 602, 606–608 high-level data model design, 61–62 126 Clusters, file blocks, 564 mapping EER schema to ODB schema, EER-to-relational mapping, 302–303 Code generator, query processing, 655 partial, 122 Code injection, SQL, 1144 407–408 superclasses and, 120–122 Collection (multivalued) constructors, Conceptualization, ontology and, 134 total, 122 Concurrency union types using, 120–122, 302–303 369 Cautious waiting algorithm, deadlock Collection objects, ODMG models, control, 749–752, 770–771 serializability of schedules and, prevention, 791 393–394 Central processing unit (CPU), primary Collection operators, OQL, 413–416 770–771 Collections transaction processing, 746–747 storage of, 542 Concurrency control protocols, 781 Centralized DBMS, 52 built-in interfaces, ODMG, 393–396 Concurrency control software, 13–14 Centralized DBMS architectures, 46–47 entity sets, 67–68 Concurrency control techniques Certification of transactions, 781 object extent and, 373, 376 data insertion and, 806 Certify locks, 796–797 persistent, 373, 376 deletion operation and, 806 Chaining, hashing collision resolution, 574 transient, 376 distributed databases (DDBs), 854–857 Character-string data types, 182–183 Collision resolution, hashing, 574 granularity of data items, 800–801 Characters of data, 547 Column, SQL, 179 index concurrency control using locks, CHECK clauses for, 187 Column-based data models, 51, 53 Checkpoints, database recovery, Column-based NOSQL, 888, 900–903 805–806 interactive transactions and, 807 818–819, 828–829 latches and, 807 locking data items, 781

Index 1219 locks used for, 782–786, 796–797, referential integrity, 21, 186–187 CREATE SCHEMA statement, 179–180 805–806 relational database schemas, 160–163 CREATE TABLE command, SQL, relational models and, 157–167 multiple granularity locking, 801–804 relationships and, 76–78 180–182 multiversion concurrency control, row-based, 187 CREATE TRIGGER statement, SQL, schema-based (explicit), 157 781, 795–797 semantics and, 21 225, 226–227 phantom records and, 806–807 specialization, 113–116 CREATE TYPE command, 184, 380–383 snapshot isolation, 781, 799–800 SQL specifications, 165, 184–187, CREATE VIEW statement, SQL, timestamp ordering (TO), 792–795, 796 timestamps, 781, 790–791, 793 225–226 228–229 two-phase locking (2PL), 782–792, state, 165 Credentials, access control and, 1142 structural, 78 CROSS PRODUCT operation 796–797 table-based, 184–187 validation (optimistic) of transactions, ternary relationships, 91–92 relational algebra set theory, 249–251 transition, 165 SQL tuple combinations, 192–193 781, 798–799 triggers in SQL, 58, 165 CRUD (create, read, update, and delete) Conditions UML notation for, 127–128 uniqueness, 21 operations, NOSQL, 887, 893, 903 constraint parentheses ( ) for user-defined subclasses, 114 Cursors assertions, 226 violations, 166–167 Constructor function, SQL declaration of, 317, 319–320 trigger component in SQL, 227 impedance mismatch and, 312 Conflict equivalence, schedules, 765–766 encapsulation, 384 iterator as, 318 Conjunctive selection, search methods Constructors, see Type constructors SQL query result processing, 312, Constructs, 35 for, 665–666 Content-based access control, 1142 317–320 CONNECT TO command, SQL, 315 Content-based retrieval, 995 updating records, 318 Connecting fields for mixed records, Contiguous allocation, file blocks, 564 Cypher query language, Neo4j system, Control measures, database security, 582–583 905–908 Connecting to a database 1123–1125 Dangling tuples, RDB design problems, Conversational information access, IR, embedded SQL, 315–316 523–524 PHP, 353–355 1059 Data Connection record, SQL/CLI, 327–328 Conversion of locks, 786 Connection to database server, 313 Core specifications, SQL, 178 Big data technology for, 914–916 Consistency preservation, transactions, Correlated nested queries, SQL, 211–212 complex relationships among, 21 Cost-based query optimization conceptual representation of, 12 757 databases and, 7–8, 12–14 Constant nodes, query graphs, 273 approach, 710–712 defined, 4 Constraint specification language, 165 defined, 710 directed graph representation of, Constraints dynamic programming compared to, 427–428 application-based (semantic), 158 716 elements, 7 assertions in SQL, 58, 165, 225–226 illustration of, 726–728 eXtended Markup Language (XML) attribute defaults and, 184–186 Cost estimation attribute-defined specialization, 114 catalog information in cost functions and, 25, 426–430 binary relationships, 76–78 granularity of data items, 800–801 business rules, 21 for, 712 insulation from programs and, 12–13 CHECK clauses for, 187 histograms for, 713 integrity constraints, 21–22 completeness (totalness), 115 JOIN optimization based on cost interchanging on the Web, 25 conditions in parentheses ( ) for logical independence, 37–38 formulas, 720–721 multiple views of, 13 assertions, 226 query execution components, 710–712 multiuser transactions and, 13–14 database applications, 21–22, 160–163 query optimization technique, 657, physical independence, 38 disjointness (d notation), 114–115 records, 6–7 domain, 158 710–713, 716–717 requirements collection and analysis, EER models and, 113–116 selection based on cost formulas, ER models and, 76–78, 91–92 60–61 existence dependency, 77–78 716–717 self-describing, 10, 427 foreign keys, 163, 186–187 Cost functions semantics and, 21 generalization, 113–116 semistructured, 426–428 indexes for management of, 641 JOIN operation use of, 717–726 sharing, 13–14 inherent model-based (implicit), 157 query optimization, 714–715, 717–726 storage, 3–4 inherent rules, 22 SELECT operation use of, 714 structured, 426 insert operation and, 166–167 COUNT function tag notation and use, HTML, 428–430 integrity, 21–22, 160–163 grouping, 260 three-schema architecture and, 37–38 key, 21, 158–160, 163–165, 186–187 SQL, 217 type, 7–8 minimum cardinality, 77 Covert channels, flow control and, unstructured, 428–430 naming, 187 variety of, 915 NULL value and, 160, 163 1148–1149 velocity of, 915 participation, 77–78 CREATE ASSERTION statement, SQL, veracity of, 915–916 predicate-defined subclasses, 113–114 volume of, 914 225–226 virtual, 13

1220 Index Data abstraction representational, 33 conceptual design, 61–62, 70–72 conceptual representation of, 12–13 self-describing, 34 data modal mapping, 62 data models and, 12, 32–34 Data normalization, 475–476 entities and attributes for, 70–72 program independence from, 12 Data organization transparency, DDBs, ER (Entity-Relationship) models for, Data allocation, DDBs, 849–853 843 60–62, 70–72 Data-based approach, Web content Data quality, database security and, 1154 functional requirements for, 61 Data replication, DDBs, 849–853 logical design, 62 analysis, 1054 Data security physical design, 62 Data buffers, transaction processing, requirements collection and analysis, access acceptability and, 1127 748–749 authenticity assurance and, 1127 60–61 Data-centric documents, XML, 431 data availability and, 1127 schema creation, 61–62 Data collection and records, PHP, 355–356 sensitivity of data and, 1126–1127 Database designer, 15 Data definition, SQL, 179 Data sources Database items, transaction processing, Data dictionary (data repository), 45–46 databases as, 425 Data Encryption Standards (DES), 1150 JDBC, 331 748 Data fragmentation, DDBs, 847–853 Data striping, RAID, 584–585 Database management systems, see DBMS Data independence, three-schema Data transfer costs, DDB query (database management system) architecture and, 37–38 processing, 860–862 Database monitoring, SQL triggers for, Data insertion, concurrency control and, Data types 226–227 806 attributes in SQL, 182–184 Database programming Data manipulation language (DML), bit strings, 183 Boolean, 183 application programming interface 39–40, 44 character strings, 182–183 (API), 312 Data marts, 1102 CREATE TYPE command, 184 Data mining DATE, 183 database application implementation, INTERVAL, 184 309 application programming interface numeric, 182 (API), 1095 records, 560–561 embedding commands in programming relational model domains, 151 language, 311, 314–320 applications of, 1094 spatial, 989–990 association rules, 1073–1084 TIME, 183 evolution of, 309–310 BIRCH algorithm, 1090 TIMESTAMP, 183–184 impendence mismatch, 312–313 classification, 1085–1088 Data values, records, 560 language design for, 312, 339 clustering, 1088–1091 Data warehouses library of functions or classes for, commercial tools, 1094–1096 building, 1111–1114 data warehousing compared to, 1070 data modeling for, 1105–1110 311–312, 326–335 decision trees, 1085–1086 defined, 1102 overview of techniques and issues, genetic algorithms (GAs), 1093 ETL (extract, transform, load) process, graphical user interface (GUI), 1095 310–311 k-means algorithm, 1088–1089 1103 sequence of interaction, 313–314 knowledge discovery in databases functionality of, 1114–1115 stored procedures, 335–338 use of, 4 Web programming using PHP, 343–359 (KDD), 1070–1073 views compared to, 1115 Database recovery techniques neural networks, 1092 Data warehousing ARIES recovery algorithm, 827–831 Open Database Connectivity (ODBC) analytical data store (ADS), 1105 caching (buffering) disk blocks, 815–816 characteristics of, 1103–1104 cascading rollback and, 819–821 interface, 1094–1095 data mining compared to, 1070 checkpoints, 818–819, 828–829 regression, 1091–1092 DSS (decision-support systems), 1102 database backup and recovery from sequential pattern discovery, 1091 master data management (MDM), 1110 spatial databases, 993–994 OLAP (online analytical processing), catastrophic failures, 832–833 Data model mapping deferred updates for recovery, 814, database design and, 62 1102 logical database design, 289 OLTP (online transaction processing), 821–823 Data models. See also Object data models force/no-force rules, 817–818 access path, 34 1102–1103 fuzzy checkpointing, 819, 828 basic operations, 32 operational data store (ODS), 1105 idempotent operations, 815 categories of, 33–34 query optimization, 731–733 immediate updates for recovery, 815, conceptual, 12–13, 33 use of, 1101 data abstraction and, 12, 32–34 warehouse implementation difficulties, 823–826 database schemas for, 34–38 multidatabase system recovery, DBMS classification from, 51–53 1115–1117 dynamic aspect of applications, 23 Database administrator, see DBA 831–834 EER (enhanced entity-relationship), NO-UNDO/REDO algorithm, 815, (database administrator) 107–146 Database design 821–823 ER (entity-relationship), 59–105 shadow paging, 826–827 object, 33, 51, 52–53 active databases, 967–972 steal/no-steal rules, 817–818 relational, 33, 51, 52, 149–157 system log for, 814, 817, 818–819 transaction rollback and, 819 transactions not affecting database, 821 UNDO/REDO algorithm, 815, 818 write-ahead logging (WAL), 816–818 Database schema, ontology as, 134

Index 1221 Database security DBMS classification, 51–53 persistent storage, 19–20 access acceptability and, 1127 defined, 6 program-data independence, 12 access control, 1126 environment program-operation independence, 12 additional forms of protection, 1123 environment of, 6–7, 42–46 properties of, 5 authenticity assurance and, 1127 extension of, 35 protection, 6 challenges for maintaining, 1154–1155 initial state, 35 queries, 6, 20 control measures, 1123–1125 instances, 35 real-time technology, 4 data availability and, 1127 interfaces, 40–42 redundancy control, 18–19 database administrator (DBA) and, languages, 38–40 relational, 24 1125–1126 module functions in, 31, 42–45 rules for inferencing information, 22 discretionary action control, 1121, populating (loading), 35 search techniques, 4 1129–1134 schemas, 34–38 self-describing data, 10 discretionary privileges, types of, tools, 45–46 sharing, 6 1129–1130 utilities, 45 Structured Query Language (SQL), 26 discretionary security mechanisms, 1123 valid state, 35 standards enforced by, 22 encryption, 1149–1153 Databases traditional applications, 3 flow control, 1147–1149 big data storage systems and, 26 transactions, 6, 14 GRANT command for, 1131 DBMS (database management triggers for, 22 GRANT OPTION for, 1131 unauthorized access restriction, 19 granting and revoking privileges, systems) for, 6, 9, 17–23, 27 updating information, 23 1129–1134 active systems, 4, 22 Datalog language information privacy relationship to, application programs for, 6 clausal form, 1003–1005 1128–1129 backing up, magnetic tape storage for, deductive databases, 1001, 1002–1003 label-based security policy, 1139–1140, Horn clauses, 1004 1155–1158 555–556 notation, 1000–1003 limiting privilege propagation, backup and recovery subsystem, 20 program safety, 1007–1010 1133–1134 big data storage, 3 queries in, 1004, 1010–1012 mandatory access control (MAC), characteristics of, 10–14 DATE data type, 183 1121, 1134–1137 cloud storage, 3 DBA (database administrators) mandatory security mechanisms, 1123 constructing, 6, 9 interfaces for, 42 Oracle, 1155–1158 data abstraction, 12–13 role of, 15 precision compared to security, 1128 data relationship complexity and, 21 DBMIN method, transaction processing, privacy issues and preservation, database users and, 3–29 1153–1154 deductive systems, 22 757 privilege specification using views, defined, 4 DBMS (database management systems) 1130–1131 development time reduction, 22–23 propagation of privileges, 1131, economies of scale, 23 advantages of approach, 17–23 1133–1134 employment concerning, 15–17 access path options, 52 revoking of privileges, 1131 eXtended Markup Language (XML) backup and recovery subsystem, 20 role-based access control (RBAC), bottom-tier database server as, 344 1121, 1137–1139 and, 25 centralized, 51 row-level access control, 1139–1140 extending capabilities of, 25 centralized architecture of, 46–47 sensitivity of data and, 1126–1127 extracting XML documents from, classification of, 51–53 SQL injection, 1143–1146 client/server architectures, 47–49 statistical database security, 1146–1147 442–443, 447–453 component modules, 42–45 system log modifications and, 1125 file processing, 10–11 conceptual design phase, 9 threats to databases, 1122 flexibility of, 23 concurrency control software for, 13–14 types of security for, 1122 hierarchical and network systems used data complexity and, 21 XML access control, 1140–1141 data models and, 51–53 as, 23–24 defined, 6 Database security and authorization history of applications, 23–26 disadvantages of, 27 subsystem, DBMS, 1123 information retrieval (IR) systems distributed, 51 federated, 52 Database server, 44 compared to, 1025–1026 general purpose, 52 Database storage integrity constraints, 21–22 heterogeneous, 52 interchanging Web data, 25 homogeneous, 52 organization of, 545–546 maintenance, 6 integrity constraints, 21–22 reorganization, 45 manipulating, 6, 9 interfaces, 40–42 Database system meta-data, 6, 10 language, 38–40 architectures, 46–51 multiple user interfaces, 20–21 logical design phase, 9 catalog, 10–11, 35, 42–45 multiple views of, 13 multiple user interfaces, 20–21 communication software, 46 multiuser transaction processing, 13–14 multiuser systems, 51 current state, 35 NOSQL system, 3, 26 number of sites for, 51–52 data models, 32–34 object-oriented (OODB), 24–25 object-oriented systems and, 19 online transaction processing (OLTP), 14

1222 Index DBMS (continued) Decision-support systems, see DSS Dependency operators and maintenance personnel, (decision-support systems) diagrammatic notation for, 474 17 equivalence of sets of, 508 persistent storage, 19–20 Decision trees, data mining, 1085–1086 functional, 471–474, 505–512, physical design phase, 9 Declaration, XML documents, 433 527–528, 532 query processing, 20 Declarative assertions, 225–227 inclusion, 531–532 redundancy control, 18–19 Declarative expressions, 268 inference rules for, 505–509, 527–528 requirements specification and Declarative languages, 40, 999 join (JD), 494–495, 530–531 analysis phase, 9 Decomposition minimal sets of, 510–512 single-user systems, 51 multivalued (MVD), 491–494, 527–530 special purpose, 52 algorithms, 519–523 preservation property, 476 SQL and, 177–178 Boyce-Codd normal form (BCNF), stored procedures and, 336–337 Dependency preservation system designers and implementers, 17 489–491, 522–523 algorithms, 519–522 tool developers, 17 dependency preservation, 514–515, nonadditive (lossless) join two-tier client-server architecture, 49 decomposition and, 519–522 unauthorized access restriction, 19 519–522 property of decomposition, 514–515 XML document storage, 442 DDMS (distributed database 3NF schema using, 519–522 DBMS-specific buffer replacement management service), 863–865 Dereferencing (–>), SQL, 386 policies, 756–757 fourth normal form (4NF), 527–530 Derived attribute, 66 nonadditive (lossless) join property, Descendant nodes, tree structures, 617 DDBMSs (distributed database Description record, SQL/CLI, 327–328 management systems) 476, 515–518, 519–523, 530 Descriptors, SQL schemas, 179 nonadditive join test for binary Design, see Database design degree of local autonomy, 865–866 Design autonomy, DDBs, 845 degree of homogeneity, 865–866 decomposition (NJB), 490 Design transparency, DDBs, 844 technology and, 841 normalization and, 489–491 Destructor, object operation, 371 update decomposition and, 863–865 properties of, 504, 513–518 Dictionary, ontology as, 134 DDBs (distributed databases) queries, 863–865 Dictionary constructor, 369 advantages of, 846 relations not in BCNF, 489–491 Digital certificates, 1153 architectures, 868–875 three normal form (3NF), 519–522 Digital libraries, 1047–1048 autonomy, 845–846 update, 863–865 Digital signatures, 1152–1153 availability, 844–845 Deductive database systems, 22 Digital terrain analysis, 988–989 catalog management, 875 Deductive databases Directed acyclic graph (DAG), 655 concurrent control and recovery in, clausal form, 1003–1005 Directed graph, XML data Datalog language for, 1001, 1002–1003 854–857 Datalog program safety, 1007–1010 representation, 427–428 conditions for, 842–843 Datalog rule, 1004 Dirty bit, buffer (cache) management, data allocation, 849–853 declarative language of, 999 data fragmentation, 847–853 enhanced data models, 962, 999–1012 558, 816 data replication, 849–853 Horn clauses, 1004 Dirty page tables, database recovery, network topologies, 843 nonrecursive query evaluation, partition tolerance, 845 828–831 query processing and optimization, 1010–1012 Dirty read problem, transaction overview of, 999–1000 859–865 Prolog language for, 1000–1001 processing, 750 reliability, 844–845 Prolog/Datalog notation, 1000–1003 DISCONNECT command, SQL, 316 scalability, 845 relational operators for, 1010 Discretionary action control, 1121, sharding, 847–848 rules, 1000, 1005–1007 technology and, 841 Deep Web, 1052 1129–1134 transaction management in, 857–859 Default values, SQL attributes, 184–186 Discretionary privileges, types of, transparency, 843–844 Deferred updates, database recovery, 814, DDL (data definition language) 1129–1130 compiler for schema definitions, 821–823 Discretionary security mechanisms, 1123 Degree of homogeneity, 865–866 Discriminating attributes, 299–300 42–43 Degree of local autonomy, 865–866 Discriminator key, UML class diagrams, 88 DBMS languages and, 39 Degree of relation Disjointness constraint (d notation), Deadlock cautious waiting algorithm, 791 schema attributes, 152 114–115 detection, 791–792 SELECT operations, 243 Disjunctive selection, search methods no waiting algorithm, 791 PROJECT operation, 244 occurrence in transactions, 789–790 DELETE command, SQL, 200 for, 666–667 prevention protocols, 790–791 Delete operation, relational data models, Disk blocks (pages) timeouts for, 792 transaction timestamps and, 790–791 166, 167–168 allocating files on, 564 Debt–credit transactions, 773 Deletion, B-Trees, 629–630 block size, 549–550 Deletion anomalies, RDB design and, 467 buffering, 556–560, 815–816 Deletion marker, files, 568 database recovery, 815–816 Deletion operation, concurrency control hardware addresses of, 550–551 interblock gaps in, 550 and, 806 reading/writing data from, 551 Denormalization, 476

Index 1223 Disk drive, 550, 551–552 tree-structured data models for, B-trees and, 601–602, 617–622 Disk pack, 547 431–433, 449–453 concept of, 616 Disk storage devices search trees and, 618–619 type of element, 434 search, insert and deletion with, capacity of, 547 valid, 434 double-sided, 547 well-formed, 433–424 625–629 efficient data access from, 552–553 XML, 431–436, 442–443, 447–453 Dynamic programming, query fixed-head, 551 Domain-key normal form (DKNF), formatting, 549–550 optimization and, 716, 725–726 external hashing, 575–577 532–533 Dynamic random-access memory hardware disk drive (HDD), 547 Domain relational calculus hardware of, 547- (DRAM), 543 interfacing drives with computer formulas (conditions), 277–278 Dynamic spatial operators, 990–991 join condition, 278 Dynamic SQL systems, 551–552 nonprocedural language of, 268 moveable head, 551 quantifiers for, 279 command preparation and execution, parameters, 1167–1169 selection condition, 278 320–321 RAID, parallelizing access using, 542, variables, 277 Domain separation (DS) method, defined, 310 584–588 queries specified at runtime, 320–321 single-sided, 547 transaction processing, 756–757 DynamoDB model, 896–867 DISTINCT option, SQL, 188, 194 Domains e-commerce environment, access control Distributed computing systems, 841 Distributed database management atomic values of, 151 and, 1141 attribute roles, 152 e-mail servers, client/server architecture, systems, see DDBMs (distributed attribute value sets, 69–70 database management systems) cardinality of, 153 47 Distributed databases, see DDBs Cartesian product of, 153 Early (static) binding, 344 (distributed databases) constraints, 158 EER (Enhanced Entity-Relationship) Distributed DBMS, 51 data type specification, 151, 184 Distributed query processing ER model entity types, 69–70 model mapping, 859 format of, 151 abstraction concepts, 129–133 localization, 859 mathematical relation, 153 categories, 108, 120–122, 126 data transfer costs, 860–862 relation schema and, 152 class relationships, 108–110 semi-join operator, 862–863 relational data models, 151–152, 158 conceptual schema refinement, DIVISION operation, 255–257 SQL, 184 Document-based data models, 51, 53 tuples for, 151–152 119–120 Document-based NOSQL, 888, 890–895 Dot notation constraints, 113–116 Document body specification, HTML, object operation application, 372, 392 database schema, 122–124 429 path expressions, SQL, 386 design choices, 124–126 Document-centric documents, XML, 431 UDT components, 383 generalization, 108, 112–120, Document header specifications, HTML, Double buffering technique, 556–557 428 Double-sided disks, 547 124–128 Document type definition (DTD), XML, Downgrading locks, 786 hierarchies, 116–119 434–436 Driver manager, JDBC, 331 inheritance, 110, 117–119 Documents Drivers, JDBC, 331–332 knowledge representation (KR), data-centric, 431 DROP command, SQL, 233 DBMS storage of, 442 DROP TABLE command, SQL, 200 128–129 declaration, XML, 433 DROP VIEW command, SQL, 229 lattices, 116–119 document-centric, 431 DSS (decision-support systems), 1102 mapping to ODB schema, 407–408 extracting from databases, 442–443, Duplicates ontology, 129, 132–134 447–453 indexes for management of, 641 semantic data models, 107–108, graph-based data for, 447–452 parallel algorithm projection and, 685 hierarchical views of, 447–452 PROJECT operation elimination of, 129–134 hybrid, 431 specialization, 108, 110–120, hypertext, 425 245 parentheses for element specifications, unary operation elimination of, 124–128 434 subclasses, 108–110, 117–119, 126 relational data models for, 447–449 244–245 superclasses, 109, 110, 117–118, 126 schemaless, 432–433 Durability (permanency) property, UML class diagrams, 127–128 schemas, 448–452 union type modeling, 108, 120–122 self-describing, 425 transactions, 758 EER-to-Relational mapping storage of, 442–443 Dynamic arrays, PHP, 345–346 attributes of relations, 298–300 tags for XML unstructured data, Dynamic file expansion, hashing for, categories, 302–303 428–430 generalization options, 298–301 577–582 model constructs to relations, Dynamic files, 566 Dynamic hashing, 580 298–303 Dynamic multilevel indexes multiple inheritance and, 301 multiple-relation options, 299–300 B-trees and, 601–602, 622–630 shared subclasses, 301 single-relation options, 299–300 specialization options, 298–301 union types, 302–303 Element operator, OQL, 413

1224 Index Elements Enhanced Entity-Relationship model, see Unified Modeling Language (UML) complex, XML structure specification, EER (Enhanced Entity-Relationship) and, 60, 85–88 441 model empty elements, 440 Error checking, PHP, 355 parentheses for specifications of, 434 Enterprise flash drives (EFDs), 553 Errors, DDBs, 844 root elements, 440 Entities ER-to-Relational mapping tree-structured data models, 430–431 type of in documents, 434, 440–441 attributes, 63–70 algorithm, 290–296 XML, 430–431, 434, 440–441 conceptual data modeling, 33 binary relationship types, 293–295 conceptual design and, 70–72 entity types, 291–293 Embedded SQL defined, 63 ER model constructs, 296–298 communication variables in, 315, 316 ER mapping of, 291–293 multivalued attributes, 295–296 connecting to a database, 315–316 ER models and, 63–72, 75, 79 n-ary relationship types, 296 cursors for, 317–320 generalized, 126 relational database design, 290–298 database programming approach, 311, identifying (owner) type, 79 weak entity types, 292–293 338–339 key (uniqueness constraint) attributes, Escape operator (/) in SQL, 196 defined, 310, 311 ETL (extract, transform, load) process, 1103 host language for, 314 68–69, 79 Evaluation for query execution, 701–702 Java commands using SQLJ, 321–325 NULL values, 66 Event-condition-action (ECA) model precompiler or preprocessor for, 311, 314 overlapping, 115 active rules (triggers), 963–964 program variables in, 314–315 participation in relationships, 72–73 SQL trigger components, 227 query results and, 317–320 recursive (self-referencing) Event information versus duration shared variables in, 314 tuple retrieval, 311, 314–317 relationships and, 75 information, 976 role names, 75 Events, SQL trigger component, 227 Empty elements, XML, 440 sets (collection), 67–68 Eventual consistency, NOSQL, 885–886 Encapsulation strong, 79 EXCEPT operation, SQL sets, 194–195 subclass as, 110, 114–115 Exceptions constructor function for, 384 superclass as, 110 mutator function for, 384 types, 67–68, 79, 110 error handling, 322–323, 393–394 ODBs, 366, 370–374, 384–385 value sets (domains) of attributes, 69–70 ODMG models, 393–394, 397–398 object behavior and, 366, 371 weak, 79, 292–293 operation signature and, 397–398 observer function for, 384 Entity integrity, relational data modeling, SQLJ, 322–323 operations, 366, 370–374, 384–385 Execution autonomy, DDBs, 845 object naming and reachability, 373–374 163–165 Execution for query optimization, 701–712 SQL, 379–380, 384–385 Entity-Relationship model, see ER Execution transparency, DDBs, 844 user-defined type (UDT) for, 384–385 Existence bitmap, 636 Encryption (Entity-Relationship) model Existence dependency, 77–78 Advanced Encryption Standards Entrypoints, object names as, 373, 387 Existential quantifiers, 271, 274 Environment record, SQL/CLI, 327–328 EXISTS function, SQL query retrieval, (AES), 1150 Environments asymmetric key encryption 212–214 application programs, 6–7, 46 Exists quantifier, OQL, 415 algorithms, 1151 communication software, 46 Expert (knowledge-based) systems, Data Encryption Standards (DES), 1150 database system, 6–7, 42–46 database security, 1149–1153 modules, 31, 42–45 962–963 defined, 1149–1150 tools, 45–46 Explicit set of values, SQL, 214 digital certificates, 1153 EQUIJOIN (=) comparison operator, 253 Expressions digital signatures, 1152–1153 Equivalence of sets of functional public key encryption, 1151–1152 Boolean, 241–242 RSA public key encryption algorithm, dependency, 508 declarative, 268 Equi-width/equi-height histograms, 713 formulas and, 270–271 1152 ER (Entity-Relationship) diagrams in-line, 245 symmetric key algorithms, 1150–1151 relational algebra, 239 End/start tag (</…>), HTML, 428 conceptual design choices, 82–84 safe, 276–277 End users, 15–16 database application use of, 63–64 tuple relational calculus, 270–271, Enhanced data models database schema as, 81 active databases, 963–974 entity type distinction, 79 276–277 active rules, 962, 963–964, 969–973 notations for, 81, 83–88, 1163–1165 EXtended Markup Language, see XML deductive databases, 962, 999–1012 schema construct names, 82 functionality and, 961 ER (Entity-Relationship) model (EXtended Markup Language) logic databases, 962 applications of, 59, 62–64, 70–72, Extendible hashing, 578–580 multimedia databases, 962, 994–999 EXTENDS inheritance, 393 spatial databases, 962, 987–994 92–94 Extensible Stylesheet Language (XLS), 447 temporal databases, 962, 974–987 attributes, 63–70 Extensible Stylesheet Language for temporal querying constructs, constraints on, 73–74, 76–78, 91–92 data model type, 33 Transformations (XSLT), 447 984–986 data modeling using, 59–105 Extensions, SQL, 178 time series data, 986–987 database design using, 60–62, 80 Extent inheritance, 377, 385 entities, 63–72, 79 Extents relationships, 72–78, 88–92 schema and, 61–62, 81–85 class declaration of, 398 constraints on, 376–377

Index 1225 defined, 376 dynamic files, 566 nonadditive join decomposition into, object persistence and, 373 fully inverted file, 641 530 ODMG models, 373, 376–377, 398 grid files, 632–633 persistent collection for, 373, 376 hashing techniques, 572–582 normalizing relations, 493–494 transient collection for, 376 headers, 564 FP-growth algorithm, 1077–1080 type hierarchy and, 376–377 heaps, 567–568 Fragmentation transparency, DDBs, External hashing, 575–577 indexed-sequential, 571 External (schema) level (views), 37 indexes, 20 843–844 External sorting, files, 568 indexing structures for, 601–652 Free-form search request, 1023 External sorting algorithms, 660–663 inverted files, 641 Frequent-pattern (FP) tree, 1077–1080 Extraneous attribute, 510 linear search for, 564, 567–568 FROM clause, SQL, 188–189, 197, 232 F-score, IR, 1046–1047 main (master) files, 571 Full functional dependency, 2NF, 481–482 Faceted search, IR, 1058–1059 mixed records, 582–583 Fully inverted file, 641 Fact constellation, 1109 operations on, 564–567 Function-based indexing, 637–638 Fact tables, 1108 ordered (sorted) records, 568–572 Function call injection, SQL, 1144–1145 Factory objects, ODMG models, 398–400 overflow (transaction), 571 Functional data models, 75 Facts, relation schema and, 156 records, 560–564, 567–572, 582–583 Functional dependency (FD) Fan-out, multilevel indexes, 613, 622 static files, 566 Fault, DDBs, 844–845 storage of, 10–11, 560–572, 582–583 Armstrong’s axioms, 506–509 Fault tolerance, Big data technology and, unordered records (heaps), 567–568 closure, 505–506, 508 Filtering input, SQL injection and, 1146 defined, 472, 505 942, 946 First normal form (1NF) equivalence of sets of, 508 Federated database (FDBS) schema atomic (indivisible) values of, 477–478 extraneous attribute, 510 multivalued attributes, 481 full functional dependency, 2NF, 481–482 architecture, 871–872 nested relations, 479–480 inference rules for, 505–509, 527–528 Federated database system (FDBS), techniques for relations, 478–479 left- and right-hand attributes of, 472 unnest relation, 479–480 legal relation states (extensions), 472 866–868 Fixed-head disks, 551 minimal sets of, 510–512 Federated DBMS, 52 Fixed-length records, 561–563 normal forms, 481–483 FETCH command, SQL, 317, 319–320 Flag fields, EER-to-relational mapping notation for diagrams, 474 FETCH INTO command, 325 RDB design and, 471–474, 505–512 Fibre Channel over Ethernet (FCoE), with, 300 semantics of attributes and, 472–473 Flash memory, 543–544 transitive dependency, 3NF, 483 590–591 Flat files, 150 universal schema relation for, 471–474 Fibre Channel over IP (FCIP), 590 Flat relational model, 155 Functional requirements, 61 Fields Flow analysis operations, 988 Functions Flow control, 1147–1149 aggregate, 216–219, 260–261 connecting, 582–583 FLWOR expression, XQuery, 445 built-in, 384 data type of, 560 FOR clause, XQuery, 445–446 hashing (randomizing), 572, 580 Fields, records, 560, 561–563, 568–569 FOR UPDATE OF clause, SQL, 318 inheritance specifications and, 385 fixed-length records, 561 Force/no-force rules, 817–818 overloading, 385 key, 568 Foreign keys PHP programming, 350–352 mixed records, 582–583 query retrieval and, 216–219 optional, 562 relational data modeling, 163–165 relational algebra for, 260–261 ordered records, 568–569 SQL constraints, 186–187 SQL, 216–219, 384–385 ordering, 568 XML specification, 441 type (class) hierarchies and, 374–375 record type, 583 Formal languages, see Relational algebra; UDT, 384–385 records, 560, 561–563, 568–569 XML data creation using, 453–455 repeating, 562–563 Relational calculus Fuzzy checkpointing, 819, 828 variable-length records, 561 Format, relational model domains, 151 Garbage collection, 827 Fifth normal form (5NF) Formatting styles, HTML, 428 Generalization definition of, 494 Forms-based interfaces, 41 conceptual schema refinement, 119–120 functional dependency in, 532 Forms specification language, 41 constraints on, 113–116 join dependency (JD) in, 494–495, Formulas (conditions) defined, 113 design choices for, 124–128 530–531 atoms in, 270–271, 277–278 EER diagram notation for, 112 inclusion dependency in, 531–532 Boolean conditions, 270–271 EER modeling concept, 108, 112–120, File load factor, hashing, 582 domain relational calculus, 277–278 File processing, 10–11 tuple relational calculus, 270–271 124–128 File servers, client/server architecture, 47 Fourth normal form (4NF) entity type, 126 Files decomposition of relations, 529 hierarchies, 119 allocating blocks on a disk, 564 definition of, 493, 528 lattices, 116–119 B-trees for organization of, 583 functional dependency and, 527–528 semantic modeling process, 131 binary search for, 570 inference rules for, 527–528 superclass from subclasses, 112–113 clustered files, 572, 583, 602–603 multivalued dependency (MVD) and, total, 115 data storage using, 541–542 UML notation for, 127–128 database catalog for, 10–11 491–494, 527–528 defined, 7

1226 Index Generalized projection operation, 259–260 Hash key, 572 Horizontal fragmentation (sharding), Genetic algorithms (GAs), 1093 Hash partitioning, 684 DDB data, 843–844, 847–848 Geographic information systems (GISs), Hash tables, 572–573 Hashing techniques Horizontal partitioning, 684 4, 987 Horn clauses, 1004 Global depth, hashing, 578 dynamic file expansion, 577–582 Host language, embedded SQL, 314 Global query optimization, 860 dynamic hashing, 580 Hot set method, transaction processing, Global query optimizer, Oracle, 734–735 extendible hashing, 578–580 Glossary, ontology as, 134 external hashing, 575–577 757 GRANT command, 1131 file storage, 572–582 Hoya (Hortonworks HBase on YARN), GRANT OPTION, 1131 folding, 574 Granting and revoking privileges, internal hashing, 572–575 943–944 linear hashing, 580–582 HTML (HyperText Markup Language) 1129–1134 multiple keys and, 632 Graph-based data, XML document partitioned hashing, 632 client tier of, 344 static hashing, 577 tag notation and use, 428–430 extraction using, 447–452 Having clause, OQL, 416 Web data and, 25 Graph-based data models, 51, 53 HAVING clause, SQL, 219–221 HTML tag (<…>), 428 Graph-based NOSQL, 888, 903–909 Hbase data model Hybrid documents, XML, 431 Graphical User Interfaces, see GUI column based systems, 900–903 Hybrid-hash join, 675–676 CRUD operations, 903 Hyperlinks, 25, 1027 (Graphical User Interface) distributed system concepts for, 903 Hypertext documents, 425 Grid files, 632–633 NOSQL, 900–903 HyperText Markup Language, see HTML GROUP BY clause versioning, 900–902 Headers, file descriptors, 564 (HyperText Markup Language) SQL, 219–220 Heaps (unordered file records), 567–568 Idempotent operations, 815 view merging, subqueries, 705–706 Here documents, PHP, 347–348 Identification, semantic modeling Grouping Heterogeneous DBMS, 52 aggregate functions and, 216–218, Heuristic rules for query optimization, process, 130–131 Identifying (owner) entity type and 260–261 657, 692, 697–701 attributes, 219, 260–261 Hidden attributes, objects, 371, 375 relationship, 79 GROUP BY clause for, 219–220 Hierarchical data models, 33, 53. See also Image data, 989 HAVING clause for, 219–221 Images NULL values in grouping attributes, 219 Tree-structured data models operator, 415–416 Hierarchical systems using databases, automatic analysis, 996–997 OQL, 415–416 color, 997 partitions, 219, 415–416 23–24 defined, 995 QBE (Query-by-Example) language, Hierarchical views, XML document multimedia databases for, 995–999 object recognition, 997–998 1175–1177 extraction using, 447–453 semantic tagging of, 998–999 relations partitioning into tuples, 219 Hierarchies shape, 997 separate groups for tuples, 219 texture, 997 SQL query retrieval and, 216–222 association rules for data mining, Immediate updates WHERE clause for, 221–222 1081–1082 database recovery, 815, 823–826 GUI (Graphical User Interface) SQL views, 230 data mining, 1095 EER models, 116–119 Immutable property of OID, 367 DBMS provision of, 20–21 generalization, 119 Impendence mismatch, 312–313 use of, 41 inheritance and, 118 Implementation Hadoop memory, 543–545 active databases, 967–972 advantages of technology, 936 object data models (acyclic graphs), 52 aggregate operations, 678–679 Big data technology for, 916–917, specialization, 116–119 database operations, 12 tree structure, 116, 452–453 JOIN operations for, 668–681 921–926 type (class), 366, 374–377, 385 operation encapsulation and, 371 distributed file system (HDFS), 921–926 High-level (conceptual) data models, 33, pipelining using iterators, 682–683 ecosystem, 926 query processing, 668–676, 679–681 historical background of, 916–917 60–62 temporal databases, 982 parallel RDBMS compared to, 944–946 High-level (nonprocedural) DML, 39–40 Implementation (physical storage) level, releases, 921 High-level language support, Big data YARN (Hadoop v2), 936–944, 949–953 RDB design, 459–460 Handles, SQL/CLI records, 328 technology and, 946 IN comparison operator, SQL, 209–210 Handle variables, SQL/CLI declaration High-performance data access, NOSQL, In-line expression, 245 In-line views, SQL, 232 of, 328 886–887 In-place updating, 816 Hardware Hints, Oracle, 736 Inclusion dependency, 5NF, 531–532 Histograms Incorporating time, temporal databases, addresses, 550–551 disk storage devices, 547–552 cost estimation from, 713 977–984 Hash field, 572 equi-width/equi-height, 713 Incorrect summary problem, transaction Hash file, 572 selection conditions and, 668 Hash (randomizing) functions, 572, 580 HITS ranking algorithm, 1051 processing, 750 Hash indexes, 633–634 HOLAP (hybrid OLAP) option, 1114 Incremental updates, SQL views, 230 Homogeneous DBMS, 52 Incremental view maintenance, 707–710

Index 1227 Index-based nested-loop join, 559, Information privacy, security Inner join, SQL table (relations), 718–719 relationship to, 1128–1129 215–216 Indexed allocation, file blocks, 564 Information repository, DBMS, 46 Inner/outer joins, 254, 263–264 Indexed (ordered) collection expressions, Information retrieval (IR) Innermost nested query, 211 INSERT command, SQL, 198–200 OQL, 415 Boolean model, 1030 Insert operation Indexed-sequential file, 571, 616 data, 1024 Indexes databases compared to IR systems, constraint violations and, 166–167 relational data models, 166–167 bitmap indexes, 634–637 1025–1026 Insertion, B-trees, 626–629 clustering, 602, 606–608 defined, 1022–1023 Insertion anomalies, RDB design and, constraint management using, 641 desktop search engines for, 1025 creation of, 639–640 enterprise search systems for, 1024 465–466 data modeling access path, 34 F-score for, 1046–1047 Instance variables, 365–366 DBMS auxiliary files, 20 free-form search request, 1023 Instances (occurrences), 35, 72 duplicate management using, 641 history of, 1026–1027 Instantiable class behavior, interface fully inverted file, 641 information need, 1024 hash indexes, 633–634 inverted indexing, 1040–1044 and, 392 locks for concurrency control, levels of scale, 1024 Instantiation, semantic modeling modes of interaction in IR systems, 805–806 process, 130 logical versus physical, 638–639 1027–1028 Integrity constraints multilevel, 613–617 pipeline for processing, 1028–1029 multiple keys for, 613–633 probabilistic model, 1033–1034 database applications and, 21–22 ordered index on multiple attributes, queries in IR systems, 1035–1037 entity integrity, 163–165 recall and precision, 1044–1046 foreign keys and, 163–164 631–632 search relevance, 1044–1047 referential integrity, 21, 163–165 physical database file structures as, 641 semantic approach, 1028 relational modeling and, 160–165 primary, 602, 603–606 semantic model, 1034–1035 relational database schemas and, rebuilding, 640 statistical approach, 1028 secondary, 603, 609–612 text preprocessing, 1037–1040 160–163 single-level ordered, 602–613 trends in, 1057–1063 semantic, 165 spatial data, 991–993 unstructured information, 1022 valid and not valid states and, 160–161 SQL creation of, 201–202 users, 1023–1024 Intellectual property rights, 1154–1155 tuning, 640–641 vector space model, 1031–1033 Intention, 35 Indexing fields, 601, 602 Information updating, 23 Interactive query interface, 43–44 Indexing structures Inherent model-based (implicit) Interactive transactions, concurrency column-based storage of relations, 642 hints in queries, 641–642 constraints, 157 control and, 807 physical database design and, 601–652 Inherent rules, 22 Interblock gaps, disk devices, 550 indexed sequential access method Inheritance Interface inheritance, 377, 393 Interfaces. See also GUI (Graphical User (ISAM), 601 behavior inheritance, 393 B-trees, 601–602, 622–630, 636–637 class–schema interface, ODL, 404–405 Interfaces) B-trees, 601–602, 617–622, 629–630 colon (:) notation for, 393 built-in, ODMG models, 393–396 single-level ordered indexes, 602–613 EER-to-relational mapping, 301 class–schema inheritance, ODL, multilevel indexes, 613–617 EXTENDS, 393 multiple keys for, 631–633 extent inheritance, 377, 385 404–405 hash indexes, 633–634 function overloading and, 385 database operations, 12 bitmap indexes, 634–637 generalization lattice or hierarchy, 119 DBMS, 20–21, 40–42 function-based indexing, 637–638 interface inheritance, 377, 393 disk drives with computer systems, issues concerning, 638–642 multiple, 118, 301, 377–378, 393 RDB design and, 643–646 ODBs, 366, 374–377, 377–378, 385, 393 551–552 strings, 640 ODMG object model and, 393, 404–405 instantiable class behavior and, 392 Industrial internet of things (IIOT or selective, 377 multiple user, 20–21 simplified model for, 347–377 noninstantiable object behavior and, IOT), 914 single, 118–119 Inference engine, deductive databases, specialization lattice or hierarchy, 392 object model definitions, 389–392 999, 1004–1005 117–118 ODMG models and, 389–396, Inference rules SQL, 380 subclass/superclass relationships, 110, 404–405 Armstrong’s axioms, 506–509 operation encapsulation and, 371 closure, 505–506, 508 117–119 operation specifications, 366 4NF schema using, 527–528 table inheritance, 385 Interleaved concurrency, 747 functional dependencies, 505–509, type inheritance, 385 Interleaved processes, 747 Initial hash function, 580 Internal hashing, 572–575 527–528 Initial state, populating (loading) Internal (schema) level, 36 proof by contradiction, 507 Internal nodes, tree structures, 622 multivalued dependencies, 527–528 databases and, 35 Internet SCSI (iSCSI), 590 Information extraction (IE), 1040 Interpolating variables within text strings, 347 Interpreted queries, 710 Interquery parallelism, 687

1228 Index INTERSECT operation, SQL sets, distributed query processing, 862–863 indexes with, 631–633 194–195 dynamic programming approach to multiple keys, 631–633 normal forms and, 476–477 INTERSECTION operation, 247–249 ordering, 725–726 ODMG object model, 398 INTERVAL data type, 184 EQUIJOIN (=) comparison operator, primary key, 159, 186–187, 441, 477 INTO clause, 317 SQL, 186–187 Intraquery parallelism, 687 253 superkey, 158–159, 476–477 inverse references, 366, 370, 396–397 hybrid-hash join, 675–676 unique keys, 160 Inverse relationships, ODMG objects, index-based nested-loop join, 559, XML schema specification, 441 Keyword-based data search, 41 396–397 718–719 Keyword queries, 1035 Inverted files, 641 inner/outer, 254, 263–264 Knowledge discovery in databases Inverted indexing join selectivity (js) operator, 717–718 MapReduce (MR), 930–932 (KDD), 1070–1073 construction of, 1041–1042 map-side hash join, 930 Knowledge representation (KR) defined, 1041 multiway joins, 668 information retrieval (IR), 1040–1044 N-way joins, 931–932 abstraction concepts, 129–133 Lucern indexing/search engine for, NATURAL JOIN (*__) comparison domain of knowledge for, 129 EER modeling and, 128–129 1043–1044 operator, 253, 262–263 ontology and, 129 process of, 1042 nested-loop join, 558–559, 672–673, 718 reasoning mechanisms, 129 IS-A relationship, 109, 126 non-equi-join, 681 Label-based security policy IS/IS NOT comparison operators, 209 optimization based on cost formulas, architecture, 1156–1157 Isolation. See also Snapshot isolation multilevel security, 1139–1140 levels of in transactions, 758 720–721 Oracle, 1155–1158 property, transactions, 14, 158 ordering choices in multirelational Virtual private database (VPD) Iterator object, ODMG models, 393 Iterator variables queries, 721–724 technology, 1156 query results and, 312 OUTER JOIN operations, 262–264, Language design for database OQL, 409–410 Iterators 679–681 programming, 312, 339 defined, 682 parallel algorithms, 685–686 Latches, concurrency control and, 807 pipelining implementation using, partition-hash join, 559, 674–675, 719, Late (dynamic) binding, 377 Lattices 682–683 930–931 SQLJ query result processing with, performance of, 673–674 EER models, 116–119 physical optimization, 724 generalization, 119 323–325 query processing implementation, inheritance and, 117–118 Java specialization, 116–119 668–676, 679–681 Lazy updates, SQL views, 230 embedding SQL commands (SQLJ), recursive closure operations, 262 Leaf class, 127 321–325 relational algebra and, 251–255, Leaf nodes, tree structures, 257, 617, 623 Least recently used (LRU) strategy, exceptions for error handling, 322–323 262–264 Web programming technologies, 358 semi-join (SJ) operator, 658–660, 681, buffering, 559 Java server pages (JSP), 358 Legacy data models, 33, 51, 53 Java servlets, 358 719–720, 862–863 Legal relation states (extensions), 472 JavaScript, 358 SQL query retrieval, 215–216 Level trigger, 967 JavaScript Object Notation (JSON), 358 SQL relations, 215–216 Library of functions or classes JDBC (Java Database Connectivity) sort-merge join, 559, 719, 930 class library imported from, 331, 332 two-way join, 668 application programming interface drivers, 331–332 k-means algorithm, 1088–1089 (API), 312, 326 programming steps, 332–335 Key constraints SQL class library, 326, 331–335 attributes, 68–69, 302 database programming approach, 311, two-tier client/server architecture and, database integrity and, 21 338–339 integrity constraints and, 163–165 49 referential integrity constraints and, JDBC: SQL class library, 326, 331–335 Join attribute, 253 SQL/CLI (SQI call level interface), Join condition, 189, 191, 252, 278 163–165 Join dependency (JD), 5NF, 494–495 relational modeling and, 158–160, 326–331 JOIN operations Lifetime of an object, 388 163–165 LIKE comparison operator, SQL, aggregate operation implementation relational schema and, 157–165 and, 678–679 surrogate, 302 195–196 uniqueness property, 68–69, 159 Linear hashing, 580–582 anti-join (AJ) operator, 658–660, Key field, records, 568 Linear regression, data mining, 1092 677–678, 681, 719–720 Key-value storage (data models), 34, 51, 53 Linear scale-up, 684 Key-value stores, NOSQL, 888, 895–900 Linear search, files, 564, 567–568 attributes, 668 Keys Linear speed-up, 684 bucket join, 931 attributes, 477 Link structure analysis, Web search and, buffer space and, 672–673 candidate key, 159–160, 477 cardinality, 719–720 composite keys, 631 1050–1051 cost functions for, 717–726 defined, 476 Linked allocation, file blocks, 564 foreign keys, 163–165, 186–187

Index 1229 Links, UML class diagrams, 87 storage devices, 555–556 defined, 6 List constructor, 369 tape reel, 555 schema storage, 35 Literal declaration, 392 Main (master) file, 571 Methods Literals Main memory, 543 database operations, 12 Maintenance, databases, 6 object data models, 53 atomic (single-valued) types, 368, 388 Maintenance personnel, 17 operation implementation and, 366, collection, 392 Mandatory access control (MAC), 1121, constructors for, 368–370 371 deductive databases, 1002–1003 1134–1137 Middle-tier Web server, PHP as, 344 objects compared to, 368 Mandatory security mechanisms, 1123 Middleware layer, n-tier architecture, ODBs, 368–370, 388, 392 Map data, 989 ODMG models, 388, 392 Mappings 50–51 structured, 388 MIN function, SQL, 217 type generators, 368–369 data model, 62 Minimal sets of functional dependency, type structures for, 368–370 database schema views, 37 Loading utility, 45 distributed query processing, 859 510–512 Local area network, 842 EER model constructs to relations, MINIMUM function, grouping, 260 Local depth, hashing, 578 Miniworld, 5 Local query optimization, 860 298–303 MINUS operation, 247–249 Localization, DDB query processing, EER schema to ODB schema, 407–408 Mirroring, (shadowing), RAID, 585 ER-to-relational, 290–298 Mixed (hybrid) fragmentation, DDB 859 ODB conceptual design, 407–408 Location analysis, 988 tuples for relations, 154 data, 847–848 Location transparency, DDBs, 843 MapReduce (MR) Mixed records, files for, 582–583 Locking data items, 781 advantages of technology, 936 Mobile applications, access control of, Locks Big data technology for, 917–921, 1141–1142 binary locks, 782–784 926–936 Mobile device apps certify locks, 796–797 historical background of, 917–918 concurrency control and, 782–786, joins in, 930–932 ER modeling and, 59 parallel RDBMS compared to, 944–946 interfacing, 40–41 796–797, 805–806 programming model, 918–921 user transactions by, 16 conversion of, 786 runtime, 927–930 Model-theoretic interpretation of rules, downgrading, 786 Map-side hash join, MapReduce (MR), index concurrency control using, 1005 930 Models, see Data models; EER (Enhanced 805–806 Mark up, XML documents for HTML, shared/exclusive (read/write) locks, Entity-Relationship) model; ER 428–429 (Entity-Relationship) model; Object 784–786 Market-basket data model, 1073–1075 data models upgrading, 786, 797 Mass storage, 543 Modification anomalies, RDB design Log buffers, 755, 756 Master data management (MDM), 1110 and, 467 Log sequence number (LSN), 828 Master-master replication, NOSQL, 886 Modifier, object operations, 371 Logic databases, 962 Master-slave replication, NOSQL, 886 Modules Logical (conceptual) level, RDB design, Materialized evaluation, 681, 702–702 buffering (caching), 20, 42 Materialized views, query execution, client module, 31 459–460 compilers, 42–45 Logical comparison operators, SQL, 707–710 database queries and, 20, 43–44 Mathematical relation, domains, 152 database systems, 31, 42–45 188–190 MAX function, SQL, 217 DBMS components, 42–45 Logical data independence, 37–38 MAXIMUM function, grouping, 260 interactive query interface, 43–44 Logical database design, see Data model Measurement operations, 988 server module, 31 Mechanical arm, disk devices, 551 stored data manager, 42 mapping Memory MOLAP (multidimensional OLAP) Logical design, 62 function, 1114 Logical index, 638–639 cache, 543 MongoDB data model Logical theory, ontology as, 134 dynamic random-access (DRAM), 543 CRUD operations, 893 Loss of confidentiality, database threat flash memory, 543–544 documents, 890–893 hierarchies, 543–545 NOSQL, 890–895 of, 1122 magnetic tape, 544–545 replication in, 894 Loss of integrity, database threat of, 1122 main, 543 sharding in, 894–895 Lossy design, 515 optical drives, 544 Moveable head disks, 551 Lost update problem, transaction random-access (RAM), 543 Multidatabase system recovery, 831–834 storage capacity and, 543 Multidimensional models, 1108 processing, 750 storage devices for, 543–545 Multilevel indexes Low-level (physical) data models, 33–34 Menu-based interfaces, 40 dynamic, 616, 617–630 Low-level (procedural) DML, 40 Merging phase, external algorithms, 661 fan-out, 613, 622 Lucern indexing/search engine, Meta-data levels, 613–616 database catalog and, 10–11 physical database design and, 613–617 1043–1044 Magnetic tape backing up databases using, 555–556 memory hierarchy and, 544–545

1230 Index Multimedia databases N-way joins, MapReduce (MR), 931–932 testing binary decompositions for, 517 audio data source analysis, 999 Named iterator, SQLJ, 323 3NF schema using, 519–522 concepts, 994–996 Namespace, XML, 440 Nonadditive join test for binary enhanced data models, 962, 994–999 Naming mechanisms image automatic analysis, 996–997 decomposition (NJB), 490 object recognition, 997–998 constraints, SQL, 187 Noninstantiable object behavior, semantic tagging of images, 998–999 database entrypoints, 373 types of, 3–4 object persistence and, 373–374 interface and, 392 operations for renaming attributes, Nonprocedural language, 268 Multiple granularity locking Nonrecursive query evaluation, 1010–1012 concurrency control and, 801–804 245–246 Nonserial schedules, 763, 764–765 granularity levels for, 801 query retrieval and, 192, 214–215 Normal form test, 475 granularity of data items, 800–801 renaming attributes, 192, 214–215, Normal forms protocol, 802–804 245–246 Boyce-Codd normal form (BCNF), Multiple hashing, collision resolution, 575 schema constructs, 82 487–491 Multiple inheritance, 118, 301, 377–378, Naming transparency, DDBs, 843 NATURAL JOIN (*__) comparison defined, 475 393 denormalization, 476 Multiple keys operator, 253, 262–263 domain-key (DKNF), 532–533 NATURAL JOIN operation, SQL tables, fifth normal form (5NF), 494–495 grid files and, 632–633 first normal form (1NF), 477–481 indexes on, 613–633 215 fourth normal form (4NF), 491–494 multiple attributes and, 631–632 Natural language interfaces, 41 insufficiency of for relational ordered index on, 631–632 Natural language queries, 1037 partitioned hashing with, 632 Neo4j system decomposition, 513–514 physical database design and, 613–633 join dependency (JD) and, 494–495 Multiple-relation options, EER-to- cypher query language of, 905–908 keys, attributes and definitions for, distributed system concepts for, relational mapping, 299–300 476–477 Multiple user interfaces, 20–21 908–909 multivalued dependency (MVD) and, Multiplicities, UML class diagrams, 87 nodes, 904–905 Multiprogramming NOSQL, 903–909 491–494 relationships, 904–905 normalization of relations, 474–476, concept of, 746–747 Nested-loop join, 558–559, 672–673, 718 operating systems, 747 Nested queries 482, 485, 486–487, 493–494 Multirelational queries, JOIN ordering comparison operators for, 210–211 practical use of, 476 correlated, 211–212 primary keys for, 483–495 choices and, 721–724 innermost query of, 211 RDB design and, 474–495, 513–514, Multiset (tuple) operations outer query of, 209 query optimization and, 702–704 528–533 comparisons for query retrieval, 209–211 subqueries, 702–704 second normal form (2NF), 481–482, SQL tables, 193–195 tuple values in, 209–211 Multiuser DBMS systems, 51 unnesting (decorrelation), 704 484–486 Multiuser transaction processing, 13–14 Nested relations, 1NF in, 479–480 third normal form (3NF), 483–484, Multivalued attributes, 66, 295–296, 481 Network-attached storage (NAS), 589–590 Multivalued dependency, see MVD Network data models, 33, 51, 53 486–487 Network systems using databases, 23–24 Normalization process (multivalued dependency) Network topologies, 843 Multiversion concurrency control, 781, Neural networks, data mining, 1092 algorithms, 519–527 No waiting algorithm, deadlock data normalization, 475–476 795–797 dependency preservation property, 476 certify locks for, 796–797 prevention, 791 multivalued dependency (MVD), timestamp ordering (TO), 796 NodeManager, YARN, 942 two-phase locking (2PL), 796–797 Nodes 493–494 Multiway joins nonadditive (lossless) join property, 476 implementing, 668 constant, query graphs, 273 normal form test for, 475 SQL table (relations), 216 leaf, query trees, 257 relations, 474–476 Mutator function, SQL encapsulation, 384 relation, query graphs, 273 NOSQL database system MVD (multivalued dependency) tree structures, 617 availability, 885–886 all-key relation of, 491, 493 Non-equi join implementation, 681 big data storage uses, 3, 26 definition of, 491–492 Nonadditive (lossless) join property CAP theorem, 888–890 fourth normal form (4NF) and, algorithms, 519–523 categories of, 887–888 Boyce-Codd normal form (BCNF) column-based, 888, 900–903 491–494, 527–530 CRUD (create, read, update, and inference rules for, 527–528 schemas using, 522–523 normalizing relations, 493–494 dependency preservation and, 519–522 delete) operations, 887, 893, 903 trivial/nontrivial, 493 4NF schema using, 530 data models, 34, 51 n-ary relationship types, mapping of, 296 normalization process, 476 DDB similar characteristics, 885–887 n-degree relationships, 88–92 RDB decomposition, 515–518, 519–522 distributed storage using, 883 n-tier architecture for Web applications, successive decompositions, 517–518 document-based, 888, 890–895 emergence of, 884–885 49–51 eventual consistency, 885–886 graph-based, 888, 903–909 Hbase data model, 900–903

Index 1231 high-performance data access, Object identifier, see OID (object identifier) literals in, 368–370, 388–392 886–887 Object identity Object Data Management Group key-value stores, 888, 895–900 literal values for, 368 (ODGM) model, 386–405, 417–418 MongoDB data model for, 890–895 ODBs, 367–368, 378 object definition language (ODL) and, Neo4j system, 903–909 OID implementation of, 367 query language similar characteristics, SQL, 379 386, 400–405 Object-oriented systems, persistent object identifier (OID), 367–368 887 object query language (OQL), 408–416 replication models for, 886 storage, 19–20 object-oriented (OO) concepts, 365–366 replication, 885–886, 894 Object query language, see OQL (object objects in, 365–371, 387–388, 395–400 scalability, 885 polymorphism (operator overloading), sharding, 886, 894–895 query language) versioning, 887, 899, 900–902 Object recognition, multimedia 366, 377 NOT FINAL, UDT inheritance RDB compared to, 405–406 databases, 997–998 SQL extended from, 379–386 specification, 385 Object-relational systems type (class) hierarchy, 366, 374–377 NOT operator, see AND/OR/NOT ODL (object definition language) extended-relational systems, 53 classes, 400, 404–405 operators SQL, 202 class–schema interface inheritance, NO-UNDO/REDO algorithm, 815, Objects arrow (–>;) notation for, 392 401–404 821–823 atomic (single-valued) types, 368, 388, Object Data Management Group NULL values 396–398 (ODGM) model and, 386, 400–405 aggregate functions and, 218 attributes, 396 object databases (ODBs) and, 386–387, attribute not applicable, 208 behavior of based on operations, 371 complex query retrieval and, 208–209 collections, 373, 376 400–405 constraints on attributes, 160, constructors for, 368–370 schemas, 400–403 dot notation for, 372, 392 type constructors in, 369 184–186 encapsulation of, 366, 371 ODMG (Object Data Management Group) discarded values, 218 exceptions, 397–398 atomic (user-defined) objects, 395–398 entity attributes, 66 hidden attributes, 371 bindings, 386, 417–418 grouping attributes with, 219 instance variables, 365–366 built-in interfaces and classes, 393–396 IS/IS NOT comparison operators for, interfaces, noninstantiable behavior C++ language binding, 386, 417–418 database standard, 33, 364–365 209 and, 392 extents, 373, 376–377, 398 query retrieval in SQL, 208–209, 218, lifetime, 388 factory objects, 398–400 literals compared to, 368 inheritance in object models, 393 219 naming, 373–374, 387 interface definitions for object models, RDB design problems, 523–524 ODBs, 365–371, 387–388, 395–400 referential integrity and, 163–164 ODMG models, 387–388, 392, 395–400 389–392 relational modeling and, 155–156, 160 operations for, 370–372 keys, 398 relation schema for RDB design and, persistent, 365, 373–374, 376 literals in object models, 388, 392 reachability, 373–374 object databases (ODBs), 386–405, 467–468 relationships, 396–397 grouping attributes, 219 signatures, 366, 397 417–418 SQL attribute constraints, 184–186 state of, 387 object definition language (ODL) and, three-valued logic for comparisons, structure of, 388 transient, 365, 373, 376 386, 400–405 208–209 type generators, 368–369 object model of, 387–400 tuples for relations, 155–156, 163, type structures for, 368–370 object query language (OQL) and, 386, unique identity, 367–368 467–468 visible/hidden attributes, 371, 375 408 unavailable (or withheld) value, 208 Observer function, SQL encapsulation, 384 objects, 387–388, 392, 395–400 unknown value, 208 ODBC (Open Database Connectivity) standards, 386, 417–417 Numeric arrays, PHP, 349 data mining, 1094–1095 OID (object identifier) Numeric data types, 182, 348 standard, 49, 326 immutable property of, 367 Object-based storage, 591–592 ODBs (object databases) ODB unique object identity and, 367–368 Object Data Management Group, see C++ language binding, 417–418 ODMG models, 387 conceptual design, 405–408 reference types used for in SQL, 383 ODMG (Object Data Management development of, 363–365 OLAP (Online analytical processing) Group) encapsulation of operations, 366, data warehousing and, 1102 Object data models data warehousing characteristics and, classes, 52 370–374, 384–385 data model type, 33 inheritance and, 366, 374–377, 1104 DBMS classification from, 51, 52–53 HOLAP (hybrid OLAP) option, 1114 hierarchies (acyclic graphs), 52 378, 385, 393 MOLAP (multidimensional OLAP) methods, 53 instance variables, 365–366 ODMG, 387–400 inverse references, 366, 370, 396–397 function, 1114 Object databases, see ODBs (object ROLAP (relational OLAP) function, databases) Object definition language, see ODL 1114 (object definition language) use of, 4

1232 Index OLTP (online transaction processing) generalized projection, 259–260 ORDER BY clause data warehousing and, 1102 insert, 166–167 SQL, 197–198 multiuser transaction processing, 14 JOIN, 251–255, 262–264, 668–676 XQuery, 446 relational data modeling, 169 method (body) of, 366, 371 special-purpose DBMS use, 52 ODBs, 366, 370–374, 384–385 Order preserving, hashing, 577 pipelining for combinations of, Ordered (sorted) records, 568–572 Online analytical processing, see OLAP Ordering field, records, 568 (Online analytical processing) 681–683 OUTER JOIN operations, 216, 262–264 program variables for, 565–566 Outer query, 209 Online transaction processing, see OLTP record-at-a-time, 566 OUTER UNION operation, 264–265 (online transaction processing) recursive closure, 262 Outlines, Oracle, 736 relational algebra, 240–259, 262–265 Overflow (transaction) file, 571 Ontology relational data modeling, 165–168 Overlapping entities, 115, 126 conceptualization and, 134 renaming attributes, 245–246 PageRank ranking algorithm, 1051 defined, 134 retrievals, 165–166, 564–565 Parallel algorithms knowledge representation (KR) and, 129 schedules, 759–760, 773 semantic Web data models, 133–134 selection conditions for, 564–565 aggregate operations for, 686 specification and, 134 sequence of, 245–246 architectures for, 683–684 types of, 134 set-at-a-time, 566 interquery parallelism, 687 set theory and, 246–251, 264–265 intraquery parallelism, 687 Ontology-based information integration, signature (interface) of, 366, 371 join techniques, 685 1052–1053 SQL query recovery and, 194–197 operator-level parallelism, 684–686 SQL sets, 194–195 partitioning strategies, 684 OO (object-oriented) concepts, 365–366 unary, 240, 241–246 projection and duplicate elimination, 685 OODB (object-oriented database) UNION, 194–195, 264–265 query processing using, 683–687 update (modify), 166, 168–169, 564–565 selection conditions, 685 attribute versioning, 982–984 user-defined functional requirements, set operations for, 686 database complexity and, 24–25 sorting, 684 development of, 363 61 Parallel database architecture, 683 temporal databases incorporating time Operator-level parallelism, 684–686 Parallel processing, 747 Operators Parameters in, 982–984 binding, 329, 333 OQL (object query language) aggregate functions, 216–219, 260–261 disks, 1167–1169 arithmetic, SQL, 196–197 JDBC statement parameters, 333 aggregate functions, 413–414 collections, 413–416 SQL/CLI statement parameters, 329 Boolean (true/false) results, 414 comparison, 209–211 stored procedure type and mode, collection operators, 413–416 nested queries, 209–211 element operator, 413 defined, 17 336–337 exists quantifier, 415 grouping, 415–416 Parametric (naïve) end users, 16 grouping operator, 415–416 logical comparison, SQL, 188–190 Parametric user interfaces, 42 indexed (ordered) collection OQL collections, 413–416 Parent nodes, tree structures, 617 spacial, 990–991 Parser, query processing, 655 expressions, 415 SQL query recovery, 188–190, Partial categories, 122 iterator variables for, 409–410 Partial key, 79, 479 named query specification, 412–413 196–197, 209–211 Partial specialization, 115, 126 ODBs, 408–416 SQL query translation into, 657–660 Participation constraints, 77–78 ODGM model queries and, 408–416 Optical drives, 544 Partition algorithm, 1081 ODMG standard and, 386 Optimistic protocols, 781 Partition-hash join, 559, 674–675, 719, path expressions, 410–412 Optional field, records, 561–562 query results, 410–412 OR logical connective, SQL, 209–210 930–931 select…from…where structure of, 409 OR operator, see AND/OR/NOT operators Partition tolerance, DDBs, 845 OOPL (object-oriented programming Oracle Partitioned hashing, 632 adaptive optimization, 735 Partitioning strategies language), class library for, 312 array processing, 735–736 op comparison operator, 270 global query optimizer, 734–735 NOSQL, 886 Open addressing, hashing collision hints, 736 parallel algorithms, 684 key-value store, 899 Partitions resolution, 574 label-based security policy, 1155–1158 OQL, 415–416 OPEN CURSOR command, SQL, 317 outlines, 736 grouping and, 219, 415–416 OpenPGP (Pretty Good Privacy) physical optimizer, 733–734 SQL query retrieval and, 219 query optimization in, 733–737 Path expressions protocol, XML, 1140–1141 SQL plan management, 736–737 OQL, 410–412 Operating system (OS), 42 virtual private database (VPD) SQL, 386 Operational data store (ODS), 583, 1105 XPath for, 443–445 Operations. See also Query processing technology, 1156 Path separators (/ and //), XML, 443 ORDBMS (object-relational database Patterns, substring matching in SQL, strategies aggregate, 678–679 management system), 364 195–197 assignment (←) for, 245 binary, 240, 251–259, 262–264 defined, 12 delete, 166, 167–168 dot notation for objects, 372 encapsulation, 366, 370–374, 384–385 files, 564–567

Index 1233 PEAR (PHP Extension and Application Placeholders, PHP, 356 PROJECT operation Repository), 353–354 Plan caching, query optimization, 730 degree of relations, 244 Pointers duplicate elimination and, 244–245 Performance, Big data technology and, 945 query processing, algorithms for, Performance monitoring, 45 B-trees, 620, 623–624 676–678 Periodic updates, SQL views, 230 file records, 563, 575–576 relational algebra using, 243–245 Persistent data, storage of, 545 Polymorphism (operator overloading) Persistent objects, 365, 373–374 binding and, 377 Prolog language, deductive databases, Persistent storage, 19–20 ODBs, 366, 377 1000–1003 Persistent storage modules, 336 Populating (loading) databases, 35 Phantom records, concurrency control Positional iterator, SQLJ, 323 Proof by contradiction, 507 Practical relational model, 177–206. Proof-theoretic interpretation of rules, and, 806–807 PHP (Hypertext processor) See also SQL (Structured Query 1005 Language) system Properties of decomposition arrays, 345–346, 348–350 Precompiler built-in variables, 352–353 DML command extraction, 44 attribute preservation condition, 513 comments in, 345 embedded SQL and, 311, 314 dependency preservation, 514–515 connecting to a database, 353–355 Predefined variables, PHP, 345–346 insufficiency of normal forms, data collection and records, 355–356 Predicate, relation schema and, 156 error checking, 355 Predicate-defined subclasses, 113, 126 513–514 Extension and Application Repository Prefix compression, string indexing, 640 nonadditive (lossless) join, 515–517, PreparedStatement objects, JDBC, 333 (PEAR), 353–354 Preprocessor, embedded SQL and, 311, 314 519–523 functions, 350–352 Primary file organization, 546 RDB design and, 504, 513–518 here documents, 347–348 Primary indexes, 602, 603–606 universal relations and, 513 HTML and, 343–346 Primary keys Protection, databases, 6 middle-tier Web server as, 344 arbitrary designation of, 477 Proximity queries, 1036 numeric data types for, 348 normal form based on, 483–495 Public key encryption, 1151–1152 placeholders, 356 relational data modeling, 159 Pure distributed database architecture, predefined variables, 345–346 SQL constraints, 186–187 query retrieval, 356–357 XML specification, 441 869–871 query submission, 355 Primary storage, 542, 543 QBE (Query-by-Example) language text strings in, 346, 347–348 Prime/nonprime attributes, 477 use of, 343–345 Printer servers, client/server architecture, aggregate functions in, 1175–1177 variable names for, 346, 347 47 grouping, 1175–1177 Web programming using, 343–359 Privacy issues and preservation, modifying the database, 1177–1178 Phrase queries, 1036 1153–1154 retrievals in, 1171–1175 Physical clustering, mixed records, 583 Privileged software use, 19 Qualified association, UML class Physical data independence, 38 Privileges, granting and revoking in SQL, Physical data models, 33–34 202 diagrams, 88 Physical database design Probabilistic model, IR, 1033–1034 Qualifier conditions, XML, 443 data storage and, 546 Probabilistic topic modeling, IR, Quantifiers indexing design decisions, 645–646 1059–1061 indexing structures, 601–652 Program variables domain relational calculus, 279 job mix factors for, 643–645 embedded SQL, 314–315 existential, 271, 274 multilevel indexes, 613–617 file operations, 565–566 queries using, 274–276 relational databases (RDBs) with, Program-data independence, 12 transformation of, 274 Programming, see Database tuple relational calculus, 271, 274–276 643–646 programming; SQL programming universal, 271, 274–276 single-level ordered indexes, 602–613 Programming languages Queries Physical database file structures, 641. See DBMS, 38–40 buffering (caching) modules for, 20, 42 declarative, 40 compiler, 43–44 also Indexes design for database programming, complex retrieval, 207–225 Physical design, data modeling, 62 312–313, 339 constant nodes, 273 Physical index, 638–639 impendence mismatch, 312–313 Datalog language, 1004, 1010–1012 Physical optimization, queries, 724 Java, 321–325, 358 defined, 6 Physical optimizer, Oracle, 733–734 PHP (Hypertext processor), 343–359 indexes for, 20 Pin count, buffer management, 558 QBE (Query-by-Example), 1171–1178 indexing hints in, 641–642 Pin-unpin bit, database recovery cache, 816 XML, 434, 436–447 information retrieval (IR) systems, Pipelined parallelism, 687 Programming model, MapReduce (MR), Pipelining 918–921 1035–1037 Program-operation independence, 12 interactive interface, 43–44 combining operations using, 681–683 Project attributes, 189 join condition, 189, 191 iterators for implementation of, 682–683 keyword-based, 41 materialized evaluation and, 681 named specification, OQL, 412–413 pipelined evaluation, 682 nested, 209–212 processing information, 1028–1029 nonrecursive evaluation, 1010–1012 query processing using, 681–683 object query language (OQL), 408–416 ODMG model for, 408–416 optimizer, 44 outer, 209 processing in databases, 20

1234 Index Queries (continued) Query processing strategies renaming attributes, 192, 214–215 quantifiers for, 274–276 aggregate operation implementation, SELECT statement (clause) for, recursive, 223 678–679 relation nodes, 273 anti-join (AJ) operator for, 658–660 187–188, 194–195, 197 relational algebra for, 265–268 distributed databases (DDBs), 859–863 select-from-where block, 188–191 select-from-where structure, 188–190 external sorting algorithms, 660–663 set operations for, 194–195 selection condition, 189 importance of, 656–657 set/multiset comparisons, 209–211 select-project-join, 189–190, 273 JOIN operation implementation, SQL, 187–198, 207–225, 230–231 spatial, 991 668–676, 679–681 substring pattern matching, 195–197 SQL retrieval, 187–198, 207–225 parallel algorithms for, 683–687 table set relations, 193–195 temporal constructs, 984–986 pipelining to combine operations, three-valued logic for comparisons, TSQL2 language for, 984–986 681–683 tuple relational calculus for, 272–276 PROJECT operation algorithm, 676–678 208–209 XML languages for, 443–447 query block for, 657–658 tuple variables for, 192, 209–211 query optimization compared to, UNIQUE function for, 212–214 Query block, 657–658 655–657 views (virtual tables) for, 230–231 Query decomposition, DDBMS, 863–865 SELECT operation algorithms, 663–668 WHERE clause for, 188, 192–193, 197 Query execution semi-join (SJ) operator for, 658–660 WITH clause for, 222–223 set operation algorithm, 676–678 Query server, two-tier client/server aggregate functions for, 709 SQL query translation, 657–660 cost components for, 711–712 steps for, 655–656 architecture, 49 GROUP-BY view merging, 705–706 Query submission, PHP, 355 incremental view maintenance, 707–710 Query results Query tree materialized views for, 707–710 bound columns approach, 329 nested subqueries, 702–704 cursor (iterator variable) for, 312, defined, 257 query evaluation for, 701–702 317–320 heuristic optimization of, 694–694 subquery (view) merging embedded SQL, 312, 317–320 internal query representation by, 655 impedance mismatch and, 312 notation, 257–259, 692–694 transformation for, 704–706 iterators for, 323–325 query optimization, 692–697 Query graphs OQL, 410–412 RDBMS use of, 257–259 path expressions, 386, 410–412 semantic equivalence of, 694–695 internal query representation by, 655 PHP, 356–357 Query validation, 655 notation, 692–694 SQL/CLI processing, 329 Question answering (QA) systems, query optimization, 692–697 SQLJ processing of, 323–325 tuple relational calculus, 273–274 1061–1063 Query modification, SQL views, 229–230 Query retrieval RAID (redundant arrays of inexpensive Query optimization aggregate functions in, 216–219 cost estimation for, 657, 710–713, alias for, 192 disks) technology arithmetic operators for, 196–197 bit-level striping, 584, 586 716–717 asterisk (*) uses, 193, 218 block-level striping, 584–585, 586 cost functions for, 714–715, 717- attribute name qualification, 191 data striping, 584–585 cost-based optimization, 710–712, 716, Boolean (TRUE/FALSE) statements levels, 586–588 for, 212–214 mirroring, (shadowing), 585 726–728 CASE clause for, 222–223 parallelizing disk access using, 542, data warehouses, 731–733 clauses used in, 198–199 distributed databases (DDBs), 859–863 comparison operators, 188–191, 195–197 584–588 dynamic programming, 716, 725–726 complex queries, 207–225 performance, improvement with, 586 execution plan, display of, 729 EXISTS function for, 212–214 reliability, improvement with, heuristic rules for, 657, 692, 697–701 explicit sets of values, 214–215 histograms for, 713 FROM clause for, 188–189, 197, 232 585–586 JOIN operation for, 717–726 grouping, 216–222 Random-access memory (RAM), 543 multirelation queries, 721–724 joined tables (relations), 215–216 Random access storage devices, 554 operation size estimation, 729–730 LIKE comparison operator, 195–196 Range partitioning, 684, 886 Oracle, 733–737 logical comparison operators for, Range relations, tuple variables and, physical optimization, 724 188–190 plan caching, 730 multiset of tuples, 188, 193–195 269–270 query execution and, 701–712 nested queries, 209–212 RDBMS (Relational database query processing compared to, NULL values and, 208–209 ORDER BY clause for, 197–198 management system) 655–657 ordering results, 197 query tree notation, 257–259 query trees and graphs for, 692–697 PHP, 356–357 two-tier client/server architecture and, SELECT operation for, 714 QBE (Query-by-Example) language, semantic query optimization, 737–738 1171–1175 49 star-transformation optimization, recursive queries, 223 RDBs (relational databases) 731–733 application flexibility with, 24 top-k results, 730 data abstraction in, 24 transformation rules for relational indexing for, 643–646 integrity constraints and, 160–163 algebra operations, 697–699 physical database design in, 643–646 Query optimizer, 655 relation schema sets as, 160 schemas, 160–163 temporal databases incorporating time in, 977–982

Index 1235 tuple versioning, 977–982 degree (arity) of attributes, 152 delete operation, 166, 167–168 valid and invalid relational states, facts, 156 domains, 151–152 functional dependency of, 471–474 entity integrity, 163–165 160–161 goodness of, 459 extraction of XML documents using, Reachability, object persistence and, interpretation of, 156 key of, 159 447–449 373–374 nested relations, 479–480 flat files, 150 Read/write head, disk devices, 551 normalization of relations, 474–476 formal languages for, see Relational Read/write transactions, 748 NULL value in tuples, 467–468 Real-time database technology, 4 predicate, 156 algebra; Relational calculus Reasoning mechanisms, 129 redundant information in tuples, insert operation, 166–167 Recall and precision metrics, IR, key constraints, 21, 158–160, 163–165 465–467 mathematical relation of, 149 1044–1046 relational database (RDB) design notation for, 156–157 Record type (format), 560 operations, 165–168 Record-at-a-time, file operations, 566 guidelines, 461–471 referential integrity, 163–165 Record-at-time DML, 40 relational model constraints and, practical language for, see SQL Record-based data models, 33 Records 157–165 (Structured Query Language) relational model domains and, 152 relations, 152–156 blocking, 563–564 semantics of, 461–465 representational model type, 33 data types, 560–561 spurious tuple generation, 468–471 retrievals (operations), 165–166 data values, 560 superkey of, 158–159 schemas, 152–165 fields, 560, 561–563, 568–569, 582–583 universal, 471–474 table of values, 150–151 file storage, 560–564, 567–572, Relation state transactions, 169 current, 153 tuples, 152–156 582–583 relational model domains and, 152–153 update (modify) operation, 166, fixed-length, 561–563 relational database, 160–161 mixed, 582–583 tuple values in, 152–156 168–169 ordered (sorted), 568–572 valid and not valid, 160–161 Relational database (RDB) design spanned versus unspanned, 563–564 Relational algebra unordered (heaps), 567–568 aggregate functions, 240, 260–261 algorithms for schema design, variable-length, 561–563 binary operations, 240, 251–259, 519–523, 524–527 Recoverability basis of schedules, 761–762 Recoverable/nonrecoverable schedule, 761 262–264 bottom-up method, 460, 504 Recursive closure operations, 262 expressions for, 239, 241–242, 245 by analysis, 503 Recursive queries, 223 formal relational modeling and, 239–240 by synthesis, 504, 503 Recursive (self-referencing) generalized projection operation, dangling tuple problems, 523–524 data model mapping for, 289 relationships, 75 259–260 designer intention for, 459–460 Redis key-value cache, 900 groupings, 260–261 EER-to-relational mapping, 298–303 Redundancy control, 18–19 operations, purpose and notation of, 258 ER-to-relational mapping, 290–298 REF keyword, 383, 386 procedural order of, 268 functional dependency and, 471–474, Reference types, OIDs created using, 383 queries in, 265–268 References query optimization and, 697–699 505–512, 527–528, 532 recursive closure operations, 262 implementation (physical storage) dot notation for path expressions, 386 set theory and, 246–251, 264–265 inverse, 366, 370, 396–397 SQL query translation into, 657–660 level, 459–460 object identity from, 370 transformation rules for operations, inclusion dependency and, 531–532 object type relationships, 369–370 inference rules for, 505–509, 527–528 relationships specified by, 386 697–699 join dependency (JD) and, 494–495, SQL, 370, 386 unary operations, 240, 241–246 Referential integrity Relational calculus 530–531 constraints, 21, 163–165, 186–187 declarative expressions for, 268 keys for, 474–483 NULL values and, 163–164 domains and, 268, 277–279 logical (conceptual) level, 459–460 relational data modeling, 163–165 formal relational modeling and, multivalued dependency (MVD) and, SQL constraints, 186–187 Referential triggered action clause, SQL, 240–241 491–494, 527–530 nonprocedural language of, 268 normal forms, 474–495, 513–514, 186 query graphs, 273–274 Reflexive association, UML class relationally complete language of, 268 528–533 tuples and, 268–277 normalization algorithm problems, diagrams, 87 Relational data models Regression, data mining, 1091–1092 attributes, 152–153 524–527 Regression function, data mining, 1092 breaking cycle for tree-structure model normalization of relations, 474–476, Relation extension/intension, 152 Relation nodes, query graphs, 273 conversion, 452–453 482, 485, 486–487, 493–494 Relation schema concepts, 150–157 NULL value problems, 523–524 constraints, 157–167 ODBs compared to, 405–406 anomalies and, 465–467 DBMS criteria and, 51–52 properties of decomposition, 504, assertion, 156 attribute clarity and, 464 513–518 relation schema, guidelines for, 461–471 top-down method, 460 universal relations, 471–474, 504

1236 Index Relational database management system, selection conditions, 564–565 recoverable/nonrecoverable schedule, see RDBMS (Relational database Retrieval, 1027 761 management system) RETURN clause, XQuery, 446 ROLAP (relational OLAP) function, 1114 result equivalence of, 765 Relational database state, 160–161 Role-based access control (RBAC), 1121, serial schedules, 763–764 Relational databases, see RDBs (relational serializability basis of, 763–766 1137–1139 serializable schedules, 763, 765–766 databases) Role names, 75 strict schedule, 762 Relational operators for deductive Roles of domain attributes, 152 testing for serializability, 767–770 Root, tree structures, 617 transaction processing, 759–773 databases, 1010 Root element, XML, 440 transactions for, 759–760 Relationally complete language of, 268 Root tag, XML documents, 434 view equivalence, 771–772 Relationships Rotational delay (latency), disk devices, 552 view serializability, 771–772 Round-robin partitioning, 684 Schema-based (explicit) constraints, 157 aggregation, 87–88 Row, SQL, 179 Schema change statements associations, 87–88 Row-based constraints, SQL, 187 ALTER command, 233–234 attributes of, 78 Row-level access control, 1139–1140 DROP command, 233 attributes, as, 74 ROW TYPE command, 380 schema evolution command use, binary types, 76–78, 293–295 RSA public key encryption algorithm, cardinality ratios for, 76–77 232–233 comparison of ternary and binary, 88–91 1152 Schema diagram, 34–35 conceptual data models, 33 Rules Schema matching, 1052 constraints on, 76–78, 91–92 Schemaless documents, XML, 432–433 degree of types, 71–74, 88 active databases systems, 22 Schemas entity participation in, 72–73 active rules, 962–964, 970–973 ER models and, 72–78, 88–92 association rules, 1073–1084 authorization identifier, 179 ER-to-relational mapping, 293–296 axioms, 1005 bottom-up conceptual synthesis, 119 existence dependency, 77–78 deductive database systems, 22 catalog collection of, 35, 38, 180 identifying, 79 deductive databases, 1000, 1005–1007 conceptual level, 37, 61–62 instances, 72 defined, 1000 constraints and, 157–165 inverse, 396–397 force/no-force rules, 817–818 constructs, 35 multivalued attributes, 295–296 4NF schema, 527–528 data independence and, 37–38 n-degree, 88–92, 296 functional dependencies, 505–509, database descriptions, 34 ODMG model objects, 396–397 database state (snapshot) and, 35 order of instances in, 87 527–528 database requirements, 122–124 participation constraints of, 77–78 inference rules, 505–509, 527–528 descriptors, 179 recursive (self-referencing), 75 inferencing information using, 22 design creation (conceptual) of, 61–62 role names and, 75 interpretation of, 1005–1007 EER modeling and, 119–120, 122–124 sets, 72 models for, 1005–1006 EER schema to ODB schema, 407–408 structural constraints of, 78 model-theoretic interpretation of, 1005 ER diagram notation for, 81, 83–85 subtype/supertype, 375–376 proof-theoretic interpretation of, 1005 ER modeling and, 61–62 ternary, 88–92 stored procedure for, 22 evolution, 35 type, 72–78, 126 theorem proving, 1005 external level (views), 37 type hierarchies, 375–376 triggers as, 22 intention, 35 UML class diagrams, 87–88 Runtime, MapReduce (MR), 927–930 interface inheritance, ODL, 404–405 Reliability, DDBs, 844–845 Runtime database processor, 44, 655 internal level, 36 RENAME operator (ρ), 245–246 Safe expressions, 276–277 mappings, 37, 407–408 Renaming attributes in SQL, 192, 214–215 Sampling algorithm, 1076–1077 meta-data storage of, 35 Repeating field, records, 561–563 Scalability naming constructs, 82 Replication models, 886 DDBs, 845 ODB conceptual design and, 407–408 Replication transparency NOSQL, 885 ODL, 400–403 DDBs, 843 Scale-invariant feature transform refinement using generalization and NOSQL, 885–886, 894 Representational (implementation) data (SIFT), 998 specialization, 119–120 Scanner, query processing, 655 relation, 157–160, 163–165 models), 33 Schedules (histories) relational database, 160–163 Resource Description Framework SQL concepts, 179–180 cascading rollback phenomenon, 762 three-schema architecture, 36–38 (RDF), 447 committed projection of, 760 top-down conceptual refinement, 119 ResourceManager (RM), YARN, complete schedule conditions, 760 XML language, 434, 436–441 concurrency control and Script functions, HTML, 428 941–942 Search, B-trees, 625–626 RESTRICT option, SQL, 233, 234 serializability, 770–771 Search engines Result equivalence, schedules, 765 conflict equivalence of, 765–766 desktop, 1025 ResultSet object JDBC, 334–335 conflicting operations in, 759–760 Lucern, 1043–1044 Retrieval operations debt–credit transactions, 773 Web search, 1047 nonserial schedules, 763, 764–765 files, 564–565 operation semantics for, 773 object information, 371 recoverability basis of, 761–762 relational data models, 165–166

Index 1237 Search relevance, IR, 1044–1047 file operations, 564–565 parallel algorithms, 686 Search techniques parallel algorithms, 685 query processing, algorithms for, WHERE clause queries, 189 conjunctive selection, 665–666 Selective inheritance, 377 676–678 disjunctive selection, 666–667 Selectivity SQL, 194–195 keyword-based, 41 join operations, 254, 719–720 Set theory query processing, 663–667 of a condition, 243, 667–668 CARTESIAN PRODUCT operation, SELECT operation algorithms, Self-describing data, 10, 427 Self-describing data models, 34 249–251 663–667 Self-describing documents, 425. See also INTERSECTION operation, 247–249 simple selection, 663–665 MINUS operation, 247–249 Web database applications, 4 JSON; XML (EXtended Markup OUTER UNION operation, 264–265 Search trees, dynamic multilevel indexes, Language) relational algebra operations from, Semantic approach, IR, 1028 618–619 Semantic data models 246–251, 264–265 Second normal form (2NF) abstraction concepts, 129–133 SET DIFFERENCE operation, EER modeling, 107–108 definition of, 481 ontology for, 132–134 247–249 full functional dependency and, Semantic equivalence, query trees, type compatibility, 247 694–695 UNION operation, 246–249 481–482 Semantic heterogeneity, 857–858 Set type, legacy data modeling with, 53 general definition of, 484–486 Semantic model, IR, 1034–1035 Set-at-a-time, file operations, 566 normalizing relations, 482, 484–486 Semantic query optimization, 737–738 Set-at-time DML, 40 primary key and, 483–484 Semantic tagging, images, 998–999 Sets Secondary access path, indexing, 601 Semantics explicit set of values, 214 Secondary indexes, 603, 609–612 attribute clarity, 461–465 multiset comparisons, SQL query Secondary storage data constraints, 21 capacity of, 534 functional dependency of, 472–473 retrieval, 209–211 devices for, 547–556 relation schema, 461–465 parentheses for, 214 random access devices, 554 RDB design, 461–465, 472–473 SQL table relations, 188, 193–195 sequential access devices, 554–555 schedule operations, 773 Shadow directory, 826 solid-state drive (SSD), 542 Semi-join (SJ) operator, 658–660, 681, Shadow paging, database recovery, Security, see Data security; Database 719–720, 862–863 Semistructured data, XML, 426–428 826–827 security Separator characters, records, 561 Shadowing, 816 Security and authorization subsystems, 19 Sequence of interaction, database Sharding Seek time, disk devices, 552 programming and, 313–314 SELECT clause statement Sequence of operations, relational DDBs, 847–848 algebra, 245–246 NOSQL, 886, 894–895 ALL option with, 194–195 Sequential access storage devices, Shared-disk architecture, 683 AS option with, 196 554–555 Shared/exclusive (read/write) locks, DISTINCT option with, 188, 194 Sequential pattern discovery, data mandatory use of, 197 mining, 1091 784–786 multiset tables and, 194–195 Serial ATA (SATA), 551 Shared-memory architecture, 683 SQL query retrieval and, 187–188, Serial schedules, 763–764 Shared-nothing architecture, 684 Serializability Shared subclasses, 118, 301 194–197 basis of schedules, 763–766 Shared variables in embedded SQL, 314 SELECT operation concurrency control and, 770–771 Signature of operations, 366, 397. See also testing for, 767–770 Boolean expressions (clauses), Serializable schedules, 763, 765–766 Interfaces 241–242 Server, defined, 48 Simple (atomic) attributes, 65–66 Servers Simple elements, XML, 431 cascade (sequence) with, 243 application, 44 Simple Object Access Protocol (SOAP), conjunctive selection, 665–666 database, 44 cost functions for, 714 DBMS module, 31 447 degree of relations, 243 SET clause, SQL, 201 Simple selection, search methods for, disjunctive selection, 666–667 SET CONNECTION command, SQL, estimating selectivity of conditions, 316 663–665 Set constructor, 369 Single character replacement symbol (_), 667–668 SET DIFFERENCE operation, 247–249 implementation options for, 663 Set operations 195–196 query processing algorithms, 663–668 anti-join (AJ) operator for set Single inheritance, 118–119 relational algebra using, 241–243 difference, 677–678 Single-level ordered indexes search methods for, 663–667 selectivity of a condition, 243, 667–668 clustering indexes, 602, 606–608 simple selection, 663–665 concept of, 602–603 SELECT operator (σ), 241 physical database design and, 602–613 Select…from…where structure, OQL, primary indexes, 602, 603–606 secondary indexes, 603, 609–612 409 Single-relation options, EER-to- Select-from-where block, SQL, 188–191 Select-project-join query, 189–190, 273 relational mapping, 299–300 Selection conditions Single-sided disks, 547 Single time point, 976 domain variables, 278 Single-user DBMS systems, 51 Single-valued attribute, ER modeling, 66

1238 Index Small computer system interface (SCSI), assertions, 158, 156, 165, 225–226 SQL plan management, Oracle, 736–737 551 attribute data types in, 182–184 SQL programming catalog concepts, 179–180 Snapshot isolation CHECK clause, 187 comparison of approaches, 338–339 concurrency control and, 758, 781, comparison operators, 188–191, 195–197 database programming language 799–800 complex queries, 207–225 defined, 775 constraints, 165, 184–187, 225–227 approaches, 309–314, 339 SQL transaction support and, 775–776 core specifications, 178 database stored procedures, 335–338 CREATE ASSERTION statement, dynamic SQL, 320–321 Snapshot (database) state, 35 embedded SQL, 311, 314–320, 338–339 Snowflake schema, 1108–1109 225–226 JDBC: SQL class library, 331–335 Social search, IR, 1058–1059 CREATE TABLE command, 180–182 library of functions or classes for, Software engineers, 16 CREATE TRIGGER statement, 225, Solid-state device (SSD) storage, 553–555 311–312, 326–335, 339 Solid-state drive (SSD), secondary 226–227 query specification and, 320–321 data definition, 179 SQL/CLI (SQI call level interface), storage of, 542 DBMS use of, 177–178 Sophisticated end users, 16 DELETE command, 200 326–331 Sorting phase, external algorithms, 661 domains, 184 SQLJ: Java commands, 321–325 Sort-merge join, 559, 719, 930 encapsulation of operations, 384–385 SQL server, two-tier client/server Spanned versus unspanned records, extensions, 178 function overloading, 385 architecture, 49 563–564 granting and revoking privileges, 202 SQL/CLI (SQI call level interface) Spatial analysis operations, 988 history of, 178 Spatial colocation rules, 993–994 index creation, 201–202 connection record, 327–328 Spatial databases inheritance, type specification of, 385 database programming with, 326–331 INSERT command, 198–200 description record, 327–328 analytical operations, 988 logical comparison operators, 188–190 environment record, 327–328 applications of spatial data, 994 NOSQL database system and, 26 handles for records, 328 data mining, 993–994 object identifiers, 383 statement record, 327–328 data types, 989–990 object-relational systems, 202 steps for programming, 328–331 enhanced data models, 962, 987–994 ODB extensions to, 379–386 SQL/PSM (SQL/persistent stored indexing, 991–993 operators, query translation into, models of information, 990 modules), 337–338 object storage by, 987–988 657–660 SQLCODE variable, 316 operators, 990–991 practical relational model, 177–206 SQLJ queries, 991 query processing, translation for, Specialization embedding SQL commands in Java, attribute-defined, 114 657–660 321–325 conceptual schema refinement, 119–120 query retrieval, 187–198, 207–225 constraints on, 113–116 reference types, 383 exceptions for error handling, 322–323 defined, 110 relational algebra, query translation iterators for, 323–325 design choices for, 124–128 query result processing, 323–325 disjointness (d notation), 114–115 into, 657–660 SQLSTATE variable, 316 EER diagram notation for, 109, 110 relational data models and, 51, 165 Standalone users, 16 EER modeling concept, 108, 110–120, schema change statements, 232–234 Standards, enforcement of, 22 schema concepts, 179–180 Star schema, 1108 124–128 syntax of, 235 STARBURST, statement-level rules in, EER-to-relational mapping options, table creation, 383–384 transaction support, 773–776 970–972 298–301 triggers, 158, 165, 226–227 Star-transformation optimization, hierarchies, 116–119 UPDATE command, 200–201 instances of, 111–112 user-defined types (UDTs), 380–384 731–733 lattices, 116–119 views (virtual tables), 228–232 Starvation, 792 partial, 115 XML data creation functions State constraints, 165 semantic modeling process, 131 State of an object or literal, 387 total, 115 (XML/SQL), 453–455 Statement object JDBC, 335 UML notation for, 127–128 SQL injection Statement parameter Specialized servers, client/server bind variables, 1145–1146 binding, 329, 333 architecture, 47 code injection, 1144 JDBC, 333 Specification, ontology and, 134 database security, 1143–1146 SQL/CLI, 329 Speech input and output, 41 filtering input, 1146 Statement record, SQL/CLI, 327–329 Spurious tuple generation, RDB design function call injection, 1144–1145 Statement string, SQL/CLI, 329 function security for, 1146 Statement-level rules, STARBURST, and, 468–471 manipulation, 1143–1144 SQL (Structured Query Language) protection against attacks, 1145–1146 970–972 risks associated with, 1145 Statement-level trigger, 967 system Static files, 566 active database techniques, 202 Static hashing, 577 arithmetic operators, 196–197 Statistical approach, IR, 1028 Statistical database security, 1146–1147 Steal/no-steal rules, 817–818 Stemming, IR text processing, 1038

Index 1239 Stopword removal, IR text processing, Strings. See also Text strings System designers and implementers, 17 1037–1038 character data types, 182–183 System log double quotations (“ ”) for, 196, 347 Storage indexing, 640 database recovery, 814, 817, 818–819 architectures for, 588–592 prefix compression, 640 modifications for database security, automated storage tiering (AST), 591 single quotations (‘ ’) for, 182, 196, 347 big data, 3 SQL use of, 182–183, 195–197 1125 buffering blocks, 541, 556–560 substring pattern matching, 195–197 transaction processing, 755–756 capacity, 543 Table inheritance, SQL, 385 cloud, 3 Strong entity types, 79 Table of values, 150–151 column-based, indexing for, 642 Struct (tuple) constructor, 368, 369 Table-based constraints, SQL, 184–187 database catalog for, 10–11 Structural constraints, 78 Tables database organization of, 545–546 Structured data, XML, 426 ALTER TABLE command, 180 database reorganization, 45 Structured data extraction, WEB, 1052 base relations, 180, 182 devices for, 543–545, 547–556 Structured objects and literals, 388, 396 CREATE TABLE command, 180–182 Fibre Channel over Ethernet (FCoE), Structured Query Language, see SQL data definition statements, 180–182 590–591 database recovery, 828–831 Fibre Channel over IP (FCIP), 590 (Structured Query Language) inner join, 215–216 file records, 560–564, 567–572, Subclasses joined relations, 215–216 582–583 multiset operations, 193–195 files, 10–11, 560–572, 582–583 class relationships, 108–110 multiway join, 216 hashing techniques, 572–582 defined, 126 NATURAL JOIN operation, 215 Internet SCSI (iSCSI), 590 defining predicate of, 113–114 OUTER JOIN operations, 216 memory hierarchies, 543–545 EER diagram notation for, 109 query retrieval and, 193–195 meta-data, 6, 10 EER modeling concept, 108–110, 126 query retrieval and, 193–195, 215–216 network-attached storage (NAS), EER-to-relational mapping, 301 sets of relations in, 188, 193–195 589–590 entity type as, 110 transaction, 828–831 object-based, 591–592 inheritance, 110, 117–119, 301 trigger activation from, 22 objects, 987–988 IS-A relationship, 109, 126 UDT creation of for SQL, 383–384 persistent, 19–20, 545 leaf class (UML node), 127 views (virtual tables), 228–232 primary, 542, 543 local attributes of, 110–111 virtual relations, 82 program objects, 19–20 overlapping entities, 115 Tags RAID technology, 542, 584–588 predicate-defined, 113–114 attributes, 430 secondary, 542, 543, 547–556 shared, 118, 301 document body specification, 429 spatial databases for, 987–988 specialization of set of, 110–112 document header specifications, 428 storage area networks (SANs), 588–589 specific relationship types, 110–111 end/start tag (</…>), 428 tertiary, 542, 543 union type, 108, 120–122 HTML tag (<…>), 428 XML documents, 442–443 user-defined, 114 mark up of documents using, 428–429 Subqueries notation and use, HTML, 428–430 Storage area networks (SANs), 588–589 nested, 702–704 semantic tagging of images, 998–999 Storage definition language (SDL), 39 query optimization and, 702–706 XML unstructured data and, 428–430 Storage devices unnesting (decorrelation), 704 Tape jukeboxes, 544 view merging transformation, 704–706 Taxonomy, ontology as, 134 databases, organization and, 545–546 Substring pattern matching, SQL, Temporal databases disks, 547–553 applications of, 974 flash memory, 543–544 195–197 calendar, 975 magnetic tape, 544–545, 555–556 Subtrees, 617 enhanced data models, 962, 974–987 memory, 543–545, 547–556 Subtypes, 375–376 implementation considerations, 982 optical drives, 544 SUM function incorporating time, 977–984 secondary, 547–556 object-oriented databases for, 982–984 solid-state device (SSD), 553–555 grouping, 260 relational databases for, 977–982 Stored attribute, 66 SQL, 217 time representation, 975–977 Stored data manager, 42, 44 Superclasses versioning, 977–984 Stored procedures base class (UML root), 127 Temporal querying constructs, 984–986 CALL statement, 337 categories of, 120–122 Temporary update problem, transaction database programming and, 335–338 class relationships, 109 parameter type and mode, 336–337 EER modeling concept, 109, 110, 126 processing, 750 persistent storage modules, 336 entity type as, 110 Ternary relationships rule enforcement using, 22 inheritance, 110, 117–118 SQL/PSM (SQL/persistent stored subclass relationships, 110, 117–118 binary relationships compared to, Superkey, 158–159, 476–477 88–89 modules), 337–338 Supertypes, 375–376 Stream-based processing, 682. See also Surrogate key, 302 degree of, 73–74 Symmetric key algorithms, 1150–1151 ER diagrams, 88–92 Pipelining Synthesis, RDB design by, 503, 504 notation for diagrams, 88–89 Strict schedule, 762 System analysts, 16 Tertiary storage, 542, 543

1240 Index Testing for serializability, 767–770 generation of, 793 Transient data, storage of, 545 Text/document source, multimedia transaction timestamps, 790–791 Transient objects, 365, 373 Tool developers, 17 Transition constraints, 165 databases, 996 Tools, DBMS, 45–46 Transitive dependency, 3NF, 483 Text preprocessing Top-down conceptual refinement, 119 Transparency, DDBs, 843–844 Top-down method, RDB design, 460 Tree search data structures, see B-trees; information extraction (IE), 1040 Top-k results, query optimization, 730 information retrieval (IR), 1037–1040 Topological relationships, 989 B+-trees stemming, 1038 Total categories, 122 Tree-structured data models stopword removal, 1037–1038 Total specialization, 115, 126 thesaurus use, 1038–1039 Transaction management, DDBs, 857–859 attributes, 433 Text strings Transaction processing breaking graph cycles for conversion double-quoted, 347–348 commit point, 756 interpolating variables within, 347 concurrency control, 749–752 to, 452–453 length of, 346 concurrency of, 746–747 data-centric documents, 431 PHP programming, 346, 347–348 data buffers, 748–749 data mining, 1077–1080, 1085–1086 single-quoted, 347–348 database items, 748 decision trees, 1085–1086 Thematic search, 989 DBMS-specific buffer replacement document-centric documents, 431 Theorem proving, 1005 document extraction using, 447–453 Thesaurus policies, 756–757 elements, 430–431 IR text processing, 1038–1039 read/write transactions, 748 frequent-pattern (FP) tree, 1077–1080 ontology as, 134 recovery for, 752–753 graph conversion into, 452–453 THETA JOIN condition, 252 schedules (histories), 759–773 hierarchies for, 116, 452–453 Third normal form (3NF) single-user versus multiuser systems, hybrid documents, 431 algorithm for RDB schema design, schemaless documents, 432–433 746–747 XML, 51, 430–433, 447–453 519–522 SQL transaction support, 773–776 Triggers definition of, 483 system log, 755–756 active databases, 963–967, 973–974 dependency preservation and, 519–522 systems, 745 database tables and, 22 general definition of, 486–487 transaction failures, 752–753 CREATE TRIGGER statement, 225, nonadditive (lossless) join transaction states, 753–754 transactions for, 747–749, 757–758 226–227 decomposition and, 519–522 Transaction rollback, database recovery, database monitoring, 226–227 normalizing relations, 485, 486–487 event-condition-action (ECA) primary key and, 483–484 819 transitive dependency and, 483 Transaction server, two-tier client/server components, 227, 963–964 Thomas’s write rule, 795 Oracle notation for, 965–967 Three-schema architecture, 36–38 architecture, 49 SQL, 158, 165, 226–227 Three-tier/client-server architecture Transaction tables, database recovery, SQL-99 standards for, 973–974 discrete databases (DDBs), 872–875 Trivial/nontrivial MVD, 493 Web applications, 49–51 828–831 Truth value of atoms, 270, 277 Three-valued logic for SQL NULL Transaction time dimensions, 976–977 TSQL2 language, 984–986 Transaction time relations, 979–980 Tuning indexes, 640–641 comparisons, 208–209 Transaction timestamps, deadlock Tuple relational calculus Thrown exceptions, SQLJ, 322–323 expressions, 270–271, 276–277 TIME data type, 183 prevention, 790–791 formulas (conditions), 270–271 Time period, 976 Transaction-id, 755 nonprocedural language of, 268 Time reduction, development of, 22–23 Transactions quantifiers, 271, 274–276 Time representation, temporal databases, queries using, 272–276 atomicity property, 14, 757 query graphs, 273–274 975–977 certification of, 781 range relations, 269–270 Time series data, 986–987 concurrency control and, 781, requested attributes, 269 Time series management systems, 987 safe expressions, 276–277 Timeouts, deadlock prevention, 792 798–799, 807 selected combinations for, 269 TIMESTAMP data type, 183–184 consistency preservation, 757 variables, 269–270 Timestamp ordering (TO) database recovery, 821 Tuple variables debt–credit, 773 alias of attributes, 192 algorithm, 793 defined, 6, 169 bound, 271 basic, 794 desirable properties of, 757–758 free, 271 concurrency control based on, durability (permanency) property, 758 iterators, 189 interactive, 807 range relations and, 269–270 792–795 isolation property, 14, 758 Tuples multiversion technique based on, 796 multiuser processing, 13–14 alternative definition of a relation and, strict, 794–795 not affecting database, 821 Thomas’s write rule for, 795 OTLP systems, 14, 52, 169 154–155 Timestamps relational data modeling, 169 anomalies and, 465–467 concurrency control and, 781, user-defined functional requirements, asterisk (*) for rows in query results, 218 790–791, 793 61 deadlock prevention using, 790–791, 793 validation (optimistic) of, 781, 798–799

Index 1241 atomic value of, 155 functions in, 374–375 duplicate elimination and, 244–245 attribute ambiguity and, 191–192 inheritance, 385 PROJECT operation, 243–245 CHECK clause for, 187 ODBs, 366, 374–377 relational algebra and, 240, 241–246 CROSS PRODUCT operation for subtype/supertype relationships, renaming attributes, 245–246 SELECT operation, 241–243 combinations, 192–193 375–376 selectivity of condition, 243 dangling tuple problems, 523–524 visible/hidden attributes, 371, 375 sequence of operations for, 245–246 delete operation for, 166, 167–168 Type (union) compatibility, 247 Unauthorized access restriction, 19 embedded SQL retrieval of, 311, Type constructors UNDO/REDO algorithm, 815, 818 array, 369 Unidirectional association, UML class 314–317 atom, 368, 369 grouping and, 219 bag, 369 diagrams, 87 mapping relations with, 154 collection (multivalued), 369 Unified Modeling Language, see UML matching, 264–265 dictionary, 369 multisets of, 193–195 list, 369 (Unified Modeling Language) nested query values, 209–211 object definition language (ODL) UNION operations n-tuple for relations, 152 NULL value of, 155–156, 163, 467–468 and, 369 matching tuples, 264–265 ordering of, 154–155 object operation, 371 OUTER UNION operation, 264–265 OUTER UNION operation and, ODB objects and literals, 368–370 partially compatible relations, 264 references to object type relationships, relational algebra, 264–265 264–265 SQL sets, 194–195 parentheses for comparisons, 210 369–370 Union types partially compatible relations, 264 set, 369 categories of, 120–122, 302–303 partitioning relations into, 219 SQL, 379 EER diagram notation for, 120 precompiler or preprocessor for struct (tuple), 368, 369 EER modeling concept, 108, 120–122 type structures and, 368–370 EER-to-relational mapping, 302–303 retrieval of, 311, 314 Type generators set union operation (∪), 120 query retrieval and, 191–195, 209–211 ODB objects and literals, 368–369 surrogate key for, 302 RDB design problems, 523–524 ODMG models, 394–395 UNIQUE function, SQL query retrieval, redundant information in, 465–467 Type inheritance, 385 referential integrity of, 163 Type structures, 368–370. See also Type 212–214 relation schema for RDB design, Unique keys, 160 constructors Uniqueness constraints 465–471 UDTs (User-defined types) relation state values, 152–156 ER model entity types, 68–68 row-based constraints, 187 arrays, 383 key attributes as, 68–69 separate groups for NULL grouping built-in functions for, 384 key constraints with, 158–160 CARDINALITY function, 383 relation schema and, 158–160 attributes, 219 CREATE TYPE command, 380–383 Universal quantifiers, 271, 274–276 set of, 154–155 dot notation for, 383 Universal relation assumption, 513 spurious tuple generation, 468–471 encapsulation of operations, 384–385 Universal schema relations, 471–474, SQL tables and, 187, 191–195 inheritance specification (NOT type (union) compatibility, 247 504, 513 update (modify) operation for, 166, FINAL), 385 Universe of Discourse (UoD), 5 SQL, 380–385 Unnest relation, 1NF, 479–480 168–169 table creation based on, 383–384 Unordered file records (heaps), 567–568 versioning, 977–982 UML (Unified Modeling Language) Unrepeatable read problem, transaction Two-phase locking (2PL) aggregation, 87–88 basic 2PL, 788 associations, 87–88 processing, 752 concurrency control, 782–792, 796–797 base class, 127 Unstructured data, XML, 428–430 conservative 2RL, 788 bidirectional associations, 87 Unstructured information, 1022 deadlock, 789–792 class diagrams, 85–88, 127–128 Unstructured/semistructured data expanding (first) phase, 786 EER models and, 127–128 locks for, 782–786 ER models and, 60, 85–88 handling, Big data technology and, multiversion concurrency control and, leaf class, 127 945 links, 87 Update (modify) operations 796–797 qualified association, 88 relational data models, 166, 168–169 protocol, 786–788 reflexive association, 87 files, 564–565 rigorous 2PL, 789 unidirectional association, 87 relational data models, 166, 168–169 shrinking (second) phase, 786 Unary operations selection conditions for, 564–565 starvation, 792 assignment operations (←) for, 245 tuple modification using, 166, 168–169 strict 2PL, 788–789 Boolean expressions (clauses), Update anomalies, RDB design and, subsystem for, 789 465–467 Two-tier client/server architecture, 49 241–242 UPDATE command, SQL, 200–201 Two-way join, 668 cascade (sequence) with, 243 Update decomposition, DDBMS, Type (class) hierarchies defined, 243 863–865 constraints on extents corresponding degree of relations, 243, 244 Update strategies for SQL views, 230–232 Upgrading locks, 786 to, 376–377

1242 Index User views, 37 query modification for, 229–230 Well-formed documents, XML, 433–424 User-defined subclass, 114, 126 query retrieval using, 230–231 WHERE clause User-defined types, see UDTs SQL virtual tables, 228–232 update strategies for, 230–232 asterisk (*) for all attributes, 193 (User-defined types) virtual data in, 13 explicit set of values in, 214–215 Utilities, DBMS functions, 45 WITH CHECK option for, 232 grouping and, 221–222 Valid documents, XML, 434 XML document extraction and, SQL query retrieval and, 188–189, Valid state, databases, 35, 160–161 Valid time, temporal databases, 976 447–452 192–193, 197, 214–215 Valid time relations, temporal databases, Virtual data, 13 selection (Boolean) condition of, 189 Virtual private database (VPD) unspecified, 192–193 977–979 XQuery, 446 Validation (optimistic) concurrency technology, 1156 WHERE CURRENT OF clause, SQL, 318 Virtual relations (tables), 82 Wide area network, 842 control, 781, 798–799 Virtual storage access method (VSAM), Wildcard (*) queries, 1036–1037 Value (state) of an object or literal, 387 WITH CHECK option, SQL views, 232 Value sets (domains) of attributes, 69–70 541 WITH clause, SQL, 222–223 Variable-length records, 561–563 Virtual tables, 228–232. See also Views Wrapper, 1025 Variables Write-ahead logging (WAL), database (virtual tables) built-in, 352–353 Visible attributes, objects, 371, 375 recovery, 816–818 communication, 316 Volatile/nonvolatile storage, 545 XML (EXtended Markup Language) domain, 277 Voldemort key-value data store, 897–899 embedded SQL, 314–316 Weak entity types, 79, 292–293 access control, 1140–1141 iterator, OQL, 409–410 Web analytics, 1057 data models, 34, 51, 53 interpolating within text strings, 347 Web-based user interfaces, 40 database extraction of documents, names for, 346, 347 Web crawlers, 1057 PHP, 345–347, 352–353 Web database programming 442–443, 447–453 predefined, 345–346 document type definition (DTD), program, 314–315 HTML and, 343–346 shared, 314 Java technologies for, 358 434–436 tuple, 189, 192, 169–170 PHP for, 343–359 documents, 433–436, 442–443, Vector space model, IR, 1031–1033 Web database systems Versioning access control policies, 1141–1142 447–453 attribute approach, 982–984 data interchanging using XML, 25 hierarchical (tree) data models, 51, NOSQL, 887, 899, 900–902 HTML and, 25 object-oriented databases menu-based interfaces, 40 430–433, 447–453 n-tier architecture for, 49–51 hypertext documents and, 425 incorporating time, 982–984 security, 1141–1142 OpenPGP (Pretty Good Privacy) relational databases incorporating three-tier architecture for, 49–51 Web information integration, 1052 protocol, 1140–1141 time, 977–982 Web pages protocols for, 446–447 tuple approach, 977–982 hypertext documents for, 425 query languages, 443–447 Vertical fragmentation, DDBs, 844, segmentation and noise reduction, relational data model for document 848–849 1053 extraction, 447–449 Video source, multimedia databases, 996 XML and formatting of, 425–426 schema language, 434, 436–441 View definition language, 39 Web search semistructured data, 426–428 View merging transformation, defined, 1028 SQL functions for creation of data, digital libraries for, 1047–1048 subqueries, 704–706 HITS ranking algorithm, 1051 453–455 Views link structure analysis, 1050–1051 structured data, 426 PageRank ranking algorithm, 1051 tag notation and use, HTML, 428–430 database designer development of, 15 search engines for, 1047 unstructured data, 428–430 equivalence, schedules, 771–772 Web analysis and, 1048–1049 Web data interchanging using, 25 serializability, schedules, 771–772 Web context analysis, 1051–1054 Web page formatting by, 425–426 support of multiple data, 13 Web structure analysis, 1049–1050 XPath for path expressions, 443–445 Views (virtual tables) Web usage analysis, 1054–1057 XQuery, 445–446 authorization using, 232 Web servers XPath, XML path expressions, 443–445 base tables compared to, 228 client/server architecture, 47 XQuery, XML query specifications, CREATE VIEW statement, 228–229 three-tier architecture, 50 data warehouses compared to, 1115 Web Services Description Language 445–446 defining tables of, 228 hierarchical, 447–452 (WSDL), 447 YARN (Hadoop v2) in-line, 232 Web spamming, 1057 architecture, 940–942 DROP VIEW command, 229 Big data technology for, 936–944, materialization, 230 949–953 frameworks on, 943–944 rational behind development of, 937–939

Pages:

Willington Island

Fundamentals of Database Systems [ PART II ]

Like this book? You can publish your book online for free in a few minutes!

Create your own flipbook

TOP SEARCH

business design fashion music health life sports home marketing children

Fundamentals of Database Systems [ PART II ]

Read the Text Version

Willington Island

TOP SEARCH

RELATED PUBLICATIONS