(key) to both communication parties while also preventing it from being intercepted by any adversary. Now we will study a general network security model with the help of the figure given below: Fig. 4.20 Network Security Model The network security model presents the two communicating party‘s sender and receiver who mutually agrees to exchange the information. The sender has information to share with the receiver. However, the transmitter cannot deliver the message in readable form to the information cannel since it would be targeted by the adversary. As a result, the message should be converted into an unreadable format before being sent through the information channel. When changing the message, secret information is employed, which will be required when the message is retransformed at the recipient's end. As a result, a trusted third party is expected to release this secret information to both parties involved in communication. As a result, when creating a security model based on this generic concept of network security, the following four responsibilities must be considered. 1. To convert a readable message into a secret code at the sender's end, a security algorithm should be built in such a way that it is difficult for an adversary to crack. 2. The next worry of the network security design creator is the development of the secret information known as a key. This secret information is used in conjunction with the security algorithm in order to transform the message. 151
3. The secret information is now required on both ends, the sender and the recipient. It is used to encrypt or transform the communication into unreadable form at the sender's end, and to decrypt or retransform the message into readable form at the receiver's end. As a result, a trustworthy third party must communicate the secret information to both the sender and the receiver. The means for distributing the key to the sender and recipient must also be developed while establishing the network security model. To convey the secret information to the transacting entities without the opponent's intervention, a suitable mechanism must be utilized. In order to accomplish the security service, it is also ensured that the communication protocols used by the transacting entities support the security algorithm and secret key. We've just talked about the security of information or messages sent over the network thus far. Now we'll look at the network access security model, which is meant to protect information systems that can be accessed via the network by an attacker. You're fully aware of the hackers who try to break into your internet-connected system. These assailants are divided into two groups: 1. Hacker: The one who is only interested in penetrating into your system. They do not cause any harm to your system they only get satisfied by getting access to your system. 2. Intruders: These attackers intend to do damage to your system or try to obtain the information from the system which can be used to attain financial gain. The attacker can place a logical program on your system through the network which can affect the software on your system. This leads to two kinds of risks: a. Information threat: This type of threat updates data on behalf of the user that the user should not have access to. As an example, enabling a critical privilege in the system. b. Service threat: This type of danger prevents a user from obtaining system data. These types of risks can be created by releasing worms and viruses, and your system may be infected with more of them. Worms and malware are examples of software attacks that can be transmitted to your computer via the internet. 152
The network security model to secure your system is shown in the figure below: Fig. 4.21 Network Access Security Model There are two ways to secure your system from attacker of which the first is to introduce the gatekeeper function. Adding a gatekeeper function entails adding login-ids and passwords to prevent unauthorized access. In the event that an unauthorized user gains access to the system, the second option to secure it is to implement internal controls that would detect the unauthorized user by analyzing system actions. This second option is known as antivirus, and it is software that we install on our computers to prevent unauthorized users from gaining access to our computers via the internet. 4.9 SUMMARY Malicious programs are software that is installed on a system with the intent of causing harm. • A virus is a piece of code that can \"infect\" other program by altering them, leaving behind a copy of the virus program that can then infect another program. • A worm is a computer software that can duplicate itself and spread copies over network connections. The worm may be triggered to multiply and proliferate once again once it arrives. Aside from reproduction, the worm usually serves some undesirable duty. 153
• A Trojan horse is a helpful (or seemingly useful) application or command procedure that contains concealed code that performs an undesirable or damaging purpose when executed. • A logic bomb is a piece of code embedded in a valid program that is programmed to \"explode\" if particular criteria are satisfied. • A multipartite virus infects in multiple ways. Typically, the multipartite virus is capable of infecting multiple types of files, so that virus eradication must deal with all of the possible sites of infection. • A blended attack uses multiple methods of infection or transmission, to maximize the speed of contagion and the severity of the attack. • Authorization is granting of a right or permission to a system entity to access a system resource. This function determines who is trusted for a given purpose. • The most common authentication mechanism for user to operating system is a password, a \"word\" known to computer and user. • The network security model presents the two communicating party‘s sender and receiver who mutually agrees to exchange the information. 4.10 KEYWORDS • Trapdoor- secret entry point into a program that allows someone who is aware of the backdoor to gain access without going through the usual security access procedures • Logic bomb- code embedded in some legitimate program that is set to ―explode‖ when certain conditions are met. • Infection mechanism- the means by which a virus spreads, enabling it to replicate. The mechanism is also referred to as the infection vector • Dormant phase - the virus is dormant. Some trigger, such as a time, the existence of another application or file, or the disc capacity reaching a certain limit, will finally activate the virus. This stage is not present in all viruses. • Propagation phase- the virus copies itself into other program or certain system regions on the hard drive. 154
It's possible that the duplicate isn't the same as the one that's spreading; viruses frequently change to avoid detection. Each infected application will now have a virus clone, which will undergo a propagation phase of its own. • Triggering phase- the virus is triggered and begins to fulfil its desired purpose. The activating phase, like the dormant phase, can be triggered by a variety of system events, such as a tally of how many times this copy of the virus has generated copies of itself. • Parasitic virus-spread by attaching itself to executable programs. When a program infected with a parasitic virus is opened, the virus code runs. 4.11 LEARNING ACTIVITY 1. The question arises as to whether it is possible to develop a program that can analyse a piece of software to determine if it is a virus. Consider that we have a program D that is supposed to be able to do that. That is, for any program P, if we run D(P), the result returned is TRUE (P is a virus) or FALSE (P is not a virus). Now consider the following program: Program CV := { ... main-program := {if D(CV) then goto next: else infect-executable; } next: } Determine if D can correctly decide whether CV is a virus. 155
2. Compare various forms of file protection mechanisms. 3. Give a few examples of user authentication. 4. What are the assets of a computer system? 5. How user authentication is done for the passengers on the airport? Also explain how it is done at the Aadhaar centres? 156
4.12 UNIT END QUESTIONS A. Descriptive Questions 1. What are typical phases of operation of a virus or worm? 2. What is a digital immune system? 3. Specify the importance of user authentication. 4. Define access control 5. What are the security features provided by operating system? 6. Describe about various types of viruses and their counter measures 7. How security is provided in Linux OS. 8. Explain how access control helps in information security 9. Describe the models of security. 10. Write about security policies. 11. What is Virus? How they affect the software program? 12. Explain Trojan Horse. 13. Explain the terms - Zombie, Bot, Worm, and Dropper. 14. Explain the layers of Operating system security 15. What do you mean by access control? Why it is required? What are the ways in which it is done? 16. What are the points to be considered while developing a security policy? B. Multiple Choice Questions 1. Which of the following is not a type of virus? a. Boot sector b. Polymorphic c. Multipartite d. Trojans 157
2. A computer _ is a malicious code which self-replicates by copying itself to other programs. a. Program b. Virus c. Application d. Worm 3. infects the master boot record and it is challenging and a complex task to remove this virus a. Boot Sector Virus b. Polymorphic c. Multipartite d. Trojans 4. Virus is difficult to find as they keep on altering their type and signature. a. Boot Sector b. Non-resident c. Polymorphic d. Multipartite 5. Which of the below-mentioned reasons do not satisfy the reason why people create a computer virus? a. Research purpose b. Identity theft c. Pranks d. Protection 158
Answers 1-d, 2-b, 3-a, 4-c, 5-d 4.13 REFERENCES Reference Books • William Stallings, Lawrie Brown, ―Computer Security Principles and Practices‖, Pearson, 2nd Edition • Charles P. Pfleeger, Shari Lawrence Pfleeger & Jonathan Margulies, ―Security in Computing‖, PHI 5th Edition • C K Shyamala, N Harini and Dr. T R Padmanabhan: Cryptography and Network Security, Wiley India Pvt.Ltd • Charlie Kaufman, Radia Perlman, and Mike Speciner, Network Security: Private Communication in a Public World, Prentice Hall, ISBN 0-13-046019-2 Textbooks • B. A. Forouzan & D Mukhopadhyay ,Cryptography and Network Security., McGraw Hill, 2nd ed.2010. • Stallings ,Cryptography and Network Security., PHI, 4th ed.2010. • Kahate, Cryptography and Network Security, TMH. Websites • https://www.ques10.com/p/3515/explain-different-targeted-malicious-code • https://binaryterms.com/network-security-model.html • https://www.upguard.com/blog/information-security-policy https://www.idrnd.ai/5-authentication-methods-that-can-prevent-the-next-breach 159
UNIT - 5 DATABASE SECURITY STRUCTURE 5.0 Learning Objectives 5.1 Introduction 5.2 Database Security 5.2.1 Security requirements 5.2.2 Reliability and Integrity 5.2.3 Sensitive data 5.2.4 Inference 5.2.5 Multilevel database 5.2.6 Proposals for Multilevel Security 5.3 Security in network 5.3.1 Threats in network 5.4 Network Security Controls 5.5 Firewall 5.6 Intrusion Detection System 5.7 Secure Email 5.8 Legal Ethical Issues 5.8.1 Protection of data and Information Laws 5.8.2 Employee rights 5.8.3 Software failure 5.8.4 Computer Crime 5.8.5 Privacy 5.8.6 Ethics 5.9 Summary 160
5.10 Keywords 5.11 Learning Activity 5.12 Unit End Questions 5.13 References 5.0 LEARNING OBJECTIVES After studying this unit, you will be able to: • Describe the overview of database security • Identify the role of firewalls • State the need of Intrusion detection systems • Secure emails from intrusion 5.1 INTRODUCTION Data protection means a set of actions taken to safeguard a database or database management software from unauthorized access and hostile cyber threats and assaults. Database security protocols protect not just the data within the database, but also the database management system and all programs that access it, against intrusion, data misuse, and destruction. It's a broad phrase that refers to a variety of processes, technologies, and approaches used to assuredatabase security. An intrusion detection system (IDS) is a device or software program that watches for hostile activity or policy breaches on a network. A security information and event management system is often used to report or gather any malicious activity or violation. The Intrusion Detection System (IDS) is the older of the two systems, and it is used offline, or out-of-band, to identify and log violations, as well as send an alert to an administrator or report the violation to a central repository known as a \"security information and event management (SIEM) system.\" 161
A firewall is a security mechanism, either hardware or software-based, that analyzes all outbound traffic and accepts, rejects, or drops that data depending on a set of security rules. Prior to firewalls, network security was handled through router-based Access Control Lists (ACLs). ACLs are rules that define whether a specific IP address should have network access granted or refused. ACLs, on the other hand, have no way of knowing what kind of packet they're preventing. Furthermore, ACL alone is insufficient to keep dangers out of the network. As a result, the Firewall was created. Organizations can no longer afford to be without Internet access. Accessing the Internet, on the other hand, promotes the business because it allows the outside world to connect with the organization's internal network. This puts the organization in jeopardy. A Firewall is required to protect the internal network from illegal traffic. 5.2 DATABASE SECURITY Database security covers and enforces security on all aspects and components of databases. This includes: • Data stored in database. • Database server. • Database management system (DBMS). • Other database workflow applications. Database security is generally planned, implemented and maintained by a database administrator and or other information security professional. The Need for Database Security: Organizational databases tend to concentrate sensitive information in a single logical system. Examples include: • Corporate financial data • Confidential phone records • Customer and employee information, such as name, Social Security number, bank account information, credit card information 162
• Proprietary product information • Health care information and medical records It is critical for several businesses and other organizations to be able to give this information to customers, partners, and employees. Internal and external hazards of misuse or unauthorized change can, however, attack such information. As a result, database-specific security is becoming an increasingly critical part of an overall corporate security strategy. The following are some of the causes why database security hasn't kept up with the rise in database usage: 1. The complexity of modern database management systems (DBMS) and the safety approaches employed to defend these essential systems are drastically out of sync. A database management system (DBMS) is a huge, sophisticated piece of software that offers numerous options, all of which must be well examined and then sealed to prevent data breaches. Despite advancements in security methods, the growing complexity of the DBMS—with many additional features and services—has resulted in a number of security flaws and the potential for misuse. 2. Databases utilise a sophisticated interaction protocol known as Structured Query Language (SQL), which is significantly more complex than the HTTP Protocol usedto interface with a Web service, for example. Effective database security necessitatesa strategy based on a thorough grasp of SQL's security flaws. 3. The average company does not have full-time database security specialists. As a result, there is a misalignment between objectives and capabilities. The duty of database administrators in most organisations is to administer the database to assure its availability, efficiency, correctness, and simplicity of use. Security expertise may be restricted, and time to master and deploy security techniques may be constrainedfor such administrators. On the other hand, personnel in charge of an organization's security may have just a rudimentary understanding of database and database management system (DBMS) technologies. 163
4. Most enterprise environments consist of a heterogeneous mixture of database platforms (Oracle, IBM DB1 and Informix, Microsoft, Sybase, etc.), enterprise platforms (Oracle E-Business Suite, PeopleSoft, SAP, Siebel, etc.), and OS platforms (UNIX, Linux, z/OS, and Windows, etc.). This creates an additional complexityhurdle for security personnel. An additional recent challenge for organizations is their increasing reliance on cloud technology to host part or all of the corporate database. This adds an additional burden to the security staff. In some cases, an organization can function with a relatively simple collection of files of data. Each file may contain text (e.g., copies of memos and reports) or numerical data (e.g., spreadsheets). A series of records makes up a more complex file. A more complex structure known as a database, on the other hand, is essential for any organization of any size. A database is a collection of structured data that can be accessed by one or more applications. Adatabase also stores the relationships between data pieces and groups of data items, inaddition to the data itself. Consider the following as an example of the difference between data files and databases. A simple personnel file might consist of a set of records, one foreach employee. Each record gives the employee‘s name, address, date of birth, position, salary, and other details needed by the personnel department. As previously stated, a personnel database comprises a personnel file. It could also include a time tracking file, which shows how many hours each person worked each week. These two files are linked together using a database organization so that a payroll program may extract information about time spent and pay for each worker in order to create paychecks. The database is accompanied by a database management system (DBMS), which is a collection of program for building and maintaining the database as well as providing ad hoc querying to numerous users and applications. For users and program, a query language provides a consistent interface to the database. A schematic illustration architecture of a DBMS architecture is shown in Figure 4.1. A data definition language (DDL) is used by database designers and administrators to define the database's logical structure and procedural attributes, which are expressed by a set of database description tables. For application developers, a data manipulation language (DML) provides a rich collection of tools. Query languages are declarative languages designed to support end 164
users. The database management system makes use of the database description tables to manage the physical database. The interface to the database is through a file manager module and a transaction manager module. In addition to the database description table, two other tables support the DBMS. To verify that the user has right to execute the query language expression on the database, the DBMS employs authorization tables. When many commands are run at the same time, the concurrent access table prevents conflicts. Database systems are critical to the running of many companies because they enable quick access to vast amounts of data. Database systems produce security requirements that are beyond the capabilities of traditional OS-based security measures or stand-alone security packages due to their complexity and criticality. Database systems provide efficient access to large volumes of data and are vital to the operation of many organizations. Database systems produce security requirements that are beyond the capabilities of traditional OS-based security measures or stand-alone security packages due to their complexity and criticality. Fig. 5.1 Database Architecture 165
Access to specific records or fields in that file. A DBMS typically does allow this type of more detailed access control to be specified. It also usually enables access controls to be specified over a wider range of commands, such as to select, insert, update, or delete specified items in the database. Thus, security services and mechanisms are needed that are designed specifically for, and integrated with, database systems. Relational Databases A relational database's basic building block is a table of data with rows and columns,analogous to an excel. Each row includes a unique value for each column, whereas each column stores a unique type of data. At least one column in the database should have a unique value for each value, functioning as an identification for each entry. A standard telephone directory, for example, has one item per subscriber, with columns for names, mobile number, and address. Since it is a single two-dimensional (rows and columns) file, such a table is referred to as a flat file. All of the data in a flat file is kept in a single table. There may be several users with the same name in the telephone directory, but the phone numbers should be distinctive, so that the contact number acts as a unique identifier for each row. Two or more people with the same phone number, on the other hand, may be listed in the directory. We may need a distinct column for secondary subscriber, tertiary subscriber, and so on in order to keep all of the data for the phone directory in a single table and offer a unique identification for each row. As a result, there would be a single item in the table for each phone number in use. The disadvantage of having a single table is that some column places for a particular row may be empty (not used). In addition, whenever a new service or kind of data is introduced to the database, more columns must be added, and the database and associated software must be updated and rebuilt. The relational database layout allows for the establishment of several tables that are all linked by a common identifier. Figure 4.2 illustrates how to add additional services and features to the telephony database without having to rebuild the primary table. There is a main table withbasic information for each contact number in this sample. The major key is the mobile number. The database administrator can then define a new table with a column for theprimary key and other columns for other information. 166
The database is accessed by users and applications using a relational query language. Rather than the operational commands of a programming language, the query language employs declarative statements. In essence, the query language lets users to pick data from all entries that meet a specific set of criteria. The software then determines how to retrieve the desired information from one or more tables. A telephone company representative, for example, might pull up a subscriber's billing information, as well as the condition of special services or the most recent payment,all on one screen. Elements of a Relational Database System The basic building component of a relational database is a relation, which is a flat table. Tuples are used to refer to rows, and attributes are used to refer to columns. A primary key is a segment of a row that is used to uniquely identify a row in a table; it consists of one or morecolumn names. Figure 4.2 shows how a single characteristic, Phone Number, is enough to identify a specific a row in a specific table. Fig. 5.2 Sample Relational model To create a relationship between two tables, the attributes that define the primary key in one table must appear as attributes in another table, where they are referred to as a foreign key . 167
Whereas the value of a primary key must be unique for each tuple (row) of its table, a foreign key value can appear multiple times in a table, so that there is a one-to-many relationship between a row in the table with the primary key and rows in the table with the foreign key. Figure 4.3a provides an example. In the Department table, the department ID ( Did ) is the primary key; each value is unique. This table gives the ID, name, and account number for each department. The Employee table contains the name, salary code, employee ID, and phone number of each employee. The Employee table also indicates the department to which each employee is assigned by including Did . Did is identified as a foreign key and provides the relationship between the Employee table and the Department table. Table 4.1: Basic Terminology for relational database Fig. 5.3 Basic Terminology for relational database A view is a virtual table. In essence, a view is the result of a query that returns selected rows and columns from one or more tables. Figure 4.3b is a view that includes the employee name, ID, and phone number from the Employee table and the corresponding department name from the Department table. The linkage is the Did, so that the view table includes data from each row of the Employee table, with additional data from the Department table. It is also possible to construct a view from a single table. For example, one view of the Employee table consists of all rows, with the salary code column deleted. A view can be qualified to include only some rows and/or some columns. For example, a view can be defined consisting of all rows in the Employee table. Views are often used for security purposes. A view can provide restricted access to arelational database so that a user or application only has access to certain rows or columns. 168
Fig. 5.3 Example for relational databaseStructural query language Structured Query Language (SQL), originally developed by IBM in the mid-1970s, is a standardized language that can be used to define schema, manipulate, and query data in a relational database. There are several versions of the ANSI/ISO standard and a variety of different implementations, but all follow the same basic syntax and semantics. For example, the two tables in Figure 4.3a are defined as follows: 169
The basic command for retrieving information is the SELECT statement. Consider this example: SELECT Ename, Eid, Ephone FROM Employee WHERE Did = 15 This query returns the Ename, Eid, and Ephone fields from the Employee table for all employees assigned to department 15. The view in Figure 4.3b is created using the following SQL statement: CREATE VIEW newtable (Dname, Ename, Eid, Ephone) AS SELECT D.Dname E.Ename, E.Eid, E.Ephone FROM Department D Employee E WHERE E.Did = D.Did The preceding are just a few examples of SQL functionality. SQL statements can be used to create tables, insert and delete data in tables, create views, and retrieve data with query statements. Database Access Control Commercial and open-source DBMSs typically provide an access control capability for the database. The DBMS operates on the assumption that the computer system has authenticated each user. As an additional line of defense, the computer system may use the overall access control system to determine whether a user may have access to the database as a whole. A database access control system offers a specialized feature that limits access to areas of the database for users who have been authenticated and authorized database access. Discretionary or role-based access control is available in both commercial and open-source DBMSs. A database management system (DBMS) can typically support a number of administrative policies, such as the ones listed below: • Centralized administration: A small number of privileged users may grant and revoke access rights. • Ownership-based administration: The owner (creator) of a table may grant and 170
revoke access rights to the table. • Decentralized administration: In addition to granting and revoking access rights to a table, the owner of the table may grant and revoke authorization rights to other users, allowing them to grant and revoke access rights to the table. As with any access control system, a database access control system distinguishes different access rights, including create, insert, delete, update, read, and write. Some DBMSs provide considerable control over the granularity of access rights. Access rights can be to the entire database, to individual tables, or to selected rows or columns within a table. Access rights can be determined based on the contents of a table entry. For example, in a personnel database, some users may be limited to seeing salary information only up to a certain maximum value. And a department manager may only be allowed to view salary information for employees in his or her department. SQL-Based Access Definition SQL provides two commands for managing access rights, GRANT and REVOKE. Fordifferent versions of SQL, the syntax is slightly different. In general terms, the GRANT command has the following syntax: GRANT { privileges | role } [ON table] TO { user | role | PUBLIC } [IDENTIFIED BY password] [WITH GRANT OPTION] This command can be used to grant one or more access rights or can be used to assign a user to a role. The command might optionally specify that it only applies to a specific table when it comes to access rights. The TO clause identifies the person or function to whom the privileges are being issued. Any user with the appropriate access rights has a PUBLIC value. The optional IDENTIFIED BY clause specifies a password that must be entered to withdraw this GRANT command's access rights. With or without the grant option, the GRANT OPTION indicates that the recipient can grant this access permission to other users. 171
As a simple example, consider the following statement. GRANT SELECT ON ANY TABLE TO ricflair This statement enables user ricflair to query any table in the database. Different implementations of SQL provide different ranges of access rights. The following is a typical list: • Select: Grantee may read entire database; individual tables; or specific columns in a table. • Insert: Grantee may insert rows in a table; or insert rows with values for specific columns in a table. • Update: Semantics is similar to INSERT. • Delete: Grantee may delete rows from a table. • References: Grantee is allowed to define foreign keys in another table that refer to the specified columns. The REVOKE command has the following syntax: REVOKE { privileges | role } [ON table] FROM { user | role | PUBLIC } Thus, the following statement revokes the access rights of the preceding example: REVOKE SELECT ON ANY TABLE FROM ricflair Cascading Authorizations The grant option enables an access right to cascade through a number of users. We consider a specific access right and illustrate the cascade phenomenon in Figure 4.4 . The figure indicates that Ann grants the access right to Bob at time t 10 and to Chris at time t 20. Assume that the grant option is always used. Thus, Bob is able to grant the access right to David at t = 30. At t 50, Chris grants David the access permission a second time. However, David grants the power to Ellen, who then gives it to Jim, and finally, David gives it to Frank. The revocation of rights cascaded in the same way as the grant option cascaded privileges from one user to another. If Ann denies the access rights of Bob and Chris, Ann also revokes the 172
access rights of David, Ellen, Jim, and Frank. When a user acquires the same access right many times, like in David's instance, a difficulty occurs. Let's say Bob takes away David's privilege. Because Chris acquired the access privilege at t 50, David still has it. David, on the other hand, granted Ellen access after acquiring the right, with granting option, from Bob butbefore gaining it from Chris. When Bob renounces the access right to David, most implementations state that the access right to Ellen and hence Jim is cancelled as well. Fig. 5.4 Bob Revokes privilege rom David This is because at t = 40, when David granted the access right to Ellen, David only had the grant option to do this from Bob. When Bob revokes the right, this causes all subsequent cascaded grants that are traceable solely to Bob via David to be revoked. Because David granted the access right to Frank after David was granted the access right with grant option from Chris, the access right to Frank remains. These effects are shown in the lower portion ofFigure 4.4. To summarize, most implementations follow the following convention. When user A renounces an access privilege, any cascaded access rights are also withdrawn, unless the access right would have existed even if A had never granted it. Role-Based Access Control A role-based access control (RBAC) scheme is a natural fit for database access control. Unlike a file system associated with a single or a few applications, a database system often supports dozens of applications. A one user can utilize a variety of apps to conduct a variety of tasks in 173
such an environment, each of which requires its own set of credentials. Giving users all of the access rights they need for all of the jobs they do would be poor administrative practice. RBAC reduces the administrative cost while also increasing security. We can divide database users into three types in a discretionary access control framework: • Application owner: A person who owns database objects (tables, columns, and rows) that are part of a programme. That is, the application generates or prepares database objects for usage by the application. • End user other than application owner: An end user who uses a specific application to interact with database objects but does not own any of the database objects. • Administrator: A user who has administrative authority over a portion or the entire database. In terms of these three sorts of users, we may make some broad statements regarding RBAC. A program has a variety of activities associated with it, each of which requires different access rights to different parts of the database. One or more responsibilities can be established for each task, each of which specifies the required access privileges. End users can be assigned responsibilities by the application's owner. Administrators are in charge of more sensitive or broad functions, such as controlling physical and logical database elements like data files, customers, and security methods. The system must be configured to grant certain administrators certain permissions. Users can be assigned to administrativeresponsibilities by administrators. A database RBAC facility needs to provide the following capabilities: • Create and delete roles. • Define permissions for a role. • Assign and cancel assignment of users to roles. A good example of the use of roles in database security is the RBAC facility provided by Microsoft SQL Server. SQL Server supports three types of roles: server roles, database roles, and user-defined roles. The first two types of roles are referred to as fixed roles (Table 4.2 ); these are preconfigured for a system with specific access rights. Fixed roles cannot be added, deleted, or modified by the administrator or user; users can only be added and removed as users of a fixed role. 174
Fixed server standards are established at the server level and do not rely on any user database. They're made to make administrative tasks easier. These roles have varying permissions and are designed to allow you to distribute administrative responsibilities without giving up complete control. Database administrators can allocate distinct administration responsibilities to staff and only grant them the rights they require using these predefined server roles. Fixed database roles operate at the level of an individual database. As with fixed server roles, some of the fixed database roles, such as db_accessadmin and db_securityadmin, are designed to assist a DBA with delegating administrative responsibilities. Others, such as db_datareader and db_datawriter, are designed to provide blanket permissions for an enduser. SQL Server allows users to create roles. These user-defined roles can then be assigned access rights to portions of the database. A user with the appropriate authority (usually, a user with the db securityadmin role) can create a new role and assign it access rights. Standard and application roles are the two sorts of user-defined roles. A standard role can be assigned to other users by an authorized user. An application role is associated with an application rather than with a group of users and requires a password. When a software performs the proper code, the role is active. The application role can be used for database access by a user whohas access to the application. Database applications frequently impose their own security based on the logic of the application. You can, for example, use a program role with its own login to allow a single user to only access and alter data during specific hours. As a result, more complex security management can be implemented within the application logic. 175
Fig. 5.5 Fixed Roles in SQL Server Database Encryption The database is typically the most valuable information resource for any organization and is therefore protected by multiple layers of security, including firewalls, authentication mechanisms, general access control systems, and database access control systems. In addition, for particularly sensitive data, database encryption is warranted and often implemented. Encryption becomes the last line of defense in database security. There are two disadvantages to database encryption: • Key management: Authorized users must have access to the decryption key for the data for which they have access. Because a database is typically accessible to a wide range of users and a number of applications, providing secure keys to selected parts of the database to authorized users and applications is a complex task. • Inflexibility: When part or all of the database is encrypted, it becomes more difficult to perform record searching. Encryption can be applied to the entire database, at the record level (encrypt selected records), at the attribute level (encrypt selected columns), or at the level of the individual field. 176
A number of approaches have been taken to database encryption. In this section, we look at a representative approach for a multiuser database. A database management system (DBMS) is a complicated combination of hardware and software. It necessitates a big storage capacity as well as experienced employees to maintain, preserve, upgrade, and secure the system. Outsourcing the DBMS and database to a service provider is an appealing choice for many small and medium-sized businesses. The database is kept off-site by the service provider, who can ensure high availability, disaster recovery, and quick access and updates. The primary worry with such a system is the data's confidentiality. In this case, encrypting the entire database and not providing the encryption/decryption credentials to the service provider is a simple solution to the security challenge. This solution is rigid on its own. The user has limited access to individual data items based on searches or indexing on key parameters; instead, the user must download whole tables from the database, decrypt the tables, and work with the findings. It must be feasible to deal with the information in its encrypted form to provide additional flexibility. An example of such an approach, depicted in Figure 4.5. Four entities are involved: • Data owner: An organization that produces data to be made available for controlled release, either within the organization or to external users. • User: Human entity that presents requests (queries) to the system. The user could be an employee of the organization who is granted access to the database via the server, or a user external to the organization who, after authentication, is granted access. Fig. 5.6 Database Encryption Scheme 177
• Client: Frontend that transforms user queries into queries on the encrypted data stored on the server. • Server: An organization that receives the encrypted data from a data owner and makes them available for distribution to clients. The server could in fact be owned by the data owner but, more typically, is a facility owned and maintained by an external provider. Let's start with the simplest possible configuration based on this circumstance. Assume that each database item is encrypted separately with the same encryption key. The encoded database is stored on the server, but the host does not have access to the key, ensuring that thedata is safe. Even if someone were to break into the server's system, all they'd be able to seeis encrypted data. The encryption key is copied on the client system. The following sequence can be used by a client user to obtain a record from the database: 1. The user issues an SQL query for fields from one or more records with a specific value of the primary key. 2. The query processor at the client encrypts the primary key, modifies the SQL query accordingly, and transmits the query to the server. 3. The server processes the query using the encrypted value of the primary key and returns the appropriate record or records. 4. The query processor decrypts the data and returns the results. For example, consider this query, on the database: SELECT Ename, Eid, Ephone FROM Employee WHERE Did = 15 Assume that the encryption key k is used and that the encrypted value of the department id 15 is E (k, 15) = 1000110111001110. Then the query processor at the client could transform the preceding query into SELECT Ename, Eid, Ephone FROM Employee WHERE Did = 1000110111001110 178
This method is certainly straightforward but, as was mentioned, lacks flexibility. For example, suppose the Employee table contains a salary attribute and the user wishes to retrieve all records for salaries less than $70K. There is no obvious way to do this, because the attribute value for salary in each record is encrypted. The set of encrypted values do not preserve the ordering of values in the original attribute. To provide more flexibility, the following approach is taken. Each record (row) of a table in the database is encrypted as a block. Referring to the abstract model of a relational database, each row Ri is treated as a contiguous block Bi =( xi1 || xi2 ||… || xiM ). Thus, each attribute value in Ri , regardless of whether it is text or numeric, is treated as a sequence of bits, and allof the attribute values for that row are concatenated together to form a single binary block. The entire row is encrypted, expressed as E( k , Bi ) = E( k , ( xi1 || xi2 || … || xiM )). To assistin data retrieval, attribute indexes are associated with each table. For some or all of the attributes an index value is created. For each row Ri of the unencrypted database, themapping is as follows: ( xi1 , xi2 , … , xiM ) -> [E( k , Bi ), Ii1 , Ii2 , … , IiM ] For each row in the original database, there is one row in the encrypted database. The index values are provided to assist in data retrieval. We can proceed as follows. For any attribute, the range of attribute values is divided into a set of non-overlapping partitions that encompassall possible values, and an index value is assigned to each partition. Table 4.6 provides an example of this mapping. Suppose that employee ID (eid ) values lie in the range [1, 1000]. We can divide these values into five partitions: [1, 200], [201, 400], [401, 600], [601, 800], and [801, 1000]; and then assign index values 1, 2, 3, 4, and 5, respectively. For a text field, we can derive an index from the first letter of the attribute value. For the attribute ename , let us assign index 1 to values starting with A or B, index 2 to values starting with C or D, and so on. Similar partitioning schemes can be used for each of the attributes. Table 5.7b shows the resulting table. The values in the first column represent the encrypted values for each row. The actual values depend on the encryption algorithm and the encryption key. The remaining columns show index values for the corresponding attribute values. The mapping functions between attribute values and index values constitute metadata that are stored at the client and data owner locations but not at the server. 179
Encryption scheme for database for Table Encrypted Database Table This arrangement provides for more efficient data retrieval. Suppose, for example, a user requests records for all employees with eid < 300. The query processor requests all records with I( eid ) 2. These are returned by the server. The query processor decrypts all rows returned, discards those that do not match the original query, and returns the requested unencrypted data to the user. The indexing scheme just described does provide a certain amount of information to an attacker, namely a rough relative ordering of rows by a given attribute. To obscure such information, the ordering of indexes can be randomized. 180
For example, the eid values could be partitioned by mapping [1, 200], [201, 400], [401, 600], [601, 800], and [801, 1000] into 2, 3, 5, 1, and 4, respectively. Because the metadata are not stored at the server, an attacker could not gain this information from the server. Other features could be introduced to this scheme in the future. The system could employ the encrypted values of the primary key feature values, or a hash value, to optimize the effectiveness of accessing records using the primary key. The row relating to the primary key value could be fetched individually in either situation. Different keys might be used to encrypt different parts of the database, giving users access to only the parts of the database for which they had the decryption key. This latter technique could be implemented into a system of role- based access control. Security requirements The basic security requirements of database systems are not unlike those of other computing systems we have studied. The basic problems—access control, exclusion of spurious data, authentication of users, and reliability—have appeared in many contexts so far in this book. Following is a list of requirements for database security. • Physical database integrity. A database's data is impervious to physical difficulties such as power outages, and it can be rebuilt if the database is lost due to a disaster. • The integrity of the logical database. The database's structure is preserved. With logical integrity of a database, a modification to the value of one field does not affect other fields, for example. • Element integrity. The data contained in each element are accurate. • Auditability. It is possible to track who or what has accessed (or modified) the elements in the database. • Access control. A user is allowed to access only authorised data, and different users can be restricted to different modes of access (such as read or write). • User authentication. Every user is positively identified, both for the audit trail and for permission to access certain data. • Availability. Users can access the database in general and all the data for which they are authorized. 181
We briefly examine each of these requirements.Integrity of the Database If a database is to serve as a central repository of data, users must be able to trust the accuracy of the data values. This scenario necessitates the database administrator ensuring that only authorized individuals are performing updates. It also means that the data must be safeguarded from corruption, whether caused by a malicious application or a natural disaster such as a fire or a power outage. The integrity of a database can be harmed in two ways: when the database as a whole is damaged (as can happen if the storage medium is broken) or when single data items are illegible. The Database, the operating system, and the (human) computer system manager are all responsible for the database's overall integrity. Databases and DBMSs are files and program, respectively, from the standpoint of the operating system and the computational system management. As a result, frequently backing up all files on the server is one technique to safeguard the database as a whole. These regular backups may be sufficient safeguards against catastrophic failure. It's sometimes critical to be able to rebuild a database in the event of a failure. When the power goes out unexpectedly, for example, a bank's customers may be in the middle of performing a transaction, or students may be in the middle of registering for classes online. We would like to be able to recover the systems to a stable state in these situations without requiring people to redo previously completed transactions. The DBMS must keep a transaction log to address these circumstances. Assume the banking system is set up so that each time an operation is executed, a message is recorded in a log (electronic, paper, or both). In the event of a system failure, the system can obtain accurate account balances by reverting to a backup copy of the database and reprocessing all later transactions from the log. Element Integrity The correctness or accuracy of database items is their integrity. Authorized users are ultimately accountable for entering accurate data into databases. Users and program, on the other hand, make errors when gathering data, calculating results, and inserting values. As a result, DBMSs occasionally take special measures to help capture errors as they occur and to make corrections after they have been inserted. 182
There are three options for corrective action. First, the database management system (DBMS) can do field checks, which are operations that verify for appropriate attributes in a position. A field can have to be numeric, uppercase, or one of a set of characters. The check guarantees that a value is within defined bounds or that it is not bigger than the sum of two other fields' values. Simple errors are prevented by these checks as the data is entered. A second integrity action is provided by access control. To see why, consider life without databases. Data files may contain data from several sources, and redundant data may be stored in several different places. For example, a student's home address may be stored in many different campus files: at class registration, for dining hall privileges, at the bookstore, and in the financial aid office. Indeed, the student may not even be aware that each separate office has the address on file. If the student moves from one residence to another, each of the separate files requires correction. Without a database, there are several risks to the data's integrity. First, at a given time, there could be some data files with the old address (they have not yet been updated) and some simultaneously with the new address (they have already been updated). Second, there is always the possibility that the data fields were changed incorrectly, again leading to files with incorrect information. Third, there may be files about which the student is uninformed, and as a result, he or she is uninformed that the file owner has to be notified about altering the address information. Databases are used to tackle these issues. They allow for the gathering and control of this data from a single location, ensuring that the student and users have the correct address. However, the centralization is easier said than done. Who owns this shared central file? Who has authorization to update which elements? What if two people apply conflicting modifications? What if modifications are applied out of sequence? How are duplicate records detected? What action is taken when duplicates are found? These are policy questions that must be resolved by the database administrator. The third means of providing database integrity is maintaining a change log for the database. A change log lists every change made to the database; it contains both original and modified values. Using this log, a database administrator can undo any changes that were made in error. For example, a library fine might erroneously be posted against Charles W. Robertson, instead of Charles M. Robertson, flagging Charles W. Robertson as ineligible to participate invarsity athletics. Upon discovering this error, the database administrator obtains Charles W.'s original eligibility value from the log and corrects the database. 183
Auditability For some applications it may be desirable to generate an audit record of all access (read or write) to a database. Such a record can help to maintain the database's integrity, or at least to discover after the fact who had affected which values and when. A second advantage, as we'll see later, is that users access protected data progressively; that is, no single access exposes protected data; rather, a series of sequential accesses seen together reveals the data, similar to how clues in a detective book are discovered. In this situation, an audit trail can be used to determine which hints a user has already been given as a criterion for whether or not to provide additional information. In auditing, granularity becomes a hindrance. In operating systems, audited events are actions such as opening a file or calling a procedure; they are rarely as specific as write record 3 or run instruction I. Database audits should include entries at the record, feature, and even item levels to be relevant for ensuring integrity. For most database applications, this detail is prohibitive. Furthermore, it is possible for a record to be accessed but not reported to a user, as when the user performs a select operation. (Accessing a record or an element without transferring to the user the data received is called the pass-through problem.) Also, you can determine the values of some elements without accessing them directly. (For example, you can ask for the average salary in a group of employees when you know the number of employees in the group is only one.) Thus, a log of all records accessed directly may both overstate and understate what a user actually knows. Access Control User access credentials are frequently used to divide databases logically. For example, all users may have access to a wide range data, but only the people department and the marketing department have access to income and sales data. Databases are beneficial because they centralize data storage and management. This centralization's obligation and benefit are both limited access. At the perspective, relationship, field, entry, or even element level, the database administrator decides who should have access to which data. This policy must be enforced by the DBMS, which must provide access to all specified data or deny access where it is banned. Furthermore, there may be numerous modalities of access. 184
A user or program may have the ability to read, update, delete, or add to a value, as well as edit or remove entire fields or records, and reorganize the database as a whole. Access control for databases appears to be similar to access control for computer systems or any other element of a computing system on the surface. The database problem, on the other hand, is more sophisticated, as we'll discover throughout this chapter. Objects in the operating system, such as files, are unconnected, whereas records, attributes, and elements are. While a person cannot understand the details of one file by reading the contents of others,a user may be able to identify one data element by reading the contents of others. Inference is the challenge of collecting data values from others, and we'll go over it in detail later in this chapter. It's worth noting that data can be accessed via inference without requiring immediate access to the secure object. In order to prevent inferences, restricting inference may include blocking particular paths. Restriction of control inference access, on the other hand, limits inquiries from individuals who do not want unlawful access to values. Furthermore, attempting to check requested entries for possible improper inferences may decrease the DBMS's performance. Finally, the size or granularity of operating system and database objects differs. An access control list for a repository with several hundred files with perhaps a hundred fields each is significantly simple to execute than an access control list for several hundred files with perhaps a hundred fields each. Processing efficiency is influenced by size. User Authentication The database management system (DBMS) may demand strict user authentication. A DBMS, for example, may require a user to pass both a specified login and time-of-day checks. This authentication is in addition to the operating system's authentication. On top of an operating system, the DBMS usually runs as a software application. Because there is no trusted link from the Database to the operating system, all data it gets, including user authentication, mustbe regarded with suspicion. As a result, the DBMS is compelled to perform its own authentication. Availability A DBMS has aspects of both a program and a system. It is a program that uses other hardware and software resources, yet to many users it is the only application run. 185
Users frequently take the database management system for granted, treating it as a necessary tool for completing specific tasks. Users are acutely aware of a DBMS's lack of availability when the system is busy servicing other users or is down for repair or upgrading. For instance, two users may seek the same record, and the DBMS must decide; one user will almost certainly be refused access for a period of time. Alternatively, the DBMS may withhold vulnerable data in order to prevent the disclosure of protected data, making the seeking user dissatisfied. Problems like these necessitate a DBMS with a high level of availability. Integrity/Confidentiality/Availability The three aspects of computer security integrity, confidentiality, and availability clearly relate to database management systems. As we have described, integrity applies to the individual elements of a database as well as to the database as a whole. Thus, integrity is a major concern in the design of database management systems. Due to the obvious inference problem, where a user might indirectly access sensitive data, confidentiality is a major concern with databases. Finally, due to the obvious shared access motivation that drives database development, availability is critical. Availability, on the other hand, is incompatible with confidentiality. The last sections of the chapter address availability in an environment in which confidentiality is also important. Reliability and Integrity Databases amalgamate data from many sources, and users expect a DBMS to provide access to the data in a reliable way. Whenever software engineers claim that a piece of software is reliable, they're referring to the fact that it can run for lengthy periods of time without crashing. Because the data is frequently critical to business or organizational objectives, usersneed a DBMS to be dependable. Furthermore, users entrust their information to a database management system (DBMS) and expect it to secure it from loss or harm. Concerns about trustworthiness are common security concerns, but they are especially noticeable when dealing with databases. A database management system (DBMS) protects against damage or loss in numerous methods, which we will look at in this section. 186
The controls we evaluate, however, are not absolute: There is no way to prevent an authorized user from inputting an acceptable but wrong value by accident. Concerns with database dependability and security can be regarded in three ways: • Database integrity: concern that the database as a whole is protected against damage, as from the failure of a disk drive or the corruption of the master database index. These concerns are addressed by operating system integrity controls and recovery procedures. • Element integrity: concern that the value of a specific data element is written or changed only by authorized users. Proper access controls protect a database from corruption by unauthorized users. • Element accuracy: concern that only correct values are written into the elements of a database. Checks on the values of elements can help prevent insertion of improper values. Also, constraint conditions can detect incorrect values. Protection Features from the Operating System A responsible system administrator backs up the files of a database periodically along with other user files. During normal execution, the operating system‘s standard access control facilities protect the files against outside access. Finally, the operating system performscertain integrity checks for all data as a part of normal read and write operations for I/O devices. These controls provide basic security for databases, but the database manager must enhance them. Two-Phase Update A serious problem for a database manager is the failure of the computing system in the middle of data modification. If the data item to be modified was a long field or a record consisting of several attributes, only some of the new data might have been written to permanent storage. Therefore, the database file would contain incorrect data that had not been updated. Even if such problems were easily detectable (which they aren't), a more subtle issue arises when multiple fields are modified and no one field seems to be incorrect. A two- phase update is used to solve this problem, as proposed by Lampson and Sturgis and incorporated by most DBMSs. 187
Update Technique The DBMS collects the resources necessary to conduct the update during the first step, known as the intent phase. It may collect data, generate dummy records, access files, lock outother customers, and compute final answers; in other words, it does everything designed to complete for the update but does not make any modifications. Since it takes no lasting action,the initial phase can be repeated an infinite number of times. There is no harm done if the system fails during the first phase because all of these procedures may be redone and performed once the system resumes processing. Committing is the last action of the first phase, and it entails writing a commit flag to the repository. The commit flag indicates that the DBMS has reached the end of its life cycle: The DBMS starts making permanent changes when you commit. The permanent alterations are made in the second phase. No actions taken first before commit can be repeated in the second phase, but the update activities in phase two can be repeated as many times as needed. If the system breaks down during the second stage, the database may include incomplete data; nevertheless, the system can restore this data by completing all second-phase tasks. The database is now complete once the second stage is complete. Two-Phase Update Example Consider a database that contains the inventory of a company‘s office supplies. The industry's central stockroom holds products like paper, pencils, and paper clips, and various departments requisition them as needed. To get the greatest pricing, the organization buys in bulk. Each sector has a cost for office supplies, and the cost of goods is collected from the department through a charge mechanism. In addition, the central stockroom keeps track of available supplies so that new ones may be ordered when the old ones run out. Let's say the procedure starts with an order for 50 boxes of paper clips from the accounting department. Assume there are 107 packages in stock, and that if the number in stock falls below 100, a new order is made. Following are the actions that must be taken after the stockroom gets the demand. 1. The stockroom checks the database to determine that 50 boxes of paper clips are on hand. If not, the requisition is rejected and the transaction is finished. 2. If enough paper clips are in stock, the stockroom deducts 50 from the inventory figure 188
in the database (107 – 50 = 57). 3. The stockroom charges accounting‘s supplies budget (also in the database) for 50 boxes of paper clips. 4. The stockroom checks its remaining quantity on hand (57) to determine whether the remaining quantity is below the reorder point. Because it is, a notice to order more paper clips is generated, and the item is flagged as ―on order‖ in the database. 5. A delivery order is prepared, enabling 50 boxes of paper clips to be sent to accounting. All five of these steps must be completed in the order listed for the database to be accurate and for the transaction to be processed correctly. Assume a failure occurs during the processing of these steps. If step 1 fails before it is completed, there is no harm done because the transaction history can be restarted. During stages 2, 3, and 4, however, changes to database elements are made. If a failure happens, the database values will be inconsistent. Worse, the transaction can't be completed because a requisition will be subtracted twice, a department will be charged twice, and two delivery orders will be generated. When a two-phase commit is used, shadow values are maintained for key data points. During the intent stage, a shadow information value is computed and saved locally, then copied to the data store during the commit phase. For a two-phase commit, the actions on the data would be as follows. Intent: 1. Check the value of COMMIT-FLAG in the database. If it is set, this phase cannot be performed. Halt or loop, checking COMMIT-FLAG until it is not set. 2. Compare number of boxes of paper clips on hand to number requisitioned; if more are requisitioned than are on hand, halt. 3. Compute TCLIPS = ONHAND – REQUISITION. 4. Obtain BUDGET, the current supplies budget remaining for accounting department. Compute TBUDGET = BUDGET – COST, where COST is the cost of50 boxes of clips. 5. Check whether TCLIPS is below reorder point; if so, set TREORDER = TRUE; else 189
set TREORDER = FALSE. Commit: 1. Set COMMIT-FLAG in database. 2. Copy TCLIPS to CLIPS in database. 3. Copy TBUDGET to BUDGET in database. 4. Copy TREORDER to REORDER in database. 5. Prepare notice to deliver paper clips to accounting department. Indicate transaction completed in log. 6. Unset COMMIT-FLAG. With this example, each step of the intent phase depends only on unmodified values from the database and the previous results of the intent phase. Each variable beginning with T is a shadow variable used only in this transaction. The steps of the intent phase can be repeated an unlimited number of times without affecting the integrity of the database. Once the DBMS begins the commit phase, it writes a COMMIT flag. When this flag is set, the DBMS will not perform any steps of the intent phase. Intent steps cannot be performed after committing because database values are modified in the commit phase. Notice, however, that the steps of the commit phase can be repeated an unlimited number of times, again with no negative effect on the correctness of the values in the database. The one remaining flaw in this logic occurs if the system fails after writing the ―transaction complete‖ message in the log but before clearing the commit flag in the database. It is a simple matter to work backward through the transaction log to find completed transactions for which the commit flag is still set and to clear those flags. Sensitive data Some databases contain what is called sensitive data. As a working definition, let us say that sensitive data are data that should not be made public. Identifying whether data items and attributes are delicate is dependent on the database in question as well as the data's underlying meaning. Obviously, some databases, such as a public library catalog, contain no sensitive data; other databases, such as defense-related ones, are wholly sensitive. 190
These two scenarios—nothing sensitive and everyone sensitive—are the most straightforward to address since they can be handled by database-wide access limits. Someone is either an authorized user or not. The operating system can provide these controls. The more challenging challenge, which is also the most interesting, is when some but not all of the database's components are sensitive. There could be different levels of sensitivity. For example, a university database might contain student data consisting of name, financial aid, dorm, drug use, sex, parking fines, and race. An example of this database is shown in Table 7- 6. Name and dorm are probably the least sensitive; financial aid, parking fines, and druguse the most; sex and race somewhere in between. That is, many people may have legitimate access to name, some to sex and race, and relatively few to financial aid, parking fines, or drug use. Indeed, knowledge of the existence of some fields, such as drug use, may itself be sensitive. Thus, security concerns not only the data elements but their context and meaning. Fig. 5.7 Example Database Furthermore, we must account for different degrees of sensitivity. Financial assistance, parking fines, and drugs use, for example, may not have the same access limitations as the other areas, despite the fact that they are all extremely sensitive. Our security standards may dictate that just a few persons have access to each column, but no one has access to all three. The access control problem's challenge is to restrict users' access so that they can only obtain data to which they have authorized access. Alternatively, the problem of access control demands us to guarantee that sensitive data is not shared with unauthorized persons. Several factors can make data sensitive. • Inherently sensitive. The value itself may be so revealing that it is sensitive. Examples are the locations of defensive missiles or the median income of barbers in a town with only one barber. • From a sensitive source. The source of the data may indicate a need for confidentiality. An example is information from an informer whose identity would 191
be compromised if the information were disclosed. • Declared sensitive. The database administrator or the owner of the data may have declared the data to be sensitive. Examples are classified military data or the name of the anonymous donor of a piece of art. • Part of a sensitive attribute or record. In a database, an entire attribute or record may be classified as sensitive. Examples are the salary attribute of a personnel database ora record describing a secret space mission. • Sensitive in relation to previously disclosed information. Some data become sensitive in the presence of other data. For example, the longitude coordinate of a secret gold mine reveals little, but the longitude coordinate in conjunction with the latitude coordinate pinpoints the mine. All of these factors must be considered when the sensitivity of the data is being determined. Types of Disclosures We all know that some data are sensitive. However, sometimes even characteristics of the data are sensitive. In this section, we see that even descriptive information about data (such astheir existence or whether they have an element that is nonzero) is a form of disclosure. Exact Data The most dangerous breach is when a sensitive data item's exact value is revealed. The user may be aware that sensitive data is being sought, or the user may be unaware that some of the data is sensitive. A defective database management may unintentionally distribute sensitive data to users who have not requested it. The outcome is the same in all of these cases: The safety of sensitive information has been jeopardized. Bounds Another exposure is disclosing bounds on a sensitive value, that is, indicating that a sensitive value, y, is between two values, L and H. Sometimes, by using a narrowing technique not unlike the binary search, the user may first determine that L ≤ y ≤ H and then see whether L ≤y ≤ H/2, and so forth, thereby permitting the user to determine y to any desired precision. In another example, simply exposing that a value exceeds a specific threshold, like the athletic scholarship fund or the amount of CIA agents, could constitute a significant security violation. Bounds, on the other hand, can be a beneficial approach to show sensitive data in some cases. 192
It's usual to publish data upper and lower bounds without identifying the individual records. For example, a company may announce that its salaries for programmers range from $50,000 to $82,000. If you are a programmer earning $79,700, you would suppose you are fairly well off, so you have the information you want; however, the announcement does not disclose who are the highest- and lowest-paid programmers. Negative Result Sometimes we can word a query to determine a negative result. That is, we can learn that z is not the value of y. Knowing that a person's total number of convicted felons is not zero indicates that the person has been convicted of a felony. The difference between 1 and 2 crimes, or 46 and 47 felonies, is not as subtle as the difference between 0 and 1. As a result, revealing that a value is not zero can be a big deal. Similarly, if a student is not on the honors list, you can assume that his or her grade point average is less than 3.50. However, so becauserange of grade point average (gpa from 0.0 to 3.49 is so vast, this information isn't veryuseful. Existence In some cases, the existence of data is itself a sensitive piece of data, regardless of the actual value. A company, for example, may not want workers to learn that their phone calls are being recorded. In this scenario, finding a column in a personnel file called NUMBER OF INDIVIDUAL TELEPHONE CALLS would reveal sensitive information. Probable Value Finally, it may be possible to determine the probability that a certain element has a certain value. To see how, suppose you want to find out whether the president of the United States is registered in the Tory party. Knowing that the president is in the database, you submit two queries to the database: A database manager can control access by direct queries; disclosure can occur in more subtle ways that are harder to control. From these queries you conclude there is a 25 percent likelihood that the president is a registered Tory. 193
Inference Inference is a way to infer or derive sensitive data from nonsensitive data. The inference problem is a subtle vulnerability in database security. The database in Table 4.5 illustrates the inference problem; this database has the same form as the one introduced in Table 4.6, but we have added more data to make some points related to multiple data items. Recall that AID is the amount of financial aid a student is receiving. FINES is the amount of parking fines still owed. DRUGS is the result of a drug-use survey: 0 means never used and 3 means frequent user. Obviously this information should be kept confidential. We assume that AID, FINES, and DRUGS are sensitive fields, although only when the values are related to a specific individual. In this section, we look at ways to determine sensitive data values from the database. Fig. 5.8 Database to illustrate inferences Direct Attack In a direct attack, a user tries to determine values of sensitive fields by seeking them directly with queries that yield few records. The most successful technique is to form a query so specific 194
that it matches exactly one data item. In Table, a sensitive query might be This query discloses that for record ADAMS, DRUGS=1. However, because it selects users for whom DRUGS=1, the attack is clear, and the DBMS may reject the query since it picks data for a particular value of the sensitive field DRUGS. A less obvious query is On the surface, this query looks as if it should conceal drug usage by selecting other non- drug- related records as well. This query, however, returns only one result, showing a name that matches to the DRUG value. The DBMS must be aware that Gender has only 2 choicesin order for the second clause to return no results. Even if it were possible, the DBMS would still need to know that no entries with DORM=AYRES exist, even though AYRES is an allowed DORM value. Inference by Arithmetic Another procedure, used by the U.S. Census Bureau and other organizations that gather sensitive data, is to release only statistics. The organizations suppress individual names, addresses, or other characteristics by which a single individual can be recognized. Only neutral statistics, such as count, sum, and mean, are released. The indirect attack seeks to infer a final result based on one or more intermediate statistical results. But this approach requires work outside the database itself. A statistical assault, in particular, aims to deduce individual data using a seemingly anonymous statistical measure. In the sections that follow, we'll look at a few instances of indirect attacks against databases that keep track of statistics. Sum Total is a type of attack that attempts to deduce a number from a stated sum. With the data base in Table 4.6, it could appear safe to report total student aid by sex and dorm, for example. Table 195
4.7 depicts such a report. This seemingly innocuous data discloses that no female resident of Grey receives financial assistance. As a result, any female resident of Grey (such as Liu) is very probably not receiving financial assistance. This approach often allows us to determine a negative result. Fig. 5.9 Table showing negative result Count The count can be combined with the sum to produce some even more revealing results. These two metrics are frequently released for a database in order for users to calculate average values. (On the other hand, if number and mean are released, the sum can be calculated.) The count of data for students by dorm and gender is shown in Table 4.7. By itself, this table is harmless. Combined with the sum table, however, this table demonstrates that the two males in Holmes and West are receiving financial aid in the amount of $5000 and $4000, respectively. We can obtain the names by selecting the subschema of NAME, DORM, which is not sensitive because it delivers only low-security data on the entire database. Fig. 5.10 Inference from count and sum 196
Mean The arithmetic mean (average) allows exact disclosure if the attacker can manipulate the subject population. As a trivial example, consider salary. Given the number of employees, themean salary for a company and the mean salary of all employees except the president, it is easy to compute the president‘s salary. Median By a slightly more complicated process, we can determine an individual value from the median, the midpoint of an ordered list of values. The attack requires finding selections having one point of intersection that happens to be exactly in the middle, as shown in Figure 4.7. Fig. 5.11 Intersecting Medians Fig. 5.12 Drug use and Aid results 197
For example, in our sample database, there are five males and three persons whose drug use value is 2. Arranged in order of aid, these lists are shown in Table 4.8. Notice that Majors is the only name common to both lists, and conveniently that name is in the middle of each list. Majors is a white dude with a drug-use score of 2. Somebody operating at the Health Clinic could be able to locate out. Majors is the junction of these two lists, and Majors' financial aid is $2000, according to this information. The queries in this example indicate the exact amount of financial help for Majors. Tracker Attacks As previously stated, database management systems may hide data when a small number of items account for a big amount of the data exposed. By employing additional queries that create minor results, a tracker attack can trick the database management into finding the needed data. For two different searches, the monitor adds extra records to be retrieved; the two sets of records cancel each other out, leaving only the statistic or data desired. The approach is to use intelligent padding of two queries. In other words, instead of trying to identify a unique value, we request n–1 other values (where there are n values in thedatabase). Given n and n–1, we can easily compute the desired single element. For instance, suppose we want to know how many female Caucasians live in Holmes Hall. A query posed might be The database management system might consult the database, find that the answer is 1, and block the answer to that query because one record dominates the result of the query. However, further analysis of the query allows us to track sensitive data through nonsensitive queries. The query 198
Because count(a) = 6 and count(a ������ ¬(b ������ c)) = 5, we can determine the suppressed value easily: 6 – 5 = 1. Furthermore, neither 6 nor 5 is a sensitive count. Linear System Vulnerability A tracker is a specific case of a more general vulnerability. With a little logic, algebra and luck in the distribution of the database contents, it may be possible to construct an algebraic linear system of equations that returns results relating to several different sets. For example, the following system of five queries does not overtly reveal any single c value from the database. However, the queries‘ equations can be solved for each of the unknown c values, revealing them all. To see how, use basic algebra to note that q1 – q2 = c3 + c5 , and q3 – q4 = c3 – c5 . Then, subtracting these two equations, we obtain c5 = ((q1 – q2 ) – (q3 – q4 ))/2. Once we know c5 , we can derive the others. In fact, this attack can also be used to obtain results other than numerical ones. Recall that we can apply logical rules to and (������) and or (∨), typical operators for database queries, to derive values from a series of logical expressions. For example, each expression might represent a query asking for precise data instead of counts, such as the equation q1 = s1∨ s2 s3 ∨s4 ∨s5 Inference is difficult to control because it can occur from algebraic calculations beyond the scope of database management systems. 199
The result of the query is a set of records. Using logic and set algebra in a manner similar to our numerical example, we can carefully determine the actual values for each of the si . Multilevel database A multilevel database system (MDBMS) supports the application of a multilevel policy for regulating access to the database objects. Information Granularity When designing a multilevel security model for a database, the first problem is to define the information granularity. The finer the information granularity is, the better the expressive power of the model is. For example, a model for relational databases where only tables or rows can be classified would be a model with a poor expressive power. Now, a model where attribute values (intersection between a row and a column) can be classified would be amodel with a great expressive power. In fact, defining the information granularity depends on the application needs. In order to illustrate our point, let us consider the Wing relation. Primary key of this relation is {Name}. Fig. 5.13 Wing table This relation represents an imaginary French American wing ready for attack! The military security administrator of this database knows the various sensitivities of the data contained in this table: • The existence of the wing is unclassified. • The existence of each plane except the Firefox is unclassified. • The existence of Firefox is confidential. 200
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250