Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore Shobit-MCA Sem II- Network Security and Cryptography (1)

Shobit-MCA Sem II- Network Security and Cryptography (1)

Published by Teamlease Edtech Ltd (Amita Chitroda), 2023-05-18 05:49:20

Description: Shobit-MCA Sem II- Network Security and Cryptography (1)

Keywords: Network Security and Cryptography

Search

Read the Text Version

2. Cipher Block Chaining Mode To overcome the limitation of ECB i.e. the repeating block in plain text produces the same ciphertext, a new technique was required which is Cipher Block Chaining (CBC) Mode. CBC confirms that even if the plain text has repeating blocks its encryption won’t produce same cipher block. To achieve totally different cipher blocks for two same plain text blocks chaining has been added to the block cipher. For this, the result obtained from the encryption of the first plain text block is fed to the encryption of the next plaintext box. In this way, each ciphertext block obtained is dependent on its corresponding current plain text block input and all the previous plain text blocks. But during the encryption of first plain text block, no previous plain text block is available so a random block of text is generated called Initialization vector. Now let’s discuss the encryption steps of CBC Step 1: The initialization vector and first plain text block are XORed and the result of XOR isthen encrypted using the key to obtain the first ciphertext block. Step 2: The first ciphertext block is fed to the encryption of the second plain text block. For the encryption of second plain text block, first ciphertext block and second plain text block is XORed and the result of XOR is encrypted using the same key in step 1 to obtain the second ciphertext block. Similarly, the result of encryption of second plain text block i.e. the second ciphertext block is fed to the encryption of third plain text block to obtain third ciphertext block. And the process continues to obtain all the ciphertext blocks. You can see the steps of CBC in the figure below: 51

Decryption steps of CBC: Step 1: The first ciphertext block is decrypted using the same key that was used for encrypting all plain text blocks. The result of decryption is then XORed with the initializationvector (IV) to obtain the first plain text block. Step 2: The second ciphertext block is decrypted and the result of decryption is XORed with the first ciphertext block to obtain the second plain text block. And the process continues till all plain text blocks are retrieved. 52

There was a limitation in CBC that if we have two identical messages and if we use the sameIV for both the identical message it would generate the same ciphertext block. 3. Cipher Feedback Mode All applications may not be designed to operate on the blocks of data, some may be characteror bit-oriented. Cipher feedback mode is used to operate on smaller units than blocks. Let us discuss the encryption steps in cipher feedback mode: Step 1: Here also we use initialization vector, IV is kept in the shift register and it isencrypted using the key. Step 2: The left most s bits of the encrypted IV is then XORed with the first fragment of theplain text of s bits. It produces the first ciphertext C1 of s bits. Step 3: Now the shift register containing initialization vector performs left shift by s bits ands bits C1 replaces the rightmost s bits of the initialization vector. 53

Then again, the encryption is performed on IV and the leftmost s bit of encrypted IV isXORed with the second fragment of plain text to obtain s bit ciphertext C2. Encryption The process continues to obtain all ciphertext fragments. Decryption Steps: Step 1: The initialization vector is placed in the shift register. It is encrypted using the samekey. Keep a note that even in the decryption process the encryption algorithm is implementedinstead of the decryption algorithm. Then from the encrypted IV s bits are XORed with the s bits ciphertext C1 to retrieve s bitsplain text P1. Step 2: The IV in the shift register is left-shifted by s bits and the s bits C1 replaces therightmost s bits of IV. The process continues until all plain text fragments are retrieved. 54

Decryption It has a limitation that if there occur a bit error in any ciphertext Ci it would affect all the subsequent ciphertext units as Ci is fed to the encryption of next Pi+1 to obtain Ci+1. In this way, bit error would propagate. 4. Output Feedback Mode The output feedback (OFB) mode is almost similar to the CFB. The difference between CFB and OFB is that unlike CFB, in OFB the encrypted IV is fed to the encryption of next plain text block. The other difference is that CFB operates on a stream of bits whereas OFB operates on the block of bits. Steps for encryption: Step 1: The initialization vector is encrypted using the key. Step 2: The encrypted IV is then XORed with the plain text block to obtain the ciphertext block. 55

The encrypted IV is fed to the encryption of next plain text block as you can see in the imagebelow. Encryption Steps for decryption: Step 1: The initialization vector is encrypted using the same key used for encrypting all plaintext blocks. Note: In the decryption process also the encryption function is implemented. Step2: The encrypted IV is then XORed with the ciphertext block to retrieve the plain textblock. The encrypted IV is also fed to the decryption process of the next ciphertext block.The process continues until all the plain text blocks are retrieved. 56

Decryption The advantage of OFB is that it protects the propagation of bit error means the if there occurs a bit error in C1 it will only affect the retrieval of P1 and won’t affect the retrieval of subsequent plain text blocks. 5. Counter Mode It is similar to OFB but there is no feedback mechanism in counter mode. Nothing is beingfed from the previous step to the next step instead it uses a sequence of number which is termed as a counter which is input to the encryption function along with the key. After a plaintext block is encrypted the counter value increments by 1. Steps of encryption: Step1: The counter value is encrypted using a key. Step 2: The encrypted counter value is XORed with the plain text block to obtain a ciphertextblock. To encrypt the next subsequent plain text block the counter value is incremented by 1 andstep 1 and 2 are repeated to obtain the corresponding ciphertext. 57

Encryption The process continues until all plain text block is encrypted.Steps for decryption: Step1: The counter value is encrypted using a key. Note: Encryption function is used in the decryption process. The same counter values are used for decryption as used while encryption. Step 2: The encrypted counter value is XORed with the ciphertext block to obtain a plain textblock. Decryption 58

To decrypt the next subsequent ciphertext block the counter value is incremented by 1 and step 1 and 2 are repeated to obtain corresponding plain text. The process continues until all ciphertext block is decrypted. So, this is all about Block cipher, its designing principles and its mode of operation’ 5.4 DES rounds The Data Encryption Standard (DES) is a symmetric-key block cipher published by the National Institute of Standards and Technology (NIST). DES is an implementation of a Feistel Cipher. It uses 16 round Feistel structure. The block size is 64-bit. Though, key length is 64-bit, DES has an effective key length of 56 bits, since 8 of the 64 bits of the key are not used by the encryption algorithm (function as check bits only). General Structure of DES is depicted in the following illustration – Fig 3.3 DES is depicted 59

Since DES is based on the Feistel Cipher, all that is required to specify DES is − • Round function • Key schedule • Any additional processing − Initial and final permutation Initial and Final Permutation The initial and final permutations are straight Permutation boxes (P-boxes) that are inverses of each other. They have no cryptography significance in DES. The initial and final permutations are shown as follows − Fig. 3.4 Initial and Final Permutation Round Function The heart of this cipher is the DES function, f. The DES function applies a 48-bit key to the rightmost 32 bits to produce a 32-bit output. 60

Fig. 3.5 Round Function • Expansion Permutation Box − Since right input is 32-bit and round key is a 48-bit, we first need to expand right input to 48 bits. Permutation logic is graphically depicted in the following illustration − Fig. 3.6 Expansion Permutation Box • The graphically depicted permutation logic is generally described as table in DES specification illustrated as shown − 61

• XOR (Whitener). − After the expansion permutation, DES does XOR operation on the expanded right section and the round key. The round key is used only in this operation. • Substitution Boxes. − The S-boxes carry out the real mixing (confusion). DES uses8 S-boxes, each with a 6-bit input and a 4-bit output. Refer the following illustration − Fig. 3.6 Substitution Boxes • The S-box rule is illustrated below − 62

Fig. 3.7 S-box rule • There are a total of eight S-box tables. The output of all eight s-boxes is thencombined in to 32 bit section. • Straight Permutation − The 32 bit output of S-boxes is then subjected to the straightpermutation with rule shown in the following illustration: Fig. 3.8 Straight Permutation 63

Key Generation The round-key generator creates sixteen 48-bit keys out of a 56-bit cipher key. The processof key generation is depicted in the following illustration − Fig. 3.9 Key Generation The logic for Parity drop, shifting, and Compression P-box is given in the DES description. DES Analysis The DES satisfies both the desired properties of block cipher. These two properties makecipher very strong. 64

• Avalanche effect − A small change in plaintext results in the very great change in theciphertext. • Completeness − Each bit of ciphertext depends on many bits of plaintext. During the last few years, cryptanalysis have found some weaknesses in DES when keyselected are weak keys. These keys shall be avoided. DES has proved to be a very well designed block cipher. There have been no significant cryptanalytic attacks on DES other than exhaustive key search. S BOXES In cryptography, a S-Box (Substitution-box) is a basic component of symmetric-key algorithms. In block ciphers, the S-Boxes are used to make the relation between the key and the ciphertext (coded text) difficult to understand–Shannon's property of confusion. The S- Boxes are carefully chosen to resist cryptanalysis (decoding). In general, an S-Box takes some number of inputs bits, m, and transforms them into some number of output bits, n: an man S-Box can be implemented as a lookup table with 2m words of n bits each. Fixed tables are normally used, as in the Data Encryption Standard (DES), but in some ciphers the tables are generated dynamically from the key; e.g. the Blowfish and the Twofish encryption algorithms. Bruce Schneier describes IDEA's modular multiplication step as a key-dependent S-Box. One good example is this 6×4-bit S-Box from DES (S5): 65

Middle (inner) 4 bits of input 0 S5 0 000 001 001 010 010 011 011 100 100 101 101 110 110 111 111 01 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 Oute 0 110 010 000 011 101 101 011 100 010 001 111 110 000 111 100 00 r bits 10 0 1 1 0 1 0 0 1 1 1 1 0 0 1 0 1 1 101 001 110 010 011 110 000 010 000 111 101 001 100 100 011 01 11 0 0 0 1 1 1 1 0 1 0 1 1 0 0 0 0 1 001 000 101 101 110 011 100 111 100 110 010 011 001 000 111 10 00 1 1 0 1 1 0 1 1 0 1 0 1 0 0 0 1 0 100 110 011 000 111 001 110 011 111 000 100 101 010 010 001 11 10 0 1 1 0 0 1 0 1 0 1 0 0 1 1 1 66

Given a 6-bit input, the 4-bit output is found by selecting the row using the outer two bits,and the column using the inner four bits. For example, an input \"011011\" has outer bits \"01\" and inner bits \"1101\"; the corresponding output would be \"1001\". The 8 S-Boxes of DES were the subject of intensive studies for many years cause of a concern that a method of bypassing the DES cipher to obtaining access to the plaintext– a vulnerability (susceptibility) known only to its designers–might have been planted (inserted) in the cipher. In 1994, the S-Box design criteria were finally published by its designers after the public rediscovery of differential cryptanalysis, showing that they had been carefully tuned the design to increase resistance against differential cryptanalysis attacks. Other research had already indicated that even a very small modification to one of the 8 S-Box used by the DES could weaken it very much. The design of good S-Boxes was the subject of a great amount of research; now much moreis understood about their use in block ciphers than when the DES S- Boxes were released. 3.4 SUMMARY Computer cryptography started with the invention of the first computer nicknamed 'Colossus'.The cipher machines that were used before, during the Second World War, were in many ways the predecessors to today's computer devices. The simple language codes used in those early devices were replaced by the binary computer language of 0s and 1s to give rise to modern computer cryptography. In today's time, cyber cryptographic algorithms are used to transfer electronic data 67

over the internet so that no third-party is able to read the data. Cryptography historically dealt with theconstruction and analysis of protocols that would prevent any third parties from reading a private communication between two parties. In the digital age, cryptography has evolved to address the encryption and decryption of private communications through the internet and computer systems, a branch of cyber and network security, in a manner far more complex than anything the world of cryptography had seen before the arrival of computers. The aim of cyber security is to attempt to create encryption systems that perform perfectly on all four of the above-mentioned parameters. This can be almost impossible to fully accomplish, since the strength of the encryption depends not only on computer programs but also on human behaviour. The best security systems in the world can still be defeated by an easily-guessed password, or the user not logging out after a session or discussing security information with outsiders. Today, cryptography uses some of the finest computer and mathematical minds on the planet. Every industry on the planet, from war to healthcare makes use of encryption to protect sensitive information that is being transmitted across the internet 3.5 KEYWORDS • Plaintext- Text that is not computationally tagged, specially formatted, or written in code. • Encryption-It is the process of encoding a message or information in such a way that only authorized parties can access it. Encryption does not itself prevent interference, but denies the intelligible content to a would-be interceptor. 68

• Cyphertext-It is the encrypted text. Plaintext is what you have before encryption, and ciphertext is the encrypted result. The term cipher is sometimes used as a synonym forciphertext, but it more properly means the method of encryption rather than the result. • Decryption-Decryption is the process of taking encoded or encrypted text or other data and converting it back into text that you or the computer can read and understand. • Secure Sockets Layer (SSL)-A protocol that uses cryptography to enable secure communication over the Internet. SSL is widely supported by leading web browsers and web servers. • Security Console- An administrative user interface through which the user performs most of the day-to-day administrative activities. • Security domain-A container that defines an area of administrative management responsibility, typically in terms of business units, departments, partners, and so on. 3.6 LEARNING ACTIVITY 1. Define encryption and decryption process Output Feedback Mode in Key schedulealgorithm. 2. Implement substitutions and transpositions techniques in network to append security 69

3.7 UNIT END QUESTIONS (MCQ AND DESCRIPTIVE) A. Descriptive Questions 1. Draw a structure of Secret Key Cryptography 2. What is Block Encryption? 3. What is function of Data Encryption Standard Rounds? 4. Explain Substitution-Boxes. 5. Explain Symmetric Cryptography. B. Multiple Choice Questions 1. When plain text is converted to unreadable format, it is termed as a. rotten text b. raw text c. cipher-text d. ciphen-text 2. Cryptographic algorithms are based on mathematical algorithms where these algorithmsuse for a secure transformation of data. a. secret key b. external programs c. add-ons d. secondary key 70

3. Data Encryption Standard is an example of a cryptosystem. a. Conventional b. public key c. hash key d. asymmetric-key 4. In asymmetric key cryptography, the private key is kept by a. sender b. receiver c. sender and receiver d. all the connected devices to the network 5. In which way does the Combined Encryption combine symmetric and asymmetricencryption? a. First, the message is encrypted with symmetric encryption and afterwards it isencrypted asymmetrically together with the key. b. The secret key is symmetrically transmitted, the message itself asymmetrically. c. First, the message is encrypted with asymmetric encryption and afterwards it isencrypted symmetrically together with the key. d. The secret key is asymmetrically transmitted, the message itself symmetrically. Answer 1.c 2.a 3.a 4.b 5.d 71

3.8 REFERENCES • Becket, B (1988). Introduction to Cryptology. Blackwell Scientific Publications. ISBN 978-0-632-01836-9. OCLC 16832704. Excellent coverage of many classical ciphers andcryptography concepts and of the \"modern\" DES and RSA systems. • Cryptography and Mathematics by Bernhard Eslinger, 200 pages, part of the free open- source package CrypTool, PDF download at the Way back Machine (archived 22 July 2011). CrypTool is the most widespread e-learning program about cryptography and cryptanalysis, open source. • W. Stallings, \"Cryptography and Network Security\", Pearson Education. • Douglas Stinson, \"Cryptography Theory and Practice\", 2nd Edition, Chapman & Hall/CRC. • . A. Forouzan, \"Cryptography & Network Security\", Tata Mc Graw Hill. • In Code: A Mathematical Journey by Sarah Flannery (with David Flannery). Popular account of Sarah's award-winning project on public-key cryptography, co-written with her father. • Hirsch, Frederick J. \"SSL/TLS Strong Encryption: An Introduction\". Apache HTTP Server. Retrieved 17 April 2013. The first two sections contain a very good introductionto public-key cryptography. • Ferguson, Niels; Schneier, Bruce (2003). Practical Cryptography. Wiley. ISBN 0-471- 22357-3. • Katz, Jon; Lindell, Y. (2007). Introduction to Modern Cryptography. CRC Press. ISBN 978-1-58488-551-1. 72

UNIT - 4 VIRUSES AND MALICIOUS CODE STRUCTURE 4.0 Learning Objectives 4.1 Introduction 4.2 Virus 4.2.1 Malicious Code 4.2.2 Secure Programs 4.2.3 Non- malicious Program Errors 4.2.4 Targeted Malicious code 4.2.5 Control against program threats 4.3 Operating System Security 4.3.1 Operating System Hardening 4.3.2 Linux / Unix Security 4.3.3 Windows Security 4.4 Access Control 4.5 File Protection 4.6 User Authentication 4.7 Security Policies 4.8 Models of security 4.9 Summary 4.10 Keywords 4.11 Learning Activity 4.12 Unit End Questions 4.13 References 73

4.0 LEARNING OBJECTIVES After studying this unit, you will be able to: • Describe the effects of viruses • Identify the mechanisms to safeguard from virus • Compare various security policies of operating systems • Describe the models of security 4.1 INTRODUCTION A system virus is a piece of code that may \"infect\" other program by altering them, which involves injecting a routine into the original program that duplicates the virus program, which can then infect another program. Computer viruses initially surfaced in the early 1980s, with Fred Cohen coining the phrase in 1983. Cohen has written a ground-breaking book on the topic. Biological viruses are tiny bits of genetic code—DNA or RNA—that may hijack a living cell's machinery and deceive it into producing thousands of perfect copies of the original virus. A piece of malware, like its biological equivalent, has the blueprint forcreating perfect copies of itself in its instructional code. A typical computer virus is hidden in a program. When an infected computer interacts with an uninfected piece of software, a replica of the virus is transferred to the new program. As a result, unknowing users who swapdrives or exchange program over a network might spread the infection from machine to computer. The ability to access apps and system services on other computers on a network creates the ideal setting for a virus to propagate. A virus has the ability to do all of the functions that other software do. A virus, on the other hand, links itself to another software and runs invisibly when the host program is launched. Once a virus has been executed, it can execute any function that the current user's permissions permit, such as wiping files and applications. 74

4.2 VIRUS A virus can be added to an executable program as a prefix or suffix, or it can be incorporated in some other way. The key to its function is that when the affected application is run, the viral code is executed first, followed by the program's source code. Figure 3.1 depicts the nature of a virus in broad strokes. In this situation, the virus code V is appended to infected program, and it is expected that the first line of the program is used as the program's point of entry when it is run. Fig. 4.1 Simple Virus The virus program is the first step in the affected program's operation. The very first script is a call to the virus's main program. The second line is a unique signature that the malware uses to assess whether or not a possible victim program has already been affected. Control is transferred directly to the primary virus program when the program is started. The virus program may search out and attack undamaged executable files initially. The infection may next take some action, which is usually harmful to the system. This operation could be carried out every time the program is run, or it may be a logic bomb that only activates under specific 75

circumstances. The virus then relinquishes control to the original application. A user is unlikely to detect any difference between the execution of an affected and an uninfected software if the infection phase of the program is relatively quick. Initial Infection Whenever a virus gains access to a system by infiltrating a specific program, it has the ability to attack some or all of the system's executable files when the affected program runs. As a result, viral infection can be entirely avoided by stopping the virus from entering during first place. Moreover, because a malware can be component of any application outside of a machine, protection is quite tough. As a result, unless you're happy to take a completely naked piece of hardware and build all of your own system and application program, you're susceptible. Many types of infection can also be prevented by limiting normal users access to the system'sprogram. Traditional machine code-based infections propagate quickly on these systems due to the lack of access limitations on early PCs. While it is quite simple to develop a machine code virus for UNIX systems, they have almost never been observed in practice due to the presence of access constraints on these systems, which precluded successful virus spread. Since contemporary PC operating systems have more effective access safeguards, traditional machine code-based infections are becoming less common. Virus creators, on the other hand, have discovered new channels, including such macro and e-mail viruses, as detailed later. Ever since inception of viruses, there has been an ongoing arms race among virus authors and antivirus software developers. As effective remedies for existing forms of viruses are discovered, more types emerge. We categorize viruses across two orthogonal axes: the sort of target the virus wants to infect and the approach the virus employs to disguise itself from discovery by users and antivirus software. A virus classification by target includes the following categories: • Boot sector infector: Infiltrates a master boot record or boot record, and then expands when a system is started from the virus-infected disc. • File infector: Infiltrates executable files as determined by the operating system or shell. 76

• Macro virus: Afflicts files using macro coding that a software interprets. • The following categories make up a viral categorization by concealing approach: • Virus that is encrypted: The following is a typical method. A component of the virus generates a random encryption key and uses it to encrypt the rest of it. The key is kept by the virus. When an infected software is run, the virus decrypts itself using the random key it has stored. Each time the virus replicates, a new random key is chosen. There is no consistent bit pattern to examine because the majority of the infection is encrypted with a major key for each iteration. • Stealth virus: A virus that is specifically intended to avoid detection by antivirus software. As a result, the entire virus is concealed, not just the payload. • Polymorphic virus: A virus that morphs with each infection, making it impossible to identify using the virus's \"signature.\" • Metamorphic virus: A metamorphic virus mutates with each infection, just as a polymorphic virus. The difference is that a metamorphic virus entirely rebuilds itself with each repetition, making detection more difficult. Metamorphic viruses have the ability to change their appearance as well as their behavior. A virus that employs compression to make the infected application exactly the same size as an unaffected one was described before as an example of a stealth virus. Methods that are far more advanced are possible. For instance, a virus can insert intercept logic into disc I/O routines, causing the virus to present the original, unaffected application when an attempt is made to read suspected areas of the disc using these functions. As a result, the term \"stealth\" does not relate to a virus per se, but rather to a strategy utilized by a virus to avoid detection. During propagation, a polymorphic virus makes copies that are functionally equal but have unique bit patterns. The goal, as with a stealth virus, is to thwart virus-scanning software. In this instance, the virus's \"signature\" will change with each clone. The virus may introduce redundant commands at random or vary the order of independent commands to accomplish this variation. Encryption is a more effective strategy. The encryption virus's strategy is employed. The mutation engine is the part of the malware that is in charge of creating keys and conducting encryption and decryption. Each time you utilize the mutation engine, it changes. 77

Virus Kits The malware toolkit is yet another instrument in the arsenal of virus writers. A toolbox like this allows even a beginner to swiftly construct a variety of viruses. Viruses developed utilizing toolkits are less sophisticated than those created from scratch, but the sheer amount of new viruses is staggering. Macro Viruses In the mid-1990s, macro viruses became by far the most prevalent type of virus. Macro viruses are particularly threatening for a number of reasons: A macro malware is platform agnostic. Viruses that corrupt Microsoft Word or other Microsoft Office applications are known as macro viruses. These apps can infect any hardware platform and operating system that supports them. 2. Macro viruses infect texts rather than executable code. The majority of data entered into a computer system is in the form of a document, not a program. 3. Macro viruses are extremely contagious. Electronic mail is a popular approach. 4. Conventional file system access controls are ineffective in limiting the spread of macro viruses because they infect user documents instead of system program. Macro viruses reap the benefits of a feature called a macro that can be present in Word and other office apps like Microsoft Excel. A macro is a program that is incorporated in a word processing document or another sort of file. Macros are generally used to perform repetitive operations and save inputs. Typically, the macro language is a variant of the Basic programming language. In a macro, a user can specify a series of keystrokes and have it activated when a task key or a special short combo of keys is pressed. MS Office products have improved their defense against macro infections with each new edition. Microsoft, for instance, has an optional Macro Virus Protection software that detects suspicious Word files and warns customers about the dangers of opening a file with macros. Macro virus detection and correction technologies have also been developed by a number of antivirus software companies. The weapons race between macro viruses and other types of viruses continues, although they are no longer the primary virus danger. 78

E-Mail Viruses The e-mail malware is a more modern phenomenon in dangerous software. Melissa, one of the first quickly propagating e-mail viruses, used a Microsoft Word macro placed in an attachment. The Word macro is triggered when the receiver accepts the e-mail attachment. Then 1. The e-mail virus sends itself to everyone on the mailing list in the user‘s e-mail package. 2. The virus does local damage on the user‘s system. Malicious Code Malicious code, often known as rogue software or malware (short for MALicious software), consists of programs or code portions inserted by a malicious entity with the goal of causing unintended or undesirable results. The author or distributor of the program is the agent. Even though both types of code can have related and serious negative consequences, malicious intent differentiates this piece of code from misrepresentations. This meaning also excludes coincidence, which occurs when minor flaws in two otherwise benign software‘s combine to produce a negative result. The majority of software flaws discovered during audits, reviews, and testing do not classify as malicious code; the reason is almost always inadvertent. Unintentional flaws, on the other hand, might elicit the same reactions as intended malice; a good cause can nonetheless result in a bad outcome. Malicious script might be targeted to a specific user or group of users, or it can be made available to everyone. You may have been impacted by malware at some point, because either your machine was attacked or because you were unable to access an infected device while its administrators worked to clear up the mess left behind by the virus. The malware could have been generated by a worm, a virus, or neither; the infection metaphor is generally appropriate, although the word \"malicious code\" is commonly misused. We differentiate names applied to different types of malware here, rather than focusing on names, you should concentrate on methods and affects. By any other name, what we call a virus would smell just as bad. A virus is malicious code that may reproduce itself and spread harmful code by changing non- malicious software. 79

The name \"virus\" was coined because the affected software behaves similarly to a real virus: it affects other healthy subjects by connecting itself to it and either killing or coexisting with it. Since viruses are so sneaky, we can't assume that a virus-free software from yesterday is still virus-free today. Furthermore, a good code can be altered to include a version of the virus code, causing the virus-infected good program to attack other software. In most cases, the infection spreads at a geometric rate, eventually taking over an entire computing device and migrating to other devices connected. Viruses can be temporary or persistent. The life period of a transitory virus is determined by the average lifespan of its host; the virus runs when the code to which it is connected runs, and it ends when the connected code stops. (The transitory virus may attack other applications while it is being executed.) A resident virus finds itself in storage and can thus stay active or be started as a stand-alone software long after the associated software hasended. Although the phrases worm and virus are frequently used colloquially, they relate to two distinct entities. A worm is a computer software that replicates itself across a network. The main distinction between a worm and a virus is that a worm utilizes networks to spread, but a virus can transmit through any medium (but usually uses a copied program or data files). Furthermore, the worm transmits itself as a standalone program, whereas the virus spreads itself as a code that connects to or incorporate itself in other code. Worm: program that spreads copies of itself through a network. Spreading copies of yourself seems boring and perhaps narcissistic. But worms do have a common, useful purpose. How big is the Internet? What addresses are in use? Worm programs, sometimes called ―crawlers‖ seek out machines on which they can install small pieces of code to gather such data. The code items report back to collection points, telling what connectivity they have found. This kind of reconnaissance can also have a negative security purpose; the worms that travel and collect data do not have to be evil. A Trojan horse is destructive programming that has a secondary, nonobvious hostile effect in addition to its primary one. A connection to the Trojan War inspired the name. According to legend, the Greeks duped the Trojans by putting a massive wooden horse beyond their defensive shield. The Trojans brought the horse inside and offered it pride of place, thinking it was a gift. 80

The wooden horse, however, was packed with the bravest of Greek soldiers, which the naive Trojans were unaware of. The Greek warriors dismounted from the horse at night, opened the gates, and signaled their forces that the route into Troy was now clear. In the same way, Trojan horse malware slips inside a program undetected and produces unwelcome effectslater on. Imagine a login script that collects a user identify and password, sends the knowledge with the rest of the machine for login operations, but keeps a copy of the data for future, malicious purpose. In this case, the user simply sees the login process as expected, thus there's no reason to think anything else undesirable happened. Trojan horse: program with benign apparent effect but second, hidden, malicious effect To remember the differences among these three types of malware, understand that a Trojan horse is on the surface a useful program with extra, undocumented (malicious) features. It does not necessarily try to propagate. By contrast, a virus is a malicious program that attempts to spread to other computers, as well as perhaps performing unpleasant action on its current host. The virus does not necessarily spread by using a network‘s properties; it can be spread instead by traveling on a document transferred by a portable device (that memory stickyou just inserted in your laptop!) or triggered to spread to other, similar file types when a file is opened. However, a worm requires a network for its attempts to spread itself elsewhere. 81

Fig. 4.2 Table Types of Malicious Code Beyond this basic terminology, there is much similarity in types of malicious code. Many other types of malicious code are shown in Table. As you can see, types of malware differ widely in their operation, transmission, and objective. Any of these terms is used popularly todescribe malware, and you will encounter imprecise and overlapping definitions. Indeed, people sometimes use virus as a convenient general term for malicious code. Again, let us remind you that nomenclature is not critical; impact and effect are. Battling over whether something is a virus or worm is beside the point; instead, we concentrate on understandingthe mechanisms by which malware perpetrates its evil. Viruses has their ability to replicate and because harm gives us insight into two aspects of malicious code. Throughout the rest of this chapter we may also use the general termmalware for any type of malicious code. You should recognize that, although we are interested primarily in the malicious aspects of these code forms so that we can recognize andaddress them, not all activities listed here are always malicious. 82

Secure Programs Secure programming is a method of writing computer code that protects it from all types of threats, attacks, and other threats that could harm the software or the device that provides it. Secure programming is sometimes referred as secure coding since it deals with code security. Software information security accuracy and reliability are closely related, yet there are some subtle variances. The unintended failure of a program as a consequence of some theoretically random, unexpected input, system interaction, or use of improper code is the subject of software quality and reliability. These errors will most likely follow a probability distribution.The most common method for improving software quality is to employ some type of systematic design and testing to detect and eradicate as many defects as possible from a program. Modifications of likely inputs and common mistakes are typically used in the testing, with the goal of reducing the amount of problems that would be encountered in ordinary use. In computer security, the adversary determines the probability distribution, focusing onspecific defects that cause a disaster that the adversary may exploit. These vulnerabilities are frequently triggered by inputs that are significantly different from what is expected, and soare unlikely to be detected by standard testing methods. Writing secure, safe code necessitates consideration of all aspects of a program's execution, its environment, and the information it analyzes. Nothing can be taken for granted, and all possible flaws must be investigated. This definition underlines the importance of stating any assumptions about how a program will run and the sorts of input it will handle explicitly. Consider the abstract model of a program to help clarify the concerns. This diagram depicts the ideas covered in most basic programming classes. A program receives data from a number of sources, processes it according to an algorithm, and then provides output, which may be sent to many destinations.It runs in an operating system-provided environment, using machine instructions from a specified processor type. While processing the data, the program will use system calls, and possibly other programs available on the system. These may result in data being saved or modified on the system or cause some other side effect as a result of the program execution. All of these aspects can interact with each other, often in complex ways. 83

Fig. 4.3 Abstract View of Program When writing the code, developers usually concentrate on what is required to address the problem that the software is intended to answer. As a result, rather of addressing every possible point of failure, they focus on the steps required for success and the regular flow of computer program. They frequently make guesses about the types of inputs a code will get as well as the context in which it will run. Defensive programming implies that an organization must verify these assertions and manage all possible problems graciously and securely. Anticipating, testing, and managing all possible problems correctly will almost likely raise the quantity of code required in a code, as well as the time it takes to write it. This runs counter to business pressures to reduce development timelines as much as feasible in order to gain a competitive advantage. A safe software is unlikely to emerge unless application security is a design goal tackled from the beginning of coding. Further, when changes are required to a program, the programmer often focuses on the changes required and what needs to be achieved. Again, defensive programming means that the programmer must carefully check any assumptions made, check and handle all possible errors, and carefully check any interactions with existing code. Failure to identify and manage such interactions can result in incorrect program behavior and the introduction of vulnerabilities into a previously secure program. Defensive programming necessitates a shift in perspective from traditional programming approaches, which place a premium on systems that solve the desired problem for the majority of users, the majority of the time. 84

Because of this shift in thinking, the programmer must be aware of the implications of failure as well as the strategies utilized by attackers. Because the massive increase in patch shows that hackers are trying to get you, paranoia is a virtue! This mentality must acknowledge that traditional test methods will miss many of the risks that present but are activated by highly odd and unanticipated inputs. It indicates that earlier mistakes must be understood from in order to ensure that new initiatives do not have the same flaws. It indicates that software should be designed to be as resilient as feasible in the face of any mistake or unanticipated condition, to the extent practicable. Defensive programmers must understand how errors happen and what efforts they may take to lessen the likelihood of them happening in their applications. Most research fields have long recognized the importance of making security and performance design goals from the start of a project. Bridges going to collapse, buildings collapsing, and airplanes collapsing are all things that society does not tolerate. Such objects are required to be designed in such a way that these catastrophic situations are unlikely to occur. Software development has not yet matured to this point, and society tolerates significantly more failure in code than does in other engineering fields. Despite software engineers' best efforts and the creation of a number of software development and quality standards, this remains the case. While the focus of these standards is on the general software development life cycle, they increasingly identify security as a key design goal. In recent years, major companies, including Microsoft and IBM, have increasingly recognized the importance of software security. This is a positive development, but it needs to be repeated across the entire software industry before significant progress can be made to reduce the torrent of software vulnerability reports. The topic of software development techniques and standards, and the integration of security with them, is well beyond the scope of this text. [MCGR06] and [VIEG01] provide much greater detail on these topics. However, we will explore some specific software security issues that should be incorporated into a wider development methodology. We examine the software security concerns of the various interactions with an executing program, as illustrated in Figure. We start with the critical issue of safe input handling, followed by security concerns related to algorithm implementation, interaction with other components,and program output. 85

When looking at these potential areas of concern, it is worth acknowledging that many security vulnerabilities result from a small set of common mistakes. We discuss a number of these. The examples in this chapter focus primarily on problems seen in Web application security. These apps are especially susceptible due to their rapid growth, often by developers who are unaware of security considerations, and their availability via the Web to a potentially vast pool of attackers. However, it is important to note that the principles outlined here are applicable to all program. Because it is difficult to forecast how program will be used in the future, safe programming principles should always be observed, even for seemingly innocuous software. It's always possible that a small tool intended for local use will be included into a larger program, possibly Web-based, with fundamentally different security considerations. Non-malicious Program Errors Programs and their computer code are the basis of computing. Without a program to guide its activity, a computer is pretty useless. Because the early days of computing offered few programs for general use, early computer users had to be programmers too— they wrote the code and then ran it to accomplish some task. Today‘s computer users sometimes write their own code, but more often they buy programs off the shelf; they even buy or share code components and then modify them for their own uses. And all users gladly run programs all the time: spreadsheets, music players, word processors, browsers, email handlers, games, simulators, and more. Indeed, code is initiated in myriad ways, from turning on a mobile phone to pressing ―start‖ on a coffee-maker or microwave oven. But as the programs have become more numerous and complex, users are more frequently unable to know what the program is really doing or how. More significantly, users rarely know if the tool they're using is giving them accurate results. Code may not be working properly if a program stops unexpectedly, text vanishes from a page, or music skips passages. (In other cases, such as when a CD player misses because the disc is broken or a medical device software stops to prevent a harm, these disruptions are deliberate.) But if a spreadsheet produces a result that is off by a small amount or anautomated drawing package doesn‘t align objects exactly, you might not notice—or you notice but blame yourself instead of the program for the discrepancy. 86

These flaws, seen and unseen, can be cause for concern in several ways. Program defects can range from minor to catastrophic, as we all know, because they are created by flawedhumans. Despite extensive testing, faults may develop on a regular or infrequent basis, depending on a variety of undiscovered and unexpected factors. Program faults can have two types of security consequences: they can produce integrity issues that result in damaging output or behavior, or they can provide a chance for a malicious actor to attack. We go over each one by one. • A programming fault can be a defect that affects the validity of the program's outcome — in other words, a fault can cause the system to fail. An authenticity flaw occurs when an operation is performed incorrectly. One of the three essential security qualities of the C-I-A triangle is integrity. Integrity encompasses not just to correctness but also precision, consistency, and accuracy. A malfunctioning software can potentially change previously correct data in an unintended way, sometimes by rewriting or removing the original data. Even if the flaw was not intentionally introduced, the consequences of an imperfect system can be disastrous. • But at the other side, even a defect that stems from a good intention can be exploited by a malicious party. A simple and non-malicious defect could become part of a hacking attempt if an attacker discovers it and can use it to deceive the program's behavior. Benign defects can and are frequently exploited for nefarious purposes. Thus, program correctness becomes a security concern as well as an overall quality concern in both cases. We'll look at a few programming faults that have security ramifications in this chapter. We also show how to increase software safety during the design, construction, and distribution phases. Buffer Overflow We will begin with a well-known vulnerability called buffer overrun. Even though the basic issue is simple to define, locating and trying to prevent such problems is challenging. Moreover, an overflow's effect can be subtle and out of proportion to the underlying oversight. This disproportionate impact is due in part to the adventures that have been made with overflows. Indeed, a buffer overflow is frequently used as a launching pad for a more destructive attack. Although most buffer overflows are caused by basic programming errors, they can be exploited for malevolent purposes. 87

Buffer overflows are frequently caused by unintentional programming errors or a fail to record and check for excess data. This was not the first instance of a buffer overflow, and many more have been identified in the subsequent years—nearly two decades. This example, on the other hand, clearly demonstrates how an attacker thinks. David was attempting to increase security in this case—he was working with one of the book's writers at the time—butattackers work to overcome security for a variety of reasons. We're now looking at the causesof buffer overflow attacks, as well as some potential remedies. Incomplete Mediation Mediation means checking: the process of intervening to confirm an actor‘s authorization before it takes an intended action. In the last chapter we discussed the steps and actors in the authentication process: the access control triple that describes what subject can perform what operation on what object. Verifying that the subject is authorized to perform the operation on an object is called mediation. Incomplete mediation is a security problem that has been with us for decades: Forgetting to ask ―Who goes there?‖ before allowing the knight across the castle drawbridge is just asking for trouble. In the same way, attackers exploit incomplete mediation to cause security problems. Time-of-Check to Time-of-Use Synchronization is also involved in the third programming problem we'll discuss. Modern computers and operating systems frequently modify the sequence in which instructions and functions are processed to increase efficiency. Statements that appear to be contiguous may not be performed directly after each other, whether because they were purposely moved out of sequence or because of the impact of other operations running in parallel. Definition Access control is a critical component of computer safety; we really would like to make sure that only those who need access to an item have it. Every demand for access must be managed by an access policy that specifies who is authorized access to what and then mediated by an access-policy-enforcement agent. When access is not checked uniformly, however, an incomplete mediation trouble comes. The time-of-check to time-of-use (TOCTTOU) flaw is a vulnerability in arbitration that involves a \"bait and switch\" in the middle. 88

Undocumented Access Point Next, we describe a common programming situation. During program development andtesting, the programmer needs a way to access the internals of a module. Perhaps a result is not being computed correctly so the programmer wants a way to interrogate data values during execution. Maybe flow of control is not proceeding as it should and the programmer needs to feed test values into a routine. It could be that the programmer wants a special debugmode to test conditions. For whatever reason the programmer creates an undocumented entry point or execution mode. These situations are understandable during program development. Sometimes, however, the programmer forgets to remove these entry points when the program moves from development to product. Or the programmer decides to leave them in to facilitate program maintenance later; the programmer may believe that nobody will find the special entry. Programmers can be naïve, because if there is a hole. Off-by-One Error When learning to program, neophytes can easily fail with the off-by-one error: miscalculating the condition to end a loop (repeat while i< = n or in?) or overlooking that an array of A [0] through A[n] contains n+1 element. Usually the programmer is at fault for failing to think correctly about when a loop should stop. Other times the problem is merging actual data with control data (sometimes called metadata or data about the data). For example, a program may manage a list that increasesand decreases. Think of a list of unresolved problems in a customer service department: Today there are five open issues, numbered 10, 47, 38, 82, and 55; during the day, issue 82 is resolved but issues 93 and 64 are added to the list. A programmer may create a simple data structure, an array, to hold these issue numbers and may reasonably specify no more than 100numbers. But to help with managing the numbers, the programmer may also reserve the first position in the array for the count of open issues. Thus, in the first case the array really holds six elements, 5 (the count), 10, 47, 38, 82, and 55; and in the second case there are seven, 6, 10, 47, 38, 93, 55, 64. A 100-element array will clearly not hold 100 data items plus one count. 89

Fig. 4.4 Both Data and Number of Used Cells in an Array In this simple example, the program may run correctly for a long time, as long as no more than 99 issues are open at any time, but adding the 100th issue will cause the program to fail. A similar problem occurs when a procedure edits or reformats input, perhaps changing a one- character sequence into two or more characters (as for example, when the one-character ellipsis symbol ―…‖ available in some fonts is converted by a word processor into three successive periods to account for more limited fonts.) These unanticipated changes in size can cause changed data to no longer fit in the space where it was originally stored. Worse, theerror will appear to be sporadic, occurring only when the amount of data exceeds the size of the allocated space. Alas, the only control against these errors is correct programming: always checking to ensure that a container is large enough for the amount of data it is to contain. Integer Overflow An integer overflow is a peculiar type of overflow, in that its outcome is somewhat different from that of the other types of overflows. An integer overflow occurs because a storage location is of fixed, finite size and therefore can contain only integers up to a certain limit. The overflow depends on whether the data values are signed (that is, whether one bit is reserved for indicating whether the number is positive or negative). Table gives the range of signed and unsigned values for several memory location (word) sizes. 90

Fig. 4.5 Table Value Range by Word Size When a computation causes a value to exceed one of the limits in Table, the extra data does not spill over to affect adjacent data items. That‘s because the arithmetic is performed in a hardware register of the processor, not in memory. Instead, either a hardware programexception or fault condition is signaled, which causes transfer to an error handling routine, or the excess digits on the most significant end of the data item are lost. Thus, with 8-bit unsigned integers, 255 + 1 = 0. If a program uses an 8-bit unsigned integer for a loop counter and the stopping condition for the loop is count = 256, then the condition will never be true. Checking for this type of overflow is difficult, because only when a result overflows can the program determine an overflow occurs. Using 8-bit unsigned values, for example, a program could determine that the first operand was 147 and then check whether the second was greater than 108. Such a test requires double work: First determine the maximum second operand that will be in range and then compute the sum. Some compilers generate code to test for an integer overflow and raise an exception. Unterminated Null-Terminated String Long strings are the source of many buffer overflows. Sometimes an attacker intentionally feeds an overly long string into a processing program to see if and how the program will fail, as was true with the Dialer program. Other times the vulnerability has an accidental cause: A program mistakenly overwrites part of a string, causing the string to be interpreted as longer than it really is. How these errors actually occur depends on how the strings are stored, which is a function of the programming language, application program, and operating system involved. Variable-length character (text) strings are delimited in three ways. The easiest way, used by Basic and Java, is to allocate space for the declared maximum string length and store the current length in a table separate from the string‘s data. 91

Fig. 4.6 Variable-Length String Representations Some systems and languages, particularly Pascal, precede a string with an integer that tells the string‘s length. In this representation, the string ―Hello‖ would be represented as 0x0548656c6c6f because 0x48, 0x65, 0x6c, and 0x6f are the internal representation of the characters ―H,‖ ―e,‖ ―l,‖ and ―o,‖ respectively. The length of the string is the first byte, 0x05. With this representation, string buffer overflows are uncommon because the processing program receives the length first and can verify that adequate space exists for the string. (This representation is vulnerable to the problem we described earlier of failing to include the length element when planning space for a string.) Even if the length field is accidentally overwritten, the application reading the string will read only as many characters as written into the length field. But the limit for a string‘s length thus becomes the maximum number that will fit in the length field, which can reach 255 for a 1- byte length and 65,535 for a 2-byte length. The last mode of representing a string, typically used in C, is called null terminated, meaning that the end of the string is denoted by a null byte, or 0x00. In this form the string ―Hello‖ would be 0x48656c6c6f00. Representing strings this way can lead to buffer overflows because the processing program determines the end of the string, and hence its length, only after having received the entire string. This format is prone to misinterpretation. Suppose an erroneous process happens to overwrite the end of the string and its terminating null character; in that case, the application reading the string will continue reading memory until anull byte happens to appear (from some other data value), at any distance beyond the end of 92

the string. Thus, the application can read 1, 100 to 100,000 extra bytes or more until it encounters a null. The problem of buffer overflow arises in computation, as well. Functions to move and copy a string may cause overflows in the stack or heap as parameters are passed to these functions. Parameter Length, Type, and Number Another source of data-length errors is procedure parameters, from web or conventional applications. Among the sources of problems are these: • Too many parameters. Even though an application receives only three incoming parameters, for example, that application can incorrectly write four outgoing result parameters by using stray data adjacent to the legitimate parameters passed in the calling stack frame. (The opposite problem, more inputs than the application expects, is less of a problem because the called applications‘ outputs will stay within the caller‘s allotted space.) • Wrong output type or size. A calling and called procedure need to agree on the type and size of data values exchanged. If the caller provides space for a two-byte integer but the called routine produces a four-byte result, those extra two bytes will go somewhere. Or a caller may expect a date result as a number of days after 1 January 1970 but the result produced is a string of the form ―ddmmm-yyyy.‖ • Too-long string. A procedure can receive as input a string longer than it can handle, or it can produce a too-long string on output, each of which will also cause an overflow condition. Procedures often have or allocate temporary space in which to manipulate parameters, so temporary space has to be large enough to contain the parameter‘s value. If the parameter being passed is a null-terminated string, the procedure cannot know how long the string will be until it finds the trailing null, so a very long string will exhaust the buffer. Unsafe Utility Program Programming languages, especially C, provide a library of utility routines to assist with common activities, such as moving and copying strings. In C the function strcpy (dest, src) copies a string from src to dest, stopping on a null, with the potential to overrun allocated memory. A safer function is strncpy(dest, src, max), which copies up to the null delimiter or 93

max characters, whichever comes first. Although there are other sources of overflow problems, from these descriptions you can readily see why so many problems with buffer overflows occur. Next, we describe several classic and significant exploits that have had a buffer overflow as a significant contributing cause. From these examples you can see the amount of harm that a seemingly insignificant program fault can produce. Race Condition As the name implies, a race condition means that two processes are competing within the same time interval, and the race affects the integrity or correctness of the computing tasks. For instance, two devices may submit competing requests to the operating system for a given chunk of memory at the same time. In the two-step request process, each device first asks if the size chunk is available, and if the answer is yes, then reserves that chunk for itself. Depending on the timing of the steps, the first device could ask for the chunk, get a ―yes‖ answer, but then not get the chunk because it has already been assigned to the second device. In cases like this, the two requesters ―race‖ to obtain a resource. A race condition occurs most often in an operating system, but it can also occur in multithreaded or cooperating processes. Targeted malicious code Targeted malicious code is written for a particular system. To do so the adversary or the code writer learns the system carefully detecting its weaknesses. The different types are: Brain: The Brain virus infiltrated the boot sector and other areas of the computer. To evade discovery and sustain its invasion, it then filtered all disc access. Brain would scan the boot sector of the disc every time it was read to see if it was contaminated. It would rebuild itselfin the boot sector and elsewhere if this was not the case. This made it tough to get rid of the virus altogether. Morris Worm: The Morris worm obtains a remote access to machines on the network by guessing the user account passwords. If that failed, it tried to exploit a buffer overflow and also tried to exploit a trapdoor in send-mail. Once access had been obtained to a machine, the worm sent a bootstrap 94

loader to the victim. The bootstrap loader then fetched the rest of the worm. In this process, the victim machine even authenticated the sender. The Morris worm went to great lengths to remain undetected. If the transmission of the worm was interrupted, all of the code that had been transmitted was deleted. When the script was obtained, it was also encrypted, and when it was decrypted and produced, the downloaded source code was removed. When the worm was executing on a system, it changed its name and process identifier (PID) on a regular basis, making it less likely for a system administrator to detect anything strange. Code Red: The Code Red worm took use of a buffer overflow in Microsoft's IIS server software to obtain access to a system. It then looked for more prospective targets by monitoring traffic onport 80. Code Red's actions were determined by the day of the month. It attempted to transmitits infestation from day 1 to 19, then a disseminated denial of service (DDoS) effort from day 20 to 27. SQL Slammer: The Slammer attacked websites by producing IP addresses at arbitrary. It could have made better use of the available bandwidth if it had used a more effective search technique. Slammer expands too quickly and effectively consumes all possible Internet bandwidth. If Slammer was already able to throttle itself a little more, it may have attacked more systems and inflicted substantially more harm. Time Bomb: This is a program that will go into force at a specific time and date. The virus is kept in storage until it is ready to burst. Y2K is an outstanding example of a time bomb. Logic Bomb: It is a malicious program initiated when a specific condition occurs. Logic can be with respect to some event. The logic can be a condition or a count which remains in the memory for the condition to occur and affect the system. 95

Controls Against Program Threats Against this threat background you may well ask how anyone can ever make secure, trustworthy, flawless programs. As the size and complexity of programs grows, the number of possibilities for attack does, too. Some software engineering techniques that have been shown to improve the security of code. Developers should have a reasonable understanding of security, and especially of thinking in terms of threats and vulnerabilities. Armed with that mindset and good development practices, programmers can write code that maintains security. Software Engineering Techniques Code usually has a long shelf-life and is enhanced over time as needs change and faults are found and fixed. As a result, one of the most important principles of software engineering is to break down a concept or code into small, self-contained elements known as components or modules; when a system is developed this manner, we refer to it as modular. Modularity has benefits for both program development and security in specific. If an element is separated from the impacts of other parts, it is said to be \"isolated\", then the system is designed in a way that limits the damage any fault causes. Maintaining the system is easier because any problem that arises connects with the fault that caused it. Testing (especially regression testing—making sure that everything else still works when you make a corrective change) is simpler, since changes to an isolated component do not affect other components. When a module is separated, designers may easily determine where flaws may exist. This is referred to as isolation encapsulation. Another feature of modular software is the ability to hide information. Each element hides its specific implementation or some other design decision from the others when information is hidden. When a change is required, the overall design can be preserved while just therequired adjustments to particular elements are performed. Modularity Modularization, as seen in Figure 3.4, is the method of breaking a task into smaller tasks. This is frequently done on a logical or functional basis, with each component performing a different, independent task. Each element must meet four criteria in order to be successful: 96

• Single-purpose, performs one function • Small, consists of an amount of information for which a human can readily grasp both structure and content • Simple, is of a low degree of complexity so that a human can readily understand the purpose and structure of the module • Independent, performs a task isolated from other modules Fig. 4.7 Modularity Encapsulation Encapsulation conceals the technical details of a module, although it does not always imply complete isolation. Many elements, typically for good cause, must disclose intelligence with one another. However, this sharing is meticulously documented so that other components in the system can only affect an element in known ways. Sharing is minimized to use as few interfaces as needed. The protective border around an encapsulated element might be translucent or transparent, depending on the situation. Encapsulation, according to Berard, is a \"method for packing the knowledge [inside an element] in such a way that hides what should be concealed while making accessible what is supposed to be seen.\" 97

Information Hiding Writers who work in an environment that emphasizes modularization can rest assured that other elements will have little impact on the ones they create. As a result, we can consider a module to be a black box with well-defined inputs and outputs as well as a well-defined function. Designers of other modules don't need to understand how the component achieves its task; all they need to learn is that the module does its job correctly. Mutual Suspicion Programs aren't always reliable. Even if an operating system enforces access restrictions, it may be hard or impracticable to successfully bind an unproven program's access permissions. In this situation, user U has a good reason to be wary of a new program P. Program P, on the other hand, might be called by another program, Q. Q has no way of knowing whether P is right or appropriate, any more than a user does. As a result, the term \"mutual suspicion\" is used to characterize the connection between two programs. Mutually doubtful applications act as though the system's other procedures are malicious or inaccurate. A calling program cannot rely on the correctness of its called sub procedures, and a called sub procedure cannot rely on the correctness of its calling program. Each secures its interface data, allowing only limited access to the other. A method to sort the items in a list, for example, cannot be relied not to change those components, and the process's caller cannot be relied to give any list at all, let alone the number of items predicted. Confinement An operating system uses confinement on a suspicious code to help guarantee that potential harm does not propagate to other sections of the system. The system resources that a confined software can access are severely limited. The data that a software can access is carefully limited if it is untrustworthy. Strong confinement would be especially beneficial in reducing virus propagation. Because viruses spread through shared data and transitivity, all the information and software in a single compartment of a limited code can only impact the programs and data in that sector. Therefore, the virus can spread only to things in that compartment; it cannot get outside the compartment. 98

Simplicity The case for simplicity—of both design and implementation—should be self-evident: simple solutions are easier to understand, leave less room for error, and are easier to review for faults. The value of simplicity goes deeper, however. With a simple design, all members of the design and implementation team can understand the role and scope of each element of the design, so each participant knows not only what to expect others to do but also what others expect. Perhaps the worst problem of a running system is maintenance: After a system has been running for some time, and the designers and programmers are working on other projects (or perhaps even at other companies), a fault appears and some unlucky junior staff member is assigned the task of correcting the fault. With no background on the project, this staff member must attempt to intuit the visions of the original designers and understand the entire context of the flaw well enough to fix it. A simple design and implementation facilitate correct maintenance. Testing Testing is a process activity that concentrates on product quality: It seeks to locate potential product failures before they actually occur. The goal of testing is to make the product failure free (eliminating the possibility of failure); realistically, however, testing will only reduce the likelihood or limit the impact of failures. Each software problem (especially when it relates to security) has the potential not only for making software fail but also for adversely affecting a business or a life. The failure of one control may expose a vulnerability that is not ameliorated by any number of functioning controls. Testers improve software quality by finding as many faults as possible and carefully documenting their findings so that developerscan locate the causes and repair the problems if possible. Testing is easier said than done, and Herbert Thompson points out that security testing is particularly hard. James Whittaker observes in the Google Testing Blog, 20 August 2010, that ―Developers grow trees; testers manage forests,‖ meaning the job of the tester is to explore the interplay of many factors. This problem is exacerbated by side effects, dependencies, unexpected users, and poor implementation bases (languages, compilers, infrastructure). However, one of the most difficult aspects of security testing is that we can't simply look at the one behavior that the program gets right; we also have to check for the hundreds of ways the program could go bad. 99

Types of testing Testing usually involves several stages. First, each program component is tested on its own. Such testing, known as module testing, component testing, or unit testing, verifies that the component functions properly with the types of input expected from a study of the component‘s design. Unit testing is done so that the test team can feed a predetermined set of data to the component being tested and observe what output actions and data are produced. Inaddition, the test team checks the internal data structures, logic, and boundary conditions for the input and output data. When collections of components have been subjected to unit testing, the next step is ensuring that the interfaces among the components are defined and handled properly. Indeed, interface mismatch can be a significant security vulnerability, so the interface design is often documented as an application programming interface or API. Integration testing is the process of verifying that the system components work together as described in the system and program design specifications. Once the developers verify that information is passed among components in accordance with their design, the system is tested to ensure that it has the desired functionality. A function test evaluates the system to determine whether the functions described by the requirements specification are actually performed by the integrated system. The result is a functioning system. The feature test compares the services specified in the developers' requirements specification to the system being constructed. The system is next put through a performance test to seehow it performs against the remaining software and hardware requirements. Testers review security criteria and ensure that the system is as protected as it needs to be during the functionand performance testing. When the performance test is complete, developers are certain that the system functions according to their understanding of the system description. The next step is conferring with the customer to make certain that the system works according to customer expectations. Developers join the customer to perform an acceptance test, in which the system is checked against the customer‘s requirements description. Upon completion of acceptance testing, the accepted system is installed in the environment in which it will be used. 100


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook