Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore CU-MCA-SEM II-Parallel and Distributed Computing-Second Draft

CU-MCA-SEM II-Parallel and Distributed Computing-Second Draft

Published by Teamlease Edtech Ltd (Amita Chitroda), 2021-10-12 04:24:33

Description: CU-MCA-SEM II-Parallel and Distributed Computing-Second Draft

Search

Read the Text Version

OSI distinguishes three additional layers above the transport layer. Only the application layer was ever used in practise. Indeed, the Address of the packet includes Above the transport layer, everything is clustered together. We'll see in this part that both the OSI nor its Internet approaches are actually appropriate when it comes to middleware systems. The session layer was essentially a more advanced transport layer. It It keeps track of which party is actually speaking and gives dialogue control. Provides the ability to synchronise. The latter is useful because it allows users to insert text. Checkpoints are built into long transfers so that in event of a crash, the user can resume where they left off. Instead of going all the way back to the beginning, go only to the final checkpoint. In In practise, few applications care about the session layer, and it is only occasionally supported. It isn't even included in the Internet protocol set. However, when it comes to developing middleware solutions, this same concept of a session as well as its related concepts are essential. Protocols have proven to be extremely useful, particularly when defining higher-level concepts. Protocols for communication Unlike the lower layers, that are concerned to getting the bits from the disc, the upper layers are concerned with getting this same bit from the disc. The presentation layer is concerned with getting the sender to both the receiver reliably and efficientlywith the significance of the pieces The majority of messages aren't made up of random bit stringsbut more organised data such as people's names, addresses, and financial amounts Money, for example. It is possible to describe records in the presentation layer that contain fields like this, and then have the sender tell the receiver that even a message has arrivedcontains a specific record in a specific format. Machines will find it easier as a result of thisas many different internal depictions to communicate with one another.The OSI application layer was designed to hold a collection of applicationscommon network applications, such as e-mail, file transfer, and so on. Emulation toward the end. It has now evolved into the container with all applications.Protocols that do not fit in with one of the underlying layers in some way.Almost all distributed systems are covered by the OSI reference modelare merely applications. A clear difference between applications, application- specific protocols, & general-purpose protocols is missing from this architecture. For instance, the 151 CU IDOL SELF LEARNING MATERIAL (SLM)

FTP (Internet File Transfer Protocol) (Postel and Reynolds, 1985; and Horowitz, 1995) is a protocol for transferring files over the internet. A protocol for exchanging files between such a client and a server system is defined by and Lunt (1997). The protocol is not to be confused with both the ftp application, which is a file transfer protocolan end-user file transfer application that also happens be implement the Internet FrP (not totally by coincidence). The HyperText Transfer Protocol is another example of an application-specific protocol. The Hypertext Transfer Protocol (HTTP) (Fielding et al., 1999) is a protocol for sending and receiving data over the internet. Web page transfer is managed and handled. The protocol is put into action byWeb browsers and servers are examples of such applications. HTTP, on the other hand, is now widely used. Systems that aren't inextricably linked to the internet Java's object-invocation mechanism, for example, leverages HTTP to request remote object invocationwhich are secured by a firewall (Sun Microsystems, 2004b) Middleware Protocols Middleware is an application which logically lives (mostly) in the application layer, but which contains numerous general-purpose protocols which warrant their own layers, independent of many other, more specific applications. A distinction can be made between high-level communication protocols & protocols for establishing various middleware services. There are many numerous procedures to support a wide range with middleware service, there are various ways of establishing authentication, that is, provide proof of the claimed identity. Authentication protocols are not closely tied to every specific application, but rather, can be integrated it in to a middleware system as just a general service. So also, authorization protocols by which authenticated users and processes have been granted access to only those resources for which they have authorization. tend to get a general, application-independent nature. Middleware sensor networks support high-level interaction services. For example, in the next 2 segments we shall discuss protocols that allow a process to call a procedure as well as invoke an object on the remote machine in a highly transparent way. Likewise, there's many high-level communication services for setting and synchronising streams besides transferring real-time data, such as needed for multimedia applications. As a last example, some middleware programs give reliable multicast services which scale to thousands of receivers spread across a widearea network. A few of the middleware communication protocols might equally well belong \\sin the transport layer, and there may be specific solution to continue them at a higher level. For example, reliable multicasting services which guarantee scalability can be implemented only 152 CU IDOL SELF LEARNING MATERIAL (SLM)

when application requirements were also taken into account. Consequently, a middleware system might well offer different (tunable) protocols, each in tum represented using different transport protocols, and yet offering a single interface. Figure 8.7 Middleware protocols Using this method to layering results in a somewhat altered communication reference model, as seen in Figure. The session and presentation layers have been replaced with a single middleware layer that incorporates application-independent protocols, as opposed to the OSI model. These protocols have no place in the previously mentioned lower tiers. The original transportation services might be delivered as a middleware service without any changes. This method is comparable to providing UDP only at transport level. Message-passing services similar to those provided by the transport layer may also be included in middleware communication services. We'll focus on four high-level middleware internet services in the rest of this chapter: remote procedure calls, network control services, support for continuous media communication through streams, plus multicasting. Types of Communication To grasp the different communication options that middleware might provide to applications, consider it as a separate service in clientserver computing, as depicted in Figure. Take, for instance, an electronic mail system. The heart of the mail delivery system could be thought of as a middleware communication service in theory. A user agent is installed on each host that allows users to compose, send, & receive e-mail. A sending user agent sends such message to the mail delivery system, expecting it to deliver it to the appropriate recipient eventually. Similarly, the receiver's user agent connects to the mail delivery system to see if any mail has 153 CU IDOL SELF LEARNING MATERIAL (SLM)

arrived. If this is the case, the messages are sent to the user agent to be displayed & read by the user. A prominent example of persistent communication is an electronic mail system. A message that has been submitted for transmission is held by the communication middleware for as long as it takes to deliver it to the receiver with persistent communication. The message will be stored by the middleware at one or more of the storage sites illustrated in Figure. As a result, the transmitting application does not need to remain running after the message has been sent. Similarly, the receiving application does not have to be running while the message is sent. In transitory communication, on the other hand, a message is only retained by the communication system for as long because as sending and receiving applications are running. In the case of Figure, if the middleware is unable to transmit a message owing to a transmission interruption or because the recipient is now inactive, the message is simply discarded. Typically, only temporary communication is provided by all transport-level communication services. The communication system in this example is made up of typical store-and-forward routers. A router will simply drop a message if it cannot send it to the next router or the destination host. Communication can be asynchronous or synchronous, in addition to being persistent or temporary. Asynchronous communication is distinguished by the fact that a transmitter continues immediately after submitting a message for transmission. This signifies that the middleware stores the message (temporary) as soon as it is submitted. The sender is blocked in synchronous communication until the request is known to be accepted. There are basically three places where synchronisation can happen. First, until the middleware notifies the sender that it will assume over transmission of the request, the sender may be blocked. Second, until the sender's request has been sent to the designated recipient, the sender may synchronise. Third, synchronisation can be achieved by allowing the sender to wait until its request has also been fully processed, i.e., until the recipient responds. In practise, several combinations like persistence and synchronisation occur. Persistence combined with synchronisation at request submission, which would be a typical approach for many message-queuing systems and which we will explore later in this chapter, is one of the most popular. Similarly, after the request has been entirely processed, transitory communication with synchronisation is often utilised. This scheme relates to remote procedure calls. We should distinguish between discrete & streaming communication in addition to durability and synchronisation. Until far, all of the examples have been discrete communication: both parties communicate through messages, each message containing a complete unit or information. Streaming, on the other hand, entails sending several messages at once, where 154 CU IDOL SELF LEARNING MATERIAL (SLM)

the messages be related to one another either by the sequence in which they are sent or by a temporal relationship. 8.3 REMOTE PROCEDURE CALL Explicit processes listed between processes has been used in many distributed systems. The send and receive procedures, on the other hand, do not hide communication at all, which is critical for achieving access transparency for distributed systems. This issue has long been recognised, but little has been done about it until Birrell and Nelson (1984) published a study that proposed a whole new approach to communication. Although (once someone has thought of it), the concept is surprisingly straightforward. The consequences are frequently subtle. In this section, we'll look at the concept, how it was implemented, as well as its strengths and drawbacks. In a nutshell, Birrell and Nelson proposed that programmes be able to call procedures on other machines. When a machine A process calls a procedure on machine B, its calling process on A is halted, and the called procedure is executed on B. The parameters can carry information from the caller to the callee, and the procedure result can return the information. The coder sees no messages passing at all. Remote Procedure Call, or RPC, is the name given to this method. While the core concept appears to be simple and beautiful, there are several minor flaws. To begin with, because the calling and called operations are executed on different machines, they execute in different address spaces, which poses issues. It's also necessary to pass parameters and results, which can be difficult, especially if the machines aren't similar. Finally, either or both devices may fail, with each failure causing its own set of issues. Even yet, the most of issues can be dealt with, and RPC is a widely used technology that sits at the heart of many distributed systems. Basic RPC Operation To completely comprehend RPC, it is necessary to first comprehend the operation of a traditional (i.e., single machine) procedure call. Consider the C function count =tead(td, but, nbytes), where fd denotes a file, but denotes an array of characters within which data is read, and nbytes is the number of bytes to read. The stack will be as illustrated in Figure. before the call if the call is made from either the main programme. The caller pushes all parameters into the stack in order, last one first, to make the call, as shown in Figure. (The reason C compilers put the parameters into reverse order has to do with printj—print! can always get its first parameter, the format string, this way.) When the read operation is finished, the return value is stored in a register, the return address is removed, and control is returned to the caller. The caller then clears the parameters off the stack, returning it to its previous state before the call. 155 CU IDOL SELF LEARNING MATERIAL (SLM)

There are a few things worth mentioning. For example, arguments in C can be called by value or by reference. As seen in Figure, a value parameter, such as fd or nbytes, is simply copied to the stack. A value parameter is nothing more than an initialised local variable to the called procedure. The called procedure can change it, but such modifications have no effect on the calling side's original value. In C, a reference parameter is a pointer to either a variable (i.e., the variable's address), not the variable's value. In response to the call to read. Because arrays are usually supplied by reference in C, the second parameter is a reference parameter. The address of both the character array is really pushed into the stack. The array in the calling method is modified if the called process uses this argument to store someone into the character array. As we'll see, the distinction between call-by-value with call-by-reference is critical for RPC. Figure 8.8 Conventional procedure call Another technique for passing parameters exists, but it is not used in C. It's referred to as call- by-copy/restore. It entails the caller copying the variable to the stack, as in call-by-value, and would then copying the variable back after the call, overwriting that caller's original value. Under most circumstances, this has the same effect with call-by-reference, but there are some exceptions. such is the presence of the same parameter in the parameter list many times. The semantics aren't the same. Many languages do not use the call-by-copy/restore method. The language designers usually choose which parameter transfer mechanism they employ, and it is a fixed property of both the language. It is sometimes dependent on the data type that is being provided. In C, for example, integers and other scalar types are always supplied by value, while arrays, as we've seen, are always passed by reference. For in out parameters, 156 CU IDOL SELF LEARNING MATERIAL (SLM)

some Ada compilers utilise copy/restore, while others use call-by-reference. Because the language definition allows for either option, the semantics are a little hazy. Client and Server Stubs The goal of RPC is to create a remote procedure call appear as close to a local one as possible. To put it another way, we want RPC to be transparent: the calling process should have no idea that the called procedure is running on a separate system, and vice versa. Consider the case where a programme needs to read data from a file. To obtain the data, the programmer inserts a call to read into the code. In a typical (single-processor) system, the linker extracts the read procedure from the library and inserts it into the object programme. It's a simple technique that's usually carried out by invoking an equivalent read system call. In other words, the read procedure acts as a link between both the user code and the operating system on the local machine. Despite the fact that read is a system call, it is invoked by pushing the parameters it onto stack, as seen in Figure. As a result, the programmer is unaware that read is engaging in questionable behaviour. RPC achieves transparency in a similar manner. When read is genuinely a remote operation (for example, one that will run mostly on file server's machine), a client stub version of read is added to the library. It, too, was called using the same calling sequence as the original. It, too, makes a call to both the local operating system, just like the original. Only, unlike the original, it does not request data from the operating system. Instead, it encapsulates the parameters in a message and sends that message to the server, as seen in Figure. Following the call to transmit, the client stubs the call to receive and waits for a response. When a message arrives only at server, it is sent up to a server stub by the operating system. A server stub is the server-side counterpart to a client stub: it's a piece of code that converts network requests into local method calls. Typically, the server stub will be blocked while waiting on incoming messages and will have called receive. The server stub unwraps the parameters from either the message before calling the server function as usual (see Figure.). From the server's perspective, it appears as if the client is calling it directly—the parameters & return address are now on the stack where they go, and nothing appears strange. The server completes its task and returns its result to a caller in the expected manner. In the case of read, for example, the server will write the data to the buffer indicated to by the second parameter. This buffer will be used by the server stub internally. When the server stub regains control once the call is completed, it encapsulates the result (the buffer) inside a message and sends it to the client. Following that, the browser stub normally makes another call to receive to await the next incoming request. When the message arrives at the client computer, the operating system recognises that it is addressed to a client process (or the client stub, depending on the operating system). The 157 CU IDOL SELF LEARNING MATERIAL (SLM)

client process is unblocked after the message is copied to the waiting bu ffer. The client stub examines the message, unwraps the result, copies that to its caller, and exits normally. When the caller regains control after the read request, all it knows is it is data are accessible. It is completely unaware that the work was completed remotely rather than by the local operating system. The brilliance of the entire system is in the client's blissful ignorance. Remote services are accessed via making standard (i.e., local) procedure calls rather than communicating. The two library operations hide all of message passing details, just as traditional libraries hide all of the specifics of actually performing system calls. To summarise, a remote procedure call follows the steps below: 1. The client procedure performs a standard call to the client stub. 2. The client stub creates a message and invokes the operating system on the local machine. 3. The message is sent to the remote as by the client's as. 4. The message is sent to the server stub by the remote as. 5. The server stub calls the server after unpacking the parameters. 6. The server completes the task and sends the outcome to the stub. 7. The server stubs it and sends it as a message to its local as. 8. The clients as receives the message from the server's as. 9. The message is sent to the client stub by the client's as. 10. The stub returns to the client after unpacking the result. The overall result of all of these processes is that the client procedure's local call to the client stub is converted to a local call to the server procedure, without either client or server being aware of both the intermediate steps or the network. 8.4 REMOTE OBJECT INVOCATION A remote object is a class that defines methods that a client inside a remote Java Virtual Machine can call (JVM). A remote object has one or more remote interfaces which declare the object's remote methods. Remote method invocation (RMI) is similar to remote procedure calls (RPC), except it extends to distributed objects. A calling object could use RMI to call a method on a potentially distant object. The underlying elements are often hidden from the user, just as they are with RPC. The following are the similarities between RMI and RPC:  They both support programming using interfaces, as well as the benefits that come with it. 158 CU IDOL SELF LEARNING MATERIAL (SLM)

 They're both built on top of request-reply protocols and therefore can support a variety of call semantics, including at-least-once and at-most-once.  They both provide a similar amount of transparency — that is, both local and remote calls use the same syntax, while remote interfaces often reveal the distributed nature of the phenomenon function, for as by providing remote exceptions. When it comes to programming sophisticated distributed systems and services, the following differences provide more expressiveness. The programmer can leverage all of object-oriented programming's expressive capability in the integration of data systems software, including objects, classes, and inheritance, as well as related object-oriented design approaches and tools. Based on the concept of specific role in object-oriented systems, most objects in an RMI- based system have distinct object references (local or remote), which can also be supplied as parameters, providing substantially broader parameter-passing semantics than RPC. In distributed systems, the issue of parameter passing is extremely essential. RMI allows the programmer can pass parameters by object reference as well as by value, either input or output parameters. If the underlying argument is large or complex, passing references is very appealing. Instead of sending the object value across the network, the remote end can use remote method invocation to access the object after obtaining an object reference. The remainder of this part delves deeper into the concept of remote method invocation, first looking at the core devolved distributed object models before moving on to implementation issues, such as distributed garbage collection. Design issues for RMI As previously stated, RMI has the same design challenges as RPC in terms of interface programming, call semantics, and transparency. The reader is urged to review Section 5.3.1 for more information on these topics. The object model, in particular, and achieving the shift from objects to dispersed objects, is a critical additional design issue. The traditional single-image object model is described first, followed by the distributed object model. An object-oriented software, such as one written in Java or C++, is made up of a collection of interconnected objects, each of which has its own set of data and methods. Invoking other objects' methods, supplying parameters and receiving results, is how an object communicates with the other objects. Objects can encapsulate both their data and their method code. Some languages, such as Java and C++, allow programmers to define objects with direct access to instance variables. In a distributed object system, however, an object's data should only be accessible through its methods. 159 CU IDOL SELF LEARNING MATERIAL (SLM)

References to objects: Object references can be used to access objects. A variable that looks to hold an object in Java, for example, actually holds a reference to that object. The object reference & method name, as well as any necessary parameters, are used to call a method in an object. The target and receiver are two terms used to describe the object that method is called. Object references are all first values, which means they can be assigned as variables, supplied as arguments, and returned as method results, among other things. Interfaces: Without specifying their implementation, an interface defines the signatures of a collection of methods (that is, the types of their parameters, return values, and exceptions). If an object's class has code that implements the interface's methods, that object will provide that interface. In Java, a class can implement several interfaces, and any class can implement an interface's methods. In addition, an interface defines types which can be used to describe the type of variables, arguments, and method return values. Constructors are not present in interfaces. Taking Action: An object invoking a method in another object initiates action in an object- oriented application. Additional information (arguments) required to carry out the method can be included in an invocation. The receiver calls the necessary method and then passes control back to the invoking object, occasionally with a result. A method's invocation can have three outcomes: 1. The receiver's condition can be modified. 2. A new object can be created using a function Object () {[native code]} in Java or C++, for example. 3. Additional invocations of methods in many other objects are possible. One action is a chain of connected method invocations, each of which finally returns, because an invocation can lead to other invocations to methods in other objects. Exceptions: Programs can run into a variety of mistakes and unexpected circumstances of different severity. Many various errors may be identified during the execution of a method, such as inconsistent values in the object's variables or failures in attempts to read or write from files or network sockets. When programmers must include tests in their code to handle all possible odd or incorrect scenarios, the readability of the normal case suffers. Exceptions are a simple approach to deal with error scenarios without adding to the code complexity. Furthermore, each function heading explicitly states the error situations that it may face as exceptions, allowing users of a method to deal with them. When certain unexpected conditions or faults occur, a block of code might be configured to throw an exception. This signifies that control is passed to a different section of code that handles the exception. The control does not return to the location where the exception occurred. Garbage collection: When things are no longer needed, it is vital to provide a mechanism of releasing the space they occupy. That a language, such as Java, detects when an object is 160 CU IDOL SELF LEARNING MATERIAL (SLM)

really no longer accessible, it recovers that space and makes it available for other objects to use. Garbage collection is the term for this procedure. When a programming language (such as C++) does not support garbage collection, the programmer must deal with the freeing of object space. This is a common source of mistakes. Objects that are dispersed • An object's state is made up of the values of its instance variables. The state of a programme is partitioned into various portions in the object-based paradigm, which are associated with an object. The physical distribution of items into separate processes or computers in a distributed system is really a natural extension of object-based programmes' logical partitioning (the issue of placement is described in Section 2.3.1). Client-server architecture may be used in distributed object systems. Objects are controlled by servers in this situation, while clients call their methods via remote method invocation. The client's request to invoke the method of an entity is transmitted to the server that manages the object via RMI. The invocation is carried out by the server performing a method of the object and returning the result to the client in a separate message. Objects in servers can become clients of objects in other servers to allow for chains of related invocations. Other architectural models can be assumed by distributed objects. Objects can be replicated, for example, to gain the normal benefits of fault tolerance and improved performance, while objects can be migrated to improve their performance and availability. Encapsulation is enforced by separating client and server objects into separate processes. That is, an object's state can only be accessed by the object's methods, which means that unauthorised methods cannot act on the state. Concurrent RMIs from objects on multiple systems, for example, imply that an object can be accessed simultaneously. As a result, the risk of competing accesses exists. The notion that an object's data can only be accessed by its own methods, on the other hand, permits objects to include ways for defending themselves against inappropriate accesses. To prevent access to their instance variables, they could employ synchronisation primitives like condition variables. Another benefit of treating a distributed program's shared state as a collection of objects is whether an object can be retrieved via RMI or copied into a local cache and accessed directly, as long as the class implementation was available locally. 161 CU IDOL SELF LEARNING MATERIAL (SLM)

Figure 8.9 Remote invocation Because objects can only be accessed through their methods, another benefit of heterogeneous systems is that different data formats can be utilised at separate sites - these formats will go unnoticed by clients using RMI to access the objects' methods. The model of distributed objects • The object model is extended in this section to make it relevant to distributed objects. As shown in Figure, each process has a collection of objects, some of which can accept both remote and local invocations, while others can only receive local invocations. Remote method invocations are method calls made between objects in distinct processes, whether they are on the same computer or not. Local method invocations are those that occur amongst objects in the same process. Remote objects are objects which can receive remote invocations. The items B and F in Figure are distant objects. Local invocations are available to all objects, but they can only be received from other objects that contain references to them. Object C, for example, requires a reference to object E in order to call one of its methods. The distributed object model is based on the following two key notions. References to remote objects: If they have access to a remote object's remote object reference, other objects can call its methods. A remote object reference for B in Figure, for example, must be available to A. Interfaces accessible from afar: Every remote object does have a remote interface that defines which of its methods can be called from a distance. The objects B and F in Figure 5.12, for example, must have remote interfaces. The distributed object model's remote object references, remote interfaces, and other elements are the next topics we'll look at. References to remote objects: Any object that can accept an RMI can have a remote object reference, which is an extension of the concept of object reference. A remote object reference 162 CU IDOL SELF LEARNING MATERIAL (SLM)

is an identifier which can be used to refer to a single unique remote object throughout a distributed system. The representation of a remote object reference, which differs from that of a local object reference, is explored in Section 4.3.4. Remote object references were similar to local ones in the following ways: 1. The invoker specifies the remote object that will receive a remote method invocation as a remote object reference. 2. Arguments and results of remote method invocations can be supplied as remote object references. Remote interfaces: The methods of a remote object's remote interface are implemented by the class of the remote object, for example, as public instance methods in Java. As demonstrated in Figure, objects in many other processes can only call the methods which belong to the remote interface. Implementation of data in the form of a remoteobject of methods. The methods in the remote interface, as well as other methods provided by a remote object, can be invoked by local objects. Note that, like other interfaces, remote interfaces need constructors. An interface definition language (IDL) is provided by the CORBA system for defining remote interfaces. A remote interface defined in CORBA IDL is shown in Figure. Client programmes and remote object classes can be written in any language in which an IDL compiler was available, such as C++, Java, or Python. To remotely invoke the methods of a remote object, CORBA clients do not need to utilise the same language as the remote object. Remote interfaces are defined the same way as any other Java interface in Java RMI. They get the ability towards being remote interfaces via extending the Remote interface. Multiple interface inheritance is supported by both CORBA IDL (Chapter 8) and Java. In other words, an interface can extend any or more other interfaces. A distributed object system's actions • A method invocation, as in the non-distributed scenario, initiates an action, that may result in future invocations on methods in other objects. The objects engaged in a chain of linked invocations in the distributed situation, on the other hand, may be distributed in various processes or on other computers. RMI is used when an invocation crosses the boundary of a process or machine, and the invoker must have access to the object's remote reference. Object A in Figure must maintain a remote object reference to object B. The results of remote procedure invocations can be used to acquire remote object references. Item A in Figure, for example, may get a remote reference for object F from object B. When an operation results in the instantiation of such a new object, that object will often exist within the process when instantiation is requested, such as when the function Object () 163 CU IDOL SELF LEARNING MATERIAL (SLM)

{[native code]} is used. It will be a remote object with such a remote object reference if the freshly instantiated object does have a remote interface. Remote objects may be provided with methods for instantiating objects that may be accessed through RMI by distributed programmes, effectively providing the effect of remote object instantiation. If the object L in Figure had a method for constructing remote objects, then remote invocations via C and K could result in the objects M and N being instantiated, respectively. Figure 8.10 A remote object 8.5 MESSAGE ORIENTED COMMUNICATION Remote procedure calls & remote object invocations aid in the concealment of communication within distributed systems, enhancing access transparency. Regrettably, neither mechanism is always effective. Alternative communication services are required when it is not possible to assume that the receiving side was executing at the moment a request is given. Similarly, RPCs' intrinsic synchronous nature, in which a client is blocked until its request is executed, needs to be replaced from time to time. That something else is a form of communication. In this section, we'll focus on message- oriented communication for distributed systems by first delving into what synchronous behaviour is and how it affects distributed systems. Then we'll talk about message systems that presume both participants are working at the same time. Finally, we'll look at message- queuing systems, which allow processes to communicate even if the other side isn't running at the moment communication is started. Message-Oriented Transient Communication 164 CU IDOL SELF LEARNING MATERIAL (SLM)

Many distributed services and applications are developed directly on top of the transport layer's simple message-oriented paradigm. We'll start with transport-level sockets to better understand and appreciate message-oriented system is comprised of middleware solutions. The standardisation of the transport interface has received special attention. The layer to allow programmers to utilise a small set of primitives to access the whole suite of (messaging) protocols. Standard interfaces also make it easier to collaborate.An application can be ported to a different computer.As an example, we'll go through the sockets interface, which was first introduced in theBerkeley UNIX in the 1970s. Another major interface is XTI, which stands for eXtensible Transport Interface. (TLI), which was created by AT&T. Sockets and XTI have a model that is fairly similar.They are both types of network programming, but they have different sets of primitives. A socket is indeed a communication end point which an application connects. Can write data to be sent out via the underlying network, and can read data from the underlying network. It is possible to read incoming data. For a certain transport protocol, a socket is an abstraction of the real communication end point utilised by the local operating system. The socket primitives are the focus of the following text. The first four primitives are usually executed in this order by servers.The caller generates a new communication channel when using the socket primitive. A specific transport protocol's endpoint. Internally, establishing a communication system. The term \"end point\" refers to the fact that the local operating system sets aside resources to allow for the sending and receiving of messages for the protocol in question. The bind primitive gives the newly constructed socket a local address. For example, a server should tie its machine's IP address to a domain name. To a socket, assign a (potentially well-known) port number. The operating system is informed by binding. indicating the server only wishes to receive messages from the address and port provided. 165 CU IDOL SELF LEARNING MATERIAL (SLM)

Table 8.1 Primitives and its meaning Only when connection-oriented communication is used does the listen primitive get called. It's a nonblocking call that tells the local operating system to set aside enough buffers for the maximum number of users the caller will accept. A call to accept puts the caller on hold until the caller receives a connection request. The local operating system introduces a new socket with much the same properties as the original one and returns it to the caller when a request arrives. This method will allow the server, for example, to fork off a process that will handle all actual communication over the new connection. In the interim, the server can return to the original socket and wait for yet another connection request. Let's have a look at the client side of things now. Although a socket must be created first using the socket primitive, it is not essential to explicitly bind the socket to a local address because the operating system may dynamically allocate a port so when connection is established. The caller of the connect primitive must specify the transport-level address with which a connection request should be issued. The client is blocked till a connection is established successfully, at which point both sides can begin exchanging data using the send and receive primitives. Finally, when using sockets, closing a connection is symmetric, and is accomplished by having the both client and the server call the close primitive. In Figure, a client and server follow a general pattern for connection-oriented communication via sockets. Stevens provides information about network programming in a UNIX environment utilising socket as well as other interfaces (1998). Message Passing Interface (MPI) Developers have been looking seeking message-oriented primitives to help them design extremely efficient programmes since the introduction of high-performance multicomputers. This implies that the primitives must be implemented at a low degree of abstraction (to make application development easier) and with little overhead. For two reasons, sockets were 166 CU IDOL SELF LEARNING MATERIAL (SLM)

deemed insufficient. First, they supported just simple send and receive primitives, which was the improper level of abstraction. Second, sockets were created with general-purpose protocol stacks like TCPIIP in mind, allowing them to communicate across networks. They were deemed incompatible with proprietary protocols designed for high-speed network interfaces, such as those found in high-performance server clusters. These protocols necessitated an interface capable of handling more complex features like buffering and synchronisation. Figure 8.11 Berkley sockets As a result, proprietary communication libraries were included with the majority of interconnection networks & high-performance multicomputers. These libraries provided a large number of high-level, typically efficient communication primitives. Naturally, all libraries were incompatible with one another, posing a portability issue for application developers. The need to be hardware & platform agnostic eventually led to the creation of the Message- Passing Interface, or MPI, as a standard for message passing. MPI is a parallel programming interface that is optimised for transitory communication. It makes advantage of the underlying network directly. It also considers that major failures like process crashes and network partitions are fatal but do not need automatic recovery. MPI implies that communication occurs inside a defined group of processes. A unique identifier is assigned to each group. A (local) identifier is also assigned to each process inside a group. As a result, a (group/D, process/D) pair is utilised instead of a transport-level address to uniquely identify the source or destination of a message. A computation may involve numerous, possibly overlapping groups or processes that are all running at the same time. The messaging primitives that facilitate transitory communication are at the heart of MPI, and the most intuitive ones are summarised in Figure. The MPI bsend primitive is used to provide transient asynchronous communication. The sender sends a message, which is usually copied to a local buffer in the MPI runtime system before being sent. The sender continues after the message has also been copied. As soon as a receiver calls a receive primitive, the local MPI runtime system will delete the message out of its local buffer and handle transmission. 167 CU IDOL SELF LEARNING MATERIAL (SLM)

Table 8.2 MPI MPLsend, a blocking send operation with implementation-dependent semantics, is also available. The primitive MPLsend may block the caller till the given message has indeed been copied to the sender's MPI runtime system, or until the recipient has launched a receive operation. The MPIssend primitive allows for synchronous communication in which the sender blocks until its request is accepted for further processing. Finally, MPLsendrecv supports the most powerful type of synchronous communication: when a sender calls MPLsendrecv, it sends a request to both the receiver and blocks until the latter responds. This primitive is essentially the same as a standard RPC. MPLsend and MPLssend both provide variations that don't copy messages from user buffers to internal MPI runtime system buffers. Asynchronous communication is represented by these versions. A sender gives a pointer to the message to MPI isend, and the MPI runtime system handles communication. The sender continues right away. MPI provides primitives to verify for completion or even to stop if necessary to avoid overwriting any message before communication is complete. Similarly, to MPLsend, it is not specified if the message was truly transmitted to the receiver or merely copied to an internal buffer by the local MPI runtime system. Similarly, when using MPLissend, a sender just sends a pointer to both the: MPI runtime system. When the runtime system shows that the message has been processed, the sender knows that the message has been accepted and that the receiver is working on it. To receive a message, the procedure MPLrecv is used; it blocks the caller until the message arrives. MPLirecv is an asynchronous variation in which the receiver indicates that it is ready to accept a message. The recipientcan check if a message has come or put a hold on it until one does. 168 CU IDOL SELF LEARNING MATERIAL (SLM)

The meanings of MPI communication brutes are often not clear, and different primitives could be swapped out without affecting the program's correctness. It allows MPI system implementers to communicate with each other. There are numerous options for improving performance. Some cynics would argue that the committee is to blame.It couldn't make up its mind, so it threw everything in. MPI has been around for a while.It's made for elevated parallel applications, so it's a lot easier to use. To recognise its diversity in various communication primitivesGropp et al. provide more information on MPI (l998b). 8.6 STREAM-ORIENTED COMMUNICATION So far, the focus of communication has been on communicating more-or-less independent and full chunks of information. A request to invoke a process, the response to such a request, including messages sent between applications in message-queuing systems are all examples. This sort of communication is distinguished by the fact that this does not depend at what point in time it occurs. Timing seems to have no effect on correctness, even if a system performs too slowly or too quickly. There are other types of communication wherein timing is critical. Consider an audio stream constructed from a series of 16-bit samples, each indicating the amplitude of sound wave, as done with Pulse Code Modulation (PCM). Assume that the audio stream is CD-quality, which means that the original sound wave was sampled at such a frequency of 44, 100 Hz. The samples in the audio stream must be played out in the sequence they appear in the stream, as well as in intervals of exactly 1/44,100 seconds, to replicate the original sound. If you change the rate of playback, you'll get a distorted version of the original sound. Support for continuous media In this section, we'll look at which features a distributed system should have for exchanging time-dependent data like video and audio streams. Halsall discusses a number of network protocols that deal on stream-oriented communication (2001). Steinmetz and Nahrstedt give a broad overview of multimedia challenges, including stream-oriented communication, in their book. Babcock et al. explore query processing on data streams. Support for the sharing of time-dependent data is frequently expressed in terms of support in continuous media. The means through which information is communicated is referred to as a medium. Storage & transmission media, as well as display media including a monitor, are examples of these means. The way information is portrayed is an essential sort of medium. To put it another way, how is data encoded in such a computer system? For different sorts of information, different representations are utilised. Text, for example, is usually encoded in ASCII or Unicode. Images could be represented in a variety of formats, including GIF and 169 CU IDOL SELF LEARNING MATERIAL (SLM)

lPEG. In a computer system, audio streams could be encoded by capturing 16-bit samples with PCM, for example. The temporal linkages between different data items are critical in continuous (representation) media for correctly comprehending what the data truly signifies. We've already seen how to play out an audio stream to reproduce a sound wave. Consider motion as another example. A succession of photos can be used to indicate motion, with each image being displayed at a regular interval T in time, typically 30-40 msec each image. Not only must the stills be shown in the correct order, but they must also be shown at a consistent rate of liT pictures per second. Unlike continuous media, discrete (representation) media is distinguished by the fact because temporal relationships between data elements are not required for accurate interpretation. Textual and still picture representations, as well as object code and executable files, are common examples of discrete media. Distributed systems typically enable data streams to record the interchange of time-dependent information. A data stream is simply a collection of data items. Streams of data can be applied to both discrete and continuous media. Examples of (byte-oriented) discrete data streams include UNIX pipes and TCPIIP connections. Establishing up a continuous data stream between both the file and the audio device is usually required to play an audio file. Data Stream Continuous data streams require precise timing. Different transmission modes are frequently distinguished to capture timing factors. The data items in a stream are transferred one after the other in asynchronous transmission mode, but there are no further temporal limits on when the items should be delivered. For discrete data streams, this is usually the case. A file, for example, can be transferred as a data stream, although the exact time each item is transferred is mostly immaterial. Each unit in a data stream has a maximum end-to-end delay determined in synchronous transmission mode. It makes no difference whether a data unit is sent substantially faster than that of the maximum allowable delay. For example, a sensor could sample temperature at a set pace and send it to an operator over a network. In that situation, it may be critical that the end-to-end propagation time over the network be assured to be less than the time interval between sampling, but it won't hurt if samples are propagated considerably faster than required. Finally, data units must be transferred on time in the isochronous transmission method. This means that data transmission has a low and high end-to-end delay, which is also known as bounded (delay) jitter. The isochronous transmission mode is especially interesting in distributed multimedia systems since it is critical for expressing audio and video. Only 170 CU IDOL SELF LEARNING MATERIAL (SLM)

continuous data streams utilizing isochronous transmission are considered in this chapter, and we shall refer to them simply as streams. Streams might be straightforward or intricate. A basic stream is made up of only one element. A simple stream is made up of multiple related simple streams called substreams, whereas a complex stream is made up of several related simple streams. In a complicated stream, the relationship between the substreams is frequently time dependent. Stereo audio, for example, can be broadcast via a complex stream made up of two substreams, each of which is responsible for a single audio channel. It is critical, however, that the two substreams remain in constant sync. To put it another way, data units of each stream must be shared pairwise to ensure stereo effect. A sophisticated stream for streaming a movie is another example. A single video stream, as well as two streams for delivering the movie's stereo sound, might make up such a stream. A fourth stream could include deaf subtitles or a translation into a language other than the audio. The importance of substream synchronisation cannot be overstated. If the synchronisation fails, the movie will not be reproduced. Below, we'll return to stream synchronisation. We can identify various elements that are required for supporting streams from the standpoint of distributed systems. We focus on broadcasting stored data rather than current data to keep things simple. Data is recorded in real time and transferred via the network to receivers in the latter instance. The key distinction between the two is that live data streaming offers fewer options for stream tweaking. We could then sketch a general client-server architecture to supporting continuous multimedia streams, as shown in Figure. 1, based on Wu et al. (2001). Figure 8.12 Stream oriented communication Streams and Quality of Service Quality of Service (QoS) criteria are used to communicate timing (as well as other non- functional) constraints. These requirements specify what the underpinning distributed system & network must provide in order to preserve, for example, all temporal relationships in a stream. The major problems of QoS with continuous data streams are timeliness, volume, 171 CU IDOL SELF LEARNING MATERIAL (SLM)

&reliability. In this section, we'll look at Quality of Service (QoS) and how it relates to setting up a stream. Jin and Nahrstedt (2004), for example, have written extensively on what to specify required QoS. In many circumstances, it comes down to describing a few key properties from the standpoint of an application (Halsall, 2001): 1. The minimum bit rate required for data transmission. 2. The amount of time it takes to set up a session (i.e., when an application can start sending data). 3. The maximum end-to-end delay (the time it takes for a data unit to reach its destination). 4. Jitter, or maximum delay variance. 5. The longest possible round-trip delay. It's worth noting that these parameters can be refined in a variety of ways, as Steinmetz and Nahrstadt demonstrate (2004). However, when dealing with Internet protocol stack -based stream-oriented communication, we simply have to accept that the foundation of communication is established by a very simple, best-effort datagram service: IP. When things get bad, as they frequently do over the Internet, the IP programme defines a protocol implementation can delete packets as needed. The Internet protocol stack is currently used to build many, if not all, distributed systems that enable stream-oriented communication. So much for QoS requirements. (It's true that IP has some QoS support, but it's rarely used.) Considering that the underlying system only provides a best-effort delivery service, a distributed system can strive to hide the lack of service quality as much as feasible. Fortunately, it has a number of mechanisms at its disposal. To begin with, the situation is not quite as awful as it has been depicted thus far. For example, the Internet's differentiated services allow users to distinguish between different types of data. A transmitting host can categorise outgoing packets into one of various classes, including an accelerated forwarding class that indicates that a packet is sent with utmost priority by the current router (Davie et al., 2002). There's also a guaranteed forwarding class, which divides traffic into four subclasses and offers three options for dropping packets unless the network becomes overloaded. Assured forwarding effectively provides a set of priorities which can be assigned for packets, allowing applications to distinguish between time- sensitive and non-critical packets. A distributed system, in addition to these network-level solutions, can aid in the transmission of data to receivers. Although there are few solutions available, using buffers to reduce jitter is one that is very useful. The concept is straightforward, as seen in Figure. Assuming that packets are delayed by a certain amount when sent over the network, the receiver simply keeps them in a buffer for as long as possible. This allows the receiver to send packets to the 172 CU IDOL SELF LEARNING MATERIAL (SLM)

application at a consistent rate, knowing that there are always enough packets in the buffer to play them back at that rate. Figure 8.13 Buffering to reduce jitter Of course, things can go awry, as shown in Figurewith packet #8. The receiver's buffer size equates to 9 seconds of packets to be sent to the application. However, packet #8 took 11 seconds on average the receiver, causing the buffer to be entirely depleted. As a result, there is a pause in the application's playback. The only way to fix this is to expand the buffer size. The obvious disadvantage is that the time it takes for the receiving application to begin playing back the data in the packets rises. Other procedures can also be utilised. Recognizing that we're working with a best-effort service implies that packets might get lost. To compensate for this drop in service quality, we must employ error correcting procedures (Perkins et al., 1998; and Wah et al., 2000). Because requesting the sender can retransmit a missing packet is usually impossible, forward error correction (FEe) must be used. One well-known strategy is to encode departing packets so any k out of n received packets can be used to reconstruct k right packets. A single packet may contain numerous audio and video frames, which could cause an issue. As a result, when a packet is lost, the receiver may perceive a big gap in the frames being played out. Interleaving frames, as seen in Figure, can help to mitigate this effect. When a packet is lost, the ensuing gap in subsequent frames is dispersed across time in this way. However, in comparison to non-interleaving, this strategy necessitates a bigger receive buffer, resulting in a longer start delay for such receiving application. In the case of Figure. (b), the receiver will need four packets instead of one to play the first four frames, as opposed to only one packet in non-interleaved transmission. 173 CU IDOL SELF LEARNING MATERIAL (SLM)

Figure 8.14 Enforcing QoS 8.7 SUMMARY  These protocols are frequently modelled as a succession of layers, each of which addresses a different functional component of the connection. Each layer has a well- defined interface that connects it to the layers above and below it.  Layering network protocols and services not only simplifies networking protocols by breaking them down into smaller, more manageable components, but it also gives them more flexibility. Protocols can be built for interoperability by splitting them into layers.  Models of communication that focus on the conveyance of messages or information, or that limit meaning to explicit content (see also transmission models; compare informational communication). In practise, sender-oriented formulations are the most common.  Stream-oriented communication is a type of communication that emphasises the importance of timing. Continuous streams of data are another term for stream-oriented communication.  In audio and video streaming, stream-oriented communication is commonly employed. A request-response model can be applied to message communication. You 174 CU IDOL SELF LEARNING MATERIAL (SLM)

might start getting data without asking for it in Stream communication. Timing limitations are one feature of stream communication.  A remote procedure call (RPC) in distributed computing is when a computer programme causes a procedure (subroutine) to execute in a different address space (typically on another computer on a shared network), coded as if it were a normal (local) procedure call, without the programmer explicitly coding the details for the remote interaction. That is, whether the subroutine is local to the current programme or remote, the programmer writes essentially the same code.  A request–response message-passing mechanism is often used to achieve this type of client–server interaction (caller is client, executor is server). RPCs are represented via remote method invocation in the object-oriented programming paradigm (RMI). The RPC model implies a degree of location transparency, i.e., calling processes are basically the same whether they are local or remote, but they aren't always identical, allowing local and remote calls to be distinguished? Because remote calls are typically orders of magnitude slower and less dependable than local calls, it's critical to know the difference. 8.8 KEYWORDS  Authentication- A method usually based on encryption and shared secrets for establishing the identity of a message sender.  Client-server architecture -A widely used architecture model in which client processes interact with individual server processes on separate host computers in order to access shared data.  RMI (Remote Method Invocation)-Method invocations between objects in different processes.  RIP (Router Information Protocol)-The protocol run in a router which determines the next hop of a packet and periodically updates its routing table based on the information received from its neighbours.  WWW (The World Wide Web)-The evolving system for publishing and accessing resources and services across the Internet based on the following technologies: HTTP, URLs and client-server architecture. 8.9 LEARNING ACTIVITY 1. A client makes RMIs to a server. The client takes 5 ms to compute the arguments for each request, and the server takes 10 ms to process each request. The local OS processing time for each send or receive operation is 0.5 ms, and the network time to transmit each request or reply message is 3 ms. Marshalling or unmarshalling takes 0.5 ms per message. Estimate the 175 CU IDOL SELF LEARNING MATERIAL (SLM)

time taken by the client to generate and return from two requests (i) if it is single-threaded, and (ii) if it has two threads that can make requests concurrently on a single processor. Is there a need for asynchronous RMI if processes are multi-threaded? 2. Define an alternative signature for the methods pairIn and readPair, whose return value indicates when no matching pair is available. The return value should be defined as an enumerated type whose values can be ok and wait. Discuss the relative merits of the two alternative approaches. Which approach would you use to indicate an error such as a key that contains illegal characters? 8.10 UNIT END QUESTIONS 176 A. Descriptive Questions Short questions 1. What is a protocol? 2. Define MOM. 3. What is routing? 4. List the goals of client and server stubs. 5. What are the similarities between RMI and RPC? Long Questions 1. Explain the purpose of lower-level protocols. 2. Describe about transport protocols. 3. Explain the categories of load balancing algorithms. 4. Describe about the types of communication. 5. Explain the properties of remote procedure call. B. Multiple Choice Questions 1. MOM stands for____ a. Mails oriented middleware b. Message oriented middleware c. Middleware of messages CU IDOL SELF LEARNING MATERIAL (SLM)

d. Main object middleware 2. Middleware has enabled the production of various types of smart machines having microprocessor chips with ____software. a. Messaging b. Cloud c. Main d. Embedded 3. A remote procedure call is _______ a. single process b. single stream c. inter process communication d. client program 4. A remote procedure is uniquely identified by _________ a. program number b. book name c. program code d. all of the above 5. An RPC application requires _________ a. system number b. system code c. system size d. client program Answers 1-a, 2-d, 3-c, 4-a, 5-d 177 CU IDOL SELF LEARNING MATERIAL (SLM)

8.11 REFERENCES Reference books  George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts and Design” (4th Edition), Addison Wesley/Pearson Education.  Pradeep K Sinha, “Distributed Operating Systems: Concepts and design”, IEEE computer society press. Text Book References  M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International Publishers2009.  Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5. Websites:  https://www.geeksforgeeks.org/introduction-to-parallel  https://hpc.llnl.gov/training/tutorials/introduction-parallel-computing-tutorial  https://www.javatpoint.com/what-is-parallel-computing 178 CU IDOL SELF LEARNING MATERIAL (SLM)

UNIT 9 -RESOURCE AND PROCESS MANAGEMENT 1 STRUCTURE 9.0 Learning Objectives 9.1 Introduction 9.2 Task assignment approach 9.3 Load balancing in distributed systems 9.4 Balance of Load 9.5 Summary 9.6 Keywords 9.7 Learning Activity 9.8 Unit end questions 9.9 References 9.0 LEARNING OBJECTIVES After studying this unit, you will be able to:  Explain the basics of resource and process management  Outline the working procedure of task assignment method  Describe about load balancing in distributed systems  Explain the basics of balancing the load 9.1 INTRODUCTION Resource Management in a Distributed Environment is a system for managing resources such as files and data across a distributed system, with the goal of ensuring that a user or client may access remote resources as easily as local ones. Resource sharing is also the foundation of resource management. Because a computer may request a service or a file from some other computer by making the proper request via the communication network. Autonomous computers can share hardware and software resources. This communication is also known as peer-to-peer communication, and it is the foundation of a distributed system instead of the centralized-server plus client communication mechanism. The peer-to-peer communication system is far more efficient, versatile, convenient, and rapid than the centralized-server plus client system. All processes involved in a job like 179 CU IDOL SELF LEARNING MATERIAL (SLM)

resource management have similar functions in this architecture, engaging cooperatively as peers with no distinction made between client and the server processes or even the computers they execute on. The goal of a peer-to-peer architecture is to take advantage of the resources of a large number of collaborating computers to complete a task. It is critical to organise the interface between each computer. To allow the greatest possible variety and types of computers to be used, the protocol and communication channel must not contain or misuse information that may be misread by some machines. Special care should also be taken to ensure that messages are sent accurately and that incorrect messages are rejected, as this might bring the system and possibly the entire network down. Another significant factor is the ability to deliver portable software to another machine so that it can run and interact with both the existing network. When employing different hardware and resources, this may not always be viable or practical, in which case other approaches must be utilised, such as cross-compiling and manually porting this software. A distributed network is a combination of self-contained computers designed to improve resource sharing. Resource discovery & resource scheduling are two aspects of resource management in such a distributed system. Resource discovery can be done in two ways: centralised and decentralised. Our application is built on efficiently handling resource discovery in a distributed system using a decentralised approach. Because all servers must register with the information server in order to be a member of the network, we created a primary information server that holds information on all of the servers in the network. Every server maintains track of all of the resources available to its clients. When a client requests a resource, it checks first with itself, and if the requested resource isn't available there, it sends a request towards its immediate server. If it fails and at immediate server as well, the request is propagated across the network with the help of an information server. If this solution segment a server that owns the resource, an acknowledgement is sent to the network server that generated the request. The tasks aimed at designing a process, creating responsibilities, reviewing process performance, and identifying potential for improvement are all part of Process Management. Process migration entails copying a sufficient quantity of a process' state from one computer to another. On the target machine, the process runs.  The goal of migration is to relocate processes from systems that are substantially loaded to systems that are lightly laden.  The performance communication is improved as a result of this.  To save money on communications, processes that interact frequently might be moved to the same node. 180 CU IDOL SELF LEARNING MATERIAL (SLM)

 The goal of process migration is to improve resource usage across the entire system.  Because the computer on which the long-running process is running will be unavailable, it may be relocated.  The process can take advantage of special hardware or software. Desirable Features of Global Scheduling Algorithm The following are the characteristics of the Global Scheduling Algorithm: 1) No prior understanding of the processes: The scheduling algorithm's operation is based on information about the processes' characteristics and resource requirements. Users who have to provide this information when submitting their processes to execution face an additional barrier. For the global scheduling algorithm, no such information is necessary. The mechanism through which threads, processes, or data flows were given access to system resources is known as scheduling in computing (e.g., processor time, communications bandwidth). This is frequently done to properly load balance & share system resources, or to reach a certain level of service quality. Users that must provide this information when submitting their processes for execution have an additional burden when using scheduling algorithms which operate based on information about both the characteristics and system allocation of the processes. A smart process task scheduling should work without any prior knowledge of the procedures that will be carried out. Because it places an additional effort on the user to provide this information prior to execution. 2) Natural Dynamism: The decision about process assignment should be dynamic, that means it should be based on the system's present load rather than some static policy. The ability to transfer the process several times should be a feature of the algorithm. The process should be run on a specific node that can be altered later to react to changes in system load, based on initial decision. Process assignment decisions must be dynamic, i.e., depending on the system's present load rather than a fixed policy. Because the original decision to place a process on a certain node may need to be revised after some time to adjust to the new system load, it is advised that the scheduling algorithm have the flexibility to move a process more than once. • A good process scheduling algorithm will be able to handle the system's various nodes' dynamically changing load (or status). 3) Capabilities for making decisions: When we consider heuristic approaches, we can see that these require less computational effort, which means that the output takes less time. This will produce a near-optimal output and will be capable of making decisions. 181 CU IDOL SELF LEARNING MATERIAL (SLM)

4) Performance of the Balancing System and Scheduling Overhead: We need a method that will offer us with near-optimal system performance in this case. It's best to collect the bare minimum of global status data, such as CPU load. This information is critical because as the volume of international state data collected grows, so does the overhead. As a result of increased cost of collecting and evaluating the additional data, the need to improve system performance by reducing scheduling overhead. 5) Consistency: Processor thrashing (due to ineffective process migration) must be avoided. For example, if nodes two input are overburdened with processes while node n3 is idle, we can offload a part of the work to n3 without knowing about the offloading decisions made by other nodes. If n3 becomes overloaded as a result of this, it may begin shifting its processes to other nodes again. The fundamental reason for this is that each node's scheduling decisions are made independently of those made by other nodes. 6) Flexibility: As the number of nodes grows, the scheduling mechanism must be able to scale. If an algorithm makes issue commands by first querying the workload from all nodes and afterwards selecting the least heavily laden node, it is said to have poor scalability. Only when the system has a small number of nodes will this approach operate. This occurs when the inquirer gets a flood of responses at the same time, and the time it takes to process the responses in order to make a node selection is too long. The network traffic quickly consumes network bandwidth as the number of nodes (N) grows. 7) Tolerance to Errors: If one or even more nodes in the system fail, the excellent scheduling process should not be affected, and a means to deal with this should be accessible. If the nodes are divided into different groups owing to connection failures, the algorithm should be able to work efficiently inside the nodes. Algorithms should decentralise decision-making capabilities and evaluate just available nodes in their decision-making in order to have superior fault tolerance capability. 8) Service Fairness: If the worldwide scheduling policy seeks to distribute the load on all of the system's nodes blindly, it is not beneficial in terms of service fairness. This is due to the fact that in any load- balancing strategy, substantially laden nodes will reap all of the benefits while less loaded nodes will experience slower response times than in a stand-alone arrangement. As a result, we propose that load balancing be replaced with load sharing. To put it another way, a node can share much of its resources so long as its users are not adversely affected. A logical resource, such as a shared file, or a physical resource, such as a CPU, are both examples of resources (a node of the distributed system). One of the functions of such a 182 CU IDOL SELF LEARNING MATERIAL (SLM)

distributed system is to assign processes to the distributed system's nodes (resources) in order to maximise resource utilisation, response time, network congestion, & scheduling overhead. A distributed system's resource manager schedules operations to maximise the combination of resource utilisation, response time, network congestion, and scheduling overhead. A distributed system's processes can be scheduled using one of three methods: 1)Task Assignment Approach, in which each process submitted for processing by a user is considered as a group of linked tasks, which are then scheduled to appropriate nodes to improve performance. 2) Load-balancing strategy, in which all processes provided by users are dispersed throughout the system's nodes in order to balance the workload. 3)A load-sharing strategy, which simply ensures that no node remains idle while processes are being handled, hence conserving the system's ability to perform work. Because it works on the assumption that now the characteristics (e.g., execution time, IPC costs, etc.) of all the processes to be scheduled are known in advance, the task assignment approach offers limited applicability in practical circumstances. Because distributed systems involve multiple resources, it is necessary to give system transparency. • A distributed operating system's function is to assign processes to the distributed system's nodes (resources) in order to optimise resource usage, response time, network congestion, as well as scheduling overhead. • System management is divided into three categories: resource management, process management, and fault tolerance. 9.2 TASK ASSIGNMENT APPROACH The WfMS assigns a task to an agent who will be responsible for it. A person or a process can act as the agent. When it comes to people, the task usually entails notifying the agent. Database Support to Workflow Management Systems explains more. Each process is thought of as a series of tasks. To optimise performance, these jobs are assigned to the most appropriate processor. This strategy isn't generally employed because it necessitates knowing the features of all processes ahead of time. This method fails to account for the system's dynamically changing state. A process is regarded to be made up of several tasks in this technique, and the purpose is to determine the best assignment policy for each process's tasks. The following are some common assumptions for task assignment:  Minimize the expense of IPC (this problem can be modelled using network flow model) 183 CU IDOL SELF LEARNING MATERIAL (SLM)

 Resource utilisation that is as efficient as possible  Quick turn around time  There's a lot of parallelism here.  Balancing the Approach Load The processes are spread among nodes to distribute the load evenly across all nodes. Load Balancing or Load Levelling Methods are scheduling algorithms that use this strategy. These methods are founded on the idea that in order to maximise resource utilisation, a distributed system's load should be balanced equitably. This load balancing technique attempts to balance the entire system load by transparently moving workload from severely loaded nodes towards lightly loaded nodes in order to ensure good performance relative to a certain specific system statistic. Load balancing algorithms can be divided into the following categories: Static: Do not consider the present state of the system. For example, if a node is overburdened, it will randomly pick up a task and move it to another node. These algorithms are easier to implement, but they may not perform well. Dynamic: Load balancing is based on current condition information. There is a cost to collecting state information on a regular basis, yet they outperform static algorithms. Deterministic: Algorithms in this class assign processes to nodes based on processor and process attributes. Probabilistic algorithms use information about the system's static features, such as the amount of nodes, processing capability, and so on. centralised: Information on the state of the system is collected by such a single node. All scheduling choices are made by this node. Distributed: This is the preferred method. Each node is fully responsible for scheduling decisions depending on the specific status and information from other sites. A distributed dynamic scheduling method known as cooperative. The distributed elements in these algorithms work together to make scheduling decisions. As a result, they are more complicated and have a higher overhead than non-cooperative counterparts. A cooperative algorithm, on the other hand, is more stable than a non-cooperative one. A distributed dynamic scheduling algorithm called non-Cooperative. Individual entities act as autonomous entities in these algorithms, making scheduling autonomous decisions of the actions of other entities. Load-Sharing Methodology Several experts feel that load balancing, which entails attempting to balance workload across all of the system's nodes, is not an appropriate goal. It is because the overhead of gathering 184 CU IDOL SELF LEARNING MATERIAL (SLM)

state information to fulfil this goal is typically quite high, particularly in distributed systems with a large number of nodes. In fact, balancing the load across all nodes is not essential for optimal resource usage in a distributed system. It is both required and sufficient to keep nodes from becoming idle while others run multiple tasks. Instead of Dynamic Load Balancing, this solution is known as Dynamic Load Sharing. The design of load sharing algorithms necessitates making informed decisions on load estimation, process transfer, state information interchange, priority assignment, and migration limiting policies. Most of these principles are easier to decide in the case of load sharing because load pooling algorithms do not aim to balance overall average workload of all the system's nodes. Rather, when a node is substantially loaded, they just try to ensure that no node is idle. The policies for priority assignments and migration limitation for load-sharing algorithms are identical to those for load-balancing algorithms. Continuous advancements, intense competition, and a collaborative environment are all posing challenges to the manufacturing industry today. Product Research and Development (R&D) is by far the most significant competitive advantage of a country's manufacturing industry, according to the \"Advanced Manufacturing National Strategic Plan,\" \"Industry 4.0 Manufacturing,\" and \"Manufacturing 2025.\" Product R&D necessitates partnerships among teams and personnel with diverse expertise domains and organisational backgrounds due to the complexity of R&D. Task assignment is the most important aspect of successful partnerships. Task assignment, according to Alidaee et al., entails establishing correspondences between a set of tasks and a set of organisational units. One of the most significant tasks in product R&D is task assignment, which has a direct impact on product R&D's operational efficiency. Hundreds of tasks and their interdependencies are always present in a product R&D project. Task assignment that is efficient can reduce design lead time and expenses while preserving product quality. As a result, efficient task assignment is critical in the product R&D process in order to achieve operational competitive advantages. Task assignment is a working-process decision problem in which workloads and responsibilities for tasks are distributed across various organisational units, each of which will complete its respective sections of a common project. The goal of this research is to look into a specific form of task assignment problem that occurs when there is ambiguity. The following is a description of the issue. Various tasks must be given to R&D teams during the product development process. The goal is to reduce the total time it takes to complete all jobs. The theoretical execution time of tasks is unknown or erroneous, and the volatility of execution time in the future is likewise uncertain. This research investigates how to design an effective task assignment approach in order to reduce the task assignment plan's final completion time when faced with uncertainty. 185 CU IDOL SELF LEARNING MATERIAL (SLM)

Decision-makers should consider not only the theoretical execution duration of tasks when making task assignment decisions, but also the possibility of ambiguity during the execution of the task assignment plan. Dealing with uncertainty can improve the resilience and stability of a work assignment strategy. In addition, in the age of the knowledge economy, knowledge is a valuable resource for product R&D. As a result, team knowledge and skill are critical for task performance. Each team possesses certain information and capabilities, and each task necessitates the possession of specific knowledge and capabilities by at least one of the teams. This means that the degree of match between the task's knowledge required and the teams' knowledge is critical for efficient task assignment. In light of this, this research offered a technique system for effective task assignment that takes into account the degree of task and knowledge satisfaction. Similar tasks identification, task execution time calculation, and task assignment model are the three components of the method system. To begin, the effective collection of related tasks is built depending on the satisfaction level of task understanding. Then, using a Knowledge-Task-Person (T-K-P) network, a BP neural network is developed to forecast the theoretical time complexity for tasks. When it comes to task assignment, a robust task assignment model is proposed that can be used in circumstances where there is minimal information about execution time fluctuations. This paper's primary contributions can be summarised as follows. To begin, a knowledge- based method for predicting task execution time is proposed, which can overcome the ambiguity & subjectivity of the manager's decision. Second, to solve the job assignment problem under uncertainty, a resilient optimization model and a corresponding enhanced genetic algorithm are devised, which can protect decision-makers from parameter ambiguity and stochastic uncertainty. Finally, in order to create more effective and robust job assignment plans, a comprehensive method system of task assignment that integrates knowledge and uncertainty is constructed. Similar tasks identification, task execution time computation, and task assignment model are the three components of the ETA system. The ETA system begins by identifying related jobs using the knowledge base as a starting point. Once similar activities have been found, historical data from past tasks can be used to estimate the level of knowledge satisfaction. The BP neural network can be used to compute task execution time based on the satisfaction level of knowledge. Then, to find the best work assignment scheme for product R&D, a robust task assignment model with uncertainty as well as a genetic algorithm are proposed. The three components of both the ETA system will be discussed in detail in the sections that follow: 186 CU IDOL SELF LEARNING MATERIAL (SLM)

Figure 9.1 Task Assignment Approach 9.3 LOAD BALANCING IN DISTRIBUTED SYSTEMS In order to improve overall execution of the distributed programme in any distributed architecture, a number of load balancing methods have been created. Load balancing entails allocating work to each processor and reducing the program's execution time. It would be able to run the applications on any machine in a globally distributed system in practise. With the advent of the internet, however, the concept of a \"distributed system\" gains popularity and appeal. The users' performance improves significantly as a result of this. This paper explains the fundamental concepts for numerous load balancing approaches in a computing environment that have recently been created. This paper also discusses several load balancing systems, their benefits and drawbacks, and comparisons based on various criteria. Distributed computing is a promising technology that entails the coordination and participation of resources in order to solve a wide range of computer problems. The design of an effective dynamic load balancing algorithm which enhances performance of distributed systems is one of the major concerns in distributed systems. In grid computing settings, scheduling and resource management are critical to obtaining high resource utilisation. Load balancing is indeed the process of evenly dividing the load among the nodes of a distributed system in order to enhance job response time & resource usage while avoiding a situation in which some nodes are severely loaded while others are idle or lightly loaded. Distributed systems allow disparate resources, like computers, storage systems, and other specialised devices, to be shared and aggregated. These resources are dispersed and may be owned by a variety of individuals or organisations. Users of a distributed system have a variety of goals, objectives, and methods, and it's impossible to predict their behaviour. The administration of resources and applications in such systems is a difficult undertaking. A distributed system is a collection of computer and communication resources that are shared 187 CU IDOL SELF LEARNING MATERIAL (SLM)

by multiple active users. The load balancing issue becomes more critical as the demand for processing power grows. The goal of load balancing would be to boost the effectiveness of a distributed system by distributing the application load more evenly. The problem can be stated in general as follows: given a large number of jobs, determine the best way to allocate them to machines while optimising a given objective function (e.g., total execution time). A system's processing speed is always important. Since the beginning of computer development, the focus has always been on system performance, or how to improve the speed and performance of the existing system, and thus we have arrived at the era of supercomputers. For the rapid changing of their day-to-day needs, corporate organisations, defence sectors, and science groups, in particular, require high-performance systems. Parallel computing & parallel distributed computing have indeed been evolved from the serial computer to the supercomputer. MPCs (massively parallel computers) are now available on the market. A network such as mesh, hypercube, or torus connects a collection of processors to memory modules in MPC. Because supercomputers are so expensive, a new alternative concept called parallel distributed computing has emerged (although it already existed), in which thousands of processors can be connected via a wide area network or across a large number of systems made up of inexpensive and readily available autonomous systems such as workstations or PCs. As a result, in comparison to MPC, it is becoming increasingly popular for massive processing tasks such as scientific calculations. Distributed systems with hundreds of powerful processors have recently been developed. Distributed computing systems provide a high-performance environment capable of processing times the amount of data. A multicomputer system can be effectively used by correctly splitting tasks (loads) and balancing them among the nodes. The processing nodes, network topology, communication medium, operating system, and other components of a dispersed network can all be different in different networks that are widely distributed around the world. The distributed computing system is currently made up of several hundred computers. The overall work load of a system must be dispersed among the nodes via the network in order to achieve optimal efficiency. As a result of the availability of distributed memory multiprocessor computing systems, the topic of load balancing has gained popularity. The load balancing problem is simply the distribution of loads to the processing elements. In a multi-node system, there's a good likelihood that some nodes will be idle while others will be overloaded. The purpose of load balancing algorithms is to keep each processing unit from being overloaded or idle, which means that each processing element should have an equal load at any given point during execution in order to achieve the system's optimal performance (shortest execution time). As a result, the right design of a load balancing algorithm can considerably increase the system's performance. There will be some fast-computing nodes and some slow computing nodes in the network. If we don't account for processing speed and connection speed (bandwidth), the total system's 188 CU IDOL SELF LEARNING MATERIAL (SLM)

performance will be limited by the network's slowest running node. As a result, load balancing solutions balance the loads across the nodes by preventing idle nodes from overloading the others. Furthermore, load balancing algorithms eliminate any node's inactivity during operation. 9.4 BALANCE OF LOAD Load balancing is the process of distributing load units (jobs or tasks) across a group of processors connected to a network that could be located anywhere on the planet. Excess or unexecuted load from one processor is moved to other processors with loads less than the threshold load. Threshold load is the amount of load applied to a processor that allows any additional load to be applied to it. In a multi-node system, there's a good likelihood that some nodes will be idle while others will be overloaded. As a result, processors in a system can be classified as fully loaded (enough jobs are awaiting execution), lightly loaded (fewer jobs are awaiting execution), or idle (no jobs are awaiting execution) (have no job to execute). It is possible to make every processor equally occupied and accomplish the tasks roughly at the same time using a load balancing method. Three rules govern a load balancing procedure. These are the rules of location, distribution, and selection. The selection rule can be used in either a pre-emptive or non-pre-emptive manner. The non-pre-emptive rule always picks up the freshly formed process, whereas the pre-emptive rule may take up the running process. Pre-emptive transfer is more expensive than non-pre-emptive transfer, which is the better option. In some cases, however, pre-emptive transfer is preferable to non-pre-emptive transfer. Load balancing's Advantages a) Load balancing boosts the performance of each node and, as a result, the entire system. b) Load balancing decreases work idle time c) small jobs are not starved for lengthy periods of time d) Maximum resource usage e) Response time is reduced f) Increased throughput; g) Increased reliability; h) Low cost, great gain I) Flexibility and progressive expansion Static Load Balancing (Static Load Balancing) (Static Load Bal Processes are assigned to processors in a static algorithm at build time based on the performance of the nodes. At run time, no changes or reassignments can be made to the processes that have already been assigned. In a static load balancing algorithm, the number of jobs in each node is fixed. No 189 CU IDOL SELF LEARNING MATERIAL (SLM)

information about the nodes is collected by static techniques. Jobs are assigned to processing nodes based on the following factors: incoming time, resource requirements, mean execution time, and inter-process communications. Static load balance is also known as a probabilistic method since these factors must be measured prior to the assignment. Because there is no job migration at runtime, there is no overhead or only a small amount of overhead. Even if a mathematical solution exists, there are numerous basic faults with static load balancing because load is balanced prior to execution: It's quite difficult to predict the execution times of various components of a programme without actually running them. Communication delays that vary depending on the situation in some cases, the number of steps required to solve an issue is unknown. In static load balancing, it is noticed that as the number of jobs exceeds the number of processors, the load balancing improves. Local tasks arrive at the assignment queue, as shown in the schematic diagram of static load balancing. From the assignment queue, a job can be transferred to a remote node or assigned to the threshold queue. A job from a remote node will be assigned to the threshold queue in the same way. A job cannot be transferred to any node once it has been allocated to a threshold queue. A job arriving at any node is either processed there or sent to another node over the communication network for remote processing. There are two types of static load balancing algorithms: optimal static load balancing and suboptimal static load balancing. Processing Node Model Techniques for Load Balancing: Round One of the most basic and often used load balancing algorithms is round-robin load balancing. In a rotating fashion, client requests are delivered among application servers. If you have three application servers, for example, the first client request will be sent to the first application server throughout the list, the second client request will be sent to the second application server, this same third client request will be sent to the third application server, the fourth client request will be sent to the first application server, and so on. This load balancing technique ignores application server characteristics, assuming that all application servers are identical in terms of availability, compute, and load management. Round Robin with Weights Round Weighted To accommodate for different application server characteristics, Robin expands on the fundamental Round-robin load balancing technique. To demonstrate the application server's traffic-handling capability, the administrator provides a weight to each application server based on criteria of their choosing. Application server #1 is furnished with a larger weight than application server #2 (and application server #3) if application server #1 is twice as powerful as application server #2 (and application server #3). The initial two (2) client requests go to application server #1, the third (3) to application server #2, the fourth (4) 190 CU IDOL SELF LEARNING MATERIAL (SLM)

to application server #3, and the fifth (5) to application server #1 if there are five (5) client requests in a row. The Weakest Link Client requests are distributed to the application server with the fewest number of active connections at the time the client request is received using the least connection load balancing algorithm. In circumstances where application servers have identical specs, a server may be overwhelmed due to longer-lasting connections; this approach considers active connection load. Least Connection (Weighted) Weighted Least Connection is a load balancing algorithm that takes into consideration the characteristics of different application servers. To demonstrate the application server's traffic- handling capability, the administrator provides a weight to each application server based on criteria of their choosing. The load balancing criteria are set by the LoadMaster based on active connections and application server weights. Based on Resources (Adaptive) The Resource Based (Adaptive) load balancing technique necessitates the installation of an agent on the application server that reports the load balancer's current load. The application server's availability and resources are monitored by the deployed agent. To assist in load balancing decisions, the load balancer requests the output from the agent. Based on Resources (SDN Adaptive) SDN Adaptive is a load balancing technique that makes more optimal traffic distribution decisions by combining knowledge from Layers 2, 3, 4, and 7, as well as input from an SDN Controller. This permits data about the status of the servers, the status of applications running on them, and health of both the network infrastructure, and the level of network congestion to all play a role in load balancing decisions. Weighting is predetermined. Fixed Weighting is indeed a load balancing algorithm in which the administrator assigns a weight to every application server based on their own criteria to indicate the application server's traffic-handling capability. The traffic will be directed to the application server the with highest weight. If the highest-weighted application server fails, all traffic is routed to the next-highest-weighted application server. Response Time (Weighted) Weighted Response Time is a load balancing mechanism that chooses which application server receives the next request based on the response timings of the application servers. The application server weights are calculated using the application server load to a health check. The next request is sent to the application server that responds the fastest. 191 CU IDOL SELF LEARNING MATERIAL (SLM)

Source IP Hash is a load balancing algorithm that generates a unique hash key by combining the client and server's source and destination IP addresses. The key is used to assign a client to a certain server. The client request is forwarded to the same server it was using earlier because the key can be regenerated if the session is broken. This is useful if it's critical for a client to reconnect to an active session after a disconnection. 9.5 SUMMARY  The WfMS assigns a task to an agent who will be responsible for it. A person or a process can act as the agent. When it comes to people, the task usually entails notifying the agent. Database Support for Workflow Management Systems explains more.  Task Assignment Approach: Each process submitted for processing by a user is considered as a collection of linked tasks, which are then scheduled to appropriate nodes to improve performance.  Task Assignment Approach: Each process submitted for processing by a user is considered as a collection of linked tasks, which are then scheduled to appropriate nodes to improve performance.  User traffic is distributed among many instances of your apps using a load balancer. Load balancing decreases the possibility of performance difficulties in your applications by distributing the load.  The logical and efficient distribution of network or application traffic among numerous servers in a server farm is known as load balancing. Each load balancer lies between client devices and backend servers, accepting and then distributing incoming requests to any server that can handle them. 9.6 KEYWORDS  Baseband- A transmission channel which carries a single communications channel, on which only one signal can transmit at a given time.  Call Setup Time - The time required to establish a switched call between DTE and devices.  Data Compression-A reduction in the size of data by exploiting redundancy. Many modems incorporate MNP5 or V.42bis protocols to compress data before it is sent over the phone line.  Dedicated Channel- An RF channel that is allocated solely for the use of a particular user or service. E.g., In CDPD, a channel may be dedicated to data. 192 CU IDOL SELF LEARNING MATERIAL (SLM)

 Channel- An individual communication path that carries signals at a specific frequency. The term also is used to describe the specific path between large computers (e.g., IBM mainframes) and attached peripherals. 9.7 LEARNING ACTIVITY 1. Suppose that the operations of the BLOB object are separated into two categories – public operations that are available to all users and protected operations that are available only to certain named users. State all of the problems involved in ensuring that only the named users can use a protected operation. Supposing that access to a protected operation provides information that should not be revealed to all users, what further problems arise? 2.For each of the factors that contribute to the time taken to transmit a message between two processes over a communication channel, state what measures would be needed to set a bound on its contribution to the total time. Why are these measures not provided in current general-purpose distributed systems? 9.8 UNIT END QUESTIONS 193 A. Descriptive Questions Short questions 1. What is load balancing? 2. Define task assignment. 3. List the advantages of load balancing. 4. Differentiate pre-emptive and non-pre-emptive transfer. 5. What is grid computing? Long Questions 1. Explain the purpose of load balancing algorithms. 2. Describe load balancing in distributed systems. 3. Explain the categories of load balancing algorithms. 4. Describe about task assignment approach. 5. Explain the desirable features of global scheduling algorithm. CU IDOL SELF LEARNING MATERIAL (SLM)

B. Multiple Choice Questions 1. A process stack does not contain __________ a. PID of child process b. Function parameters c. Local variables d. Return addresses 2. Which system call can be used by a parent process to determine the termination of child process? a. Exit b. Fork c. Wait d. Stack 3. The address of the next instruction to be executed by the current process is provided by the _________ a. CPU registers b. Program counter c. Distributed systems d. Call monitor 4. A part of a computer program that performs a well-defined task is known as? a. Software b. Hardware c. Program d. Algorithm 5. Which of the following is a feature of global scheduling algorithm? 194 a. Scalability b. Flexibility CU IDOL SELF LEARNING MATERIAL (SLM)

c. Security d. Transparency Answers 1-a, 2-c, 3-b, 4-d, 5-a 9.9 REFERENCES Reference books  George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts and Design” (4th Edition), Addison Wesley/Pearson Education.  Pradeep K Sinha, “Distributed Operating Systems: Concepts and design”, IEEE computer society press. Text Book References  M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International Publishers2009.  Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5. Websites:  https://www.techopedia.com/definition/10062/wireless-communications  https://www.computernetworkingnotes.com/  https://www.guru99.com 195 CU IDOL SELF LEARNING MATERIAL (SLM)

UNIT 10 -RESOURCE AND PROCESS MANAGEMENT 2 STRUCTURE 10.0Learning Objectives 10.1Load sharing approach 10.2 Introduction to process management 10.3Process Mitigation 10.4Threads 10.5Virtualization 10.6Client and Server 10.7Code Migration 10.8Summary 10.9 Keywords 10.10 Learning Activity 10.11 Unit end questions 10.12 References 10.0 LEARNING OBJECTIVES After studying this unit, you will be able to:  Describe the approaches of load sharing  Explain about process management  State the working of process mitigation  Outline about threads and code mitigation 10.1 LOAD SHARING APPROACH Load balancing products are designed to establish a dispersed network in which requests are evenly spread among multiple servers. Load sharing, on the other hand, requires sending a part of the traffic with one server and the rest to another. The loads don't have to be equal; they only have to be set up in such a way that the system isn't overloaded. Load sharing is a natural feature of the server-to-endpoint device forwarding process. In general, algorithms plan to receive more traffic to routes that they consider to be more 196 CU IDOL SELF LEARNING MATERIAL (SLM)

efficient. Assume there are two links to a website's servers, one that sends 300 Mbps and the other that sends 150 Mbps. An algorithm designed to follow load sharing will transmit 1 packet of data down the second connection for every 2 data packets sent down the first link. The goal isn't to use the same amount of bandwidth on both lines, but to use the maximum amount that each can manage. The choice of a load sharing or global scheduling method is a crucial aspect of a distributed system's architecture. This paper presents a thorough review of the literature on the subject. We offer a load-sharing algorithm taxonomy that distinguishes between source-initiative & server-initiative strategies. The taxonomy allows for the selection of ten typical algorithms for performance evaluation. The Q-factor (quality of load sharing) is a performance indicator that summarises an algorithm's overall efficiency and fairness and allows algorithms to be rated by performance. The algorithms are then evaluated using a combination of mathematical and simulation methodologies. The study's findings show that: I selecting a load-sharing algorithm is indeed a critical design decision; ii) server-initiative algorithms have the potential to outperform source-initiative algorithms for the same level with scheduling information exchange (whether this potential is realised depends on the factors such as communication overhead); iii) this same Q-factor is indeed a useful yardstick. Several experts feel that load balancing, which entails attempting to balance workload across all of the system's nodes, is not an appropriate goal. This is because the overhead of gathering state information to fulfil this goal is typically quite high, particularly in distributed systems with a large number of nodes. In fact, balancing the load across all nodes is not essential for optimal resource usage in a distributed system. It is both required and sufficient to keep nodes from being idle while others run multiple tasks. Instead of Dynamic Load Balancing, this solution is known as Dynamic Load Sharing. The design of load sharing algorithms necessitates making informed decisions on load estimation, process transfer, state information interchange, priority assignment, and migration limiting policies. Most of these principles are easier to decide in the case of loa d sharing because load sharing algorithms do not aim to balance overall average workload of all the system's nodes. Rather, when a node is substantially loaded, they just try to ensure that neither node is idle. The policies for priority assignments and migration limitation for load- sharing algorithms are identical to those for load-balancing algorithms. Planning and, in some circumstances, configuring load sharing is required. By default, some protocols cannot provide load sharing. An Internetwork Packet Exchange (IPX) router, for example, can only remember one route to a remote network when using Novell's Routing Information Protocol (RIP). The ipx maximum-paths command on a Cisco router can be used to adjust this behaviour. 197 CU IDOL SELF LEARNING MATERIAL (SLM)

Configuring channel aggregation in ISDN systems can help with load sharing. As bandwidth demands grow, a router can use channel aggregation to automatically bring up several ISDN B channels. For ISDN B-channel aggregation, the Multilink Point-to-Point Protocol (MPPP) is also an Internet Engineering Task Force (IETF) standard. MPPP ensures all packets arrive to the receiving router in the correct order. Data is contained inside the Point-to-Point Protocol (PPP) and datagrams are assigned a sequence number to accomplish this. PPP employs the sequence number to recreate the original data stream at the receiving router. Multiple channels appear to upper-layer protocols as a single logical link. Load sharing over parallel links of equal cost is supported by most vendors' implementations of IP routing protocols. (Routing protocols utilise cost values to select the most desirable path to a destination; cost might be based on hop count, bandwidth, delay, or other considerations, depending on the routing protocol.) Cisco allows load sharing across six parallel paths. Cisco enables load sharing via the IGRP and Enhanced IGRP protocols, even when the paths may not have the same bandwidth (which is the main metric used for measuring cost for those protocols). IGRP and Enhanced IGRP may load balance among paths that do not have exactly the same aggregate bandwidth that used a feature called variance. Some routing protocols calculate the cost based on the number of hops required to reach a specific destination. As long as the hop count is equal, these routing methods load balance over unequal bandwidth channels. Higher-capacity links, on the other hand, cannot be filled once a sluggish link gets saturated. Pinhole congestion is the technical term for this. Pinhole congestion can be avoided by employing a routing protocol that costs bandwidth and has the variance feature, or by constructing equal transfer activities within one layer of the hierarchy. Resource Management in a Distributed Environment is a system for managing resources such as files and data across a distributed system, with the goal of ensuring that a user or client may access remote resources with the same ease as local resources. Resource sharing is also the foundation of resource management. Because a computer can request a service or file from another computer by submitting an acceptable request via the communication network. Hardware and software resources can be shared among autonomous computers. This communication is also known as peer-to-peer communication, which is the foundation of distributed systems rather than the centralized-server and client technique. Peer-to-peer communication is a type of communication in which two or more people communicate with each The mechanism is far more efficient, versatile, convenient, and quick than the centralized- server and client's mechanisms. All processes involved in a job like resource management have similar roles in this architecture, engaging cooperatively as peers without any distinction between client and server processes or the computers they execute on. The goal of a peer-to- peer architecture is to utilise the resources of a large number of collaborating computers to complete a task. The necessity of organising the relationship between each computer cannot be overstated. To allow the largest possible variety and types of computers to be used, the 198 CU IDOL SELF LEARNING MATERIAL (SLM)

protocol or communication channel should not contain or misuse information that may not be understood by some machines. Special care must also be taken to ensure that messages are sent correctly and that invalid messages are rejected, as this would otherwise bring the system and maybe the rest of the network down. Another essential consideration is the ability to send software to another machine in a portable format, allowing it to run and communicate with the current network. When employing different hardware and resources, this may not always be viable or practical, in which case other approaches must be utilised, such as cross-compiling or manually porting this software. 10.2 INTRODUCTION TO PROCESS MANAGEMENT A programme is useless unless the instructions it contains are executed by either a CPU. A process is a programme that is currently running. Processes require computer resources to do their tasks. There could be multiple processes in the system that need the same shared resource. As a result, the operating system must efficiently and effectively handle all processes and resources. To preserve consistency, some resources may have to be executed by one section at a time. Otherwise, the system may become inconsistent and a deadlock may develop. The execution of a programme that accomplishes the actions defined in that programme is referred to as a process. It can be characterised as a processing unit in which a programme executes. The operating system (OS) assists you in creating, scheduling, and terminating CPU processes. A child process is a process that is created by the primary process. With the use of PCBs, process processes may be simply managed (Process Control Block). You can think of it as the process's brain, as it includes all of the critical information about processing, such as the process's id, priority, state, and CPU registers. In terms of Process Management, the operating system is in charge of the following tasks.  Processes & threads on the CPUs are scheduled.  Both software requirement processes can be created and deleted.  Processes can be paused and resumed.  Providing synchronisation mechanisms for processes.  Providing communication mechanisms for processes. A process is a running programme. When we construct a C or C++ programme, for example, the compiler generates binary code. Both the original and binary codes are programmes. When we run the binary code, it turns into a process. 199 CU IDOL SELF LEARNING MATERIAL (SLM)

In contrast to a programme, which is considered a \"passive\" entity, a process is an \"active\" entity. When a single programme is run numerous times, it can generate multiple processes; for example, when we open a.exe or binary file multiple times, multiple instances are formed (multiple processes are created). Process management entails a variety of activities, including process development, scheduling, termination, and a deadlock. A process is a running programme that is an important component of current operating systems. The operating system must distribute resources to allow processes to share & exchange data. It also secures each process' resources from many other methods and enables for process synchronisation. The OS's responsibility is to keep track of all the system's operating processes. It manages operations by completing duties such as process scheduling and resource allocation. There can be multiple instances of a programme loaded in memory at one time in either modern operating system. For instance, multiple users could be running the same software, each with their own copy of the programme loaded into memory. It is feasible to have one copy of a programme loaded into memory while multiple users share access to it so that they can all execute the same programme code. This type of programme is referred to be re- entrant. The processor can only execute one instruction from a single programme at any given moment, but numerous processes can be sustained over time by allocating each process to a processor at intervals while others stay inactive. Concurrent execution refers to the execution of multiple processes over even a period of time rather than at the same time. A multiprogramming and multitasking operating system are one that runs multiple processes at the same time. Multiprogramming necessitates allocating the processor to each process for a set length of time and then de-allocating it at the right time. If a processor is de-allocated while a process is running, it must be designed in such a way that it may be restarted as quickly as feasible later. There are two ways for an operating system to reclaim control of a processor during the execution of a programme in order to execute de-allocation or allocation: A system call (also known as a software interrupt) is issued by the process; for example, an I/O request is made to access a file on the hard disc. A hardware interrupt happens when a key on the keyboard is pressed or a timer expires (used in pre-emptive multitasking). A context switch and context change occur when one process is stopped and another is started (or restarted). Processes in many operating systems can be broken down into multiple sub-processes. The concept of a thread is introduced here. A thread can be thought of as a sub-process, or a separate, independent execution sequence within the code of another process. Threads are becoming more significant in the design of distributed & client–server systems, as well as multi-processor software. 200 CU IDOL SELF LEARNING MATERIAL (SLM)


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook