Figure 10.1 Process architecture Here is a diagram of the Process Architecture. The Stack is a data structure that contains temporary data such as function parameters, return addresses, and local variables. Heap Allocates memory that can be processed throughout the program's execution. The variable is contained in the data. The current activity is reflected by the value of both the Program Counter in the Text Section. Process Control Block The PCB stands for Process Control Block in its entire form. It is a data structure that the Operating System keeps track of for each process. An integer Process ID should be used to identify the PCB (PID). It assists you in storing all of the data necessary to keep track of all operating processes. It's also in charge of keeping track of the contents in processor registers. When the process exits the operating state and then returns to it, these are stored. As soon as the process transitions to a new state, the OS updates the information in the PCB. 201 CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 10.2 Process control block A process control block, also known as a task control block, is used to represent each process in the operating system. Here are some of the most critical aspects of the PCB manufacturing process: A process can be brand new, ready to go, running, waiting, and so forth. The programme counter informs you of the address of next instruction that should be executed in that process. Accumulators, index & general-purpose registers, and condition code information are all part of the CPU registers. This component contains a process priority, pointers to scheduling queues, and several other scheduling settings for the CPU. Information on accounting and business: It contains CPU usage as well as time utilities such as real-time usage, task or process counts, and so on. Memory-management data contains the value of the base & limit registers, as well as the page and segment tables. This is dependent on the memory system that the operating system employs. I/O status information: This block contains information such as a list of open files, a list of I/O devices assigned to the process, and so on. 202 CU IDOL SELF LEARNING MATERIAL (SLM)
Process States: Figure 10.3 Process states A process state is the state of the process at a given point in time. It also specifies the process' current state. The following are the seven steps of a process: When a software calls from secondary memory/hard disc to primary memory/RAM, a new process is established. The process must be loaded into primary memory, which really is ready for execution, when it is in a ready condition. Waiting: The process is waiting for CPU time and other resources to be allocated for execution. The process is currently in an execution state. Blocked: When a process remains waiting for an event, such as I/O operations, to complete, it is said to be blocked. Suspended: The suspended state describes when a process is ready to run but has not yet been placed inside the ready queue by the operating system. 203 CU IDOL SELF LEARNING MATERIAL (SLM)
Terminated: When a process is terminated, it is in the terminated state. A process uses all of the resources available after each phase is completed, and memory becomes available. 10.3 PROCESS MIGRATION Process migration is a specific form of process management in computing that involves moving processes through one computing environment to the next. This is a term that originated with distributed computing but is now more commonly used. Process migration is a standard part of process scheduling on multicore machines (multiple cores solely on a single processor or multiple processors), and it is relatively simple to migrate a process within a machine because most resources (memory, files, sockets) do not have to be changed, just the execution context (primarily programme counter and registers). Process migration in computer clusters, where processes were moved from machine to machine, is a much more complicated task since it necessitates serialising the process image and migrating and reacquiring resources on the new system. OpenMosix, for example, implements process migration. The Sprite OS from the University of California, Berkeley, was the first to do so. When a running process is moved to another machine, it causes a slew of issues. Some of these issues include: I/O redirection: if a process performs I/O to files or devices which are bound to a specific machine, access to these resources must be redirected even after the process ha s migrated. This entails routing the I/O data stream over the network, which has security, performance, and reliability drawbacks. Messages delivered to a process having process ID P on a machine M must be forwarded to the new machine N as well as the new process ID Q. The machine from which the process was migrated must keep track of the moved processes. The overhead grows if several migrations take place. If one of a group of collaborating processes migrates away and they all use a shared memory segment, the network must be used to simulate shared memory access. This adds complexity and significantly hinders access to shared memory for processes that have moved away from the machine that holds the shared memory. Residual Dependency is the phenomenon in which a host computer is required to offer services to a process that has moved away. Process-based operating systems typically maintain a variety different tables and states for running processes, in addition to the issues outlined above. In a basic sequence of copy operations, there is no simple way to extract all administrative information about the event. Tables on the destination device must be searched, duplicated, edited, and re-established depending on the process's execution. As a result, process state replication is a difficult task. 204 CU IDOL SELF LEARNING MATERIAL (SLM)
In integrated circuit design and engineering, a different definition of process migration is utilised. A design flow to adapt and downsize an existing IC layout to a new process technology node is known as process migration or layout migration in this context. A process migration can be carried out manually by redrawing the layout component by feature or automatically using EDA/CAD tools. Process migration is a mechanism used in load sharing systems to move a process from one node to another. Because processes are a very well design in operating systems, the concept of a process is not introduced in this study. Process In a distributed computing context, migration refers to the mobility of running (or halted) processes. This word usually refers to a process that uses a network can migrate to another system and continue running there. Within the same computer, the word is sometimes used to describe the transition through one processor to another. The act of shifting a process across two machines is known as process migration. Dynamic load distribution, fault resilience, simplified system administration, & data access localization are all possible with it. Despite these objectives and current research, migration is still not widely used. Process migration is once again garnering increased attention in both research and product development, thanks to the rising deployment of distributed systems in general and distributed operating systems in particular. We expect migration to become more essential as high-performance facilities migrate from supercomputers towards networks of workstations, and as the role of the World Wide Web grows in importance, we expect migration to become more widely accepted. Process migration has indeed been studied in computer science for over two decades. One of the earliest idea papers refers to \"worms\" as programmes that can traverse machine boundaries on search of a free machine to execute. Process migration has been refined in subsequent articles as a viable means of sharing processing power or other resources among numerous processors. Several articles and implementations have been presented to introduce process migration on various architectures, including symmetric multiprocessors, Non- Uniform Memory Access (NUMA) multiprocessors, Massively Parallel Processors (MPP), and computer networks. An operating system (OS) entity of a programme under execution is referred to as a process. Address space and other OS internal features such as home directory, open file descriptors, user id, programme counter, and so on are associated with it. The movement of a process between various networks connected by a network is known as process migration. Dynamic Load Balancing is one of the reasons for process migration. It allows processes to migrate from overloaded nodes to less overloaded ones, allowing them to take benefit of less loaded nodes. Accessibility. A failed node's processes can be transferred to other healthy nodes. 205 CU IDOL SELF LEARNING MATERIAL (SLM)
Administration of the system. Processes that are currently running on a node that is undergoing system maintenance could be moved to other nodes. Data accessibility. Data locality or other specific features of a particular node can be used by processes. The ability to move. Before the device is disconnected from the network, processes can be transferred from a portable device or laptop computer towards more powerful server computer. Recovery from a fault. Technically, the ability to pause, transport, and resume a process is useful for fault recovery for mission-critical or transactional systems. MOSIX, Sprite, Charlotte, V, Condor, and Mach are just a few of the projects that have worked on process migration. Despite the fact that process migration is beneficial in many situations, it is not extensively used currently. One of the issues is the difficulty of supporting process migration upon top of a system which was not designed to enable it. Network transparency, naming transparency, location transparency, and other features are among them. Implementing process migration in systems that lack appropriate facilities can result in performance and security degradation, as well as complicated implementation and low reliability. Despite the fact that process migration can be classified in to the different implementation categories depending on the level of operating system it is built on, like UNIX (or variant) clear migration, microkernel, message-passing kernel, and user-space migration, all implementations tend to provide such a common subset of system functionalities throughout order to promote process migration effectively. Transparencies in Networking and Naming Processes that have been migrated to the target node ought to be able to run in the same way that they did on the source node. The migrated process should have easy access to communication channels, I/O devices, and file systems. A certain degree both network and naming transparency is enabled in order to provide execution context. MOSIX, for example, uses a super-root, \"/...\", as a network-wide origin to give a consistent method of accessing objects. Export & Import Process States Interfaces Certain process states must be collected first from source node & imported to the remote node in order for the migrated process to run correctly at the remote node. Program counter (PC), files handles, CPU registers, and other states are among them. To be meaningful to both the target architecture, heterogeneous process migration might require additional level of abstraction or translation. Mechanism for transferring processes. Different ways of process transfer are used in different implementations. Remote procedure calls (RPC), message passing, and other options are available (MP). 206 CU IDOL SELF LEARNING MATERIAL (SLM)
Management of Load Information (Optional). Many solutions (e.g., Condor) have load information management components to make logical process migration decisions based on information distributed from local and remote nodes. 10.4 THREADS A process is a running programme. A process is traditionally made up of an address space and a single control thread executing in that address space. In a process control block, the operating system keeps environmental and accounting data for each process under its supervision (PCB). This involves keeping track of historical resource utilisation for cost and auditing purposes, as well as descriptors for open files & communication channels, I/O buffers, locks, and other OS assets in use by the process. The PCB holds the required information for the native operating system to switch contexts. Figure 10.4 PCB If a single thread of control rules the address space of a process, it is said to be sequential. For distributed system applications, threads are quite useful. A client/server model of communication is used by many systems, in which a server process listens besides client requests, performs them, and returns the results. Two or more concurrent requests had to be executed serially by the server process if it was operating as a sequential process while attempting to accommodate multiple client requests. This would keep a bunch of buffered, unanswered requests waiting endlessly until the server completed with the ones that came before them. The client may believe the server has failed because the outstanding requests have timed out. One option would be to have the server create a new process for each new request. When a listener process receives a message, it spawns a new process to run the service, sending the client's arguments to it. The operating system then controls the asynchronous execution of requests. If necessary, the listener process could offer an immediate first response to the client. However, because the operating system must find adequate free memory space for the process and set up other aspects of its control block, process creation can be quite slow. A limited number of PCBs are also available for the operating system. If the amount of time spend servicing requests was little, this would be inefficient. Furthermore, if the server is storing data of any type, the server processes must synchronise their accesses to this data utilising interprocess communication, resulting in propagation delays that degrade server performance. 207 CU IDOL SELF LEARNING MATERIAL (SLM)
A better option is for the server to generate threads to handle new requests. New requests are received by a controlling thread, which constructs a lightweight thread to execute them. From the operating system's perspective, this is a more scalable strategy since it makes better use of system resources. Shared memory techniques allow for more efficient coordination and synchronisation between the cooperating portions of the server process. Threads enable parallelism to be coupled with the client/server model's well-understood blocking communication. That example, because sequential code execution & blocking communication are easy to specify, programmers prefer them. With blocking calls, we can keep our sequential process interaction model while yet achieving parallelism with in application with threads. This is due to the fact that when a thread in a process stops while waiting for a response from a server, another thread in that process can be scheduled, ensuring that the process does not lose its allowed CPU time. Another advantage of threads is that they allow the process to use a multiprocessor system. Depending on the application, there are several ways to organise parallelism within a process. The first two are different ways to organise server processes. The third would be better suited to organised producer-consumer applications. Concurrency is introduced when many threads of control are allowed in a process. Each thread shares and functions within the same process address space, but each thread does have its own local processor state stored in the process's thread control block (TCB). Thread management is extremely light, as thread creation and context changes have minimal overhead. Threads use rapid shared memory methods to communicate and synchronise with one another. The first organisation employs one thread to listen for client requests, then selects (or creates) a free slave to carry out the request while returning to listen for more client requests. Thread creation could be dynamic, meaning that the number of threads created can change in response to demand for the service. The incoming messages are only visible to one thread. The second organisation prioritises the threads equally and works as a team to fulfil the service requests. Each of them has access to the requests that come in. Apart from the usage of shared memory techniques for coordinating access to the process data, threads run independently, hence the number of threads is statically specified when the process is started. The third organisation shows a thread pipeline, in which one thread's output is used as the input for the next. Requests are supplied to the pipeline's initial thread. Compilers, for example, have organisations that are comparable to this. Unix commands can indeed be made up of many commands, each of which outputs to the next. For example, \"ls -l | grep dkelly\" will return all files in the current directory owned by or named dkelly. The Unix shell spawns separate processes for each portion of the command and uses the pipe communication method to connect the required output one to the input file of the next. The command might be treated as a single process if \"ls -l\" and \"grep dkelly\" could be done by distinct threads. 208 CU IDOL SELF LEARNING MATERIAL (SLM)
All modern operating systems now offer threads, allowing programmers to construct concurrent programmes quickly and easily. Many windowing applications, for example, use threads to make event coordination between distinct windows easier because the threads share memory. Consider a Word Processor as an example of an application programme. The word processor's code and data are both stored in the process memory area, and the application operates as a process on the system. If the application is multithreaded, one thread can read keystrokes while another checks spelling, another checks grammar, and yet another repaginates or draws visuals, all while sharing access to the same programme code and data. Consider a web server or a database as an example of an internet server programme. If the server is multithreaded, it can launch a new thread for each new request, allowing it to handle several requests simultaneously. If a single threaded server is utilised, each request will be queued and served in turn, and the client programme may have to wait for a long time for a response if the server is busy. This lack of response might sometimes result in timeout issues. With a new thread for each concurrent request, the server can immediately acknowledge the request while still processing requests from other clients. Another alternative is to have the server create a limited number of threads when it first starts up. A client request can be handled by any available thread, but it must wait for one to become available. If resources are scarce and the machine cannot handle a high degree of multithreading, this strategy avoids thread formation time for each request and reduces the resources needed by the server. Within a single application, multithreaded applications can use more operating system resources at the same time. If many threads may operate simultaneously on a multiprocessor system, then application may run faster. Another possibility is that one thread is waiting for an I/O device while another is using the processor. Single threaded processes, on the other hand, would completely halt while waiting for I/O. Blocking communication primitives are preferred by client/server application programmers for their simplicity and understandability. This indicates that the client is waiting for a response from the server. Without multithreading, concurrent execution of client and server code is impossible. A client or server thread might wait for communication whereas other threads in the application perform different tasks using multithreading. 209 CU IDOL SELF LEARNING MATERIAL (SLM)
Figure 10.5 Thread processes A programme can have one or more execution loci. A thread on execution is the name given to each execution. Each process in classic operating systems has its own address space and execution thread. It is the lowest unit of processing that an operating system may arrange. Within a process, a thread is indeed a single sequence stream. Threads are frequently referred to as lightweight processes since they share some of the characteristics of processes. Threads in a process allow numerous streams to be executed at the same time. Threads are also the entities scheduled to execution on the CPU, and processes are used to organise resources together. A programme counter in the thread maintains track of which instruction is to be executed next. It has registers where it keeps track of its current working variables. It has a stack with one frame for each operation called but not yet returned from, which contains its execution history. Despite the fact that a thread must run as part of a process, the thread and the process are distinct notions that can be managed separately. Threads enhance the process model by allowing several executions to occur in same process environment that are largely independent of each other. Having several threads operating in a single process is analogous to having multiple processes running in a single machine. The threads shared an address space, open files, as well as other resources in the former instance. Processes share physical memory, discs, printers, and other resources in the latter instance. 210 CU IDOL SELF LEARNING MATERIAL (SLM)
Three typical procedures can be seen in Figure. Each process does have its own address space as well as a single control thread. In contrast, Figure. shows a single process having three control threads. Figure 10.6 Control threads Although there are three threads in both examples, each one works in its own address space in Figure. (a), whereas in Figure. (b), they all share the same address space. A thread can be in any of the following states: running, blocked, ready, or terminated, just like a standard process (i.e., a process with only one thread). When multithreading is enabled, processes typically begin with a single thread. This thread can generate new threads by executing the thread create library operation. When a thread has completed its task, it can exit by executing the thread exit library operation. A procedure thread join can be used to wait for a (particular) thread to exit. The calling thread is blocked until a (particular) thread has departed. Thread yield is another popular thread call that allows a thread to freely give up the CPU to allow another thread to run. 10.5 VIRTUALIZATION In distributed systems, virtualization is a crucial idea. In the area of networking, we've seen the one application in virtualization in the shape of overlay networks that support specific types of distributed applications. Virtualization is also used in operating systems, and it is in this area that virtualization does have the greatest impact. In this section, we'll look at what it means to use virtualization at the operating system level (system virtualization), as well as a case study of Xen, a popular system-level virtualization solution. System virtualization aims to provide numerous virtual machines (virtual hardware images) on top of the actual physical machine architecture, each running its own operating system instance. The concept is based on the fact that computer system architectures have the capability required to host potentially enormous numbers of virtual machines plus multiplex resources among them. On virtual machines, many instances of same operating system can 211 CU IDOL SELF LEARNING MATERIAL (SLM)
operate at the same time, or a variety of operating systems could be supported. The virtualization system distributes a real machine's processor(s) and other resources among all virtual machines it supports. Processes were once used to share its processor as well as other resources across several tasks that were running under behalf of one or more users. System virtualization is a more recent development that is now widely employed for this purpose. It provides advantages in terms of security and clean task separation, as well as more precise resource allocation and pricing for each user than can be achieved using processes running in such a single system. Consider the following use cases of virtualization technology to fully comprehend the purpose for virtualization only at operating system level: On server machines, an organisation assigns each service it provides to a virtual machine before allocating the virtual machines onto physical servers in the most efficient way possible. Virtual machines, unlike processes, may be easily transferred to other physical computers, giving administrators more freedom in managing server architecture. This strategy has the potential to lower server computing costs while also lowering energy usage, which is a major concern for large server farms. The provision for cloud computing necessitates the use of virtualization. Storage, computation, and higher-level objects created on top of them are all given as a service in cloud computing. The services offered range between low-level components like physical infrastructure (known as infrastructure as a service), to software platforms like the Google App Engine (platform as a service), to arbitrary input validation services (software as a service). Indeed, virtualization enables the first by allowing cloud users to be given with one or more virtual machines for their own use. Virtualization solution creators are also driven by the necessity for application components to quickly generate and delete virtual machines with minimal overhead. This is necessary in applications that require dynamic resource demands, such as cooperative online games or distributed multimedia applications. Adopting suitable resource allocation policies that meet virtual machine quality of service needs might improve support for such apps. On the other hand, offering quick access to multiple operating system environments on such a single desktop computer is a quite different issue. Multiple operating system types can be provided using virtualization on a single physical architecture. The Parallels Desktop virtual machine monitor, for example, on either a Macintosh OS X computer, allows a Windows or Linux system to be installed and cohabit with OS X while sharing same underlying physical resources. A thin layer of software called a virtual machine monitor or hypervisor is used to implement system virtualization on top of both the underlying physical machine architecture. This 212 CU IDOL SELF LEARNING MATERIAL (SLM)
virtual machine monitor has a user interface that is based on the physical architecture. In full virtualization, the virtual machine monitor provides an interface that is equivalent to the multiple hardware architecture. This has the advantage of allowing existing operating systems to run on the virtual machine monitor transparently and without modification. However, on many computer architectures, such as the x86 family of processors, experience has shown that full virtualization can be difficult to achieve with satisfactory performance, and that performance could be improved besides allowing a modified interface to just be provided (with both the drawback which operating systems must then be ported to this modified interface). Virtualization is a technique that employs software to construct an abstraction layer above computer hardware which allows a single computer's physical elements—processors, memory, storage, and other components—to be separated into several virtual computers, or virtual machines (VMs). Even though it is only running on a part of the real core computer hardware, each VM runs its very own operating system (OS) and acts like an independent machine. As a result, virtualization allows for more efficient use of physical computer hardware and a higher return on a company's hardware investment. In today's enterprise IT architecture, virtualization is a common technique. It is the technology which drives the economics of cloud computing. Virtualization allows cloud providers to serve customers using their existing physical computer hardware, and it allows cloud users to buy only the computing resources they require when they require them, and to scale these resources cost-effectively while their workloads grow. Benefits of Virtualization: Data centre operators & service providers profit from virtualization in various ways: Prior to virtualization, every application server had its own dedicated physical CPU, which meant that IT workers had to buy and setup a new server by each application they intended to run. (For reasons of reliability, IT preferred one application one and operating system (OS) per computer.) Every physical server would invariably be underutilised. Server virtualization, on the other hand, allows you to run multiple programmes on a single physical machine (usually an x86 server) without affecting stability. This allows for the most efficient use of the computational capabilities of the physical hardware. Easier administration: It's easier to use and administer regulations specified in software when physical computers are replaced by software-defined virtual machines. This enables you to develop IT managed services workflows that are automated. Administrators can use automated deployment & configuration tools to define groups of virtual machines and apps as services in software templates, for example. This means they can instal such services again and consistently without having to go through the tedious and time-consuming process each 213 CU IDOL SELF LEARNING MATERIAL (SLM)
time. Manual setup is time-consuming and prone to errors. Virtualization security policies can be used by administrators to impose particular security make effort on the virtual machine's role. Policies can even improve resource efficiency by decommissioning unwanted virtual machines to save space and processing power. Downtime is minimised: Operating system and application breakdowns can cause downtime and impair user productivity. Admins can run numerous redundant virtual machines side by side and failover between them if one fails. It is more expensive to run numerous redundant physical servers. Faster provisioning: It takes time to buy, instal, and configure hardware for each application. Provisioning virtual machines to execute all of your applications is substantially faster if the hardware is now in place. You may even use management software to automate it and incorporate it into current operations. Types of Virtualizations: We've just learnt about server virtualization so far, but many other aspects of IT infrastructure can be virtualized to provide major benefits to IT managers (in particular) and the organisation as a whole. Desktop virtualization Network virtualization Storage virtualization Data virtualization Application virtualization Data center virtualization CPU virtualization GPU virtualization Linux virtualization Cloud virtualization Desktop Virtualization Desktop virtualization allows you to run numerous desktop operating systems on the same computer, each in its own virtual machine. Desktop virtualization can be divided into two categories: Virtual desktop infrastructure (VDI) operates multiple desktops in virtual machines (VMs) on a central server and feeds them to thin client devices. In this approach, VDI enables an organisation to provide its users with access to a variety of operating systems from any 214 CU IDOL SELF LEARNING MATERIAL (SLM)
device, without the need to instal operating systems on those devices. For a more detailed explanation, see \"What is Virtual Desktop Infrastructure (VDI)?\" Local desktop virtualization instals a hypervisor on a local computer, allowing the user to run one or more extra operating systems on that machine and switch between them as needed without affecting the primary OS. Network Virtualization Network virtualization use software to generate a “view” of the network that allows an administrator to control the network from a single console. It abstracts hardware pieces and operations (for example, connections, switches, and routers) into software that runs on a hypervisor. The network administrator may change and control these features without having to touch the underlying physical components, making network management easier. Software-defined networking (SDN) virtualizes hardware which controls network traffic routing (referred to as the \"control plane\"), while network function virtualization (NFV) virtualizes one maybe more hardware appliance which provide a specific network purpose (e.g., a firewall, load balancer, as well as traffic analyzer), trying to make those appliances easier to configure. Storage virtualization Storage virtualization allows all storage devices on a network to be accessed & controlled by a single storage device, regardless of whether they're placed on individual servers or stand- alone storage units. Storage virtualization, in particular, consolidates all storage blocks into a single common database from which they'll be allocated to any VM mostly on network as needed. Storage virtualization makes it easy to provision storage for virtual machines and makes the most of all network storage. Data virtualization Data from numerous applications is stored in multiple locations, ranging first from cloud to on-premise hardware and the software systems, utilising multiple file formats. Any programme can access all of that data thanks to data virtualization, regardless of its source, format, or location. Between the applications that access the data and the systems that store it, data virtualization tools build a software layer. The layer converts a data request or query from an application as needed & returns results that can span several systems. When alternative forms of integration aren't practical, acceptable, or inexpensive, data virtualization can assist break down data silos. Application Virtualization Application virtualization allows users to execute software without having to instal it on their operating system. This differs from full desktop virtualization (discussed above) in that only 215 CU IDOL SELF LEARNING MATERIAL (SLM)
the programme runs in a virtual environment, but the OS on the end user's device continues to function normally. Application virtualization can be divided into three categories: Local application virtualization: Instead of running on native hardware, the complete application runs in a runtime environment on the endpoint device. Streaming applications: The programme is hosted on a server, which distributes small software components to the end user's device as needed. Virtualization of server-based applications the application is totally executed on a server, with the client device receiving only the user interface. Data center virtualization Virtualization of data centres abstracts the majority of a data center's hardware into software, allowing an administrator to separate a single physical data centre into many virtual data centres for various clients. Each client has access to their own infrastructure as a service (IaaS), which is hosted on the same physical hardware as the others. Virtual data centres provide a simple entry point into cloud computing, allowing businesses to quickly build up a complete data centre environment without having to invest in infrastructure hardware. CPU Virtualization Virtualization of the CPU (central processing unit) is the foundational technology that allows hypervisors, virtual machines, & operating systems to exist. It divides a single CPU into many virtual CPUs that can be used by multiple VMs. Initially, CPU virtualization was solely software-defined, but today's processors feature extended instruction sets that support CPU virtualization, resulting in improved VM performance. GPU Virtualization A GPU (graphics processing unit) is a multi-core processor that boosts overall computer performance by handling graphic and mathematical processing. GPU virtualization allows several virtual machines (VMs) to share a single GPU's processing capacity, allowing for quicker video, AI, and other graphic- or math-intensive applications. Pass-through GPUs provide a single guest OS access to the full GPU. For usage by server-based VMs, shared vGPUs divide actual GPU cores among numerous virtual GPUs (vGPUs). Linux Virtualization The kernel-based virtual machine (KVM) is a Linux hypervisor that supports Intel and AMD's virtualization processor extensions, allowing you to create x86-based VMs from within a Linux host OS. 216 CU IDOL SELF LEARNING MATERIAL (SLM)
Linux is highly adaptable as an open source operating system. You can create virtual machines (VMs) that run customised versions of Linux or security-hardened versions of Linux for more sensitive applications. Cloud Virtualization As previously stated, virtualization is essential to the cloud computing model. Cloud computing providers can provide a variety of services to consumers by virtualizing servers, storage, as well as other physical data centre resources, such as: Infrastructure as a service (IaaS) refers to virtualized server, storage, & network resources that you can tailor to your needs. PaaS (platform as a service) refers to virtualized developer tools, databases, as well as other cloud-based services that you can use to create your own cloud-based apps and solutions. Software as a service (SaaS) refers to cloud-based software applications. SaaS is the cloud- based service that is the most decoupled from hardware. 10.6 CLIENT AND SERVER A client-server networking architecture is one in which computers, such as servers, give network services to other computers, such as clients, in order for them to conduct user-based tasks. Client-server networking is the name given to this paradigm. The following strategies should be followed by client-server application programmes: Figure 10.7 Client server A client programme is an application programme that runs on a local machine and requests a service from a server programme that runs on a remote machine. 217 CU IDOL SELF LEARNING MATERIAL (SLM)
A client programme runs only when it requests a service from the server, whereas the server software operates continuously because it has no way of knowing when its service will be needed. A server serves a large number of clients rather than a single client. As a result, we can say that the client-server relationship is one-to-many. One server can serve a large number of clients. Many users have a specialised client-server application software, and services are commonly required. The client-server application software, for example, allows users to access files, send e-mail, and etc. If the services are more specialised, we should have a single general application software which enables users to access the remote computer's services. Client A client is an application that runs on a local machine and asks the server for help. A client programme is a finite programme, which means that the user initiates the service and it ends when it is completed. Server A server is a piece of software that runs on a remote computer and provides services to clients. The server opens a door for incoming requests when a client wants a service, but it never launches the service. A server programme is an endless programme, which implies it will operate indefinitely unless a problem occurs. The server sits and waits for the clients' requests. The server responds to the request when it arrives at the server. Client-server networks provide the following benefits: In client-server networks, centralised backup is possible since all data is saved on a server. Security: Because all shared resources are managed centrally, these networks are more secure. Performance: Using a dedicated server boosts the speed with which resources are shared. This improves the entire system's performance. Scalability: The number of clients and servers can be increased separately, i.e., a new element can be added at any time, or a new node in a network can be added at any time. Client-Server Network Disadvantages: In Client/Server networks, traffic congestion is a major issue. When a high number of customers send requests to the same server, traffic congestion can occur. It lacks network robustness, i.e., when the server is unavailable, the client requests are unable to be fulfilled. 218 CU IDOL SELF LEARNING MATERIAL (SLM)
The importance of a client/server network cannot be overstated. Regular computer hardware may not be able to handle a specific number of customers. In such cases, specific hardware on the server side is necessary to execute the task. The resources may be present on the server but not on the client. For example, if the application is web-based, we won't be able to print straight to printers without first closing the print view window. The working of client-server model Browsers are used to access the internet. This essay will assist us in establishing a solid web foundation and assisting us in navigating web technologies with ease. Client: A client is a person or an organisation who uses a specific service. A client is a computer (Host) in the digital world, capable of receiving information or using a specific service from the service providers (Servers). Servers: A person or medium who serves something is referred to as a server in the same way. A server, similarly, is a remote computer that offers information (data) or access to specific services in the digital world. So, the Client requests something, and the Server fulfils the request as long as the information is stored in the database. Figure 10.8 Working of client server model 10.7 CODE MIGRATION Data migration is a convenient way to move information from one location to another. Although this may appear to be a straightforward task, it necessitates a change in storage & database or program. In the sense of the extract/transform/load process, any data migration would necessitate at least the transform and load phases. This method proposes that obtained 219 CU IDOL SELF LEARNING MATERIAL (SLM)
data be processed through a series of functions before being placed in a target position during planning. Companies migrate data for a variety of reasons, including infrastructure redesign, database upgrades, building a new data warehouse, and integrating new data from a purchase or other source. Data replication is frequently required while installing another framework alongside existing applications. There are three types of data movers. Host-based software is ideal for application-specific migrations such as platform upgrades, database replication, and file copying. The most common use of software based on arrays is to transmit data across related structures. Network appliances migrate volumes, files, or blocks of data depending on their configuration. Making a data migration strategy A data transformation project can be difficult because managers must preserve data privacy and time the project such that the organisation has a minimal impact on the business while keeping costs under control. Any errors that arise during data migration can have a negative impact on the organization's business, thus a data migration procedure is necessary to ensure that successful business processes are not disrupted. The length of the migration, the amount of downtime necessary, and the company's risk of fastening, data corruption, application performance issues, and missed data loss are all factors to consider during a data migration process. Application Migration: When moving to a different vendor's programme or platform, application migration may be necessary. As systems interact with one another and each has its own data model, this technique has its own inherent levels of complication. Control tools, operating systems, and virtual machine implementations will all differ from those used when the programme was developed or implemented. Successful programme migration may need the usage of middleware goods to bridge technology barriers. Database migration is accomplished when there is a need to change storage vendors, update database infrastructure, or move a database to the cloud. When a protocol or data language is updated, the underlying data will change, which will have an impact on the application layer. Server data replication is concerned with changing data without changing the schema. Among the most important duties are evaluating the database's size to determine how much storage is available, testing software, and protecting the confidentiality of the records. Compatibility difficulties may develop during the migration process, thus it's important to double-check everything. Storage Migration: Storage migration is justified by infrastructure changes, and the procedure is viewed as an ideal time to verify and minimise data by discovering obsolete or incorrect 220 CU IDOL SELF LEARNING MATERIAL (SLM)
data. The approach entails moving data blocks and files from one storage medium to another, whether on a disc, tape, or the cloud. For storage transfer, there are a variety of products and devices that can make the operation go more smoothly. Relocating storage also allows you to address any orphaned storage or inefficiencies. Data migration vs. data integration: what's the difference? Data migration is not the same as data integration. Data transfer entails copying or transferring data from one device to another, and from one environment to another, whereas data transformation is moving data across programmes and structures. Administrators who work with data should be conversant with data extraction, processing, and loading technology. Each method has its own set of underlying layers of complexity, as well as a unique data model. Applications are not designed to be used on a mobile device. Control tools, operating systems, and virtual machine implementations will all differ from those used when the programme was developed or implemented. Successful programme migration may need the usage of middleware goods to bridge technology barriers. If you need to move storage providers, change your database architecture, or relocate a database to the cloud, you'll need to conduct a database relocation. When a protocol or user language upgrade occurs, the underlying data will move and have an impact on the application layer. The modification to data without affecting the schema is the subject of server data replication. Some of the essential activities include determining the archive's size to determine how much space is available, reviewing codes, and safeguarding the document's confidentiality. Compatibility issues may arise during the migration procedure, so it is vital to check the process first. Device migration can occur while switching to a different vendor's application or platform. When systems communicate with one other, each approach has its own inherent levels of complexity, and each data model is unique. Migration \"Big Bang\" In a big bang data migration, the full change is completed in a short period of time. Live systems experience downtime while data is processed by ETL and sent to the new database. Of course, the advantage of this method is that it all takes place in a timed event that takes very little time to complete. However, because one of the organization's properties is unavailable, the strain can be significant. This poses a threat to a corrupted installation. 10.8 SUMMARY As computing goes more and more to the cloud, load balancing plays an increasingly crucial security role. A load balancer's off-loading function protects an organisation from distributed denial-of-service (DDoS) attacks. This is accomplished by redirecting attack traffic away from the corporate server and onto a public cloud 221 CU IDOL SELF LEARNING MATERIAL (SLM)
provider. As the quantity and size of DDoS attacks grows, they represent a significant share of cybercrime. Hardware defences, such as perimeter firewalls, can be expensive and time-consuming to maintain. Cloud offloading software load balancers provide efficient and cost-effective protection. User traffic is distributed among many instances of your apps using a load balancer. Load balancing decreases the possibility of performance difficulties in your applications by distributing the load. Use load balancing on a global scale. When your backends are deployed across several regions and your users need access to the same applications and content, you want to use a single anycast IP address to give that access. IPv6 termination is also possible with global load balancing. When your backends are all in the same region and you only need IPv4 termination, use regional load balancing. Process management is an approach to organisational architecture that indicates that firm operations are organised and optimised in processes. It supports all model types that you need to create cohesive and visible process documentation, such as the business model and organisational chart. In distributed systems design, dealing with failures is a key issue. Hardware and software failures are the two most common types of failures. Until the late 1980s, hardware breakdowns were a major worry, but internal hardware dependability has vastly improved since then. Smaller circuits produce less heat and use less power, reducing off-chip connections and wiring, and using high-quality manufacturing procedures have all helped to improve hardware reliability. Connections and mechanical devices, such as network and drive failures, are the most common causes of difficulties today. 10.9 KEYWORDS Proxy server-A server which provides a shared cache of resources for the client machines at a site or across several sites. The purpose of web proxy servers is to reduce load on the wide-area network and to act as a security measure across firewall. Synchronous interaction model-A model of distributed systems in which lower/upper time bounds are known for the time to: execute each step, transmit a message and clock drift rates. Thin client-A software layer that supports a window-based user interface on a computer that is local to the user while executing application programs on a remote computer. 222 CU IDOL SELF LEARNING MATERIAL (SLM)
WAN (Wide Area Network)-A network that uses technology designed to span a large geographic area. For example, a satellite network is a WAN because a satellite can relay communication across an entire continent. WANs have higher propagation delay than LANs. Connectionless communication service-A communication service, e.g., UDP, based on the sending and receiving of messages which can get lost/duplicated/delivered out of order/corrupted without telling the user. 10.10 LEARNING ACTIVITY 1. For the different failure types listed above, consider what makes each one difficult for a programmer trying to guard against it. What kinds of processing can be added to a program to deal with these failures? 2. Contrast TCP and UDP. Under what circumstances would you choose one over the other? 10.11 UNIT END QUESTIONS 223 A. Descriptive Questions Short questions 1. What are the different states of a process? 2. Define peer-peer communication. 3. Differentiate IGRP and ICRP. 4. Define pin hole congestion. 5. What is PCB? Long questions 1. Explain the different states of a process. 2. Describe about process mitigation. 3. How can we make use of threads? 4. Describe about virtualization. 5. Explain the requirements of code mitigation. CU IDOL SELF LEARNING MATERIAL (SLM)
B. Multiple Choice Questions 1. _____is a natural feature of the server-to-endpoint device forwarding process. a. security b. Load sharing c. Resistance d. quality 2. Resource sharing is also the foundation of _____management. a. local b. dedicated c. resource d. supply 3. Processes & threads on the CPUs are______. a. scheduled b. Administered c. complex d. Work stations 4. A system call also known as a ______ a. Replica b. Random c. TCP d. software interrupt 5. A ____ is a running programme. a. program b. TCP c. process d. provision 224 CU IDOL SELF LEARNING MATERIAL (SLM)
Answers 1-b, 2- c, 3- a, 4- d, 5- c 10.12 REFERENCES Reference books George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts and Design” (4th Edition), Addison Wesley/Pearson Education. Pradeep K Sinha, “Distributed Operating Systems: Concepts and design”, IEEE computer society press. Text Book References M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International Publishers2009. Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5. Websites: https://www.techopedia.com/definition/10062/wireless-communications https://www.computernetworkingnotes.com/ https://www.guru99.com 225 CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 11- SYNCHRONIZATION STRUCTURE 11.0Learning Objectives 11.1Introduction 11.2 Clock synchronization 11.3 Logical clock 11.4Election Algorithm 11.5 Summary 11.6Keywords 11.7Learning Activity 11.8 Unit end questions 11.9References 11.0 LEARNING OBJECTIVES After studying this unit, you will be able to: Explain the basics of synchronization Outline the procedures of clock synchronization Describe about Logical clock Explain about election algorithm 11.1 INTRODUCTION Synchronization is the simultaneous operation of a system through coordinating events. The conductor of an orchestra, for example, maintains the symphony in sync or in time. Systems that work with all parts in sync are referred to as synchronous, whereas those that do not are referred to as asynchronous. Today, satellite navigation signals may be used to synchronise time across systems all over the world. A distributed system is a group of computers connected by a high-speed communication network. Message passing allows the hardware components in a distributed system to interact and coordinate their operations. In distributed systems, each node can share its resources with other nodes. As a result, efficient resource allocation is required to maintain the status of resources and to aid coordination between the various operations. 226 CU IDOL SELF LEARNING MATERIAL (SLM)
Synchronization is employed to address such issues. Clocks are used in distributed systems to achieve synchronisation. Nodes' times are adjusted using physical clocks. Each node with in system can communicate its local time with some other nodes in the system. UTC is used to set the time (Universal Time Coordination). The nodes in the system use UTC as their reference time clock. External and internal clock synchronisation are two methods for achieving clock synchronisation. When an external reference clock is present, it is referred to as external clock synchronisation. It serves as a point of reference, and the system's nodes can set and alter their time accordingly. Internal clock synchronisation occurs when each node shares the time with other nodes, and each node sets and adjusts its time accordingly. Clock synchronisation algorithms are divided into two categories: centralised and distributed. A centralised system is one that uses a time server as a reference. The time is propagated to the nodes by a single time server, and all nodes change their clocks accordingly. It is reliant on a single time server, therefore if one node fails, the entire system will be out of sync. Berkeley Algorithm, Passive Time Server, Active Time Server, and other centralised algorithms are examples. The term \"distributed\" refers to the absence of a centralised time server. Instead, the nodes update their time by first utilising their local time and then averaging the time disparities with other nodes. Distributed algorithms solve centralised algorithms' scalability & single point failure problems. Global Averaging Algorithm, Centralized Averaging Algorithm, NTP (Network Time Protocol), and other distributed algorithms are examples. In long-distance maritime navigation, timekeeping and synchronisation of clocks is a key issue. Before radio navigation & satellite navigation, navigators had to use accurate time and astronomical observations to figure out how far east or west their ship had travelled. Marine navigation was revolutionised when an accurate marine chronometer was invented. Important ports offered time signals in the form of the signal gun, flag, or dropping time ball by the end of the nineteenth century, allowing mariners to check and repair their chronometers for errors. Synchronization was critical in the operation of 19th-century railways, which were the first significant mode of transportation rapid enough to notice changes in local mean time between close towns. Each line dealt with the issue by synchronising all of its stations to headquarters in accordance with conventional railway time. Companies in some territories shared a single train track and had to avoid colliding. Companies soon agreed on a single time standard due to the requirement for strict timekeeping, and civic authorities subsequently abandoned local mean time in favour of railway time. 227 CU IDOL SELF LEARNING MATERIAL (SLM)
Processes are classified into one of two categories based on their synchronisation: Independent Processes: The execution of one process has no bearing on the execution of others. Execution of one process has an impact on the execution of other processes. The challenge of process synchronisation emerges in cooperative processes as well, due to the fact that resources are shared. Condition of the Race When more than one process executes the same code or accesses the same memory and any shared changeable in that condition, there is a chance that the output or value of the shared variable is incorrect, so all the processes participating in the race must declare that their output is correct. This is known as a race condition. When several processes access & process the same data at the same time, the outcome is determined by the order in which the access takes place. A race condition is an occurrence that can happen within a vital segment. When the result of multiple thread processing in the critical region varies depending on the sequence in which the threads run, this occurs. If the critical section is regarded as an atomic instruction, race situations in critical parts can be avoided. Race problems can also be avoided by employing thread synchronisation techniques such as locks or atomic variables. 11.2 CLOCK SYNCHRONIZATION Clock synchronisation is a field of engineering and computer science that tries to bring previously unrelated clocks together. Real clocks will fluctuate after a certain length of time owing to clock drift, which is caused with clocks counting time as slightly different rates. There are a variety of issues that arise as a result of clock rate differences, as well as a variety of remedies, some of which are more appropriate in particular situations than others. Clock synchronisation in serial communication can refer to clock recovery that achieves frequency synchronisation rather than full phase synchronisation. Telecommunications synchronisation and automated baud rate detection both use this type of clock synchronisation. A system with frequency synchronisation and loose phase synchronisation is referred to as plesiochronous or isochronous operation. Synchronous operation entails a tighter synchronisation that is based on time as well as frequency. Because it's difficult to manage time at lower scales, there are issues with clock skew that become more complicated in distributed computing, when multiple computers must realise the same global time. For example, the make command in Unix systems is used to build new 228 CU IDOL SELF LEARNING MATERIAL (SLM)
or modified code while avoiding recompiling unaltered code. The make command determines which source files need to be recompiled based on the clock of the computer it runs on. The make programme may not provide the expected output if the sources are on a separate file server as well as the two machines' clocks are out of sync. For correct replication of streaming media, synchronisation is necessary. The synchronisation of clocks is an important part of audio via Ethernet systems. The synchronisation solution is simple in a system with such a central server; the server will set the system time. In this scenario, Cristian's algorithm and the Berkeley algorithm were potential answers to the clock synchronisation problem. Because a global time is not easily determined in distributed computing, the situation becomes more complicated. The Network Time Protocol (NTP), which is a layered client- server architecture based on the user Datagram Protocol (UDP) message passing, is the most widely used time synchronisation system on the Internet. In distributed computing, Lamport timestamps & vector clocks are ideas of the logical clock. Due to the risk of synchronisation packets colliding on the wireless medium and the increased drift rate of clocks on low-cost wireless devices, the problem becomes considerably more difficult in a wireless network. A distributed system is made up of a group of processors which communicate via message transmission rather than a central clock. Nonetheless, the processors must frequently obtain some shared idea of time, where time might refer to either an approximation of real time or just an integer-valued counter. Clock synchronisation is a technique which is used to coordinate the concept of time. For a variety of reasons, synchronised clocks are beneficial. In real-time processing in factories, aeroplanes, space vehicles, and military applications, a distributed system is frequently developed to achieve some synchronised behaviour. Algorithms can be run in \"rounds\" if clocks are synced, and algorithms intended for synchronous systems can be used. Version management and concurrency control in database systems rely on the ability to assign timestamps & version numbers to files and other things. Some timeout-based algorithms, such as communication protocols, are extremely time-sensitive. Giving each processor a receiver and using satellite-based time signals is one way to keep docks synced. With this method, there are clear concerns about reliability and cost. The use of software and the development of synchronisation algorithms is another option. Because the processors will not have access to a random number generator, probabilistic techniques are ruled out. We allow up to f faults in the network, with a fault being either a faulty processor or even a faulty link. If f is always 0, we call a system dependable. Otherwise, the system is ineffective or unreliable. 229 CU IDOL SELF LEARNING MATERIAL (SLM)
Although some fault tolerance research distinguishes between node and link faults (e.g., [DHSS]), we will assume that certain node faults occur for the sake of simplicity. If a link is broken, we can choose one of two nodes that make up the broken link at random and designate that node as faulty. This is definitely a conservative assumption, because the broken node could be the terminal of a slew of non-faulty links, all of which are now deemed faulty. We've narrowed our focus to node defects, but there are still a number of models to choose. The most basic of these models, known as fail safe, assumes that the only mode of fault is a processor crash. There's also the possibility that shortly before a processor crash, it sends a signal to the rest of the system that it's about to crash. This is the only type where the broken processor is considerate enough to let the others know. Unannounced processor crashes, also known as a failurestop error, are a more insidious form of failure.The omission fault model comes next in the fault hierarchy. In this instance, the processor may simply refuse to send or relay a message. A crashed processor will, of course, fail to send all of its messages. 11.3 LOGICAL CLOCK Logical clocks refer to a technique that is implemented on all machines in your distributed system so that events are ordered consistently within some virtual timeframe. In a distributed system, a logical clock is a technique for capturing temporal and causal links. Because distributed systems may lack a physically synchronised global clock, a logical clock is used to facilitate global ordering of events from various processes. For example, if we walk outside, we have a detailed plan of where we should go first, second, and so on. We don't start with second place and then move up to first. We always follow the technique or organisation that has been planned ahead of time. Similarly, we should do activities on our computers one by one and in a systematic manner. Assume we have more than 10 PCs inside a distributed system, each of which is conducting its own task. How can we persuade them to cooperate? There is a solution for this, which is the logical clock. Method 1: Try to sync clocks in one manner to order events across processes. This means that if one PC has a time of 2:00 p.m., all PCs should have the same time, which is impossible. Not all clocks can sync at the same moment. Then we won't be able to use this method. Method 2: Assigning Timestamps to Events is another option. Using the example, this means that we allocate 1 to the first spot, 2 to the second place, 3 to the third place, and so on. Then we know that first place will always take precedence, and so 230 CU IDOL SELF LEARNING MATERIAL (SLM)
on. Similarly, if we assign a unique number to each PC, the system will be ordered so that the first PC completes its task first, followed by the second, and so on. However, timestamps will only operate if they follow the law of causality. What is the definition of causality? HAPPEN BEFORE RELATIONSHIP is the sole basis for causation. If A has a timestamp of 1, then B should have a timestamp greater than 1, and this should only occur before the relationship is formed. Consider the following scenario: you send a message to someone at 2:00:00 pm, and they get it at 2:00:02 pm. Then TS(sender) TS TS TS TS TS TS TS TS TS TS TS TS TS TS TS TS (receiver). Happen Before Relationship – Derived Properties TS(A) TS(B) TS(C) Transitive Relation – If TS(A) TS(B) and TS(B) TS(C), then TS(A) TS(C) Causally Ordered Relation – a->b, this means that a happens before b, and any changes in a will inevitably affect b. Concurrent Event – This indicates that some processes, such as A || B, are made to happen simultaneously rather than one after the other. In a distributed system, a logical clock is a technique for capturing temporal and causal links. Distributed systems frequently lack a physically synchronised world clock. Fortunately, in many programmes (such as distributed GNU build), the lack of synchronisation is undetectable if two processes never communicate. Furthermore, in these applications, agreeing on the event ordering (i.e., logical clock) rather than the wall-clock time is sufficient. Leslie Lamport proposed the Lamport timestamps, the first logical clock implementation, in 1978. Each process has two data structures in logical clock systems: logical local time and logical global time. The process uses logical local time to indicate its own events, while logical global time is local knowledge about global time. When processes exchange data, a particular protocol is needed to update logical local time after each local event and logical global time. Physical time is used to organise events. An event at 8:15 a.m., for example, occurs before an event at 8:16 a.m. Physical clocks are not always accurate in distributed systems, therefore we can't rely on physical time to organise events. Logic clocks, on the other hand, can be used to generate a partial or complete order of events. Sets A set is an unordered collection of distinct things. For example, we can bundle the integers 3, 11, and 0 into a set denoted {3,11,0}{3,11,0}. Or, we could throw the integer 42 into a set with only one element: {42}{42}. Or, we could have the empty set {}{}. We say an element aa is a member of AA if AA contains aa, and we denote this a∈Aa∈A. 231 CU IDOL SELF LEARNING MATERIAL (SLM)
A subset AA of a set BB is a set whose elements are all members of BB. We denote that AA is a subset of BB as A⊆BA⊆B. For example: {1,2}⊆{1,2,3}{1,2}⊆{1,2,3}, {}⊆{1,2,3}{}⊆{1,2,3}, and {1,2,3}⊆{1,2,3}{1,2,3}⊆{1,2,3}. A set AA is a strict subset of BB, denoted A⊂BA⊂B, if A⊆BA⊆B and A≠BA≠B. The Cartesian product of two sets AA and BB, denoted A×BA×B, is the set of ordered pairs (a,b)(a,b) for every a∈Aa∈A and b∈Bb∈B. For example, {1,2}×{a,b,c}={{1,2}×{a,b,c}={(1,a)(1,a),,(1,b)(1,b),,(1,c)(1,c),,(2,a)(2,a),,(2,b)(2,b),,(2,c)(2, c)}}. Relation A binary relation RR on a set AA is a subset of A×AA×A. Similarly, a binary relation on a set AA and BB is a subset of A×BA×B. For example, here's a couple relations on {1,2}{1,2} and {a,b,c}{a,b,c}: {(1,a),(2,c)set}{(1,a),(2,c)set}, {(1,a),(2,b),(3,c)}{(1,a),(2, b),(3,c)}, {}{}, and {(1,a),(1,b),(1,c)}{(1,a),(1,b),(1,c)}. We can denote (a,b)∈R(a,b)∈R as aRbaRb. Consider for example the familiar less-than relation, <<, on the set of natural numbers. Two naturals ii, and jj are in the relation << if ii is less than jj. We denote this i<ji<j. Concretely, << is the infinite set {(0,1),(0,2),(0,3),…,(1,2),(1,3),…}{(0,1),(0,2),(0,3),…,(1,2),(1,3),…}. Partial and Total Orderings An irreflexive partial ordering on a set AA is a relation on AA that satisfies three properties. 1. irreflexivity: a≮aa≮a 2. antisymmetry: if a<ba<b then b≮ab≮a 3. transitivity: if a<ba<b and b<cb<c then a<c 11.4 ELECTION ALGORITHM AN ALGORITHM THAT OPERATES ON A DISTRIBUTED SYSTEM IS KNOWN AS A DISTRIBUTED ALGORITHM. A distributed system is a group of separate computers that do not share memory. Each CPU has its own memory and communicates with other processors through communication networks. In a network, communication is achieved by a process on one system interacting with a process on another machine. Many distributed system methods necessitate the usage of a coordinator who performs duties that are required by other processes in the system. The goal of election algorithms is to select a coordinator. 232 CU IDOL SELF LEARNING MATERIAL (SLM)
Election Algorithms: Election algorithms select a coordinator process from a collection of processors. If the coordinator process fails for whatever reason, a new coordinator is chosen on a different processor. In essence, the election algorithm selects where a new copy of the coordinator should be begun. Every active process inside the system does have a unique priority number, according to the election algorithm. As a new coordinator, the process with the highest priority will be chosen. As a result, if a coordinator quits, this algorithm selects the active process with the highest priority number as the replacement. The number is then broadcast to all active processes in the distributed system. We have two election algorithms with two different distributed system architectures. 1. The Bully Algorithm — This algorithm is used in systems where each process has the ability to send messages to all other processes in the system. Consider the following scenario: Process P sends a message to the coordinator. It is presumed that coordinator had failed if he or she does not react within a time interval T. Process P now sends an election message to every high-priority process. It listens for responses and then elects itself as a coordinator if no one responds for time interval T. Then it notifies all processes with lower priority numbers that it has been elected as its new coordinator. If any other process Q responds within time T, (I) Process P waits for time interval T' to receive additional message from Q indicating that it's been elected as coordinator. (II) If Q does not react within the time interval T', the algorithm is considered to have failed and the process is resumed. 2. The Algorithm of the Rings – This algorithm is applicable to ring-organized systems (logically or physically). We assume that the links between the processes are unidirectional in this algorithm, and that each process can only communicate with the processes on its right. The active list is a data structure that this algorithm employs. It is a list that contains the priority numbers of all active processes in the system. Algorithm is a term that refers to a set of instructions used to If process P1 senses a coordinator failure, a new active list is created, which is initially empty. It sends a message of election to its right-hand neighbour and adds number 1 to its active list. When process P2 receives a message from one of the processes on the left, it responds in one of three ways: 233 CU IDOL SELF LEARNING MATERIAL (SLM)
(I) P1 adds 2 to its active list & forwards the message if the message received doesn't really contain 1 in its active list. P1 creates a new active list with numbers 1 and 2 if this is the first election message it has received or transmitted. It then sends out election messages 1 and 2, in that order. (III) If Process P1 receives its election message 1, the active list for P1 now includes the numbers of all the system's active processes. Process P1 now chooses the highest priority value from the list as the new coordinator. 11.5 SUMMARY Clock synchronisation is a field of computer science and engineering that tries to bring previously unrelated clocks together. Real clocks will fluctuate after a certain length of time owing to clock drift, which is caused by clocks counting time at slightly different rates. Clock synchronisation in serial communication can refer to clock recovery that achieves frequency synchronisation rather than full phase synchronisation. Telecommunications synchronisation and automated baud rate detection both use this type of clock synchronisation. A system with frequency synchronisation and loose phase synchronisation is referred to as plesiochronous or isochronous operation. Synchronous operation entails a tighter synchronisation that is based on time as well as frequency. The synchronisation solution is simple in a system with a central server; the server will set the system time. In this scenario, Cristian's algorithm and the Berkeley algorithm are potential answers to the clock synchronisation problem. Because a global time is not easily determined in distributed computing, the situation becomes more complicated. The Network Time Protocol (NTP), which is a layered client-server architecture based on User Datagram Protocol (UDP) message passing, is the most widely used time synchronisation system on the Internet. In distributed computing, Lamport timestamps and vector clocks are ideas of the logical clock. Due to the risk of synchronisation packets colliding on the wireless medium and the increased drift rate of clocks on low-cost wireless devices, the problem becomes considerably more difficult in a wireless network. In essence, the election algorithm selects where a new copy of the coordinator should be begun. Every active process in the system has a unique priority number, according to the election algorithm. As a new coordinator, the process with the highest priority will be chosen. 234 CU IDOL SELF LEARNING MATERIAL (SLM)
11.6 KEYWORDS Platform-The lowest layer of hardware and software layers, typically hardware plus operating system. Socket abstraction-An endpoint for communication between processes supported by most operating systems, which is bound to a local port and an Internet address. TCP (Transmission Control Protocol)-The TCP/IP protocol that provides application programs with access to a connection-oriented communication service. TCP offers reliable, flow-controlled delivery. More important TCP accommodates changing conditions in the Internet by adapting its retransmission scheme. Tunnelling-A type of communication needed when a pair of nodes belonging to different networks communicate through an 'alien' protocol. URL (Uniform Resource Locator)-A syntactic form used to identify a page of information on the World Wide Web. 11.7 LEARNING ACTIVITY 1.Here is an interesting problem called partial connectivity that can occur in a distributed environment. Let's say A and B are systems that need to talk to each other. C is a master that also talks to A and B individually. The communications between A and B fail. C can tell that A and B are both healthy. C tells A to send something to B and waits for this to occur. C has no way of knowing that A cannot talk to B, and thus waits and waits and waits. What diagnostics can you add in your code to deal with this situation? 2.This is the Byzantine Generals problem: Two generals are on hills either side of a valley. They each have an army of 1000 soldiers. In the woods in the valley is an enemy army of 1500 men. If each general attacks alone, his army will lose. If they attack together, they will win. They wish to send messengers through the valley to coordinate when to attack. However, the messengers may get lost or caught in the woods (or brainwashed into delivering different messages). How can they devise a scheme by which they either attack with high probability, or not at all? 235 CU IDOL SELF LEARNING MATERIAL (SLM)
11.8 UNIT END QUESTIONS A. Descriptive Questions Short questions 1. Expand UTC. 2. Differentiate external and internal clock. 3. What are the categories of clock synchronization? 4. Define race of condition. 5. What is causality? Long Questions 1. Explain synchronization with their categories. 2. Describe about logical clock. 3. How can we persuade distributed system to cooperate? 4. Describe about sets and relation. 5. Explain the requirements of election algorithm B. Multiple Choice Questions 1. In distributed systems, a logical clock is associated with ______________ a. Each instruction b. Each process c. Each memory d. Each register 2. If timestamps of two events are same, then the events are ____________ a. Monotonic b. Non monotonic c. Concurrent d. Non-concurrent 3. For proper synchronization in distributed systems ____________ a. prevention from the deadlock & starvation is must b. prevention from the deadlock is must c. prevention from the starvation is must d. none of the mentioned 4. In distributed systems, election algorithms assumes that ____________ a. there is no priority number associated with any process b. priority of the processes is not required c. a unique priority number is associated with each active process in system 236 CU IDOL SELF LEARNING MATERIAL (SLM)
d. none of the above 5. Which of the following is Election algorithm e. Cristian's Algorithm f. Lamport's Algorithm g. Berkley's Algorithm h. Bully Algorithm Answers 1-b, 2-c, 3-a, 4-c, 5-d 11.9 REFERENCES Reference books George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts and Design” (4th Edition), Addison Wesley/Pearson Education. Pradeep K Sinha, “Distributed Operating Systems: Concepts and design”, IEEE computer society press. Text Book References M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International Publishers2009. Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5. Websites: https://www.techopedia.com/definition/10062/wireless-communications https://www.computernetworkingnotes.com/ https://www.guru99.com 237 CU IDOL SELF LEARNING MATERIAL (SLM)
UNIT 12 - MUTUAL EXCLUSION STRUCTURE 12.0 Learning Objectives 12.1Introduction 12.2Distributed Mutual Exclusion 12.3Requirements of mutual exclusion algorithms 12.4Classification of mutual exclusion algorithms 12.5Summary 12.6Keywords 12.7Learning Activity 12.8Unit end questions 12.9References 12.0 LEARNING OBJECTIVES After studying this unit, you will be able to: Explain the importance of distributed mutual exclusion Describe about mutual exclusion algorithms Outline the need of using MUTEX algorithms Compare the various types of algorithms 12.1 INTRODUCTION Mutual exclusion is indeed a concurrency control characteristic used to avoid race problems. It is a requirement that a process cannot enter its critical section when another concurrent process is present or executing in the critical section at the same time, i.e., only one process can execute its critical section at any given time. Mutual exclusion vs. distributed systems in a single computer system: Memory as well as other resources were shared among processes in a single computer system. Because the status of shared resources and users is readily available in shared memory, the mutual exclusion problem is easily solved using a shared variable (for example, semaphores). 238 CU IDOL SELF LEARNING MATERIAL (SLM)
We don't have shared memory or a common physical clock in distributed systems, therefore we can't solve the mutual exclusion problem with shared variables. To solve the mutual exclusion problem with distributed systems, a message-passing mechanism is adopted. Due to the lack of shared memory as well as a common physical clock, a site in a distributed system does not have comprehensive information about the system's status. Mutual exclusion Algorithm Requirements: There should be no deadlock if two or more sites are waiting for a message which will never arrive. No starvation: Every site that wishes to complete a critical portion should be given the option to do so in a reasonable amount of time. Any site must not wait indefinitely to carry out important sections while other sites carry out critical sections on a regular basis. Fairness: Each site should have an equal opportunity to complete crucial sections. Any request to execute a critical section must be carried out in the order in which it was submitted, i.e., critical section execution demands must be carried out in the order in which they were received in the system. Fault Tolerance: In the event of a failure, it ought to be able to recognise it on its own and continue to work normally. The following is a solution for distributed mutual exclusion: In distributed systems, shared variables or a local kernel could be used to implement mutual exclusion. Mutual exclusion can be implemented using message passing. The three techniques to implementing mutual exclusion for distributed systems based on message passing are listed below: Token-Based Algorithm: Each site has its own unique token. If a site has the one-of-a-kind token, it can access the vital portion. The important section is ordered using sequence numbers in this method. A sequence number is included in each crucial section request. This number is used to distinguish between old and new requests. Because the token is unique, this method ensures mutual exclusion. Suzuki-Broadcast Kasami's Algorithm is an example. Approach that isn't based on tokens: In order to determine what sites should perform key sections next, a site connects with other sites. This necessitates the exchange if two or more rounds of communications between sites. This method orders requests for the important portion using timestamps rather than sequence numbers. 239 CU IDOL SELF LEARNING MATERIAL (SLM)
A timestamp is generated whenever a site requests a critical part. The timestamp is also utilised to settle any requests for critical sections that clash. A logical clock is maintained by any algorithm that does not use tokens. Lamport's technique is used to refresh logical clocks. Lamport's algorithm and Ricart–algorithm Agrawala's are two examples. Quorum-based approach: Instead of requesting permission to run the essential portion from all other sites, each site only requests permission from a quorum of sites. A common site can be found in any subsets of sites or Quorum. This central location is in charge of ensuring mutual exclusion. Maekawa's Algorithm is an example. The simplest way to accomplish mutual exclusion on uniprocessor systems is to silence interrupts throughout a process's critical phase. Any interrupt service procedures will not be able to run as a result of this (effectively preventing a process from being preempted). This technique is effective, but it causes a slew of issues. Because the timer interrupt was no longer serviced when a critical section is conducted, the system clock would drift every moment a critical section is executed, making time tracking impossible and during critical section. Furthermore, if a process suspends during its critical portion, control will still not be transferred to another process, thereby stopping the entire system. The busy-wait method is a more elegant way to achieve mutual exclusion. Both uniprocessor & multiprocessor systems benefit from busy-waiting. Mutual exclusion is achieved by the use of shared memory as well as an atomic test-and-set instruction. A process may test-and-set on a shared memory location, and because the operation seems to be atomic, just one process at a time can set the flag. Any process that fails to set the flag can either move on to other tasks and try again tomorrow, release the processor to some other process and try again later, or loop while monitoring the flag until it succeeds. Because preemption is still available, this strategy allows the system to keep working even if a process crashes while holding the lock. The compare-and-swap atomic procedure is one of several that can be used to achieve mutual exclusion of data structures (CAS). By building a linked list with each node representing the desired action to be executed, CAS can be utilised to ensure wait-free mutual exclusion for just any shared data structure. The pointers in the linked list [6] are then changed using CAS during the insertion of a new node. Only one process can indeed be successful in its CAS at a time; anyone else processes attempting to join a node at the same time must try again. Each process can therefore keep a local copy of a data structure and perform every operation from list on its local copy when traversing the linked list. 240 CU IDOL SELF LEARNING MATERIAL (SLM)
12.2 DISTRIBUTED MUTUAL EXCLUSION A distributed system is a group of individual servers that seem to consumers as a single cohesive system. Distributed computing is a form of improving performance and efficiency by distributing software components over numerous computers. All of the computers are connected in a network, either a Local Area Network (LAN) or a Wide Area Network (WAN), and communicate with one another, allowing different parts of a Distributed programme to run on separate machines in different locations. A distributed programme is a computer programme that runs in a distributed system. A problem is divided into numerous tasks in distributed computing, that are each solved with one or more computers that communicate with one another via message passing. A typical distributed system would contain a large number or interacting devices, each of which runs its own software but is influenced by receiving messages, monitoring shared- memory updates, or the states of many other devices. Simple systems where a single client communicates with a single server to vast amorphous networks and the Internet as a whole are examples of distributed systems. Different hardware and software level architectures characterise distributed computing architecture. At a lower level, some type of network is required to connect several CPUs, whether that system is printed on a circuit board or composed of loosely coupled gadgets and connections. At a higher level, some type of communication mechanism is required to connect processes executing on those CPUs. Client-server, three-tier, n-tier, or peer-to-peer are the most common architectures for distributed programming, as are loose coupling and tight coupling. Client-server architectures are those in which smart clients request data from the server, which is subsequently formatted and displayed to the users. When a client input constitutes a permanent change, it is committed back to the server. Three-tier designs shift client intelligence to something like a middle tier, allowing for the usage of stateless clients. This makes application deployment easier. Three-tier web applications are the most common. N-tier architectures are frequently used to describe online apps that route requests to certain other enterprise services. This is the type of application that is primarily responsible for application server success. Peer-to-peer architectures are those in which no dedicated machines deliver services or manage network resources. Instead, all responsibilities are evenly distributed among all machines, which are referred to as peers. Peers can function as both clients and servers. BitTorrent as well as the bitcoin network are two examples of this architecture. 241 CU IDOL SELF LEARNING MATERIAL (SLM)
The way of communicating & coordinating work across concurrent processes is another important feature of distributed computing architecture. Processes can communicate directly with others via various message passing protocols, usually in a master/slave relationship. Alternatively, by employing a shared database, a \"database-centric\" design can enable distributed computing without any sort of direct inter-process communication. Database- centric design, in particular, enables live environment relay by providing relational processing analytics in a schematic architecture. This allows distributed computing operations to be performed both within and outside of the parameters of a networked database. Distributed Computing's Benefits High fault tolerance and reliability: A system crash with one server has no effect on other servers. Scalability: In distributed computing systems, new machines can be added as needed. It is simple to deploy, implement, and debug new services because of its flexibility. Faster computation speed: A distributed computer system may combine the computing capacity of several computers, allowing it to perform calculations faster than traditional systems. Because it is an open system, this can be accessed locally as well as remotely. High efficiency: It can deliver greater performance and cost performance than centralised computer network clusters. Distributed Computing's Drawbacks Challenging troubleshooting: Because of the distributed nature of the servers, troubleshooting & diagnostics are more difficult. Less software support: One of the biggest disadvantages of dispersed computer systems is the lack of software support. Excessive network infrastructure costs: Issues with basic network configuration, such as transmission, high load, and data loss. Data security & sharing are hazards in distributed computer networks due to the features of open systems. Distributed Computing Examples The following are some examples for distributed systems and distributed computing applications: Telecommunications networks, number one: cellular networks and telephone networks 242 CU IDOL SELF LEARNING MATERIAL (SLM)
Routing algorithms for computer networks like the Internet and wireless sensor networks Network Applications (#2): Peer-to-peer networks and the World Wide Web Distributed databases & distributed database management systems network file systems distributed cache such as burst buffers massively multiplayer online graphics and online reality communities Banking and airline reservation systems are examples of distributed information processing systems. 3: Process Control in Real-Time: 4: Parallel Computation: Aircraft Control Systems and Industrial Control Systems Distributed rendering in computer graphics, including cluster computing, grid computing, cloud computing, and different volunteer computing projects. 12.3 REQUIREMENTS OF MUTUAL EXCLUSION ALGORITHMS The following properties should be guaranteed by a ME algorithm: 1. Safety property: It assures that only one process may execute the CS at any one time. It is a requirement of the MUTEX algorithm. 2. The nonexistence between deadlock and famine is stated by this property. It's not a good idea for two or more sites to keep waiting for communications that will never arrive. Furthermore, a site cannot wait indefinitely to perform the CS while other sites repeatedly execute the CS. That is, each asking site should have a chance to run the CS in a certain amount of time. 3. Fairness: In the context of MUTEX, fairness means each process is given a reasonable chance to execute the CS. The fairness attribute typically indicates that CS execution requests are carried out in the order in which they are received in the system (the time is determined by a logical clock) There should be no deadlock if two or more sites are waiting for a message that will never arrive. No starvation: Every site that wishes to complete a critical portion should be given the option to do so in a reasonable amount of time. Any site should not wait indefinitely to carry out important sections while other sites carry out critical sections on a regular basis. Fairness: Each site should have an equal opportunity to complete crucial sections. Any request to execute a critical section must be carried out in the order in which it was submitted, i.e., critical section execution requests must be carried out in the order in which they were received in the system. 243 CU IDOL SELF LEARNING MATERIAL (SLM)
Fault Tolerance: In the event of a failure, it should be able to recognise it on its own and continue to work normally. 12.4 CLASSIFICATION OF MUTUAL EXCLUSION ALGORITHMS Centralized Algorithm As the name implies, one coordinator is in charge of all requests for access to the shared resource/data. Every process requests permission from the coordinator, and only if the coordinator grants permission will a process be allowed to enter CS. The coordinator will keep track of the requests in a queue. a) Process 1 requests permission from the coordinator to access a crucial region. Because the queue is empty and there are no pending requests, permission is granted by the coordinator. b) Process 2 then requests authorization to visit the same critical area as Process 1. Because P1 has not yet exited CS, the coordinator does not respond. c) When Process 1 escapes the crucial region, it informs the coordinator, who then grants permission to Process 2. Figure 12.1 Centralized Algorithm Advantages Fair algorithm; to grant permission, follow the FIFO sequence. The scheme is simple to construct and can be used for generic resource allocation. Shortcomings There is just one point of failure, and there is no fault tolerance. The terms \"no-reply\" and \"permission denied\" are often used interchangeably. In a large system, there is a performance bottleneck due to a single coordinator. Distributed algorithm 244 CU IDOL SELF LEARNING MATERIAL (SLM)
There is no coordinator in this scheme. Every process requests permission to enter CS from the other processes. There are two elements to these algorithms. 1. Algorithms based on non-token contention 2. Algorithms based on tokens that can be controlled 2.2.1 Algorithms based on contention (non-token) Sites communicate with a group of other sites to determine who should run the CS first in this type of algorithm. These algorithms are also split into two sections: 1. Based on a timestamp 2. Voting procedure The following are two fundamental timestamp-based contention algorithms: Algorithm Of Lamport On the basis on his logical clock notion, Lamport was the first to build a distributed MUTEX algorithm. This is a scheme that isn't reliant on tokens. Requests for CS are ordered using timestamps in non-token-based protocols. Message of request Include the following: Name of resource timestamp Identifier (Machine id, process id) When sites issue CS requests, the operating system assigns them a timestamp, which is a unique integer number. When a request is received, the timestamp is incrementally incremented. Requests for smaller timestamps take precedence over requests for larger timestamps. Request set Ri=P1, P2, P3, ......Pn for a site Pi in Lamport's scheme. It includes all of the sites where the Pi must obtain permission before joining the CS. In ascending order of timestamps, each process keeps a queue of pending requests for entering CS. The channels in this method are assumed to be FIFO. Algorithm Obtaining the crucial section: When a process enters CS, it goes through the following steps: 1. Creates its own queue for its request (ordered by time stamps). 2. Issues a request to all nodes. 3. Wait for all other nodes to respond. 4. Whenever another site receives the request message, it sends the requesting site a timestamp reply message and stores it in its own request queue. Putting the essential section into action When these two characteristics are met, a site can be classified as CS: [L1]: From all other sites, Pi really hasn't received a message with a timestamp greater than (tsi, I 245 CU IDOL SELF LEARNING MATERIAL (SLM)
[L2]: Pi's request has been placed at the front of the request_ queue. The critical section has been released. 1. It removes the request from the queue & sends a release message to all processes when it exits the crucial phase. 2. When other sites receive the release message, they remove the associated entry from their own request queue. 3. Enter CS if your request is at the top of the list and all responses have been received. Performance Message complexity: the number of messages required to request CS (N-1). The number of messages necessary for a response is (N-1). The amount of messages required for release is (N-1). In both heavy and light loads, a total of 3 (N-1) numbers of messages are necessary. Synchronization delay: The time it takes for a message to go from one process to another. In synchronisation, T time occurs. Algorithms based on tokens that are controlled A unique token is shared throughout the sites in token-based algorithms. If a site has the token, it is authorised to enter its CS. To distinguish among old and new requests, token- based algorithms utilize sequence numbers rather than timestamps. In general, don't count on FIFO message delivery. Their proof of accuracy is also simple. Issues include where to look for and obtain the token. This distinguishes between different algorithms. On the basis of the structure in which processes are coupled, these algorithms are classified into three parts: Structure of the Ring: As indicated in the following diagram, all processes were connected in the shape of a ring, with each process assigned a place. The ring places could be assigned based on the numerical order of network addresses or by another method. The order in which things are done is less crucial than the fact that each process understands who comes after it in line. Advantages: easy to use, no deadlocks, and a level playing field. Disadvantages: The token circulates even when no request is made (unnecessary traffic). Long road (O(N)) - there may be a long wait for a token. Structure of the Broadcast (Suzuki-Kasami Algorithm) If a site wishes to access the CS but does not have the token, the Suzuki-kasami algorithm broadcasts a REQUEST message again for token to all other sites. When a site with the token receives a REQUEST message, it delivers it to the requesting site. If a site receives a REQUEST message while executing a CS, it does not deliver the token until the CS has been completed. Tokens are made up of the following items: 246 CU IDOL SELF LEARNING MATERIAL (SLM)
– Q: The requesting processes' queue, with a maximum of n. - LN [1...n]: array of integers, where LN[j] is the sequence number of the most recent request Pj processed. Structures of Data: • REQUEST (j,n): Pj's nth request to input the CS REQUEST message. • RNi[1..N]: In a REQUEST message from Pj received by Pi, RNi[j] is the biggest sequence number. • When Pi receives REQUEST (j,n), he sets RNi[j] to the maximum value (RNi[j],n). • The message is outdated if RNi[j] >n. The following two design challenges must be efficiently addressed by this algorithm: (1) How to tell the difference between an outdated REQUEST message and a current REQUEST message: A site may get a token request message after the relevant request has been satisfied due to variable message delays. If a site is unable to establish whether the request corresponding to a token request has been fulfilled, the token may be sent to a site that does not require it. This will not violate the correctness of the code, but it will likely reduce performance. (2) How to figure out which site has had an outstanding CS request: After a site has completed the CS execution, it must determine how so many sites have outstanding CS requests so that the token can be sent to one of them. The following is how the first issue is addressed: REQUEST (j, n) is the form of a REQUEST message from site Pj, where n (n=1, 2,...) is a sequence number indicating that site Pj is requesting its nth CS execution. A site Pi maintains an array of numbers RNi[1..N], where RNi[j] is the biggest sequence number received in a REQUEST message from site Pj thus far. Site Pi sets RNi[j]:=max when it receives a REQUEST(j, n) message (RNi[j], n). If RNi[j]>n and a site Pi receives a REQUEST(j, n) message, the request is obsolete. The following is how the second issue is addressed: The token is made up of a queue of requesting sites Q and an array of numbers LN [1...N], where LN[j] is the sequence number of the most recent request processed by site Pj, and LN[i]:=RNi[i] indicates that the request corresponding to sequence number RNi[i] has been executed. If RNi[j]=LN[j]+1 at site Pi, site Pj is now seeking a token. 12.5 SUMMARY Mutual exclusion is a characteristic of concurrency control in computer science, which is implemented to prevent race events. It is a requirement that one thread of 247 CU IDOL SELF LEARNING MATERIAL (SLM)
execution never enters a critical section while another thread of execution is already accessing a critical section. A critical section refers to a period of time during which a thread with execution accesses the shared resource, such as [Shared data objects, shared resources, shared memory]. A data object is the critical portion, which two or more concurrent threads are attempting to alter (where two concurrent read operations are permitted but, no two concurrent write operations or one read and one write are permitted, since it leads to data inconsistency). If a process is already performing write operations on a data object critical section, the mutual exclusion algorithm ensures that no other process/thread can access/modify the object until the first process has finished writing on the data object critical section as well as released the object for other processes to read and write on. Mutual exclusion solves a resource sharing problem: how can a software system govern multiple processes' access to the shared resource while each process requires exclusive control of the that resource while performing its tasks? The mutual- exclusion solution to this problem makes the common resource available only when the process is in the critical area of the code. It regulates access to the shared resource by regulating each mutual execution of the component of its programme that would use the resource. At the very least, a good solution to this problem must include the following two characteristics: It must use reciprocal exclusion, which means that only one process at a time can be in the crucial area. It must be free of deadlocks: if multiple processes attempt to reach the critical part, one of them must finally succeed, as long as no process remains in the critical area indefinitely. Must mutual exclusion algorithms are built on the assumption that no failures would occur while a process is executing inside the crucial region. In fact, though, such failures may be prevalent. A sudden loss of power or a broken connector, for example, could result in an unrecoverable error or the inability to continue a process in a vital portion. Conventional, non-failure-tolerant mutual exclusion algorithms may deadlock or otherwise fail critical liveness properties if such a failure happens. Several solutions based on crash-recovery systems have been offered to address this issue. 12.6 KEYWORDS Reliability-A protocol is said to be reliable if it delivers messages without loss or corruption across a transmission medium. 248 CU IDOL SELF LEARNING MATERIAL (SLM)
Security-The set of measures employed by a distributed system to ensure the protection of shared data and objects against enemy attacks. Synchronous communication -A type of communication in which both send and receive are blocking operations. UDP (User Datagram Protocol)- A transport-level protocol for transmitting messages between processes which does not guarantee delivery. Collision-An event that occurs on a CSMA/CD network when two stations attempt to transmit simultaneously. The signals interfere with each other, forcing the two stations to back off and try again. 12.7 LEARNING ACTIVITY 1.What are some of the error conditions we need to guard against in a distributed environment that we do not need to worry about in a local programming environment? 2. Have you ever encountered a Heisenbug? How did you isolate and fix it? 12.8 UNIT END QUESTIONS 249 A. Descriptive Questions Short questions 1. Define mutual exclusion. 2. Define token-based algorithm. 3. What are the requirements of mutual exclusion algorithms? 4. What is fault tolerance? 5. What is time stamp? Long Questions 1. Explain the solutions of distributed mutual exclusion. 2. Describe uni-processor and multi-processor systems. 3. Explain about distributed mutual exclusion. 4. Describe the benefits of distributed computing. 5. Explain the requirements of mutual exclusion algorithms. CU IDOL SELF LEARNING MATERIAL (SLM)
B. Multiple Choice Questions 1.Which of the following facility or capacity are required to provide support for the mutual exclusion? i) A process that halts in its noncritical section must do so without interfering with other processes. ii) The assumption should be made about relative process speeds or the number of processors. iii) A process remains inside its critical section for a finite time only a. i and iii only b. ii and iii only c. i and ii only d. None of the above 2. ……………………. techniques can be used to resolve conflicts, such as competition for resources, and synchronize processes so that they can co-operate. a. Synchronization b. Mutual exclusion c. Deadlock d. System 3. ………………… are used for signalling among processes and can be readily used to enforce a mutual exclusion discipline. a. Message b. Monitors c. Office d. semaphores 4. Semaphores provide a primitive yet powerful and flexible tool for enforcing mutual exclusion and for co-coordinating processes called ………… a. Monitor b. Semaphores c. Messaging d. Token ring 250 CU IDOL SELF LEARNING MATERIAL (SLM)
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304