Chapter 21 • Memory and attention 469 Fourthly, memory can also be seen as a constructive process. Bransford et al. (1972) were able to show that we construct and integrate information from, for example, indi vidual sentences. In an experiment they presented a group of people with a series of thematically related sentences and then presented them with a second set of sentences, asking ‘Have you seen this sentence before?’. They found that most people estimated that they had seen approximately 80 per cent of these sentences before. In fact all of the sentences were new. Bransford et al. concluded that people are happy to say that they recognized sentences they have not seen providing they are consistent with the theme of the other sentences. Finally, many researchers would now argue that memory cannot be meaningfully studied in isolation, as it necessarily underpins all other aspects of cognition (thinking). For example, object recognition relies on memory; the production and understanding of language relies on some form of internal lexicon (or dictionary); finding our way about town relies on an internal representation of the environment, sometimes described as a cognitive map (Tversky, 2003); the acquisition of skills often begins with internalizing and remembering instructions. Memory is related to attention and these two are related to making mistakes, having accidents or doing things unintentionally. Memory, attention and error are also related to emotion. In this chapter we discuss the first three of these, devoting the next chapter to looking at emotion, or ‘affect’. 21.2 Memory Memory is usually divided into a set of memory processes and a number of different types of memory store. Table 21.1 is a summary of the main memory stores and their sub-components and associated processes. Figure 21.1 is an illustration of this multi store model of the memory (note the role of attention). Memory stores: working memory As we have already noted, working memory, first identified and named by Baddeley and Hitch (1974), is made up from three linked components, namely a central executive, a visuo-spatial sketchpad and an articulatory loop (also called the phonological loop). The central executive is involved in decision making, planning and related activities. It is also closely linked to managing our ability to perform more than one thing at a time (see the section below which discusses the role of attention). The articulatory or phonological loop can be thought of as behaving like a loop of audio tape. When we are trying to dial an unfamiliar telephone number or repeating a phrase in a foreign lan guage, we tend to repeat the string of numbers (or words) either out loud or silently to ourselves. This process is called rehearsal. When we are doing this we are making use of the articulatory loop, which can also account for our experience of the inner voice. The analogy of the audio tape is useful as it allows us to see that the articulatory loop is limited in both capacity and duration. The visuo-spatial sketchpad (also called the scratchpad) is the visual and spatial information equivalent of the articulatory loop and has been linked to our mind’s eye. We use our mind’s eye to visualize a route through a town or building or for the mental rotation of figures (visualize a coin and then rotate it to see what is on the other side). The visuo-spatial sketchpad is also limited in capacity and duration unless refreshed
470 PART IV • Foundations of designing interactive systems Table 21.1 A summary of the structure of memory Main components Key processes associated with this particular store Sensory stores The contents of these stores are transferred to working memory within a fraction of a second. The iconic store (visual) and the echoic store (auditory) are temporary stores where information is Rehearsal is the process of refreshing the contents of held before it enters working memory. WM, such as repeating aloud a phone number. The contents of WM are said to decay (are lost/ Working memory (WM) forgotten) if they are not rehearsed. Another way of forgetting from WM is displacement Working memory is made up from three key which is the process by which the current contents of elements: the central executive, the articulatory WM are pushed out by new material. loop and the visuo-spatial sketchpad. The central executive is involved in decision making, the Encoding is the process by which information is stored articulatory loop holds auditory information and the in memory. visuo-spatial sketchpad, as the name suggests, holds Retrieval is the means by which memories are visual information. recovered from long-term storage. Forgetting is the name of a number of different Long-term memory (LTM) possible processes by which we fail to recover information. Long-term memory comprises the following: Semantic memory. This holds information related to meaning. Procedural memory. This stores our knowledge of how to do things such as typing or driving. Episodic and/or autobiographical memory. This may be one or two different forms of memory that are related to memories personal to an individual such as memories of birthdays, graduation or getting married. Permastore. This has been suggested by Bahrick (1984) as the name for the part of LTM which lasts for our lifetime. It stores the things you never forget. by means of rehearsal. Finally, the capacity of working memory itself is approximately three or four items (e.g. MacGregor, 1987; LeCompte, 1999) where an item may be a word or a phrase or an image. It should be noted that older textbooks and papers sug gest that the limit of short-term memory is 7 ± 2 items, sometimes called the magical number 7: this is now known to be incorrect. Distinguishing between short-term and working memory In their multi-store model of memory, Atkinson and Shiffrin (1968) distinguish between short- and long-term memory (reflecting William James's primary and sec ondary memory division 70 years earlier). While the term short-term memory (STM) is still widely used, we have chosen to employ the term working memory (WM) instead. STM is usually characterized by a limited, temporary store for information before it is transferred to long-term memory, while WM is much more flexible and detailed in structure and function. Our use of WM instead of STM also better reflects our everyday experience.
Chapter 21 • Memory and attention 471 Memory stores: long-term memory Long-term memory has an effectively unlimited capacity and memories stored there may last as long as an individual’s lifetime. The coding (the internal representation) of the information held by it is primarily semantic in nature, that is, it is stored in terms of its meaning, e.g. knowledge of facts and the meaning of words (contrast this with the binary encoding of information in a computer). However, research has indicated that other forms of encoding are present too: for example, memories of music or the bark of a dog are encoded as auditory information, and similarly haptic (touch) encod ing allows us to remember the feeling of silk and the sting of a cut. Finally, olfactory (smell) and gustatory (taste) encoding allows us to recognize and distinguish between the smell and taste of fresh and rotten food. In addition to semantic memory, long-term memory includes other kinds of memo ries such as episodic or autobiographical memory (memory of our personal history, for example our first kiss, graduation day, the death of a parent) and procedural mem ory (e.g. the knowledge of how to ride a bike, type, play the euphonium). This neat three-way division of long-term memory into component parts - semantic, episodic and procedural - has been questioned by Cohen and Squire (1980) who argued that the real distinction is between ‘knowing that’ (declarative memory) and ‘knowing how’ (proce dural memory), but in practice there is little between these two accounts. Challenge 21.1 Contrast listing the components of a bicycle (e.g. frame, wheels, etc.) with knowing how to ride a bicycle (e.g. sitting on the saddle and pedalling) and with your memory of the first time you rode a bicycle (e.g. How old were you? What sort of day was it? Who else was there?). Which is hardest to describe? How do we remember? In everyday English, to remember means both to retrieve information (‘I think her birth day is the 18th of June’) and to store information in memory (‘I’ll remember that’). To remove this ambiguity we will use the terms store and encode to mean place in mem ory, and retrieve and recall to mean bring back from memory. If what we want to store is not too complex (that is, it does not exceed the capacity of working memory), we will typically rehearse it, that is, repeat the string of words either aloud or using our inner voice. This is useful for remembering unfamiliar names or strings of numbers or words such as a foreign phrase, for example ‘Dos cervezas, por favor’. This technique exploits the articulatory loop of working memory. Similar strat egies are also used to remember, for a short time, the shape of an object or a set of directions. The capacity of working memory can effectively be enhanced by chunking the material to be remembered first. Chunking is the process by which we can organize material into meaningful groups (chunks). For example, an apparently random string of numbers such as 00441314551234 may defeat most people unless it is chunked. This particular number may be seen to be a telephone number made up from the code for international calls (0044), the area code for Edinburgh (131) and the prefix for Edinburgh Napier University (455), leaving only 1234 to remember. Thus the string of numbers has been reduced to four chunks. So how do we remember things for longer periods? One answer is elaboration which has been developed as an alternative view of memory in itself. The levels of processing
472 PART IV • Foundations of designing interactive systems (LoP) model proposed by Craik and Lockhart (1972) argues that rather than focusing on the structural, multi-store model of memory we should emphasize the memory pro cesses involved. The LoP model recognizes that any given stimulus (piece of informa tion) can be processed in a number of different ways (or levels) ranging from the trivial or shallow all the way through to a deep, semantic analysis. Superficial processing may involve the analysis of the stimulus’s surface features such as its colour or shape; a deeper level of analysis may follow which may test for such things as whether the stimu lus (e.g. cow) rhymes with the word ‘hat’. The final and deepest level of analysis is the semantic, which considers the stimulus’s meaning - does the word refer to a mammal? Finally, we are able to retrieve stored information by way of recall and/or recogni tion. Recall is the process whereby individuals actively search their memories to retrieve a particular piece of information. Recognition involves searching our memory and then deciding whether the piece of information matches what we have in our memory stores. How and why do we forget? There are numerous theories of forgetting. However, before we discuss their strengths and weaknesses, we begin with another key distinction, namely the difference between accessibility and availability. Accessibility refers to whether or not we are able to retrieve information that has been stored in memory, while the availability of a mem ory depends on whether or not it was stored in memory. The metaphor of a library is often used to illustrate this difference. Imagine you are trying to find a specific book in a library. There are three possible outcomes: (a) you find the book (the memory is retrieved); (b) the book is not in the library (the memory is not available); or (c) the book is in the library but has been misfiled (not accessible). There is, of course, a fourth possibility, namely that someone else has borrowed the book, which is where the metaphor breaks down! As we described earlier, information is transferred from working memory to long-term memory to be stored permanendy, which means that availability is the main issue for work ing memory while accessibility is the main (potential) problem for long-term memory. Challenge 21.2 Demonstrating recency and the serial order effect. The serial position curve is an elegant demonstration of the presence of (a) a short/long-term divide in memory and (b) the primacy and recency effects in forgetting. This is easily demonstrated. First, create a list of, say, 20 to 30 words. Present them in turn (read them or present them on a screen - try using PowerPoint) to a friend, noting the order in which the words were presented. At the end of the list, ask them to recall as many of the words as they can. Again note the order of the words. Repeat this process with another 6-10 people. Plot how many words presented first (in position 1) were recalled, then how many in positions 2, 3, 4, etc., up to the end of the list. V; ’ - - -- -■ ■ - - - - - ...... ............... - ........■ ■-* Forgetting from working memory The first and perhaps oldest theory is decay theory, which argues that memory sim ply fades with time, a point which is particularly relevant to working memory which maintains memories for only 30 seconds or so without rehearsal. Another account is
Chapter 21 • Memory and attention 473 displacem ent theory, which has also been developed to account for forgetting from working memory. As we have already seen, working memory is limited in capacity, so it follows that if we were to try to add another item or two to this memory, a correspond ing number of items must be squeezed out. Forgetting from long-term memory (LTM) We turn now to more widely respected theories of forgetting from long-term memory. Again psychology cannot supply us with one, simple, widely agreed view of how we for get from LTM. Instead there are a number of competing theories with varying amounts of supporting evidence. Early theories (Hebb, 1949) suggested that we forget from dis use. For example, we become less proficient in a foreign language learned at school if we never use it. In the 1950s it was suggested that forgetting from LTM may simply be a matter of decay. Perhaps memory engrams (= memory traces) simply fade with time, but except in cases of explicit neurological damage such as Alzheimer’s disease no evi dence has been found to support this. A more widely regarded account of forgetting is interference theory, which suggests that forgetting is more strongly influenced by what we have done before or after learn ing than the passage of time itself. Interference takes two forms: retroactive interfer ence (RI) and proactive interference (PI). Retroactive interference, as the name suggests, works backwards. That is, newer learning interferes with earlier learning. Having been used to driving a manual-shift car, spending time on holiday driving an automatic may interfere with the way one drives after returning home. In contrast to RI, proactive interference may be seen in action in, for example, mov ing from word processor v l to v2. Version 2 may have added new features and reor ganized the presentation of menus. Having learned version 1 interferes with learning version 2. Thus earlier learning interferes with new learning. However, despite these and numerous other examples of PI and RI, there is surprisingly little outside the labo ratory to support this theory. Retrieval failure theory proposes that memories cannot be retrieved because we have not employed the correct retrieval cue. Recalling the earlier library metaphor, it is as if we have ‘filed’ the memory in the wrong place. The model is similar to the tip-of- the-tongue phenomenon (Box 21.2). All in all, many of these theories probably account for some forgetting from LTM. The tip-of-the-tongue phenomenon Researchers Brown and McNeill (1966) created a list of dictionary definitions of unfa miliar words and asked a group of people to provide words that matched them. Not surprisingly, not everyone was able to provide the missing word. However, of those people who could not, many were able to supply the word's first letter, or the number of syllables or even words that sounded like the missing word itself. Examples of the definitions are: • Favouritism, especially governmental patronage extended to relatives (nepotism) • The common cavity into which the various ducts of the body open in certain fish, birds and mammals (cloaca). -------------------- --------- ------ -........ ................... J
474 PART IV • Foundations of designing interactive systems 21.3 Attention Attention is a pivotally important human ability and is central to operating a machine, using a computer, driving to work or catching a train. Failures in attention are a fre quently cited reason for accidents: car accidents have been attributed to the driver using their mobile phone while driving; aircraft have experienced ‘controlled flight into terrain’ (to use the official jargon) when the pilots have paid too much atten tion to the ‘wrong’cockpit warning; and control room operators can be overwhelmed by the range and complexity of instruments to which they must attend. Clearly we need to be able to understand the mechanism of attention, its capabilities and limi tations, and how to design to make the most of these abilities while minimizing its limitations. Attention is an aspect of cognition that is particularly important in the design and operation of safety-critical interactive systems (ranging from the all too frequently quoted control room operator through to inspection tasks on mundane production lines). While there is no single agreed definition of attention, Solso (1995) defines it as ‘the concentration of mental effort on sensory or mental events’, which is typical of many definitions. The problem with definitions in many ways reflects how atten tion has been studied and what mental faculties researchers have included under the umbrella term of attention. However, the study of attention has been split between two basic forms, namely selective attention and divided attention. Selective (or focused) attention generally refers to whether or not we become aware of sensory information. Indeed, Cherry (1953) coined the term the cocktail party effect to illustrate this (Box 21.3). The cocktail party effect Cherry (1953), presumably while at a cocktail party, had noticed that we are able to focus our attention on the person we are talking to while filtering out everyone else's conversation. This principle is at the heart of the search for extra-terrestrial intelligence (SETI), which is selectively listening for alien radio signals against the background of natural radio signals. . -...................................... ................................ -............. ........................................ ............................. J Studies of selective attention have employed a dichotic listening approach. Typically, participants in such experiments are requested to shadow (repeat aloud) one of the two voices they will hear through a set of headphones. One voice will be played through the right headphone while another is played through the left - hence dichotic. In contrast to selective attention, divided attention recognizes that atten tion can be thought of in terms of mental resources (e.g. Kahneman, 1973; Pashler, 1998) that can in some sense be divided between tasks being performed simultane ously (commonly referred to as multi-tasking). For example, when watching televi sion while holding a conversation, attention is being split between two tasks. Unless an individual is very well practised, the performance of two simultaneously executed tasks would be expected to be poorer than attending to just one at a time. Studies of divided attention might employ the same physical arrangements as above but ask the participant to attend (listen to) both voices and, say, press a button when a keyword is heard spoken in either channel.
Chapter 21 • Memory and attention 475 The Stroop effect Stroop (1935) showed that if a colour word such as 'green' is written in a conflicting colour such as red, people find it remarkably difficult to name the colour the word is written in. The reason is that reading is an automatic process which conflicts with the task of naming the colour of the 'ink' a word is written in. The Stroop effect has also been shown to apply to suitably organized numbers and words. Try saying aloud the colour of the text - not the word itself: Column 7 Column 2 RED RED GREEN GREEN BLUE BLUE RED RED GREEN GREEN RED RED You should find that saying the colour of each word in column 1 is slower and more prone to error owing to the meaning of the word itself. The word 'red' interferes with the colour (green) it is printed in and vice versa. How attention works To date, there have been a number of different accounts or models of attention. The earliest date from the 1950s and are characterized by likening attention to a bottle neck. Later theories have concentrated on an allocation model that treats attention as a resource that can be spread (or allocated) across a number of different tasks. Other views of attention have concentrated on the automatic/controlled processing divide and on sequential/parallel processing. As in most aspects of psychology, there is no sin gle account of attention; instead there is a mosaic of complementary views. ‘Bottleneck’ theories of attention We begin with Donald Broadbent’s single-channel theory of attention (Broadbent, 1958). He proposed that information arriving at the senses is stored in short-term memory before being filtered or selected as being of interest (or being discarded), which in practice means that we attend to one particular channel and ignore oth ers. This information (this channel) is then processed by a limited-capacity proces sor. On being processed, instructions may be sent to motor effectors (the muscles) to generate a response. The presence of short-term memory, acting as a temporary buffer, means that information which is not selected is not immediately discarded either. Figure 21.3 is an illustration of Broadbent’s model. Broadbent realized that we might be able to attend to this information stored in the short-term memory, but switching between two different channels of information would be inefficient. (It has been observed by a number of researchers that Broadbent’s thinking reflects the technology of his day, as in many ways this single-channel model of attention is similar to the conventional model of a computer’s central processing unit (CPU), which also has a single channel and is a serial processing device - the von Neumann architecture.) This original single-channel model (sometimes referred to as a bottle neck account of attention) was refined and developed by Broadbent’s co-workers and
476 PART IV • Foundations of designing interactive systems Sensory information Short-term memory (o r w orking m em o ry) Selective filter Limited capacity filter Sto re o f conditional System for varying Effector Response properties of past the output until output input is received Figure 21.3 Broadbent's single-channel model of attention others (Triesman, 1960; Deutsch and Deutsch, 1963; Norman, 1968) but remained broadly similar. Triesman argued for the atten u a tio n of the unattended channel, which is like turning down the volume of a signal, rather than an on-off switch. In Triesman’s model, competing information is analysed for its physical properties, and for sound, syllable pattern, grammatical structure and meaning, before being attended. The later Deutsch and Deutsch (1963) and Deutsch-Norman (Norman, 1968) models completely rejected Broadbent’s early selection model, instead arguing for a later- selection filter/pertinence account. Selection (or filtering) only occurs after all of the sensory inputs have been analysed. The major criticism of this family of single channel models is their lack of flexibility, particularly in the face of a competing allo cation model discussed below. It has also been questioned as to whether any single, general-purpose, limited-capacity processor can ever account for the complexity of selective attention. The reality of everyday divided attention presents even greater problems for such accounts. As we have just discussed, models of selective attention assume the existence of a limited-capacity filter capable of dealing with only one information channel at a time. However, this is at odds with both everyday experience and experimental evidence.
Chapter 21 • Memory and attention 477 Attention as capacity allocation Next, we briefly discuss an example of a group of models of attention which treat atten tion as a limited resource that is allocated to different processes. The best known is Kahneman’s capacity allocation model (Kahneman, 1973). Kahneman argued that we have a limited amount of processing power at our disposal and whether or not we are able to carry out a task depends on how much of this capacity is applied to the task. Of course, some tasks require relatively little processing power and others may require more - perhaps more than we have available. This intuitively appealing account does allow us to explain how we can divide our attention across a number of tasks depending upon how demanding they are and how experienced we are in executing them. However, there are a number of other variables that affect the ways in which we allocate this atten- tional capacity, including our state of arousal and what Kahneman describes as endur ing dispositions, momentary intentions and the evaluation of the attentional demands. Enduring dispositions are described as the rules for allocating capacity that are not under voluntary control (e.g. hearing your own name spoken), and momentary inten tions are voluntary shifts in attention, such as responding to a particular signal. There is a further variable that is how aroused we are. Arousal in this context may be thought of as how awake we are. Figure 21.4 is a diagram of the capacity allocation model in which we can see the limit capacity; the central processor has been replaced by an allocation policy component that governs which of the competing demands should receive atten tion. While Kahneman portrays attention as being more flexible and dynamic than the single-channel models, he is unable to describe how attention is channelled or focused. Similarly, he is unable to define the limits of what is meant by ‘capacity’. Enduring Available Manifestations dispositions capacity of arousal — Allocation Evaluation of Momentary policy demands on intentions Possible capacity activities Response Figure 21.4 Kahneman's capacity allocation model Automatic and controlled processing In contrast to the foregoing models of attention, Schneider and Shiffrin (1977) observed that we are capable of both automatic and controlled information processing. We generally use automatic processing with tasks we find easy (and this, of course, is
478 PART IV • Foundations of designing interactive systems dependent upon our expertise in this task) but use controlled processing on unfamil iar and difficult tasks. Schneider and Shiffrin distinguish between controlled and automatic processing in terms of attention as follows. Controlled processing makes heavy demands on attention and is slow, limited in capacity and involving consciously directing attention towards a task. In contrast, automatic processing makes little or no demand on attention, is fast, unaffected by capacity limitations, unavoidable and difficult to modify, and is not sub ject to conscious awareness. Schneider and Shiffrin found that if people are given practice at a task, they can per form it quickly and accurately, but their performance is resistant to change. An example of apparent automaticity in real life occurs when we learn to drive a car. At first, focused attention is required for each component of driving, and any distraction can disrupt performance. Once we have learnt to drive, and as we become more experienced, our ability to attend simultaneously to other things increases. Moving from this very brief treatment of models of attention, we now consider how a wide range of internal and external factors can affect our ability to attend. Factors affecting attention Of the factors that affect our ability to pay attention to a task, stress is the most impor tant. Stress is the effect of external and psychological stimuli on us and directly affects our level of arousal. Arousal is different from attention in that it refers to a general increase or decrease in perceptual and motor activity. For example, sexual arousal is typified by heightened levels of hormonal secretions, dilation of the pupils, increased blood flow and a whole range of mating behaviours. Stressors (stimuli which cause stress) include such things as noise, light, vibration (e.g. flying through turbulence) and more psychological factors such as anxiety, fatigue, anger, threat, lack of sleep and fear (e.g. think about the days before an examination). As long ago as 1908, Yerkes and Dodson found a relationship between performance of tasks and level of arousal. Figure 21.5 is an illustration of this relationship - the so- called Yerkes-Dodson law. There are two things to note about this relationship. First, for both simple and complex tasks there is an optimal level of arousal. As our level of Figure 21.5 The Yerkes-Dodson law
Chapter 21 • Memory and attention 479 arousal increases, our ability to execute a task increases until we reach a point when we Arousal is also im portant to the study of emotion, are too aroused and our performance falls off sharply. Secondly, simple tasks are more described in Chap ter 22 resistant to increased levels of arousal than are complex tasks. The other aspect of this is the skill of the individual involved. A simple task to a highly skilled individual is likely to be seen as complex by a less skilled or able individual. Vigilance Vigilance is a term applied to the execution of a task wherein an individual is required to monitor an instrument or situation for a signal. Perhaps the classic example of a vigi lance task is being on watch on board a ship. During the Second World War mariners were required to be vigilant in scanning the horizon for enemy ships, submarines, air craft or icefloes. Wartime aside, vigilance is still an important element of many jobs - consider the role of the operator of a luggage X-ray machine at an airport, or a safety inspector checking for cracks or loose fittings on a railway track. Attention drivers! Wikman et al. (1998) have reported differences in the performance of inexperienced (novice) and experienced drivers when given a secondary task to perform while driv ing. The drivers were asked to do such things as changing a CD, operating the car radio or using a mobile (cell) phone. Unsurprisingly, the novice drivers were distracted more (allocated their attention less effectively) than the experienced drivers. Experienced drivers took their eyes off the road for less than three seconds, while novice drivers were found to weave across the road. In-car systems FURTHER THOUGHTS The use of spoken messages in-car, particularly for satellite navigation (satnav) systems, is now becoming commonplace. The challenge for the designers of these systems is (a) to attract the attention of the driver without distracting him or her, and (b) to avoid habituation - that is, the driver learning to ignore the nagging voice. The choice of voice is also critical. Honda have decided upon 'Midori' - the name given to the voice of an unnamed bilingual Japanese actress whose voice is 'smooth as liqueur*. In contrast, Italian Range Rovers are equipped with a voice which is argu mentative in tone, and Jaguar (the English motor manufacturer) retains British colloqui alisms to reinforce their brand image. These brand images aside, manufacturers have found that drivers tend to listen to female voices more than male voices. Other issues in in-car HCI concern the design of devices such as phones and satellite navigation systems that require complex operation and hence result in divided atten tion (Green, 2012). Mental workload Mental workload addresses issues such as how busy the user or operator is and how difficult are the tasks assigned to him or her - will he or she be able to deal with an addi tional workload? A classic example of this occurred in the 1970s when it was decided
480 PART IV • Foundations of designing interactive systems to remove the third crew member from a flight team on board a medium to large pas senger jet. The Federal Aviation Administration now requires measures of the mental workload on the crew prior to the certification of a new aircraft or new control system. Turning now to design issues in respect of mental workload, the first observation is that a discussion of mental workload does not necessarily equate workload with overload. Indeed, the reverse is often true: just consider the potential consequences of operator/user boredom and fatigue (Wickens and Hollands, 2000, p. 470). There are a number of different ways in which workload can be estimated, one of which is the NASA TLX scale. This scale (Table 21.2) is a subjective rating procedure that provides an over all workload score based on a weighted average of ratings on six sub-scales. Table 21.2 Measuring workload Title Endpoints Description Mental demand Low/end How much mental and perceptual activity was required (e.g. thinking, deciding, etc.)? Physical demand Low/high Was the task easy or demanding, simple or complex? Temporal demand Low/high How much physical effort was required (e.g. pushing, pulling, etc.)? Was the task easy or demanding, slack or strenuous, restful or Performance Perfect/failure laborious? Effort Low/high How much time pressure did you feel due to the rate or pace at Frustration level Low/high which the tasks or task elements occurred? Was the pace slow and leisurely or rapid and frantic? How successful do you think you were in accomplishing the goals of the task set by the experimenter (or yourself)? How satisfied were you with your performance in accomplishing these goals? How hard did you have to work (mentally and physically) to accomplish your level of performance? How insecure, discouraged, irritated, stressed and annoyed as opposed to secure, gratified, content, relaxed and complacent did you feel during your task? Source: Wickens, Christopher D.; Hollands, Justin G„ Engineering Psychology and Human Performance, 3rd, © 2000. Printed and Electronically reproduced by permission of Pearson Education, Inc., Upper Saddle River, New Jersey. Visual search Visual search has been researched extensively by psychologists and ergonomists and refers to our ability to locate particular items in a visual scene. Participants in a visual search study, for example, may be required to locate a single letter in a block of miscel laneous characters. Try to find the letter ‘F’ in the matrix in Figure 21.6. This is a good example of how perception and attention overlap and an understand ing of the issues involved in visual search can help in avoiding interactive systems such as that shown in Figure 21.7. Research has revealed that there is no consistent visual search pattern which can be predicted in advance. Visual search cannot be presumed to be left to right, or clockwise rather than anti-clockwise, except to say that searching tends to be directed towards where the target is expected to be. However, visual attention will be drawn towards features which are large and bright and changing (e.g. flashing, which may be used for warnings). These visual features can be used to direct attention, particularly if they
Chapter 21 • Memory and attention have a sudden onset (i.e. a light being switched on, or a car horn sounding). Megaw and Richardson (1979) found that physical organization can also have an effect on search pat terns. Displays or dials organized in rows tended to be scanned from left to right (just as in reading Western languages, but raising the question of cultural bias - would the same be true for those cultures who read from right to left or from top to bottom?). Parasuraman (1986) has reported evidence of an edge effect wherein during supervisory tasks (that is, the routine scanning of dials and displays) operators tended to concentrate on the centre of the display panel and tended to ignore the periphery. As Wickens and Hollands (2000) note, research into visual scanning behaviour has yielded two broad conclusions. First, visual scanning reveals much about the internal expectancies that drive selective attention. Secondly, these insights are probably most useful in the area of diagnostics. Clearly those instruments which are most frequently watched are likely to be the most important to an operator’s task. This should guide design decisions to place the instru ments in prominent locations or to locate them adjacent to one another. E EE E E E EE E EE E E E EE E EE E E E EE E EE E E E EE E EE E E E EE E EE E E E EE E EE E E E FE E EE E E E EE Fig u re 21.6 A matrix of letters ammm News by Date News-Stand l * * * - * - 5 A - B - C - D - E - F - G - H - I - J - K —L —M - M - g - g - f i - B - S —T - U —V -W —X - Y - Z - + g Q 0 8 . . g l . a i . , , H a v e n W w ik s .c o m SatUrdacalendar^* E5lb T1 Vv Ow n nline T1 elievision \\ y : SlQ l^ H fflBH lSSBiN eW S Ha\\oiW orks W rt + :0 0 8 » \"s Qistg Fair. Balanced. Democratic News: x ' t t u a » *5 2010 I S - t l - va •y i - V5A - vu -\\y v •\\yy ^H artm ann M P3s Vid >ick Chenev - +S ecre t - + O s a m a bin Laden Greenwald 2 Radio Blogs!-; T\\ Blog ■Weblog® Mn>r n w. Senate committee: IO P IN I O Nl _______________________ ____ _ Bush knew Iraq claims g-AktartShtfrr- |■Ho--b-a--m--a--N--e-w--s■I ESKtrraa weren't true.' -s,^. Dem'08 T V tS& llH un- QQi.o. DECEIVING CONGRESS Official History WITH FABRICATED \"American Stories. Anger Grip* Auto prv| Spotiiflhtsjrajl THREATS OF IRAQ American Solutions.\" W orkers: Cntic of a Rebuilding WMDs TO Rescue [Tennessee Blunders. 'An Republican Senator) FRAUDULENTLY OBTAIN Sen Corker Faces unpublished 513-page Backlash at H om e' federal history of the As th e workers and A m e n c a n -le d reconstruction of Iraq SUPPORT FOR AN depicts an effort cnppled AUTHORIZATION OF THE before the invasion by r « r n c u h it id v Fig u re 21.7 A practical example of the challenge of visual search (Source: www.HavenWorks.com)
482 PART IV • Foundations of designing interactive systems Just how long is it reasonable to wait? It is generally accepted that delays o f less than 0.1 second are taken to be effectively instantaneous, but delays of a second or two may be perceived by the user of an inter active system as being an interruption in the free flo w o f his o r her interaction. Delays o f m ore than 10 seconds present problems fo r people. M inim izing delay is im portant in the design of websites for w hich numerous, often contradictory, guidelines have been published. Here are two perfectly reasonable suggestions: • The top of your page should be meaningful and fast. • Sim plify com plex tables as they display more slowly. Signal detection theory It is late at night. You are asleep alone in your apartment. You are awoken by a noise. What do you do? For many people the first thing to do is to wait and see (as it were) whether they hear the noise again. Here we are in the domain of signal detection theory - was there really a signal (e.g. the sound of breaking glass by the local axe-murderer) and if so, are we to act on it - or was it just the wind or a cat in the dustbin? Signal detection theory (SDT) is applicable in any situation in which there are two different, non-overlapping states (i.e. signal and noise) that cannot be easily discriminated - that is, for example, did a signal appear on the radar screen, did it move, has it changed size or shape? In such situations we are concerned with signals which must be detected, and in the process one of two responses may be produced - e.g. ‘I detected the presence of a signal, so I shall press the stop button’, or ‘I failed to see anything, so 1shall continue to watch’. This may vary in importance from the trivial, e.g. recognizing that a job has been printed (the printer icon has disappeared from the application’s status bar), through to the safety-critical, e.g. a train driver spotting (or not) a stop light. The following compelling examples of the importance of SDT have been identified by Wickens and Hollands (2000): the detection of a concealed weapon by an airport security guard; the identification of a malignant tumour on an X-ray plate by a radiolo gist; and a system malfunction detected by a nuclear plant supervisor. Their list goes on to include identifying critical incidents in the context of air traffic control, proof reading, detecting lies from a polygraph (lie detector) and spotting hairline cracks in aircraft wings, amongst other things. SDT recognizes that an individual faced with such a situation can respond in one of four ways: in the presence of a signal, the operator may detect it (hit) or fail to detect it (miss); in the absence of a signal, the operator may correctly reject it (correct rejection) or incorrectly identify it (false alarm). This is illus trated in Table 21.3. Table 21.3 SDT decision table State Response S ig n a l N oise Yes Hit False alarm No Miss Correct rejection The probability of each response is typically calculated for a given situation and these figures are often quoted for both people and machines. So, a navigational aid on board an aircraft (e.g. ground collision radar) might be quoted as producing false alarms (also
Chapter 21 • Memory and attention 4 8 3 called false positives) at a rate of less than 0.001 - one in a thousand. Similar figures are quoted as targets for medical screening operators (e.g. no more than 1 in 10,000 real instances of, say, breast cancer should be missed while 1 in 1000 false alarms are acceptable). Transcript from Apollo XIII: barber-poles and the Moon The Apollo flights to the M oon in the late 1960s and early 1970s are excellent exam ples I of both user-centred design and brilliant and innovative ergonomic design. One of the innovations can be found in the design of the A pollo spacecraft w hich used barber- poles to provide status inform ation to the astronauts. A barber-pole is a striped bar signalling that a particular circuit o r function is active (for exam ple, the com m unication system - the talkback system), or, as can be seen in the transcript below, measures of liquid helium and the state of the electrical systems. In the transcript w e see that Jim Lovell reports to M ission Control that main bus '8 is b a rb e r p o le d a n d D is b a rb e r p o le d , h eliu m 2 , D is b a rb e r p o le ': 55:55:35 - Lovell: ‘H o u sto n , w e've h a d a p ro b le m . W e 've h a d a m a in B b u s u n d e rv o lt.' 55:55:20 - Swigert: ‘O ka y , H o u s to n , w e 'v e h a d a p ro b le m h e re .' 55:57:40 - DC main bus B drops below 26.25 volts and continues to fall rapidly. 55:57:44 - Lovell: 'O k a y. A n d w e 're lo o k in g a t o u r s e r v ic e m o d u le R C S h e liu m 1. W e h a v e - B is b a rb e r p o le d a n d D is b a rb e r p o led , h eliu m 2 , D is b a rb e r p o le, a n d se c o n d a ry p ro p e lla n ts, I h a ve A a n d C b a rb e r p o le .' AC bus fails w ithin 2 seconds. Interestingly, the use o f a barber-pole can be found in m odern o peratin g system s. For exam ple, the OS X system uses barber-poles (Figure 21.8). Figure 21.8 Barber-pole, OS X J 21.4 Human error Human error is studied in a wide variety of ways. Some researchers conduct laboratory <- Chapter 2 discusses investigations while others investigate the causes of major accidents after the event. A mental models typical example of a laboratory study is that by Hull et al. (1988) who asked 24 ordinary men and women to wire an electric plug. They found that only 5 succeeded in doing so safely, despite the fact that 23 of the 24 had wired a plug in the previous 12 months. In analysing the results of this study it was found that a number of different factors con tributed to these failures, including: • Failure to read the instructions • Inability to formulate an appropriate mental model • Failure of the plug designers to provide clear physical constraints on erroneous actions. This last point was regarded as the most significant. Unhappily, error is an inescapable fact of life. Analysis of the causes of major acci dents has found that human error is primarily responsible in 60-90 per cent of all major accidents (Rouse and Rouse, 1983; Reason, 1997). This figure is consistent with the findings of commercial organizations: for example, Boeing, the aircraft manufacturer, estimate that 70 per cent of all ‘commercial airplane hull-loss accidents’are attributable to human error.
4 8 4 PART IV • Foundations of designing interactive systems Understanding action slips Research conducted by Reason (1992) has given insight into everyday errors. In one study he asked 36 people to keep a diary of action slips (i.e. actions which have devi ated from what they intended) for a period of four weeks. Analysis of the reported 433 slips revealed that storage failures (e.g repeating an action which has already been com pleted) were the most frequently reported. Figure 21.9 summarizes the key findings of this study and Table 21.4 describes each type of action slip (the miscellaneous errors are too diverse to discuss). Percentage of action slips Fig u re 21.9 Five categories of action slips (Source: After Reason, 1992, Fig. 15.24) Table 21.4 Action slips D e sc rip tio n Type of action slip These were the most com m on and involved errors such as repeating an action Storage failures w hich has already been com pleted, e.g. sending the sam e e-m ail twice. Test failures These refer to forgetting what the goal of the action was, owing to failing to Subroutine failures m onitor the execution of a series of actions, e.g. starting to com pose an e-mail Discrimination failures and then forgetting to w hom you are sending it. Programme assembly failures These errors w ere due to om itting a step in the sequence o f executing an action, e.g. sending an e-m ail and forgetting to attach the attachment. Failure to discrim inate between tw o sim ilar objects used in the execution of an action resulted in this category o f error, e.g. intending to send an e-m ail and starting W ord instead by mistake. This was the smallest category, accounting for only 5 per cent of the total. They involved incorrectly com bining actions, e.g. saving the e-m ail and deleting the attachm ent instead o f saving the attachm ent and deleting the e-mail. Each of these slips (and there are other classifications of errors, e.g. Smith et al. (2012)) presents challenges for the interactive systems designer. Some can be reduced or managed, others cannot.
Chapter 21 • Memory and attention 4 8 5 Reducing action slips Designers should design to minimize the chance of slips. For example, “wizards’prompt people for, and help them recall, the steps which need to be undertaken to complete a task, such as installing a printer. In the image of the sequence in Figure 21.10 the system prompts to obtain information to allow the operating system to install a printer. The advantage of this approach is that only relatively small amounts of information are required at any one time. It also has the advantage of an error correction system (i.e. use of the Back and Next steps). Figure 21.10 Using a Microsoft w izard to prompt a user to supply information one step at a time One of the most demanding tasks in the work of an academic is marking course- work and examination scripts and tabulating the results without making mistakes. Figure 21.11 is a snapshot of a spreadsheet designed by Professor Jon Kerridge of the School of Computing, Edinburgh Napier University, to help reduce errors in this process. It is an example of good practice in this kind of manual tabulation of data as it employs a number of semi-automated checks (with corresponding error messages): in the column labelled Checked is a note indicating that an error message will appear if‘either the mark inserted for a question is more than the maximum mark obtainable for that question o r . .. ’. The author of the system has annotated the spreadsheet using comments and has used a series of if statements to check the inputted data. So, for example, marks should be entered for three questions only and an error is signalled if this number is exceeded. co42005-e.xls CD E FGH1J KL MNO Paper Total! 60 | Max Questions'! 3 | 22.56 Mean 10.4 3.71 2 8.5 12.00 SD 2.22 2.19 0 5.5 25 Attempts 24 7 1 2 0 0 0 0 0 0 Out 0 fv 2 ] Jon K erridge: Total Exam ' C h e c k e d ^ Will indicate ERROR if 18 34 NA the mark inserted For a question is m ore than the maximum m ark 32 obtainable For that question NA or NA the num ber o f m arks inserted For a candidate is greater than the NA maximum num ber of questions a candidate can answer 17 NA Subsequently the person checking the m arks can insert a tick to NA indicate the marks for each question allocated in the text o f the 15 answers add u p correctly Figure 21.11 Avoiding m istakes with autom ated error checking (Source: Courtesy of Jon Kerridge)
486 PART IV • Foundations of designing interactive systems Challenge 21.3 W h a t is w ron g w ith th e erro r m essa g e in Figure 21.72? H o w w o u ld yo u rew ord it? A Photoshop quit unexpectedly. Click Reopen to open the application again. Click Report to see more detailed information and send a report to Apple. © Ignore ^ Report... Reopen Figure 21.12 An unexpected error message rnm tam m m m m m m im m m im m m m m m m m m Summary and key points W e have seen that m e m o ry is d ivided into a n u m b e r o f d ifferent stores, each o f d iffer ent size, m ake-up and purpose. In fo rm atio n a rrivin g at the senses is held ve ry b riefly in the sen so ry stores before m oving on to w o rk in g m e m o ry . W orking m em ory (the modern equivalent of short-term m em ory) holds three or four items for up to 30 seconds unless re h e a rs e d . In fo rm atio n m ay su b se q u en tly be stored in the lo n g - te r m m e m o r y store w ith additional processing. The contents o f long-term m em ory last a lo ngtim e (m inutes, hours, days, even years) and are held in several different typ es o f m em ory, in clu din g m e m o ry for skills (p ro c e d u ra l m e m o ry ), se m a n tic memory w hich holds the m eaning of w ords, facts and knowledge generally, a u to b io g ra p h ic a l m e m o ry w hich holds o ur personal experiences, and finally p e rm a -sto re w hich holds inform ation w hich literally m ay last a lifetime. In term s of design these lim itations and capabilities translate into tw o im portant prin ciples: the need to c h u n k m aterial to reduce the load on w orking m em ory; the im por tance of designing for re co g n itio n rather than reca ll. A ttention can be thought o f in term s o f being d iv id e d o r se le c tiv e . D ivided attention refers to our ab ility to carry out m ore than one task at a tim e, though our ab ility to carry out m ultiple tasks also depends upon our skill (expertise) and the difficulty of the task. In co n trast, sele ctive atten tio n is m o re co n ce rn e d w ith fo cu sin g on p a rticu la r tasks o r things in the e n v iro n m e n t. It should co m e as no su rp rise th at w e m ake erro rs w h ile using interactive devices. These errors have been classified and described by a range of researchers, sto ra g e fa ilu re s being the m ost com m on. W hile all errors cannot be pre vented, m easures can be taken to m inim ize them , using devices such as w iz a rd s and autom ated error checking. Exercises 1 As w izards can be used to prevent action slips being m ade, does it m ake good sense to use them fo r all dialogues with the system or application? W hen would you n o t use an error-preventing dialogue style such as w izards? 2 Com pare and contrast how you w ould design a W eb brow ser fo r recall as com pared to one fo r recognition. W hat are the key differences?
Chapter 21 • Memory and attention 487 3 (A d van ced ) - You are responsible for designing the control panel for a nuclear reactor. O perators have to m onitor num erous alerts, alarm s and readings w hich (thankfully) indicate norm al operating conditions alm ost all the tim e. If and w h e n an a b n o rm a l state is in d icated , th e o p e ra to r m ust take rem edial action im m ediately. Discuss how you w ould design the control panel to take into account the qualities o f hum an attention. 4 (A d van ced ) - H ow far (if at all) does psychological research into the m echanism s of hum an m em ory support the effective design of interactive system s? Give concrete examples. — ........................... ........................ ................ -.... ..............................J Further reading Reason, j. (1990) Human Error. Cambridge University Press, Cambridge. P erh a p s a little d a ted now but a highly readable introduction to the study o f error. W ickens, C.D. and Hollands, J.G . (2000) Engineering Psychology and Human Performance, 3rd ed n. P rentice-H all, Upper Saddle River, NJ. O n e o f th e d e fin it iv e te x ts o n e n g in e e r in g p sy c h o lo g y . Getting ahead Baddeley, A . (1997) Human Memory: Theory and Practice. Psychology Press, Hove, Sussex. A n excellent introduction to hum an m em ory. Ericsson, K .A . and Sm ith, J. (eds) (1991) Towards a General Theory o f Expertise. Cambridge University Press, Cambridge. T h is is a n in terestin g c o lle ctio n o f c h a p te rs w ritte n b y e x p e rts in expertise. The accom panying website has links to relevant websites. Go to w w w .p e a rso n e d .co .u k/b e n yo n Comments on challenges Challenge 21.1 The most difficult of these to describe is usually the procedural knowledge of how to ride a bicycle. Most people find the other two aspects reasonably easy. Procedural knowledge is notoriously dif ficult to articulate - hence the advisability of having users show you how they perform particular tasks rather than try to tell you. Challenge 21.2 The plot should resemble Figure 21.13. Words presented first, second, third,... are recalled well, as are the last four or five words. The twin peaks represent recall from long-term (prim acy ) and working memory (recency), respectively. This is a well-known effect and explains why, when asking directions or instructions, we tend to remember the beginning and end but are very vague about what was said in the middle.
488 PART IV Foundations of designing interactive systems Serial Position curve Serial Position Fig u re 21.13 C h a lle n g e 21.3 This error message violates a number of good practice guidelines. 'Critical' sounds scary. 'Can't start program' is not helpful - what is the user supposed to do next? How does the user avoid the error in the future? See the guidelines above. Perhaps a better form of wording might be 'System problem encountered. Please restart application'.
Chapter 22 Affect Contents Aims 22.1 Introduction 490 In a special issue o f an aca d e m ic jo u rn a l devoted to affective 22.2 Psychological theories of co m p uting, Rosalind Picard quotes a M O RI survey that found three- quarters of people using co m p uters adm it to sw earing at them (Picard, em otion 491 2003). This chap ter focuses on the role of em o tio ns (often term ed affect in th is co n te xt) in in te ra ctiv e system s design. W e first in tro d u ce 22.3 Detecting and recognizing th e o rie s o f h u m a n e m o tio n an d d e m o n s tra te th e ir a p p lic a tio n in technologies that respond to em o tio n , o r can generate 'em otions' em otions 497 them selves. 22.4 Expressing em otion 501 22.5 Potential applications and key A fter studying this chap ter you should be able to describe: issues for further research 504 • The physical and cognitive accounts (m odels) o f em otion Sum m ary and key points 506 • T h e po tential fo r affective co m p u tin g in in teractive system s design • Applications of affective com puting Exercises 506 • Sensing and recognizing hum an affective/em o tio nal signals and Further reading 506 W eb links 507 understanding affective behaviour Com m ents on challenges 507 • Syn th esizin g em o tio n al responses in in teractive devices.
490 PART IV • Foundations of designing interactive systems 22.1 Introduction Affect is concerned with describing the whole range of emotions, feelings, moods, senti ment and other aspects of people that might be considered non-cognitive (not aiming to describe how we come to know and understand things) and non-conative (not aiming to describe intention and volition). Of course, affect interacts with the cognitive and conative in complex ways. In particular, a person’s level of arousal and stress impacts on what they know, what they can remember, what they are attending to, how good they are at doing something and what they want to do! There are basic emotions such as fear, anger and surprise (discussed further below) and there are longer-term emotions such as love or jealousy that may be built up over years. People can be in different moods at different times. Moods tend to be longer lasting and slower to develop than emotions. Affect interacts with both cognitive and conative aspects of people. For example, if you are afraid of something it will affect the attention you pay to it. If you are in a positive frame of mind, it might affect how you perceive some event. If an event has a strong emotional impact, you are more likely to remember it. Affective computing concerns how computing devices can deal with emotions. There are three basic aspects to consider: getting interactive systems to recognize human emo tions and adapt accordingly; getting interactive systems to synthesize emotions and hence to appear more engaging or desirable; designing systems that elicit an emotional response from people or that allow people to express emotions. A good example of getting computers to recognize human emotions and react accordingly might be the use of a sensor in a motor car to detect whether or not the driver is angry or stressed. Sensors could be used to pick up on the fact that the driver is perspiring, is holding the steering wheel in a vice-like grip or has elevated blood pressure or heart rate. These are physiological signs of arousal. As statistics show that stress and anger are major contributory factors in road accidents, the car may then offer counselling, refuse to start (or something equally infuriating) or phone ahead to the emergency services. Another couple of examples, suggested by Picard and Healey (1997), are (a) the creation of an intelligent Web browser which responds to the w earer’s degree of interest in a topic that the wearer found inter esting, until it detected the interest fading, and (b) an affective assistant agent that could intelligently filter your e-mail or schedule, taking into account your emotional state or degree of activity. Synthesizing emotion is concerned with giving the impression of computers behav ing or reacting with emotion. Here an example might be a machine showing signs of distress when a system crash has just destroyed several hours’work. The notion perme ates much science fiction. A classic instance here is HAL, the onboard computer on the spaceship in Arthur C. Clarke’s novel 2001: A Space Odyssey (Clarke, 1968). In Stanley Kubrick’s film version HAL’s voice eloquently expresses fear as Dave, the astronaut, con siders switching ‘him’off. HAL’s ‘death’is agonizingly slow and piteous: 'D ave, stop. Stop, w ill yo u ? Stop, D ave. W ill you stop, D a ve? Stop, D ave. I'm afraid. I'm afraid, D a ve . D a ve, m y m in d is g o in g . I c a n fe e l it. I c a n fe e l it. M y m in d is g o in g . T h e re is n o q u estio n a b o u t it. I ca n fe e l it. I ca n fe e l it. I ca n fe e l it. I'm a fra id .’ In the film the dialogue is particularly poignant when contrasted with the unchanging expression of HAL’s ‘eye’.
Chapter 22 • Affect 491 Designing interactive systems that communicate or evoke human emotions is another Chapter 5 covers key aspect of affective computing. Designing for pleasure is one aspect of this - and designing for pleasure commercially crucial for small consumer devices such as phones - but others include devices which allow people to communicate affect at a distance, and the creation of virtual environments which support the treatment of phobias or attempt to evoke the feelings associated with particular places. Whether or not computers could ever actually feel emotion is beyond this discussion, but science fiction novels such as Do Androids Dream of Electric Sheep'? by Philip K. Dick (1968) offer interesting discussions of such themes. At first sight, the idea of ‘giving’ computers emotion seems to be counter-intuitive. Computers are the epitome of logic and the idea of acting emotionally has strong negative connotations - just think of Star Trek’s Mr Spock, or the android Data. There is no denying that emotion (affect) has tra ditionally had a bad press. The other side of the argument is the recognition that emotions are part of day-to day human functioning. Emotion plays a significant part in decision making, social interaction and most aspects of what we would describe as cognition, such as problem solving, thinking and perception. Increasingly, these human activities and functions are supported by interactive sys tems, so an understanding of how emotion works can help us design systems that rec ognize, synthesize or evoke emotions. Or computers with affective capabilities might be more effective than conventional technology in making decisions with incomplete data - circumstances where affect helps human beings to respond quickly. Needless to say, we must now turn to what psychologists have concluded over the years. Be warned that the research findings are less well agreed than many other aspects of human behaviour. Challenge 22.1 0 Is it eth ica l to a tte m p t to m a n ip u la te people's em o tio n s through tech n o lo g y? D o new tech n o lo g ies d iffe r fro m o ld e r m ed ia su ch as film s in th is resp ect? — - ....................... - — I 22.2 Psychological theories of emotion What are the basic human emotions? Ekman et al. (1972) are widely quoted research ers who identified six basic emotions, namely, fear, surprise, disgust, anger, happiness and sadness. These are generally regarded as being universal - that is, recognized and expressed (facially at least) in the same way in all cultures. Ekman and Friesen (1978) went on to develop the ‘facial action coding system’ (FACS) which uses facial muscle movements to quantify emotions; an automated version of FACS has also been produced (Bartlett et al., 1999). FACS is still the most widely used method of detecting emotion from facial expression. Similar work has been undertaken by Plutchik (1980) who has argued for eight pairs of basic or primary emotions that can be combined to produce secondary emotions. In Figure 22.1 we can see that disgust and sadness combine to give the experience of remorse. But what do we mean by basic or primary emotions? For Ekman this means that they have adaptive value (that is, they have evolved for some purpose), they are, as we have already said, common to everyone irrespective of culture and individual differences,
492 PART IV • Foundations of designing interactive systems (Source: After Plutchik, Robert, Emotion: APsychoevolutionarySynthesis, 1st, © 1979. Printed and Electronically reproduced by permission of Pearson Education, Inc., Upper Saddle River, New Jersey.) and finally, they all have a quick onset - that is, they appear or start quickly. There is indeed some evidence of different patterns of activity in the ANS (the autonomic nerv ous system, which links organs such as the heart and stomach to the central nervous system embodied in the brain and spinal cord) for some of the basic emotions. Clearly it would be useful for the designers of affective systems if there is indeed a relatively small number of basic emotions to recognize or simulate. However, the idea of basic emotions has been challenged, largely because of method ological issues. The main flaw, it is argued, is that the experiments of Ekman and others required participants to make a ‘forced choice’between the eight emotions when iden tifying facial expressions rather than having a completely free choice of emotion terms. Instead of the eight emotions model, Russell and his colleagues propose that variations in just two dimensions - greater or lesser degrees of pleasure (or Valence’) and arousal - can describe the range of affective facial expressions (Russell and Fernandez-Dols, 1997). For example, ‘happy’ and ‘content’ lie at the pleasure end of the pleasure/ displeasure dimension and entail slightly positive and slightly negative degrees of arousal respectively. Both the Ekman and Russell approaches continue to be used in affective computing research and development. In particular, the Russell wheel lays out a large number of emotions in the two-dimensional space described by an x-axis of valence and ay-axis of arousal. It is generally agreed that emotions have three components: • The subjective experience or feelings of fear and so on. • The associated physiological changes in the ANS and the endocrine system (glands and the hormones released by them). We are aware of some but not all of these (e.g trembling with fear) and have little or no conscious control of them. • The behaviour evoked, such as running away.
Chapter 22 • Affect 4 9 3 Virtual environments Consideration o f these three aspects of em otions can be seen in the evaluation of v ir j tual environm ents, as discussed in Chapter 10. Researchers evaluating the im pact of a 'precipice' in the virtual environm ent might capture data on people's reported experi- ence through questionnaires and interviews, their physiological changes through vari ous sensors, and behaviour through observation. In convincing virtual environm ents, reports o f fear, increases in heart rate and retreat from the 'precipice' have all been found. In this context the self-report measures are often term ed 'subjective', and behav ioural and physiological measures 'objective'. However, as we shall see from this ch ap ter, the so-called objective measures require a degree of interpretation by researchers, thereby introducing a substantial degree of subjectivity. Beyond the simple cataloguing of emotion and its components are the various attempts to account for them. This, until relatively recently, was the province of philosophers until the early psychologists decided to try their hands at it. Probably the first of these were James and Lange. The James-Lange theory This theory, which dates from the 1890s, argues that action precedes emotions and the brain interprets the observed action or actions as emotions. So, for example, we see an axe-wielding maniac walking towards us: in response our pulse rate rises, we begin to sweat and we quicken our step - we run for our lives. These changes in the state of our body (increased pulse, sweating and running) are then interpreted as fear. Thus from interpreting the state of our bodies we conclude that we must be afraid. This is summarized in Figure 22.2 but is obviously a rather crude model. What bod ily state, for example, corresponds to the emotional state ‘mildly disappointed but amused’? Perception of Visceral and Interpretation emotion-arousing skeletal changes stimulus Feedback loop Figure 22.2 The Jam es-Lange theory of emotion The Cannon-Bard theory Two psychologists working in the 1920s, Cannon and Bard, disagreed with the James- Lange theory and argued that when an emotion-arousing stimulus is first perceived, actions follow from cognitive appraisal. They also noted that the same visceral changes occur in a range of different emotions. In their view, the thalamus (a complex struc ture in the brain) plays a central role by interpreting an emotional situation while
494 PART IV • Foundations of designing interactive systems simultaneously sending signals to the autonomic nervous system (ANS) and to the cor tex that interprets the situation. The ANS is responsible for the regulation of uncon scious functions like heart rate and the secretion of hormones such as adrenaline. This is shown in Figure 22.3. Thalamus sends The experience of impulses to cortex emotion Perception of Physiological emotion-arousing (bodily) changes stimulus Thalamus sends impulses to hypothalamus Figure 22.3 The Cannon-Bard theory Cognitive labelling and appraisal theories: Schachter-Singer and Lazarus On a more contemporary note, Schachter and Singer conducted a series of experi ments in the 1960s, basically working along the same lines as James and Lange. However, Schachter and Singer favoured the idea that the experience of emotions arises from the cognitive labelling of physiological sensation. However, they also believed that this was not enough to explain the more subtle differences in emotion self-perception, i.e. the difference between anger and fear. Thus, they proposed that, once the physiological symptoms or arousal have been experienced, an individual will gather information from the immediate context and use it to modify the label they attach to the sensation. Figure 22.4 is an illustration of the Schachter and Singer model of emotion. In a series of classic experimental studies they tested these ideas. The most famous of their studies was the adrenaline experiment (Schachter and Singer, 1962). In this experiment they told the participants that they would receive an injection of a vitamin (adrenaline is not a vitamin) and then test to see whether it had affected their vision. They also divided the participants into four groups: Group A These people were given accurate information as to the effect of the “vita min’, that is, sweating, tremor, feeling jittery. Group B They were given false information as to the effect of the ‘vitamin’, namely, itching and headaches. Group C These people were told nothing. Group D This group served as a control and were actually injected with saline (which has no side effects) and were also told nothing.
Chapter 22 Affect 4 9 5 Awareness of physiological arousal Perception of Thalamus sends Interpreting the emotion-arousing impulses to co rtex arousal as a stimulus particular emotion given the context Physiological (bodily) changes Figure 22.4 The Schachter-Singer theory Before the (fake) vision test, the experimenters exposed everyone to an emotion-arous ing stimulus that in practice was invoked by a stooge - a laughing, happy individual who fooled around or an angry, bad-tempered person who was seen to rip up a question naire. The participants were then asked to rate to what degree they had joined in with the stooge’s behaviour and to report on how they felt. As expected, groups A and D said that they felt less likely to join in, while groups B and C said that they shared the stooge’s apparent emotional state. There have been several criticisms and qualifications of the theory arising from later research: • The situation in the experiment is atypical, in that there is usually much less ambigu ity about what is happening. • We base our labelling of the emotion not just on the behaviour of others, but on our own past experiences and many other sources of information. • Unexplained emotional arousal tends to be experienced as negative - for example, a vague sense of unease - thus indicating that the nature of emotional experience is not entirely determined by cognitive labelling. Cognitive labelling theory has been developed further by Lazarus (1982), who proposed the notion of cognitive appraisal. According to cognitive appraisal theory, some degree of evaluation of the situation (appraising the situation) always precedes the affective reaction, although this can be unconscious and does not prevent the immediacy of the sensation. Zajonc (1984), however, argues that some emotional responses do pre cede any cognitive processing. In conclusion, it is generally believed that some cognitive evaluation, or appraisal, occurs in the experience of emotion, but there is no overall agreement about the relative
496 PART IV • Foundations of designing interactive systems dominance and order of cognition and the affective reaction. Sherer (2005) proposes that appraisal consists of four checks that are carried out by people in assessing their environment. They consider: • The relevance and implication of events for their well-being • The relevance and implication of events for long-term goals • How well they can cope with the situation • The significance of the event for their self-concept and social norms. Emotion is defined as the changes in the ANS and other subsystems such as the cen tral nervous system. For our work as designers, the ‘take-home message’is that it is not enough to induce emotional arousal, but the context of the arousal must support the identification of the particular emotion that it is intended to evoke. As a concrete instance of this, we return to the design issues for virtual environments. Many experiments have shown that events or features in the environment can engen der some sensation of anticipation, or fear, or whatever. However, those feelings can be attenuated or labelled differently because of the knowledge that one is experiencing the world through an HMD (head-mounted display) within a laboratory. Moreover, it is (we hope) unlikely that one’s colleagues have created a situation that is genuinely dangerous. Computer games minimize this problem by creating a strong narrative (or story) and a good deal of interaction, both of which help to reduce the influence of the real world beyond the virtual one. It is necessary to use similar stratagems in the design of virtual worlds intended to evoke an emotional response, whether this is for entertain ment, therapy, training or some other purpose. The EMMA project In the EU-funded EM M A project researchers w ere investigating the relationship between presence (the sense of 'being there') and emotions. EM M A uses tools such as virtual reality, intelligent agents, augmented reality and wireless devices to provide ways of coping with distressing emotions for users, including people with psychologi cal problems. Emotions are stimulated through engagement with a virtual park, which changes in accordance with the em otion involved. Figure 22.5 shows the w inter view of the park, designed to evoke sadness. Figure 22.5 The 'sad' park developm ent in the EMMA project (Source: www.psychology. org/The%20EMMA%20 Projecthtm. Courtesy of Mariano Alcaniz) Source: http://cordis.europa.eu/search/index.cfm?fuseaction-proj.document&PJ_RCN-5874162
Chapter 22 • Affect 4 9 7 Challenge 22.2 Use Sherer's fo u r checks to evaluate y o u r response to som eon e leapin g ou t a t you a n d sh o u tin g 'b oo !' very loudly. r I I I. ..................... .......................... i n . . . . . . . \" \" . \" I .i ll . ,.u - ........................................ ....................................................... ......................... ................... -L I-. -IH -LU ..... ^22.3 Detecting and recognizing emotions If technologies are to act upon human emotions, the first step is to recognize different affective states. As we have already seen from the psychology of emotion, human emo tional states have physiological, cognitive and behavioural components. Behavioural and (some) physiological changes are, of course, most apparent to the outside world, unless we deliberately choose to disguise our feelings. Some signs of our affective state are more easily detected than others, however, as shown in Table 22.1. But while some of the physiological changes are obscure to other people, unless they are extremely physically close or have special monitoring equipment, they are virtually all accessible to a computer armed with the appropriate sensors. However, detecting changes and attributing them to the correct emotion are two rad ically different problems. The second is much more intractable than the first, and one which causes much misunderstanding between people as well as potentially between people and machines. This area is also known as social signal processing. A good source of material on this is at http://sspnet.eu. What if boredom couldn’t be disguised? We are, of course, capable of disguising the more overt sym ptom s of socially unaccep table em otions. W riting in the O b s e rv e r newspaper o f 7 Septem ber 2003, the colum nist Victoria Coren speculates thus: 'W hat if (just as you blush when you're embarrassed or shiver when you're cold) you autom atically removed your trousers when you were bored? The world of polite feigned interest would be dead and gone. You could smile all you liked as the boss made small talk - but no use, the trousers would be off. Everyone w ould have to try harder and w affle less. As things stand, boredom is too easily dis guised, but we have tamed our w ilder instincts.' J________________________________________- _____________________________________ Table 22.1 Form s of sentic m odulation Apparent to other people Less apparent to other people Facial expression Respiration Voice intonation H eart rate, pulse Gesture, movement Tem p eratu re Posture Electrodermal response, perspiration Pupillary dilation Muscle action potentials Blood pressure Source: Adapted from Picard, Rosalind W. Affective Computing, Table 1.1, © 1997 Massachusetts Institute of Technology, by perm ission of The M IT Press
498 PART IV • Foundations of designing interactive systems Basic capabilities for recognizing emotion Technologies that successfully recognize emotion need to draw upon techniques such as pattern recognition and are likely to need to be trained to individual people - as for voice input technologies. The list below is reproduced from Picard (1997, p. 55) and sets out what capabilities a computer requires to be able to discriminate emotions. • Input. Receiving a variety of input signals, for example face, hand gestures, posture and gait, respiration, electrothermal response, temperature, electrocardiogram, blood pressure, blood volume, and electromyogram (a test that measures the activity of the muscles). • Pattern recognition. Performs feature extraction and classification on these signals. For example, analyzes video motion features to discriminate a frown from a smile. • Reasoning. Predicts underlying emotion based on knowledge about how emotions are generated and expressed. This reasoning would require the system to reason about the context of the emotion and a wide knowledge of social psychology. • Learning. As the computer ‘gets to know’ someone, it learns which of the above fac tors are most important for that individual, and gets quicker and better at recogniz ing his or her emotions. • Bias. The emotional state of the computer, if it has emotions, influences its recogni tion of ambiguous emotions. • Output. The computer names (or describes) the recognized expressions and the likely underlying emotion. Progress has been made on many of these dimensions. Sensors and software that detect physiological changes such as heart rate, skin conductivity and so forth have long been available. However, there are practical issues with relying on this sort of data alone. The sensors themselves are too intrusive or awkward for most everyday uses, and the data requires expert analysis - or intelligent systems - to interpret the significance of changes. Also, individual physiological signs tend to indicate a general increase in arousal rather than specific emotions, and the same combinations of physiological signs can belong to different emotions - the signs of disgust and amusement are very similar, for example. Hence the need for the detection of other physical signs and/or pattern recognition to support computers in recognizing emotions. StartleCam StartleCam is a w earable video cam era, com puter and sensing system, w hich enables the cam era to be controlled via both conscious and preconscious events involving the wearer. Traditionally, a wearer consciously hits 'record' on the video cam era, or runs a com puter script to trigger the cam era according to some prespecified frequency. The system described here offers an additional option: images are saved by the system w hen it detects certain events o f supposed interest to the wearer. The im plem entation described here aim s to capture events that are likely to get the user's attention and to be remembered. Attention and m em ory are highly correlated with what psychologists call arousal level, and the latter is often signalled by skin conductivity changes; conse quently, StartleCam monitors the wearer's skin conductivity. StartleCam looks for pat terns indicative of a 'startle response' in the skin conductivity signal. W hen this response is detected, a buffer o f digital images, recently captured by the w eare r’s digital ca m era, is dow nloaded and optionally transm itted w irelessly to a W eb server. T his selective storage of digital images creates a 'flashbulb' m em ory archive for the wearable which ............--.................................... ................... ........................................................ -J
Chapter 22 Affect 4 9 9 aims to m imic the wearer's own selective m em ory response. Using a startle detection filter, the StartleCam system has been demonstrated to w ork on several wearers in both indoor and outdoor ambulatory environments. Source: StartleCam (1999) Recognizing emotions in practice Pattern recognition was exploited in work by Picard and her team which was designed to explore whether a wearable computer could recognize a person’s emotions over an extended period of time (Picard et al, 2001). Over a period of ‘many weeks’, four sen sors captured • an electromyogram (indicating muscle activity) • skin conductance • blood volume pulse (a measure of arousal) • respiration rate. By using pattern recognition algorithms, eight emotions were distinguishable at lev els significantly higher than chance. This does not mean, however, that computers can recognize people’s emotions with reliable accuracy - the main reason being that the recognition software was constrained to a forced choice among the eight defined emo tions. But as Picard notes, even partial recognition can be helpful - provided that the wrong emotion is not positively identified. Elsewhere at MIT, work has been targeted at identifying rather more diffuse emotions such as ‘the state you are in when all is going well with the computer’, as contrasted with ‘the state you are in when encountering annoying usability problems’ (Picard, 2003). There is clear value here for developing applications that mitigate user frustration. Tracking changes in facial expression offers another means of extending the data from physiology. In one experiment, for example, Ward et a l, (2003) used a commer cially available facial tracking package. The software works by tracking facial movements detected from a video of the face. The findings suggested the following: • Facial expressions change in response to even relatively minor interaction events (in this case low-key surprising and amusing events, where the latter produced a weaker reaction). • These changes were detected by the tracking software. The authors concluded that the approach has potential as a tool for detecting emotions evoked by interacting with computers, but best performance in recognizing emotions (rather than simply tracking physical changes) is likely to be more successful with a combination of data sources. Most applications of computer recognition of emotion lie in the development of sys tems that moderate their responses to respond to user frustration, stress or anxiety. However, there are worthwhile applications in domains beyond computing per se. One of the most significant of these is healthcare, where taking account of affective state is a vital element of patient care. In tele-healthcare, however, the clinician’s scope for doing this is rather limited. Tele-healthcare is used for such applications as collecting ‘vital signs’such as blood pressure, checking that medication has been taken or compli ance with other medical directions. Lisetti et al. (2003) report early work on an applica tion designed to improve affective information in this context. The system models the patient’s affective state using multiple inputs from wearable sensors and other devices such as a camera. The identified emotions are then mapped on to intelligent agents
500 PART IV • Foundations of designing interactive systems which are embodied as avatars. The personal avatar is then able to ‘chat’to the patient to confirm the emotions identified, and also to reflect this state in supplementing tex tual communication between patient and clinician (Figure 22.6). Preliminary results showed 90 per cent success in recognizing sadness, 80 per cent success for anger, 80 per cent for fear and 70 per cent for frustration. Figure 22 .6 An avatar mirroring a user's sad state (Source: Reprinted from InternationalJounal ofHuman-ComputerStudies, 59, Lisetti, C. eta!., Developing multimodal intelligent affective interfaces for tele-home health care, pp. 245-55. Copyright 2003, with permission from Elsevier) Affective wearables W earables are discussed ‘An affective wearable is a wearable system equipped with sensors and tools that ena- bles recognition of its wearer’s affective patterns’ (Picard, 1997, p. 227). Wearable com in Chapter 20 puters are not merely portable like a laptop or a Walkman but can be used whether we are walking, standing or travelling. Wearables are also always on (in every sense). At present, a wide range of prototypes of affective wearables already exists, though they are far from complete or polished and require regular attention/maintenance. One of the clear advantages to the design and use of affective wearables is that they can sup ply information on affect naturalistically. Affective wearables provide an opportunity to study and test theories of emotion. Currently, the most common examples of affective wearables are affective jewellery. Figure 22.7 is an illustration of a piece of affective jewellery, in this instance an earring that also serves to display the wearer’s blood volume pressure using photo plethysmography. This involves using an LED to sense the amount of blood flow in the earlobe. From this reading both the heart beat and constriction of the blood vessel can be determined. In practice, the earring proved to be very sensitive to movement but future applications might include being able to gauge the wearer’s reaction to consumer products. Figure 22.8 is a further example of a system that can sample and transmit biometric data to larger computers for analysis. The data is sent by way of an infra-red (IR) link. Figure 22.7 The Blood Volume Pressure Figu re 2 2 .8 Sampling biometric data with (BVP) earring a w earable device (Source: Courtesy of Frank Dabek) (Source: Courtesy of Frank Dabek)
Chapter 22 • Affect 501 Challenge 22.3 We h a v e e s ta b lis h e d th a t a ffe c t s te m s p a r t ly fr o m p h y s io lo g ic a l s e n s a tio n s s u c h a s in creases in p u lse rate, p ersp ira tio n a n d so on. G iven th a t sensors exist to d e te ct these changes, h ow cou ld these p h en o m en a be exp lo ited in the design o f in tera ctive g a m es? You sh ou ld consider accep ta b ility to gam ers alongside tech n ical feasibility. -................. :C.:.....................1 .....------- .......... ................... j Whether computers could ever be said to experience emotions has long been a matter for debate and is largely beyond the scope of this chapter, but in some ways this fascinating question does not fundamentally affect thinking on how to design for affect. We now move on to investigate what it means for a computer - or any other interactive system - to express emotion. 22.4 Expressing emotion This is the other side of the affective computing equation. As we have seen, humans express emotions through facial expressions, body movements and posture, smaller- scale physiological changes and changes in tone of voice - which can be extended to the tone and style of written communications. With interactive systems, there are several aspects to consider: • How computers that apparently express emotion can improve the quality and effec tiveness of communication between people and technologies. • How people can communicate with computers in ways that express their emotions. • How technology can stimulate and support new modes of affective communication between people. Can computers express emotion? There is little argument that computers can appear to express emotion. Consider the expressions of the Microsoft Office Assistant - in Figure 22.9 he is ‘sulking’when ignored by the author. Whether such unsophisticated anthropomorphism enhances the interactive experience is debatable at best. As well as the irritation provoked in many users, there is a risk that people may expect much more than the system can provide. Many of the more visible outward expressions of emotion introduced in Section 22.3 can be mimicked by computing applications. Even very simple facial models have been found capable of expressing recognizable emotions. A representative instance of this strand of research is reported by Schiano and her colleagues (Schiano e ta l, 2000). The experiment tested an early prototype of a simple robot with ‘a box-like face contain ing eyes with moveable lids, tilting eyebrows, and an upper and lower lip which could be independently raised or lowered from the center’. The face was made of metal and had a generally cartoon-like appearance - most of the subtle changes in facial folds and lines that characterize human emotions were missing. Despite these limitations, human observers were able to identify the emotions communicated successfully. The impact of even limited emotional expression is illustrated again by an experi mental application at MIT, the ‘relational agent’ (Bickmore, 2003). This was designed to sustain a long-term relationship with people who were undertaking a programme to enhance exercise levels. The agent asked about, and responded to, their emotions and expressed concern by modifying text and bodily expression where appropriate. The computer did not disguise its limited empathetic skills, nor were people really
a; PART IV • Foundations of designing interactive systems 502 convinced of the reality of the ‘feelings’displayed, but nevertheless the agent was rated significandy higher for likeability, trust, respect and feelings that it cared for them than a standard interactive agent. Fig u re 2 2 .9 The Microsoft Office Assistant 'sulking' By contrast, ‘Kismet’ (Figure 22.10), an expressive robot developed at MIT, provides a much more complex physical implementation. It is equipped with visual, auditory and proprioceptive (touch) sensory inputs. Kismet can express apparent emotion through vocalization, facial expression, and adjustment of gaze direction and head orientation. See also Chapter 17 on avatars and com panions Fig ure 22.10 The Kismet robot (Source: Sam Ogden/Science Photo Library)
I Chapter 22 • Affect 503 Affective input to interactive systems So, if computers can express apparent emotions to humans, how can humans express emotions to computers, aside from swearing or switching off the machine in a fit of pique? We have already seen in Section 22.3 that computers can detect affective states. Interactive systems such as those described in that section gener ally aim to monitor human affective signs unobtrusively so as to identify current emotions. But what if the human wants to commu nicate an emotion more actively, perhaps to influence the actions of her character in a game? The affective, tangible user interface developed in the SenToy project (Paiva et al., 2003) affords an imaginative treatm ent of this type of input problem. Manipulating the SenToy doll (Figure 22.11) so that it performs pre-specified gestures and movements allows people to modify the ‘emotions’ and behav iour of a character in a game. People can express anger, fear, sur prise, sadness, gloating and happiness through gestures which are picked up by the doll’s internal sensors and transmitted to the game software. Sadness, for example, is expressed through Fig u re 22.11 The SenToy affective bending the doll forwards, while shaking it with its arms raised interface denotes anger. Actions carried out by the game character reflect the emotion detected. (Source: Reprinted from InternationalJournal ofHuman-ComputerStudies, 59(1-2), Paiva, A. In preliminary trials with adults and children, sadness, anger etal., SenToy: an effective sympathetic interface. and happiness were easily expressed without instruction, while Copyright 2003, with permission from Elsevier) gloating - requiring the doll to point and perform a little dance - was particularly difficult. In playing the game itself, this time with instructions for the gestures, all the emotions except surprise were expressed effectively. People became very involved with the game and the doll, and generally enjoyed the experience. Enhancing human affective communication Researchers have also turned their attention to enhancing emotionally toned communica tion between people.Developments fuse highly creative conceptual design with (sometimes very simple) technology. Sometimes the idea is to convey a particular emotion - generally a positive one - but more often the aim is to foster emotional bonds through feelings of connection. Like much else in the affective computing field, these innovations are very much in their infancy at the time of writing, with few realized in their final form. A representative set of examples, designed for ‘telematic emotional communication’, is described by Tollmar and Persson (2002). Rather unusually in this domain, the inspi ration behind the ideas comes not only from the designers or technologists, but also from ethnographic studies of households and their use of artefacts to support emotional closeness. They include ‘6th sense’ (Figure 22.12), a light sculpture which senses body move ment in the vicinity. If there is continuous movement for a time, the lamp sends this information to its sister lamp in another household. This lights up, indicating someone’s presence in the first household - an unobtrusive way of staying in touch with the move ments of a friend or family member.
504 PART IV Foundations of designing interactive systems Figure 22.12 6th Sense (Source: Tollmar and Persson (2002) Understanding remote presence, ProceedingsoftheSecondNordic ConferenceonHuman-Computer Interaction, Aarhus, Denmark, 19-23 October, Nordi CHI'02, vol. 31. ACM, New York, pp. 41 -50. © 2002 ACM, Inc. Reprinted by permission, http:// doi.acm.org/10.1145.572020.572027) r .... .. ........ i. ..................... ............. .............. . 22.5 Potential applications and key issues for further research L______________________________________________________________________ J Table 22.2 is a list of potential ‘areas of impact’ for affective computing. ‘Foreground’ applications are those in which the computer takes an active, usually visible, role; in ‘background’applications the computer is a more backstage presence. Despite lists such as that in Table 22.2 which identify a compelling range of poten tial applications, affective computing is still a developing area. There are fundamen tal issues which remain to be clarified. Perhaps the most salient for interactive systems designers are: • In which domains does affective capability make a positive difference to hum an- computer interaction, and where is it irrelevant or even obstructive? • How precise do we need to be in identifying human emotions - perhaps it is enough to identify a generally positive or negative feeling? What techniques best detect emo tional states for this purpose? • How do we evaluate the contribution of affect to the overall success of a design?
Chapter 22 • Affect 5 05 Table 22.2 Potential 'a re a s of im p act' for affective com puting Human-human mediation Human-computer interaction Foreground Conversation Graphical user interface • Recognizing em otional states • Adaptive response based on physiological W ireless-m obile devices detection • Representing-displaying W earable computers em otional states • Remote sensing o f physiological states Telephones Virtual environments • Speech-synthetic affect and voice • Emotional capture and display Video teleconferences Decision support • Affective iconics • Affective elements of decision-making Background Portholes (video/audio links between Smart house technology offices or other spaces) • Sensors and affective architectures • Affective interchanges - adaptive Ubiquitous computing monitoring • Affective learning Electronic badges Speech recognition • Affective alert and display systems • M onitoring voice stress Avatars Gaze systems • Creation of personality via synthetic • M ovements and emotion detection emotional content Intelligent agents • Social and em otional intelligence Source: Reprinted from I n t e r n a t io n a l J o u r n a l o f H u m a n - C o m p u t e r S t u d ie s , 59 (1-2), McNeese, M.D., New visions of human- computer interaction: making affect compute, copyright 2003, with permission from Elsevier Is affective computing possible or desirable? | --------------------------------------------------------------------------------- FURTHER W riting in the 2003 Special Issue on Affective C om puting of the In te rn a tio n a l Jo u r n a l o f THOUGHTS H u m a n -C o m p u te r Stu dies, Eric Hollnagel argues thus: Emotions can provide some kind of redundancy that may im prove the effectiveness of com m unication. The affective m odality of com m unication can furtherm ore be expressed by different means such as the grammatical structure (a polite request versus an order), the choice of words, or the tone of voice (or choice of colours, depending on the medium or channel). Yet neither of these represents affective com puting as such. Instead the style of com puting - or rather, the style of com m unication or interaction - is effectual. It does not try to transm it em otions as such but rather settles for adjusting the style of com m unication to achieve m axim um effectiveness. In w ork, people are generally encouraged to be rational and logical rather than affective and em otional. Indeed, every effort in task design, ergonom ics, procedures and training goes towards that. From the practical perspective the need is therefore not to emulate emotions but to be able to recognize and control emotions. (This goes for hum an-hum an com m unication as w ell as h u m an-m achine interaction.) In cases w here em otions are considered an advantage, th ey should be am plified. But in cases where they are a disadvantage (which include most practical work situations), they should be dampened. All work, with or without information technology, aims to produce som ething in a system atic and replicable m anner - from ploughing a field to assembling a machine. Affects and em otions usually do not contribute to the efficiency of that but are more likely to have a negative influence. In contrast, art
506 PART IV • Foundations of designing interactive systems does not aim to produce identical copies of the same thing and emotions or non logical (not replicable) procedures and thinking are therefore valuable. In conclusion, affective com puting is neither a meaningful concept nor a reason able goal. Rather than trying to make computers (or computing) affective, w e should try to make comm unication effectual. Rather than trying to reproduce emotions we should try to imitate those aspects of emotions that are known to enhance the effec tiveness of communication. Source: Reprinted from InternationalJournal ofHuman-ComputerStudies, 59 (1-2), Hollnagel, E., Is affective computing an oxymoron?, p.69, copyright 2003, with permission from Elsevier. f Kristina Hook et al. (2008) express sim ilar view s in arguing for an interactionist v ie w of em otion. We should not be trying to guess people's emotions and adapting systems based on this guess; we should be designing systems that let people express emotion when and how they want to. ........................................— ...- ................. .................................................... ................ - J Summary and key points In th is c h a p te r w e h ave exp lo red th e th e o ry o f e m o tio n s and seen h o w th is has been a p p lie d to th e d e v e lo p in g fie ld o f a ffe c tiv e co m p u tin g . W e h ave d iscu sse d w h a t is required for technologies to display apparent em otion, to detect and respond to hum an em otions and to support hum an affective com m unication - potentially a very diverse and technically advanced set of capabilities - but w e have suggested that an approxi m ate identification and representation of em otion m ay suffice for m any purposes. Applications have been identified w hich range from affective com m unication to sup porting telem edicine to interacting w ith games. Exercises 1 H o w fa r is it n ecessary to u n derstan d th e th e o ry o f hum an e m o tio n s in o rd e r to design affective technologies? Illustrate yo u r answ er w ith exam ples. 2 Develop a storyboard showing the proposed use of an affective operating system designed to respond w h e n it detects fru stratio n and tired n ess in its user. Further reading In t e r n a t io n a l J o u r n a l o f H u m a n - C o m p u t e r S t u d ie s , no. 59 (2 0 0 3 ) - special issue on Affective Computing. In clu d es re vie w p a p ers, o p in io n p ieces, th e o re tica l trea tm e n ts a n d a p p lic a tio n s a n d so p rovid es an exce lle n t sn a p sh o t o f the sta te o f a ffe ctiv e co m p u tin g in th e e a rly tw en ty-first century. N o rm an, D.A. (2004) E m o tio n a l D e s ig n : W h y W e L o v e ( o r H a te ) E v e r d a y T h in g s . Basic Books, New York. A very rea d a b le a cco u n t o f the rela tio n sh ip b etw een em o tio n s a n d design. Picard, R.W . (1997) A ffe c tiv e C o m p u tin g . M IT Press, Cambridge, M A. A stim u la tin g discu ssion o f th e th eo retica l a n d tech n o lo g ica l issues g ro u n d ed in (th en ) sta te -o f-th e -a rt research a t M IT.
Chapter 22 • Affect 507 Getting ahead B rave, S. and Nass, C. (2 0 0 7 ) E m o t io n in H u m a n - C o m p u t e r In t e r a c t io n . In S e a r s A . a n d Ja c k o .J.A . (eds) The H a n d b o o k o f H u m a n -C o m p u te r In tera ctio n : Fu n d a m en ta ls, G row in g T e ch n o lo g ie s a n d E m e rg in g A p p lic a tio n s, 2nd edn. Lawrence Erlbaum Associates, Mahwah, NJ. Web links The HUM AINE project at http://em otion-research.net The accom panying website has links to relevant websites. Go to w w w .p e a rso n e d .co .u k/b e n yo n Comments on challenges Challenge 22.1 There are many possible arguments here. Our view on this is that it is probably acceptable as long as people can choose whether to use the technology, they are aware that the technology has affec tive aspects and they can stop using it at any point. Among the differences from older media are the interactive nature of new technologies and the possibility that giving technology reactions and expressions which mimic human emotions may mislead people into inappropriately trustful behaviour. Challenge 22.2 1 The relevance and implication of events for your well-being will probably initially induce fear. 2 The relevance and implication of events for long-term goals will help you understand that this is a temporary thing. 3 How well they can cope with the situation will probably make you jum p and then laugh. 4 The significance of the event for their self-concept and social norms will make you angry if you have looked foolish in the presence of others. Challenge 22.3 It would be possible, for example, to detect that a games player's state of arousal had remained unchanged since starting the game and step up the pace accordingly, or conversely to slow things down to provide a calmer interlude after a sustained period of high arousal readings. I would guess that people might prefer not to wear (and fix, and calibrate) physical sensors, so perhaps non contact monitoring, e.g. of facial expression, might be an acceptable solution.
Chapter 23 Cognition and action Contents ) 23.1 Hum an inform ation Aims processing 509 P e rh a p s s u rp risin g ly , th e re is n o sin g le th e o ry a b o u t h o w p e o p le th in k 23.2 Situated action 512 and reason (cognition) and w hat relationships there are betw een 23.3 Distributed cognition 514 th o ug h t and actio ns. In this ch a p te r w e lo ok at a n u m b e r o f d ifferent view s. C ognitive psychology tends to focus on a disem bodied view 23.4 Em bodied cognition 516 o f c o g n itio n : 'I th in k th e re fo re I a m ' in th e fa m o u s p h ra se o f R en e 23.5 Activity theory 519 D escartes. Em bodied cognition recognizes th at w e have physical Sum m ary and key points 525 bodies w h ich have evolved and are adapted to a range o f activities that Exercises 525 ta ke p lace in th e w o rld . D istrib u te d co g n itio n argu es th a t th in k in g is Further reading 525 spread acro ss b rain s, artefacts an d d evices and is not sim p ly processed Web links 526 in th e b rain . Situ ated actio n p o in ts to th e im p o rta n ce o f c o n te x t in Com m ents on challenges 526 d e cid in g w h a t w e do and a ctiv ity th e o ry fo cu ses on actio n in p u rsu it o f o b je c tiv e s . A fter studying this chap ter you should be able to understand: • C o g n itive p sych o lo gy and in p articu lar the idea o f h um ans as inform ation processors • T h e im p o rta n ce o f co n te xt in th e design o f in te ractive system s and as a m ajor part o f d eterm ining the range and type of actions w e take • The im p ortance o f the body to thinking and taking action • T w o m ore view s o f cognition and action - distributed cognition and activity theory.
Chapter 23 • Cognition and action 5 0 9 23.1 Human information processing In 1983, Card et al. published The Psychology of Human-Computer Interaction. In the preface to this, one of the first and certainly the most celebrated book on psychology and HCI, we find this earnest hope expressed: The dom ain of concern to us, and the subject of this book, is how humans interact with computers. A scientific psychology should help us in arranging the interface so it is easy, efficient and error free - even enjoyable. Card et al. (1983), p. vii The book has at its core the Model Human Processor which is a simplified model of human information processing from the perspective of (a) the psychological knowledge at that time and (b) a task-based approach to human-computer interaction. Task-based approaches to HCI are concerned with looking at people trying to achieve a specific goal (they are discussed in Chapter 11). The human information processing paradigm char acterizes or simplifies people’s abilities into three ‘blocks’ or subsystems: (a) a sensory input subsystem, (b) a central information processing subsystem and (c) a motor output subsystem. This is, of course, remarkably similar to how we generally partition the main elements of a computer. Figure 23.1 is an illustration of this relation between people and computers. In this view of HCI, humans and computers are functionally similar and form a closed loop. In this model, all of our cognitive abilities have been grouped into a box labelled ‘Human Information Processing’ which is a great oversimplification. For example in previous chapters we have discussed the role of memory, attention and affect in under standing how people think and act. In this representation of humans and computers Fig u re 23.1 The information processing paradigm (in its sim plest form)
510 PART IV Foundations of designing interactive systems the outside world is reduced to stimuli, which can be detected by the senses, so the model effectively decontextualizes human beings. There are more complex and empiri cally tested models of humans (see Box 23.1) and there are many accounts at the neuro logical level of how the brain is organized and how it processes signals. As researchers, academics and designers we are interested in understanding and predicting the use of interactive systems and, for many of us, this is best done using an underlying theory such as cognitive psychology. But human beings are very complex, so we have to sim plify our view of human cognitive abilities in order to make them manageable. Cognitive models Cognitive models or cognitive architectures were once the flagships of both cognitive psychology and HCI. Cognitive models (or architectures) such as SOAR and ACT-R have been developed by research teams to accom m odate a range of what Newell has called micro-theories of cognition. ACT-R (http://act-r.psy.cm u.edu/about/) stands for 'The Adaptive Control o f Thought - Rational' and is strictly speaking a cognitive architecture. A C T-R is broader than any particular theory and can even accom m odate m ultiple theo ries w ithin its fram ew ork. It has been developed to model problem solving, learning and memory. A C T-R looks and behaves like a program m ing language except that it is based on constructs based on human cognition (or w hat its creators believe to be the elements of human cognition). Using ACT-R the programmer/psychologist or cogni tive scientist can solve or model problems such as logic puzzles, or control an aircraft and then study the results. These results might give insights into the tim e to perform tasks and the kinds o f errors w hich people might make in doing so. For exam ple, see Barnard (1985). O ther cognitive architectures w o rk in sim ilar ways. A seven-stage model of activity We encountered these One celebrated psychologist who has been involved in psychology and HCI since the gulfs briefly in Chapter 2 1970s is Donald Norman. Figure 23.2 is a representation of Norman’s seven-stage model of how an individual completes an activity (Norman, 1988). Norman argues that we begin with a goal, e.g. checking sports results on the Web, or phoning a friend. Our next step is to form a set of intentions to achieve this goal, e.g. finding a computer with a browser, or trying to remember which jacket we were wearing when we last used our mobile phone. This is then translated into a sequence of actions which we then execute, e.g. go to a computer lab or Internet cafe, then log on to a PC, double-click on a Web browser, type in the URL, hit return, read sports results. At each step on the way we perceive the new state of the world, interpret what we see, and compare it against what we intended to change. We may have to repeat these actions if our goals were not met. In Figure 23.2, the gulf of execution refers to the problem of how an individual translates intentions into action. The gulf of evaluation is the converse and refers to h°w an individual understands, or evaluates, the effects of actions and knows when his or her goals are satisfied. Challenge 23.1 Id en tify instances o f the g u lf o f execu tion an d the g u lf o f eva lu a tion in devices o r system s w h ich yo u (o r oth er p eo p le) have d ifficu lty using.
Chapter 23 • Cognition and action 511 Goals Intention Evaluation of to act interpretations Sequence of Interpreting the actions perception Execution of the Perceiving the action sequence state of the world Fig u re 2 3 .2 Norman's seven-stage model of activity (Source: After Norman, 1988) Why HIP is not enough While the human information processing (HIP) account of cognition proved to be popu lar both within psychology and in the early years of HCI, this popularity has diminished dramatically in recent years, for the following reasons: • It is too simple. We are much more complex and cannot be represented meaningfully as a series of boxes, clouds and arrows. Human memory is not a passive receptacle; it
512 PART IV • Foundations of designing interactive systems is not analogous to an SQL database. It is active, with multiple, concurrent and evolv ing goals, and is multimodal. Visual perception has very little in common with a pair of binocular cameras connected to a computer. Perception exists to guide purposive action. • HIP arosefrom laboratory studies. The physical and social contexts of people are many and varied and conspicuous by their absence from these diagrams. • HIP models assume that we are alone in the world. Human behaviour is primarily social and hardly ever solitary. Work is social, travel is usually social, playing games is often social, writing a document (e-mail, assignment, book, text message, graffiti) is social as it is intended to be read by someone else. Where are these people represented in the block and arrow models of cognition? These models are very clearly incomplete as they omit important aspects of human psy chology such as affect (our emotional response); they also fail to notice that we have bodies. FURTHER Creativity and cognition THOUGHTS Human abilities of creativity offer something of a challenge to cognition. W here do new ideas com e from ? H ow do w e make the ju m p in thinking that is characteristic of being creative? There are, of course, m any view s on this and it is most likely that em o tion, social interaction, intention and volition all have a part to play. One distinction that has been m ade is between convergent and divergent th in kin g (G uilford, 1967). Convergent thinking is directed at finding the best solution to a problem whereas diver gent thinking is concerned w ith bringing diverse ideas together and exploring m any unusual ideas and possibilities. (M any design techniques for encouraging divergent thinking such as brainstorm ing are described in Chapters 7 and 9.) 23.2 Situated action The late 1980s and 1990s saw the rise of criticisms of the classic cognitive psychological accounts such as HIP. For example, Liam Bannon argued for studying people outside the confines of the psychology laboratory, while Lucy Suchman criticized the idea that people follow simple plans in her ground-breaking book Plans and Situated Actions in 1987 (second edition, 2007). This showed that people respond constructively and per haps unpredictably to real-world situations. In 1991 Bannon published a paper entitled ‘From human factors to human actors’ (Bannon, 1991). The paper was a plea to understand the people using collaborative systems as empowered, problem-solving, value-laden, cooperative individuals rather than mere subjects in an applied psychology experiment. In adopting this new perspec tive we necessarily move out of the laboratory and into complex real-world settings. The argument highlights the differences in perception between treating people as merely a set of cognitive systems and subsystems (which is implied by the term human factors) and respecting people as autonomous participants, actors, with the capacity to govern their own behaviour. From this point Bannon explores the consequences of this change in perspective. He argues that it involves moving from narrow experimental studies of individual people working on a computer system to the social setting of the work place. This too would require changes in techniques, from a cognitive and experimental
Chapter 23 • Cognition and action 513 approach to less intrusive techniques with perhaps an emphasis on the observational. <- There is more about Once in the workplace we should study experts and the obstacles they face in improving participative design in their practice or competence. There is a need to shift from snapshot studies to extended Chapter 7 longitudinal studies. Finally, Bannon argues that we should adopt a design approach that places people at the centre of the design process through participative design See also the discussion approaches. o f collab oration in Chapter 16 It is said that Lucy Suchman’s Plans and Situated Actions is the book most widely quoted by researchers of collaborative system design. The book is a critique of some of the core assumptions of artificial intelligence (AI) and cognitive science, specifically the role of plans in behaviour, but in doing so it opened the door to ethnomethodology and conversation analysis in HCI. Suchman’s starting point - before refuting the planning approach - is to identify the role of planning in AI and the belief of cognitive psychology that this is true of human behaviour too. Simply put, both human and artificially intel ligent behaviour can be modelled in terms of the formulation and execution of plans. A plan is a script, a sequence of actions. ‘Going for a curry’ script As the national dish of the UK is said to be chicken tikka masala (a kind of creamy curry), let us think about how the British enjoy this dish. The scene is a Saturday night in any city in the UK. Our potential curry enthusiasts (let's call them students for the sake of argument) meet and then drink a great deal of lager which creates an irresistible desire for a curry. The second step is to locate an Indian restaurant. Having gained entry to the restaurant one of the students will ask a waiter for a table. The waiter will guide the party to a table, offering to take their coats as they go. Next the waiter will give each student a copy of the menu, suggesting that they might like to have a lager while choosing from the menu. The students then decide what they want to eat and order it from the waiter, stopping only to argue over how many poppadums, chapattis or naan breads they want (these are common forms of Indian breads eaten prior to or with a curry). The curry is then served and consumed. Everyone being sated, one of the students asks for the bill. After 20 minutes of heated argument about who had ordered what, the students finally settle the bill and hurry home to ensure that they have a good night's sleep. Researchers Schank and Abelson (1977) were the first to identify scripts as being a credible means by which we organize our knowledge of the world and, more impor tantly, as a means of directing our behaviour (i.e. planning). The advantage of scripts is that they can be adapted to other situations. The above Indian restaurant script is read ily adapted for use in a Thai, Chinese or Italian restaurant (i.e. find restaurant, get table, read menu, order food, eat food, pay for meal) and easily adapted to, for example, hamburger restaurants (the order of get table and order food is simply reversed). ~~~.......- .............-..............-................. -........ ............................................................ ................ J Plans are formulated through a set of procedures beginning with a goal, successive Chapter 25 on decomposition into subgoals and into primitive actions. The plan is then executed. A navigation also criticizes goal is the desired state of the system. the traditional view of plans The problems with the planning model as identified by Suchman included the obser vations that the world is not stable, immutable and objective. Instead it is dynamic and interpreted (by us) and the interpretation is contextual or ‘situated’. Thus plans are not executed but rather they are just one resource which can shape an individual’s behaviour.
514 PART IV • Foundations of designing interactive systems Challenge 23.2 How do the current generation of graphical user interfaces support behaviour that does not rely on planning? . J half r. i 23.3 Distributed cognition On 20 July 1969, astronauts Neil Armstrong and Buzz Aldrin landed on the Moon. At Mission Control, Charlie Duke followed the process closely (along with 600,000,000 other people listening on radio and watching on TV). What follows is a transcript of the last few seconds of the landing of the Lunar Module (LM). Aldrin: ‘4 forward. 4 forward. Drifting to the right a little. 20 feet, down a half Duke: '30 seconds.’ Aldrin: 'Drifting forwardjust a little bit; that’s good.' Aldrin: ‘Contact Light.' Armstrong: ’Shutdown.' Aldrin: ’Okay. Engine Stop.' Aldrin: 'ACA out of detent' Armstrong: 'Out of detent. Auto.' Duke: ’We copy you down, Eagle.' Armstrong: ’Engine arm is off. Houston, Tranquillity Base here. The Eagle has landed.' Duke: Roger, Tranquillity. We copy you on the ground. You got a bunch of guys about to turn blue. We're breathing again. Thanks a lot.' The question is, who landed the spacecraft? History records that Neil Armstrong was the mission commander while Buzz Aldrin was the LM pilot. While Armstrong operated the descent engine and control thrusters, Aldrin read aloud the speed and altitude of the LM (‘4 forward’, that is, we are moving forward at 4 feet per second), while 250,000 miles away back on Earth, Duke confirms the quantity of fuel remaining (30 seconds). So who landed the LM? In a very real sense they all did: it was a joint activity. Ed Hutchins has developed the theory of distributed cognition to describe situ ations such as this (Hutchins, 1995). The theory argues that both the cognitive pro cess itself and the knowledge used and generated are often distributed across multiple people, tools and representations. Everyday examples would include: • A driver and passenger navigating a foreign city using maps and road signs • The homely task of shopping with the aid of a list and the reminders presented by the supermarket shelves • Colleagues rationalizing a project budget using an Excel spreadsheet and some unin telligible printouts from Finance. Internal and external representation In distributed cognition, resources include the internal representation of knowledge (human memory, sometimes called knowledge in the head) and external representa tions (knowledge in the world). This is potentially anything that supports the cogni tive activity, but instances would include gestures, the physical layout of objects, notes, diagrams, computer readings and so forth. These are just as much part of the activity
Chapter 23 • Cognition and action 515 as cognitive processes, not just memory aids (Zhang and Norman, 1994). Hutchins has studied distributed cognition in a range of team-working situations, from Pacific island ers wayfinding between far distant islands to the navigation of US naval ships to aircraft cockpits. In a study of how pilots control approach speeds (aircraft landing speeds), Hutchins (1995) suggests that the cockpit system as a whole in a sense ‘remembers’ its speed. He argued that the various representations in the cockpit, their physical location and the way in which they are shared between pilots make up the cockpit system as a whole (Figure 23.3). Figure 23.3 A typical example of distributed cognition in an aircraft cockpit (Source: Mike Miller/Science Photo Library) The representations employed by the pilots include what they say, charts and manu als, and the cockpit instruments themselves. There is also a whole host of implicit infor mation in aircraft cockpits, such as the relative positions of airspeed indicators and other dials. Hutchins also notes that the various representational states change with time and may even transfer between media in the course of the system’s operation. These transformations may be carried out by an individual using a tool (artefact) while at other times representational states are produced or transformed entirely by artefacts. Different ways in which processes might be distributed When these principles are applied in the wild, three different kinds of distribution emerge: <- Chapter 18 discusses distributed information • Cognitive processes may be distributed across the members of a social group • Cognitive processes may involve coordination between internal and external structures • Processes may be distributed through time in such a way that the products of earlier events can transform the nature of later events. All in all, distributed cognition offers an excellent means of describing how complex systems operate which is well supported by empirical evidence. However, translating
PART IV • Foundations of designing interactive systems these descriptions into the design of interactive systems remains problematic. Hollan et al. (2000) showed how insights from a distributed cognition perspective have guided the design of their PAD++ system, but such examples remain few. 23.4 Embodied cognition James Gibson is best known in HCI design circles as the man who gave us the concept of affordance. An affordance is a resource or support that the environment offers an animal; the animal in turn must possess the capabilities to perceive it and to use it. The affordances of the environment are what it offers animals, what it provides or fur nishes, for good or ill. (Gibson, 1977) Examples of affordances include surfaces that provide support, objects that can be manipulated, substances that can be eaten and other animals that afford interactions of all kinds. This all seems quite remote from HCI, but if we were able to design interactive , systems which immediately presented their affordances then many, if not all, usability , .. .. c . .. <- Affordance is one of lssues would be banished: people would perceive the opportunities for action as simply the usability principles as they recognize they can walk through a doorway. described in Chapter 4 The properties of these affordances for animals are specified in stimulus information. Even if an animal possesses the appropriate attributes and equipment, it may need to learn to detect the information and to perfect the activities that make the affordance useful - or dangerous if ignored. An affordance, once detected, is meaningful and has value for the animal. It is nevertheless objective, inasmuch as it refers to physical properties of the animal’s niche (environmental constraints) and to its bodily dimen sions and capacities. An affordance thus exists, whether it is perceived or used or not. It may be detected and used without explicit awareness of doing so. This description was revised in 1986 when Gibson wrote An affordance cuts across the dichotomy of subjective-objective and helps us to under stand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer. (G ibson, 1986, p. 129) So affordances are (confusingly) neither and both in the world and in the mind of the observer. Figure 23.4 is an illustration of a door. Opening a door is probably the most widely cited example of an affordance in action. The argument is that we ‘see’ that we can either push or pull open the door from the affordances of the door itself. This might work well for doors, but does it apply to the design of interactive systems? Challenge 23.3 Find some more affordances in everyday objects. How is the affordance 'presented'? Also try to identify apparent affordances that are misleading. Donald Norman, who was instrumental in introducing the concept of affordance to HCI, recognized that Gibson’s formulation of the concept needs some revision. He has argued that we need to replace the original biological-environmental formulation
Chapter 23 Cognition and action 517 Figure 23.4 An affordance in the world (Source: Phil Turner) with a definition that is at one remove, namely perceived affordance (Norman, 1988). He has suggested that the concept of affordances can be extended to a weaker for mulation: people are said to perceive the intended behaviour of the interface widgets such as the knobs and dials of a range of software applications. These intended and perceived behaviours are usually very simple, including sliding, pressing and rotating. He continues, real affordances are not nearly as im portant as perceived affordances: it is perceived affordances that tell the user what actions can be performed on an object and, to some extent, how to do them. [Perceived affordances are] often more about conventions than about reality. (Norman, 1999, p. 123) He gives a scrollbar as an example of such a convention. Figure 23.5 is a screenshot with a number of perceived affordances present. The slider labelled ‘Dock Size’affords sliding; the radio buttons (‘Position on screen’) afford selecting; but does the check box ‘Animate opening applications’ really afford checking? What does ‘checking’ really mean? Are these really affordances or just conventions? Despite the difficulties in precisely stating what an affordance is, as a concept it is enormously popular amongst researchers. The use of the term affordance in anthropol ogy is not unusual (e.g. Cole, 1996; Wenger, 1998; Hollan e ta l, 2000). However, what may be surprising is the extravagant use of the term, going well beyond Gibson’s modest conceptualization. Cole (1996), for example, identified a range of affordances offered by a variety of mediating artefacts including the life stories of recovering alcoholics in an AA meeting (affording rehabilitation), patients’ charts in a hospital setting (afford ing access to a patient’s medical history), poker chips (affording gambling) and ‘sexy’ clothes (affording gender stereotyping). Cole notes that mediating artefacts embody their own ‘developmental histories’that are a reflection of their use. That is, these arte facts have been manufactured or produced and continue to be used as part of, and in relation to, intentional human actions.
518 PART IV • Foundations of designing interactive systems Figure 23.5 A perceived affordance at the user interface In Where the Action Is Paul Dourish develops his ideas on the foundations of embod ied interaction (Dourish, 2001). The embodied interaction perspective considers inter action “with the things themselves’. Dourish draws on the phenomenological philosophy of such writers as Heidegger, Husserl and Merleau-Ponty and recent developments in tangible computing and social computing to develop a theory of embodied interaction. For Dourish, phenomenology is about the tight coupling of action and meaning. Enactive interaction T—he close coupling of the perception-action loop is a key characteristic of enactive interfaces. Enactive interaction is direct, natural and intuitive, based on physical and social experiences with the world. Bruner describes three systems or ways of organizing knowledge and three corresponding forms of representation of the interaction with the world: enactive, iconic and symbolic (Bruner, 1966,1968). Enactive knowledge is constructed on motor skills, enactive representations are acquired by doing, and doing is the means for learning in an enactive context. Enactive interaction is direct, natural and intuitive. In order to give rise to believable experiences with enactive interfaces it is necessary to respect certain conditions of the interaction with the real world, such as the role played by action in the shaping of the perceptual content, the role of active exploration and the role of perception in the guidance of action. The close coupling of the perception-action loop is hence a key characteristic of enactive interfaces. ............................................ .......... ............................................................. -.....-......................... J Embodied interaction is concerned with two main features: meaning and coupling. Meaning may be about ontology, inter-subjectivity or intentionality. Ontology is con cerned with how we describe the world, with the entities and relationships with which we interact. Inter-subjectivity is about how meaning can be shared with others. This involves both the communication of meaning from designers to other people, so that the system can reveal its purpose, and the communication between people through the system. The third aspect of meaning is intentionality. This is to do with the directedness of meaning and how it relates one thing to another.
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304
- 305
- 306
- 307
- 308
- 309
- 310
- 311
- 312
- 313
- 314
- 315
- 316
- 317
- 318
- 319
- 320
- 321
- 322
- 323
- 324
- 325
- 326
- 327
- 328
- 329
- 330
- 331
- 332
- 333
- 334
- 335
- 336
- 337
- 338
- 339
- 340
- 341
- 342
- 343
- 344
- 345
- 346
- 347
- 348
- 349
- 350
- 351
- 352
- 353
- 354
- 355
- 356
- 357
- 358
- 359
- 360
- 361
- 362
- 363
- 364
- 365
- 366
- 367
- 368
- 369
- 370
- 371
- 372
- 373
- 374
- 375
- 376
- 377
- 378
- 379
- 380
- 381
- 382
- 383
- 384
- 385
- 386
- 387
- 388
- 389
- 390
- 391
- 392
- 393
- 394
- 395
- 396
- 397
- 398
- 399
- 400
- 401
- 402
- 403
- 404
- 405
- 406
- 407
- 408
- 409
- 410
- 411
- 412
- 413
- 414
- 415
- 416
- 417
- 418
- 419
- 420
- 421
- 422
- 423
- 424
- 425
- 426
- 427
- 428
- 429
- 430
- 431
- 432
- 433
- 434
- 435
- 436
- 437
- 438
- 439
- 440
- 441
- 442
- 443
- 444
- 445
- 446
- 447
- 448
- 449
- 450
- 451
- 452
- 453
- 454
- 455
- 456
- 457
- 458
- 459
- 460
- 461
- 462
- 463
- 464
- 465
- 466
- 467
- 468
- 469
- 470
- 471
- 472
- 473
- 474
- 475
- 476
- 477
- 478
- 479
- 480
- 481
- 482
- 483
- 484
- 485
- 486
- 487
- 488
- 489
- 490
- 491
- 492
- 493
- 494
- 495
- 496
- 497
- 498
- 499
- 500
- 501
- 502
- 503
- 504
- 505
- 506
- 507
- 508
- 509
- 510
- 511
- 512
- 513
- 514
- 515
- 516
- 517
- 518
- 519
- 520
- 521
- 522
- 523
- 524
- 525
- 526
- 527
- 528
- 529
- 530
- 531
- 532
- 533
- 534
- 535
- 536
- 537
- 538
- 539
- 540
- 541
- 542
- 543
- 544
- 545
- 546
- 547
- 548
- 549
- 550
- 551
- 552
- 553
- 554
- 555
- 556
- 557
- 558
- 559
- 560
- 561
- 562
- 563
- 564
- 565
- 566
- 567
- 568
- 569
- 570
- 571
- 572
- 573
- 574
- 575
- 576
- 577
- 578
- 579
- 580
- 581
- 582
- 583
- 584
- 585
- 586
- 587
- 588
- 589
- 590
- 591
- 592
- 593
- 594
- 595
- 596
- 597
- 598
- 599
- 600
- 601
- 602
- 603
- 604
- 605
- 606
- 607
- 608
- 609
- 610
- 611
- 612
- 613
- 614
- 615
- 616
- 617
- 618
- 619
- 620
- 621
- 622
- 623
- 624
- 625
- 626
- 627
- 628
- 629
- 630
- 631
- 632
- 633
- 634
- 635
- 636
- 637
- 1 - 50
- 51 - 100
- 101 - 150
- 151 - 200
- 201 - 250
- 251 - 300
- 301 - 350
- 351 - 400
- 401 - 450
- 451 - 500
- 501 - 550
- 551 - 600
- 601 - 637
Pages: