International Journal of Arabic-English Studies (IJAES) Vol. 15, 2015 Appendix C: Receptive test of collocational knowledge Circle the number of the sentences that do NOT contain collocations (a collocation is an expression where the words frequently occur together—e.g., best friend). 1. I should go to a dentist to fix my artificial teeth. 2. After the death of his son, Mark had a heart attack. 3. Eating soup at the start of a meal fills the stomach. 4. Are you having second thoughts about coming with me to Brighton? 5. The artist was not painting for a wide public. 6. Tom’s wife gave birth to a son yesterday. 7. If you do not take the short cut, the hotel is four miles further down the road. 8. She had forgotten how hot blood could run. 9. It will do you good if you exercise more often. 10. The main political parties in the U.S. are the Democrats and the Republicans. 11. She rarely wears makeup and is usually pretty shy. 12. The robbery took place at about 3:30 a.m. yesterday. 13. Today is your last chance to submit your paper. 14. We think that we still must do an effort to avoid the such mistakes in the future. 15. What difference does it make if your friend does not have a car? 16. Many people die of old age around the world. 17. Last July, Mike made the mistake of going to work on a strike day. 18. The growing generation is the nation’s hope for building its future. 19. That's a horrifying image that doesn't leave the mind easily. 20. Although no executions were ordered until 1980, the state reestablished capital punishment in 1982. 21. The lantern was kicked and the barn caught fire. 22. You need to run more vitamins and minerals in your diet. 23. McDonald's is one of the largest fast food chains in the U.S. 24. Don’t lie, just tell the truth! 25. The 1930s and 1940s are considered the golden age of Hollywood. 26. The boxer gave him a black eye, so he was taken to the hospital. 27. This book describes ten ways to take advantage of the Internet. 28. Coca Cola mainly produces soft drinks rather than juices or water. 101
Zareva and Shehata At the Intersection of L1 Congruence ... 29. Such negative publicity puts extra risk into her career as a politician 30. The term fine arts is used to refer to the visual arts such as painting and architecture. 31. Could you keep an eye on my bag for a while? 32. The heavy traffic made me late for my appointment. 33. Fixing false limbs has now become possible. 34. Inborn abilities always have an effect on what we become. 35. There were heavy rain and strong winds during the afternoon storm. 36. Governments should take the necessary actions to stop the massacre. 37. This tourist speaks broken English. 38. She spends a lot of her time reading. 39. Everyone knows that a little white lie is sometimes necessary in a time of crisis. 40. Parents can play a role in preventing childhood obesity. 41. Politicians are trying to influence the public opinion on the topic. 42. It's true that we gain weight when we eat more than we should. 43. One of the advantages I had in getting this success a little later is that I’d seen the mistakes other people had made. 44. China hopes to grow its middle class to more than half of its total population by 2020. 45. It usually takes time to change laws. 46. It was a quiet residential area with many family homes and a few businesses. 47. Do you think there is a chance that John will change his mind? 48. A wide imagination stimulates the thinking process and the ability to be creative. 49. If you take my advice, you'll stop seeing him. 50. Bureaucracy and red tape are the real problems in this company. 102
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 Contextual Considerations in the Use of Synonymous Verbs: The Case of Cease, End, Finish and Stop Aziz Thabit Saeed Sana’a University, Yemen Abstract: This corpus-based study endeavors to explore the semantic differences among the verbs cease, finish, end and stop when used in different contexts. The author analyzed a set of data that comprised 500 contextualized instances in which these verbs are used. The data were culled from different sources including BNC, novels and spontaneous speech. In addition to the syntactic constraints that govern the use of these verbs, the findings of the study reveal a number of contextual features that distinguish them from one another, including completeness, finality, conclusiveness, animacy, abruptness, gradualness, among others. Keywords: context, semantics, synonymous verbs, verbs of ending. 1. Introduction Synonymous terms tend to be one of the rather intricate areas that ESL/EFL learners stumble upon in the process of their learning the language. Many studies have pointed to the difficulty that such learners encounter when dealing with synonyms. Khuvasanond et al. (2010), for instance, found that “synonyms were more difficult for ESL/EFL to produce than antonyms” (p. 180). More often than not, ESL/EFL learners tend to question the significance of having more than one term to express a certain activity and wonder whether there exist some semantic differences among them or not. Unfortunately, teachers, dictionaries, and textbooks do not always offer answers to such learners’ legitimate queries. The terms cease, end, finish and stop are a case in point. They are among the synonymous verbs that are very frequently used in daily communication and which, at times, constitute a difficulty for English learners. This study, therefore, aims at exploring the semantic differences that can manifest themselves among the verbs cease, finish, end and stop when used in different contexts. More specifically the study seeks answers to the following two questions: 1. What are the features that differentiate these verbs from one another? 2. To what extent can context help in deciphering the meaning of these synonymous verbs? 2. Review of literature The notion of synonymy has been the focus of many studies including Lyons (1977), Lyons (1995), Saeed (1979), Palmer (1981), Cruse (1986), Gregory (2000), Kearns (2000), among others. In scrutinizing the myriad of definitions 103
Aziz Thabit Saeed Contextual Considerations in the Use of ... that this notion has received, one could find that virtually all of them categorize as synonyms any two-word forms that share a sense or many senses. Gregory (2000), for instance, maintains that 'if two word forms share at least one word sense, then they are synonyms' (p. 2). On the other hand, none of the definitions associates synonymy with absolute interchangeability. Although Lyons (1995: 60-61) defines synonymy as expressions that have “the same meaning,” the word ‘same’ does not imply identicalness, for he, in the same context, argues that “absolute synonymy is extremely rare” (emphasis added). Saeed (1979: 65) points out that, 'true or exact synonyms are rare'. Kearns (2000) holds the same view, arguing: 'true lexical synonymy is rare' (p. 10). Palmer (1981) asserts that 'there are no real synonyms.' He also contends that 'no two words have exactly the same meaning. Indeed, it would seem unlikely that two words with exactly the same meaning would both survive in a language' (p. 89). Palmer ascribes the plethora of synonyms in English to a historical reason, pointing out that English vocabulary 'has come from two different sources, from Anglo-Saxon on the one hand and from French, Latin and Greek on the other' (p. 88). These views prove the problematic nature of the concept of synonyms. Dictionaries, as will be illustrated below, do not do much to help learners. Palmer maintains that 'dictionaries, unfortunately, (except the very large ones), tell us little about the precise connections between words and their defining synonyms or between the synonyms themselves' (p. 91). One of the approaches that has been resorted to as a way of tackling the problem posed by synonyms is to study them in context. Only through context can one decipher the areas of closeness and areas of overlap in any pair or group of supposed synonyms, and this is, indeed, what many studies have attempted to do. Some of these studies include Rea (1968), Freed (1979), Riddle (1989), Clift (2003), Saeed and Fareh (2006), Wang (2009), Phoocharoensi (2010), Shen (2010), Chung (2011), among others. Rea (1968) explored the uses of lend and loan in American English while Freed (1979) investigated, among other things, the differences between start and begin. Riddle (1989) attempted a semantic analysis of three transition expressions that tend to be perceived by learners as synonymous namely however, nevertheless, and in spite of, demonstrating how a detailed lexical semantic analysis of such highly utilized synonymous transitions could help EFL learners decipher their various meaning and range of uses. Clift (2003) examines the use of actually and in fact. Her data revealed “interactional differences” between these two items, which are virtually always considered synonyms. These differences may, as she puts it, “be identified with reference to the position of each in a turn-at-talk, and the composition of that turn” (p. 182). Saeed and Fareh (2006) explored the contextual factors that govern the use of the verbs steal, rob and burglarize in authentic contexts “in an attempt to identify the semantic and syntactic constraints that differentiate them from one another” (p. 323). Their investigation delineated a number of semantic features that determine which verb to choose in a given context. These features include location of activity, object of activity, and manner of action. Wang (2009) explored “the different meaning and usages between the two Chinese 104
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 near-synonyms verbs of running: Ben and Pao…” He identified some semantic patterns that govern the use of these two verbs. Such patterns, as he puts it, can provide non-native speakers of Chinese “with guidelines to use the words appropriately” (p. 399). Phoocharoensi (2010) studied the lexical, syntactic, and stylistic information of five synonyms: ask, beg, plead, request, and appeal. His analysis revealed that 'despite being similar in core meaning, these words in reality differ in some particular details or senses of meaning, connotations, styles, dialects, grammatical patterns, and collocations' (p. 243). Shen (2010) used the synonymous pair glad and happy to investigate Chinese EFL learners' errors in the use of these terms in writing. He states: 'English has many words that are considered synonymous and Chinese EFL (English as Foreign Language) learners tend to mix them up and make certain kinds of errors” (p. 1). Chung (2011) examined, using corpus-based data, the similarities and differences between create and produce. The author found that 'although many of the senses of the two verbs, and even their object types, might seem unrelated, they could be linked through the notion of PRODUCT.' She proposed an explanation 'encompassing the non-discrete semantic features of ‘create’ and ‘produce’ and discussed the literal and/or metaphorical extensions of PRODUCTS of both verbs' (p. 419). Despite the number of studies on synonymous verbs, none has attempted to explore the distinguishing features that characterize the four synonymous verbs in the study. The only study whose focus was found to be slightly close to that of our study is that of Nagy (2006). Nagy attempted a description of what he calls “aspectualizers,” drawing on Freed's study (1979). He focused on aspectualizers 'expressing initiation (begin vs. start), continuity (continue, keep, resume and repeat), interruption or cessation (stop, quit, cease) and termination (finish, end and complete).' Nagy's treatment discusses some syntactic and semantic constraints in the use of these aspectualizers. Although the study provides interesting observations related to the use of these aspectualizers, it does not provide a comprehensive account of the features that characterize each group of these aspectualizers. In other words, Nagy's account does not go into a great enough depth to produce plausible generalizations. Our study differs from Nagy's in that it is, among other things, corpus-based. One of the sources that contribute to the difficulty of such a group of synonyms, and almost certainly to similar groups, is ascribed to the fact that they are underdefined by both monolingual and bilingual dictionaries. Both monolingual and bilingual dictionaries offer incomplete information on their meaning and use, which does not help the learner to use the terms appropriately in ‘a normal range of context’ to use Riddle's words (see Riddle, 1989). In fact, many researchers including Wang (2009), Saeed and Fareh (2006) and Riddle (1989), among others, have addressed the incomplete nature of the information offered by dictionaries. In his investigation of the Chinese near-synonymous verbs of running, Wang (2009) states: “[T]he definition given by the dictionary is often circular and far from enough to help distinguish near synonymous verbs” (p. 399). Indeed, circularity is manifest in the treatment that most 105
Aziz Thabit Saeed Contextual Considerations in the Use of ... monolingual and bilingual dictionaries offer to synonymous terms. Table 1 shows the definitions that three monolingual learners’ dictionaries give to the four synonymous verbs under study. These dictionaries are among the ones frequently used by EFL learners. It is not the intention of the author to compare these dictionaries or doubt their usefulness, but rather to show their treatment of the four verbs in the study. Table 1. Monolingual learners’ dictionaries’ definitions of cease, finish, end and stop Verb Dictionary Cease End Finish Stop Cambridge To stop To finish or To complete To finish Advanced something stop, or to something or To finish Learner's make come to the doing Dictionary something end of an something finish or stop activity: To end… The Compact - come or bring or come or bring Oxford bring to an come to an to an end;… English end; finish. end; cease or cause Dictionary of end by doing to cease Current something moving or English or… operating Longman To stop (esp. To (cause to) To come or …to (cause to) Dictionary an activity of finish; come to bring to an end for English state) or bring to an end; reach Language end the end of and Culture (an action or activity) In examining this table, one finds that virtually all these dictionaries define the four verbs by means of each other. Thus, the Cambridge Advanced Learner's Dictionary uses stop to define both cease and end, and uses the word finish to define end and stop and finally uses the word end to define finish and stop. The treatment that The Compact Oxford English Dictionary of Current English offers is more interesting; this dictionary uses the phrase ‘bring to an end’ as a definition to end, finish and stop and offers no definition of cease. The Longman Dictionary for English Language and Culture uses this same phrase to define end and finish. Such unhelpful treatment of these synonymous words may contribute to the already existing misconception that learners have, namely that such terms can be used interchangeably. It is true that these dictionaries list other meanings of the words; however, learners usually look at the first one or two definitions. Since the first definitions in these dictionaries convey the impression that these verbs mean the same, confusion on the part of learners is inevitable and a misuse of these verbs is to be expected. 106
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 Some of these dictionaries such as the Longman Dictionary of English Language and Culture present useful notes on usage; however, such notes, though helpful, are not enough. For instance, this dictionary compares the use of end and finish stating: “when used transitively, finish is much more common than end… when used intransitively, finish is more informal than end, but end is commonly used in writing” (p. 421). Such information, though important, does not help learners grasp the various uses and meanings that these verbs can assume. Bilingual dictionaries are not any better. In fact, the situation here is even thornier. For instance, the Atlas encyclopedic dictionary: English – Arabic, a dictionary whose electronic version is widely used amongst Arab EFL learners, uses the Arabic translation yatawaqqaf ‘to stop’ as one of the primary meanings for both the verbs stop and cease, and uses the Arabic term yantahƯ ‘to end’ to define both to end and to finish. Other dictionaries such as Al-Mawrid dictionary: English-Arabic and Al-Mughni Al-'akbar Dictionary, two of the most widely used English-Arabic dictionaries offer more or less similar treatments. Although monolingual dictionaries help the learner by presenting illustrative examples, bilingual dictionaries hardly ever offer examples. Teachers and vocabulary textbooks do not always do much to tackle the problem either. Studies that have focused on EFL learners’ acquisition of synonymous terms have shown that learners experience difficulties when learning such terms. Fareh’s study (2007) is an example. Fareh explored EFL Arab learners’ acquisition of verbs of saying: say, speak, tell and talk at the level of recognition and production. His study revealed that EFL learners do encounter a great deal of difficulty when learning these synonymous verbs. Although his subjects' level of competence in English was quite high (500-550 on the TOEFL), the level of mastery of these verbs was remarkably low. He concluded that both teachers and textbooks fall short of providing learners with an adequate treatment that enables them to use these words accurately. In fact, the problem of misusing synonyms can, at times, be observed at the fairly advanced stages of learning the language. The following sentence was written by an MA student. *I will end my degree in 2011. Then I will go to the States to study for my PhD. Committing such an error by an MA English major graduate implies that the problem of synonymy can manifest itself even at the relatively advanced stages of learning the language. The overlapping nature of the four near synonyms in the study necessitates the use of the notion of context as a vital element that motivates the use of a particular verb rather than its near synonym(s). Context here refers to what surrounds the verb in the study, a word or any entities bigger than a word, subjects, objects, etc. This agrees with Werth’s stance regarding context. Werth (1999: 78-79), cited in Requeju (2007:171) states, “The context of a piece of language (…) is its surrounding environment. But this can include as little as the articulatory movements immediately before and after it, or as much as the whole 107
Aziz Thabit Saeed Contextual Considerations in the Use of ... universe with its past and future”. In addition, the subcategorization properties and selectional restrictions of the verbs are also important in opting for a particular word as opposed to any of its synonyms. Although the focus of the study is semantic-oriented, it is hard to completely ignore the role that syntactic constraints play in licensing the use of a verb rather than its synonymous counterpart. In this regards, Oliver Sacks, 1993 (cited in Wierzbick, 1996:22) states: “it is increasingly clear, from studying the natural acquisition of language in the child, and, equally, from the persistent failure of computers to ‘understand’… that syntax cannot be separated from semantics.” Partington (1998:3) argues that “the close correlation between the different senses of a word and the structures in which it appears implies that syntactic form and meaning are interdependent in the sense that each helps define the other”. 3. Methodology 3.1 Data collection In order to recognize and analyze the semantic differences among cease, end, finish and stop, the researcher collected nearly 500 instances in which these verbs were used. The data in the study were drawn from several sources including the British National Corpus (BNC), a novel, newspaper articles and spontaneous speech. Altogether, four hundred (400) examples were randomly drawn from the (BNC) (i.e. 100 tokens for each term in the study) and one hundred (100) examples were collected from other sources, 25 for each term. The data drawn from the BNC could have been enough; however, the author thought that drawing some instances from other sources should strengthen the reliability of the findings. To make sure that the balance granted by the data contained in the BNC is retained, the author extracted the additional 100 examples from similar sources and in a balanced manner: 20 tokens from a novel, 60 from articles of different types (educational, political, literary etc.) and 20 from spontaneous speech. Each verb of the four in the study received the same number of tokens, i.e., 25 each. For instance, each verb received 5 tokens of the 20 examples extracted from novels, 15 each from the different types of articles, and 5 each from spontaneous speech. The novel used in the study is The Adventures of Sally, by Pelham Wodehouse, which is an e-book available in the net. This novel is among the ones that the author had read in the past, and selecting it as a data source was completely arbitrary. As for the newspaper articles, they come from different sources in the net. The author is aware of the fact that this group of synonyms has other members including terminate, pause, quit, suspend, etc, yet the decision to opt for these four is based on the following: a. These verbs, particularly end, finish and stop, are among the most frequent ones in daily communication, whether this communication is basic daily exchanges or scholarly discourse. In fact, these three verbs are among the first 400 of the 3,500 most common English words listed in the Macmillan Essential Dictionary. (See also “Most Common English Verbs”). 108
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 b. Based on the database of the BNC, the frequency of use of these four words is remarkably higher than the use of any other words in the extended family of these synonyms such as terminate and quit, for instance (see BNC). c. Words such as terminate do not occur as much in simple everyday exchanges but rather in more formal settings or genres. In The Adventures of Sally, a novel that consists of more than 77 thousand words, there is only one single occurrence of terminate. In this novel, the word stop, on the other hand, is used 45 times, as a verb. d. It is true that occurrences of the word cease are not as frequent as is the case with the other three members, yet including this word in the study stems from the fact that virtually all thesauri cite stop, finish and end as primary synonyms of this word. On the other hand, a number of thesauri do not include the word quit as a primary synonym of any of the four words in the study. e. Finally, although the verb complete is a synonym of both end and finish, it is not a synonym of cease or stop. Therefore, it was not included in the group in the study, for the four verbs in the study are listed by virtually all thesauri as synonyms to one another (See http://www.thefreedictionary.com/) In drawing a random sample for the verbs in the study from BNC, the author used all forms of the verb as search items. For instance, stop, stops, to stop, stopping, stopped, etc. The same approach was applied when drawing data from the other sources in the study, i.e., the novel, articles, etc. Then the collected data items were numbered, and the nth item was selected. 3.2 Data analysis In analyzing the data, the author accorded special attention to context. As indicated above, context plays a vital role in enabling language users to opt for a certain term as opposed to its synonyms. Other important aspects that were considered in the analysis include the subcategorization properties and selectional restrictions of the verbs, which also play a significant role in restricting the use of a particular word as opposed to any of its synonyms. When the author completed his analysis, he consulted two native speakers of English colleagues to see whether they agreed with his judgments and conclusions. The author gave them the paper to read and comment on his judgments. Both of them agreed with virtually all the judgments he made based on his analysis of the data. In the few cases where both of them did not find the explanation he offered plausible, the author reexamined the data and attempted to find other explanations. Before embarking on the discussion, it should be made clear that this analysis will not discuss the metaphorical meaning of these verbs nor their collocations or idiomatic usage, i.e. phrasal verbs, prepositional 109
Aziz Thabit Saeed Contextual Considerations in the Use of ... verbs, and phrasal prepositional verbs, since these constitute separate studies in themselves. 4. Discussion As indicted above, the data in the study comprised 500 instances exemplifying the use of the four verbs in the study, 125 items representing each. Before discussing the major considerations that govern the use of these verbs, it is worth highlighting briefly the features that these verbs require in their subjects and objects, as the findings revealed. Table 2 summarizes these features. Table 2. Features that the verbs in the study require in their subjects and objects Category Features Verb Cease End Finish stop +animate 31 35 83 102 Subject % 24.8% 28% 66.4% 81.6% -animate 94 90 42 23 % 75.2% 72% 33.6% 18.4% +animate 5 0 20 60 % 4% 0% 16% 48% -animate 14 64 68 39 Object % 11.2% 51.2% 54.4% 31.2% +Action-Oriented 9 18 15 52 % 7.5% 14.4% 12% 41.6% -Action-oriented 0 30 73 5 % 0% 24% 58.4% 4% No object 106 61 37 26 84.8% 48.8% 29.6% 20.8% As the table shows, the subjects of the verb cease are mostly (-animate), whereas the subjects of the verb stop are mostly (+animate). On the other hand, as many as106 of the 125 instances exemplifying the use of cease co-occur with no object compared to 26 only in the case of stop that do not show use of an object. The feature of animacy is also important in the case of the verbs end and finish. While 83 of the subjects that the verb finish co-occur with are +animate, only 35 of the subjects that the verb end co-occur with are +animate. Sixty-one (61) of the examples that contain this verb show that the verb is intransitive. We will show the importance of the features animate vs. inanimate and action vs. non-action NP object in determining the use of a verb rather than its synonymous counterparts after highlighting the contextual-oriented factors which emerged as the major governing elements in the use of these verbs. In the discussion below, the first verbs in the examples (the ones in bold) are the original ones, i.e., the ones used in the data; the other ones are made-up for discussion purposes. 110
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 4. 1. Completion of an activity One of the features that the findings reveal as determinant in distinguishing these verbs from one another is the completion of an activity, as the following example illustrates. (1) I finished /stopped/ *ended painting today. (Walden, 2011) Although the two verbs finish and stop share the feature of transitivity, among other things, their impact on their objects differ. With the verb finished, the sentence implies the completion of the activity of painting that presumably started at some point in the past, and was completed at some point before the time of speaking. With the verb stop, however, the sentence, given the limited context here, could imply that the painting process was in motion, but for some reason it was stopped, i.e. something caused it to stop before its completion. Therefore, painting has not finished yet, and it is possible that it will resume at some point in the future. Putting this sentence in a bigger context, however, may affect the acceptability of the verb stop. (2) I thought I'd introduce you to my most recent print that I finished/ ?stopped/ *ended painting today! (Walden, 2011) Here the additional context makes the use of stop rather improper, for the speakers here wanted to show the addressee the ‘finished version’ of his or her most recent print, which clearly indicates that the process of painting has been completed. The following example illustrates further this point: (3) His opponent was one of the Shadow forge figures I finished /*stopped/*ended painting today, bringing my gladiator accumulation up to 67 figures. I think that's it though - I don't have any unpainted gladiator figures around now - I shall have to buy some more. (Hero Defeated, 2014) In this example, the verb stop is not acceptable, for the painting is definitely completed, and the protagonist now has 'no unpainted gladiator figures,' as his statement affirms. The verb end is not acceptable, either, for with end, the activity of painting is brought to a close, complete or incomplete, but since we are certain that the painting is complete, using end is not felicitous. Besides, this verb is not acceptable for syntactic reasons, since it, unlike the verbs finish and stop, does not normally select for a gerund. It becomes acceptable in sentence (1) if it is paraphrased to become: I ended the painting, wherein the gerund is replaced by a noun, but here the meaning will be ‘ending the painting process.’ Ending the painting might have been a decision on the part of the speaker, for instance. The sentence, I ended the painting could continue as follows: (4) I ended the painting today. The lady I was working for kept interfering and suggesting modifications. I couldn't take it anymore – I said to her 'that’s it – I end it here.' As the verb finish entails a completion of an activity, it also implies the completion of a process,. (5) You finish your orange juice and we'll go and watch the rest of the race. BNC 111
Aziz Thabit Saeed Contextual Considerations in the Use of ... (6) “At least finish your course, get your degree,” Aunt Kit cried. BNC In these examples, the interlocutors addressed are encouraged to take the time to finish what they are engaged in, i.e., finishing their orange juice or the course. Using the verbs stop and end won't be fitting here, either grammatically or semantically i.e., we do not stop or end our intake but rather can stop consuming it, for 'intake' implies an amount of something, i.e., a static entity, whereas consuming it is a process. (See discussion below about the kind of complement the four verbs in the study require). The sentence becomes appropriate with the verb stop if it is paraphrased to become: (7) Stop drinking your orange juice... Here the object is an action-oriented NP, namely drinking (more about the role of the object in verb choice is below). In the case of (7), we know that drinking is not complete. The verb finish, however, is appropriate here, albeit with a different meaning. It indicates the complete consumption of the object – the completion of the whole can or bottle of orange, rather than some of it. The verb end is not appropriate for, as indicated above, the verb cannot be followed by a gerund, and semantically, we do not end our intake. These examples illustrate clearly the fact that the verb finish implies completion of an activity or completion of a process (of something), whereas the verb stop indicates a pause before completing an activity or a process. An important element in this generalization is the fact that the complement/object has to involve duration. If the activity, action or process does not involve duration, neither stop nor finish can be appropriate. Consider the following example cited in Freed (1979): *The guest stopped/ finished arriving. Here both stopped and finished are not appropriate because of the nature of the complement that follows the verbs, which is here an achievement activity. In this respect, Freed (1979) argues that since achievements “are over as soon as they have begun, they cannot be stopped (or interrupted), and since finish also requires its complement to denote an eventuality with duration, they cannot occur with finish either” (p. 28). Note, however, that this sentence becomes acceptable if the subject is in the plural form. (8) The guests stopped/ finished arriving at 8:45. Here arriving of ‘guests’ does take some time, and thus both finish and stop can be used. To sum up, we can say that the verb finish implies completion of an activity or a process that involves duration, whereas the verb stop indicates a pause before completing an activity or a process. As for end, it implies either completion of an activity or a decision to stop for good. Syntactically, the verb selectional restrictions, the state of the subjects, as in example (8) above, can play a role in the acceptability of a verb. 4. 2. Temporary discontinuity The verb stop implies a momentary pause or temporary discontinuity of an event, after which the activity may resume. 112
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 (9) He has been so busy in the last two or three months that he has been forced to temporarily stop/ ?finish/ *end promoting his products. BNC In this example, we assume that the protagonist is usually actively engaged in promoting his products, but in the last three months he has been so busy that he was forced to stop temporarily. Once the unexpected circumstances that have made him busy come to an end, he will resume promoting his products. The sentence with the verb finish sounds odd, for things do not ‘temporarily’ finish, but rather stop, once and for all. In fact, even without the adverb ‘temporarily’ in the sentence, the statement remains clumsy, for it will mean that this promotion process comprises some steps that can be carried out over a certain period of time. The verb end in (9) is inappropriate syntactically, since, as indicated above, it does not select for a gerund. However, if it is paraphrased to become: (10) He has been so busy in the last two or three months that he has been forced to temporarily end the promotion of his products the sentence becomes fairly acceptable syntactically, yet semantically clumsy due to the presence of the adverb ‘temporarily.’ The use of end becomes appropriate when this adverb is removed from the sentence. (11) He has been so busy in the last two or three months that he has been forced to end the promotion of his products. 4.3. Abruptness vs. gradualness Another feature that can help distinguish the verbs in the study from each other is the binarity of abruptness vs. gradualness. The feature of abruptness can be illustrated by means of the following example: (12) The band finished /stopped playing and everybody left. In the case of finished, we are certain that the band played all the pieces of music they had prepared for the occasion and stopped when they had completed their last piece. With the verb stop, however, something abrupt might have happened that interrupted the music or caused it to stop instantly, which resulted in everybody leaving the venue. In the following example, stop is appropriate. (13) The band stopped, and a group of the groom's more drunken friends broke into song. BNC Here, stopping could have been final or just a pause. From the sentence, we feel that the groom's drunken friends might have been so noisy that the band felt that they should stop for a while. The phrase 'broke into song' once the band stopped implies that there was some kind of interruption from these 'drunken friends'. The feature of gradualness can also distinguish stop from end as in: (14) The legal position is clear, so can we please stop/ end/ *finish this nonsense? BNC In using stop here, the addresser requires that the nonsense terminate immediately and with no delay. However, with the verb end, it is felt that the termination of this nonsense does not have to be abrupt or swift since the crucial 113
Aziz Thabit Saeed Contextual Considerations in the Use of ... matter is to put an end to it even if this takes time. The verb finish is acceptable, but sounds rather odd in this sentence, for the asker requires that the addressees continue the nonsense, which makes no sense. The following is another example that further illustrates the features of abruptness vs. gradualness. (15) “I would therefore like to finish by thanking you all most sincerely for helping us to write a further chapter in what continues to be a major UK success story.” BNC The speaker here declares that he wants to finish his speech by a note of thanks, which enables him to approach the final phase of his address or speech. Both finish and end are acceptable; although with end, the sentence becomes more acceptable if the verb has an object, i.e., end my speech by… On the other hand, the verb stop will not be appropriate here, for if you stop you freeze. That is why sentences such as the following are anomalous: (16) *We are stopping dealing with this company. Here the progressive tense sounds odd, for once the decision to stop dealing with the company is taken, it is supposed to be carried out instantaneously. Thus, the feature of gradualness distinguishes finish and end from stop. The verb cease, unlike stop, can indicate gradualness as in: (17) [Talking about people leaving a theatre.] Gradually the floor emptied. The shuffling of the feet ceased. Adventures of Sally. Here the shuffling of the feet of the people leaving the theater cannot stop suddenly; it will definitely take time for the sound of the feet to vanish completely. The preceding sentence in the example supports this explanation: “Gradually the floor emptied.” 4.4. Finality and conclusion Both end and finish, unlike stop, convey a sense of finality, with the verb end also indicating a sense of conclusion. The following example shows how the verb end conveys a high sense of finality. (18) 'What's the bad news?' asked Sally abruptly. She wanted to end the suspense. Adventures of Sally In this example, end is the best choice, for the speaker wants to bring this suspense to an end. Using stop is acceptable, but in this case, suspense will not stop for good, but may continue after a period of time. Finish is not appropriate in this instance, for it portrays the suspense involved as something that should continue till it is finished which is not what the speaker means. End, therefore, is the most appropriate choice as it means that the requester wants to put an end to this suspense, i.e. to terminate it completely and for good. The fact that the verb end conveys a high finality and conclusiveness sense is seen clearly in the following example: (19) The marriage ended /?stopped /*finished in 1926, and the same year Jane married the writer J. B. Priestley. BNC In this example, stop is appropriate syntactically, but not very apt semantically, for, as mentioned earlier, stopping something can be temporary. The context of 114
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 the sentence here indicates that separation was final; Jane married somebody else in the same year she was divorced from her first husband. Finish is not appropriate either for marriage is not a practice or an activity that continues for some time and then comes to a predictable conclusion. End is the only suitable option, for it implies that this marriage was terminated for good. Hence, end collocates naturally with the word marriage. The fact that we say end at the end of stories, chapters, roads, etc. indicates that the verb end denotes a strong sense of finality and conclusiveness, i.e., with end the story or event reaches its closing stage. The following is another example that shows that the verb end conveys a conclusion sense: (20) THAT HURT! MIKE Dixon shows the effects of the beating he took from Lewis after the referee stopped / ended /?finished / *ceased the fight. BNC Both end and stop are appropriate here, yet with different meanings. With the verb end, the meaning is that the referee declares the winning fighter victorious, which brings the fight to its end even if the fight rounds have not finished, which proves our claim that the verb end expresses a sense of conclusion. On the other hand, using the verb stop means that the referee commanded the fighters to stop fighting even before the fight rounds were over. Here the meaning of stop is “to prevent completion of an activity”. The referee may or may not instruct the fighters to resume. The verb finish in (20) is not acceptable, semantically speaking, since the subject (the referee) is not a participant in the activity, and also since this activity – fighting—does not imply a specific amount of work that should be completed in a specific period of time. However, we can say: (21) The match finished on a Tuesday. BNC In this case, we know that the match is over and a winner has probably been declared, which, again, denotes that the verb finish conveys a finality sense. The verb cease is not acceptable in (20), since we do not cease things, they cease, i.e., the verb is intransitive and thus cannot denote a process that can affect a patient. The sense of finality is also a primary factor in the choice between cease and stop. Consider the following examples: (22) The music stopped abruptly. Insistent clapping started it again, but Sally moved away to her table, and he followed her like a shadow. Adventures of Sally (23) The music stopped. There was more clapping, but this time the orchestra did not respond. Gradually the floor emptied. The shuffling of the feet ceased. Adventures of Sally In the first example, the band stopped playing, but the clapping of the audience motivated them to play some more, which agrees with our explanation above that stop implies possible resumption of a terminated activity. The second example states that the band played some more, responding to the clapping of the audience and then they finally stopped. There was more clapping from the 115
Aziz Thabit Saeed Contextual Considerations in the Use of ... audience, but this time the band did not respond. Members of the audience left and none remained in the theater. The narrator did not say “The shuffling of their feet stopped” but rather, “ceased,” which means that cease, unlike stop, transmits a strong sense of finality. The seriousness and sense of finality found in the verb cease make it form a collocation with the word fire, as in cease fire. The following example illustrates the features of seriousness and finality in the verb cease. (24) Without this extensive support, many of our newspapers, for example, would either cost much, much more or cease / stop/*end /*finish to exist. BNC Both cease and stop are acceptable here, yet the tone of seriousness and finality is stronger with the verb cease. Also the complement ‘to exist’ in this sentence dictates the choice of the word cease, i.e., cease to exist has become a collocation. Since cease conveys a high sense of seriousness and finality, it sounds less appropriate than stop in: (25) The pain stopped /ceased when I took the aspirin. In this context, the more fitting choice is stop. The verb cease is somewhat appropriate, yet the focus will be on the medication. In other words, with the verb cease the sentence implies how effective the medication – aspirin – is, as it causes the pain to stop forever. 4.5. Animate vs. non-animate agent or object The type of agent can play a role in the choice of one verb rather than the others. (26) The coin stopped /*finished /*ended/ ceased rolling. (27) The champion stopped / finished /*ended/ ceased rolling In (27), stop, finish and cease are acceptable for the agent here (+animate) is in control of the action, a feature that does not exist in the case of example (26) where the agent is (-animate). The inappropriateness of end in this instance is due to selectional restrictions, i.e., the verb does not select for a gerund. As is the case with the agent, the type of the object (+animate vs. - animate) does, in certain cases, determine the type of verb to use. (28) Out in the dark cold hall, she stopped /*finished /*ended /*ceased him at the foot of the stairs. BNC This example shows that the verbs cease, finish and end do not tend to take an animate object. 4.6. Action-oriented vs. non-action oriented object The type of object can play a role in the choice of the verb. (29) I finished /*ended /*stopped /*ceased the food in my bowl, feeling quite kindly disposed toward Marcus for giving me real meat. BNC (30) A police photographer finished /?stopped / ended /?ceased the task of photographing the body. BNC 116
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 In example (29), only finished is appropriate, whereas in (30) three are acceptable. The only apparent motivation here is the nature of the object. In the second example, the object is of the type that can be labeled ‘action NP’, similar to what Dowty (1979) calls “incremental theme,” whereas in the first it is not. Thus, while we can finish, end and or stop something that is action-oriented, we cannot do so with non-action-oriented objects. The notion of action NP vs. non-action NP also applies in the case of the subject, as is shown below. (31) My money has finished /*ended /*stopped /*ceased and my friends have gone. www.5steps2english.com (32) The game has finished / ended / stopped /?ceased. Again, here the examples show that the notion action vs. non-action plays a vivid role in accepting the verb; we do not stop, end, or cease non-action oriented NPs. 5. Conclusion and implications The results of the analysis have shown that the four verbs in the study share many features that make them a group by themselves, yet each one of them can be distinguished from the other members by means of the meaning and the subtle nuances that it can convey when used in different contexts. Thus, in addition to the syntactic constraints that differentiate them from one another, the findings of the study reveal many distinctive features that characterize each. These features include completeness vs. incompleteness, finality vs. temporariness and abruptness vs. gradualness. Other features that the findings reveal include those that are both semantic and syntactic-oriented, motivated by the type of complement/object NP that these verbs require. Such features include animate vs. inanimate and action vs. non-action NP. The main differences and distinctive features that differentiate the four synonymous verbs in the study from each other are summed up in the componential analysis presented in table 3. The features presented in this table are not inclusive; a larger corpus of data is likely to reveal more distinguishing features. In the table, the sign in parentheses indicates that the verb might in certain cases be used in two ways. For instance, the verb end virtually always implies completion, but in rare cases, the activity may not be complete. Therefore, the norm for end is denoting completion of an activity and the exception is the opposite. The findings of this study show that the differences that distinguish a synonym from its counterpart tend to be hard to pinpoint unless sufficient context is provided. Only through a contextualized type of data can language learners recognize the subtle different meanings that distinguish a verb from its close synonyms. As shown in the discussion above, if a context is added, the meaning and or the use of any one of the verbs in the study changes. This necessitates the need to teach EFL learners both the syntactic and semantic constraints that characterize any set of synonyms. The syntactic features may, to some extent, be acquired via grammar teaching; nonetheless, the semantic 117
Aziz Thabit Saeed Contextual Considerations in the Use of ... features, which account for most uses, cannot be deciphered except through context – textual and/or situational. Since these differences tend to be very subtle, the information found in dictionaries is not enough; therefore, EFL/ESL teachers and textbook writers should rely, among other things, on corpus data when teaching the meaning and use of synonyms. Aijmer (2009) maintains: Corpora are invaluable for teachers, in that they can employ them in a number of ways, such as…to create exercises, demonstrate variation in grammar, show how syntactic structures are used to signal differences in meaning and level of style, discuss near-synonyms and collocations… (p. 49). Table 3. Semantic and syntactic features of the verbs cease, end, finish and stop Feature Syntactic Semantic Features Features Object Action Animate Action- Gerund Complete Temporary Final Gradual Verb Oriented NP Cease - - (+) + + - + + End - + - + (-) - + + Finish - (+) + + + - + + Stop +/- + + - + - - (+) Aziz Thabit Saeed Vice President, Yemeni Scientific and Linguistic Academy Dean of the College of Languages Sana'a University, Sana'a-Yemen Tel: 00967771501505 Email: [email protected] References Aijmer, Karin. (2009). Corpora and Language Teaching. Philadelphia: John Benjamins Publishing Co. Al-Deeb, Ghassan and Adil Al-Masri. (Eds.) (2002). Atlas Encyclopedic Dictionary: English –Arabic (1st edition.) Cairo: Atlas Publishing House. Baalbaki, Munir and Rohi Baalbaki. (Eds.) (1997). Al-Mawrid Dictionary: English-Arabic (2nd edition). Beirut: Dar El-Ilm Lilmalayin. 118
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 British National Corpus (BNC). http://www.natcorp.ox.ac.uk/ (Accessed in August 2014). Cambridge Advanced Learner's Dictionary. http://dictionary.cambridge.org/ (Accessed in August 2014). Chu, Man-ni. (2005). 'Lexicalization Patterns of Verbs of Hitting in Taiwanese Southern Min.' UST Working Papers in Linguistics, (USTWPL) 1(17): 17- 30. Chung, Siaw-Fong. (2011). 'A Corpus-based Analysis of 'Create' and 'Produce.'' Humanities and Social Sciences, 4 (2): 399-425. Clift, Rebecca. (2003). 'Synonyms in Action: A Case Study.' International Journal of Arabic-English Studies, 3: 167-187. Compact Oxford English Dictionary of Current English. http://www.askoxford.com/dictionaries/compact_oed/?view=uk (Accessed in August 2014). Cruse, David. (1986). Lexical semantics. Cambridge: CUP. Dowty, David .R. (1979). Word Meaning and Montague Grammar: The Semantics of Verbs and Times in Generative Semantics and in Montague's the Proper Treatment of Quantification in Ordinary English (PTQ), Dordrecht, Holland: D. Reidel Publishing Company. Fareh, Shehdeh. (2006). 'The Acquisition of the Verbs of Saying by Arab EFL Learners.' International Journal of Arabic-English Studies, 7: 137-150 First Step in reading Comprehension. http://www.5steps2english.com/forums/first_steps_in_reading_comprehen sion-t1076.0.html. (Accessed in December 2014). Freed, Alice. (1979). The Semantics of English Aspectual Complementation. Dordrecht, Holland: D. Reidel Publishing Company. Gregory, Howard. (2000). Semantics. London: Routledge. Hero Defeated and Some New Gladiator. http://hordesofthethings.blogspot.com/2013/06/hero-defeated-and-some- new-gladiators.html. (Accessed in November 2014). Karmi, Hassan S. (1997). Al-Mughni al-Akbar: A Dictionary of Classical and Contemporary English: English Arabic. Beirut, Lebanon: Librairie du Liban. Kearns, Kate. (2000). Semantics. New York: Palgrave. Khuvasanond, Kirati, Tatiana Sildus, I. Hurford, P. David and Richard P Lipka. (2010).'Comparative Approaches to Teaching English as a Second Language in the United States and English as a Foreign Language in Thailand.' In LSCAC Proceedings, 175-187. Learning from a Legacy of Heat. http://www.bsu.edu/learningfromhate/m_o.htm (Accessed in November 2013). nd Longman Dictionary of English Language and Culture. (1998). (2 edn). Longman: Harlow Lyons, John. (1995). Linguistic Semantics: An Introduction. Cambridge: CUP. 119
Aziz Thabit Saeed Contextual Considerations in the Use of ... Lyons, John. (1977). Semantics. Two vols. Cambridge: CUP. Macmillan Essential Dictionary. (2007). Oxford: Macmillan. Most Common English Verbs. http://www.acme2k.co.uk/Acme/3star%20verbs.htm (Accessed in January 2014). Nagy, Tünde. (2006). 'The Semantics of Aspectualizers in English.' A paper st presented at the 1 Central European Student conference in Linguistics.Budapest,Hungrywww.nytud.hu/cescl/proceedings/Tunde_Nag y_CESCL.pdf (Accessed in August 2009). nd Palmer, Frank. (1981). Semantics. (2 edn). Cambridge: CUP. Partington, Alan. (1998). Patterns and Meaning: Using Corpora for English Language Research and Teaching. Philadelphia: John Benjamins Publishing Co. Phoocharoensil, Supakorn. (2010). 'A Corpus-Based Study of English Synonyms.' International Journal of Arts and Sciences 3 (10): 227 - 245 Rea, John. (1968). ‘Lend and Loan in American English.' American Speech, 43: 1, 65-68, Duke University Press. Requejo, Maria. (2007). 'The Role of Context in Word Meaning Construction: A Case Study.' International Journal of English Studies, 7, 169-173. Riddle, Elizabeth. (1989). 'Issues in the acquisition of word meaning: Transition expression.' A paper read at TESOL Convention, San Antonio, 1989, 1-14. Sacks, Oliver. (1993). ‘Making up the Mind.' The New York Review, April 8. Saeed, Aziz and Shehdeh Fareh. (2006). 'Some contextual considerations in the use of synonymous verbs: The case of steal, rob and burglarize.' Studia Anglica Posananiensia, 41: 323-336 Saeed, John. (1979). Semantics. Oxford: Blackwell Publishers. Shen, Yingying. (2010). 'EFL Learners冾 Synonymous Errors: A Case Study of Glad and Happy.' Journal of Language Teaching and Research, 1 (1): 1-7. The Free Dictionary. http://www.thefreedictionary.com/) Date of access is December 2012. Walden, Mandy. (2011). http://mandywaldenartistprintmaker.blogspot.com/2011/05/new-l print.html. (Accessed on 20 June 2014). Wang, Juan. (2009). 'A corpus-based study on the Chinese near-synonymous st verbs of running.' Proceedings of the 21 North American Conference on Chinese Linguistics (NACCL). Vol. 2, Bryant University, 399-416. Werth, Paul. (1999). Text Worlds: Representing Conceptual Space in Discourse. London: Longman. Wierzbicka, Anna. (1996). Semantics: Primes and Universals. New York: OUP. Wodehouse, Pelham. G. The Adventures of Sally. http://www.gutenberg.org/dirs/7/4/6/7464/7464.txt (Accessed in August 2012). 120
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 Measure Terms in Rural Jordanian Spoken Arabic Ahmad Mohammad Al-Harahsheh Yarmouk University, Jordan Abstract: The use of measure terms can be socially and culturally determined, as every speech society may have its own unique measure terms. This study aims to shed light on the sociolinguistic usage of measure terms in Jordanian Spoken Arabic (JSA). The researcher collected the data from everyday conversations in the rural dialect of the north of Jordan. The participants of the study were 15 men and women who were in their fifties and sixties. The ethnography of communication and Interactional Sociolinguistic (IS) approaches are adopted as the theoretical framework for this study. The study concluded that measure terms in JSA are culturally and socially inherited and transmitted, and Jordanians tend to use body parts (i.e. finger, hand, foot and leg related expressions) as measure terms for heights, lengths and weights. Keywords: Interactional Sociolinguistic, ethnography of communication, measure terms, Jordanian Spoken Arabic. 1. Introduction It is overwhelmingly agreed that language is a social phenomenon, and there is a mutual relationship between language and society (Hymes 1964, 1974; Labov 1966, 1972; Gumperz 1971; Halliday 1971; Trudgill 1983,2006; Hudson 1990; Holmes 2006; Meyerhoff 2006).People use language not only to understand each others' feelings and thoughts, but also to define the relationship among others, as language is the main means of communication between people. The language used by men is different from that used by women,and that used by children is different from that employed by the elderly people. Therefore, 'these aspects of language use serve as an emblematic function: they identify the speaker as belonging to a particular group or having a particular social identity' (Guy 1988:37). Guy (1988:79) argues that there are two peculiar reasons for viewing the study of linguistic production as the study of speakers: the first is psychological which occurs in our 'individualistic culture',since 'language production is a form of behavior' i.e. grammatical knowledge and intellectual capabilities. The second occurs in linguistics itself, as linguistics studies the grammars, which is liable for production, in the mind of the speakers. Sociolinguistics encompasses a broad area of research, as sociolinguistics focuses on how interlocutors use language in social settings. Therefore, it concerns with who says what, to whom, when and where, in what manner, and under what social circumstances. Sociolinguistics also encompasses the study of 121
Al-Harahsheh Measure Terms in Rural Jordanian ... identity, class, solidarity, power,status and gender. In addition, sociolinguistics describes different ways of studying language; and sociolinguists employ 'different methods for collecting and analyzing data' (Meyerhoff 2006:1). According to sociolinguists, people acquire language by a social process rather than a cognitive one (Pinto 2012).That is, speakers learn language from the social environment s/he raises in through different social events (Meyerhoff 2006). For example, speakers learn communication skills from the surrounding environment or the society i.e. what to say in sad or happy situations. Halliday (2007a) elucidates the notion of a social man i.e. the individual in his social environment. Language is the main channel of communication between individuals in any speech community, by which an individual learns the values, the norms, the beliefs and the culture of the society where s/he lives. These intriguing issues cannot be learned at schools as the individual learns them independently by different events s/he encounters in his or her society. In addition, there is a strong relationship between language and the 'social man' (Halliday 2007b). Halliday (2007b) also illustrates that a society consists of relations rather than participants, and these relations define the social roles; an individual can occupy many roles at a time. In addition, the purpose of language is to communicate information within the members of a certain society; the members of the society do not use the language in the way they please, as meaning is controlled by a certain social norms specific for every speech community (Sanchez 2007). Trudgill (1983:14) demonstrates two peculiar aspects of language behavior:'first,the function of language in establishing social relationships and second the role play by language in conveying information about the speakers.' Besides, language is governed by the social norms,values and structures of a certain society, as 'social structure may either influence or determine linguistic structure and/or behaviour' (Wardhaugh 1998:10-11).In other words,the linguistic varieties, words or styles individuals use to communicate reflect their regional, social and ethnic origin and sometimes their gender. Wardhaugh (1998: 10-11) also explains that 'language and society may influence each other'. That is, language and society are concatenated, and they have reciprocal influence on the use of language in certain social settings, since the choice of varieties, words and linguistics styles are sometimes governed by the social norms of a certain society. Ethnography literally means ‘a portrait of a people’. Harris and Johnson (2000:4) posited that 'an ethnography is a written description of a particular culture- the customs, beliefs, and behaviours-based on information collected through fieldwork'Ethnography of communication is concerned with linguistics 122
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 as it describes and analyzes language codes (Saville-Troike 2003). According to Saville-Troike (2003:1), 'the uses of language and speech in different societies have patterns in their own which are worthy of ethnographic description comparable to – and intersecting with–patterns in social organization and other cultural domains.' To illustrate, every speech society has its own dialects and social norms of using language in certain social settings. In addition, the use of language portrays the cultural norms of those societies. This is why we have cultural-specific terms which are employed within a certain culture not another; they can be effortlessly understood among the speakers of that culture only. Therefore, there is a strong correlation between culture and language; studying language means studying the social and cultural norms of its speakers so that we can comprehend the different uses of language in various social settings. Haugen (1966) explains that language and dialect are ambiguous terms; ordinary speakers use them interchangeably when talking about different linguistic situations .(1) 'Language can be used to refer to either a single linguistic norm or to a group of related norms, and dialect to refer to one of the norms; but themselves are not static'(Wardhaugh 1986:25).As a way of illustration, language is codified in books and dictionaries, and is standardized i.e. it can be the official language of the state, court and education while the dialect is not. Jordan, an Arab country, is located in the Middle East; 97% of the population of Jordan are Muslims, while 3% are Christians. Standard Arabic (SA) is the official language of the state; it is the language of courts, education and media. Jordanian Spoken Arabic (JSA) is a variety of SA with some phonological and syntactic differences. There are three main distinguished dialects of JSA namely; rural dialect, spoken in the north parts of Jordan, especially towns and villages; urbanized dialect, spoken in the major cities of Jordan i.e. Amman, Irbid and Zarqa, and bedouin dialect, spoken in the eastern and southern cities of Jordan such as Mafraq, Karak, Tafielah, Ma’an and Aqaba. This study concerns with the rural dialect, especially the dialect spoken in the towns and villages of Irbid City. The use of dialect can be favorable in different social settings for affective reasons, as speakers can use dialect to express their ordinary or actual feelings towards certain issues during conversation. This study is significant and original as it is the first that tackles this peculiar issue in JSA. It also shows the relationship between language, culture and society, as Jordanian speakers tend to use familiar expressions derived from their usual conversations as measure terms. Several intriguing sociolinguistics phenomena have not been researched in JSA such as the employment of measure terms that is a fascinating phenomenon. 123
Al-Harahsheh Measure Terms in Rural Jordanian ... Every speech society has its own linguistic repertoires to express different linguistics terms such as measure terms. In JSA,speakers use socially and culturally specific words to refer to measure terms.This supports the assumption of Wardhaugh (1998) that language and society may influence each other. This study supports the notion that language and society can affect each other. This study aims at studying measure terms as a linguistics production of Jordanian speakers from social,cultural and linguistic perspectives. These terms are distinguished and they are still used especially by the elderly people in the north of Jordan. However, younger generations may not understand these terms because of urbanization and education,as the majority of these terms are not available in SA,and they are regarded as old-fashioned terms nowadays. The study focuses on the measure terms and their collocations i.e. certain measure terms collocate only with specific heights and weights. 2. Study questions Since one of the significant aims of sociolinguistics is to study, how and why interlocutors use language to describe different social settings. This paper is designed to study the sociolinguistics usage of measure terms in JSA. This study tries to answer the following questions: 1. How do Jordanians express measure terms? 2. Are measure terms in Jordanian Spoken Arabic socially and culturally inherited and transmitted? 3. Theoretical framework Theoretical framework is the backbone for any study on linguistics phenomenon. This study describes the measure terms used in Jordanian society. Therefore, the theoretical framework draws on Ethnography of Communication, Discourse Analysis, and Interactional Sociolinguistics (IS) which 'is an approach to discourse analysis that has its origin in the search for replicable methods of qualitative analysis that account for our ability to interpret what participants intend to convey in everyday communicative practice '(Gumperz 2001:215). This study concentrates on units of interaction employed by the participants during speech events, as these units can reflect the values and the beliefs of Jordanian society. 4. Methodology The participants of the current study were 15 men and women in their fifties and sixties in the north of Jordan. The researcher informed them about the purpose of the study in order to know the exact measure terms they use,when they refer 124
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 to weights and heights in different social settings. The researcher observed and quoted the sentences that express measure terms during the conversation. The researcher draws on the discourse analysis, ethnography of communication and Interactional Sociolinguistics approaches in analyzing the collected data. The researcher rests on his observations, linguistics, and social experiences. The measure terms were classified into terms used for amounts (small vs. large; liquid vs. solid), and terms used for weights and heights. 5. Results and discussion Measure terms are employed in every speech community; there are standard measure terms that are globally agreed upon, and this is not the target of the current study. These terms are well known by all the members of speech communities such as, gram, kilogram, millimetre, centimetre, meter, yard, mile, etc. However, there are some specific measure terms used exclusively by certain members of a society. For instance, measure terms in JSA are socially and culturally inherited, and transmitted from one generation to another. These terms have been used for many years; they can be classified into two categories: terms used for amounts i.e. solid and liquid (large and small) amounts, and terms used for heights and lengths. Moreover, they can be grouped into finger, hand, arm, foot and leg related expressions. 5.1. Measure terms used for amounts In Jordanian society, people have their own special terms, words or expressions to express measure terms referring to amounts, some of these terms are lexicalized in Standard Arabic, while the majority are only used in JSA i.e. they are culture-specific terms, as they are only employed and understood by the speakers of Jordanian Arabic. Jordanians tend to employ finger related expressions as measure terms to refer to small amount of things, especially amounts used for preparing food such as salt, spices, etc. These amounts can be measured by the amount that can be held by the forefinger and the thumb. Other native speakers of Arabic may not comprehend these terms. There are two taxonomies of measure terms, which Jordanians employ when referring to solid and liquid amounts. 5.1.1. Measure terms used for small amounts (solid and liquid items) A. Finger related expressions Jordanians use fingers as measurable tools to refer to small amounts of solid items such as salt, pepper and spices in general. These amounts can be hold by both the forefinger and the thumb; it refers to small but enough amounts. It is 125
Al-Harahsheh Measure Terms in Rural Jordanian ... hard to find out how many grams these amounts are; both participants are fully aware of these quantities because of the shared knowledge between them. The following terms are synonyms, and they are used interchangeably to refer to the same quantity: c katha ΔΤΘϛ, rashshi Δηέ, guᒸmi ΔϤτϗ, nitfi ΔϔΘϧ, giz a Δϋΰϗ /gabsi ΔμΒϗ and nugᒸa (drop). c (2) (1). Huᒸi kathit/ gabsit/ giz it / rasshhit/guᒸmit/ nitfit milih . 'Give me a pinch of salt.' c katha/giz a/gabsi/rashshi/guᒸmi/nitfi milih are the cooking measurements in JSA; they seem synonymous; they mean a pinch of salt, a small amount that can be hold by the forefinger and the thumb. A small amount that is just as a flavour enhancer or an amount that makes the taste of food acceptable i.e. the c taste is moderate. However, guᒸmi ΔϤτϗ, nitfi ΔϔΘϧ, giz a Δϋΰϗ are smaller than katha, rashshi and gabsi, as they may mean smidgen, where a smidgen equals ½ a pinch or 1/32 teaspoon; a pinch holds ½ a dash or 1/16 teaspoon; a dash holds 1/8 teaspoon, 8 dashes equal a teaspoon. A drop (nugᒸa) is used only with (3) liquid items such as oil, it may equal 1/64 teaspoon . Understanding what amounts to be added depends on the mutual relationship and the shared knowledge between the interlocutors, especially the intimacy. To illustrate, the interlocutors are usually a mother and her daughter, and both of them recognize the amount of salt or species that should be added to each dish. Therefore, both of them are fully aware of these amounts, and the addressee uses exactly the suitable amount of something. Based on this assumption, these terms are socially and culturally contextualized, so we can consider them as cultural-specific measure terms. Noticeably, they are mainly used by the elderly people who live in rural areas in the northern provinces of Jordan. However, nobody can foretell how many grams katha equals, but it is semantically, culturally and socially known as a small but enough amount. Jordanians who live in cities rather than those who live in rural areas employ rashshi. Rashi and nitfi are also used with liquid items such as rashshit/ nitfit/nugᒸit zeit (a small amount of oil). B. Hand related measure terms/ Solid items c 1. I ra:m ϡήϋ /malat ࡢIydak ϙΪϳ Εϼϣ /gabda ΔπΒϗ/ hafni ΔϨϔΣ These terms are measured by using one hand or both hands. They are generally used to refer to solid amounts of things, countable and uncountable. Although these measure terms are synonyms, hafnih ΔϨϔΣ is used to refer to the amounts of both hands in JSA, it may equal a cup of something, about 200grams. However, 126
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 in Standard Arabic, it means a handful. Hafnit tra:b (a handful of soil) is a very popular collocation. These terms are utilized with solid items to refer to bigger amounts than the aforementioned ones. hafni ΔϨϔΣ and milƘ ilyad Ϊϴϟ ˯ϰϠϣ, about c half a cup or 100 grams, are Standard Arabic terms, while i ra:m is colloquial. There are no specific weights for these amounts as the size of the hand is different from one person to another. These terms are usually employed with legumes and grains such as lentils, beans, peanuts,etc. c c (2). I ᒸi:ni Icram/ malat Iydik / hafnit adas. 'Give me a handful of lentils.' c We can make approximate weights for i ra:m/malat Iydik /hafnih as half a cup or about 100 grams.Still these approximation is not specific, as the amount may be more or less than 100grams, but it is known for Jordanians, as both participants mutually understand the exact amount. 2. Kamshi ΔθϤϛ (a fist or cupped hand of something) Kamshi is measured by the fist of one’s hand, kamshi may equal a cup, and it refers to an amount that one hand can hold when the fingers are clenched. However, in English, it is used as an informal unit of distance equals (4) approximately to the hand (10 centimetres or 4 inches) , it is employed to refer to certain solid items, countable and uncountable. It is also employed with peanuts, legumes and candies, when someone offers someone candies or peanuts, especially when these items are contained in a bag or in a can. (3). Khudhlak kamshit hilu: 'Take a fist of candies' In this example, kamshit (the /t/ sound is added here and in other examples when compounding) refers to small amount that one can hold when fingers are included. It is also used in everyday conversations to refer to sarcasm or to debase someone. For example, when someone describes the childish or the strange behaviour of someone especially a child, s/he may say sarcastically 'gad ilkamshi, w ma hada gadirluh' (He is a very little child and no one can control him). Metaphorically, it means that the child is still too little to behave in this way. Also, it can be used as an insult for someone,especially when two men are arguing with each other, one of them may say to the other 'kullak gad ilkamshi' ΔθϤϜϟ Ϊϗ ϚϠϛ (Pragmatically, you are nothing). In addition, it is employed as a verb,especially when someone is chasing a child who did something wrong, s/he says 'kamshtak' (I caught you). 127
Al-Harahsheh Measure Terms in Rural Jordanian ... 5.1.2. Measure terms used for large amount (leafy plants). Jordanians also use specific measurable terms to refer to large amounts of leafy plants. They tend to measure these items by both arms and by the small spans of both hands when are closed to each other forming a circle. Consider the following terms: A. dhumma ΔϤ˵ο (a bunch) It is measured by the small spans of both hands when they are closed to each other forming a circle.This term is utilized with flowers, leafy plants such as mint, spring onion, etc. It cannot be used for large amounts.It is employed widely in Jordanian society, especially in vegetables and fruits market, as they sell these items by a bunch rather than by a kilogram. c (4). a ᒸi:ni dhumit bagdu:nis. 'Give me a bunch of parsley.' The reason behind using this term may go back to the fact that these amounts are lightweight. c B. abᒸah ΔτΒϋ/ Arm related expression This measure unit is measured by the capacity of an adult both arms. This term is used to refer to a large amount of something such as plants, wheat, malt and other corps,it is larger than dhumma; Jordanian speakers clearly differentiate c between the two terms.It is somehow impossible to guess the weight of abᒸa. However, it is socially well-known between the interlocutors. c c (5). A ᒸi:ni abᒸit Molokhiyeh. 'Give me a package of Jews’ mallow'. The listener of this sentence recognizes the amount of Jews Mallow that the speaker wants; s/he will not give him/her dhummah (a bunch) for example. As mentioned above, JSA has its own distinguished measure terms for small and large amounts of liquid and solid items. 5.2. Measure terms used for lengths and heights Jordanians use different measure terms to refer to lengths, widths and heights.These terms can be classified into finger, hand, arm and leg related expressions. These terms have been ordered upwardly. 3.2.1. Finger related measure terms Again, Jordanians use finger related measure terms to express lengths and heights.These measure terms are socially acceptable, and they employed by 128
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 elderly people and the people in the rural areas. The use of these terms went back to the early period of the twentieth century. During this period, the use of modern measure units such as metre was not known or unpopular. Therefore, Jordanians tend to use their body organs as measure terms. c A. ࡢisba (A finger) A term used to refer to the shortest length and height in JSA.A traditional unit of (5) distance equal to 2 nails or 4.5 inches (11.43 centimetres) .It is usually utilized to emphasize on the shortness of something. This term is used as a cooking measure term, especially to refer to the amount of water that should be added to prepare a certain dish. It is also used for exaggeration or sarcasm about the length or the height of something or someone. c (6). Howa ᒸu:l ࡢisbi i,ࡢaw ma hada gadrluh 'Lit. He is a finger length and nobody can overcome him.' This sentence denotes sarcasm about the height of the intended person; it is a kind of debasing him.That is, he is very short and rowdy-dowdy; it is widely employed among girls to criticise each other’s heights, as heights is considered as a mark of beauty in Jordanian society. c c (7). F1: ala ࡢi::sh shayfih ƫalha, ma hi ᒸwl ࡢisbi i. 'Who do think herself is? She is as tall as my finger.' c F2: ࡢah wala hata ࡢisbi i aᒸwal minha. 'You are right, even my finger is taller than her.' c Obviously, in this example, the measure unit ࡢisbi (a finger) is metaphorically used to express a sarcasm, and a criticism of a girl’s height i.e. she is very short. A finger may also be used to refer to the person’s health case, especially when s/he is sick or becomes skinny. For instance, someone may say to a friend c c 'sayir wijhak ard ࡢisba i:n' (Lit. Your face is two fingers breadth); it means that your face is very skinny, as an indication that you are sick or look so exhausted. B. Fitir (small span) (is the distance between the index finger and the thumb.It is about 15 cm. depending on the individual's hand size).Fitir is employed in JSA to refer to short lengths and heights. It is mainly used with cloths, water and lands. c (8). giᒸ it igma:sh ᒸu:lha fitir. 129
Al-Harahsheh Measure Terms in Rural Jordanian ... 'A piece of cloth, its length is small span.' Fitir donates a very short length or height in this example.It is noted that Jordanian speakers tend to use specific measure terms to refer to lengths and heights. C. Shibir (Large span) Shibir is another measure term utilized to refer to heights and lengths; it is larger than fitir. It is a traditional unit of distance equal to 9 inches (approximately 22.9 centimetres) or 1/4 yard. This distance represents the span of a man's hand with (6) fingers stretched out as far as possible . Shibir is used to be a popular measure term especially for lands (areas), and clothes in JSA. Sometimes,it is used for sarcasm when referring to someone who is short.Moreover,it is used with liquid items, especially water, as there is a popular proverb in JSA says, ϲϣ ή ΒθΑ ϕήϐΑ bighrag ib shibir may (He sinks in a large span of water).It is a sarcastic expression refers to those who are unable to manage things; and they commit many mistakes when they are asking to do things i.e. those who are unreliable.In addition, it collocates with land, like shibir ࡢard (a span of land). (9). ana mustahi:l afarriᒸ ib shibir min ࡢardi. 'For me, it is impossible to waste a span of my land.' c D. Idhra: (Arm’s-length) is an old measure term in JSA. The length of a human arm is standardized as a unit of distance equals to about 70 centimetres or 28 (7) inches .It was used for lands, but nowadays, especially tailors in Jordan employ it with cloths, albeit some of them are using the meter as a measurement when tailoring cloths, but it is still in use. c c (10). ࡢinta biha:jit ࡢrba ࡢadhru igma:sh misha:n tfasi:l thu:b. 'You need four arms of cloth to make a address for you.' E. Gadam (foot) is another measure term, which was used as a measurement for land,but meter is employed instead nowadays.A foot is a unit of length equals to 12 inches, taken from the average length of the human foot. One foot equals to 0.3048 meters.Nowadays, it is used to refer to the heights and the capacity of refrigerators (11). ࡢishtari:t thala:ji 16 gadam. 'I bought a 16 feet refrigerator.' 130
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 F. Fahjih (sit or stand with the legs spread wide) is also used for the length of land; it is only used by ordinary people. Officially, the Department of Survey and Lands in Jordan use metre as a measure term for lengths and areas. Moreover, it is used in casual conversations to refer to the distance between places. Again,its use here is also sarcastic as fahji may refer to the distance of three kilometres or more;no one can expect the distance of fahji. c (12). A: gadi:sh ba i:dih ilmasafih min hu:n? 'How far is the distance from here?' B: ya zalamih kulha fahjih. 'Oh, man! It is a fahjih( not far away)'. In this example, fahji is used as a mitigation to indicate the shortness of the distance.However,the distance can be longer than expected, the speaker B does not want to disappoint the speaker A. 5.3. Measure terms for unspecified distances Jordanians use social specific measure terms to refer to unspecified distances. These terms are usually employed when the speaker is uncertain about the distance of the place. Besides, they may be used ironically, especially when the place is far away and the speaker wants to relieve the listener in order not to think of the length of the distance. The following expressions are the most popular in JSA. c 5.3.1. Magraᒸ il asa Ύμόϟ ρήϘϣ (Literally, the length of a stick/ pragmatically, not far away) This measure term is so popular in Jordanian society, especially among the elderly people. It refers to unspecific distance, particularly when the distance is somehow long, and the speaker does not want to tell the listener about the exact distance in order not to disturb him/her. Surprisingly, no one can foretell how long that distance. c (13). M1: Ya:zalami wi:n rayihi:n ? shakloh ilmaka:n ba i:d. 'Oh, man! Where are we going? It seems that the place is so far from here'. c M2 : La ya:zalami, magraᒸ il asa. 'No, man! It is not far away from here'. 131
Al-Harahsheh Measure Terms in Rural Jordanian ... c 5.3.2. Farkit ka aib ΐόϛ Δϛήϓ (Lit.: turning a heel/ (Prag. a short distance) This term is also popular in JSA, as it refers to a short distance. Fahjih can be used interchangeably with this term. It is normally employed to encourage the other party to go for a walk rather than riding a car. (14). M1: Khali:na inru:h bisayarah. ' Let us go by a car.' c M2: la yazalmih ma hi Farkit ka aib 'No, man! It is not far away. Jordanians may measure the distance of a place by how much time does it take to reach to the destination, especially when the speaker is not quite sure about the exact distance from one place to another, or to donate the shortness of that place.A person may say, biࡢimkanak timshi:ha aw bissya:ra kulha btukhidh minnak khamis dag:ayig (You can walk or drive for five minutes from here to reach your destination). 6. Conclusions Based on the data analysed above, the employment of body organs as measure terms to refer to heights, lengths and weights is socially and culturally inherited and transmitted from one generation to another in JSA, that is, the usage of these terms is pertained to the Jordanian society. Jordanians tend to use body organs, like fingers, arms, feet and legs, as measure terms for heights and weighs (liquid or solid amounts). Jordanians have special terms for particular amounts used for cooking measurements, weights and heights, and they employ them distinctly. This leads us to the assumption that there is a strong and mutual relation between language and the society in which it is employed. To illustrate, language can only be understood within its social context, and the shared knowledge between the interlocutors. Further studies are recommended, especially in translation field to investigate the translatability of these terms into English or other languages. Ahmad Mohammad Al-Harahsheh Translation Department Yarmouk University Email: [email protected] 132
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 [email protected] Mobile:00962779924822 Endnotes 1. Cited in Wardhaugh, Ronald (1986). An Introduction to Sociolinguistics. New York: Basil Blackwell Ltd. 2. kamshi becomes kamshit in compound 3. See How Many? A Dictionary of Units of Measurement. Available at http://www.unc.edu/~rowlett/units/dictF.html 4. See How Many? A Dictionary of Units of Measurement. Available at http://www.unc.edu/~rowlett/units/dictF.html 5. See How Many? A Dictionary of Units of Measurement. Available at http://www.unc.edu/~rowlett/units/dictF.html 6. See How Many? A Dictionary of Units of Measurement. Available at http://www.unc.edu/~rowlett/units/dictF.html 7. See How Many? A Dictionary of Units of Measurement. Available at http://www.unc.edu/~rowlett/units/dictF.html References Gumperz, John. (2001).'Interactional sociolinguistics: A personal perspective'. In Debora Shiffirn; Debora Tannen and Hiedi Hamilton Ehernberger (eds.), The Handbook of Discourse Analysis,215-229. Cambridge: Blackwell Publishers. Guy, Gregory. (1988). 'Language and social class'. In Newmeyer Frederick. (ed), Linguistics: The Cambridge Survey, IV, Language: The Socio- cultural Context, 37-79. Cambridge: CUP. Halliday, M.A.K. (2007a).'Language and social man'. In Jonathan Webster (ed),Language and Society, 65-130.London: Continuum. Halliday, M.A.K. (2007b). 'Language in social perspective'. In Jonathan Webster J. (ed), Language and Society,43-65. London: Continuum. Harris, Marven and Johnson, Orna. (2000). Cultural Anthropology (5th ed.). Needham Heights MA:Allyn and Bacon. Holmes, Janet. (2006). Introduction to Sociolinguistics. Edinburgh: Pearson Education Limited. Hudson, Richard. (1990). Sociolinguistics. Cambridge: Cambridge University Press. 133
Al-Harahsheh Measure Terms in Rural Jordanian ... Hymes, Dell. (1964). Language in Culture and in Society. New York: Harper& Row. Hymes, Dell. (1974). Foundations in Sociolinguistics: An Ethnographic Approach. Philadelphia: University of Pennsylvania Press. Labov, William. (1966). The Stratification of English in New York City. Washington, DC: Center for Applied Linguistics. Meyerhoff, Miriam. (2006). Introducing Sociolinguistics. London: Routledge. Pinto, Sara. (2012). 'Sociolinguistics and translation.' In Yves Gambier and Luc Van Doorslaer (eds), Handbook of Translation Studies - Vol III, 156-162. Amsterdam/Philadelphia: John Benjamins. Sánchez, Maria. (2007). 'Translation and sociolinguistics: Can language translate society?' Babel, 53 (2):123-131. Saville-Troike, Muriel. (2003). The Ethnography of Communication: An rd Introduction (3 ed.). New York: Blackwell Publishing. Trudgill, Peter. (1983). Sociolinguistics: An Introduction to Language and Society. Harmondsworth, England: Penguin Books. Wardhaugh, Ronald. (2006). An Introduction to Sociolinguistics. New York: Basil Blackwell Ltd. 134
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 Corpus Linguistic Tools for Historical Semantics in Arabic Omaima Ismail, Sane Yagi and Bassam Hammo University of Jordan, Jordan Abstract: In this paper, we present a set of corpus linguistic tools for conducting historical semantic research in the Arabic language. We compiled a Historical Arabic Corpus (HAC) that spans more than 1500 years of continuous language use. With techniques from the field of Natural Language Processing (NLP), the tools we presented here have been used to create the HAC and to explore lexical semantic change. The development of these tools is aimed at offering a catalyst to the ambitions goal of compiling an Arabic dictionary on historical principles. HAC and the tools can also be used for conducting research in a variety of areas of linguistics. Keywords: Arabic historical corpus, diachronic semantics, etymology, computational lexicography, lingusitic tools, NLP resources. 1. Introduction Corpus Linguistics is a sub-discipline of the scientific study of language that uses machine-aided tools for the compilation, retrieval, and analysis of a large body of classified, machine-searchable texts that are representative of authentic language use. It uses frequencies, collocations, and phrase structures to make generalizations about language use. A corpus is a database of a large body of classified machine-searchable texts that are selected to be representative of language as spoken or written in a specific geographic region or period of time, by a specific group of users, and/or for a specific function. The texts are often meta-encoded with information about the author, date, place, and medium of publication, genre, language, degree of representativeness, etc. They are also annotated at word and/or sentence level with information about a word’s part of speech (POS), grammatical function, morphological components, prosodic features, etc. Corpora are at the basis of a variety of recent linguistic studies. They are considered a good resource for research in natural language processing, teaching languages, machine translation, language engineering, information retrieval, lexical analysis, lexicography, and many others. (Al-Sulaiti and Atwell, 2006) Only few researches have been conducted on Arabic from a historical linguistic perspective. One reason for this is the fact that most NLP tools at the disposal of linguists have been geared to Modern Standard Arabic (MSA). The absence of an Arabic corpus on historical principles is at the root of this research deficiency. Arabic has one of the longest traditions in lexicography, with the first full- fledged dictionary dating back to 786 C.E. It also has one of the richest works of 135
Ismael, Yagi and Hammo Corpus Linguistic Tools for Historical Semantics ... lexicography with dictionaries that are onomasiological, semasiological, alphabetical, retrogradic, encyclopaedic, terminological, etc. What it does not have is a dictionary that traces the semantic development of words; ie, an etymological dictionary (a dictionary on historical principles). Towards the ultimate end of initiating systematic work in the direction of historical linguistic research and lexicography, we have leveraged information technology. This paper aims at describing, previewing, and demonstrating a set of computational tools that would facilitate research in historical semantics and etymological lexicography. It provides a set of tools that were used to create and analyze the Historical Arabic Corpus (HAC) in order to extract historical semantic knowledge about Arabic. The rest of the paper is organized as follows: Section 2 defines historical semantics and discusses semantic change. Section 3 provides a background on corpus linguistics and reviews some previous work. In section 4, we give a description of the research methodology we followed in the development of the corpus tools. Section 5 shows some experiments that were conducted for illustrating the utility of our work. Section 6 concludes and draws a roadmap for future research. 2. Historical semantics and meaning change Historical semantics is a scientific enterprise that studies diachronic change of meaning. It identifies, describes, and explains this change and attempts to discover the conditions that motivate it, the factors that regulate it, and the mechanisms that propagate it. It also studies the consequences of this change for meaning relations within and across semantic fields. Furthermore, historical semantics studies the conceptual history of a language community by investigating the etymology of words. Fritz (2012) asserts that historical semantics is also “a research area where fundamental problems of semantics tend to surface and which can be seen as a testing ground for theories of meaning and for methodologies of semantic description” (p.2644). On the benefit that historical semantics has gained from corpus linguistics, Fritz (2012) maintains that corpus linguistics has inspired historical semantics to reflect on the relationship between collocations of a word and its senses. Historical corpora facilitate the study of gradual changes in contexts of use and make it possible to discriminate between new senses that are transient and new senses that develop firmly. Semantic change, meaning change, denotes the universal tendency of word meanings to become different over time by gaining new senses or losing old ones, replacing default senses, drifting in terms of word prototype, narrowing or widening category boundaries, pejoration or amelioration, and/or bleaching. As in all languages, Arabic words change meaning over time. Below we present some examples of the major types of semantic shift that affected Arabic words. These examples have been culled from our Historical Arabic Corpus to illustrate both some types of semantic change and the potential of our compiled corpus. Table 1 shows examples of words which underwent semantic 136
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 specialization. We found out that many of the cases of semantic change are due to specialization (or semantic narrowing). Table 1. Examples of words that underwent specialization Word Classical meaning Meaning in modern contexts ΔϧΎϣ ?amana Trust Safe deposit; honesty; integrity; fidelity; confidence Ν˴ή˴Σ haraj Extreme distress Shyness; modesty ϢΠϧ najm A star in the sky Acelebrity Broadening the senses of a word is another type of semantic change. Classical Arabic words that have undergone semantic generalization in MSA are few (cf. Table 2). Table 2. Examples of words that underwent semantic generalization Word Classical meaning Meaning in modern contexts Ύϳ΅έ ru:?ya Good dream Vision, trance, dream ΚϴΣ haythu Adverb of place Adverb (where) Semantic pejoration (degradation) and semantic amelioration (elevation) also occur in modern Arabic contexts. Tables 3 and 4 illustrate both types. Table 3. Examples of words that underwent semantic pejoration Quranic word Meaning in the Quran Meaning in modern contexts ϥϮϴΣ hayawa:n True eternal life Animal ΏΎϫέ· ?irha:b Deterrence of aggressors Indiscriminate assault 137
Ismael, Yagi and Hammo Corpus Linguistic Tools for Historical Semantics ... Table 4. Examples of words that underwent semantic amelioration Word Meaning in the Quran Meaning in modern contexts Φϴη shaikh Old man Leader; prince; religious scholar Ρ˴ή˴ϣ marah Falsehood Delight and joy ΔϴϧΎΑί Guardian spirits of Hell Accomplices zaba:niya Now, let us focus on one word, Φϴη‘shaikh’, and inspect how it acquired elevated senses in modern times. Figure 1 shows a snapshot from the concordance of n-gram textual phrases for the word ‘shaikh’ in current newspaper editorials from a variety of Arab countries. Whilst this word meant in Quranic Arabic 'old man', it means in Modern Arabic 'old man' as well as 'chief', 'religious leader', 'supreme religious leader', and 'scholar'. In this type of semantic change, the original meaning of a word transforms to an entirely different sense. Table 5 shows few examples of this type of semantic change. Table 5. Examples of words that underwent semantic transition Quranic word Original meaning in Meaning in modern Quran context ίΎϬΟ jiha:z Furniture Machinery ΔΒλΎϧ nasiya Exhausting Erecting; fraudulent ϯ˶ή˸θ˴ϳ yashri: Sells Buys Fig 1. A snapshot from the concordance, tracking the word Φϴη‘shaikh’ in MSA 138
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 Table 6 illustrates how morphological change plays an important role in the semantic development of words. In modern contexts, some plural forms of Quranic words are now used as singular and new plural forms have been coined. Table 6. Examples of Quranic words that underwent morphological change Quranic Meaning in the Meaning in modern contexts word Quran ˷ ΝΎ˴Σ ‘ha:j’ Pilgrims One pilgrim, the plural being ΝΎ͉Π˵Σ‘huja:j’ ή˴ϔ˴ϧ ‘nafar’ A group of people One person, the plural being έΎϔϧ ‘anfa:r’ 3. Background and literature review As corpora are large and structured collections of texts, electronically stored and processed, they are good resources for linguistic research, natural language processing, and data mining. In order to make corpora more useful for linguistic research, they are often annotated with different information depending on the purpose that the corpus is used for. For example, a corpus could be annotated with Part of Speech Tags (POST), and some morphological information such as a word’s lemma, stem, root, morphological pattern, etc. Corpora are considered fundamental to computational linguistic research and to the study of language use in the real word. They are used in lexicography, translation, language learning and teaching, data mining, etc. Historical corpora consist of texts from periods that span the entire history of a language or part of it and they are used by historical linguists to explore the development of the language over time. Such corpora would reveal how words changed meaning and how grammatical structures changed. Historical corpora are indispensable for the compilation of historical dictionaries, not only for tracing the changes in a word’s meaning but also for providing quotations that illustrate the senses of a word. Arabic is in dire need for a dictionary on historical principles. There are many Arabic corpora with a range of structures and annotations. In terms of annotation, Al-Sulaiti (2004) developed a corpus of contemporary Arabic, which included modern standard Arabic texts and samples of colloquial varieties. The purpose of this corpus was to enrich resources for teaching and researching Arabic. The corpus contained one million words, marked up with Extensible Markup Language (XML). They were collected mainly from magazines, newspapers, websites, and radio stations. XML is a language that utilizes a set of rules for encoding documents in a human-readable and machine- readable format. Al-Sulaiti’s XML encoding contains tags for general information about the texts but no tags for morphological or POS information. Alansary, et al. (2007) built the International Corpus of Arabic (ICA) for the purpose of evaluating methods of information extraction from Arabic documents. They planned for it to contain 100 million words of Modern Standard Arabic, selected from a range of resources. They also built software for the ICA to query 139
Ismael, Yagi and Hammo Corpus Linguistic Tools for Historical Semantics ... the corpus and to insert documents. It can detect the genre and source of inserted documents and it places them in an appropriate hierarchy. The structure of the corpus, however, is not encoded in XML. Alansary, et al. used annotation to add information about the inserted documents but they never used morphological annotation. They later appended their software with a morphological analysis module that utilized the Buckwalter analyzer (Alansary, et al., 2008). Some other XML structures were used for lexicography purposes, e.g., the iSPEDAL. In the light of their own research, the authors proposed an improved structured electronic dictionary in the form of a relational database or in the form of XML documents; most Arabic dictionaries are found in flat textual form. The system contains such data as affixes, morphological patterns, and derived words and all words are linked to their roots (Hajjar, et al., 2010). Attia, et al. (2010) used Arabic corpora of MSA to build a lexicon encoded in Lexical Markup Framework (LMF). Their model was made such that it would automatically obtain lexical information from corpora for the purpose of constructing a large lexical resource. They also provided a complete description of inflectional and syntactic behavior of Arabic lexical entries. They developed a web system called AraComLex to help in lexicographic work which featured morphological patterns, sub-categorization frames, and Arabic lemma. (Attia, et al., 2011) Another work is the SALAH project. Boella, et al. (2011) proposed a model for segmenting and linguistically analyzing classical Arabic texts of Prophet Muhammad’s traditions (Hadieth) and the narratives on his life and deeds (Sunna). It divides text units into: transmitters’ chain (isnad) and text content (matn). The final system outputs an XML format that contains relations among transmitters and a lemmatized text corpus that is used in the automatic generation of concordance texts. The authors suggest that the system be used for information retrieval from Hadieth texts, and for verifying relations between transmitters. One of the most important Arabic corpora is the Quranic Arabic Corpus, an annotated corpus of Quranic text that uses dependency grammar to provide multiple layers of annotation, including morphological segmentation, part-of- speech tagging, and syntactic analysis. The Quranic corpus is automatically annotated by Buckwalter Arabic Morphological Analyzer and is then manually verified. The fully annotated corpus can be browsed online, and is encoded in both XML and plain text format (Dukes and Habash, 2010). Another work on the Quran presents a Quranic corpus tagged with personal pronouns and antecedents. They named this product QurAna. SimQur (Sharaf and Atwell, 2012 a; Sharaf and Atwell, 2012 b) is yet another Quranic corpus but it is concerned with semantically related verses. These corpora are resources for Quran and hadieth scholars and students, as well as computational linguists and computer scientists. The majority of the corpora above, however, either refrain from using simple structures and annotations, or are limited in scope and restricted to primary religious texts (i.e., the Quran and Hadieth). None is concerned with how the Arabic language developed over time. On the other hand, there are numerous English historical corpora that contain samples of texts from earlier eras. These corpora are used to study language variation in earlier periods of English as well as language change and 140
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 development. The Helsinki Corpus (2011), for instance, is a structured multi- genre corpus that includes periodically organized text samples from Old, Middle, and Early Modern English. It can be used for giving general information on the occurrence of forms, structures and lexemes in different periods of English. Another English historical corpus is ARCHER (2014) (A Representative Corpus of Historical English Registers), which is a multi-genre corpus of British and American English covering the period of 1600-1999. Other languages have historical corpora as well. For example, Sánchez- Marco, et al. (2011) present a general method that adapts existing NLP tools to facilitate dealing with historical varieties of languages. They implemented these tools for Old Spanish. They used them to automatically add linguistic information to texts and to annotate them for their historical corpus. Until the present time, there has been no published historical corpora for Arabic and neither have there been techniques for dealing with historical texts or annotating them. Most Arabic corpora are created manually or by simple tools that compile texts in an XML format and add annotation as meta-data. Khoja (2009) created a software application that can download RSS feeds to compile a corpus. It uses Arabic blogs of both modern standard and colloquial Arabic. The software converts the blogs into a corpus encoded in XML. It uses such tags for meta-data as author, gender, country, blog URL, etc. O’Donnell, (2008) also created the UAM Corpus Tool, which is an application for annotating texts using different linguistic layers, where the user can define the hierarchy of tags appropriate for the layer. The annotations can be at document layer level (e.g., text type, writer, register, etc.), at semantic-pragmatic level, and at syntactic level (e.g., clause, phrase). While the central task of the corpus tool is annotation, it also provides other functionalities, such as cross-layer searching, semi-automatic tagging, production of statistical reports, visualization of the tagged corpus, inter- coder reliability statistics, etc. It also enables users to add annotation manually, and it stores the annotation data using XML. Other tools used commonly for dealing with corpora are concordances, tools used in analyzing corpora, and searching for and retrieving words in context. Concordances have been shown to be an effective aid in the acquisition of a second or foreign language, because they facilitate the learning of vocabulary, collocations, grammar, and writing styles. Linguists can use concordance output to understand language behavior at morphological, lexical, syntactic, and semantic levels. Lexicographers can also use concordance output to identify multiple senses of a word. There are numerous concordancing tools for English and European languages, like MonoConc, WordSmith, WordPilot, etc. and most of them are commercial products. Little work is there to support Arabic and its unique morphological features. Anthony (2005) created AntConc, which is a multiplatform, multipurpose freeware corpus analysis toolkit, designed specifically for use in the classroom. It consists of a concordance, word and keyword frequency generator, tools for cluster and lexical bundle analysis, and a word distribution plotter. It also offers the choice of simple wildcard searches or regular expression searches. Although it can handle corpus text in UTF-8 encoding, it is not very efficient in handling Arabic texts. 141
Ismael, Yagi and Hammo Corpus Linguistic Tools for Historical Semantics ... Roberts, et al. (2006) built a concordancer called aConCorde. It supports Arabic and has both Arabic and English interfaces, and can be used on multiple platforms. The aConCorde provides an interactive stem-based searching facility and displays Arabic text correctly, but it still has some limitations. The full text of a selected item can’t be seen by users. It is proposed as a data-driven language learning tool for Arabic and as a tool for lexicographers and linguists. Abbès and Dichy (2008) developed AraConc as an interactive software specifically for Arabic. It integrates the Arabic word-form analyzer and generator (MorphArab) that is based on a lexicon generated from the DIINAR.1 knowledge database (DIctionnaire INformatisé del’ARabe, version 1). The AraConc software allows the building of new lexical resources, and widens the descriptive scope of the analyses used to construct DIINAR.1. AraConc inputs texts, extracts word-forms, updates their occurrences, and sends them one by one to the morphological analyzer (MorphArab). Analyses are then saved together with the position of the word in a document, and dispatched into specific files. Also the results of analyses are stored in a relational structure, which offers users a great number of choices in the grouping of output information and in statistics. It is evident from the above that the available tools are useful but there is need for work that can provide well-structured schemas and tools for Arabic corpora and that are capable of handling morphological annotation, and building an annotated corpus automatically. Historical corpora for Arabic has not received enough attention from scholars; hence, this research is very important for the construction, query, and analysis of an Arabic historical corpus. NLP in the Arabic language is still in its initial stage compared to the work in the English language, which has already benefited from extensive research by scholars from all over the world. There are obstacles that slow down progress in Arabic NLP compared to the accomplishments in English and other European languages (Al-Daimi and Abdel-Amir, 1994). These hurdles include: x Arabic is highly inflectional and derivational, which makes morphological analysis a very complex task. x The absence of diacritics (which represent short vowels) in the written text creates ambiguity and therefore, complex morphological rules are required to identify the tokens and parse text material. x Capitalization is not used in Arabic, which makes it hard to identify proper names, acronyms, and abbreviations. In addition to these, there is also lack of free Arabic corpora, lexicons, and sophisticated machine-readable dictionaries, resources that are essential to advancing research in NLP. In the last decade, the volume of Arabic textual data on the web has started to grow and Arabic software for browsing the web is improving. Unfortunately, much of Classical Arabic texts available on the web was posted as images, which makes it unsuitable for searching and machine processing. There is, however, a gradual increase in the amount of Arabic news material available on the web. 4. Methodology and developed tools Compiling a historical corpus is not our primary target but rather the development of tools for building such a corpus, tools that can automatically 142
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 annotate texts and categorize them per historical era. Here, we will explain our framework, corpus data, the proposed schema, the tools we developed, and our system’s architecture, in addition to the tool’s features and functionalities. Corpus data Before collecting the corpus data, we considered the issue of textual representation. The most important factor that we focused on was that the data must cover all time periods of Arabic. That is why we classified the corpus data time-wise starting from pre-Islamic times until the current century, and divided this time span into periods of 100 years each, calling them eras. We also classified the corpus data into primary and secondary sources on the basis of how representative a text is of its time of authorship. A primary text is poetry and literary prose, as well as non-fiction that does not comment on texts of older eras. But secondary texts include the language used in commentaries on older texts, commentaries that are expected to reflect how language was used at the time when the commentator lived, while the language of the text being explained shows how people of older times used the language. Quran exegesis and critical commentary on poetry of older times are two examples of secondary texts. Another factor we considered is a text’s genre. It is linguistically well established that genre affects the language used in a text; hence, we classified our corpus texts into Literary Prose; Poetry; History; Philosophy; Religion; Science; Thought; Dictionaries; Others. In addition to era, genre, and primary/secondary categorization, we collected general information about the texts to be compiled into the corpus, information such as document title and author. Below is a table that illustrates the variety of texts that make up our corpus on historical principles and how they have been annotated in terms of author, era, genre, and category. XML schema XML is an acronym for Extensible Markup Language. This is an open and popular standard for marking up text in a way that is both machine and human readable. By ‘marking up text’ we mean that the data in the text files is formatted such that it would include meaningful symbols that represent what that data is for. We designed a novel XML schema for the corpus, creating tags for each document’s metadata, and a text token’s morphological annotation, and stored each token in a single tag together with its annotation attributes. The annotations we stored with each token are: root, morphological pattern, POST, and light stem. Corpus Builder The corpus builder application was developed to compile the corpus and encode it in the proposed XML schema automatically. The corpus builder takes as input a text file encoded in utf-8 together with its document meta-data, and processes the document. Furthermore, we decided to integrate a stemmer and a POS tagger in our corpus builder in order to create the corpus as required. We adapted Khoja’s Arabic Stemmer (Khoja, 1999) to extract word roots and 143
Ismael, Yagi and Hammo Corpus Linguistic Tools for Historical Semantics ... morphological patterns and stems. For tagging, we used Stanford Part-Of-Speech Tagger (Toutanova et al., 2003). This Corpus Builder application: 1. Uploads a document with its meta-data 2. Processes and tokenizes the document’s text 3. Extracts roots of words using Khoja’s modified Arabic Stemmer 4. Retrieves the morphological pattern and light stem for each word in the document 5. Tags with Stanford Part-Of-Speech Tagger each word with its part of speech 6. Compiles all information into XML 7. Produces a morphologically and syntactically annotated XML file of the processed document. Table 7. Sample of corpus documents Document Author Era Genre Category Title βϴϘϟ Ή˶ήϣ ϥϮϳΩ β˸ϴ˴Ϙϟ ˵΅˵ή˸ϣ Before 600 C.E. Poetry Primary Ϧϴόϟ ϱΪϴϫήϔϟ 700-800 C.E. Dictionaries Secondary Literary ˯ϼΨΒϟ φΣΎΠϟ 800-900 C.E. Primary prose ΐτϟ ϰϓ ϥϮϧΎϘϟ ΎϨϴγ ϦΑ 1000-1100 C.E. Science Primary έΎΒΧ ϲϓ ΔσΎΣϹ ϦΑ ϦϳΪϟ ϥΎδϟ 1300-1400 C.E. History Primary ΔσΎϧήϏ ΐϴτΨϟ ϲ͋ϠΤϤϟ ϦϳΪϟ ϝϼΟ ϦϴϟϼΠϟ ήϴδϔΗ ϦϳΪϟ ϝϼΟϭ 1400-1500 C.E. Religion Secondary ϲσϮϴδϟ ΔϨϳΪϤϟ Ϟϫ ˯έ ϲΑέΎϔϟ 1500-1600 C.E. Philosophy Primary ΔϠοΎϔϟ Δϳήόθϟ ϝΎϤϋϷ ϢϴϫήΑϻ ΔϠϣΎϜϟ ϥΎϗϮσ ϢϴϫήΑ 1900- 2000 C.E. Poetry Primary ϥΎϗϮσ Figure 2 shows a portion of an annotated XML file, and table 8 lists the symbols representing the annotations made in 2-5 above. 144
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 Figure 2: Portion of an XML Corpus File Table 8. Word tag attributes Attribute Description No The sequence number of the word in context V The word value as a token (the word itself) R Root of the word Ptn Morphological pattern of the word POST Part of speech tag for the word in context Lem Light stem of the word 145
Ismael, Yagi and Hammo Corpus Linguistic Tools for Historical Semantics ... Historical Arabic Concordancing and Searching System (HACSS) The HACSS was created to facilitate tracing the development of linguistic aspects of Arabic across time. It does not use a database engine but rather XML corpus files and other XML files purposefully created for information retrieval. This system consists of four main modules: term indexer, term search engine, concordancer, dictionary editor. Indexer In Computer Science, an ‘index’ is “a list of keywords associated with a 1 record or document, used especially as an aid in searching for information” . It is used to assist in information retrieval systems in order to save search time especially if the number of documents to search through is huge. In full text searching and when all results are equally needed, a simple inverted index or a Boolean index is enough for searching and retrieving all occurrences of a search term in all documents. The inverted index contains a list of references to documents associated with each term. In HACSS, we created a set of index files for word stems and another set for roots. Each index file stores the terms that start with one of the letters of the alphabet. A corpus XML document is indexed for one time by identifying its unique terms (words and roots), then storing each of them in the appropriate index file in accordance with their word-initial letters. If the term already exists in the index file, we add only its new reference. Search Engine The HACSS’ search engine can search for a word or a phrase, a root, or a morphological pattern and is capable of retrieving words and their contexts from any specified era. The Search Engine provides this set of functionalities: 1- Different types of search: Searching by Word: The search engine looks for words in their light stem form. It also offers the chance to search for a word by exact or partial matching. Searching by Root: This will retrieve all the words derived from that root that occurred in the corpus. Searching by Morphological Pattern: This feature will retrieve all words that were coined in the morphological template of the search term. 2- The system provides a list of eras and a list of genres to search within. The user can search for a term in a specific historical era, or in a specific genre. Concordancer The search engine’s results are displayed by the concordancer. It extracts the matched terms and their immediate contexts and compiles them in the form of a concordance list. The size of context is user determined; the user can specify for extraction and display the number of words preceding the search term and following it. The concordancer also displays morphological information stored in the corpus with the searched term (i.e. the root, pattern, and POST) in 1 http://www.thefreedictionary.com/indexing 146
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 addition to the source document metadata (i.e. document title, author, genre, and historical era). Figure 3 shows a concordance page for the term ‘Jiha:z’. Figure 3: A Sample Concordance Page for the Search Term ‘Jiha:z’ Dictionary Editor The dictionary editor is an interface that the lexicographer interacts with. It fetches sentences from the concordance and plugs them into the example slot in the dictionary entry, extracts from the annotations and meta-tags such information as root, morphological pattern, part of speech, text author, and historical era, and plugs them into the appropriate slots in the Dictionary Entry Form. It also gives lexicographers the facility to add senses as deemed necessary, and to save the full dictionary entry in a Word document. The dictionary editor is designed to make the work of lexicographers easier and to save time. Figure 4 shows a snap shot of the dictionary editor’s interface. 147
Ismael, Yagi and Hammo Corpus Linguistic Tools for Historical Semantics ... Figure 4: Dictionary Editor’s interface 5. Experiments and results Here is a demonstration that uses our corpus tools to show word usage and the different senses of some terms which changed in time. Change over time Words undergo semantic change over time, and the word ‘Jiha:z’ (ίΎϬΟ) is no exception. Table 9 below shows sample sentences of this word extracted from our corpus. 148
International Journal of Arabic-English Studies (IJAES) Vol. 15, 2014 Table 9: Senses of “Jiha:z” over time extracted from HAC English Era Genre Source Word in Context Meaning 1100- ΕΩήϔϣ ωΎΘϣ Ϧϣ Ϊόϳ Ύϣ ίΎϬΠϟ 1200 Dictionaries ϥήϘϟ ΎϤϠϓ :ϰϟΎόΗ ϝΎϗ) ˬϩήϴϏϭ Belongings C.E. ϲϧΎϬϔλϼϟ (ϢϫίΎϬΠΑ ϢϫΰϬΟ 1300- 1400 History ϲϓ ΔσΎΣϹ ϲϘΣ ήϫΪϟ ΕΎϨΑ Ϧϣ άΧϭ Furniture C.E. ΔσΎϧήϏ έΎΒΧ ΎΑϼΘγ ΐϠΘγ ΖϴΒϟ ίΎϬΟ έΎΛϵ ΐΎΠϋ εήϔϟ ωϮϧ΄Α ΎϫϮηήϓϭ 1800- 1900 History ϢΟήΘϟ ϲϓ ίΎϬΟ ΎϬϴϟ· ϮϠϘϧϭ ΓήΧΎϔϟ Clothes C.E. έΎΒΧϷϭ ϡΪϗ Ύϣϭ ϖϳΩΎϨμϟϭ αϭήόϟ ϲΗήΒΠϠϟ ΎϳΪϬϟ Ϧϣ ΎϬϴϟ· ΔϋϮγϮϣ ϲϧΎτϳήΒϟ ΏΪΘϧϻ ΔϣϮϜΣ 1900- 2000 Thought ΩϮϬϴϟ ϞΧΩ ϢϫΪϴϨΠΗ Εέήϗ ϙΎϨϫ Staff C.E. ϭ ΔϳΩϮ Ϭϴϟϭ ϦϴϔχϮϤϛ ϲϣϮϜΤϟ ίΎϬΠϟ ΔϴϧϮϴϬμϟ ϝΰόϤΑ ϢϬϴϘΒΗ ϥ ΎϬϨϜϤϳ ϰΘΣ After ΕΎϘϴΒτΗ ϩΩϮΟϮϤϟ ΎϬϋϮϧ ίϭΎΠΘΗϭ (Digestive) 2000 Science ϲϓ ΎϳήΘϜΒϟ Ϧϣ ήΜϛ ϲϤπϬϟ ίΎϬΠϟ ϲϓ System C.E. ϲΒτϟ ϝΎΠϤϟ ΎϳήΘϜΒϟ Ϧϣ ωϮϧ ( 3000 ) ϱ ΩϮΟϭ ϞϴΠδΘ Α ϥϮϣϮϘϳ After ΕΎϘϴΒτΗ 2000 Science ϲϓ ΎϳήΘϜΒϟ ϝϼΧ Ϧϣ ΕήΠϔΘϤϠϟ ήΛ Apparatus C.E. ϲΒτϟ ϝΎΠϤϟ ϲϓ ϪόϴϨμΗ ϢΗ ΚϳΪΣ ίΎϬΟ ϮϜϴδϜϣϮϴϧ ΙΎΤΑ ϞϣΎόϣ HACSS is capable of calculating and graphing the frequency of a term’s usage over time. It shows the trend of popularity of a term across all eras. Figures 5 and 6 are charts that plot the chronology of some words in the corpus. Figure 5: Timeline chart of the word siyasa (ΔγΎϴγ) ‘politics’ 149
Ismael, Yagi and Hammo Corpus Linguistic Tools for Historical Semantics ... c Figure 6: Timeline chart of the root tq (ϖΘϋ) ‘emancipation’ 6. Conclusion and future work The Historical Arabic Corpus together with the developed tools provide a good resource for extracting historical semantic knowledge. It has been made clear that HAC will enable linguists to study the Arabic language from a historical perspective, and will facilitate the work of lexicographers. We have developed a set of tools for the creation and manipulation of a historical Arabic corpus. The corpus builder integrates a stemmer with a tagger to process and annotate documents, and then compile them into an XML corpus that uses a novel schema. We also created an indexer, a search engine, a concordancer, and a dictionary editor that together facilitate searching and extraction of linguistic knowledge from HAC, and facilitate the compilation of dictionary entries in a hypothetical dictionary on historical principles. We aim to enhance HAC in the future by rendering more accurate annotation, expanding the corpus so that it would become more representative of Arabic through time, optimizing and adding functionalities to the search engine and concordancer, and reacting to linguists’ needs and offering them more flexibility. Omaima Ismail Computer Information Systems Department University of Jordan, Amman, Jordan [email protected] Sane Yagi Department of Linguistics University of Jordan, Amman, Jordan [email protected] Bassam Hammo Computer Information Systems Department University of Jordan, Amman, Jordan [email protected] 150
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164