Derivational Morphology in Urdu: A Lexical Morphology Approach

From the theoretical perspective of lexical morphology (LM), this paper analyzes neutral and non-neutral affixes and their general organizational position in the morphology of derived words in Urdu. It explores the properties and behavior that Urdu affixes exercise during their attachment or insertion into roots/bases to produce new words, to question the assumptions of LM. Nine hundred and eighty sample words were randomly selected from our observations, articles in Urdu newspapers, and Urdu news television channels in Pakistan. While LM helps a lot regarding the analysis of neutral and non-neutral affixes, its assumptions concerning the hierarchical organization of affixes in derived word-formations do not correspond with the morphology of words in Urdu. This paper contributes as an initial step toward formulating a theory of the morphology of derived words in Urdu – a language rarely theoretically analyzed regarding the morphology of its derived words.


Introduction
This paper primarily focuses on whether the general derivational behavior of words in Urdu corresponds to the theoretical assumptions of Lexical Morphology (LM), which was developed from the general derivational and inflectional behavior of words in English. To analyze this issue, the paper attempts to answer the following three questions: (1) Which affixes in Urdu are neutral and non-neutral? (2) Are these neutral and non-neutral affixes organized hierarchically (as the theory supposes them to be) in a derived word-formation that contains both of these types of affixes)? (3) Do the general derivational behaviors of words in Urdu pose any challenge to the theoretical assumptions of LM? Through an analysis of the properties of affixes (prefixes, suffixes, infixes, inter-fixes, circumfixes and transfixes) in Urdu from the theoretical perspective of LM regarding their effect on the consonant, vowel and stress segments of root words in a derivational and inflectional process, this paper concerns the areas that the theory needs to account for in reviewing its assumptions concerning the analysis of Urdu.
Urdu is distinct in its processes of word formation as it borrows affixes, roots and stems from Arabic, Persian as well as native Urdu sources and organizes them into derivatives in a different way to English. "The morphological structures [in Urdu] which are apparent as a whole are also an amalgamation of the morphological structures from these three sources" (Mangrio 2016: 1). Linguists often treat Urdu and Hindi as the same language due to their having very similar phonological processes (ibid). While Sanskrit can be taken as the mother of both languages, Urdu is highly influenced by Persian and Arabic, and Hindi, by Sanskrit, so they differ from each other in some lexical, morphological and phonetic elements despite sharing several common features (ibid). In addition to being widely spoken in Pakistan and its national language, Urdu is used for communication by many in India, Bangladesh, Afghanistan and Nepal and South Asian immigrants in general all over the world. Over the last decade, research on syntactical, lexical and aspects of morphological integration and code-switching in Urdu has gradually evolved (Ahmed & Hautli, 2015;Khan, 2020;Malik, 2017;Raza 2015). However, the lack of research to produce a theoretical analysis of the morphology of words of the language is surprising, and also an obstacle to this paper's objectives, which will thus take guidance from the theory (LM) as applied to English in previous studies (Katamba 1993, Kaisse & McMahon 2011, Kiparsky 1982. Regarding descriptive work on the language, this paper draws on the works on Urdu by David, Maxwell, Browne andLynn (2009) andMangrio (2016). To explore the properties of affixes and their hierarchical organization in a derived word that includes stratum 1 and stratum 2, this paper analyses a sample of 980 derived words in Urdu. These sample words were selected randomly from the researchers' own observations (as the researchers are native speakers of Urdu in Pakistan), from articles in Urdu newspapers and Urdu news television channels in Pakistan. While collecting the sample, strenuous effort was made to ensure that it covered a wide variety of affixes that are generally used to create derived words in Urdu. Though we are conscious of the Persian or Arabic or Sanskrit source of the affixes, we do not give this much importance; rather, we focus our efforts on the analysis of their usual effect on the root word within the derivational process and their structural position within the derived word in Urdu. Based on the analyzed properties of these sample affixes and their organizational position in the morphology of the derived words, this paper, in its conclusion, makes some general comments regarding the general morphological structure of derived words in Urdu. However, it does not claim to have covered all the affixes and structural complexities within derivations. Rather, in its significance for future research, it aims to be an initial step in the theoretical analysis of the morphology of derived words while emphasizing the need for further research to ultimately come up with some more inclusive theory to comprehensively describe the morphological structure of derived words in Urdu.

Lexical Morphology/Phonology
The theory of lexical morphology/phonology can be referred to as either lexical morphology (LM) or lexical phonology (LP) or both lexical phonology and morphology (LPM). It is the whole word, rather than the morpheme that is the key unit of morphological analysis for this theoretical approach. By focusing on individual words as the unit of analysis, it aligns itself with the word-based models of traditional, pre-structuralist approaches to morphology, and modern word-andparadigm morphology, in contrast to the morphological models of American structuralists in which the morpheme is the central unit of analysis. On the basis of their phonological behavior, LM groups English affixes in two broad classes: neutral affixes and non-neutral affixes. Neutral affixes have no phonological effect on the base to which they are attached. For example, in English (examples from English are referred to because of the theory's focus on English during its development; examples from Urdu will be given in the analytical section), abstract (adj.) [ˈabstrakt] becomes abstractness (n.) [ˈabstraktnəs] through the addition of the suffix [-ness] without undergoing any major change in the consonant, vowel or stress segments. [-ness] is a neutral affix. However, non-neutral affixes modify in some way the consonant or vowel segments, or the location of stress in the base to which they are attached (Katamba, 1993). For example, grammar (n.) [ˈɡramə] becomes grammarian (n.) [ɡrəˈmeːrɪən] through the addition of the suffix [-ian] which causes changes in the vowel segments and stress in the root. [-ian] is, thus, a nonneutral affix. A key assumption of lexical morphology is that the morphological components of a derived word are organized in a series of hierarchical strata (Allen 1978, Halle & Mohanan, 1985Katamba, 1993;Kiparsky, 1982). In a multi-layered derived or inflected word, non-neutral affixes (which are also called stratum 1 affixes) come closer to the root than the neutral ones (which are also called stratum 2 affixes). This means stratum 1 affixes appear in the inner layer and stratum 2 affixes on the outer layer of the derived/inflected word that contains both types of affixes. For example, competitiveness (n.) [kəmˈpetɪtɪvnəs] contains [-tive] (non-neutral affix) closer to the root than -[ness] (neutral affix). Kiparsky (1982) assumes that all irregular inflectional (e.g. see ~ saw (past tense)) and derivational affixes (e.g. long (adj.) ~ length (n.)) are stratum 1, while stratum 2 affixes are regular derivations (e.g. kind (adj.) ~ kindly (adv.)) and compounding ones, and stratum 3 affixes are regular inflectional ones (e.g. walk ~ walked (past tense)). However, Katamba (1993) reduces lexical strata to only two by proposing that all irregular inflection and derivation happens at stratum 1 and all regular derivation, inflection and compounding, at stratum 2. Another important assumption of the theory is that there is a symbiotic relationship between the morphological and phonological rules of a word's formation. The rules that dictate the way a word is pronounced are inter-related with the rules that dictate the way the same word is structured. The output of each layer of derivation must be a possible word that does not violate the well-formedness constraint of the language. Each layer of derivation also needs to pass through the phonological rules that determine how the resulting word is to be pronounced.
The lexical rules of LM require the class of the bases affected to be specified, the affix that is attached, where exactly it is attached, the class which the resulting word belongs to and stratum to which the affix belongs to, along with its properties. Katamba (1993) quotes critics like Glodsmith (1990) who have challenged the claims of LM, arguing that the same affix can simultaneously belong to two strata. The theory is also criticized for disagreement among its advocates concerning the exact number of strata in a word. Counterevidence to the rule of stratum ordering in a word is also an important objection to the theory. Though several aspects of the theory have been challenged by later research, it remains influential "in its legacy of ways of thinking about phonology and in new instantiations that marry it with Optimality Theory [OT]" (Kaisse & McMahon 2011: 1). OT, by its central focus on phonology, proposes that the observed forms of language emerge from the optimal satisfaction of competing constraints/candidates/conflicts (McCarthy, 2007). With the purpose of exploring the derivational properties of affixes in Urdu in terms of their being neutral/non-neutral and their usual hierarchal position in a multi-layered derived word, this study limits itself to the application of the concepts of LM. No work has been found on this theoretical application on the morphology of Urdu words and the resulting complexities. Urdu contains a large number and variety of affixes (which further come from a variety of linguistic sources) to build words. The pattern of irregular inflection in English, for example, and the number of words that undergo this inflectional process is fewer than in Urdu, which borrows infixes and suffixes from both Arabic and Persian sources, and this creates numbers of irregularly inflected words which may more appropriately be termed derived rather than just inflected, given their considerably different structure from the root word. However, identifying the properties of affixes in Urdu regarding their effect on the morphology/phonology of the root to which they are attached, their hierarchical organization in a multi-layered word keeping in view the well-formedness constraint, requires the application of the theories of Lexical Morphology and Optimality Theory; future theoretical research of the morphology of Urdu should focus on it. Given the apparently distinctive morphology and phonology of Urdu and the fact that the theory was mostly derived from research into the English language, it is important to study whether the basic assumptions of LM are to be confirmed with its analytical application to Urdu. In particular, regarding the dependency on infixes and inter-fixes to produce new words, the two languages display remarkable differences. The number of root words that experience their breakage to form new words in English is comparatively less and with different behavior from those in Urdu. Therefore, to explore the potential theoretical challenges to LM, and suggestions for improvement to the theory to encapsulate Urdu as well is one of the objectives of this paper.

Methodology
This paper analyzes the morphology of derived words in Urdu by applying the theoretical assumptions of Lexical Morphology. The paper has a two-pronged agenda: to analyze the words according to LM theory and to check whether the claims of LM cover the morphology of derived words in Urdu. 980 sample words were randomly selected through the researchers' own observations, from articles in Urdu newspapers and Urdu news television channels in Pakistan. Safdar is both observer and active participant in the process of data collection, along with Mangrio. Data collection, coding, identification of patterns in the data and analysis occurred continuously and recursively throughout this study. During the study, the questions had to be adjusted and readjusted as well as the boundaries to match the emerging patterns in the data. Finally, from the morphological patterns found in the collected data and the theoretical assumptions of LM, this paper, firstly, identifies neutral and non-neutral affixes and analyses their properties in Urdu. Secondly, the positions or hierarchical positions of neutral and non-neutral affixes in the words containing both types is analyzed in accordance with the suppositions of the theory. Thirdly, the challenges brought by the morphology of Urdu words to LM claims are highlighted. Thus, this paper is a descriptive, exploratory and interpretive study.
It is very important to underline that this paper considers only those affixes as non-neutral that have a major effect on the root words; this means such affixes within the process of derivation cause any addition, deletion, replacement or mutation of some consonant/vowel segment, or shift of stress in the root/base word. Katamba (1993) has also attempted to look at such drastic changes caused by non-neutral affixes in derived words in his study of lexical morphology.

Analysis and Discussion
It can be observed that most affixes in Urdu involve long vowels and are weighty and tend to either shift the stress or cause consonant or vowel changes in the base word. Keeping in mind the definitions of neutral and non-neutral affixes, establishing the identity of many affixes and their hierarchical organization is not greatly challenging. However, there are a few affixes that do not exhibit fixed behavior patterns and so need to be examined in the context of their connection to different bases. For example, the suffixes in Table (1) can be called neutral (except -gi and -pən, which display two kinds of behavior, see table 3) as they cause no major effect in the root word after their attachment. The suffix [-gi] behaves as neutral when it attaches to a base which does not end in [-ɑ(h)]. Similarly, the suffix [-pən] also behaves as neutral when it attaches to a base which does not end in [-ɑ/i]. However, when [-gi] attaches to a base ending in [-ɑ(h)] ,drastic changes occur from the deletion of the final vowels to build new words. [mənd̪ ] can all be suffixed to create adjectives of the bases to which they attach without deleting or replacing any morphological component of the bases. Semantic and categorical change in the base word will not make the attached affixes neutral or non-neutral. It is the morphological change in the base that comes through affixation that makes the attached/inserted affix neutral or nonneutral. However, there are other affixes in Table (2) which are not easy to clearly categorize as either neutral or non-neutral because of the complexity of their syllabic stress. These are affixes that take long vowels and can give the impression of stress shift when they are attached to the base. It is difficult to categorically identify and locate the primary stress because of the varying dialects of Urdu and the presence of more than one stressed syllable in a word and the little relevant theoretical research on this trend of word structures.  Katamba (1993) found in his analysis of English that neutral or non-neutral affixes share certain properties regarding their effect on the root to which they attach. The affixes in Table (2) share a similar behavioral property when attaching a root word. They all take a long vowel which attracts syllabic stress. Given this stress shift effect, they can be classified as non-neutral, stratum 1 affixes. When there is more than one syllable of almost maximum weight in a word, the last syllable takes the main word stress (Bernard, 1990cited in Nayyar, 2002. The suffix [-d̪ ɑr] usually attaches to noun bases at stratum 1 to make them into an adjective or another noun. -in suffixes with a noun produce pluralization, whereas with a comparative adjective, they produce a superlative adjective. [-dʒɑt̪ ], [-ɑn], and [-ɑt̪ ] usually attach to singular noun bases to pluralize them. The plural marker suffix [-ɑt̪ ] causes more changes than just attracting the stress in root words ending in [-ɑ(h)], e.g. t̪ əbqɑ(h) ~ t̪ əbqɑt̪ . The aspiration or glottal fricative [-h] is deleted in the process of pluralization. However, the pharyngeal fricative [-ħ] sound (e.g. ɪslɑh ~ ɪslɑhɑt̪ ) remains and takes on the word stress in the plural noun. The morphological behavior of the nominal marker [-ɪjɑt̪ ] is also generally predictable. In most cases, it attaches to singular nouns to turn them into the names of some branches of knowledge and cause change in word stress. However, a noun which ends in a voiceless dental stop[-t̪ ] brings more drastic changes by breaking the base of the word, e.g. səˈhulət̪ ~ səˌhulɪˈjɑt̪ . Though semantically different, the prefixes in the Table (2) also exhibit similar morphological and phonological behavior by attracting word stress. On the basis of their properties, all the affixes in Table 3 can be classified as non-neutral, stratum 1 affixes.   (4) break singular noun word bases to mostly pluralize them, or, in some cases, build another noun without the sense of either singular or plural (e.g. hɑsɪl ~ həsul). Since they create drastic changes in root words, they can be categorized as non-neutral, stratum 1 affixes. Katamba (1993) has analyzed affixes in English that break the base to produce a new word, referring to the breakable bases as 'ablaut' and 'umlaut'. Ablaut refers to the change in a root vowel (aɪ ~ əʊ) in words like ride [raɪd] to rode [rəʊd]. Umlaut means the fronting of a vowel if the next syllable contains a front vowel. However, the breaking of base and infixing, circumfixing and transfixing in Urdu is generally performed by affixes borrowed from Arabic sources. The source of these affixes borrowed by Urdu is not important, but what changes they cause is, but they cannot be analyzed through the ablaut and umlaut approaches because of the completely different behaviors of Urdu and English bases in terms of their derivational processes. Moreover, ablaut and umlaut vowel alternation patterns have their origins in old Indo-European linguistic practices from which English is ultimately a product of.  (5) is geminate and clearly attracts word stress. It generally attaches to adjective word bases/stems to change them into abstract nouns referring to some condition, state of being or situation. If the stem word ends in [-i], it replaces or shortens it. Therefore, [-ɪjjət̪ ] is a non-neutral, stratum 1 suffix.  (6) is used as a suffix to produce an adjective from a monosyllabic and or disyllabic noun base. It attracts the stress onto the last syllable and is, thus, a non-neutral, stratum 1 suffix.

Stratum Ordering
The following abbreviations (adopted from Katamba (1993)  zɪmɑ(h)-ˈd̪ ɑr zɪmɑ(h)-d̪ ɑˈr-i zɪmɑ(h)-ˈd̪ ɑr-ɪˈjɑ̃ According to the basic concept of LM, a derived word made up of neutral and non-neutral affixes, takes the non-neutral affix at stratum 1 (i.e. closer to the root) and the neutral one at stratum 2 (i.e. away from the root) in its structural hierarchy. However, in the above examples, the non-neutral suffixes in (b) cause changes in the roots by shifting the stress, and in (c), again, the second suffix (-i), in all the three examples, behaves as a non-neutral suffix by placing the stress of the base word on the last syllable. In a word that contains more than one stress, the last stress is usually given more importance (Bernard, 1990). The examples show how multiple layers of suffixes of the same stratum preceding or following each other affect the root/base word. -i is placed away from the root following when other non-neutral suffixes join the root first. Interestingly, a general pattern can be noted in the multi-suffixed, derived words in the data: the non-neutral suffix -i, rather than some neutral suffix ( according to the basic assumptions of LM), is the second layer of suffix with a noun base and converts it into an adjective or, with an adjective base, changes its class to that of a noun. The plural marker suffix -ɪjɑ̃ in 'd' above can also be taken as another layer of non-neutral suffix with a strong effect on the base word. If the prefix [ɣer] is involved (in the above examples), it will be another non-neutral affix because its clear stress pattern refers to the negative marking/meaning of the base. Stress on the last syllable (can be after multiple stresses) may shift more favorably onto prefixes (like lɑ, bɑ, nɑ, ɣer, etc.) with long vowels, which sometimes produce negative, or at other times, positive adjectival markers. In the above examples, however, it can be argued, that there was no stratum 2 affix, which is essential to the hierarchy concept of LM. Therefore, see the following example: ˈsehət̪ mənd̪ sehət̪ mənˈd̪ i This example clearly refutes the LM assumption regarding the hierarchal organization of affixes in a word that consists of both neutral and non-neutral affixes. In it, a stratum 2 (neutral) suffix (i.e., mənd̪ ) precedes a stratum 1 (non-neutral) suffix (i.e., i). Similar examples of challenges to LM's presupposition of hierarchy have been mounted by Katamba (1993)  In Urdu, on the other hand, the biggest challenge to LM theory is posed by syllabic stress which this paper reports in accordance with spoken Urdu in Pakistan and studies by Nayyar (2002), Bernard (1990) and Katamba (1990), which rejects LM's central presupposition regarding the hierarchy of affixes in a word with affixes of both strata. This challenge from syllable stress mostly comes when suffixes with long or heavy vowels attach to bases following stratum 2 suffixes, which then tend to attract the word stress and shorten the preceding long vowels. As seen above, the coming together of one type of suffix in a multi-layered word is not the problem; the issue is that a non-neutral suffix follows a neutral one in the organizational hierarchy, the proximity to the root, of the suffixes. LM theory presumes non-neutral suffixes are closer to the root than neutral one, which is what, in the above Urdu examples, can be seen to not happen, especially in cases where the affixes cause a shift in word stress.

Conclusion
Based on the analysis of the data from the theoretical perspective of lexical morphology, this paper concludes that the morphological structure of derived words in Urdu only partially conforms to the assumptions of Lexical Morphology theory. While the theory is helpful for the analysis of neutral and non-neutral affixes, its assumptions concerning the hierarchical organization of affixes in word-formations that are derivations do not correspond to the morphology of words in Urdu. While word-structure in Urdu follows some of the basic suppositions regarding the types of affixes, it poses serious challenges to the other basic assumption regarding the hierarchal organization of non-neutral and neutral affixes in a word that contains both types. LM is really helpful for identifying the neutrality and non-neutrality of affixes with reference to their behavioral properties. But their hierarchical organization based on their closeness to the root assumed by the theory is not seen in many multi-layered derived words. These theoretical challenges to the theory mainly come due to the long and weighty (stress attracting/shifting) vowels in many suffixes and prefixes. This challenge multiplies when the non-neutral, long front vowel suffix [-i] joins another nonneutral suffix at the base and does not allow any stratum 2 suffix to attach (though another stratum 1 suffix can). Non-neutral suffixes, in many words, are adjacent without any stratum 2 suffix after them; [-i] is usually seen as the second adjacent suffix, operating either as an adjective or noun marker. Many examples from the sample data have multiple layers of suffixes. However, none had the theoretical hierarchical organization of neutral suffixes following non-neutral ones. Rather, in words like [sehət̪ mənd̪ i] or [sehət̪ mənd̪ ɑnɑ], the reverse of the hierarchical claim central to LM can be seen. Given the long and heavy vowels, including geminates like [-ɪjjət̪ ], inherent in many Urdu affixes and the fact that a large number of derived words come through breakage of the root, the assumptions concerning hierarchical strata cannot be applied to Urdu. Therefore, the theory needs to be reviewed according to the properties analyzed in this paper.

Limitations and Implications
Since this paper is one of the few initial steps taken by researchers to investigate, analyze and interpret the morphology of derived words in Urdu from a theoretical perspective with the objective of either supporting or challenging LM theory, several points can further be improved. Large numbers of affixes in Urdu are used to produce new words. This paper has attempted to study the properties and positioning of the most generally used affixes. However, there are many more that need to be added. Based on the assumptions of LM and the patterns generally found regarding the properties and organizational position of the affixes here, a more comprehensive theory is necessary to cover the lexical morphology of Urdu. In further research, the data sample should be increased to cover a larger variety of affixes along with their morphological positioning. The research can also be helpful to see how far LM presuppositions are supported by data from other Indo-Aryan languages.