Tone Recognition of Pahari Language

Salma Asghar; Uzma Anjum; Urooj Akhter

doi:10.32350/jcct.52.04

Salma Asghar Lecturer, Department of English, Women University of Azad Jammu & Kashmir, Bagh, Azad Kashmir, Pakistan https://orcid.org/0000-0003-4766-7323
Uzma Anjum Associate Professor, Department of English, Air University Islamabad, Islamabad, Pakistan
Urooj Akhter Lecturer, Department of English, University of Poonch, Rawalakot, Azad Jammu & Kashmir, Pakistan https://orcid.org/0000-0001-7186-9166

DOI: https://doi.org/10.32350/jcct.52.04

Keywords: random forest, linear mixed effect models, fundamental frequency, first formant, third formant, cepstral peak prominence

Abstract

Abstract Views: 0

Pahari is an under-resourced, endangered, and undocumented tonal language, spoken in Pakistan Administered State of the Azad Jammu and Kashmir (AJK). Preliminary studies have established the notion, that the Pahari language has three discrete level tones; high, mid, and low. In the current study, tone distribution in monosyllabic words is measured with 45 iterations consisting of 15 high, 15 mid, and 15 low tones, collected from 5 native speakers of Pahari language. An attempt has been made to automatically recognize the phonologically contrastive tones in Pahari language, by using the Random Forest and the Linear Mixed Effect Models with f0 as a preliminary feature along with duration, intensity, F1, F3, and (Cepstral Peak Prominence) CPP. The results showed that the overall accuracy of the Random Forest was higher than the accuracy of the linear mixed effect model. Additionally, the mean f0 played a highly significant role in the prediction of tone while duration, intensity, F1, F3, and CPP played a less significant role.

Downloads

Download data is not yet available.

References

Alan, C. L. (2010). Tonal effects on perceived vowel duration. In C. Fougeron, B. Kuehnert, M. D. Imperio & N. Vallee (Eds.), Laboratory phonology (pp. 151–168). De Gruyter Mouton. https://doi.org/10.1515/9783110224917

Altaf, T., Anwar, S. M., Gul, N., Majeed, M. N., & Majid, M. (2018). Multi-class Alzheimer's disease classification using image and clinical features. Biomedical Signal Processing and Control, 43, 64–74. https://doi.org/10.1016/j.bspc.2018.02.019

Baart, J. L. G. (2004). Tone and song in Kalam Kohistani (pakistan). LOT Occasional Series, 2, 5–15.

Baart, J. L. G. (2014). Tone and stress in north‐west Indo‐Aryan. In Above and beyond the segments: Experimental linguistics and Phonetics. John Benjamins.

Bashir, E., & Conners, T. J. (2019). Linguistic context. In E. Bashir & T. J. Conners (Ed.), A descriptive grammar of Hindko, Panjabi, and Saraiki (Vol. 4, pp. 9–18). De Gruyter Mouton. https://doi.org /10.1515/9781614512257

Burnham, D., & Francis, E. (1997). The role of linguistic experience in the perception of Thai tones. In A. S. Abramson (Ed.), South east asian linguistic studies in honour of Vichin Panupong (pp. 29–47). Chulalongkorn University Press.

Cahana-Amitay, D., Spiro III, A., Sayers, J. T., Oveis, A. C., Higby, E., Ojo, E. A., Duncane, S., Gorala, M., Hyuna, J., Albert, M. L., & Obler, L. K. (2016). How older adults use cognition in sentence-final word recognition. Aging, Neuropsychology, and Cognition, 23(4), 418–444. https://doi.org/10.1080/13825585.2015.1111291

Chao, Y. R. (1930). A system of tone letters. Le Maître Phonétique, 45, 24–27.

Chen, F., Wong, L. L., & Hu, Y. (2014). Effects of lexical tone contour on Mandarin sentence intelligibility. Journal of Speech, Language, and Hearing Research, 57(1), 338–345. https://doi.org/10.1044/1092-4388(2013/12-0324)

Chen, X. X., Cai, C. N., Guo, P., & Sun, Y. (1987, Apri 6–9). A hidden Markov model applied to Chinese four-tone recognition (Paper presentation). ICASSP'87. IEEE International Conference on Acoustics, Speech, and Signal Processing. Dallas, USA. http//doi/org/10. 1109/ICASSP.1987.1169595

Chen, Y., & Xu, Y. (2020, May 25–28). Intermediate features are not useful for tone perception [Paper presentation]. 10th International Conference on Speech Prosody. Tokyo, Japan. https://doi.org/10.21437 /SpeechProsody.2020-105

Chen, Y., Gao, Y., & Xu, Y. (2022). Computational modelling of tone perception based on direct processing of f 0 contours. Brain Sciences, 12(3), Article e337. https://doi.org/10.3390/brainsci12030337

Chung, H. (2002, April 11–13). Duration models and the perceptual evaluation of spoken Korean [Paper presentation]. Speech Prosody 2002, International Conference. Aix-en-Provence, France. https://www.worldcat.org/title/speech-prosody-2002-proceedings-of-the-1st-international-conference-on-speech-prosody-aix-en-provence-france-11-13-april-2002/oclc/728728734

Cohen, J. D., Li, L., Wang, Y., Thoburn, C., Afsari, B., Danilova, L., Douville, C., Wong, F., Mattox, A., Hruban, R. H., Wolfgang, C. L., Goggins, M. G., Molin, M. D., Wang, T-L., Roden, R., Klein, A. P., Ptak, J., Dobbyn, L., Schaefer, J.,…Papadopoulos, N. (2018). Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science, 359(6378), 926–930. https://doi.org/10.1126 /science.aar3247

Cooper, F. S., Delattre, P. C., Liberman, A. M., Borst, J. M., & Gerstman, L. J. (1952). Some experiments on the perception of synthetic speech sounds. The Journal of the Acoustical Society of America, 24(6), 597–606. https://doi.org/10.1121/1.1906940

Corbin, J., & Strauss, A. (2008). Basics of qualitative research: Grounded theory procedures and techniques (3rd ed.). Sage Publisction.

Corbin, J., & Strauss, A. (2015). Basics of qualitative research: Grounded theory procedures and techniques (3rd ed.). Sage Publication.

Creswell, J. W. (2007). Qualitative inquiry& research design choosing among five approaches (2nd ed.). Sage Publications.

de Vos, F., Schouten, T. M., Hafkemeijer, A., Dopper, E. G., van Swieten, J. C., de Rooij, M., & Rombouts, S. A. (2016). Combining multiple anatomical MRI measures improves Alzheimer's disease classification. Human Brain Mapping, 37(5), 1920–1929. https://doi.org/10. 1002/hbm.23147

Eimas, P. D., Siqueland, E. R., Jusczyk, P., & Vigorito, J. (1971). Speech perception in infants. Science, 171(3968), 303–306. https://doi.org/10. 1126/science.171.3968.303

Francis, A. L., & Ciocca, V. (2003). Stimulus presentation order and the perception of lexical tones in Cantonese. The Journal of the Acoustical Society of America, 114(3), 1611–1621. https://doi.org/10. 1121/1.1603231

Gogoi, P., Tzudir, M., Sarmah, P., & Prasanna, S. R. M. (2020, May 25–28). Automatic tone recognition of ao language (Paper presentation).10th International Conference on Speech Prosody 2020. Tokyo, Japan.

Halle, M., & Chomsky, N. (1968). The sound pattern of English. Harper & Row.

Holt, L. L., & Lotto, A. J. (2010). Speech perception as categorization. Attention, Perception, & Psychophysics, 72(5), 1218–1227. https://doi.org/10.3758/APP.72.5.1218

Hornéy, C. S. (2019). Tonal variation in Pyen. Journal of the Southeast Asian Linguistics Society, 12(1), 12–24. http://hdl.handle.net /10524/52442

Hyman, L. M. (2014). How to study a tone language. Language Documentation & Conservation, 8, 525–562.

Hyman, L. M. (2006). Word-prosodic typology. Phonology, 23(2), 225–257. https://doi.org/10.1017/S0952675706000893

Jakobson, R., & Halle, M. (1968). Phonology in relation to phonetics. North-Holland Publishing Company.

Jakobson, R., Fant, C. G., & Halle, M. (1951). Preliminaries to speech analysis: The distinctive features and their correlates. The MIT Press.

Kaur, J., Singh, A., & Kadyan, V. (2020). Automatic speech recognition system for tonal languages: State-of-the-art survey. Archives of Computational Methods in Engineering, 28(3), 1039–1068. https://doi.org/:10.1007/s11831-020-09414-4

Khan, A. Q. (2017). The tonal system of Pahari. Acta Linguistica Academica, 64(2), 313–324. https://doi.org/10.1556/2062.2017.64.2.7

Khan, A. Q., & Bukhari, N. H. (2015). Lexical stress placement in monomorphemic words in Pahari. Acta Linguistica, 9(1), 51–62.

Khan, A. Q., Xu, Y., & Sohail, A. (2020). Multidimensionality of tone in Pahari. Lingua, 245, Article e102923. https://doi.org/10.1016 /j.lingua.2020.102923

Kingston, J., & Diehl, R. L. (1995). Intermediate properties in the perception of distinctive feature values. In B. Connell & A. Arvaniti (Eds.), Papers in laboratory phonology (pp. 7–27). Cambridge University Press. https://doi.org/10.1017/CBO9780511554315.002

Koerner, T. K., & Zhang, Y. (2017). Application of linear mixed-effects models in human neuroscience research: a comparison with Pearson correlation in two auditory electrophysiology studies. Brain sciences, 7(3), Article e26. https://doi.org/10.3390/brainsci7030026

Kumar, Y., & Singh, N. (2017). An automatic speech recognition system for spontaneous punjabi speech corpus. International Journal of Speech Technology, 20(2), 297–303. https://doi.org/10.1007/s10772-017-9408-2

Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries: Erratum. Journal of Experimental Psychology, 54(5), 358–368. https://doi.org/10.1037/h0044417

Lotto, A. J., Hickok, G. S., & Holt, L. L. (2009). Reflections on mirror neurons and speech perception. Trends in Cognitive Sciences, 13(3), 110–114. https://doi.org/10.1016/j.tics.2008.11.008

Maddieson, I., & Pang, K.-F. (1993). The tone in Utsat. Oceanic Linguistics Special Publications, 24, 75–89.

McElreath, R. (2020). Statistical rethinking: A bayesian course with examples in R and Stan (2nd ed.). Chapman and Hall/CRC. https://doi.org/10.1201/9780429029608

Mingliang, G., & Yuguo, X. (2008, June 7–11). Chinese dialect identification using clustered support vector machine (Paper presentation). International Conference on Neural Networks and Signal Processing. Nanjing, China. https://doi.org/10.1109/ ICNNSP.2008.4590380

Moulin, A., Bernard, A., Tordella, L., Vergne, J., Gisbert, A., Martin, C., & Richard, C. (2017). Variability of word discrimination scores in clinical practice and consequences on their sensitivity to hearing loss. European Archives of Oto-Rhino-Laryngology, 274(5), 2117–2124. https://doi.org /10.1007/s00405-016-4439-x

Odden, D. (1995). Tone: African languages. In J. A. Goldsmith (Ed.), The handbook of phonological theory (pp. 444–475). Blackwell.

Odden, D. (2011). Features impinging on tone. In J. A. Goldsmith, E. Hume & L. Wetzels (Eds.), Tones and features: Phonetic and phonological perspectives (pp. 81–107). De Gruyter Mouton. https://doi.org/ 10.1515/9783110246223

Palczewska, A., Palczewski, J., Robinson, R. M., & Neagu, D. (2014). Interpreting random forest classification models using a feature contribution method. In T. Bouabana-Tebibel & S. Rubin (Eds.), Integration of reusable systems. Springer. https://doi.org/10.1007/978-3-319-04717-1_9

Paul, A., Mukherjee, D. P., Das, P., Gangopadhyay, A., Chintha, A. R., & Kundu, S. (2018). Improved random forest for classification. IEEE Transactions on Image Processing, 27(8), 4012–4024. https://doi.org/ 10.1109/TIP.2018.2834830.

Peng, G., & Wang, W. S. Y. (2005). Tone recognition of continuous cantonese speech based on support vector machines. Speech Communication, 45(1), 49–62. https://doi.org/10.1016/j.specom. 2004.09.004

Pike, K. L. (1948). Tone languages; A technique for determining the number and type of pitch contrasts in a language, with studies in tonemic substitution and fusion. University of Michigan.

Rashid, H. U. (2015). Syllabification And stress patterns in Hindko [Unpublished doctoral dissertation]. University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan. https://prr.hec.gov.pk/jspui/ bitstream/123456789/6906/1/Haroon-un-Rashid_2015_English_Univ_of_AJK_Muzafarabad.pdf

Repp, B. H., & Lin, H. B. (1990). Integration of segmental and tonal information in speech perception: A cross-linguistic study. Journal of Phonetics, 18(4), 481–495. https://doi.org/10.1016/S0095-4470(19) 30410-3

Sandhu, J. K., & Singh, A. (2021). Research insight of indian tonal languages: A review. Artificial Intelligence and Speech Technology. CRC Press.

Schmitz, J., Bartoli, E., Maffongelli, L., Fadiga, L., Sebastian-Galles, N., & D'Ausilio, A. (2019). The motor cortex compensates for the lack of sensory and motor experience during auditory speech perception. Neuropsychologia, 128, 290–296. https://doi.org/10.1016/j. neuropsychologia.2018.01.006

Shahi, T. B., & Sitaula, C. (2022). Natural language processing for Nepali text: A review. Artificial Intelligence Review, 55, 3401–3429. https://doi.org/10.1007/s10462-021-10093-1

Silva, D. (2006). Acoustic evidence for the emergence of tonal contrast in contemporary Korean. Phonology, 23(2), 287–308. http://doi.org/10. 1017/S0952675706000911

Sjerps, M. J., Zhang, C., & Peng, G. (2018). The lexical tone is perceived relative to the locally surrounding context, and vowel quality to the preceding context. Journal of Experimental Psychology, 44(6), 914–924. https://doi.org/10.1037/xhp0000504

Spille, C., Ewert, S. D., Kollmeier, B., & Meyer, B. T. (2018). Predicting speech intelligibility with deep neural networks. Computer Speech & Language, 48, 51–66. https://doi.org/10.1016/j.csl.2017.10.004

Stevens, K. N., & Blumstein, S. E. (1978). Invariant cues for the place of articulation in stop consonants. The Journal of the Acoustical Society of America, 64(5), 1358–1368. https://doi.org/10.1121/1.382102

Thubthong, N., Kijsirikul, B., & Luksaneeyanawin, S. (2002, May 9–11). An empirical study for constructing Thai tone models (Paper presentation). 5th Symposium on Natural Language Processing and Oriental COCOSDA Workshop. Thailand.

Tong, T., Gray, K., Gao, Q., Chen, L., Rueckert, D., & Alzheimer's Disease Neuroimaging Initiative. (2017). Multi-modal classification of Alzheimer's disease using nonlinear graph fusion. Pattern Recognition, 63, 171–181. https://doi.org/10.1016/j.patcog.2016.10.009

van Lancker, D., & Fromkin, V. A. (1978). Cerebral dominance for pitch contrasts in tone language speakers and musically untrained and trained English speakers. Journal of Phonetics, 6(1), 19–23. https://doi.org /10.1016/S0095-4470(19)31082-4

Watkins, K. E., Strafella, A. P., & Paus, T. (2003). Seeing and hearing speech excites the motor system involved in speech production. Neuropsychologia, 41(8), 989–994. https://doi.org/10.1016/S0028-3932(02)00316-0