Tone Recognition of Pahari Language
Abstract
Abstract Views: 0Pahari is an under-resourced, endangered, and undocumented tonal language, spoken in Pakistan Administered State of the Azad Jammu and Kashmir (AJK). Preliminary studies have established the notion, that the Pahari language has three discrete level tones; high, mid, and low. In the current study, tone distribution in monosyllabic words is measured with 45 iterations consisting of 15 high, 15 mid, and 15 low tones, collected from 5 native speakers of Pahari language. An attempt has been made to automatically recognize the phonologically contrastive tones in Pahari language, by using the Random Forest and the Linear Mixed Effect Models with f0 as a preliminary feature along with duration, intensity, F1, F3, and (Cepstral Peak Prominence) CPP. The results showed that the overall accuracy of the Random Forest was higher than the accuracy of the linear mixed effect model. Additionally, the mean f0 played a highly significant role in the prediction of tone while duration, intensity, F1, F3, and CPP played a less significant role.
Downloads
References
Alan, C. L. (2010). Tonal effects on perceived vowel duration. In C. Fougeron, B. Kuehnert, M. D. Imperio & N. Vallee (Eds.), Laboratory phonology (pp. 151–168). De Gruyter Mouton. https://doi.org/10.1515/9783110224917
Altaf, T., Anwar, S. M., Gul, N., Majeed, M. N., & Majid, M. (2018). Multi-class Alzheimer's disease classification using image and clinical features. Biomedical Signal Processing and Control, 43, 64–74. https://doi.org/10.1016/j.bspc.2018.02.019
Baart, J. L. G. (2004). Tone and song in Kalam Kohistani (pakistan). LOT Occasional Series, 2, 5–15.
Baart, J. L. G. (2014). Tone and stress in north‐west Indo‐Aryan. In Above and beyond the segments: Experimental linguistics and Phonetics. John Benjamins.
Bashir, E., & Conners, T. J. (2019). Linguistic context. In E. Bashir & T. J. Conners (Ed.), A descriptive grammar of Hindko, Panjabi, and Saraiki (Vol. 4, pp. 9–18). De Gruyter Mouton. https://doi.org /10.1515/9781614512257
Burnham, D., & Francis, E. (1997). The role of linguistic experience in the perception of Thai tones. In A. S. Abramson (Ed.), South east asian linguistic studies in honour of Vichin Panupong (pp. 29–47). Chulalongkorn University Press.
Cahana-Amitay, D., Spiro III, A., Sayers, J. T., Oveis, A. C., Higby, E., Ojo, E. A., Duncane, S., Gorala, M., Hyuna, J., Albert, M. L., & Obler, L. K. (2016). How older adults use cognition in sentence-final word recognition. Aging, Neuropsychology, and Cognition, 23(4), 418–444. https://doi.org/10.1080/13825585.2015.1111291
Chao, Y. R. (1930). A system of tone letters. Le Maître Phonétique, 45, 24–27.
Chen, F., Wong, L. L., & Hu, Y. (2014). Effects of lexical tone contour on Mandarin sentence intelligibility. Journal of Speech, Language, and Hearing Research, 57(1), 338–345. https://doi.org/10.1044/1092-4388(2013/12-0324)
Chen, X. X., Cai, C. N., Guo, P., & Sun, Y. (1987, Apri 6–9). A hidden Markov model applied to Chinese four-tone recognition (Paper presentation). ICASSP'87. IEEE International Conference on Acoustics, Speech, and Signal Processing. Dallas, USA. http//doi/org/10. 1109/ICASSP.1987.1169595
Chen, Y., & Xu, Y. (2020, May 25–28). Intermediate features are not useful for tone perception [Paper presentation]. 10th International Conference on Speech Prosody. Tokyo, Japan. https://doi.org/10.21437 /SpeechProsody.2020-105
Chen, Y., Gao, Y., & Xu, Y. (2022). Computational modelling of tone perception based on direct processing of f 0 contours. Brain Sciences, 12(3), Article e337. https://doi.org/10.3390/brainsci12030337
Chung, H. (2002, April 11–13). Duration models and the perceptual evaluation of spoken Korean [Paper presentation]. Speech Prosody 2002, International Conference. Aix-en-Provence, France. https://www.worldcat.org/title/speech-prosody-2002-proceedings-of-the-1st-international-conference-on-speech-prosody-aix-en-provence-france-11-13-april-2002/oclc/728728734
Cohen, J. D., Li, L., Wang, Y., Thoburn, C., Afsari, B., Danilova, L., Douville, C., Wong, F., Mattox, A., Hruban, R. H., Wolfgang, C. L., Goggins, M. G., Molin, M. D., Wang, T-L., Roden, R., Klein, A. P., Ptak, J., Dobbyn, L., Schaefer, J.,…Papadopoulos, N. (2018). Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science, 359(6378), 926–930. https://doi.org/10.1126 /science.aar3247
Cooper, F. S., Delattre, P. C., Liberman, A. M., Borst, J. M., & Gerstman, L. J. (1952). Some experiments on the perception of synthetic speech sounds. The Journal of the Acoustical Society of America, 24(6), 597–606. https://doi.org/10.1121/1.1906940
Corbin, J., & Strauss, A. (2008). Basics of qualitative research: Grounded theory procedures and techniques (3rd ed.). Sage Publisction.
Corbin, J., & Strauss, A. (2015). Basics of qualitative research: Grounded theory procedures and techniques (3rd ed.). Sage Publication.
Creswell, J. W. (2007). Qualitative inquiry& research design choosing among five approaches (2nd ed.). Sage Publications.
de Vos, F., Schouten, T. M., Hafkemeijer, A., Dopper, E. G., van Swieten, J. C., de Rooij, M., & Rombouts, S. A. (2016). Combining multiple anatomical MRI measures improves Alzheimer's disease classification. Human Brain Mapping, 37(5), 1920–1929. https://doi.org/10. 1002/hbm.23147
Eimas, P. D., Siqueland, E. R., Jusczyk, P., & Vigorito, J. (1971). Speech perception in infants. Science, 171(3968), 303–306. https://doi.org/10. 1126/science.171.3968.303
Francis, A. L., & Ciocca, V. (2003). Stimulus presentation order and the perception of lexical tones in Cantonese. The Journal of the Acoustical Society of America, 114(3), 1611–1621. https://doi.org/10. 1121/1.1603231
Gogoi, P., Tzudir, M., Sarmah, P., & Prasanna, S. R. M. (2020, May 25–28). Automatic tone recognition of ao language (Paper presentation).10th International Conference on Speech Prosody 2020. Tokyo, Japan.
Halle, M., & Chomsky, N. (1968). The sound pattern of English. Harper & Row.
Holt, L. L., & Lotto, A. J. (2010). Speech perception as categorization. Attention, Perception, & Psychophysics, 72(5), 1218–1227. https://doi.org/10.3758/APP.72.5.1218
Hornéy, C. S. (2019). Tonal variation in Pyen. Journal of the Southeast Asian Linguistics Society, 12(1), 12–24. http://hdl.handle.net /10524/52442
Hyman, L. M. (2014). How to study a tone language. Language Documentation & Conservation, 8, 525–562.
Hyman, L. M. (2006). Word-prosodic typology. Phonology, 23(2), 225–257. https://doi.org/10.1017/S0952675706000893
Jakobson, R., & Halle, M. (1968). Phonology in relation to phonetics. North-Holland Publishing Company.
Jakobson, R., Fant, C. G., & Halle, M. (1951). Preliminaries to speech analysis: The distinctive features and their correlates. The MIT Press.
Kaur, J., Singh, A., & Kadyan, V. (2020). Automatic speech recognition system for tonal languages: State-of-the-art survey. Archives of Computational Methods in Engineering, 28(3), 1039–1068. https://doi.org/:10.1007/s11831-020-09414-4
Khan, A. Q. (2017). The tonal system of Pahari. Acta Linguistica Academica, 64(2), 313–324. https://doi.org/10.1556/2062.2017.64.2.7
Khan, A. Q., & Bukhari, N. H. (2015). Lexical stress placement in monomorphemic words in Pahari. Acta Linguistica, 9(1), 51–62.
Khan, A. Q., Xu, Y., & Sohail, A. (2020). Multidimensionality of tone in Pahari. Lingua, 245, Article e102923. https://doi.org/10.1016 /j.lingua.2020.102923
Kingston, J., & Diehl, R. L. (1995). Intermediate properties in the perception of distinctive feature values. In B. Connell & A. Arvaniti (Eds.), Papers in laboratory phonology (pp. 7–27). Cambridge University Press. https://doi.org/10.1017/CBO9780511554315.002
Koerner, T. K., & Zhang, Y. (2017). Application of linear mixed-effects models in human neuroscience research: a comparison with Pearson correlation in two auditory electrophysiology studies. Brain sciences, 7(3), Article e26. https://doi.org/10.3390/brainsci7030026
Kumar, Y., & Singh, N. (2017). An automatic speech recognition system for spontaneous punjabi speech corpus. International Journal of Speech Technology, 20(2), 297–303. https://doi.org/10.1007/s10772-017-9408-2
Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries: Erratum. Journal of Experimental Psychology, 54(5), 358–368. https://doi.org/10.1037/h0044417
Lotto, A. J., Hickok, G. S., & Holt, L. L. (2009). Reflections on mirror neurons and speech perception. Trends in Cognitive Sciences, 13(3), 110–114. https://doi.org/10.1016/j.tics.2008.11.008
Maddieson, I., & Pang, K.-F. (1993). The tone in Utsat. Oceanic Linguistics Special Publications, 24, 75–89.
McElreath, R. (2020). Statistical rethinking: A bayesian course with examples in R and Stan (2nd ed.). Chapman and Hall/CRC. https://doi.org/10.1201/9780429029608
Mingliang, G., & Yuguo, X. (2008, June 7–11). Chinese dialect identification using clustered support vector machine (Paper presentation). International Conference on Neural Networks and Signal Processing. Nanjing, China. https://doi.org/10.1109/ ICNNSP.2008.4590380
Moulin, A., Bernard, A., Tordella, L., Vergne, J., Gisbert, A., Martin, C., & Richard, C. (2017). Variability of word discrimination scores in clinical practice and consequences on their sensitivity to hearing loss. European Archives of Oto-Rhino-Laryngology, 274(5), 2117–2124. https://doi.org /10.1007/s00405-016-4439-x
Odden, D. (1995). Tone: African languages. In J. A. Goldsmith (Ed.), The handbook of phonological theory (pp. 444–475). Blackwell.
Odden, D. (2011). Features impinging on tone. In J. A. Goldsmith, E. Hume & L. Wetzels (Eds.), Tones and features: Phonetic and phonological perspectives (pp. 81–107). De Gruyter Mouton. https://doi.org/ 10.1515/9783110246223
Palczewska, A., Palczewski, J., Robinson, R. M., & Neagu, D. (2014). Interpreting random forest classification models using a feature contribution method. In T. Bouabana-Tebibel & S. Rubin (Eds.), Integration of reusable systems. Springer. https://doi.org/10.1007/978-3-319-04717-1_9
Paul, A., Mukherjee, D. P., Das, P., Gangopadhyay, A., Chintha, A. R., & Kundu, S. (2018). Improved random forest for classification. IEEE Transactions on Image Processing, 27(8), 4012–4024. https://doi.org/ 10.1109/TIP.2018.2834830.
Peng, G., & Wang, W. S. Y. (2005). Tone recognition of continuous cantonese speech based on support vector machines. Speech Communication, 45(1), 49–62. https://doi.org/10.1016/j.specom. 2004.09.004
Pike, K. L. (1948). Tone languages; A technique for determining the number and type of pitch contrasts in a language, with studies in tonemic substitution and fusion. University of Michigan.
Rashid, H. U. (2015). Syllabification And stress patterns in Hindko [Unpublished doctoral dissertation]. University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan. https://prr.hec.gov.pk/jspui/ bitstream/123456789/6906/1/Haroon-un-Rashid_2015_English_Univ_of_AJK_Muzafarabad.pdf
Repp, B. H., & Lin, H. B. (1990). Integration of segmental and tonal information in speech perception: A cross-linguistic study. Journal of Phonetics, 18(4), 481–495. https://doi.org/10.1016/S0095-4470(19) 30410-3
Sandhu, J. K., & Singh, A. (2021). Research insight of indian tonal languages: A review. Artificial Intelligence and Speech Technology. CRC Press.
Schmitz, J., Bartoli, E., Maffongelli, L., Fadiga, L., Sebastian-Galles, N., & D'Ausilio, A. (2019). The motor cortex compensates for the lack of sensory and motor experience during auditory speech perception. Neuropsychologia, 128, 290–296. https://doi.org/10.1016/j. neuropsychologia.2018.01.006
Shahi, T. B., & Sitaula, C. (2022). Natural language processing for Nepali text: A review. Artificial Intelligence Review, 55, 3401–3429. https://doi.org/10.1007/s10462-021-10093-1
Silva, D. (2006). Acoustic evidence for the emergence of tonal contrast in contemporary Korean. Phonology, 23(2), 287–308. http://doi.org/10. 1017/S0952675706000911
Sjerps, M. J., Zhang, C., & Peng, G. (2018). The lexical tone is perceived relative to the locally surrounding context, and vowel quality to the preceding context. Journal of Experimental Psychology, 44(6), 914–924. https://doi.org/10.1037/xhp0000504
Spille, C., Ewert, S. D., Kollmeier, B., & Meyer, B. T. (2018). Predicting speech intelligibility with deep neural networks. Computer Speech & Language, 48, 51–66. https://doi.org/10.1016/j.csl.2017.10.004
Stevens, K. N., & Blumstein, S. E. (1978). Invariant cues for the place of articulation in stop consonants. The Journal of the Acoustical Society of America, 64(5), 1358–1368. https://doi.org/10.1121/1.382102
Thubthong, N., Kijsirikul, B., & Luksaneeyanawin, S. (2002, May 9–11). An empirical study for constructing Thai tone models (Paper presentation). 5th Symposium on Natural Language Processing and Oriental COCOSDA Workshop. Thailand.
Tong, T., Gray, K., Gao, Q., Chen, L., Rueckert, D., & Alzheimer's Disease Neuroimaging Initiative. (2017). Multi-modal classification of Alzheimer's disease using nonlinear graph fusion. Pattern Recognition, 63, 171–181. https://doi.org/10.1016/j.patcog.2016.10.009
van Lancker, D., & Fromkin, V. A. (1978). Cerebral dominance for pitch contrasts in tone language speakers and musically untrained and trained English speakers. Journal of Phonetics, 6(1), 19–23. https://doi.org /10.1016/S0095-4470(19)31082-4
Watkins, K. E., Strafella, A. P., & Paus, T. (2003). Seeing and hearing speech excites the motor system involved in speech production. Neuropsychologia, 41(8), 989–994. https://doi.org/10.1016/S0028-3932(02)00316-0
This work is licensed under a Creative Commons Attribution 4.0 International License. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution (CC-BY) 4.0 License that allows others to share the work with an acknowledgement of the work’s authorship and initial publication in this journal.