UMT Artificial Intelligence Review https://journals.umt.edu.pk/index.php/UMT-AIR <p style="text-align: justify;"><strong>UMT Artificial Intelligence Review (UMT-AIR)</strong> is a double-blind peer-reviewed biannual journal that provides a wide variety of perspectives on the theory and practices of work in the realm of AI. We welcome research papers on foundational and applied work, as well as case studies. UMT-AIR also invites studies on critical analytical studies on AI applications, which present an in-depth evaluation of the AI tools and methods being employed.</p> en-US <p style="text-align: justify;"><em>UMT-AIR </em>follow an open-access publishing policy and full text of all published articles is available free, immediately upon publication of an issue. The journal’s contents are published and distributed under the terms of the <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International</a> (<a href="https://creativecommons.org/licenses/by/4.0/">CC-BY 4.0</a>) license. Thus, the work submitted to the journal implies that it is original, unpublished work of the authors (neither published previously nor accepted/under consideration for publication elsewhere). On acceptance of a manuscript for publication, a corresponding author on the behalf of all co-authors of the manuscript will sign and submit a completed&nbsp;the Copyright and Author Consent Form.</p> [email protected] (UMT-AIR) [email protected] (Editorial Assistant) Wed, 20 Dec 2023 00:00:00 +0000 OJS 3.1.2.1 http://blogs.law.harvard.edu/tech/rss 60 Recitation of The Holy Quran Verses Recognition System Based on Speech Recognition Techniques https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/3668 <p>Arabic is the language in which the Holy Quran was revealed to Mohammed (S.A.W). Muslims claim that the Holy Quran has not been tampered with since it has been preserved. The Arabic Quran should be read exactly as it has been written. With the flourishment of Islam and the appearance of faults in Quran’s recitation, the experts created Tajweed to preserve Allah's revelation. The Holy Quran's authenticity and purity must be protected from erasure or contamination. The current study examined speech recognition techniques used in the Quran’s recitation along with their strengths and faults. Moreover, it also examined the Quranic text verification paradigm.<strong>&nbsp;The </strong>development of a computer-aided system, to automatically learn the Holy Quran's recitation, is a practical learning technique. Computer-aided Programming Language (CAPL) has gained popularity in recent years. Moreover, numerous researches have been conducted so far to improve these methods, especially in second-language instruction. Computer technologies can help language teachers with pronunciation and accent reduction. The computers play an essential role in automated tutoring. With the help of computer, words can be learned at home. CAPL's strict application is to automate the Holy Quran’s recitation unlike a language-learning exercise, where many pronunciations may be appropriate. There is minimal opportunity for variation while reciting the Holy Quran in Arabic language. The current study presented a concept for Quran’s recitation verification system along with an overview of Quran’s voice recognition techniques.</p> Muhammad Rehan Afzal, Aqib Ali, Wali Khan Mashwani, Sania Anam, Muhammad Zubair, Laraib Qammar Copyright (c) https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/3668 Wed, 20 Dec 2023 00:00:00 +0000 Enhancing Agricultural Pest Management with YOLO V5: A Detection and Classification Approach https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/4387 <p>Due to the growing population and the numerous ecological challenges that affect crop yields, the need to modernize the crop production process has become increasingly critical. Swiftly managing potential threats to crops can have a substantial impact on overall crop production. Pests represent a significant menace, capable of causing substantial losses if not effectively controlled in a timely manner. In this study, a deep learning-based method for pest identification is introduced. The approach leverages the YOLO (You Only Look Once) object recognition SSD (single shot detection )algorithm in combination with the pre-trained DARKNET architecture to categorize pests into nine distinct classes. The study utilizes a publicly available dataset sourced from Kaggle, which comprises a total of 7,046 images. The outcomes reveal overall 83% of overall accuracy rate, with a notably low training and validation loss of 0.02%. Moreover, our model exhibits a notable enhancement in localization results, delivering a precision of 0.83, a recall of 0.83, an mAP-0.5 of 0.833, and an mAP-0.5:0.95 of 0.783.</p> Asif Raza, Muhammad Kashif Shaikh, Osama Ahmed Siddiqui, Asher Ali, Afshan Khan Copyright (c) 2023 Asif Raza, Muhammad Kashif Shaikh, Osama Ahmed Siddiqui, Asher Ali, Afshan Khan https://creativecommons.org/licenses/by/4.0 https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/4387 Wed, 20 Dec 2023 00:00:00 +0000 Factors Behind Virtual Reality (VR) Enactment for Netflix Watching in Pakistan: An Integrated SEM-ANN-based Study https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/3672 <p>From the Netflix perspective, "entertainment is just like you eat buffet." In this regard, Netflix services prefer to deploy modern technology solutions that facilitate the consumers regularly. By keeping in view the potential integration of modern technology in Netflix watching, the current study adopted an <strong>integrated SEM-ANN technique to scrutinize</strong> some basic characteristics of Netflix as dynamic mechanisms of Virtual Reality (VR) adoption in Pakistan. The current research used a self-proposed study model and randomly selected a sample of <em>n=</em> 400 students from <em>n</em>= 4 private sector universities in Islamabad. The analysis, using Structural Equation Modelling (SEM) results, indicated several significant associations. Specifically, it was observed that social influence, satisfaction, and relevance were notably linked with Netflix usage. Additionally, there was a robust and statistically significant relationship between Netflix usage, perceived ease of use, and behavioral intention. Furthermore, perceived ease of use and behavioral intention demonstrated significant correlations with virtual reality adoption. In the subsequent Artificial Neural Network (ANN) analysis, an accuracy of 51.1% was achieved during the training phase and a further improvement to 54.4% during the testing phase. In light of these results, while adopting technology in Pakistan may have been relatively gradual, the transition from conventional media to social TV, integrated with VR, marks the inception of a new technological era in the country. This shift holds important implications for the media landscape and consumer behavior. Finally, the study limitations and contributions have also been discussed accordingly.</p> Sana Ali, Saadia Pasha, Shazia Hashmat Copyright (c) 2023 Sana Ali, Saadia Pasha, Shazia Hashmat https://creativecommons.org/licenses/by/4.0 https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/3672 Wed, 20 Dec 2023 00:00:00 +0000 Voice Cloning Using Transfer Learning with Audio Samples https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/5500 <p>Voice cloning refers to the artificial replication of a certain human voice. Several deep learning approaches were studied for voice cloning. After studying learning approaches, a cloning system was offered that creates natural-sounding audio samples within few seconds of source speech from the target speaker. From a speaker verification challenge to text-to-speech synthesis with multi-speaker capability, the current study used a transfer learning technique. In a zero-shot mode, this system creates speech sounds in the voices of various speakers, even individuals who were not seen during the training process. The current study used latent embedding’s to encode speaker-specific information, enabling additional model parameters to be pooled across all speakers. The speaker modelling stage was separated from voice synthesis by training a discrete speaker-discriminative encoder network. This is because networks require distinct types of input, disconnection enables each to be trained using separate datasets. When employed for zero-shot adaptability to unknown speakers, an embedding-based technique for voice cloning enhances speaker resemblance. Furthermore, it reduces computational resource needs which may be advantageous for use-cases requiring minimal resource deployment.</p> Usman Nawaz, Usman Ahmed Raza, Amjad Farooq, Muhammad Junaid Iqbal, Ammara Tariq Copyright (c) 2023 Usman Nawaz, Usman Ahmed Raza, Amjad Farooq, Muhammad Junaid Iqbal, Ammara Tariq https://creativecommons.org/licenses/by/4.0 https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/5500 Wed, 20 Dec 2023 00:00:00 +0000 Determining Urdu News Type from Headline Text Using Deep Learning https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/4668 <p>In recent years, the volume of data of regional languages available on the Internet has grown significantly. It helps people to express themselves by removing linguistic boundaries. Moreover, the accessibility of news articles on the web provides billions of web users with a source of knowledge. This research offers a classification model for categorizing Urdu news headlines text with deep learning (DL) techniques and different word vector embeddings. To improve the efficacy of various Urdu natural language processing (NLP) applications, this study included two neural word embeddings built by utilizing the most widely used approaches, namely Word2vec and pre-trained fastText. Both intrinsic and extrinsic evaluation methods were used to examine the integrity of the created neural word embeddings. The study employed a vast, fresh corpus of Urdu text containing 153,050 headlines categorized into 8 different classes. Then, text pre-processing techniques and two DL models, namely the Long Short-Term Memory (LSTM) and Bidirectional Long Short-Term Memory (BiLSTM) were applied. The results were compared based on embeddings. It was found that when a pre-trained fastText embedding was utilized, BiLSTM surpassed other DL models with an accuracy of 93.93%, precision of 93.86%, recall of 93.93%, and F1 score of 93.89%.</p> Umair Arshad, Khawar Iqbal Malik, Hira Arooj, Muhammad Fiaz Copyright (c) 2023 Umair Arshad, Khawar Iqbal Malik, Hira Arooj, Muhammad Fiaz https://creativecommons.org/licenses/by/4.0 https://journals.umt.edu.pk/index.php/UMT-AIR/article/view/4668 Wed, 20 Dec 2023 00:00:00 +0000