Investigating Functional Variation across Academic Disciplines: A Multidimensional Analysis of Research Articles

The current study aims to investigate the functional variation in academic writing across six major disciplines of sciences, namely biology, physics, and computer science and of social sciences, namely linguistics, sociology, and psychology research articles. Furthermore, the current study also compares the use of language in science and social sciences research articles (RAs). For this purpose, a corpus of 300 research articles from six sub-disciplines of science and social sciences journals was compiled to conduct the analysis. A Multidimensional Analysis Tagger using version 1.3.2 was selected to tag and evaluate the RAs corpus. The triangulation research method was deployed to analyze the data quantitatively and to interpret the results qualitatively. The findings indicate that there are functional differences in the language employed in academic disciplines, particularly in terms of being informational, non-narrative, and persuasive. Likewise, comparable outcomes were noticed in dimension 3. The findings indicate that there are functional differences in the language employed in academic disciplines, particularly in terms of being informational, non-narrative, and persuasive. Likewise, comparable outcomes were noticed in dimension 3. The results may be helpful for the researchers, students of different disciplines, and teachers to spot trends across various sub-disciplines.


Introduction
In modern times, English has established itself as the primary language used for scientific communication globally.Almost 95% of peer-reviewed international science and technology publications are published in English (Lillis & Curry, 2010).Publication of research articles in a well-recognized international journal has become an essential requirement for young researchers and scientists to thrive in the modern academic world (Hyland, 2016).Despite its widespread use for scientific communication, the dominance of the English language has created difficulties for emerging scientists and researchers, who may not be familiar with the established writing conventions specific to their field.
Research articles (RAs) are one of the numerous forms of academic writing, such as reviews and technical reports, which Hyland (2000) described as a codification of disciplinary knowledge.In fact, research articles have a great role to disseminate disciplinary knowledge on an international level.However, writing the discussion section of theses/dissertations and research articles has been challenging for academic writers (Geng & Wharton, 2016).Lack of understanding about different forms and functions of this section is the main problem for novice writers as they have less awareness about the utilization of metadiscoursal resources.
A range of topics on the genre of academic research articles and academic language has been addressed in recent years but the primary focus of academic scholars has been on the rhetorical structure of the Swales model (Basturkmen, 2012) for discussion section.Scholarly work has also been done to explore the communication functions of different sections of research articles including methods, results, and conclusions (Martínez, 2003;Jin, 2021).Conrad (1996), Biber & Finegan (2001), Gray (2013), and Egbert (2015) analyzed research articles using a multidimensional approach, while variation across disciplines was also taken into consideration (Liu & Xiao, 2022).However, the discourse of science and social sciences in research articles remained under research.A comprehensive account of language use in research articles, to understand the usefulness of particular features, can help novice writers to develop familiarity with linguistic conventions (Hyland, 2000).
The current study aims to examine the functional variation in research articles (RAs) across six sub-disciplines of science-related subjects, namely biology, physics, and computer science and social sciences, namely linguistics, sociology, and psychology based on Biber's (1988) factor analysis.Andrea Nini's multidimensional analysis tagger (v 1.3.2) was used to tag and get co-occurring linguistic features.The current research would facilitate Pakistani researchers from the field of scientific writing to gain knowledge of acceptable grammatical and lexical components for future research.

Objectives
In order to achieve this purpose, the research has set the following objectives: 1. To evaluate the research articles of science and social sciences disciplines on functional dimensions 2. To compare the sub-disciplines of both these genres on functional dimensions

Research Questions
1. What are the distributive patterns of functional lexical resources of research articles in science and social sciences?
2. What kind of variations were observed within the research articles of social sciences?
3. What kind of variations were found within the research articles of sciences?
4. Where does the text type of research articles stand in terms of functional dimensions in comparison with other text types?

Significance of the Study
This study allows us to view textual varieties exits in two disciplines of academic writing, namely science and social sciences.The results may be helpful for the authors, students of different disciplines as well as for teachers to spot trends across various sub-disciplines.

Scope of the Study
This study is helpful for students and teachers of different academic disciplines.This study explores the prominent co-occurring linguistic features in research articles and their functions.Students and teachers can know that every type of data (genre) can be analyzed by using MAT rather than manually analyzing the data.

Theoretical Framework
In recent years, the genre approach or related concepts such as register, domain, and text type has gained momentum in corpus-based studies.Researchers classify corpus texts into genres and registers to facilitate the analysis of language use in specific contexts.Text categorization allows researchers to determine the types of language they are examining and enables them to delineate the scope of generalization with a particular genre.Thus, it is important to point out theoretical orientations lying behind the use of these terms.

Genre, Register, and Style
English language variation has remained an interesting subject.Besides genre, other terms used for language variations are register and style.Crystal (1991) has described the term register as a language variety defined in relation to its use in different social situations, such as legal, religious, scientific, and formal English register.The term genre is not included in his Dictionary of Linguistics and Phonetics so he doesn't try to make a difference between these competing lexes.However, Crystal and Davy (1969) used the word style in terms of register and called it a particular way of language use in a particular context.Bhatia (1993) and Swales (1990) tried to explain the difference between genre and register that the former is closely associated with ideology and power due to its connection with social purposes and cultural organization whereas the latter is concerned with situation or context.By simply understanding how the language system works, many interpretations of the term discourse can be sensed on various levels.For instance, when content is likely to be more focused; it has been evaluating registers that would further be discussed as field, mode, and tenor of discourse (Bhatia, 2008).Besides this, registers can vary from discipline to discipline such as science, legal or business registers.Functional variations would be analyzed in them based on words and general syntactic features.Secondly, discourse may be constituted along with register as a genre as it is a written unit that has some specific communicative meaning and purposes.Bhatia (2008) stated that disciplines are identified considering their content and discourse field other than some typical configuration of contextual factors including field, mode, and tenor.To be brief, registers, genres, and disciplines are associated with each other.Systemic functional grammar may provide a clear idea about genre and register.According to Hallidayan grammatical terms, a specific configuration of field, tenor, and mode constitutes the term register.It can also be taken as a language variety that has some sort of functional association with specific contextual and situational parameters of variation.The following diagram shows the relationship between language, genre, and register more explicitly.

Figure 1 Metafunctions with Respect to Register and Genre
Register and genre serve as two distinct ways to examine an object.When a text is viewed as language, it is known as register, while grouping of text as a member of a category is characterized as genre.For example, in a legal register in which its focus is on language, genres may be debates in courts, wills, and testaments or affidavits as per membership of category.The language of sermons, sports reporting, and experimental research results are included in the register.However, text varieties of literary and non-literary categories, such as novels, short stories, sonnets, and so forth are included in genres (Finegan & Biber, 1994).

Genre Analysis
The primary aim of genre analysis was to investigate the overall patterns in a particular genre in association with its form, meaning, and function.Bhatia (2002) explained the term genre analysis as an investigation of textual manuscripts, while considering the cultural context of that speech community.This process would help to understand the interaction among discourse communities with each other to achieve communicative objectives in social, academic, and professional situations.Contrastive genre analysis was another way to make a comparison between texts belonging to native and non-native speakers.The concept of contrastive Volume 5 Issue 1, Spring 2023 Department of Linguistics and Communications genre analysis was firstly presented by Kaplan (1966) in his article after evaluating the 600 ESL students' essays, which belong to different cultural backgrounds.

Multidimensional Analysis
Since its development, Multidimensional (MD) Analysis has extensively been used to investigate registered variations between different corpora of texts.The presence or absence of registered features helps in determining the registered variations (Biber, 2009).A multivariate approach of MD analysis, exploring sixty-seven lexicon-grammatical items in written and spoken corpora also showed the result in interpretable dimensions of linguistics.These dimensions include: 1. Involved vs. Informational 2. Narrative vs. Non-narrative 3. Explicit vs. Implicit 4. Overt Expression vs. Covert 5. Abstract vs. Non-abstract information 6.Online informational elaboration According to Friginal et al. (2013), MD analytical approach offers a detailed recognition of fundamental functional and structural characteristics of the discourse of a said genre.Some studies also produced new dimensions through new factor analysis such as Hardy and Romer (2013) advised 4 dimensions of linguistics.To interpret the functions, the researcher must have a good grip on grammar, especially functional grammar.Biber (2015) summarized that the first and foremost objective of MDA is the identification and interpretation of major parameters of linguistic variations, dimensions, register, language, or any domain of discourse.Secondly, every single dimension contains co-occurring series of linguistic features, therefore, registers and discourse are contrasting domains, considering these features' and linguistic characteristics, namely functional associations and quantitative similarities or differences.Ultimately, every language possesses its unique set of grammatical and lexico-grammatical characteristics, which means that each language would perceive the text in a distinct manner.

Multi Correspondence Analysis
Multiple correspondence analysis (MCA) proceeds toward analyzing the relationship between a few absolute dependent variables.Additionally, it is a form of correspondence analysis (CA) that focuses on categorical variables instead of the quantitative data.Many latest researchers have adopted this approach to conduct analysis but the current study would only analyze the text through a multidimensional analysis approach.

Research Background
The current research aims to explore the functional variation across two academic disciplines of science and social sciences-related research articles.Jin (2018) carried out research on the discussion sections of research articles in the engineering discipline to explore the co-occurring linguistic patterns and variations.A multidimensional analysis technique was used for this study to identify co-occurring linguistic features.Another research by Jin (2021) was conducted on the discussion sections of research articles to investigate the patterns of linguistic characterizations in a chemical engineering discipline.The findings showed that these highlighted linguistic features were associated with style and cultural authority, which had an influence on the writing manuals and teacher instruction.
Another study was conducted on the multidimensional analysis of Pakistani research articles to see the linguistic variation across academic disciplines.This research explored the language of the academic register of Pakistani research articles by using Biber's (1988) five textual dimensions.The finding of this study showed that the language of Pakistani research articles was highly impersonal, non-narrative, explicit, and informational (Rashid & Mehmood, 2019).Department of Linguistics and Communications differences have been observed in the conclusion section of natural science and social sciences research articles on D1, D2, and D4.The findings revealed that possibility and prediction modals (such as can or will) were highly used in all six disciplines, which refers to future research direction.However, only a limited work has been done on different individual sections of the research articles, rather than on the article as whole, which is the main focus of current work.

Methodology
This section focuses on a set of methods used to perform the corpus compilation, tools, and analysis procedure.Later, this combines the corpus codification, the data collection, and eventually the data analysis.

Methodological framework
Numerous research studies have been carried out on various aspects of research articles throughout the years (Jin, 2021).However, research on scientific and social scientific discourses has been insufficient.This study examines the functional variation in research articles across six (6) disciplines of social sciences and science related subjects along with their five (5) dimensions.To differentiate both the scientific researcher applied a multi-dimensional analysis tagger to check out the results, which helped to observe data from various viewpoints.Data had been collected from different online journals.Furthermore, triangulation techniques were employed to explore the functional variations in science and social sciences related research articles.After this, the interpretation was made based on the results in the text form.The results highlighted variation across different disciplines and fields as well as in their research articles.

Tools for Data Collection
The free online version of Multidimensional Analysis Tagger (v 1.3.2) by Andrea Nini was used as a research tool, which provided a grammatically annotated version of 67 linguistic features for genre analysis.This tool imitated the variation across speech and writing tagger introduced by Biber (1988) and employed the Stanford Tagger for the division of parts of speech to explore the Biber (1988) patterns.The corpus containing simple text files was tagged with MAT, which automatically produced the annotated version as output files to identify factors and co-occurring linguistic features.These output files were further analyzed and compared to evaluate the dimension scores of both genres and further graphed to see the variations.ANOVA analysis of variance was also applied to explore the linguistic variations in science and social sciences-related research articles.

Corpus compilation
In this research, the researcher collected data of different research articles related to science and social sciences in text form, from different websites and journals.Firstly, the researcher downloaded the research articles, 150 each for science and social sciences, from internet in pdf format and then converted these files to .docxfile format through an online converter.After sifting all the .docxfiles were converted into .textfile format to make it compatible with MAT.This data had been arranged in two different files, one for science related subjects and the other for social sciences related subjects.Moreover, the excel file was prepared for further correspondence.

Data Type and Sampling
The current study deals with textual data type, which had been further stored in text file format to conduct the multidimensional analysis.

Sampling Criteria
The sampling criteria of the current study was specified only to explore the functional variation across academic disciplines.The sample contains 300 files, 150 of each discipline, namely science and social sciences.

Sampling Type
Stratified sampling technique was used to collect research articles of science and social sciences from different journals.

Sample Representativeness
The corpus of this study consists of research articles related to science

Results and Findings
In this section, the researcher investigated the functional variation among the six sub-disciplines of science and social sciences along with five textual dimensions introduced by Biber (1988).Hence, the researcher explained the normalized frequency of science and social sciences research articles with the help of Table 2.The above table shows the normalized frequency of science and social sciences research articles.Nouns are most frequently used parts of speech in both disciplines.Whereas prepositions are less frequently used in social sciences research articles.Additionally, third person pronoun (TPP3) is used only in social sciences research articles.The researcher normalized the data to generalize and accurate the results because there was no information regarding the exact number of words of each file, so the researcher normalized the raw frequency using this formula: Tokens/total *100

Figure 3 Comparison of Sciences and Social Sciences Research Articles on Dimensions
Figure 3 shows that the research articles on science related topics were more informational, non-narrative, and abstract as compared to social sciences.The features like nouns, preposition, and attributive adjectives were the prominent markers of informational discourse as well as conjunctions and passives, which were the features of abstract discourse because they provided the information in a formal or technical way.The preposition was highly common used parts of speech in research articles of science.The articles on social sciences related topics were less informational, non-narrative and non-argumentative.Present tense verbs were the feature of non-narrative discourse.All these features indicated that the research articles related to science subjects were informational.The results showed that the variation exists in the research articles of science and social sciences at dimensions 1, 2, 4, and 5 and a minor difference had observed at dimension 3.  The researcher compared the individual discipline of social sciences at all five dimensions as given by Biber (1988).Figure 5 shows the variation within the research articles of social sciences, namely linguistics, sociology, and psychology.The research articles on psychology was more informational, non-narrative, explicit, and impersonal as compared to linguistics and sociology articles.Nominalizations and wh-clauses were the most common features of explicit discourse and conjunctions, whereas agentless passives were the most common features of abstract discourse.All of three social sciences sub-disciplines were less persuasive but articles on linguistics were least persuasive than t he other two disciplines.However, variations existed within the research articles of linguistics, sociology, and psychology.The results showed that the variations existed in all social sciences research articles at all dimensions.

Figure 6 Examples from Article's Text
In this text nouns are used (20) times, adjectives are used (5) times, and nominalization is used (10) times.The research articles on psychology were more informational, explicit, and impersonal.

Variation Within Research Articles of Sciences According to Biber's (1988) Dimensions
Variation is present within the research articles of science related Volume 5 Issue 1, Spring 2023 Department of Linguistics and Communications subjects, namely biology, physics, and computer science.The research articles on computer science were more explicit and articles related to physics topics were more abstract.

Figure 7 Variation Within Research Articles of Sciences
The above Figure 7 shows that the research articles on biology are more informational as compared to computer science and research articles on physics are less informational.The research articles of computer sciences and physics are non-narratives and biology are less non-narratives.Abstract features were observed in all three sub-disciplines of sciences.Researchers found the variation within the research articles of science related topics at five dimensions.The results were different at dimensions 1, 3, 4, and 5.The results of biology and physics at dimension 2 were the same, which means that research articles are non-narrative in nature and were descriptive.

Types of Text in Both Disciplines
According to Biber (1988), the closest type of text in the sciences corpus was learned exposition, which is a characteristic of official papers and academic prose.On the other hand, the closest type of texts in social sciences research articles was scientific exposition.These text types were typically informational expositions and focused more on conveying information in a formal and very technical way.

Discussion
Factor 1 scores of sciences (-18.34) and social sciences (-14.16) research articles showed features possessing negatives values, which were more frequent in both disciplines.The features having negative weights, namely nouns, attributive adjectives, and prepositional phrases occurred in high frequency in sciences research articles than in social sciences research articles, which indicated that science research articles were more informational focused, carefully written, and lack involvement, which is the function of positive weighted features.The results of the current research showed similarity to the research of BAWE corpus (Gardener, 2015) on Factor 1.
Scores of Factor 2 indicated that features possessing negative values occurred more frequently in the research articles of sciences (-4.08) and social sciences (-2.35) than the positive features, which means text was nonnarrative, expository, and descriptive.Research articles on sciences were more descriptive and had non-narrative concerns than social sciences.It can be said that the focus of research articles in both disciplines was to report matters of immediate concern rather than depicting past events.The findings of the current research matched with Gardner's (2015) BAWE corpus research on Factor 2.
The cluster of features having positive value occurred in the science and social sciences research articles on Factor 3, which were used to indicate the referentially explicit discourse.These features mark explicitness in research articles' discourse that was integrated as well as informational.This indicates that both sciences and social sciences research articles utilized relativization to explicitly identify referents within their discourse.
Scores on Factor 4 indicated that research articles of both disciplines were less argumentative and were lacking persuasive elements.The features of positive values had a low contribution to the research articles.As elements of persuasion were used to mark the speaker's personal point of view, which was almost absent, in the current findings so it can be assumed that there was no personal involvement and the text was only informational focused.
The co-occurring features on Factor 5 possessing positive values were more frequently used in sciences (5.04%) and social sciences (4.15%) research articles.This indicated that the text of research articles was in formal style, which contains abstract and technical content.Overall, this factor marked that science and social sciences research articles had informational discourse having technical, formal, and abstract features.Comprehensively, the findings of the current research showed some similarity with Rashid (2019) work on research articles.
Factor analysis of sub-disciplines of science research articles showed that they were highly information, non-narrative, less persuasive, and highly explicit.Following observations were made based on the analysis of research articles 1. Factor 1 scores of three sub-disciplines of science, namely biology -19.35, physics, -18.31, and computer -17.37 showed negative values, which means that these research articles were packed with a lot of information.Nouns, prepositional phrases, and attributive adjectives were more frequent in biology research articles so they were more informational than the other two with slight variations.
2. Scores of Factor 2 indicated that all three disciplines of sciences lack the feature of narrative discourse, so the text of research articles was descriptive.Computer science (-4.52) and physics (-4.54)RAs contain more present tense verbs and attributive adjectives than biology (-3.18) research articles.The focus of research articles was on reporting matters of immediate concern.
3. Dimension scores of Factor 3 revealed that computer science (9.54) research articles were more explicit than the other two sub-disciplines with fewer variations.The presence of relative clauses, nominalizations (NOMZ), and phrasal coordination means the discourse of research articles was integrated and informational.
4. Negative scores on Factor 4 showed that the discourse of sub-disciplines of science research articles was not argumentative, thus it lacks personal involvement and focused only on conveying information.Computer science (-2.14) research articles contains fewer elements of persuasiveness than physics (-4.58) and biology (-3.63) research articles.
5. Positive values on Factor 5 having slight variations indicated that biology (4.6), computer science (5.02), and physics (5.5) research articles contain technical, formal, and abstract information.
According to Multidimensional Analysis Tagger (MAT) results of six sub-disciplines of social sciences research articles, the articles contain features of informational, non-narrative, explicit, and abstract discourse.
2. All three disciplines of social sciences had non-narrative, descriptive, and expository discourse with slight variations.
5. There are slight differences in Factor 5 with regards to psychology (5.13) as research articles in this filed contain more technical, formal, and abstract information when compared to research articles in linguistics (3.69) and psychology (3.64).

Conclusion
Research conducted on different disciplines of social sciences and science showed a variation among the six sub-disciplines.These research articles of science were more informational, non-narrative, explicit, and abstractive.The research articles of social sciences were less informational, non-narrative and abstractive.To further enhance the current research, it is recommended to expand the sample size, incorporate advanced statistical tools, and compare the findings with the relevant research studies, including dissertations, while considering the theoretical foundation of the current research.
The current research has some limitations.The corpus of research articles (RAs) used in this study comprised only two major disciplines of science and social sciences, with each represented 150 research articles.However, to extend the current study other disciplines of academia can also be compared on functional dimension to understand their discourse and functional implications.Therefore, this study of other academic disciplines may provide more insights into the language of this register.

Limitations
The current research has some limitations.The corpus of research articles (RAs) used in this study covers only two disciplines (sub-genres) Rashid, in 2019 employed the multidimensional approach to investigate the academic journal articles of Pakistan to see the linguistic variation in their research sections.Biber's (1988) MD analysis model constituted the theoretical background of the study.The outcomes of the study on Pakistani academic journal articles through ANOVA revealed significant differences in research sections.The results indicated that the academic research articles in Pakistani journals tend to be, explicit, informational, non-argumentative, and impersonal.In (2022), Liu conducted a research on the multi-dimensional analysis of the conclusion section in the selected research articles.Significant Volume 5 Issue 1, Spring 2023 Figure 4Examples from the Article's Text Figure 5 Variation within Research Articles of Social Science

Table 1
Number of Research Articles in Terms of Selected Disciplines Both corpora consisted of 150 text files in the form of research articles.This sample size was large enough to get valid results as this research deal with corpora that consist of millions of words.
Volume 5 Issue 1, Spring 2023 Department of Linguistics and Communications and social sciences subjects.

Table 2
Normalized Frequency of Sciences and Social Sciences Research Articles