| Review | Open Access |
|---|
Pathogenic SNP Network Enhancing IL-2/JAK-STAT Signaling and T-Cell Responses in Celiac Disease |
|---|
Celiac disease is an immune-mediated enteropathy caused by abnormal activation of T-cells triggered by dietary gluten. The underlying genetic processes that enhance the immune action remain poorly defined. The current paper focuses on studying four immune-regulatory single nucleotide polymorphisms (SNPs), namely rs11712165 (CD80), rs3184504 (SH2B3), rs243323 (SOCS1), and rs2298428 (UBE2L3), to investigate their possible combined effects on cytokine signaling dysregulation in celiac disease. With an integrative bioinformatics strategy that includes functional annotation tools, GTEx eQTL expression analysis, KEGG pathway mapping, and drug-gene interaction databases, the study was able to find a pattern of pathway-level disruption consistent with the IL-2/JAK-STAT axis. Increased CD80 expression correlated with stronger T-cell co-stimulation, augmented IL-2 secretion, and decreased SH2B3 expression inhibited the activity of inhibitory adapters and facilitated hyper-responsiveness to cytokine signaling. Simultaneous reductions in the expression of SOCS1 and UBE2L3 reduced the required negative responses and ubiquitin-mediated signal termination, permitting intracellular signal to be sustained beyond normal regulation. All these variant-induced changes culminated in the formation of a coordinated mechanistic pattern, whereby IL-2/JAK-STAT signaling was augmented and prolonged and T-cell activation and intestinal inflammation increased. The evidence above is consistent with recognized immunopathological characteristics of celiac disease, since overproduction of IL-2 signaling and impaired T-cell responses are factors leading to mucosal damage. The screening of drug interactions also indicated that there are a number of approved therapeutics against CD80 and JAK kinases and could provide more opportunities for pathway-specific interventions. In general, this paper illustrates the role of specific immune-regulatory SNPs interacting together to create a pro-inflammatory signaling space in celiac disease and also offers a genetically informed template to be used in future therapeutic endeavors.
Celiac disease (CeD) is a chronic, immune-mediated enteropathy triggered by dietary gluten in genetically predisposed individuals. It affects approximately 1% of the global population and is characterized by villous atrophy, crypt hyperplasia, and a broad spectrum of gastrointestinal and extra-intestinal manifestations [1]. Even though the development of the disease requires the presence of HLA-DQ2/DQ8 haplotypes, their presence is not sufficient to explain disease susceptibility. On the one hand, these variants remain widespread in the general population. Conversely, not all carriers are affected by CeD [2]. The absence of this gap explains why there is a need to explore the role of non-HLA immune-regulatory genes that regulate T-cell activation, cytokine signaling, and mucosal inflammation.
Over 41 non-HLA loci associated with CeD have been revealed by genome-wide association studies. Most of these loci focus on immune pathways that mediate cytokine signaling, T-cell growth, and antigen presentation [3, 4]. Some of these loci have an impact on genes that directly regulate the IL-2/JAK-STAT axis, which is one of the key pathways that regulate T-cell expansion and efficacies. Specifically, mutations in CD80, SH2B3, SOCS1, and UBE2L3 have been reported numerous times in autoimmune susceptibilities, such as CeD. These genes mediate co-stimulatory signals, cytokine receptor sensitivity, SOCS protein-mediated negative feedback, and ubiquitin-mediated signaling protein turnover [5, 6]. They can cause a condition of increased and prolonged T-cell activation due to their functional modification.
Within the framework of CeD, gluten-specific CD4+ T cells release excessive IL-2 through activation, which causes JAK1/JAK3-induced phosphorylation of STAT3 and STAT5 that stimulate the growth of T-cells and inflammation [7]. Continuous IL-2/JAK-STAT signaling has been observed to play a role in mucosal injury and immune amplification in CeD, which points to the significance of upstream genetic regulators of this process [8].
This research is based on an integrative bioinformatics analysis to assess the potential role of four major SNPs, namely rs11712165 (CD80), rs3184504 (SH2B3), rs243323 (SOCS1), and rs2298428 (UBE2L3), to dysregulate immune signaling. These variants were identified on the basis of their high regulatory or structural potential, which was shown with RegulomeDB, GTEx eQTL mapping, I-Mutant structural predictions, and pathway enrichment plans. To build a mechanistic model connecting an IL-2/JAK-STAT signaling amplification of variant-driven gene dysregulation, genomic annotation, tissue-specific effects of expression, protein-protein interactions, and pathway mapping were combined.
In addition, since these genes are clustered around a therapeutically exploitable signaling axis, drug-gene interaction screening was included to determine the available pharmacological agents that target this pathway. SOCS1, as well as UBE2L3, was not found to have direct drug interactions but CD80 and the JAK kinases were implicated with clinically approved immunomodulatory reagents, suggesting potential translational uses of pathway-guided therapy.
Although there is a growing understanding of genetic susceptibility in celiac disease, the mechanistic interaction between SNP variant variants on the one hand and immune signaling pathways on the other hand is yet to be fully understood. Specifically, synergistic effects of regulatory variants that influence the action of cytokine signaling and T-cell activation have not been properly studied. Thus, this study aimed at conducting an integrative bioinformatics analysis of four SNPs associated with immunology (rs11712165, rs3184504, rs243323, and rs2298428) to understand how these SNPs might work together to alter IL-2/JAK-STAT signaling, T-cell activation, and inflammatory reactions in celiac disease.
The current research was aimed at the discovery of genes and SNPs implicated in celiac disease. Genes associated with celiac disease were retrieved from publicly available databases or genome-wide association studies (GWAS). The selection of these genes was based on their known functions in immune regulation, intestinal barrier, and inflammatory response [9, 10]. GWAS data and scientific literature were then used to retrieve SNPs related to celiac disease. The SNPs were confirmed by various databases including dbSNP and Ensembl according to their chromosome location, reference and alternate nucleotides, and associated gene or non-gene open region. Only the SNPs of individuals with confirmed genomic data and a good association with celiac disease were included in the final dataset [11].
2.2. Identification and Analysis of Celiac Disease-Associated SNPsOnce the gene set was set, SNPs within genes or within their immediate environment were determined using the genomic database. Both variants were compared and their location in the chromosome, reference and alternate alleles, and their functional region (coding, intronic, or intergenic) were identified [12]. The data was also checked by comparing it with former literature and UCSC Genome Browser. The amino acid sequence corresponding to each SNP was accessed to be used later in interpreting the functions and predicting the effects of the variants, although the results were reported sequentially [13].
2.3. Prediction of Potential Deleterious SNPsThe SNPs identified were then filtered with regards to their possible negative effect by computational prediction sites including SIFT, PolyPhen-2, MutationTaster, and LRT (Likelihood Ratio Test) [14-16]. These algorithms determine the likelihood of a specific amino acid mutation or nucleotide mutation having an impact on protein functionality, given its sequence conservation, physicochemical characteristics, and evolutionary background. SNPs that are predicted to be damaging or potentially damaging based on multiple tools were given priority to be studied further [17].
2.4. Functional Analysis of Celiac Disease-Associated SNPsFunctional analysis of the top SNPs was conducted to gain further insight into their biological importance. Each one of these variants was marked on their genomic location, such as exonic, intronic, or intergenic and assessed based on their potential regulatory activities, in terms of transcription factor binding or enhancer activity. The extent of regulatory potential was determined in RegulomeDB and HaploReg by combining the data of epigenetic marks, DNase hypersensitivity sites, and histone modifications. This method was used to have a closer understanding of how non-coding variants result in gene expression alterations associated with celiac disease [12, 17].
2.5. Structural Stability Analysis of Protein VariantsBesides functional annotation, the analysis of nonsynonymous SNPs (nsSNPs) was done to predict their effect on the structural stability of proteins. Change in Gibbs free energy (ΔΔG) between wild-type and mutant proteins was calculated using I-Mutant 3.0 and SNPs3D. Minus Δ ΔG values represent a decrease in stability. The replacement of amino acid stabilizes the protein structure, otherwise the protein would not be able to perform its normal functions. This structural stability study proved to be useful in determining the SNPs that are likely to bring about deleterious conformational change of disease-relevant proteins [18, 19].
2.6. Pathway Enrichment AnalysisPathway Enrichr was used to study the collective biological relevance of the identified genes [20]. The pathways found to be enriched in the set of genes with deleterious SNPs were identified using the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome databases [21, 22]. This step was aimed at identifying whether these genes were co-localized in any of the biological pathways, especially those linked to immune signaling, cytokine activity, or apoptotic mechanisms that are usually dysregulated in celiac disease.
2.7. Protein-Protein Interaction (PPI) Network AnalysisThis step was aimed at identifying whether these genes were co-localized in any of the biological pathways, especially those linked to immune signaling, cytokine activity, or apoptotic mechanisms that are usually dysregulated in celiac disease [23]. The obtained network visualization made it possible to identify the central hub proteins and functional modules that can potentially be critical in disease progression. This integrative system biology approach provided comprehensive insight on how genetic variants interact overall to alter molecular and cellular processes in celiac disease.
2.8. Linkage Disequilibrium Analysis using LD-link (LD proxy Tool)Each one of the four lead variants (rs11712165, rs3184504, rs243323, and rs2298428) was analyzed by linkage disequilibrium (LD) through the LD proxy module on the LD-link data analysis web site created by the National Cancer Institute. The SNPs were integrated with the 1000 genomes project CEU (Northern and Western European) population panel. The tool was designed to recover proxy variants with high LD with the top SNPs, which is r 2 =0.8 and above. The following information was obtained and tabulated on the SNP identifier: chromosomal coordinates, r 2 and D values, minor allele frequency, and simple functional annotations. All proxy SNPs identified with the help of this mechanism were listed to facilitate downstream functional and regulatory annotation procedures and to ensure that functional signals observed could not be explained by adjacent linkage variants [24].
2.9. Expression Quantitative Trait Loci (eQTL) Analysis using GTEx v10The GTEx v10 portal was used to assess the possible effect of every SNP that was selected on the basis of gene expression in each one of the human tissues [25]. Every variant was individually searched in the GTEx interface and all the reported tissue-specific normalized effect sizes (NES) were harvested. Statistically significant eQTL correlations were observed in each SNP to a gene pair based on the various thresholds available in the GTEx database. All the tissues with significant associations reportedly showed the directionality of expression effects (positive or negative NES). The review of tissue distribution plots in case they were available was also done to put tissue-specific relevance into perspective. The obtained NES values and tissue associations were organized in a systematic manner, so that they can be later integrated with the biological pathway mapping.
2.10. Integrated Mechanistic Pathway and Drug Interaction Analysis2.10.1 Pathway Mapping and Mechanistic Integration. Pathway mapping with KEGG pathway maps, Reactome pathway database, and literature was carried out to contextualize the functional annotations of the SNP-associated genes [21, 22]. The genes (CD80, SH2B3, SOCS1 and UBE2L3) were individually queried in KEGG and Reactome to determine any connected pathways, upstream regulators, downstream signaling components, and any other molecular interactions. The information on extracted pathways was applied to determine the location of each gene within the larger immune and cell-signaling structures. Based on these curated references, a mechanistic schema was built to visually connect the genes that are variably linked with their associated components in the pathway. Only the identification of pathways and positional mapping was done without making any biological deductions in this step.
Figure 1. Bioinformatics workflow of the analysis of a functional SNP and its potential influence on immune signaling pathways in celiac disease.
2.10.2. Drug-Gene Interaction Screening (DGIdb). Drug-gene interaction screening was performed on the Drug-Gene Interaction Database (DGIdb) to find possible therapeutic agents against the genes of interest [26]. The genes (CD80, SH2B3, SOCS1, UBE2L3) were typed into DGIdb. All drug interactions, types of interaction, evidence, and drug approval status were obtained. The annotations of DGIdb served as the basis of documented drug mechanisms, with a cross-reference to supporting literature where necessary. The genes that had no reported drug interactions were placed under follow-up therapeutic exploration of the pathway. The data was retrieved and put together into an organized dataset to aid in downstream analysis.
2.10.3. Pathway-Level Drug Target Identification. In case of genes that do not directly react with drugs, those genes were determined as pathway-level therapeutic candidates to provide more comprehensive signaling processes. To find out approved or investigative agents that act on significant components of pathways, a query of DrugBank, KEGG DRUG, and other literature sources was performed [27]. Drug acting on cytokine receptors, JAK kinases, IL-2 signaling components, and ubiquitin-proteasome system elements were searched. The names of drugs, their molecular targets, their mode of action, and their approved status were noted. Such agents were clustered into a pool of reference which was further incorporated into the mechanistic structure. This procedure was aimed at solely collecting the pathway-targeting compounds and not considering the interpretation of therapeutic significance.
The literature review and database mining approaches were used to create a comprehensive list of genes related to the celiac disease. Genome-wide association studies (GWAS) and peer-reviewed articles retrieved previously have reported genes that are known to be involved in immune response, intestinal barrier regulation, and inflammation. The validation of these genes and their functional relevance was achieved using databases such as the GWAS Catalog, UCSC Genome Browser, and HaploReg v4.2. Consequently, a list of immunological and signaling genes, such as IL2, SOCS1, UBE2L3, PFKFB3, DAD1, PKIA, ABCC9, and KCNK10, was narrowed down to undergo further screening through SNPs.
Table 1. List of SNPs associated with celiac disease from HaploReg v4.2 and UCSC Genome browser.
|
rsID |
Position |
Gene |
Ref/Alt |
Region |
|---|---|---|---|---|
|
rs117128341 |
chr8:79449773 |
PKIA |
G/A |
Intronic |
|
rs73687528 |
chr8:79430748 |
PKIA |
A/G |
Intronic |
|
rs79215674 |
chr8:79479883 |
PKIA |
C/T |
Intronic |
|
rs74450608 |
chr8:79587012 |
IL7 |
T/C |
Intronic |
|
rs79374792 |
chr8:79590478 |
IL7 |
G/A |
Intronic |
|
rs117139146 |
chr10:6200555 |
PFKFB3 |
C/T |
Intronic |
|
rs17810546 |
chr3:159665049 |
Il12A/SCHIP1 |
A/G |
Intergenic |
|
rs183665868 |
chr18:8103224 |
PTPRM |
G/A |
Intronic |
|
rs189838725 |
chr2:216021219 |
ATIC |
G/T |
Intergenic |
|
rs192900921 |
chr6:40402691 |
LRFN2 |
G/A |
Intronic |
|
rs32723 |
chr5:3451905 |
LINC01019 |
T/G |
Intronic |
|
rs32726 |
chr5:3452577 |
LINC01019 |
T/C |
Intronic |
|
rs32727 |
chr5:3452885 |
LINC01019 |
G/C |
Intronic |
|
rs3748816 |
chr1:2526745 |
MMEL1 |
A/G |
Missense |
|
rs576626084 |
chr2:54974118 |
EML6 |
C/G |
Intergenic |
|
rs761616279 |
chr12:22005548 |
ABCC9 |
T/C |
Intergenic |
|
rs769377780 |
chr14:88777583 |
KCNK10 |
C/T |
Intergenic |
|
rs5979785 |
chrX:12971523 |
TLR7 |
C/T |
Intergenic |
|
rs653178 |
chr12:112007755 |
SH2B3 |
C/T |
Intronic |
|
rs13010713 |
chr2:181996044 |
ITGA4 |
A/G |
Intergenic |
|
rs13151961 |
chr4:123115501 |
IL2 |
A/G |
Intronic |
|
rs11712165 |
chr3:119118795 |
CD80 |
T/G |
Intronic |
|
rs10936599 |
chr3:169492100 |
MYNN |
C/T |
synonymous |
|
rs13098911 |
chr3:46235200 |
CCR1, CCR3 |
C/T |
Intergenic |
|
rs17810546 |
chr3:159665049 |
Il12A/SCHIP1 |
A/G |
Intergenic |
|
rs2327832 |
chr6:137973067 |
OLIG3 |
A/G |
Intergenic |
|
rs2298428 |
chr22:21982891 |
UBE2L3 |
C/T |
Missense |
|
rs3184504 |
chr12:111884607 |
SH2B3 |
T/C |
Missense |
|
rs76830965 |
chr3:159637677 |
IL12A/SCHIP1 |
C/A |
Intergenic |
|
rs7616215 |
chr3:46205685 |
CCR1, CCR3 |
C/T |
Intergenic |
|
rs17264332 |
chr6:138005514 |
OLIG3 |
A/G |
Intergenic |
|
rs243323 |
chr16:11361201 |
SOCS1 |
A/G |
Intergenic |
|
rs13132308 |
chr4:123551113 |
IL2 |
A/G |
Intergenic |
|
rs990171 |
chr2:103086769 |
IL18R1, IL18RAP |
A/C |
Intergenic |
|
rs551170288 |
chr2:60664485 |
MIR4432HG, BCL11A |
C/G |
Intergenic |
|
rs780153546 |
chr14:22910739 |
LOC105370401, DAD1 |
C/A |
Intergenic |
|
rs137888770 |
chr15:46555534 |
LOC105370802, SEMA6D |
G/A |
Intergenic |
|
rs1002929661 |
chr15:55560198 |
RAB27A |
C/T |
Intergenic |
|
rs894868996 |
chr15:55868822 |
PYGO1 |
G/A |
Intergenic |
|
rs945505625 |
chr17:60075777 |
MED13 |
T/C |
Intergenic |
After the identification of the genes, SNPs associated with the chosen celiac disease-related genes were mined by using a synthesis of literature-based knowledge and bioinformatics databases. The inclusion of the literature review meant that variants that have been previously reported as clinically or functionally important were included, while other variants were acquired through various databases such as HaploReg v4.2 and dbSNP to ensure that the catalogue of variants remained complete. This combined method enabled both experimentally validated and computationally predicted versions to be included.
A total of 41 SNPs were detected in different genes. All SNPs were mapped to the respective chromosomes, genomic regions, and types of alleles. The majority were in intronic or intergenic areas and their implications to be controlled by regulatory mechanisms than direct coding alterations. A smaller group of variants comprising rs3748816 (MMEL1), rs2298428 (UBE2L3), and rs3184504 (SH2B3) fell within the exonic or missense domain, indicating potential structural effects. Table 1 presents the entire set of these SNPs along with their chromosomal location, association, and types of variations. It gives a summary of the genomic architecture of the analyzed variants which forms the foundation of further functional, structural, and pathway-based assessment.
|
a |
b |
|
c |
d |
|
e |
f |
|
g |
h |
|
i |
j |
|
k |
l |
|
m |
n |
|
o |
p |
Figure 2. Visual Representation of Transcription Factor Binding Potential and Chromatin Accessibility of SNPs Found using RegulomeDB. The Regulatory Potential of Variants rs243323, rs11712165, rs2298428, and rs3184504 is the Most Powerful, So These Loci can have an Effect on the Functioning of Immune-related Tissues, such as Transcription Factor Binding and Gene Expression. These Results Confirm the Hypothesis that Regulatory SNPs are Involved in the Dysregulation of Immune Signaling in Celiac Disease.
Various in silico tools were employed in predicting the functional significance of the identified SNPs. RegulomeDB and RegPotential scoring identified various variants that had a high regulatory potential. SNPs that scored low on RegulomeDB (1f3a) showed a strong transcription factor binding or enhancer activity. It is important to note that the following were predicted to play critical regulatory roles, namely rs243323 (SOCS1), rs11712165 (CD80), rs2298428 (UBE2L3), and rs3184504 (SH2B3). Table 2 gives the summary of the regulatory potential and conservation value, whereas Figure 2 presents the graphical representation of the effects of SNPs on transcription factor binding and chromatin accessibility. These findings demonstrate that various variants can alter the activity of genes by altering promoter accessibility and transcription regulation, especially of immune-related genes.
Table 2. Functional analysis of celiac disease-associated SNPs.
|
Sr. no. |
SNP ID |
RegPotential* |
Conservation** |
Regulome DB Score*** |
|---|---|---|---|---|
|
1 |
rs117128341 |
- |
- |
No Data |
|
2 |
rs73687528 |
- |
- |
No Data |
|
3 |
rs79215674 |
- |
- |
No Data |
|
4 |
rs74450608 |
- |
- |
No Data |
|
5 |
rs79374792 |
- |
- |
6 |
|
6 |
rs117139146 |
- |
- |
No Data |
|
7 |
rs17810546 |
0 |
0 |
4 |
|
8 |
rs183665868 |
- |
- |
No Data |
|
9 |
rs189838725 |
- |
- |
No Data |
|
10 |
rs192900921 |
- |
- |
5 |
|
11 |
rs32723 |
0.001756 |
0 |
5 |
|
12 |
rs32726 |
0 |
0 |
6 |
|
13 |
rs32727 |
0 |
0 |
No Data |
|
14 |
rs3748816 |
0.206919 |
0 |
4 |
|
15 |
rs576626084 |
- |
- |
- |
|
16 |
rs761616279 |
- |
- |
- |
|
17 |
rs769377780 |
- |
- |
- |
|
18 |
rs5979785 |
0 |
0 |
No Data |
|
19 |
rs653178 |
0 |
0 |
5 |
|
20 |
rs13010713 |
0 |
0 |
5 |
|
21 |
rs13151961 |
0 |
0.001 |
6 |
|
22 |
rs11712165 |
0.049487 |
0.847 |
3a |
|
23 |
rs10936599 |
0.272578 |
1 |
5 |
|
24 |
rs13098911 |
NA |
0.001 |
3a |
|
25 |
rs17810546 |
- |
- |
4 |
|
26 |
rs2327832 |
0 |
0.066 |
6 |
|
27 |
rs2298428 |
0.366729 |
1 |
4 |
|
28 |
rs3184504 |
0.287954 |
0.005 |
3a |
|
29 |
rs76830965 |
- |
- |
5 |
|
30 |
rs7616215 |
NA |
0 |
4 |
|
31 |
rs17264332 |
0 |
0 |
6 |
|
32 |
rs243323 |
0 |
0.086 |
1f |
|
33 |
rs13132308 |
0 |
0.02 |
6 |
|
34 |
rs990171 |
NA |
0.199 |
5 |
|
35 |
rs551170288 |
- |
- |
- |
|
36 |
rs780153546 |
- |
- |
- |
|
37 |
rs137888770 |
- |
- |
6 |
|
38 |
rs1002929661 |
- |
- |
- |
|
39 |
rs894868996 |
- |
- |
- |
|
40 |
rs945505625 |
- |
- |
- |
Functional analysis also describes the way these variants can lead to the occurrence of celiac disease. The variants present in or close to immune associated genes were observed to affect crucial molecular mechanisms including cytokine signaling, apoptosis, and immune regulation. Such variants as rs243323 (SOCS1) and rs11712165 (CD80) were also found to be regulatory and might cause changes in the activity of transcription factors. Conversely, the coding variants likely to disrupt the function of a protein or signaling pathways were rs2298428 (UBE2L3) and rs3184504 (SH2B3). Table 3 gives a summary of all high impact SNPs, for which the impact according to prediction is either structural or regulatory or both. As an example, the candidate gene rs243323 exhibited a great regulatory capacity and a smaller destabilizing impact (Δ Δ G = -1.04), thus indicating a dual influence on the regulation of both the gene and the protein. At the same time, the ΔΔG of rs192900921 (LRFN2) was -5.22, which serves as the evidence of the most destabilizing change in structures among all the variants.
3.5. Structural Stability Analysis of Protein VariantsAmino acid replacement effects on the structural stability of proteins were measured with I-Mutant 3.0. The essential ΔΔG values were calculated showing that a number of non-synonymous variants were destabilizing to protein structure. The structurally unstable ones were the ones that had ΔΔG values of less than -1.0.
Table 3. Summary of High-impact SNPs Associated with Celiac Disease
|
SNP ID |
Gene |
Region |
RegulomeDB Score |
ΔΔG(I-Mutant) |
Functional Category |
Predicted Impact |
|---|---|---|---|---|---|---|
|
rs243323 |
SOCS1 |
Intergenic |
1f |
−1.04 |
Regulatory / Structural |
Highest regulatory potential; may influence TF binding and immune-gene regulation |
|
rs11712165 |
CD80 |
Intronic |
3a |
−0.63 |
Regulatory |
Moderate enhancer activity; possible effect on T-cell co-stimulation |
|
rs13098911 |
CCR1/CCR3 |
Intergenic |
3a |
−0.33 |
Regulatory |
May alter promoter accessibility in immune-cell loci |
|
rs2298428 |
UBE2L3 |
Missense |
4 |
−1.12 |
Structural / Regulatory |
Deleterious coding variant; potential effect on ubiquitin-pathway signaling |
|
rs3184504 |
SH2B3 |
Missense |
3a |
1.11 |
Regulatory |
Reported autoimmune-risk variant; alters cytokine signaling |
|
rs192900921 |
LRFN2 |
Intronic |
5 |
−5.22 |
Structural |
Strongly destabilizing; may disrupt protein folding |
|
rs769377780 |
KCNK10 |
Intergenic |
5 |
−3.46 |
Structural |
Predicted to decrease protein stability; affects ion-channel regulation |
|
rs894868996 |
PYGO1 |
Intergenic |
– |
−1.87 |
Structural |
Reduces stability of Wnt-pathway co-activator |
|
rs780153546 |
DAD1 |
Intergenic |
– |
−1.83 |
Structural |
Destabilizing variant; may affect apoptosis-related protein |
|
rs137888770 |
SEMA6D |
Intergenic |
6 |
−1.32 |
Structural |
Mild destabilization; may influence neuronal/immune signaling |
|
rs551170288 |
BCL11A |
Intergenic |
– |
−1.44 |
Structural |
Slightly destabilizing; may affect transcriptional regulation |
|
rs13151961 |
IL2 |
Intronic |
6 |
−1.04 |
Structural / Regulatory |
Weak regulatory evidence but minor destabilizing effect on cytokine structure |
Table 4. SNPs Involved in Destabilizing Protein Structure
|
rsID |
Position |
Gene/Region |
Ref/Alt |
Amino Acid change |
Position in protein |
I-MUTANT |
3D SNP Score |
|---|---|---|---|---|---|---|---|
|
rs117128341 |
chr8:79449773 |
PKIA |
G/A |
S/N |
59 |
-1.22 |
2.38 |
|
rs73687528 |
chr8:79430748 |
PKIA |
A/G |
S/N |
59 |
-1.22 |
8.69 |
|
rs79215674 |
chr8:79479883 |
PKIA |
C/T |
S/N |
59 |
-1.22 |
1.55 |
|
rs117139146 |
chr10:6200555 |
PFKFB3 |
C/T |
A/T |
115 |
-1.33 |
6.24 |
|
rs183665868 |
chr18:8103224 |
PTPRM |
G/A |
A/T |
75 |
-1.08 |
1.03 |
|
rs192900921 |
chr6:40402691 |
LRFN2 |
G/A |
S/G |
68 |
-5.22 |
1.7 |
|
rs761616279 |
chr12:22005548 |
ABCC9 |
T/C |
N/D |
8 |
-1.42 |
2.4 |
|
rs769377780 |
chr14:88777583 |
KCNK10 |
C/T |
T/A |
4 |
-3.46 |
2.98 |
|
rs13151961 |
chr4:123115501 |
IL2 |
A/G |
T/A |
23 |
-1.04 |
1.24 |
|
rs2298428 |
chr22:21982891 |
UBE2L3, YDJC |
C/T |
K/E |
20 |
-1.12 |
63.92 |
|
rs243323 |
chr16:11361201 |
SOCS1 |
A/G |
V/I |
13 |
-1.04 |
7.76 |
|
rs13132308 |
chr4:123551113 |
IL2 |
A/G |
T/A |
23 |
-1.04 |
1.84 |
|
rs551170288 |
chr2:60664485 |
MIR4432HG, BCL11A |
C/G |
A/G |
129 |
-1.44 |
9.48 |
|
rs780153546 |
chr14:22910739 |
LOC105370401, DAD1 |
C/A |
I/T |
20 |
-1.83 |
0.71 |
|
rs137888770 |
chr15:46555534 |
LOC105370802, SEMA6D |
G/A |
V/F |
3 |
-1.32 |
1.12 |
|
rs1002929661 |
chr15:55560198 |
RAB27A |
C/T |
S/N |
56 |
-1.79 |
39.91 |
|
rs894868996 |
chr15:55868822 |
PYGO1 |
G/A |
S/A |
11 |
-1.87 |
5.8 |
Table 4 summarizes the variants that are expected to destabilize the proteins. The most disruptive replacements were noted in rs192900921 (LRFN2, ΔΔG = -5.22), rs769377780 (KCNK10, 2 - 5.26), and 894868996 (PYGO1, 2 - 1.87). Other variants including rs551170288 (BCL11A) and rs780153546 (DAD1) were also found to have significant negative energy shifts which may indicate changes in protein folding and degradation potential. These results indicate that structural destabilization can be one of the mechanisms of disease susceptibility.
3.6. Pathway Enrichment AnalysisEnrichr pathway enrichment analysis identified important biological networks affected by the identified genes. The pathways found to be significantly associated with these SNP-bearing genes were determined using the KEGG and Reactome databases. The findings revealed that there had been a high level of enrichment of immune system-related signaling pathways, especially IL2 signaling pathway, as well as cytokine-cytokine receptor interaction pathway. The maps of Reactome and KEGG enrichment of pathways (Figures 3 a) and b) respectively) cluster to reveal the immunologic, signaling, and apoptotic processes. SOCS1, UBE2L3, IL2, PFKFB3, DAD1, ABCC9, PTPRM, KCNK10, SEMA6D, and PKIA are the genes involved in KEGG pathway enrichment. Figure 3a indicate that some of the variants found could have an effect on immune regulation and cellular signaling events that are of interest to the pathophysiology of celiac disease. In Figure 3b is important to note that the activation of the JAK-STAT signaling pathway and ubiquitin-mediated proteolysis contributes to the suggested mechanism through which genetic variations influence the regulation of cytokine signaling and protein degradation. Other pathways, such as fructose and mannose metabolism, ABC transporter, and intestinal immune network, show possible interactions between activation of immune system and metabolism.
In addition, Table 6 provides the detailed structure of the IL2 signaling cascade, cytokine ligands (IL2), receptor subunits (IL2RA, IL2RB, IL2RG), signal transducers (JAK1, JAK3), transcription factors (STAT3, STAT5A/B), and negative regulators (SOCS1). This pathway is visualized in Figure 4 highlighting the ligand-receptor interactions and downstream signal activation pathways which control T-cell proliferation.
|
a) |
|
b) |
Figure 3. The Enrichment of SNP Network Genes using Functional Pathway Analysis. (a) Results of Reactome Pathway Enrichment Show Biological Pathways that are Critically Involved with the Gene Set Being Analyzed. The Highest Enriched Signaling Pathways are Neuronal Signaling, Potassium Channel Activity, Cardiac Conduction, and Immune Regulatory Signaling, Including IL-2 Signaling and RUNX1FOXP3 Pathway of Regulatory T Cells Development. (b) KEGG Pathway Analysis Defined Key Signaling and Metabolic Pathways Associated with Genes of Interest
Table 6. Genes Involved in IL2 Signaling Pathway Identified from Enrichment and KEGG Analysis
|
Functional Role |
Gene Symbol |
Description |
|---|---|---|
|
Cytokine (ligand) |
IL2 |
Activates proliferation and differentiation of T cells |
|
Receptor subunits |
IL2RA, IL2RB, IL2RG |
Form the IL2 receptor complex |
|
Signal transducers |
JAK1, JAK3 |
Kinases that activate STAT proteins |
|
Transcription factors |
STAT3, STAT5A, STAT5B |
Induce expression of IL2-responsive genes |
|
Negative regulator |
SOCS1 |
Feedback inhibitor of IL2–JAK–STAT signaling |
a)
b)
A strong protein-protein interaction (PPI) network was formed with STRING v12.0 to investigate the molecular connectivity of the previously prioritized genes. The PPI network is shown in Figure 5, indicating that strong links include SOCS1, IL2, and UBE2L3, as well as DAD1 situated in the central regulatory cluster. This cluster marks the interaction between immune signaling and apoptotic regulation. The peripheral genes including PFKFB3, KCNK10, PTPRM, and PKIA were not found to be directly linked but must have a secondary or supportive role in these pathways.
Figure 5. Protein-protein Interaction (PPI) Network of Prioritized Genes Obtained with the Aid of the STRING v12.0 Database
The network plots high-confidence interactions (score 0.7 or more) between proteins coded by deleterious SNPs interacting with celiac disease. Proteins are represented as nodes and predicted functional associations between proteins are represented as edges, which are informed by experimental and computational data. The focus group of SOCS1, IL2, DAD1, and UBE2L3 means that the regulatory relationships were strong in terms of linkage between immune and apoptotic processes. While, the peripheral nodes (PFKFB3, KCNK10, ABCC9, PTPRM, SEMA6D, and PKIA) indicate the involvement of the corresponding genes in the regulation of the immune and apoptotic processes in particular. The clustering of SOCS1, IL2, and UBE2L3 in the center indicates a coordination of cytokine signaling pathways and, therefore, their possible functionality as functional hubs in immune activation during celiac disease.
3.8. Linkage Disequilibrium AnalysisThe strong linkage disequilibrium (r 2 ≥ 0.8) analysis of LD proxies revealed several proxy variants associated with each one of the lead SNPs. In the case of rs11712165 (CD80) and rs3184504 (SH2B3), a number of high-LD proxies were found to lie within regulatory or intronic regions, indicating retained haplotype around the functional loci. Further, rs243323 (SOCS1) and rs2298428 (UBE2L3) were found to have fewer high-LD partners, indicating more isolated genetic signals. No proxies indicated greater predicted regulatory scores than the lead SNPs. This has been used to support primary variants as the most reasonable functional candidates. Table 7 presents detailed values of proxies and LD metrics.
Table 7. Summary of LD Proxy Results for the Four Lead SNPs, Showing the Number of Identified Proxy Variants (r² ≥ 0.8) and A Brief Description of Observed Regulatory Evidence
|
Query SNP |
No. of Proxy Variants Identified |
Functional Regulatory Evidence Observed |
|---|---|---|
|
rs243323 |
2 |
Promoter/enhancer peaks in blood/immune tissues |
|
rs11712165 |
6 |
Promoter and enhancer marks in blood and lymphoid tissues |
|
rs2298428 |
1 |
Located in active regulatory regions with protein binding peaks in immune cells |
|
rs3184504 |
1 |
Evidence of enhancer/promoter signals and altered transcription factor motifs |
The analysis of GTEx v10 proved that every variant of lead had varying effects on the expression of particular genes and the effects were tissue-specific.
Table 8 summarizes all NES values and tissues that are of great significance.
Table 8. GTEx v10 eQTL Analysis of Prioritized Celiac-disease-associated SNPs (rs243323, rs2298428, rs3184504, and rs11712165).
|
SNP |
Gene |
GTEx Effect |
Biological Consequence |
|---|---|---|---|
|
rs243323 |
SOCS1 |
↑ expression |
negative feedback regulator; minor compensatory increase |
|
rs2298428 |
UBE2L3 |
↓ expression |
reduced ubiquitination → prolonged kinase signaling |
|
rs3184504 |
SH2B3 |
↓ expression |
less inhibition of cytokine receptor signaling |
|
rs11712165 |
CD80 |
↑ expression |
stronger T-cell co-stimulation → increased IL-2 output |
3.9.1. Integrated Mechanistic Pathway Mapping. The comparison of the expression effect of the four SNP-associated genes on KEGG and Reactome pathways showed intersection on the IL-2/JAK-STAT signaling axis. High expression of CD80 favored amplified T-cell co-stimulation and production of IL-2. Reduced SH2B3 indicated less inhibitory regulation of the cytokine receptor signaling. Reduced SOCS1 activity represented the sign of inhibition of JAK kinases through feedback, whereas reduced UBE2L3 expression suggested the lack of ubiquitin-mediated control of the pathway components. Taken together, these results provided evidence to one coherent mechanistic model of long-term IL-2/JAK-STAT activation, as shown in Figure 6.
Figure 6. Integrated Model Showing How SNP-associated Changes in CD80, SH2B3, SOCS1, and UBE2L3 Collectively Enhance IL-2/JAK-STAT Signaling, Leading to Sustained T-cell Activation and Intestinal Inflammation
3.10. Drug-Gene and Pathway-Level Therapeutic AssociationsA number of direct drug gene interactions were discovered through DGIdb screening. Clinically approved CTLA4-Ig4 agents Abatacept and Belatacept were linked to CD80, while SH2B3 was linked to Ruxolitinib, a JAK1/JAK2 inhibitor. SOCS1 and UBE2L3 did not directly interact with drugs. The pathway-level searches identified a number of agents with the potential to tune the hyperactive JAK-STAT signaling deduced by the mechanistic model, such as tofacitinib, upadacitinib, filgotinib, baricitinib and ruxolitinib. Table 9 summarizes these drug associations. The drugs that interact with both CD80 and SH2B3 were found in DGIdb. Whereas, SOCS1 and UBE2L3 did not have any direct pharmacological targets. Due to the convergence of all four genes on the activation of IL-2/JAK biomolecules, JAK inhibitors were also incorporated in the pathway-level therapeutic suggestions, which are clinically approved.
Table 9. SNP-based Repositioning Candidates of Gene Dysregulation and Pathway Response
|
Node/ Gene |
SNP Effect |
Candidate Drugs |
Class |
Rationale |
|---|---|---|---|---|
|
CD80 ↑ |
↑ co-stimulation, ↑ IL-2 |
Abatacept, Belatacept |
CTLA4-Ig fusion |
Block CD80/CD28, reduce IL-2 secretion |
|
SH2B3 ↓ |
loss of inhibitory adaptor |
Ruxolitinib |
JAK1/2 inhibitor |
Compensates for increased JAK-STAT signaling |
|
IL-2 / JAK1/3 |
pathway overactivity (from all SNPs) |
Tofacitinib, Upadacitinib, Baricitinib |
JAK inhibitors |
Directly dampen IL-2/JAK-STAT axis |
|
UBE2L3 ↓, SOCS1 ↑ |
altered ubiquitination / feedback but no direct drugs |
– (pathway-level JAK inhibitors) |
– |
Inform choice of JAK inhibitors rather than gene-specific drugs |
Celiac disease (CeD) is a complex disease that occurs due to the interaction of genetic risk variants, defective immune signaling, and gluten-mediated T-cell activation. Despite the fact that HLA-DQ2/DQ8 molecules are considered the strongest genetic determinants, there is increasing evidence that non-HLA loci play a significant role in disease vulnerability, cytokine receptivity, and chronic inflammation [28]. The current study offers a mechanistic model between four SNP-related genes, namely CD80, SH2B3, SOCS1, and UBE2L3, with amplified IL-2/JAK-STAT signaling, which forms a pro-inflammatory T-cell condition.
The study found that the following are connected with the expression of CD80: rs11712165 is correlated with the highly increased expression of CD80. The message concurs with other publications which observed that an augmentation of the CD80/CD28 co-stimulation augmented the IL-2 release and reduced the stimulation threshold of gluten reactive T cells [29]. The augmented supply of IL-2 provides a more lax cytokine condition where downstream JAK1/JAK3 stimulation is maintained and the phosphorylation of the key transcription factors of T-cell proliferation and effector differentiation, namely STAT3/STAT5, is augmented [30].
SH2B3 (rs3184504) missense mutation is also an additional contribution to this hyper-responsive condition. SH2B3 (or LNK) is typically an inhibitor of cytokine receptor signaling by suppressing JAK activity, while loss-of-function mutations are closely linked to autoimmune diseases, including type 1 diabetes and rheumatoid arthritis [31]. A lowered expression of SH2B3, as is the case with eQTL, increases the efficiency of IL-2 receptor signaling, which aggravates the impact of increased CD80-mediated IL-2 secretion.
SOCS1 is one of the main negative regulators of JAK kinases and variant rs243323 exhibited regulatory phenotypes indicative of defective transcriptional production. The loss of SOCS1-mediated feedback has been attributed to sustained activation of STAT in a number of autoimmune diseases [32]. Other such effects may also play a role in CeD by inhibiting IL-2-induced phosphorylation cascades by preventing their timely termination.
Of special interest is the role of UBE2L3 (rs2298428) in CeD. UBE2L3 is a gene that involves the E2 ubiquitin-conjugating enzyme which participates in the ubiquitin-mediated degradation of signaling intermediates. The low UBE2L3 activity has been found to extend NF-kB and cytokine signaling in lymphocytes [33]. We propose, in our data, the combination of destabilizing structural predictions and downregulated expression to indicate the failure of immune signaling termination, which allows the maintenance of JAK-STAT activity.
These variants, taken as a unit, indicate a consistent pattern of mechanism:
This type of multilevel dysregulation is consistent with more recent systems immunology findings demonstrating that CeD is defined by sustained IL-2 and IFN-gamma signaling loops which maintain effector T-cell activation in the absence of antigens [36]. The combined model advocates that CeD pathogenesis is not solely gluten-activated. Rather, it is also greatly influenced by genetic genotypes that enhance immune signaling pathways.
Pathway enrichment continuously found cytokine-cytokine receptor and IL-2/JAK-STAT signaling responses to be highly overrepresented, in line with other genomic studies that placed cytokine signaling at the center of CeD immunopathology [37]. Protein-protein interaction (PPI) network analysis also demonstrated close functional clustering of SOCS1, IL2, and UBE2L3. This is in line with the earlier networks between these proteins, as well as lymphocyte activation and regulation of apoptosis [38]. These findings support the idea that the variants analyzed did not act singly but engaged in a molecular circuit to hyperactivate T-cells.
Tissue signature derived by GTEx demonstrated effects of expression in blood, lymphoid tissues, and exposed mucosal tissues, resembling the activity of immune activation of CeD. It has been demonstrated that IL-2/STAT signatures in peripheral blood can potentially indicate continuous intestinal inflammation and may also indicate the potential value of systemic expression results to comprehend mucosal disease pathophysiology [39].
Even though clinical applications of these drugs require experimental validation, the drug to gene mapping determines clinically relevant targets. CTLA4-Ig4 likes abatacept are agents that regulate the CD80/CD28 signaling system and have been demonstrated to be helpful in other autoimmune conditions [40, 41]. Since it was found that upregulation of CD80 is genetically strengthened in CeD, its inhibition can theoretically suppress the production of IL-2.
What is even stronger though is the convergence to JAK1/JAK3. Various JAK inhibitors (e.g., tofacitinib, upadacitinib) have been shown to be effective in diseases that have an exaggerated STAT signal [42]. Their theoretical applicability to CeD has been suggested but not investigated. Our results present genetic reasons to explore JAK inhibitors in the treatment of refractory or severe forms of CeD.
4.1. Limitations and Future DirectionsThe review is limited in a number of ways that must be taken into consideration. Single studies reported mostly non-HLA variations, which limited the number of SNPs that could be used in meta-analysis and the strength of evidence of many links. Two variants that could be pooled, namely rs71810546 and rs990171, are highly heterogeneous; they differ in the study design, the sample size, the ancestry, and the diagnostic requirements, which lower the confidence of the generalized effects estimates. There were also limitations in the quality of reporting, such as missing genotype information and statistical measures, which prevented the analysis and could have created reporting bias. Also, this study examined the relationship between single-SNP and did not examine polygenic interactions and epistasis, as well as environmental modifiers, which might be important determinants of celiac disease susceptibility.
Future studies need to focus on large, harmonized, multi-ethnic cohorts and standardized reporting practices in order to minimize heterogeneity and enhance reproducibility. It would be necessary to combine functional genomics, transcriptomic profiling, immune-cell-specific studies, and fine-map to elucidate the biological processes involved in these variants. Further genome-wide studies, such as polygenic risk modeling and gene-environment interaction analysis, are required to further the knowledge on non-HLA causation of celiac disease and to bring genetic knowledge to practical clinical usage.
4.2. ConclusionThis genetic analysis showed that four major celiac disease-related variants, namely rs11712165 (CD80), rs3184504 (SH2B3), rs243323 (SOCS1), and rs2298428 (UBE2L3), intersected in a mechanistic manner to increase IL-2/JAK-STAT signaling, which is a central axis of T-cell activation and inflammation. In functional annotation, structural stability prediction, eQTL expression profiling, pathway enrichment and protein-interaction modeling, each variant made its own directional contribution to a common pathogenic outcome. High levels of CD80 augment production of IL-2, whereas low levels of SH2B3 and SOCS1 undermine vital inhibitory checkpoints. At the same time, impaired UBE2L3 activity impairs ubiquitin-mediated degradation of signaling intermediates, which enables sustained kinase activity. These dysregulated processes combined to provide a molecular environment conducive to chronic and sustained STAT3/STAT5 activation and chronic intestinal inflammation.
Connecting SNP-level to pathway-level malfunctioning, the study builds a logical mechanistic system that contributes to research on non-HLA genetic predisposition to celiac disease. Notably, the convergence of various variants on a signaling cascade that is therapeutically targetable demonstrates opportunities in translation. Celiac disease does not have approved immunomodulatory agents currently, but the signatures of the observed pathways promise the applicability of these agents to current therapies in severe or refractory diseases including CD80 modulators and JAK inhibitors. Such targeted interventions should be evaluated by future experimental confirmation and clinical trials in order to determine its practicality.
Altogether, these results highlight the fact that celiac disease is not merely caused by gluten exposure but is significantly influenced by genetic architecture and predisposes people towards hyper-activation of the immune system. The mechanistic model provides a basis for further functional research and possibly the creation of specific therapeutic approaches to restore immune homeostasis in individuals affected by this condition.
Saba Farooq: conceptualization, writing – original draft, formal analysis, software. Saima Bibi: software, formal analysis. Syeda Marriam Bakhtiar: supervision, writing – review & editing
The authors of the manuscript have no financial or non-financial conflict of interest in the subject matter or materials discussed in this manuscript.
Data supporting the findings of this study will be made available by the corresponding author upon request.
No funding has been received for this research.
The authors did not use any type of generative artificial intelligence software for this research.