Home
Scholarly Works
Low complexity (A/C)GG Repeats and m1 A...
Journal article

Low complexity (A/C)GG Repeats and m1 A Methylation Sites in 5` UTRs Regulate Gene Expression

Abstract

Repetitive and compositionally biased low-complexity (LC) motifs appear in biological sequences where they interact with the machinery controlling the abundance of their host molecules. They can have significant impacts on physiological function, and act as raw material for evolution of regulatory motifs. The extent to which LC motifs affect abundance is not known. Even definitions of LC sequences are not well established, let alone which motifs exists in LC sequences, and which of those are abundance associated. To fill these knowledge gaps for post-transcriptional impacts of LC motifs, we integrated data from the GTEx project, PaxDb, and the IGSR. We establish definitions for LC motifs in both RNA and protein sequences. We observed that the presence of LC motifs in the 5' UTR were positively associated with transcript abundance. We present a method to de novo identify abundance associated motifs and identified trinucleotide repeats of (A/C)GG as most strongly abundance associated. We observed that m1A methylation sites were strongly associated with both LC motifs and abundance, an effect which is amplified as methylation signatures from unspecialized RNA-seq increased. Together, our results demonstrate that LC motifs play important roles in regulating gene expression.

Authors

Dickson ZW; Bilodeau M; Ruiz DL; Golding GB

Journal

Genome, Vol. 0, No. ja,

Publisher

Canadian Science Publishing

Publication Date

January 9, 2026

DOI

10.1139/gen-2025-0071

ISSN

0831-2796

Contact the Experts team