Modern critical cyber-physical systems such as autonomous vehicles, drones, and real-time medical monitoring, demand not only intensive data processing but also stringent adherence to real-time performance constraints. These applications often involve continuous or sequential data streams (e.g., images, videos, and sensor readings), which require frequent memory accesses. Despite advancements in processing power, huge variable interference delay is incurred within the Dynamic Random Access Memory (DRAM) accesses. However, achieving a tight bound of memory latency remains a significant challenge, yet it is essential for ensuring safe and predictable execution of these critical tasks. To address this bottleneck, we propose InterStellarRT , a novel hardware/software harmony methodology that provides data-aware optimizations across the entire memory hierarchy. Leveraging a software layer that communicates data access patterns to the memory controller, InterStellarRT achieves significant reductions in memory access times, ensuring tightly bounded and predictable times. We perform the theoretical analysis of the memory latency bound. Then, we prove that InterStellarRT provides remarkable tighter memory latency bound for in-isolation and interference latencies compared to the state-of-the-art real-time systems based on the Commercial-Off-The-Shelf (COTS) Double Data Rate 4 (DDR4) memory devices and is also applicable to DDR5. We evaluate InterStellarRT on a RISC-V-based quad-core system on GEM5 and DDR4 in Ramulator. Analyzing benchmark results from Polybench, LAPACK, Phoenix, and HPCG Suites, InterStellarRT achieves a 3.8× tighter average bound for in-isolation memory latency and 13.5× for interference latency under affine workloads, while for mixed-affinity workloads, the bounds are 2.15× and 4×, respectively. Moreover, InterStellarRT achieves average 1.72× end-to-end speedup, and 1.9× bandwidth improvement, and 14% DRAM energy reduction against the baseline.