Ramp metering (RM) is the most effective dynamic traffic measure in response to growing congestion in urban freeway networks. Among the extensive RM methods available, those based on optimal control theory have shown strong potential in improving freeway performance. However, these algorithms require an accurate traffic model that limits their applicability in practice. Reinforcement learning (RL) provides the tools to achieve optimal RM control without reliance on any traffic model. In this paper, a guideline for designing RM control systems based on RL is presented by testing different states’ representations, learning methods, action selection, and reward definitions. A microscopic simulation test bed based on a portion of Highway 401 in Toronto, Canada, is developed to evaluate each of the above design parameters and quantify various RM control strategies. A comparison of the reinforcement learning ramp-metering (RLRM) algorithm with a modified version of ALINEA shows the potential of RLRM to improve freeway traffic conditions. When applied to the developed case study, the proposed RLRM algorithm and modified ALINEA reduce the total travel time by 40% and 20%, respectively, compared with the case with no RM.