Technological advances now make it possible to generate diverse, complex and varying sizes of data in a wide range of applications from business to engineering to medicine. In the health sciences, in particular, data are being produced at an unprecedented rate across the full spectrum of scientific inquiry spanning basic biology, clinical medicine, public health and health care systems. Leveraging these data can accelerate scientific advances, health discovery and innovations. However, data are just the raw material required to generate new knowledge, not knowledge on its own, as a pile of bricks would not be mistaken for a building. In order to solve complex scientific problems, appropriate methods, tools and technologies must be integrated with domain knowledge expertise to generate and analyze big data. This integrated interdisciplinary approach is what has become to be widely known as data science. Although the discipline of data science has been rapidly evolving over the past couple of decades in resource-rich countries, the situation is bleak in resource-limited settings such as most countries in Africa primarily due to lack of well-trained data scientists. In this paper, we highlight a roadmap for building capacity in health data science in Africa to help spur health discovery and innovation, and propose a sustainable potential solution consisting of three key activities: a graduate-level training, faculty development, and stakeholder engagement. We also outline potential challenges and mitigating strategies.