Restoring Consistency in Ontological Multidimensional Data Models via Weighted Repairs
Conferences
Overview
Research
Identity
Additional Document Info
View All
Overview
abstract
High data quality is a prerequisite for accurate data analysis. However, data inconsistencies
often arise in real data, leading to untrusted decision making downstream in the data
analysis pipeline. In this research, we study the problem of inconsistency detection and
repair of the Ontology Multi-dimensional Data Model (OMD). We propose a framework
of data quality assessment, and repair for the OMD. We formally define a weight-based
repair-by-deletion semantics, and present an automatic weight generation mechanism
that considers multiple input criteria. Our methods are rooted in multi-criteria decision
making that consider the correlation, contrast, and conflict that may exist among
multiple criteria, and is often needed in the data cleaning domain. After weight generation
we present a dynamic programming based Min-Sum algorithm to identify minimal
weight solution. We then apply evolutionary optimization techniques and demonstrate
improved performance using medical datasets, making it realizable in practice.