摘要:Many modern data analysis algorithms either assume or are considerably more efficient if the distances between the data points satisfy a metric. However, as real data sets are noisy, they often do not possess this fundamental property. For this reason, Gilbert and Jain [A. Gilbert and L. Jain, 2017] and Fan et al. [C. Fan et al., 2018] introduced the closely related sparse metric repair and metric violation distance problems. Given a matrix, representing all distances, the goal is to repair as few entries as possible to ensure they satisfy a metric. This problem was shown to be APX-hard, and an O(OPT^{1/3})-approximation was given, where OPT is the optimal solution size. In this paper, we generalize the problem, by describing distances by a possibly incomplete positively weighted graph, where again our goal is to find the smallest number of weight modifications so that they satisfy a metric. This natural generalization is more flexible as it takes into account different relationships among the data points. We demonstrate the inherent combinatorial structure of the problem, and give an approximation-preserving reduction from MULTICUT, which is hard to approximate within any constant factor assuming UGC. Conversely, we show that for any fixed constant Ï,, for the large class of Ï,-chordal graphs, the problem is fixed parameter tractable, answering an open question from previous work. Call a cycle broken if it contains an edge whose weight is larger than the sum of all its other edges, and call the amount of this difference its deficit. We present approximation algorithms, one depending on the maximum number of edges in a broken cycle, and one depending on the number of distinct deficit values, both quantities which may naturally be small. Finally, we give improved analysis of previous algorithms for complete graphs.