Artifical vs Real

On the use of syntactic distance metrics in evaluating fault mimicking techniques

View project on GitHub

Journal First - TSE 2022

Supplementary data

Empirical evaluation

RQ1 - Results per tool

How semantically and syntactically similar are seeded and real faults?

BLEU Coefficient - Ochiai score

Class granularity level (PiTest - CodeBERT - DeepMutation - IBIR)

Cosine Coefficient - Ochiai score

Class granularity level (PiTest - CodeBERT - DeepMutation - IBIR)

Function granularity level (PiTest - CodeBERT - DeepMutation - IBIR)

Jaccard Coefficient - Ochiai score

Class granularity level (PiTest - CodeBERT - DeepMutation - IBIR)

Function granularity level (PiTest - CodeBERT - DeepMutation - IBIR)

Discussion

Sensitivity of mutants from the same location. Small syntactic changes lead to diverse semantic changes.

Changed lines location (PiTest - CodeBERT - IBIR)

Random lines location (PiTest - CodeBERT - IBIR)

Using Fault Detection Probability instead of Ochiai score for measurement

Support or Contact

Check our Git-Repo for all descriptive statistics data