To compare the differences between the two samples, regardless of their averages, it is best to consider the ratio between the pairs of measurements. [4] The log transformation (base 2) of the measurements prior to the analysis makes it possible to use the standard approach; so the action will be given by the following equation: It is important, there are no uniform criteria for what is acceptable limits of the agreement. This is a subjective decision that must be made from a clinical point of view and depends on the variables measured and must be predefined. Bland and Altman indicate that two measurement methods developed to measure the same parameter (or property) should have a good correlation when a group of samples is selected so that the property to be determined varies considerably. Therefore, a high correlation for two methods of measuring the same property could in itself be only a sign that a widely used sample has been chosen. A high correlation does not necessarily mean that there is a good agreement between the two methods. When examining the binary results, the diagonal elements of an emergency chart indicate the frequency of the match. Cohens Kappa is a measure of an agreement considered more robust than a simple percentage agreement, because it takes into account the possibility of obtaining a random agreement. It is given by: A common question in clinical research is whether a new method of measurement is equal to an established one. As a statistical advisor at PHASTAR, I see an increase in studies comparing a new diagnostic tool for artificial intelligence or machine learning with an existing tool or with a clinician.

The methodology for analyzing binary data is well established, but the methodology for continuous results is less developed. Here we will review the current methodology and show some of the common pitfalls. It should be noted that the match analysis does not guarantee the accuracy of the measurement methods, but shows the extent to which different measurement techniques correspond. To properly assess a new measurement method, it is also necessary to take into account quantities relating to the validity of measurements, such as sensitivity, specificity and positive and negative forecast values. A perfect match occurs when Cohen`s Kappa is equal to 1 and a value equal to zero indicates that the agreement is no better than the one that would have been made by chance. The Bland-Altman plot is a representation of the differences between the pairs of measurements relative to the average of the measurement and gives an overview of the magnitude of the match. We summarize the lack of agreement by calculating the distortion estimated by the average difference, d, and also calculating the standard deviation of differences, s. If the differences are normally distributed, we would expect most of the differences to be between d-2s and d-2s. These upper and lower limits are called „agreement limits” and allow for compliance assessment. If the differences are not clinically important, the methods can be interchangeable. An example of the Bland-Altman diagram for hemoglobin is shown in Figure 3. Bland-Altman parcels were also used to investigate a possible link between the differences between the measurements and the actual value (i.e.

proportional distortion). The existence of proportional distortion indicates that the methods do not uniformly correspond to the range of measures (i.e., the limits of compliance depend on the actual measure).