This feature has been tested and it works properly.
The small amount of data that you guys collected can't be used as a statistical data.
GKO
Really? Let's find out!I hypothesize that observed good form results from medium training are statistically different from those reported. Test 1: z-test using normal curveA z-proportions test was used to to test whether significantly fewer "good" form results from medium training intensity were observed than are reported (3%). Of the 247 observations, 0 (0%) were good and 247 (100%) were not. This difference is statistically significant (z = -2.7639) at α = 0.05, indicating that fewer good form results were observed than would be expected by chance. Thus
the probability that the actual probability of getting good form result from medium training is less than the reported 3% is > 99.71%.
Test 2: chi-square testA chi-square test was also used to test whether significantly fewer "good" form results from medium training intensity were observed than are reported (3%). Again the difference is statistically significant (X
2 = 7.6392, df=1) yielding the precise
p-value of 0.006092.
n = 247 for both tests.
Thus we should reject the null hypothesis (there is no difference between observed and reported probabilities) in favor of the alternative (stated above). These results provide very strong evidence that there is at least one bug in this feature.The z-test is easily done using a calculator and normal probability table.
The chi square test was done using two separate programs found online
here and
here so anyone can run the test for themself.
However, as Gabriellis pointed out, the fact that there have not been any reported good form results from medium training really speak for themself!
If someone else can record and post 50-100 results from hard training we can test whether there is also a bug in good form results from hard training. Due to the relatively low number of observations I have made there, the resulting p-value is a little bit high to draw conclusions with real confidence (p <0.08 based on the data in my original post), but with some more, much stronger evidence can be reported. I just don't have the time tbf.