When fitting a model to a dataset, one often uses the Chi-Square metric to validate the goodness of fit and decide how likely the model may have generated the dataset (or the noise-corrupted measurements of some quantity). The big assumption here is that the measurement errors follow a normal (Gaussian) distribution. The null hypothesis (H0) for such a case is defined as follows: "the given model [and measurement errors] explains the dataset". That is, the residuals in the fit are consistent with the measurement errors and the minimum Chi-Square value, chisq_min, was drawn from a Chi-Square distribution with N - M degrees of freedom. Here, N = number of data points being fitted and M = number of parameters in the fit. If the probability of obtaining a Chi-Square value > chisq_min under H0 is tiny (i.e., < some critical value Pcrit), H0 is rejected and we conclude it's unlikely the model could have generated the dataset. Therefore, the model is rejected. Alternatively, it could also mean the measurement uncertainties were simply underestimated and inflated the achieved chisq_min value. This would be interpreted by a frequentist as accepting the model as "correct", i.e., as the one having generated the data (= H0 above), then asking: how many times would this sequence of fit residuals (with specific chisq-min value) be exceeded if the dataset were repeatedly simulated using the model assuming errors drawn from a normal distribution? -- F. Masci, 08/02/2025