Provide several metrics to assess the quality of the predictions of a model (see note) against observations.

n_obs(obs)

mean_obs(obs, na.rm = TRUE)

mean_sim(sim, na.rm = TRUE)

sd_obs(obs, na.rm = TRUE)

sd_sim(sim, na.rm = TRUE)

CV_obs(obs, na.rm = TRUE)

CV_sim(sim, na.rm = TRUE)

r_means(sim, obs, na.rm = TRUE)

R2(sim, obs, na.action = stats::na.omit)

SS_res(sim, obs, na.rm = TRUE)

Inter(sim, obs, na.action = stats::na.omit)

Slope(sim, obs, na.action = stats::na.omit)

RMSE(sim, obs, na.rm = TRUE)

RMSEs(sim, obs, na.rm = TRUE)

RMSEu(sim, obs, na.rm = TRUE)

nRMSE(sim, obs, na.rm = TRUE)

rRMSE(sim, obs, na.rm = TRUE)

rRMSEs(sim, obs, na.rm = TRUE)

rRMSEu(sim, obs, na.rm = TRUE)

pMSEs(sim, obs, na.rm = TRUE)

pMSEu(sim, obs, na.rm = TRUE)

Bias2(sim, obs, na.rm = TRUE)

SDSD(sim, obs, na.rm = TRUE)

LCS(sim, obs, na.rm = TRUE)

rbias2(sim, obs, na.rm = TRUE)

rSDSD(sim, obs, na.rm = TRUE)

rLCS(sim, obs, na.rm = TRUE)

MAE(sim, obs, na.rm = TRUE)

ABS(sim, obs, na.rm = TRUE)

MSE(sim, obs, na.rm = TRUE)

EF(sim, obs, na.rm = TRUE)

NSE(sim, obs, na.rm = TRUE)

Bias(sim, obs, na.rm = TRUE)

MAPE(sim, obs, na.rm = TRUE)

FVU(sim, obs, na.rm = TRUE)

RME(sim, obs, na.rm = TRUE)

tSTUD(sim, obs, na.rm = TRUE)

tLimit(sim, obs, risk = 0.05, na.rm = TRUE)

Decision(sim, obs, risk = 0.05, na.rm = TRUE)

Arguments

obs

Observed values

na.rm

Boolean. Remove NA values if TRUE (default)

sim

Simulated values

na.action

A function which indicates what should happen when the data contain NAs.

risk

Risk of the statistical test

Value

A statistic depending on the function used.

Details

The statistics for model quality can differ between sources. Here is a short description of each statistic and its equation (see html version for LATEX):

  • n_obs(): Number of observations.

  • mean_obs(): Mean of observed values

  • mean_sim(): Mean of simulated values

  • sd_obs(): Standard deviation of observed values

  • sd_sim(): standard deviation of simulated values

  • CV_obs(): Coefficient of variation of observed values

  • CV_sim(): Coefficient of variation of simulated values

  • r_means(): Ratio between mean simulated values and mean observed values (%), computed as : $$r\_means = \frac{100*\frac{\sum_1^n(\hat{y_i})}{n}} {\frac{\sum_1^n(y_i)}{n}}$$

  • R2(): coefficient of determination, computed using stats::lm() on obs~sim.

  • SS_res(): residual sum of squares (see notes).

  • Inter(): Intercept of regression line, computed using stats::lm() on sim~obs.

  • Slope(): Slope of regression line, computed using stats::lm() on sim~obs.

  • RMSE(): Root Mean Squared Error, computed as $$RMSE = \sqrt{\frac{\sum_1^n(\hat{y_i}-y_i)^2}{n}}$$ RMSE = sqrt(mean((sim-obs)^2)

  • RMSEs(): Systematic Root Mean Squared Error, computed as $$RMSEs = \sqrt{\frac{\sum_1^n(\sim{y_i}-y_i)^2}{n}}$$ RMSEs = sqrt(mean((fitted.values(lm(formula=sim~obs))-obs)^2)

  • RMSEu(): Unsystematic Root Mean Squared Error, computed as $$RMSEu = \sqrt{\frac{\sum_1^n(\sim{y_i}-\hat{y_i})^2}{n}}$$ RMSEu = sqrt(mean((fitted.values(lm(formula=sim~obs))-sim)^2)

  • NSE(): Nash-Sutcliffe Efficiency, alias of EF, provided for user convenience.

  • nRMSE(): Normalized Root Mean Squared Error, also denoted as CV(RMSE), and computed as: $$nRMSE = \frac{RMSE}{\bar{y}}\cdot100$$ nRMSE = (RMSE/mean(obs))*100

  • rRMSE(): Relative Root Mean Squared Error, computed as: $$rRMSE = \frac{RMSE}{\bar{y}}$$

  • rRMSEs(): Relative Systematic Root Mean Squared Error, computed as $$rRMSEs = \frac{RMSEs}{\bar{y}}$$

  • rRMSEu(): Relative Unsystematic Root Mean Squared Error, computed as $$rRMSEu = \frac{RMSEu}{\bar{y}}$$

  • pMSEs(): Proportion of Systematic Mean Squared Error in Mean Square Error, computed as: $$pMSEs = \frac{MSEs}{MSE}$$

  • pMSEu(): Proportion of Unsystematic Mean Squared Error in MEan Square Error, computed as: $$pMSEu = \frac{MSEu}{MSE}$$

  • Bias2(): Bias squared (1st term of Kobayashi and Salam (2000) MSE decomposition): $$Bias2 = Bias^2$$

  • SDSD(): Difference between sd_obs and sd_sim squared (2nd term of Kobayashi and Salam (2000) MSE decomposition), computed as: $$SDSD = (sd\_obs-sd\_sim)^2$$

  • LCS(): Correlation between observed and simulated values (3rd term of Kobayashi and Salam (2000) MSE decomposition), computed as: $$LCS = 2*sd\_obs*sd\_sim*(1-r)$$

  • rbias2(): Relative bias squared, computed as: $$rbias2 = \frac{Bias^2}{\bar{y}^2}$$ rbias2 = Bias^2/mean(obs)^2

  • rSDSD(): Relative difference between sd_obs and sd_sim squared, computed as: $$rSDSD = \frac{SDSD}{\bar{y}^2}$$

  • rLCS(): Relative correlation between observed and simulated values, computed as: $$rLCS = \frac{LCS}{\bar{y}^2}$$

  • MAE(): Mean Absolute Error, computed as: $$MAE = \frac{\sum_1^n(\left|\hat{y_i}-y_i\right|)}{n}$$ MAE = mean(abs(sim-obs))

  • ABS(): Mean Absolute Bias, which is an alias of MAE()

  • FVU(): Fraction of variance unexplained, computed as: $$FVU = \frac{SS_{res}}{SS_{tot}}$$

  • MSE(): Mean squared Error, computed as: $$MSE = \frac{1}{n}\sum_{i=1}^n(Y_i-\hat{Y_i})^2$$ MSE = mean((sim-obs)^2)

  • EF(): Model efficiency, also called Nash-Sutcliffe efficiency (NSE). This statistic is related to the FVU as \(EF= 1-FVU\). It is also related to the \(R^2\) because they share the same equation, except SStot is applied relative to the identity function (i.e. 1:1 line) instead of the regression line. It is computed as: $$EF = 1-\frac{SS_{res}}{SS_{tot}}$$

  • Bias(): Modelling bias, simply computed as: $$Bias = \frac{\sum_1^n(\hat{y_i}-y_i)}{n}$$ Bias = mean(sim-obs)

  • MAPE(): Mean Absolute Percent Error, computed as: $$MAPE = \frac{\sum_1^n(\frac{\left|\hat{y_i}-y_i\right|} {y_i})}{n}$$

  • RME(): Relative mean error, computed as: $$RME = \frac{\sum_1^n(\frac{\hat{y_i}-y_i}{y_i})}{n}$$ RME = mean((sim-obs)/obs)

  • tSTUD(): T student test of the mean difference, computed as: $$tSTUD = \frac{Bias}{\sqrt(\frac{var(M)}{n_obs})}$$ tSTUD = Bias/sqrt(var(M)/n_obs)

  • tLimit(): T student threshold, computed using qt(): $$tLimit = qt(1-\frac{\alpha}{2},df=length(obs)-1)$$ tLimit = qt(1-risk/2,df =length(obs)-1)

  • Decision(): Decision of the t student test of the mean difference (can bias be considered statistically not different from 0 at alpha level 0.05, i.e. 5% probability of erroneously rejecting this hypothesis?), computed as: $$Decision = abs(tSTUD ) < tLimit$$

Note

\(SS_{res}\) is the residual sum of squares and \(SS_{tot}\) the total sum of squares. They are computed as: $$SS_{res} = \sum_{i=1}^n (y_i - \hat{y_i})^2$$ SS_res= sum((obs-sim)^2) $$SS_{tot} = \sum_{i=1}^{n}\left(y_{i}-\bar{y}\right)^2$$ SS_tot= sum((obs-mean(obs))^2 Also, it should be noted that \(y_i\) refers to the observed values and \(\hat{y_i}\) to the predicted values, \(\bar{y}\) to the mean value of observations and \(\sim{y_i}\) to values predicted by linear regression.

Examples

if (FALSE) {
sim <- rnorm(n = 5, mean = 1, sd = 1)
obs <- rnorm(n = 5, mean = 1, sd = 1)
RMSE(sim, obs)
}