Internal Validation Methods
For the internal validation the simulated data for used common variables (constraints) need to be compared with the existing census data, as the census data holds complete information about a person for the simulated area. Several validation methods can be undertaken, the framework simSALUD supports at the moment the Total Absolute Error (TAE), Standardized absolute error (SAE) and the Percentage error (PSAE)
Total Absolute Error (TAE)
The total absolute error (TAE) is the difference (absolute value) for each constraint category for each area summed over all areas. This method provides information on the size of the difference between the simulated and actual datasets. But this measure does not provide any information on whether any differences area statistically significant.
Total Absolute Error (TAE) Percent of total
Based on the total absolute error (TAE) this method calculates the percentage of all values which are smaller than the user defined values based on all regions. Example: User defined threshold value:10% Result: Male:30% -> 30% of all regions have a smaller deviation between the census and the simulated value than 10%.
Standardized absolute error (SAE)
The standardized absolute error (SAE) is TAE divided by the number of total census for each area.
Percentage error (PSAE)
The percentage error is the SAE times 100 to get the SAE in percentage.
Independent Samples T-Test
The independent-samples t-test compares the means between two unrelated groups.
Correlation Coefficient or Pearson Correlation
The Correlation Coefficient or Pearson Correlation (Root of the coefficient of determination ) measures the strength and the direction of a linear relationship between two variables.
Linear Regression: Coefficient of determination
The coefficient of determination (R²), indicates how well data points fit a line or curve. (Square of the Pearson correlation coefficient.