This
ShinyItemAnalysis
module provides interactive display of
Differential Item Functioning in Change (DIF-C) analysis in ordinal items.
We use Micro and Macro measurements of attitudes towards the expulsion
of Sudeten Germans after WWII, as more closesly described in
Kolek, Šisler, Martinková, and Brom (2021). DIF-C analysis was first described for binary items in
Martinková, Hladká, and Potužníková (2020), demonstrating that this more detailed item-level analysis is able
to detect between-group differences in pre-post gains even in case when
no difference is observable in gains in total scores. DIF analysis is
implemented with generalized logistic regression models in the
difNLR
package
(Hladká & Martinková, 2020). The module is part of the
ShinyItemAnalysis
package
(Martinková & Drabinová, 2018).
DIF analysis may come to a different conclusion than a test of group differences in total scores. Two groups may have the same distribution of total scores, yet, some items may function differently for the two groups. Also, one of the groups may have a significantly lower total score, yet, it may happen that there is no DIF item (Martinková et al., 2017). This section examines the between-group differences in total scores only. Further sections are devoted to DIF and DIF-C analysis.
We first examine the two groups (experimental and control) on Pretest. No between-group differences were expected on Pretest.
We now examine the change in attitudes towards expulsion in the two groups, and the between-group differences in this change. We expected the experimental group being more affected by the game, both in short-term (Pretest - Posttest) and in the long term (Pretest - Delayed Posttest, while the change was expected to remain long-term (no difference was expected between Posttest - Delayed Posttes. Note that in their study, Kolek et al. (2021) complement the t tests displayed below also by more complex mixed-effect regression models taking into account respondent characteristics.
# load libraries library(ShinyItemAnalysis) library(difNLR) library(ggplot2) library(moments) # explore the variables of the dataset (from ShinyItemAnalysis) names(AttitudesExpulsion) # convert group variable to integer, assigning '1' to the experimental group group <- as.numeric(AttitudesExpulsion[, "Group"] == "E") # total score calculation with respect to group score <- AttitudesExpulsion$PreMacro # or PreMicro, PostMacro/Micro, DelMacro/Micro score0 <- score[group == 0] # control group score1 <- score[group == 1] # experimental group # summary of total score tab <- rbind( c( length(score0), min(score0), max(score0), mean(score0), median(score0), sd(score0), skewness(score0), kurtosis(score0) ), c( length(score1), min(score1), max(score1), mean(score1), median(score1), sd(score1), skewness(score1), kurtosis(score1) ) ) colnames(tab) <- c("N", "Min", "Max", "Mean", "Median", "SD", "Skewness", "Kurtosis") tab # create a dataframe for plotting df <- data.frame(score, group = as.factor(group)) # histogram of total scores with respect to group ggplot(data = df, aes(x = score, fill = group, col = group)) + geom_histogram(binwidth = 1, position = "dodge2", alpha = 0.75) + xlab("Total score") + ylab("Number of respondents") + scale_fill_manual( values = c("dodgerblue2", "goldenrod2"), labels = c("Control", "Experimental") ) + scale_colour_manual( values = c("dodgerblue2", "goldenrod2"), labels = c("Control", "Experimental") ) + theme_app() + theme(legend.position = "left") # t-test to compare total scores t.test(score0, score1)
In Pretest, Kolek et al. (2021) assumed the items will function similarly for the experimental and the control group. As expected, no DIF was confirmed in Pretest.
In their study, Kolek et al. (2021) used the group-specific cumulative logit model to detect DIF on Pretest, and DIF-C in Posttest and in Delayed posttest. They tested the hypthesis of any DIF/DIF-C against the alternative of any type of DIF/DIF-C (uniform or nonuniform). They used the Pretest total score as a matching criterion and the Benjamini-Hochberg correction for multiple comparisons. Item purification was not applied. Here we offer the DIF/DIC-C analysis with the same settings as in Kolek et al. (2021). You can also change the type of DIF to be tested, the matching criterion, and the parametrization - either the IRT or the classical intercept/slope. You can also select a correction method for a multiple comparison and/or item purification.
The probability that respondent \(p\) with the pretest score (matching criterion) \(X_p\) and the group membership variable \(G_p\) obtained at least \(k\) points in item \(i\) is given by the following equation:
The probability that respondent \(p\) with the pretest score (matching criterion) \(X_p\) and group membership \(G_p\) obtained exactly \(k\) points in item \(i\) is then given as the difference between the probabilities of obtaining at least \(k\) and \(k + 1\) points:
This summary table contains information about \(\chi^2\)-statistics of the likelihood ratio test, corresponding \(p\)-values considering selected correction method, and significance codes. The table also provides estimated parameters for the best fitted model for each item.
Points represent a proportion of the obtained score with respect to the matching criterion. Their size is determined by the count of respondents who achieved a given level of the matching criterion and who selected given option with respect to the group membership.
This table summarizes estimated item parameters together with the standard errors.
In Posttest, Kolek et al. (2021) assumed some but not necessarily all the items will function differentially for respondents in the experimental and the control group with the same pretest score. The DIF-C analysis revealed that Item 4 in Macro, and Item 10 in Micro measurement functioned differenially.
In their study, Kolek et al. (2021) used the group-specific cumulative logit model to detect DIF on Pretest, and DIF-C in Posttest and in Delayed posttest. They tested the hypthesis of any DIF/DIF-C against the alternative of any type of DIF/DIF-C (uniform or nonuniform). They used the Pretest total score as a matching criterion and the Benjamini-Hochberg correction for multiple comparisons. Item purification was not applied. Here we offer the DIF/DIC-C analysis with the same settings as in Kolek et al. (2021). You can also change the type of DIF to be tested, the matching criterion, and the parametrization - either the IRT or the classical intercept/slope. You can also select a correction method for a multiple comparison and/or item purification.
The probability that respondent \(p\) with the pretest score (matching criterion) \(X_p\) and the group membership variable \(G_p\) obtained at least \(k\) points in item \(i\) is given by the following equation:
The probability that respondent \(p\) with the pretest score (matching criterion) \(X_p\) and group membership \(G_p\) obtained exactly \(k\) points in item \(i\) is then given as the difference between the probabilities of obtaining at least \(k\) and \(k + 1\) points:
This summary table contains information about \(\chi^2\)-statistics of the likelihood ratio test, corresponding \(p\)-values considering selected correction method, and significance codes. The table also provides estimated parameters for the best fitted model for each item.
Points represent a proportion of the obtained score with respect to the matching criterion. Their size is determined by the count of respondents who achieved a given level of the matching criterion and who selected given option with respect to the group membership.
This table summarizes estimated item parameters together with the standard errors.
# load libraries library(ShinyItemAnalysis) library(difNLR) library(ggplot2) # prepare data data <- AttitudesExpulsion[, c(paste0("PostMacro_0", 1:7))] group <- as.numeric(AttitudesExpulsion[, "Group"] == "E") score <- AttitudesExpulsion$PreMacro # DIF matching score # DIF-C with cumulative logit regression model (fit <- difORD( Data = data, group = group, focal.name = 1, model = "cumulative", type = "both", match = score, p.adjust.method = "BH", purify = FALSE, parametrization = "classic" )) # plot cumulative probabilities for item X2003 plot(fit, item = "PostMacro_04", plot.type = "cumulative") # plot category probabilities for item X2003 plot(fit, item = "PostMacro_04", plot.type = "category") # estimate coefficients for all items with SE coef(fit, SE = TRUE)
In Delayed Posttest, Kolek et al. (2021) assumed some but not necessarily all the items will function differentially for respondents in the experimental and the control group with the same pretest score. The DIF-C analysis revealed that items 6 and 10 in Micro measurement functioned differenially.
In their study, Kolek et al. (2021) used the group-specific cumulative logit model to detect DIF on Pretest, and DIF-C in Posttest and in Delayed posttest. They tested the hypthesis of any DIF/DIF-C against the alternative of any type of DIF/DIF-C (uniform or nonuniform). They used the Pretest total score as a matching criterion and the Benjamini-Hochberg correction for multiple comparisons. Item purification was not applied. Here we offer the DIF/DIC-C analysis with the same settings as in Kolek et al. (2021). You can also change the type of DIF to be tested, the matching criterion, and the parametrization - either the IRT or the classical intercept/slope. You can also select a correction method for a multiple comparison and/or item purification.
The probability that respondent \(p\) with the pretest score (matching criterion) \(X_p\) and the group membership variable \(G_p\) obtained at least \(k\) points in item \(i\) is given by the following equation:
The probability that respondent \(p\) with the pretest score (matching criterion) \(X_p\) and group membership \(G_p\) obtained exactly \(k\) points in item \(i\) is then given as the difference between the probabilities of obtaining at least \(k\) and \(k + 1\) points:
This summary table contains information about \(\chi^2\)-statistics of the likelihood ratio test, corresponding \(p\)-values considering selected correction method, and significance codes. The table also provides estimated parameters for the best fitted model for each item.
Points represent a proportion of the obtained score with respect to the matching criterion. Their size is determined by the count of respondents who achieved a given level of the matching criterion and who selected given option with respect to the group membership.
This table summarizes estimated item parameters together with the standard errors.
ShinyItemAnalysis Modules are developed by the Computational Psychometrics Group supported by the Czech Science Foundation under Grant Number 21-03658S.