Title: | Strategic Selection Estimator |
---|---|
Description: | Provides functions to estimate a strategic selection estimator. A strategic selection estimator is an agent error model in which the two random components are not assumed to be orthogonal. In addition this package provides generic functions to print and plot objects of its class as well as the necessary functions to create tables for LaTeX. There is also a function to create dyadic data sets. |
Authors: | Lucas Leemann |
Maintainer: | Lucas Leemann <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.4 |
Built: | 2025-03-05 05:02:26 UTC |
Source: | https://github.com/lleemann/stratsel |
This package provides functionality to estimate, summarize, plot, predict, and export strategic selection estimates. It allows researchers to incorporate the strategic nature of the DGP while not constraining the errors to be orthogonal. By relaxing the assumptions, this estimator becomes a blend of an agent error model and a Heckman selection model.
Lucas Leemann [email protected]
Lucas Leemann. 2014. "Strategy and Sample Selection - A Strategic Selection Estimator", Political Analysis 22: 374-397.
games
# replicate the example from Leemann (2014): library(memisc) data(war1800) ## Not run: out1 <- StratSel(Y ~ s_wt_re1 + revis1 | dem1 + mixed1 | balanc + dem2 + mixed2, data=war1800, corr=TRUE) ## End(Not run) out2 <- StratSel(Y ~ s_wt_re1 + revis1 | dem1 + mixed1 | balanc + dem2 + mixed2, data=war1800, corr=FALSE) setStratSelDefault() ## Not run: z <- mtable(out1,out2) # toLatex(z) for a LaTeX output or just regular table:
# replicate the example from Leemann (2014): library(memisc) data(war1800) ## Not run: out1 <- StratSel(Y ~ s_wt_re1 + revis1 | dem1 + mixed1 | balanc + dem2 + mixed2, data=war1800, corr=TRUE) ## End(Not run) out2 <- StratSel(Y ~ s_wt_re1 + revis1 | dem1 + mixed1 | balanc + dem2 + mixed2, data=war1800, corr=FALSE) setStratSelDefault() ## Not run: z <- mtable(out1,out2) # toLatex(z) for a LaTeX output or just regular table:
This data is just for illustration. The code to generate it is:
set.seed(124)
n <- 1000
x24 <- cbind(rnorm(n), rnorm(n))
error <- rmnorm(n,c(0,0),matrix(c(1,0.6,0.6,1),2,2))
e24 <- error[,2]
y24.latent <- x24%*%c(1,1) + e24
y2 <- rep(NA,n)
y2[y24.latent>0] <- 1
y2[y24.latent<0] <- 0
mod2 <- glm(y2 ~ x24, family=binomial(link=probit))
p24 <- pnorm(predict(mod2))
x11 <- cbind(rnorm(n, sd=0.2), rnorm(n, sd=0.2))
x14 <- cbind(x24[,2],rnorm(n))
e14 <- error[,1]
y14.latent <- x14%*%c(2,1) * p24 - x11%*%c(1,1) + e14
y1 <- rep(NA,n)
y1[y14.latent>0] <- 1
y1[y14.latent<0] <- 0
Y <- rep(NA,n)
Y[y1==0] <- 1
Y[y1==1&y2==0] <- 3
Y[y1==1&y2==1] <- 4
colnames(x11) <- c("var A", "var B")
colnames(x14) <- c("var C", "var D")
colnames(x24) <- c("var E", "var C")
data.fake <- data.frame(Y,x11,x14,x24)
data(data.fake)
data(data.fake)
A data frame with 1000 observations on the following 7 variables.
Y
A numeric vector with values 1,3, and 4 depending on which outcome occurred.
var.A
A numeric vector mimicking an explanatory variable as part of .
var.B
A numeric vector mimicking an explanatory variable as part of .
var.C
A numeric vector mimicking an explanatory variable as part of and of
.
var.D
A numeric vector mimicking an explanatory variable as part of .
var.E
A numeric vector mimicking an explanatory variable as part of .
var.C.1
A numeric vector mimicking an explanatory variable as part of and of
. Identical to var.C.
Can be independently re-created by anybody.
data(data.fake) summary(data.fake) ## Not run: out1 <- StratSel(Y ~ var.A + var.B | var.C + var.D | var.E + var.C, data=data.fake, corr=TRUE) ## End(Not run) ## Not run: summary(out1) # True parameters are 1 or 2 except the three constant terms (which are 0). # The correlation parameter was set to +0.6.
data(data.fake) summary(data.fake) ## Not run: out1 <- StratSel(Y ~ var.A + var.B | var.C + var.D | var.E + var.C, data=data.fake, corr=TRUE) ## End(Not run) ## Not run: summary(out1) # True parameters are 1 or 2 except the three constant terms (which are 0). # The correlation parameter was set to +0.6.
) back to
The model has a correlation parameter which is estimated and theoretically bound between -1 and +1. To ensure that the estimated parameters are within the theoretical bounds a transformation is necessary. The chosen transformation is:
Whereas is the actual correlation coefficient and
is the parameter we estimate in the model. This parametrization has been worked into the likelihood function and ensures that
will be between
and
.
fetch.rho.b(b)
fetch.rho.b(b)
b |
The vector of estimated coefficients ( |
This function is for internal use but documented as a regular function to enable any user to assess the estimator and its functionality.
The function returns the correct estimate for .
We want to estimate but because it is theoretically bound, we estimate
which is not bound can range from
to
.
Lucas Leemann [email protected]
test <- c(1,1,-2.35) fetch.rho.b(test)
test <- c(1,1,-2.35) fetch.rho.b(test)
back to
The model has a correlation parameter which is estimated and theoretically bound between -1 and +1. To ensure that the estimated parameters are within the theoretical bounds a transformation is necessary. The chosen transformation is:
Whereas is the actual correlation coefficient and
is the parameter we estimate in the model. This parametrization has been worked into the likelihood function and ensures that
will be between
and
.
The variance covariance matrix thus contains entries based on but not
. Hence, this function takes the variance of the transformed correlation parameter (
) and produces the value correct for
.
To create the correct measure of this function simulates 1,000
's and then transforms them to
's. The variance of these
's is then reported. Note, this means that the variance-covariance returned by
StratSel
is only correct for all diagonals and off-diagonals for the parameters () but for the correlation coefficient only the variance is correct. Given that there is no reason to use the full variance-covariance for post-estimation commands this is not a problem.
fetch.rho.v(v, b)
fetch.rho.v(v, b)
v |
Variance-covariance matrix based on the regular parameters ( |
b |
Coefficient vector, first |
This function is for internal use but documented as a regular function to enable any user to assess the estimator and its functionality.
Returns the correct variance estimate for the estimate of the correlation coefficient .
Lucas Leemann [email protected]
fetch.rho.v(matrix(c(1,0,0,1),2,2),c(0,0)) fetch.rho.v(matrix(c(1,0,0,2),2,2),c(0,0))
fetch.rho.v(matrix(c(1,0,0,1),2,2),c(0,0)) fetch.rho.v(matrix(c(1,0,0,2),2,2),c(0,0))
The function creates good starting values based on the supplied data and model which are to be estimated. To do so, the function runs two probit models, whereas the first one is just on the lower node of the game tree (see StratSel
). It then creates predicted probabilities () to estimate a second probit at the first node whereas the variables which are part of
are weighted by
.
gen.Startval(Startval, user.supplied.startval, corr, ys, xs11, xs14, xs24, dim.x11, dim.x14, dim.x24)
gen.Startval(Startval, user.supplied.startval, corr, ys, xs11, xs14, xs24, dim.x11, dim.x14, dim.x24)
Startval |
Optional. A vector of user supplied starting values. |
user.supplied.startval |
Logical. If TRUE this function just returns the vector |
corr |
Logical. Indicates whether the estimated agent error model assumes orthogonal errors ( |
ys |
Vector. The outcome variable which is supplied by the user to StratSel. |
xs11 |
Matrix. Explanatory variables for player 1 and measuring utility from outcome 1. |
xs14 |
Matrix. Explanatory variables for player 1 and measuring utility from outcome 4. |
xs24 |
Matrix. Explanatory variables for player 2 and measuring utility from outcome 4. |
dim.x11 |
Vector. Has two elements for the dimension of X11. |
dim.x14 |
Vector. Has two elements for the dimension of X14. |
dim.x24 |
Vector. Has two elements for the dimension of X24. |
This function is for internal use but documented as a regular function to enable any user to assess the estimator and its functionality.
Vector. Has length of the number of parameters to be estimated.
Lucas Leemann [email protected]
This function extends the mtable() to report strategic selection models (StratSel
). Together with setStratSelDefault
and the mtable
command from the memisc
package users can create multi-model tables and export them to LaTeX.
## S3 method for class 'StratSel' getSummary(obj, alpha = 0.05, ...)
## S3 method for class 'StratSel' getSummary(obj, alpha = 0.05, ...)
obj |
An object of class |
alpha |
Significance level. |
... |
additional arguments |
Returns a list of objects to be fed to mtable
. Do not use this command directly. The command mtable
will automatically call this function for an object of the StratSel
class.
Lucas Leemann [email protected]
Elff, Martin. (2013). memisc: Tools for Management of Survey Data, Graphics, Programming, Statistics, and Simulation R package version 0.96-7.
data(data.fake) out1 <- StratSel(Y ~ var.A | var.D | var.E , data=data.fake, corr=FALSE) out2 <- StratSel(Y ~ var.A | var.C | var.E, data=data.fake, corr=FALSE) mtable(out1,out2)
data(data.fake) out1 <- StratSel(Y ~ var.A | var.D | var.E , data=data.fake, corr=FALSE) out2 <- StratSel(Y ~ var.A | var.C | var.E, data=data.fake, corr=FALSE) mtable(out1,out2)
StratSel
Generic logLik function for objects of class StratSel
.
## S3 method for class 'StratSel' logLik(object, ...)
## S3 method for class 'StratSel' logLik(object, ...)
object |
An object of class |
... |
additional arguments. |
Lucas Leemann [email protected]
This function calculates the log-likelihood value for an agent error model (belongs to the general class of quantal response models). The underlying formal structure is
1 /\ / \ / \ 2 u11 /\ / \ / \ 0 u14 0 u24
and shows a game where there are two players which move sequentially. Player 1 decides to move left or right and if she does move right player 2 gets to move. The final outcome in this case depends on the move of player 2.
logLikStrat(x11, x14, x24, y, beta)
logLikStrat(x11, x14, x24, y, beta)
x11 |
A vector or a matrix containing the explanatory variables used to parametrize |
x14 |
A vector or a matrix containing the explanatory variables used to parametrize |
x24 |
A vector or a matrix containing the explanatory variables used to parametrize |
y |
Vector. Outcome variable which can take values 1, 3, and 4 depending on which outcome occurred. |
beta |
Vector. Coefficients of the model. |
This function provides the likelihood of an agent error model (Signorino, 2003). Note, that to derive it one assumes that the two errors are independent. Further, as with probit and logit models, one needs to assume an error variance to achieve identification. Signorino uses while
logLikStrat
uses 1. Hence, the numeric results will differ, but all relevant statistics (predicted probabilities, z-values, ...) will be identical. Finally, u13
and u23
are set to 0 to achieve identification.
Returns a numeric value for the log-likelihood function evaluated for .
The log-likelihood function:
whereas
and
Lucas Leemann [email protected]
Curtis S. Signorino. 2003. "Structure and Uncertainty in Discrete Choice Models." Political Analysis 11:316–344.
This function calculates the log-likelihood value for an agent error model (belongs to the general class of quantal response models) with correlated errors. The underlying formal structure is
1 /\ / \ / \ 2 u11 /\ / \ / \ 0 u14 0 u24
and shows a game where there are two players which move sequentially. Player 1 decides to move left or right and if she does move right player 2 gets to move. The final outcome in this case depends on the move of player 2.
logLikStratSel(x11, x14, x24, y, beta)
logLikStratSel(x11, x14, x24, y, beta)
x11 |
A vector or a matrix containing the explanatory variables used to parametrize |
x14 |
A vector or a matrix containing the explanatory variables used to parametrize |
x24 |
A vector or a matrix containing the explanatory variables used to parametrize |
y |
Vector. Outcome variable which can take values 1, 3, and 4 depending on which outcome occurred. |
beta |
Vector. Coefficients of the model whereas the last element is the correlation coefficient |
This function provides the likelihood of an agent error model (Signorino, 2003) but in addition allows the random components to be correlated and hence can take selection into account. The correlation parameter is re-paramaterized (see Note). Further, as with probit and logit models, one needs to assume an error variance to achieve identification, here 1 is chosen as with a regular probit model. Finally, u13
and u23
are set to 0 to achieve identification.
Returns a numeric value for the log-likelihood function evaluated for .
The notation indicates a bivariate standard normal cumulative distribution evaluated at the values
a,b
whereas the two random variables have a correlation of c
.
whereas
and
The re-parametrization is as follows:
Lucas Leemann [email protected]
Lucas Leemann. 2014. "Strategy and Sample Selection - A Strategic Selection Estimator", Political Analysis 22: 374-397.
This function allows the user to create dyadic data sets which can be directed or undirected.
makeDyadic(x, directed = FALSE, show.progress = 5)
makeDyadic(x, directed = FALSE, show.progress = 5)
x |
The data matrix whereas the first variable is the country code and the second column has to be the time variable. |
directed |
Logical. If |
show.progress |
Logical. The process may take some time depending on the size of the supplied data matrix. This option allows users to receive feedback of how far along the process is at periodical steps. Default is set to 5. |
This function was first written for Simon Collrad-Wexler and then later amended for Fabio Wasserfallen.
Returns a data frame with the dyadic data set.
Lucas Leemann [email protected]
dataOrig <- matrix(c( rep(c(1:4),3), rep(1,4), rep(2,4), rep(3,4), rnorm(4,1.5,0.1), rnorm(4,2.5,0.1), rnorm(4,3.5,0.1), rnorm(4,4.5,0.1), rnorm(4,5.5,0.1), rnorm(4,6.5,0.1)),12,4) colnames(dataOrig) <- c("countryCODE", "Year", "Variable 1", "Variable 2") dataNew <- makeDyadic(dataOrig, directed=TRUE)
dataOrig <- matrix(c( rep(c(1:4),3), rep(1,4), rep(2,4), rep(3,4), rnorm(4,1.5,0.1), rnorm(4,2.5,0.1), rnorm(4,3.5,0.1), rnorm(4,4.5,0.1), rnorm(4,5.5,0.1), rnorm(4,6.5,0.1)),12,4) colnames(dataOrig) <- c("countryCODE", "Year", "Variable 1", "Variable 2") dataNew <- makeDyadic(dataOrig, directed=TRUE)
Plots predicted probabilities for all three possible outcomes based on an object of class StratSel
.
## S3 method for class 'StratSel' plot(x, profile, x.move, x.range, uncertainty = FALSE, n.sim = 100, ci = 0.95, ylim, xlab, ylab1, ylab2, ylab3, plot.nr, ...)
## S3 method for class 'StratSel' plot(x, profile, x.move, x.range, uncertainty = FALSE, n.sim = 100, ci = 0.95, ylim, xlab, ylab1, ylab2, ylab3, plot.nr, ...)
x |
An object of class |
profile |
Vector. The values of all independent variables including the three constants. |
x.move |
Scalar. Indicates which variable is changing (and displayed on the x-axis). |
x.range |
Vector. A vector with two elements. The |
uncertainty |
Logical. Indicates whether confidence bands should be displayed or not. |
n.sim |
Scalar. If |
ci |
Scalar. Indicates which confidence interval should be plotted, the default is 0.95. |
ylim |
Vector. A vector with two elements defining the range of the plotted y (predicted probability). |
xlab |
String. A label to be used for the x-axis. Will be recycled in all three plots. |
ylab1 |
String. Label for the y-axis of the first plot (predicted probability of outcome 1). |
ylab2 |
String. Label for the y-axis of the second plot (predicted probability of not outcome 1). |
ylab3 |
String. Label for the y-axis of the third plot (predicted probability of outcome 4). |
plot.nr |
Vector. If one does not want to plot all three outcomes, one can use this vector to indicate which plot(s) should be shown. |
... |
Further arguments to be supplied to |
Lucas Leemann [email protected]
data(data.fake) # Running just an agent error model (note: corr=FALSE) with \code{var.C} being #part of both actors' utilities out1 <- StratSel(Y ~ var.A + var.B | var.C + var.D | var.E + var.C, data=data.fake, corr=FALSE) par(mfrow=c(3,1)) plot(out1, profile=c(1,0.2,-0.2,1,0.2,-0.2,1,0.1,-0.3), x.move=c(5,9),x.range=c(-15,15), ci=0.7, uncertainty=TRUE)
data(data.fake) # Running just an agent error model (note: corr=FALSE) with \code{var.C} being #part of both actors' utilities out1 <- StratSel(Y ~ var.A + var.B | var.C + var.D | var.E + var.C, data=data.fake, corr=FALSE) par(mfrow=c(3,1)) plot(out1, profile=c(1,0.2,-0.2,1,0.2,-0.2,1,0.1,-0.3), x.move=c(5,9),x.range=c(-15,15), ci=0.7, uncertainty=TRUE)
StratSel
Class
Prediction function for objects of the StratSel
class. Provides either predictions for all observations in a model or for a specified profile. In addition, the function will either predict an outcome or three probabilities (indicating the probability for each outcome).
## S3 method for class 'StratSel' predict(object, prob = FALSE, profile, ...)
## S3 method for class 'StratSel' predict(object, prob = FALSE, profile, ...)
object |
An object of class |
prob |
Logical. If |
profile |
Vector. A vector defining a specific profile for which the prediction is made. |
... |
... |
Either a matrix with dimension n * m
, where there are n
observations in the original model and m
is three (for the three possible outcomes) or it will be a vector with n
elements indicating for each observation which the most likely outcome would be.
Lucas Leemann [email protected]
data(data.fake) out1 <- StratSel(Y ~ var.A + var.B | var.C + var.D | var.E + var.C, data=data.fake, corr=FALSE) predict(out1) predict(out1, prob=TRUE) predict(out1, profile=c(1,0.2,0.2,1,0.2,0.2,1,0.2,0.2))
data(data.fake) out1 <- StratSel(Y ~ var.A + var.B | var.C + var.D | var.E + var.C, data=data.fake, corr=FALSE) predict(out1) predict(out1, prob=TRUE) predict(out1, profile=c(1,0.2,0.2,1,0.2,0.2,1,0.2,0.2))
StratSel
Generic print function for objects of class StratSel
.
## S3 method for class 'StratSel' print(x,...)
## S3 method for class 'StratSel' print(x,...)
x |
An object of class |
... |
additional arguments. |
Lucas Leemann [email protected]
StratSel
Function to print the summary output of an object of class StratSel
## S3 method for class 'StratSel' print.summary(x, ...)
## S3 method for class 'StratSel' print.summary(x, ...)
x |
An object of class |
... |
additional arguments. |
Lucas Leemann [email protected]
mtable
Command
Function changes default settings to use mtable
command.
setStratSelDefault()
setStratSelDefault()
Lucas Leemann [email protected]
Elff, Martin. (2013). memisc: Tools for Management of Survey Data, Graphics, Programming, Statistics, and Simulation R package version 0.96-7.
See mtable
table command in the memisc
package.
This function estimates a strategic selection estimator. This function fits a strategic selection estimator which is based on an agent error model (belongs to the general class of quantal response models). The underlying formal structure is
1 /\ / \ / \ 2 u11 /\ / \ / \ 0 u14 0 u24
and shows a game where there are two players which move sequentially. Player 1 decides to move left or right and if she does move right player 2 gets to move. The final outcome in this case depends on the move of player 2.
StratSel(formula, corr = TRUE, Startval, optim.method = "BFGS", data, ...)
StratSel(formula, corr = TRUE, Startval, optim.method = "BFGS", data, ...)
formula |
The formula has the following form |
corr |
Logical. If |
Startval |
Vector. Allows the user to specify starting values. If there is no user-supplied vector the function will generate starting values itself. It is strongly recommended to to let the function determine the optimal starting values. |
optim.method |
Optimization method to be used by |
data |
an optional data frame, list or environment (or object coercible by |
... |
additional arguments. |
StratSel
returns an object of class StratSel
for which appropriate plot
, print
, summary
, and predict
functions exist.
Lucas Leemann [email protected]
Lucas Leemann. 2014. "Strategy and Sample Selection - A Strategic Selection Estimator", Political Analysis 22: 374-397.
Curtis S. Signorino. 2003. "Structure and Uncertainty in Discrete Choice Models." Political Analysis 11:316–344.
# replicate the example from Leemann (2014): data(war1800) ## Not run: out1 <- StratSel(Y ~ s_wt_re1 + revis1 | dem1 + mixed1 | balanc + dem2 + mixed2, data=war1800, corr=TRUE) ## End(Not run) out2 <- StratSel(Y ~ s_wt_re1 + revis1 | dem1 + mixed1 | balanc + dem2 + mixed2, data=war1800, corr=FALSE)
# replicate the example from Leemann (2014): data(war1800) ## Not run: out1 <- StratSel(Y ~ s_wt_re1 + revis1 | dem1 + mixed1 | balanc + dem2 + mixed2, data=war1800, corr=TRUE) ## End(Not run) out2 <- StratSel(Y ~ s_wt_re1 + revis1 | dem1 + mixed1 | balanc + dem2 + mixed2, data=war1800, corr=FALSE)
StratSel
Objects
Summary function for StratSel
objects which displays a table of estimated coefficients and their standard errors.
## S3 method for class 'StratSel' summary(object, ...)
## S3 method for class 'StratSel' summary(object, ...)
object |
An object of class |
... |
... |
See StratSel
help-file for an example.
Lucas Leemann [email protected]
StratSel
Generic vcov function for objects of class StratSel
.
## S3 method for class 'StratSel' vcov(object,...)
## S3 method for class 'StratSel' vcov(object,...)
object |
An object of class |
... |
additional arguments. |
Lucas Leemann [email protected]
This is a subset (only some variables included) of the data set which is also included in the package games
. The data set can also be used to replicate the example that is provided in Leemann (2014) illustrating the strategic selection estimator. It is a data set of militarized international disputes between 1816 and 1899.
data(war1800)
data(war1800)
A data frame with 313 observations on the following 10 variables.
esc
a numeric vector
war
a numeric vector
dem1
a numeric vector
mixed1
a numeric vector
dem2
a numeric vector
mixed2
a numeric vector
s_wt_re1
a numeric vector
revis1
a numeric vector
balanc
a numeric vector
Y
a numeric vector
This data set is taken from the package games
.
Daniel M. Jones, Stuart A. Bremer and J. David Singer. 1996. "Militarized Interstate Disputes, 1816-1992: Rationale, Coding Rules, and Empirical Patterns." Conflict Management and Peace Science 15(2): 163–213.
Lucas Leemann. 2014. "Strategy and Sample Selection - A Strategic Selection Estimator", Political Analysis 22: 374-397.
data(war1800) summary(war1800)
data(war1800) summary(war1800)