Estimating the Logit and Probit Models with Mathematica
Author
Aylin Aktukun
Title
Estimating the Logit and Probit Models with Mathematica
Description
In this work, I introduce the features of my package BinaryResponse.m with a short review of Logit and Probit models. The package calculates the maximum likelihood estimates of parameters, the asymptotic standard errors, t-stats and p-values.
Category
Educational Materials
Keywords
URL
http://www.notebookarchive.org/2018-10-10qu4lt/
DOI
https://notebookarchive.org/2018-10-10qu4lt
Date Added
2018-10-02
Date Last Modified
2018-10-02
File Size
57.46 kilobytes
Supplements
Rights
Redistribution rights reserved




Mathematica Version Advisory |
This notebook was created in an earlier version of Mathematica |
Most notebooks run without change. This tool scans for possible issues and suggests changes. |
Scan for possible issues | Do not scan this notebook Never scan notebooks |
Wolfram Library Archieve
Wolfram Library Archieve
Estimating the Logit and Probit Models with Mathematica
Estimating the Logit and Probit Models with Mathematica
L. Aylin Aktukun
Istanbul University
Department of Econometrics
aylin@istanbul.edu.tr
Istanbul University
Department of Econometrics
aylin@istanbul.edu.tr
Abstract: In this work, I introduce the features of my package BinaryResponse.m with a short review of Logit and Probit models. The package calculates the maximum likelihood estimates of parameters, the asymptotic standard errors, t-stats and p-values.
Keywords: Binary Response, Logit, Probit, Mathematica 6.0.
Keywords: Binary Response, Logit, Probit, Mathematica 6.0.
1. Introduction
1. Introduction
Consider the linear regression model of the form
y=Xβ+ϵ
(
1
)where y is an n×1 vector of responses, X is an n×(k+1) matrix of regressors and has a full column rank (k+1), β is a (k+1) vector of parameters and ϵ is an n×1 vector of random errors. We write (k+1) rather than k, to make explicit the inclusion of an intercept term in the multiple regression. The main object is to estimate β. Although ordinary least squares (OLS) method is used for achieving this goal, it is generally not appropriate when the response variable is discrete. The most commonly encountered type of discrete response variable is a binary one. In a binary response model, the value of the response variable Y can take on only two values: 0 or 1 so that
Prob(Y=1)=F(Xβ)Prob(Y=0)=1-F(Xβ)
(
2
)where F is the cumulative distribution function (CDF) of Y. In estimating β, one possibility is a linear regression which is called linear probability model:
F(x,β)=Xβ
(
3
)But the linear probability model has a number of weaknesses. The major shortcoming of this model is that it does not constrain the predicted value of the response variable Y to lie between 0 and 1. It would also produce negative variances. The other problem with this model is that it is heteroscedastic. Choosing a continuous distribution for F, however, can suffice to overcome these problems. When F is chosen as the cumulative standard normal distribution function, the model is called probit. If the logistic function is preferred for F, then the model is termed logit. The details of these models can be found in many econometric textbooks. Gujarati (1995) provides a good starting point. For theoretical derivations for these two models and for the other limited dependent models, see, for instance, Greene (1997) and Davidson and Mackinnon (2004).
2. Estimation
2. Estimation
The method of maximum likelihood (MLE) is the most common way to estimate the binary response models. The likelihood function, however, cannot be defined as a joint density function since the response variable is discrete. When the response variable can take on discrete values, the likelihood function for these values should be defined as the probabilities that the values are realized. Hence the likelihood function should not be a density for these values. The loglikelihood function can be defined as (see, for instance, Davidson and Mackinnon (2004))
logL(y,β)=[ylogF(Xβ)+(1-y)log(1-F(Xβ))]
n
∑
i=1
(
4
)
The loglikelihood function is globally concave when log F(Xβ) and log(1- F(Xβ) are concave functions, see Pratt (1981). For the probit model F(x) is the cumulative standard normal distribution function:
The loglikelihood function is globally concave when log F(Xβ) and log(1- F(Xβ) are concave functions, see Pratt (1981). For the probit model F(x) is the cumulative standard normal distribution function:
F(x)=1(√(2π))t=(1/2)1+Erfx(√2)
x
∫
-∞
-
1
2
2
t
e
(
5
)and for the logit model F(x) is the cumulative distribution function of the logistic distribution:
F(x)=(/)t=1/(1+)
x
∫
-∞
-t
2
(1+)
-t
-x
(
6
) In general, the logit and probit models give very similar predicted probabilities and hence the maximized values of the likelihood functions can be very close. However, since the variance of the logistic distribution (/3) is larger than that of the standard normal distribution, which is unity, the estimates of the logit model tend to be larger than those of the probit model. We can understand this difference from the graphics of cumulative distribution functions of logistic and standard normal distributions in Fig. 1. Some writers suggest that the CDF of logistic function should be rescaled, see, Amemiya (1981) and Greene (1997) for the discussion on this issue.
2
π
In[]:=
Figure 1: Cumulative distribution functions of logistic and standard normal functions.
3. Inference
3. Inference
Since the first derivative of the loglikelihood function defined in Eq. 4 is non-linear, we should use iterative processes like Newton's method to obtain the estimate of β. To compute the standard errors which are also non-linear, we generally use delta method. Delta method is essentially a linear approximation method, see, for a simple definition, Bain and Engelhard (1992). The standard errors calculated by Delta method, however, are asymptotically valid and hence are not exact in finite samples. It can be shown that the Delta method gives the following formula for the approximate covariance matrix (see, Davidson and Mackinnon (1993)):
Var=
∧
β
-1
γX
X
∧
β
(
7
)Here, is defined as
γ
∧
β
γ=(Xβ)(F(Xβ)(1-F(Xβ)))
∧
β
2
f
(
8
)where f is the standard normal density function for probit model and logistic density function for logistic model.
4. The package: BinaryResponse
4. The package: BinaryResponse
The package BinaryResponse.m should be put in $AddOnsDirectory/Applications of Mathematica's main directory. It is written for Mathematica 6.0. The package BinaryResponse.m can be loaded with:
The package BinaryResponse.m should be put in $AddOnsDirectory/Applications of Mathematica's main directory. It is written for Mathematica 6.0. The package BinaryResponse.m can be loaded with:
In[]:=
<<BinaryResponse.m
There are two functions in the package, which are Logit[data] and Probit[data]. As the names suggest, Logit[data] is used for logit model and Probit[data] is used for the probit model. Both functions give the maximum likelihood estimates of parameters, the asymptotic standard errors, t-stats and p-values. After entering your data, type the functions and evaluate the cell. The package will give all the summary statistics for the maximum likelihood estimation. To test the package, we give the data set in Greene (1997):
In[]:=
GPA={2.66,2.89,3.28,2.92,4.00,2.86,2.76,2.87,3.03,3.92,2.63,3.32,3.57,3.26,3.53,2.74,2.75,2.83,3.12,3.16,2.06,3.62,2.89,3.51`,3.54`,2.83,3.39,2.67,3.65,4.00,3.10,2.39};TUCE={20,22,24,12,21,17,17,21,25,29,20,23,23,25,26,19,25,19,23,25,22,28,14,26,24,27,17,24,21,23,21,19};PSI={0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1};GRADE={0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,1,0,0,1,1,1,0,1,1,0,1};data=Transpose[{GPA,TUCE,PSI,GRADE}];
In[]:=
Logit[data]
"*** Maximum Likelihood Estimates of Parameters for Logit Model ***"
"Parameters" | "MLE Estim." | "Asymp. SE." | "t Stat" | "P-Values" | ||||||||||||||||||||
|
|
|
|
|
In[]:=
Probit[data]
"*** Maximum Likelihood Estimates of Parameters for Probit Model ***"
"Parameters" | "MLE Estim." | "Asymp. SE." | "t Stat" | "P-Values" | ||||||||||||||||||||
|
|
|
|
|
4. Acknowledgements
4. Acknowledgements
I thank to my colleague Enis Siniksaran for his helpful comments and suggestions.
References
Amemiya, T, (1981) "Qualitative response models a survey", Journal of Economic Literature, 19, 1483-1536.
Bain, L, and Engelhardt M (1992) "Introduction to Probability and Mathematical Statistics", Duxbury Classic Series.
Davidson R, and MacKinnon, J, G. (2004) "Econometric Theory and Methods", Oxford University Press.
Greene, W, H.(1997) "Econometric Analysis", Prentice Hall, 3rd edition.
Gujarati D, N. (1995) "Basic Econometrics", McGraw-Hill, 3rd Edition.
Pratt, J, W. (1981) "Concavity of the loglikelihood", Journal of the American Statistical Assosciation, 76, 103-106.
Amemiya, T, (1981) "Qualitative response models a survey", Journal of Economic Literature, 19, 1483-1536.
Bain, L, and Engelhardt M (1992) "Introduction to Probability and Mathematical Statistics", Duxbury Classic Series.
Davidson R, and MacKinnon, J, G. (2004) "Econometric Theory and Methods", Oxford University Press.
Greene, W, H.(1997) "Econometric Analysis", Prentice Hall, 3rd edition.
Gujarati D, N. (1995) "Basic Econometrics", McGraw-Hill, 3rd Edition.
Pratt, J, W. (1981) "Concavity of the loglikelihood", Journal of the American Statistical Assosciation, 76, 103-106.


Cite this as: Aylin Aktukun, "Estimating the Logit and Probit Models with Mathematica" from the Notebook Archive (2008), https://notebookarchive.org/2018-10-10qu4lt

Download

