Measurement Theory UGA, Homework Problem Number 5

LEGACY CONTENT. If you are looking for Voteview.com, PLEASE CLICK HERE

This site is an archived version of Voteview.com archived from University of Georgia on May 23, 2017. This point-in-time capture includes all files publicly linked on Voteview.com at that time. We provide access to this content as a service to ensure that past users of Voteview.com have access to historical files. This content will remain online until at least January 1st, 2018. UCLA provides no warranty or guarantee of access to these files.

Homework 5, POLS 8505: MEASUREMENT THEORY
Due 21 September 2011

Modify the program you used to answer Problem 2 of Homework 4 to graph the eigenvalues of the Hessian matrix. In particular, use the code shown in the posted example double_center_seven_points_eigen.r -- Simple R Program that Illustrates Double-Centering and Graphs the Eigenvalues of the Double-Centered Matrix (uses seven_points_example.txt). Note that you will need to use the windows() command before the plot of the 12 nations and the plot of the eigenvalues. Remember that the number of eigenvalues is greater than 12! Turn in NEATLY FORMATTED versions of both plots.
Modify the program you used to answer Problem 3 of Homework 4 to graph the eigenvalues of the Hessian matrix.
1. Again, turn in NEATLY FORMATTED versions of both plots.
2. How does the plot of the eigenvalues in this question differ from the first question?

Setting one country at the origin will eliminate two of the zero eigenvalues of the Hessian matrix. Again, because we are working with relational data -- that is, data that can be interpreted as distances between points -- one point can be fixed at the origin without affecting the estimation of the other points because we are trying to reproduce distances. It is also possible to fix one coordinate of a second point at 0.0. The same distances produced by moving all the points vis a vis one another can be generated by fixing one point at the origin and one coordinate of another point and simply moving the remaining points vis a vis the fixed point and the second point which is free to move back and forth on one of the dimensions. In this problem we will fix the U.S.S.R. at the origin as we have done before but we will also fix the U.S.A. second dimension coordinate at 0.0. Download the R program:

metric_mds_nations_ussr_usa.r -- Program to run Metric MDS on a matrix of Similarities/Dissimilarities with one point fixed at the origin

#
# nations_mds_nations_ussr_usa.r -- Does Metric MDS
#
# Wish 1971 nations similarities data
# Title: Perceived similarity of nations
#
# Description: Wish (1971) asked 18 students to rate the global
# similarity of different pairs of nations such as 'France and China' on
# a 9-point rating scale ranging from `1=very different' to `9=very
# similar'. The table shows the mean similarity ratings. 
#
# Brazil      9.00 4.83 5.28 3.44 4.72 4.50 3.83 3.50 2.39 3.06 5.39 3.17
# Congo       4.83 9.00 4.56 5.00 4.00 4.83 3.33 3.39 4.00 3.39 2.39 3.50
# Cuba        5.28 4.56 9.00 5.17 4.11 4.00 3.61 2.94 5.50 5.44 3.17 5.11
# Egypt       3.44 5.00 5.17 9.00 4.78 5.83 4.67 3.83 4.39 4.39 3.33 4.28
# France      4.72 4.00 4.11 4.78 9.00 3.44 4.00 4.22 3.67 5.06 5.94 4.72
# India       4.50 4.83 4.00 5.83 3.44 9.00 4.11 4.50 4.11 4.50 4.28 4.00
# Israel      3.83 3.33 3.61 4.67 4.00 4.11 9.00 4.83 3.00 4.17 5.94 4.44
# Japan       3.50 3.39 2.94 3.83 4.22 4.50 4.83 9.00 4.17 4.61 6.06 4.28
# China       2.39 4.00 5.50 4.39 3.67 4.11 3.00 4.17 9.00 5.72 2.56 5.06
# USSR        3.06 3.39 5.44 4.39 5.06 4.50 4.17 4.61 5.72 9.00 5.00 6.67
# USA         5.39 2.39 3.17 3.33 5.94 4.28 5.94 6.06 2.56 5.00 9.00 3.56
# Yugoslavia  3.17 3.50 5.11 4.28 4.72 4.00 4.44 4.28 5.06 6.67 3.56 9.00
#
rm(list=ls(all=TRUE))
library(MASS)
library(stats)
#
#  &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
#    FUNCTION CALLED BY OPTIM(...) Below
#
# *** Calculate Log-Likelihood -- In this case simple SSE ***
# **************************************************************
fr <- function(zzz){
#
xx[1:18] <- zzz[1:18]
xx[21] <- zzz[19]        
xx[19:20] <- 0.0         USSR Set to Origin
xx[22] <- 0.0            USA 2nd Dimension Coordinate Set to 0.0
xx[23:24] <- zzz[20:21]
#
i <- 1
sse <- 0.0
while (i <= np) {
  ii <- 1
  while (ii <= i) {
   j <- 1
   dist_i_ii <- 0.0
   while (j <= ndim) {
#
#  Calculate distance between points
#
       dist_i_ii <- dist_i_ii + (xx[ndim*(i-1)+j]-xx[ndim*(ii-1)+j])*(xx[ndim*(i-1)+j]-xx[ndim*(ii-1)+j])
       j <- j + 1
       }
      sse <- sse + (sqrt(TT[i,ii]) - sqrt(dist_i_ii))*(sqrt(TT[i,ii]) - sqrt(dist_i_ii))
      ii <- ii + 1
     }
   i <- i + 1
   }
return(sse)
}
#
# &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
#
#  Read Nations Similarities Data
#
T <- matrix(scan("C:/uga_course_homework_5/nations.txt",0),ncol=12,byrow=TRUE)
#
#
np <- length(T[,1])
ndim <- 2
nparam <- np*ndim - 3    The number of parameters is 24-3=21
#
names <- NULL
names[1] <- "Brazil"
names[2] <- "Congo"
names[3] <- "Cuba"
names[4] <- "Egypt"
names[5] <- "France"
names[6] <- "India"
names[7] <- "Israel"
names[8] <- "Japan"
names[9] <- "China"
names[10] <- "USSR"
names[11] <- "USA"
names[12] <- "Yugo"
#
#
# pos -- a position specifier for the text. Values of 1, 2, 3 and 4, 
# respectively indicate positions below, to the left of, above and 
# to the right of the specified coordinates 
#
namepos <- rep(4,np)
#namepos[1] <- 4    # Brazil
#namepos[2] <- 4    # Congo
#namepos[3] <- 4    # Cuba
#namepos[4] <- 4    # Egypt
#namepos[5] <- 4    # France
#namepos[6] <- 4    # India
#namepos[7] <- 4    # Israel
#namepos[8] <- 4    # Japan
#namepos[9] <- 4    # China
#namepos[10] <- 4   # USSR
#namepos[11] <- 4   # USA
#namepos[12] <- 4   # Yugoslavia
#
#  Set Parameter Values
#
zz <- rep(0,np*ndim)
dim(zz) <- c(np*ndim,1)
xx <- rep(0,np*ndim)
dim(xx) <- c(np*ndim,1)
#
# Set up for Double-Centering
#
TT <- rep(0,np*np)
dim(TT) <- c(np,np)
TTT <- rep(0,np*np)
dim(TTT) <- c(np,np)
#
xrow <- NULL
xcol <- NULL
xcount <- NULL
matrixmean <- 0
#
# Transform the Matrix
#
i <- 0
while (i < np) {
  i <- i + 1
  xcount[i] <- i
  j <- 0
  while (j < np) {
     j <- j + 1
#
#  This is the Normal Transformation
#
     TT[i,j] <- ((9 - T[i,j]))**2
#
#  But the R subroutine wants simple Distances!
#
     TTT[i,j] <- (9 - T[i,j])
#
  }
}
#
#  Call Double Center Routine From R Program 
#     cmdscale(....) in stats library
#     The Input data are DISTANCES!!!  Not Squared Distances!!
#     Note that the R routine does not divide
#     by -2.0
#
#
dcenter <- cmdscale(TTT,ndim, eig=FALSE,add=FALSE,x.ret=TRUE)
#
#  returns double-center matrix as dcenter$x if x.ret=TRUE
#
#  Do the Division by -2.0 Here
#
TTT <- (dcenter$x)/(-2.0)
#
#  Perform Eigenvalue-Eigenvector Decomposition of Double-Centered Matrix
#
#  ev$values are the eigenvectors
#  ev$vectors are the eigenvectors
#
ev <- eigen(TTT)
#  The Two Lines Below Put the Eigenvalues in a
#    Diagonal Matrix -- The first one creates an 
#    identity matrix and the second command puts
#    the singular values on the diagonal
#
Lambda <- diag(np)
diag(Lambda) <- ev$val
#
#  Compute U*LAMBDA*U' for check below
#
#
XX <- ev$vec %*% Lambda %*% t(ev$vec)
#
#
# Compute Fit of decomposition -- This is just the sum of squared
#  error -- Note that xfit should be zero!
#
xfit <- sum(abs(TTT-XX))   Note that I have simplified the code here
#
#  Use Eigenvectors for Starts
#
i <- 1
while (i <= np) 
  {
    j <- 1
    while (j <= ndim)
     { 
       zz[ndim*(i-1)+j]=ev$vector[i,j]
       j <- j +1
     }
  i <- i + 1
}
#
#  CONSTRAIN USSR TO THE ORIGIN -- entries 19 and 20
#
zzz <- rep(0,nparam)
dim(zzz) <- c(nparam,1)
zzz[1:18] <- zz[1:18]
zzz[19] <- zz[21]
#
#  CONSTRAIN 2nd DIMENSION USA TO 0.0  entry 22
#
zzz[20:21] <- zz[23:24]
#
# *******************************************
#  DO MAXIMUM LIKELIHOOD MAXIMIZATION HERE
# *******************************************
#
model_nd <- optim(zzz,fr,hessian=TRUE,control=list(maxit=50000))
model_sann <- optim(model_nd$par,fr,method="SANN",hessian=TRUE,control=list(maxit=50000, temp=20))
model <- optim(model_sann$par,fr,method="BFGS",hessian=TRUE)
#
#  Log-Likelihood (inverse -- optim minimizes!!)
#
logLmax <- model$value
#
#  Parameter Estimates
#
rhomax <- model$par
#
# convergence an integer code. 
#     0 indicates successful convergence. 
#         Error codes are
#     1 indicates that the iteration limit maxit had been reached.
#    10 indicates degeneracy of the Nelder–Mead simplex.
#    51 indicates a warning from the "L-BFGS-B" method; see component message for further details.
#    52 indicates an error from the "L-BFGS-B" method; see component message for further details. 
#
nconverge <- model$convergence
#
# counts 
#  A two-element integer vector giving the number of calls to 
#    fn and gr respectively. 
#
ncounts <- model$counts
#
xhessian <- model$hessian
#
#  Check Rank of Hessian Matrix
#
evv <- eigen(xhessian)
#
#  evv$values -- shows the eigenvalues
#
LambdaInv <- diag(nparam)
diag(LambdaInv) <- 1/evv$val
#
#  Compute U*[(LAMBDA)-1]*U' for check below
#
#
XXInv <- evv$vec %*% LambdaInv %*% t(evv$vec) Note that this is the Inverse of the Hessian
#
#
results <- rep(0,nparam*4)
dim(results) <- c(nparam,4)
#
results[,1] <- rhomax
results[,2] <- sqrt(diag(XXInv))            Standard Error
results[,3] <- rhomax/sqrt(diag(XXInv))     T Value
results[,4] <- pt(-abs(results[,3]),((np*(np-1))/2)-nparam-1)*2 Two-tail p-value
#
#
#  Put Coordinates into matrix for plot purposes
#
xmetric <- rep(0,np*ndim)
dim(xmetric) <- c(np,ndim)
#
i <- 1
while (i <= np) 
  {
    j <- 1
    while (j <= ndim)
     { 
       if(i < 10){
              xmetric[i,j] <- rhomax[ndim*(i-1)+j]
       }
       j <- j +1
     }
       if(i==10){
            xmetric[i,1] <- 0.0       Note the Kludge Here
            xmetric[i,2] <- 0.0       This very inelegant
          }
       if(i==11){
            xmetric[i,1] <- rhomax[19]
            xmetric[i,2] <- 0.0
          }
       if(i==12){
            xmetric[i,1] <- rhomax[20]
            xmetric[i,2] <- rhomax[21]
          }
  i <- i + 1
}
#
#  Compute R-Square between True vs. Reproduced Distances
# 
aaa <- NULL
bbb <- NULL
i <- 0
j <- 0
kk <- 0
while (i < np) {
  i <- i + 1
  j <- i
  while (j < np) {
     j <- j + 1
     kk <- kk + 1
     aaa[kk] <- sqrt(TT[i,j])
     sum <- 0.0
     k <- 0
     while (k < ndim){
       k <- k + 1
       sum <- sum + (xmetric[i,k]-xmetric[j,k])*(xmetric[i,k]-xmetric[j,k])
     }
     bbb[kk] <- sqrt(sum)
  }
}
rfit <- cor(aaa,bbb)
xmax <- max(xmetric)
xmin <- min(xmetric)
#
plot(xmetric[,1],xmetric[,2],type="n",asp=1,
       main="",
       xlab="",
       ylab="",
       xlim=c(xmin,xmax),ylim=c(xmin,xmax),font=2)
points(xmetric[,1],xmetric[,2],pch=16,col="red")
axis(1,font=2)
axis(2,font=2,cex=1.2)
#
# Main title
mtext("12 Nations Data",side=3,line=1.50,cex=1.2,font=2)
# x-axis title
#mtext("Liberal-Conservative",side=1,line=2.75,cex=1.2)
# y-axis title
#mtext("Region (Civil Rights)",side=2,line=2.5,cex=1.2)
#
text(xmetric[,1],xmetric[,2],names,pos=namepos,offset=00.00,col="blue",font=2)

Run metric_mds_nations_ussr_usa.r and turn in a NEATLY FORMATTED plot of the nations.
Report xfit and rfit and a NEATLY FORMATTED listing of results.
Add the code to produce a graph of the eigenvalues of the Hessian matrix. Turn in NEATLY FORMATTED versions of this plot with appropriate labeling.

Let A be the matrix of Morse Code two dimensional coordinates from the Lower Half of the Similarities matrix that you analyzed in Question 4 of Homework 1 (recall you plotted these!) after subtracting off the column means, and let B be the matrix of legislator coordinates from the Upper Half of the Similarities Matrix after subtracting off the column means. Note that subtracting off the column means of both matrices centers both at the origin, (0.0, 0.0). Solve for the orthogonal procrustes rotation matrix, T, for B. Namely, we want to minimize:

L(T) = tr(A - BT)(A - BT)'

The solution is:

T = VU' where

A'B = ULV'

where ULV' is the Singular Value Decomposition of A'B (see Borg and Groenen, pp. 430-432).

In R you can perform the decompostion with the svd command. For example:

C <- t(A)%*%B
svddecomp <- svd(C)

svddecomp$u has the matrix U
svddecomp$v has the matrix V
svddecomp$d has the diagonal of L

Note that you can check your work as we discussed in class by doing the following:

D <- diag(svddecomp$d)
U <- svddecomp$u
V <- svddecomp$v
ABCHECK <- U%*%D%*%t(V)
errorcheck <- sum((C-ABCHECK)^2)

Solve for T and turn in a neatly formatted listing. Compute the Pearson r-squares between the corresponding columns of A and B before and after rotating B.

Do a two panel graph with the two plots side-by-side -- A on the left and B on the right. Below is a code fragment showing how to do this:
# par(mfrow=c(1,2)) # # mar # A numerical vector of the form c(bottom, left, top, right) which # gives the number of lines of margin to be specified on the four # sides of the plot. The default is c(5, 4, 4, 2) + 0.1. # Note that you might want to make the right-hand side margin narrower so that the plots are closer together # You may want to experiment with the parameters until the plot looks nice! # This puts more white space # on the Right-Hand-Side Margin # par(mar=c(4.1,5.1,4.1,5.1)) # # plot(dimension1,dimension2,type="n",asp=1, main="", xlab="", ylab="", xlim=c(-2.0,2.0),ylim=c(-2.0,2.0)) points(dimension1,dimension2,pch=16,col="red") axis(1,font=2) axis(2,font=2,cex=1.2) # Main title mtext("Morse Code Signals Lower Half Matrix",side=3,line=1.00,cex=1.2,font=2) # x-axis title mtext("Best Interpretation!!",side=1,line=2.75,cex=1.2) # y-axis title mtext("Best Interpretation!!",side=2,line=2.5,cex=1.2) # # pos -- a position specifier for the text. Values of 1, 2, 3 and 4, # respectively indicate positions below, to the left of, above and # to the right of the specified coordinates # namepos <- rep(4,nrow) # # text(dimension1,dimension2,names,pos=namepos,offset=00.00,col="blue") # # Right hand Plot # plot(dimension11,dimension22,type="n",asp=1, main="", xlab="", ylab="", xlim=c(-2.0,2.0),ylim=c(-2.0,2.0)) points(dimension11,dimension22,pch=16,col="red") axis(1,font=2) axis(2,font=2,cex=1.2) # Main title mtext("Morse Code Signals Upper Half Matrix",side=3,line=1.00,cex=1.2,font=2) # x-axis title mtext("Best Interpretation!!",side=1,line=2.75,cex=1.2) # y-axis title mtext("Best Interpretation!!",side=2,line=2.5,cex=1.2) # # pos -- a position specifier for the text. Values of 1, 2, 3 and 4, # respectively indicate positions below, to the left of, above and # to the right of the specified coordinates # namepos <- rep(4,nrow) # # text(dimension11,dimension22,names,pos=namepos,offset=00.00,col="blue")

Homework 5, POLS 8505: MEASUREMENT THEORYDue 21 September 2011

Homework 5, POLS 8505: MEASUREMENT THEORY
Due 21 September 2011