Linear Regression -- Normal Equation

Computer Methods in Chemical Engineering


Problem Statement: Write a subroutine normal(...) that returns a set of coefficients that best describe the relationship between a set of data representing independent variables X and a set of data representing dependent variables Y. Specifically, the subroutine should calculate the following (which we will derive later in the semester).

a=(XTX)-1*XT*Y
In essence, the subroutine normal will call the following subroutines to handle the above equation:
  1. a subroutine to perform matrix transpose (you figure this out; it should be easy)
  2. a subroutine to perform matrix multiplication (see product.htm)
  3. a subroutine to perform matrix inverse (from one of the libraries)
Provide a main program that does the following:
  1. read the X and Y matrices from a file
  2. call the normal subroutine to find the set of coefficients a
  3. print out the answer
With this problem, you practice combining several subroutines together to perform a given task. Test case: Data from ranking.dat -- the US News and World Report's Ranking of the engineering schools in the US. Treat the column of overall scores as Y (a 50x1 matrix) and the remainder columns of statistical facts as X (a 50x6 matrix).
  a1=-9.2026E+00 intercept
  a2= 4.2360E-03 total enrollment
  a3= 9.9039E-02 research $
  a4= 1.9728E+00 student/faculty ratio
  a5=-1.2351E-01 acceptance rate
  a6= 1.1434E-01 quantitative GRE score
Some examples of the types of problems this program applies to:
  1. how reactor productivity depends on temperature, pressure, pH, flow rate, operator experience, etc.
  2. how the national departmental ranking depends on the size of the faculty, the number of various degrees awarded, the number of publication, research dollars, size of endowment, budget, year of founding, etc.;
  3. how a person's weight depends on the daily consumption of various types of foods, the extent of exercise, income, race, etc.
  4. how a student's grade depends on the number of hours of study, a person's IQ, sex, income, age, weekly beer consumption, etc.

Solution:


Return to Prof. Nam Sun Wang's Home Page
Return to Computer Methods in Chemical Engineering (ENCH250)

Computer Methods in Chemical Engineering -- Linear Regression -- Normal Equation
Forward comments to:
Nam Sun Wang
Department of Chemical & Biomolecular Engineering
University of Maryland
College Park, MD 20742-2111
301-405-1910 (voice)
301-314-9126 (FAX)
e-mail: nsw@umd.edu ©1996-2006 by Nam Sun Wang
UMCP logo