Asking for help with statistics. Help explain the method and some tables.

Question

I need to present a paper on Technology Acceptance Model tomorrow morning, which involves some statistical method The non-statistics part of this paper is not hard so I thought I could finish it on time. I did not expect it explained nothing on its statistics!. This paper seems to presume the reader already knows this method. I not majoring in statistics but I think I still need to explain the statistics in this paper to the audience, so I come here to ask for help.

The major problem is that I don't understand what the symbols in the following tables mean, the $R^2$, $\beta$ and $p$. Is this some sort of factor analysis? That's the best I can guess, but what does the "regression" mean? How come this involves regression? Those "Perceived Usefulness", "Perceived Ease of Use", etc., are factors of a customer's intention to use a technology.

enter image description here

score 3 · Accepted Answer · answered Oct 09 '14 at 02:10

You would probably get a better response on stats.SE rather than math.SE.

That said, if you don't understand why regression was used, and the meaning of $R^2$, $\beta$, and $p$, then I will be completely honest: you lack the essential ability to interpret this analysis, and it is unlikely you will be able to explain the significance of the statistical model and its interpretation to an audience by tomorrow morning. The nature of your questions is too broad to be adequately explained in the scope of a single answer here--you would, in essence, need to take a course in applied statistical inference with a focus on simple linear regression.

In any case, here are the high-level answers to your questions, but don't expect more detailed explanation: as I said, what you are asking to understand is enough to cover an entire semester's worth of statistics.

Linear regression is used here to model a response variable (here, two separate regressions are performed, one with intention to use as response, and another with perceived usefulness as response) under the influence of various predictor variables or "covariates." In performing such a regression analysis, we do not infer causation, only correlation--that if the degree of association between the predictor(s) and the response is nontrivial, that this simply demonstrates a statistically significant association between the two.

The stronger this association between the predictors and the response, the higher the value of $R^2$, which is what we call the coefficient of determination. $R^2$ is a number between $0$ and $1$, and represents the proportion of the variance of the response that can be explained by the variation in the covariates in the model. Thus, if $R^2 = 1$, the model is "perfect"--the predictors explain 100% of the response, with no error. Thus $R^2$ is a measure of the strength of association. However, there is the problem of overfitting--that is to say, if we introduce many covariates into our model to attempt to explain a finite set of observations in our data, we can create a model that has a very high $R^2$ but has little inferential utility, because the model is in some sense contrived to fit the data we observed, and not the underlying true association. Thus, we speak of an adjusted $R^2$, sometimes written $R^2_\mathrm{adj}$, in which we "penalize" the use of too many covariates. Note that $R^2_\mathrm{adj} \le R^2$.

Hence this adjustment occurs in the context of model selection, in which we are to decide which covariates that we include in a model "best" represent the association between predictor(s) and response. The underlying assumption is that of parsimony--that is, we consider as superior the model that does not merely fit best, but is also simple. This is the meaning of the $p$-values attached to the coefficient estimates $\beta$. So suppose our regression equation has the form $$Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_m X_m + \varepsilon,$$ where $Y$ is the response and $X_1, X_2, \ldots, X_m$ are $m$ candidate covariates we consider for inclusion in the model. $\varepsilon$ is assumed to be a normally distributed random variable with zero mean and is what we call the unexplained error (unexplained in the sense that the value of the response is not purely determined by the covariates). With some statistical assumptions, we can take a set of $n$ observations $\{(X_{i1}, X_{i2}, \ldots, X_{im})\}_{i=1}^n$ of the covariates, and fit the model such that a certain quantity is minimized, which results in a set of estimates for the unknown coefficients $\beta_0, \beta_1, \ldots, \beta_m$. The result of this fitting process are the $\beta$s that are shown in the table, where each $\beta$ corresponds to the covariate/predictor in that row. The number of stars after that $\beta$ represents the significance ($p$-value) of that predictor in the fit of the model, so if there are three stars, that means that covariate demonstrates strong statistical significance in the model fit, and is an important component of the model.

Lastly, there is a discussion of interactions. This was not included in the above description of the regression model. An interaction between two covariates, say $X_i$ and $X_j$, means that there is a statistically significant term of the form $\beta_{ij} X_i X_j$ in the model formed by the product of the values of those two covariates. Interactions can be of higher order as well; e.g., $X_i X_j X_k$ indicates a three-way interaction between covariates. One example of an interaction might be how age might interact with smoking status in predicting lung function. While older individuals are expected to have worse lung function than young individuals, and smokers are also expected to have worse lung function than non-smokers, the combination of these two factors (old and smoker) has a much more detrimental impact on lung function than either of the two factors taken alone can explain. Note that interactions can also have a counteracting rather than synergistic effect.

That's about as non-rigorous and brief I can make my explanation. For more detail on such short notice, you will need to look up other online resources.

Thank you so much! I understand most of your reply! – Ralph B. Oct 09 '14 at 04:41 — Ralph B., Oct 09 '14 at 04:41

Asking for help with statistics. Help explain the method and some tables.

1 Answers1