Questions tagged [regression]

This tag is for questions on (linear or nonlinear) regression, which is a way of describing how one variable, the outcome, is numerically related to predictor variables. The dependent variable is also referred to as $~Y~$, dependent or response and is plotted on the vertical axis (ordinate) of a graph.

Regression is a statistical measurement used in finance, investing and other disciplines that attempts to determine the strength of the relationship between one dependent variable (usually denoted by $~Y~$) and a series of other changing variables (known as independent variables).

Types of Regression –

  • Linear regression
  • Logistic regression
  • Polynomial regression
  • Stepwise regression
  • Stepwise regression
  • Ridge regression
  • Lasso regression
  • ElasticNet regression

The two basic types of regression are linear regression and multiple linear regression.

The general form of each type of regression is:

  • Linear regression: $~Y = a + b~X + u~$
  • Multiple regression: $~Y = a + b_1~X_1 + b_2~X_2 + b_3~X_3 + ... + b_t~X_t + u~$

Where:

  • $Y =~$ the variable that you are trying to predict (dependent variable).
  • $X =~$ the variable that you are using to predict Y (independent variable).
  • $a =~$ the intercept.
  • $b =~$ the slope.
  • $u =~$ the regression residual.

There are multiple benefits of using regression analysis. They are as follows:

$1.~$ It indicates the significant relationships between dependent variable and independent variable.

$2.~$ It indicates the strength of impact of multiple independent variables on a dependent variable.

Reference:

https://en.wikipedia.org/wiki/Regression_analysis

This tag often goes along with the tag.

2700 questions
1
vote
2 answers

Linear trend has to pass through a point

I need to interpolate a linear trend surface through a number of points but with the condition that the surface has to pass exactly through one of them. Can somebody give me any advice?
fparaggio
  • 113
1
vote
1 answer

Determining t values from R output

I have the following R output with several values omitted. I am trying to find what the t value of two estimates, $\widehat{B}_0$, $\widehat{B}_1$ 1 > summary(lm(y ~ x)) 2 Coefficients: 3 Estimate Std. Error t value Pr(>|t|) 4…
1
vote
0 answers

Finding the best fitting square

I have to find the best fitting square using the total least squares method. First we we had to find the best fitting rectangle using the following equations: s1: $c1+ax+by=0$, $a^2 +b^2 =1$ s2: $c2−bx+ay=0$, s3: $c3+ax+by=0$, s4:…
Peng Nhao
  • 11
  • 2
1
vote
0 answers

Another interpretation for $y$

I calculated a regression line of the formula $y = 9.8x - 2.666$ Where $y =$ calories burned, and $x =$ time (in minutes) John plans to exercise for $10$ minutes today. Predict the total number of calories burned. I found the answer to be $95.333$…
1
vote
3 answers

Finding a function coefficients from rounded values

I have a floored quadratic function with unknown coefficients in such form: $$y=\lfloor a(x+b)^2 \rfloor$$ I also have some (about a thousand) pairs of integer values, for example these: $\begin{matrix} x & 280 & 281 & 282 & 283 & 284 & 285 & 286 &…
Džuris
  • 2,590
1
vote
0 answers

SPSS Selecting cases (outliers) seperately for each split file

I divided my dataset in 10 groups (because I have 10 different industries). Now, in SPSS one can use "select cases" to get rid of outliers.This is fine. However, I want to apply a different "select cases" rule for each of the 10 groups because it…
Steven
  • 11
1
vote
1 answer

Given a dataset is it possible to decide which curve best fits the data?

I have a dataset containing weights and heights of 1000 people. I used simple linear regression to arrive at an equation. I wonder why cannot I try to fit an ellipse or a parabola or some curve. Is there is a theory which helps to decide whether a…
1
vote
1 answer

Prediction intervals around a regression line

I have a set of observations (x,y). I want to use x values to predict y. I plot a simple regression and this gives me an equation y = mx+c. This is the thin black line. How do I construct confidence intervals around the value of y or any given x?…
BYZZav
  • 125
1
vote
0 answers

What is the meaning of errors of separate variables in a fit?

When doing a least-square fitting of a two-parameter function (e.g. $y=a+bx$) with specialised software like Origin or gnuplot, one gets errors for the resulting $a$ and $b$. What do these errors mean exactly?
texnic
  • 155
1
vote
0 answers

T-Value And Significance (SLR in R)

I am currently trying to figure out the output I got from the summary-command when I do a linear regression in R. I get 2 values that I do not understand, first: the t-value. I do understand that it uses a t-test, testing the null hypothesis H0:…
lisa
  • 111
1
vote
2 answers

regression (using 3D points cloud dataset)

I have a dataset of trajectories. These trajectories are represented in 3D space (x,y,z). All trajectories of this dataset are similar in their shape, but they are not exactly the same, I mean, there is some variation along the points. The…
Medf
1
vote
0 answers

Show that $SS_\text{reg}=y^T(X(X^T X)^{-1}X^T-X_2(X_2^T X_2)^{-1}X_2^T)y$

If $X$ is the design matrix, show that: $$SS_\text{reg}=y^T(X(X^T X)^{-1}X^T-X_2(X_2^T X_2)^{-1}X_2^T)y$$ where $X_2$ is a $n \times 1$ matrix with entries equal to one. So far what I have done is that I know $SS_\text{reg}=\sum_{i=1}^n…
1
vote
1 answer

Constrained curve fitting

We have a series of observations $(x_i,y_i)$ where $Y_i \in \{0,1\}$ and $X_i \in (0,1)$. I want to curve fit these observations to $y = f(x)$ in order to interpret $f(x)$ as a probability given some arbitrary metric $x$. How can I do this and under…
1
vote
2 answers

Regression without linearity

Given two independent, standard-normally distributed random variables $x,y\sim \mathcal{N}(0,1).$ I would like to do an univariate linear regression without intercept $Y = X \cdot \beta + \epsilon.$ R gives me as estimate $\beta = 0$ n <- 10000 …
PT272
  • 309
1
vote
1 answer

Interpret overall fitness in prediction

I came through this equation and would like to learn more about how to inteprete it. d is defined as actual value, and $\hat{d}$ is defined as predicted value. Why does this equation (a) divides sum of least square error to the sum of $d_{i,j}^2$ …
twfx
  • 143