# Tests for differences among nested models

When you are trying to fit a regression o logistic regression model you have to decide the number of parameters to use in your model. Usually you start with a few of them and add new parameters, or you start with all of them and remove parameters. In any case you decide between succesive models which are a kind of russian dolls in respect to the parameters used(nested parameters models).

In these cases, there are two different approaches to decide which model is better:

•  Logistic regression, or models solved using Maximul Likelihood Estimates: In these cases you use one the likelihood ratio, Wald, or Lagrange multiplier (score) tests.

http://www.ats.ucla.edu/stat/mult_pkg/faq/general/nested_tests.htm

• Ordinary regression (OLS): You use Anova, partial-F tests.

http://www.stat.columbia.edu/~martin/W2024/R6.pdf

http://stats.stackexchange.com/questions/16493/difference-between-confidence-intervals-and-prediction-intervals

# Explain relation between multiple regression, anova and t-test

Look at this explanation is great:

http://mathforum.org/library/drmath/view/56651.html

And this:

http://www.theanalysisfactor.com/why-anova-and-linear-regression-are-the-same-analysis/

This web is good comparing the techniques:

http://www.strath.ac.uk/aer/materials/4dataanalysisineducationalresearch/unit6/analysisofvarianceanova/

http://www.strath.ac.uk/aer/materials/4dataanalysisineducationalresearch/unit6/anovaandregression/

http://www.strath.ac.uk/aer/materials/4dataanalysisineducationalresearch/unit6/t-testsandanova/

I extracted this explanation from the above web about the results presented with ANOVA being general and not giving information about the particular group or groups that are different:

“One important thing to note about the F-test (in ANOVA) is that it is a global  test. What that means is that if we find a significant difference (p-value <.05) all we know is that overall there is a significant difference somewhere in the comparisons between the three groups. But, we don’t know where exactly the significance lies.

It could be that the means of all three groups differ significantly from one another, or it could be that Finnish students differ from Scottish and Flemish students, or that Finnish students differ from Scottish student, but not from Flemish students, or another combination of differences my have led to the overall significant difference. Clearly, that is a bit frustrating, and we will want to find a way of telling us which countries are significantly different.”

I take also this explanation from :

http://www.allanalytics.com/author.asp?section_id=1413&doc_id=252823

“Regression and ANOVA always give exactly the same R2, which measures the extent to which the variation in all the independent variables together explains the variation in the dependent variable (close to 0 percent means only random connection; close to 100 percent means the independent variables explain nearly everything). This is because ANOVA asks, “How much do differences in category make a difference in result?” and regression asks, “How much does category matter at all?” Both are forms of the same question.

Regression and ANOVA yield different F-statistics. The F-statistic is a ratio of variances from which a probability can be calculated that this situation is not the null hypothesis. (The null hypothesis is the hypothesis that the independent variables don’t matter.)

But the null hypothesis of regression and the null hypothesis of ANOVA are different.

Regression solves for the linear equation that minimizes the sum of the squared errors; for each dummy variable it assigns a coefficient, i.e. a number by which it is multiplied. Obviously, if a coefficient is zero, then the variable drops out of the equation and doesn’t have any effect at all. So for regression, the F-statistic tests how likely it is that the coefficient is not zero (against the null hypothesis that the coefficient is zero and there is no effect).

ANOVA uses the categories to split the overall population into sub-populations (what we call “segments” in marketing and “test groups” in industrial quality control), and then tests against the null hypothesis that the subpopulations all have the same average value of the dependent variable. The F-statistic tests the probability that the means differ only by chance.

That difference in the null hypothesis is a difference in the actual question the procedure answers. Just remember, regression asks, “Do the categories have an effect?” and ANOVA asks “Is the effect significantly different across categories?”

Look as well this videos: