
www.Usenet.com
| <-- __Chronological__ --> | <-- __Thread__ --> |
On Thu, 27 Nov 2003 04:50:34 +0900 (KST), wuzzy <[EMAIL PROTECTED]> wrote: > if you used dummy-variables to categorize 4 variables with 5 > layers, you should have less than 20 vars, 4(n-1)=16. You can > definitely drop categories since eg., category 2 should not affect the > comparison between 1 and 4 if 1 is the control.. I agree that there should be 16 dummies. I disagree with arbitrarily dropping variables -- whether they are dummy variables or not. That is not-good advice since it is potentially a bad practice in inference, always. Regression without all the "relevant" variables is biased, remember? How bad it is, here, depends on the definition of the categories and what conclusions are being assayed. Even when the situation may justify it, the simple advice has to be overly hasty, when you don't know how the dummy variables are encoded -- What seems apparent, here, is that the person asking the question knows little, too. What is the purpose of dropping the category. Sometimes it makes sense to combine categories, sometimes not. Are you trying to form efficient estimates? Or, could you be trying to steal degrees of freedom in order to come up with a statistical test that reads as nominally significant at the fixed test level? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html "Taxes are the price we pay for civilization."
| <-- __Chronological__ --> | <-- __Thread__ --> |