Confusion with Hypothesis Testing and other statistical thingies

Nesh

Double Agent
Legend
I ve been doing some statistics recently and I am a little confused.

So far I ve been doing for years hypothesis testing with regressions so I kind of forgot how to do simple hypothesis testings.

So here is what happened.

I took a population and calculated it's mean and S.E.

Then took a sample and did the same thing.

Now what I wanted to test is whether the sample's mean can be actualized inside the population. (T-test)
I used Megastat for hypothesis testing. I used as a Hypothesized mean that of the sample and the range of my data that of the population and tested it.

Now when I opened my book today I noticed the opposite thing. The examples used the population's mean as the hypothesized value (Ho=mean of population H1=different) and the data range were that of the sample.

So I wonder if my hypothesis testing was wrong :???:

Also another question. Lets say you take random samples of inconsistent observations (for example first sample is 4 obs, second is 10 etc with obs in one sample not being repeated in another) and then you find all of their means. If you find the mean of all the means collected would that still equal to that of the population or would that not happen only in case my samples are biased meaning they were grouped?

Here is what I did. I took a population of tested cars' annual costs. The cars were divided into types.

What I wanted to do is to test whether the annual costs are grouped as with the type of cars. If I proved there was some form of bias then then I automatically proved that annual costs differed too according to vehicle type.

So I separated annual costs according to types of cars and used them as samples. I found the means of each sample and later the mean of these means and compared with the population.

They were different but ofcourse I am not sure if this is because each sample differed in the number of observations.

Then I did the hypothesis testing with their means as Ho's (as described above) and all were falling in the rejection region of the population which I translated into a possible indication that there was a bias and that annual costs per type were very different and thus grouped too.

Is my work totally wrong? Half wrong? Or ok? :???:
 
Also another question. Lets say you take random samples of inconsistent observations (for example first sample is 4 obs, second is 10 etc with obs in one sample not being repeated in another) and then you find all of their means. If you find the mean of all the means collected would that still equal to that of the population or would that not happen only in case my samples are biased meaning they were grouped?
Could you tell us some more of what your data looks like? Life cycle data with irregular sample intervals? Random independent observations? In other words: Is a 'sample' several observations of the same unit, or is it observations of different units sharing a trait on another variable? What?

In general, it sounds like you're describing stratification. So, if yo want to, say, compare costs across brands of car, your universe of observations should preferably be grouped by brand before sampling to ensure that you get enough observations for each subgroup. Then the baseline would be the mean of the draw from each strata. If you wanted to compare a car to the average of cars, then the draw should be entirely random an the baseline baseline would be the mean of your draw from the entire population.
 
Could you tell us some more of what your data looks like? Life cycle data with irregular sample intervals? Random independent observations? In other words: Is a 'sample' several observations of the same unit, or is it observations of different units sharing a trait on another variable? What?

In general, it sounds like you're describing stratification. So, if yo want to, say, compare costs across brands of car, your universe of observations should preferably be grouped by brand before sampling to ensure that you get enough observations for each subgroup. Then the baseline would be the mean of the draw from each strata. If you wanted to compare a car to the average of cars, then the draw should be entirely random an the baseline baseline would be the mean of your draw from the entire population.

I ll get into more detail as soon as I finish from my stupid final exams. Which should be in Friday.
 
Back
Top