welcome to this presentation on the independent sample t-test this is dr. Amanda rockin since AB Q and in this tutorial we’re going to talk about the independent samples t-test we’re gonna define it we’re gonna talk about when to use it we’re gonna talk about the logic behind it we’re also gonna talk about the statistics behind it and then talk a little bit about how to report an APA result section as well as a set up a file in SPSS so let’s get started the independent samples t-test also sometimes referred to as simply a t-test or a t-test for independent means is an extremely common hypothesis test in social science research it’s often used in group comparison studies when a researcher desires to compare the mean scores of two different groups this is often an experimental in a control group but not always and we’ll get into that later because the experimental and control groups are independent that is they contain separate individuals who receive separate treatment conditions the scores of the individuals in the two groups have nothing to do with each other it follows then that the means scores for these two groups are independent from each other and that’s why we call this an independent samples t-test for the independent samples t-test the researcher needs to have two types of variables first that he or she needs to have one categorical independent variable with two levels or two groups second the researcher needs one dependent variable that’s a continuous variable and remember in an independent samples t-test different from that of like a one samples t-tests the population mean and standard deviation are not known before we discuss the independent samples t-test further it’s really important that you have a clear understanding of variables so let’s take a moment and review the definition for the independent and dependent variable remember the independent variable is the treatment or the intervention variable it’s usually that which the researcher manipulates our measures whereas the dependent variable is the response variable it’s often said that the dependent variable depends upon the independent variable it’s the observed aspect of the behavior of an organism that has been stimulated let’s um take a moment and review variables in the context of an actual scenario in social science researcher a researcher may want to know how two groups differ from one another let’s say it’s a heat he um can study two groups after the fact or he can decide to conduct an experiment by dividing a group of participants into separate groups in this case we’re going to talk about two groups he treats them each differently and then he measures the groups on a specified variable after the treatment to see if there is a difference between the groups in other words the researcher divides the participants into groups based on the independent or grouping variable an example of this may be the type of course that students are enrolled in or the type of course that he wants to assign the students to all students enrolled in an online course could be one level or one group while all the students enrolled in a residential course could be another level or another group so here we see the independent variable that’s which is manipulated or that which was previously manipulated it’s type of course it would it would have to it would have two levels or two groupings and just as an FYI levels is just another word or another word that we often use for grouping or groups in research so the independent variable is the type of course it has two levels both residential and online then we have the outcome variable or the variable that the researcher hopes the independent variable will affect it’s called the dependent variable or the test variable let’s say here that the researcher is looking to see if the outcome will be different depending on which group those who take online courses versus those who take them in residents another way of saying this is that the outcome variable or score is dependent on the level of the independent variable so now that we have a good understanding of independent and dependent variables let’s go back to talking about independent sample t-tests thus far we’ve defined an independent samples t-test we’ve also talked about what’s needed for an independent samples t-test now let’s take a look at an example of when a researcher may use this type of t-test consider a study in which an educational researcher decides to conduct a causal comparative study to examine if the type of course or the medium in which students take a statistics course influences their learning here we see that the medium is either online or residential so there’s

two different groups and the dependent variable is learning let’s say operationally defined as course points ranging from zero to a thousand more precisely there’s the independent variable and that again is type of course and there are two groups the experimental group consists of graduate students who participate in a new online statistics course and the control group consists of graduate students who participate in a residential statistics course let’s say it’s a well designed research study and so both groups cover the same exact material and they’re very similar in demographics at this point the researcher would ask the question is there a statistically significant difference in students course points in their statistics course based on the type of course they take residential or online and the null hypothesis simply is States almost the exact same thing there’s no statistically significant difference in students course points in a statistics course based on the type of course they take here we see because the educational researcher is comparing two groups on one dependent variable and let’s say they that he or she has no information about the effect of the type of course medium on the general population because after all this is a new course and thus can’t run a one sample t-test the correct hypothesis testing procedure in this case would be a t-test for independent means let’s take a look at one other example we just looked at an example in which the researcher conducted a causal comparative study that is he or she looked at something after the fact the students had enrolled in the courses and oftentimes the t-test is appropriate for a causal comparative study it can also be appropriate for use in an experimental study consider a study in which an educational researcher decides to conduct an experimental study by applying media richness theory to his online class now to give us a little basis here media richness theory is based on the assumption that the use of rich media as compared to lean media results in more effective communication theorist really argue that communication the communication medium or the more rich that a communication medium is the more ambiguity and uncertainty is reduced considering media richness theory criteria let’s say in an online environment it would go something like this using a video conferencing system for communication in which you’d meet online using video is more media rich than let’s say asynchronous communications such as discussion board thus if we apply this theory it would imply that video conferencing when compared to discussion boards may result in more effective communication when discussing complex a complex ambiguous learning topics and assignment thus the video conferencing may be better for learning complex material than discussion board so let’s say that the researcher wants to test this the researcher decides to randomly assign graduate students taking a statistics course to one of to one of two groups on the experimental or the control group here the experimental group we’re going to say consists of graduate students who participate in the online statistics course using the video conferencing medium whereas the control group consists of graduate students who participate in the online statistics course and use a discussion forum again here because it’s a well-designed research study both groups cover the same material have the same assignments and have similar demographics and because um the students are randomly assigned according to campbell and stanley in 1963 we can assume that they are equivalent so therefore no pretest is needed and so we don’t have to add a covariant into our analysis and we’ll get into that in another tutorial when we talk about annan kova here the research question that the researcher proposes is this is there a statistically significant difference in graduate students course points based on the type of communication medium they use in their online statistics course the null hypothesis system simply states that there is no statistically significant difference in graduate students course points based on the type of communication medium used to communicate in their online course again here we see because the educational researcher is comparing two groups the group that’s using discussion board versus the group that’s using video conferencing and comparing the difference between those groups on one dependent variable the correct procedure here again is the t-test for independent means so the null hypothesis that’s being tested in an independent samples t-test simply states that there will be no difference between the experimental populations mean and the control group populations mean or that the difference between the means will be equal to zero

in our example again there will be no statistically significant difference in graduate students course points based on the type of communication medium they use to communicate in their online courses now let’s take a look at the alternative hypothesis alternatively the research or alternative hypothesis will state that the means are different or that they’re not equal to zero and the difference between the means is large enough to be significant stated in words on the alternative or research hypothesis for our example that we’ve been discussing is this there is a statistically significant difference in graduate students course points based on the type of communication medium they use – or they use to communicate in their online courses now let’s take a moment to summarize what we just talked about an independent samples t-test is appropriate when comparing two groups on one dependent variable that is there’s one independent variable with two groups and one dependent variable let’s take a look at this diagram two groups of the independent variable are formed either on pre-existing conditions or through random assignment in our first example that we looked at the researcher examined the difference between two groups based on pre-existing conditions whether or not the students enrolled in an online course or a residential course in the second example the researcher randomly assigned the students to either a video conferencing online course or to a discussion board online course let’s take a let’s talk about the experiment for a moment so the Xperia T researcher assigned the students randomly assigned the students to either a treatment or a control group then each group was treated by the researcher either they received the treatment or they didn’t receive the treatment or they receive different levels of treatment again going back to our example one group got you’re conferencing and one group got amused discussion board after the two groups are treated both are measured on a specified outcome again in our example we used course points at this point then the t-test for independent means could be conducted to compare the mean scores of both the treatment and the control group to determine if there is a statistically significant difference or if a statistically significant difference exists between course points now that we understand when an independent samples t-test is used let’s look at the logic behind the analysis because the t-test for independent means looks at two completely independent groups like an experimental and control group it’s necessary to treat each group is coming from completely independent sample populations to better understand this let’s take a look at the diagram here this is a diagram often seen in statistics books let’s start with a top row on the top row you see the two separate independent population distributions one representing the population from which the treatment group is selected and one representing the population from which the control group is selected then in the middle row we see that each of these populations has its own distribution of the means up to this point the distributions are completely independent as one another and you can see that in these two rows now we’re really going to get to the core of the independent samples t-test so let’s move to the third row the point of any study with an experimental and control group is to compare the two groups to each other on the dependent variable to determine if they’re statistically significantly different so what should our country or our comparison distribution be because um we’re looking at a difference between means we have to use a third distribution called the distribution of differences between the means for our comparison and this is pictured in the third row this distribution represents all possible differences between the mean of the treatment group and the mean of the control group if the researcher were to a study numerous times let me say that again this distribution represents all the possible differences between the mean of the treatment group and the mean of the control group if the researcher were to conduct the studying numerous times this is found by calculating the actual difference between the means of the two groups of the study as we see here the samples represented at the bottom of the diagram in comparing this difference to our distribution of differences between the means so this is the logic behind the analysis so following this logic the null hypothesis for the independent samples t-test States as we’ve previously said that there’s no difference between the means of the two groups the null hypothesis first assumes that the two independent populations are equal or that the differences between them is zero

next is states that the means of the two distributions of the sample mean are equal or the differences between them are also zero so when we construct the distribution of differences between the means the mean of that distribution will be zero if there is no hypothetical difference between the two populations then the difference between the means will average out to be zero now that we’ve looked at the basic concepts behind an independent samples t-test and what it does the logic behind it let’s look at a data set um that we could analyze using this type of t-test our example earlier mentioned a study designed to examine the effects of an online communication medium aimed at improving learning I remember the researcher randomly assigned the participants to either the treatment group that participated in video conferencing or the control group that participated in the discussion board and at the end of the course the researcher gathered the course points for both groups here you can see what the data looks like after the researcher collected the scores of 10 participants from each group remember that these scores are from 20 individuals different individuals altogether and that the scores of the treatment group are independent of the scores of the control group and as you can see we can compute the mean score for each group and then we can find out the crucial difference between the means see it here it’s 18.9 by the way this is all the information that you will need to run the independent samples t-test in SPSS however in order for you to really understand how the test works and what these values represent so that you can properly analyze and interpret SPSS we need to take a the SPSS output we need to take a look at actually conducting the analysis so that’s what we’re gonna take a look at next here you see the formula to calculate the T score you find this value by subtracting the mean of one group from the mean of the other group and then dividing by the standard deviation of the distribution of differences between the means which we’ll look at in a moment in our study of learning differences based on communication mediums used in the online class we would find the S difference for the sample then compute the T score by calculating the mean of the treatment minus the mean of the control divided by the S difference there are however a few steps we need to take to get to this point so let’s walk through this analysis step by step here you see the notations that we will be using as we talk about the different formulas for calculating the independent samples t-test I do want to make a quick notation here the capital n refers to the sample population for the entire study whereas the lowercase ends usually refer to the treatment and control group are the different groups within this study however sometimes you’ll see formulas and you may even see some in these slides where capital n is used in both cases so I just want to make you aware of that as we continue the first step in conducting an independent samples t-test is to estimate the population variance you must do this for each of the two groups so you’ll actually have two variances the degrees of freedom in this formula is the number of participants in the particular group minus one for example of the data we just looked at a few slides back this would be 10 minus 1 which equals 9 for each of the groups and then to find the estimated population standard deviation you simply take the square root of the variance for each of the populations here you’ll see a list of items that need to be reported in an APA result section for an independent samples t-test so that the descriptive statistics the mean and the standard deviation the number of the sample population the number of each group the degrees of freedom the T value the significance level and the effect and power next find the pooled estimate of the population variance this brings us to an important assumption of the t-test analysis that the two distributions have the same variance using this assumption we find or we can use the estimated variance of our two samples from step one to compute an overall variance based on both samples in the formula here s squared pooled represents the pooled estimate of the population variance DF one is the degrees of freedom from the first sample such as the treatment group and US Deputy Prime from both samples together both groups together in the last part of the formula we see the

same thing repeated for the second sample or second group for example the control group and then we add these values together now as we talked about step 3 I want you to think back or recall the diagram of distributions from a few slides earlier this third step here involves that second row we have to find the variance of the two distributions of the mean we do this for each group separately to find the variance of the distribution of means for Group 1 for example the treatment group we divide the pooled estimate found in step 2 by the number of participants for that particular group even though here it’s denoted by capital N 1 probably more appropriately it should be lower case and 1 then the you repeat these steps for the second group for example the control group only divide the number of participants in that particular group here you’ll notice that if the number of participants in each group is the same the to variance is computed will actually be equal now we are ready to find the main value that we will use to compute our T statistic step 4 involves of finding the variance and the standard deviation of the comparison distribution the distribution of differences between the means this is the distribution on row 3 of the diagram that we looked at earlier the variance of the distribution of differences between means is called the S squared difference and it is found by simply adding the variances from step 3 together once we’ve calculated this value we can take the square root to find the standard deviation of the distribution of different between the means called us difference this last value is the one that we will use to compute in our t-score it’s used in the formula that we looked at earlier and that we’ll look at again in step 6 next is time to find the critical or cutoff T value using the T table for an independent samples t-test the degrees of freedom that you should look up on the table is the total degrees of freedom and this is found by adding the degrees of freedom from both of your groups DF 1 plus DF 2 so going back to our example remember that DF 1 was 9 10 minus 1 if we go back to our example we had 10 in our control group that’s 10 minus 1 so 9 plus 9 equals 18 that would be an example of how to find the degrees of freedom here the total degrees of freedom here remember when you are using the table to identify whether or not your test is a one tailed test or a two-tailed test and if you need a refresher on this you can go back and listen to this tutorial on hypothesis testing and when you use a one-tailed or two-tailed test you also need to be prepared to identify your probability or alpha level that you’ll be using and again for the most part in social science research this is a point 0 5 now it’s time to calculate the T score also sometimes referred to as the T value or the T statistic this should look familiar we’ve already looked at it but let’s go ahead and review it you will find this value by subtracting the mean of one group from the mean of the other group and then dividing it by the standard deviation of the distribution of differences between the means which remember we found in step 4 in our study of the learning differences based on communication mediums used in the online class remember we would find the S difference for the sample then compute the T score by calculating the mean of the treatment group minus the mean of the control group divided by the S difference now we’ve come to the point that we can make a decision about the null hypothesis now that we have the critical T value and we’ve computed a T value for our study really the question we want to ask carriers is the T value smaller large relative to the expected distribution value of T let’s talk a little bit about this in fact we’re going to talk about it a little bit over the next few slides when the null hypothesis is true or it’s supported most values of T should be near zero we define the value of T that far away from zero by looking at the critical values of T usually values that correspond to the bottom 2.5% or the top super 52.5% of the area in a distribution or the most extreme 5 percent when the again when the T value is far away from the zero its corresponding alpha will likely be small and therefore the T value will be significant let’s take let’s talk about this a little bit further so here

if the null hypothesis is supported or we fail to reject the null hypothesis in a study what we can expect is that the means of the two samples or two groups should be close together that is the mean difference should be near to zero and the value of the T ratio should be small or close to zero on the other hand if the null hypothesis is not supported or it’s rejected the mean difference between the two groups should be large and the value of the T ratio should be far from zero now here it’s important to note that the value of T can be either positive or negative depending on which group means happens to be larger however the T ratio whether it’s positive or negative will be large if we fail to repair if we reject the null hypothesis or the null hypothesis is not supported so let’s say that we calculated a T value for our example and we looked up the critical T value and our critical T value was 1.96 we would know that if the T value that we calculated for our experiment was less than negative one point nine six or more than 1.96 that we would reject the null hypothesis that is we would say that there was a statistically significant difference between our two groups however if the value the T value that we calculated was between these two numbers then we would say that we’d failed to reject the null hypothesis and there was not a statistically significant difference between the means of the two groups so at this point we’ve really talked about the T value in calculating if there’s a statistically significant difference but after we determined that there is a statistically significant difference or a statistically significant difference exist really a researcher wants to compute the effect size to determine the practical significance of the results remember an effect size is a unit free or standard index of strength association between two variables and here what we’re gonna look at is Cohen’s D and Cohen DS Cohen’s D tells us how far apart the sample means were I’m in the number of standard deviations so let me just give you a quick example of this a practical significance versus statistical significance let’s say that in our two groups remember our online video group versus our online discussion board group let’s say that the course points on our video group was 990 their mean was 990 where is the discussion board group their mean was 980 and even though there was a statistically significant difference what we know by just looking at the scores is that they aren’t very far apart and therefore maybe practically the intervention didn’t make a bunch of difference however if we lift this same example in our video group had a mean score a mean score mean course points of 990 in our discussion board had a mean score of let’s say 900 there’s a 90 points difference so our practical significance is larger so how do we calculate the effect size we can estimate the effect size of our study after the studies completed by using figures from our results we really do this by using the formula here it’s the estimated effect size or Cohen’s D it equals the actual mean of Group 1 the treatment group minus the actual mean of group 2 the control group and it’s divided by the pooled estimate of the population standard deviation if you remember back to the different calculations we did this can be found in step 2 of our analysis we divide by the s pooled because it’s our best estimate of the overall population standard deviation or Sigma then once we’ve calculated on Cohen’s D here and you can also use a 2 square but for the purpose here we’re using Cohen’s D once you’ve calculated this value you can use Cohen’s on standards or conventions to decide whether or not the effect size of the small medium or large so for going back to the example I gave earlier if let’s say the group the groups differed by 90 points we might find that the effect size is large for example at a point AIDS now the effect size is calculated after the T value is calculated or is often calculated after you have the results however there are some things that we need to consider

examining prior to calculating the T value and that is we need to do assumption testing remember parametric analyses require that assumptions be met in order to ensure the robustness for an independent samples t-test here’s a list of the Assumption testing or the assumptions that need to be met first of all there’s normality a normality assumes that the population distributions are normal and that’s each population distribution so normality should be examined for both groups individually the t-test is really robust in the face of moderate violations for this assumption especially when the researcher is using a two-tailed test or the samples are rather large or not that small usually more than thirty most people say or most an statistics book say to check for normality before doing a t-test researchers have a few different options one is to create a histogram in examine normality also statistics such as to pair Wilks or coma gaurav smirnoff tests can be used on the histogram really what you’re looking for is a symmetrical bell-shaped curve and you can see an example of a nice normal distribution here when you’re conducting the normality test non significant results actually result significance levels above a point zero five indicate ten ability of the assumption or that normality can be assumed so you want to look at normality the second assumption that you that a researcher you want to analyze when you’re doing a t-test is homogeneity of variance or equal variance equal variance assumes really that the population distributions have the same variance remember we talked about this earlier when we were talking about calculations if this assumption is violated the averaging of the two variances is really really futile if it’s not violated or if it is violated and especially in a t-test you can use modified statistical procedures in spss you’ll note that there’s actually an alternative T value and even in most statistical textbooks you’ll notice that there’s actually two different ways to calculate T values based on whether or not equal variance can be a um how much an ad of variance is usually tested using Levine’s test for equality of significance and again a significance level of larger than 0.05 indicates that equal variances can be assumed so and then a significance value of a less than 0.05 indicates that equal variance cannot be assumed and that the assumption is not tenable so really what you’re looking for here is a significant Alivia significance level of above 0.05 for this assumption to be tenable additionally what you want to make sure also with an independent samples t-test and we talked about this earlier is that the scores of the two samples are completely independent of with each other remember there are two separate groups if the scores are related and you’re looking at one group then you’re better off using a paired sample t-test and finally in an independent samples t-test you know this is more of a practical matter but you need to make sure that your n for each of your groups is reasonably large enough in order to have adequate statistical power that is the prime remember statistical power is the probability of rejecting the null when it’s is false and generally what you’re looking for here is a reasonably high power usually around a point 7 or ideally 0.8 so you want to large enough sample size for each of your groups so these are things that you want to make sure that a researcher wants to make sure of prior to conducting the t-test so now we’ve got we’ve talked about the assumption testing that’s going to be done prior to the T calculating the T value calculating the T value and then also calculating the effect size let’s talk about what needs to be reported in an APA result section now let’s take a look at a few of these items a little bit closer and how they should be reported when reporting an APO or reporting results for an independent samples t-test in an APA results section you need to state the statistical results the T value the Alpha level and the effect size as you can see in the first two lines here after seeing the actual statistical results you need to make sure that you communicate what those results mean so let’s go back to the example that we’ve been using of our online students participating in an

online statistics course using either video conferencing or discussion board here you’ll see that the researcher found a statistically significant difference because the Alpha level is less than 0.05 and by inspecting the means the researcher was able to determine which group scored higher so he or she may communicate the following students who use video conferencing for online communication had a statistically significant hirers had statistically significant higher course point than those who use discussion board for online communication and this can be seen with the Oh with the means here the one group had a higher mean than the other since the Cohen’s D is point one that means it’s a small effect size so the effect size was small and then let’s say the power was point nine here the researcher would communicate a power of ninety percent or point nine indicates that if the study was conducted ten times it’s likely to produce the same results nine times that means that the researcher can say that with 90 percent certainty the results were correct that the researcher correctly rejected the null hypothesis and that brings me to the last point here in an APA result section the decision about the null hypothesis should be stated so the t-test provided evidence that to reject the null hypothesis now as we wrap up our discussion we’re gonna end with putting together an SPSS file for an independent t-test we in this discussion about the independent samples t-test talking about SPSS because even though it’s important to understand the logic behind and the formulas for the independent samples t-test the reality is is that most researchers in the social science fields use SPSS to calculate their statistics for them so it’s important that you set up the file correctly here we see an example of what a file setup for an independent samples t-test will look like in SPSS here you can see that the grouping variable or the independent variable is on the left the one indicates individuals who were in Group one or the video conferencing or treatment group and the two indicates that that those are the scores for the individuals in group two or the discussion group or the control group the individual scores on the dependent test variable are entered into the next column here we see it under course points here we can see that the data from a portion of our sample for from the treatment and for from the control group in the variable view dialog box you can label the results so one can be treatment two can be control or one can be video conferencing two can be discussion board so that’s whenever you bring up the variable dialog view in SPSS and when we get into SPSS you can look more at that then what’s nice about that if you label your variables SPSS will automatically use these labels in your output which can really be helpful as you clarify your results text on SPSS manuals on SPSS actually provide more information and walk you through step-by-step examples there’s also some tutorials here that we can want that will walk you through SPSS so this concludes our tutorial on the independent samples t-test let’s take just a moment and review what we talked about we talked about the definition for an independent samples t-test we talked about when to use an independent samples t-test the logic behind the t-test the formulas for the t-test we talked a little bit about assumption testing writing in APA results section and then finally we ended here with setting up our SPSS file