· Analysis of variance (ANOVA) assesses whether the means of two or more groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means (quantitative variables) of groups (categorical variables). The null hypothesis is that there is no difference in the mean of the quantitative variable across groups (categorical variable), while the alternative is that there is a difference.

· A Chi-Square Test of Independence compares frequencies of one categorical variable for different values of a second categorical variable. The null hypothesis is that the relative proportions of one variable are independent of the second variable; in other words, the proportions of one variable are the same for different values of the second variable. The alternate hypothesis is that the relative proportions of one variable are associated with the second variable. Note: although it is possible to run large Chi-Square tables (e.g. 5 x 5, 4 x 6, etc.), the test is really only interpretable when you response variable has 2 levels (see Graphing decisions flow chart in bivariate graphing chapter).

· Correlation coefficient assesses the degree of linear relationship between two variables. It ranges from +1 to -1. A correlation of +1 means that there is a perfect, positive, linear relationship between the two variables. A correlation of -1 means there is a perfect, negative linear relationship between the two variables. In both cases, knowing the value of one variable, you can perfectly predict the value of the second. Note: Two 3+ level categorical variables can be used to generate a correlation coefficient if the the categories are ordered and the average (i.e. mean) can be interpreted. The scatter plot on the other hand will not be useful. In general the scatterplot is not useful for discrete variables (i.e. those that take on a limited number of values). When we square r, it tells us the proportion of the variability in one variable that is described by variation in the second variable (aka RSquare or Coefficient of Determination).

 

· Please note: If you have a quantitative explanatory variable and a categorical response, you will eventually be using logistic regression. For now, categorize your explanatory variable and use a chi-square test as explained above.

 

The requirement of this assignment is to: Run the appropriate test, post the syntax used, and interpret your findings. In addition, use post-hoc tests if appropriate. Please see the samples below for guidance in writing statistical findings.

 

 

Example Writeup:

 

· Example of how to write results for ANOVA:

· When examining the association between current number of cigarettes smoked (quantitative response) and past year nicotine dependence (categorical explanatory), an Analysis of Variance (ANOVA) revealed that among daily, young adult smokers (my sample), those with nicotine dependence reported smoking significantly more cigarettes per day (Mean=14.6, s.d. ±9.15) compared to those without nicotine dependence (Mean=11.4, s.d. ±7.43), F(1, 1313)=44.68, p=.0001.

· Post hoc ANOVA results: ANOVA revealed that among daily, young adult smokers (my sample), number of cigarettes smoked per day (collapsed into 5 ordered categories, which is the categorical explanatory variable) and number of nicotine dependence symptoms (quantitative response variable) were significantly associated, F (4, 1308)=11.79, p=.0001. Post hoc comparisons of mean number of nicotine dependence symptoms by pairs of cigarettes per day categories revealed that those individuals smoking more than 10 cigarettes per day (i.e. 11 to 15, 16 to 20 and >20) reported significantly more nicotine dependence symptoms compared to those smoking 10 or fewer cigarettes per day (i.e. 1 to 5 and 6 to 10). All other comparisons were statistically similar.

 

· Chi-Square Test of Independence

· When examining the association between lifetime major depression (categorical response) and past year nicotine dependence (categorical explanatory), a chi- square test of independence revealed that among daily, young adults smokers (my sample), those with past year nicotine dependence were more likely to have experienced major depression in their lifetime (36.2%) compared to those without past year nicotine dependence (12.7%), X2 =88.60, 1 df, p=0001.

· Post hoc Chi-Square results: A Chi Square test of independence revealed that among daily, young adult smokers (my sample), number of cigarettes smoked per day (collapsed into 5 ordered categories) and past year nicotine dependence (binary categorical variable) were significantly associated, X2 =45.16, 4 df, p=.0001. Post hoc comparisons of rates of nicotine dependence by pairs of cigarettes per day categories revealed that higher rates of nicotine dependence were seen among those smoking more cigarettes, up to 11 to 15 cigarettes per day. In comparison, prevalence of nicotine dependence was statistically similar among those groups smoking 10 to 15, 16 to 20, and > 20 cigarettes per day.

· Correlation

· Among daily, young adult smokers (my sample), the correlation between number of cigarettes smoked per day (quantitative) and number of nicotine dependence symptoms experienced in the past year (quantitative) was 0.17 (p=.0001), suggesting that only 3% (i.e. 0.17 squared) of the variance in number of current nicotine dependence symptoms can be explained by number of cigarettes smoked per day.

 

Sample Submission:

ANOVA

 

In looking at the question assessing whether getting pregnant now would be one of the worst things, we saw no difference between men (M = 1.64, SD = .97) and women (M = 1.69. SD = 1.01), F (1, 4425) = 2.10, p = .148, partial eta square = .000.

 

 

 

 

Univariate Analysis of Variance

 

 

 

Notes

Output Created

10-APR-2024 14:19:24

Comments

 

Input

Data

C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav

 

Active Dataset

DataSet1

 

Filter

H1RP1<5.5 (FILTER)

 

Weight

<none>

 

Split File

<none>

 

N of Rows in Working Data File

4427

Missing Value Handling

Definition of Missing

User-defined missing values are treated as missing.

 

Cases Used

Statistics are based on all cases with valid data for all variables in the model.

Syntax

UNIANOVA H1RP1 BY BIO_SEX

/METHOD=SSTYPE(3)

/INTERCEPT=INCLUDE

/PRINT ETASQ DESCRIPTIVE

/CRITERIA=ALPHA(.05)

/DESIGN=BIO_SEX.

Resources

Processor Time

00:00:00.06

 

Elapsed Time

00:00:00.07

 

 

 

Between-Subjects Factors

 

N

BIOLOGICAL SEX-W1

1

2197

 

2

2230

 

 

 

Descriptive Statistics

Dependent Variable: S8Q1 PREGNANT NOW ONE OF THE WORST-W1

BIOLOGICAL SEX-W1

Mean

Std. Deviation

N

1

1.64

.965

2197

2

1.69

1.007

2230

Total

1.67

.986

4427

 

 

 

Tests of Between-Subjects Effects

Dependent Variable: S8Q1 PREGNANT NOW ONE OF THE WORST-W1

Source

Type III Sum of Squares

df

Mean Square

F

Sig.

Partial Eta Squared

Corrected Model

2.040a

1

2.040

2.097

.148

.000

Intercept

12279.740

1

12279.740

12621.511

.000

.740

BIO_SEX

2.040

1

2.040

2.097

.148

.000

Error

4305.178

4425

.973

 

 

 

Total

16590.000

4427

 

 

 

 

Corrected Total

4307.218

4426

 

 

 

 

a. R Squared = .000 (Adjusted R Squared = .000)

 

 

 

 

 

 

 

In analyzing the getting pregnant now not being so bad question, there was a significant difference between men (M = 4.23, SD =.99) and women (M= 4.17, SD= 1.03), where men thought that getting pregnant now would not be as bad as women thought, F (1, 4425) = 4.26, p = .039, partial eta square = .001.

 

 

 

Univariate Analysis of Variance

 

 

 

Notes

Output Created

10-APR-2024 14:33:07

Comments

 

Input

Data

C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav

 

Active Dataset

DataSet1

 

Filter

H1RP2<5.5 (FILTER)

 

Weight

<none>

 

Split File

<none>

 

N of Rows in Working Data File

4427

Missing Value Handling

Definition of Missing

User-defined missing values are treated as missing.

 

Cases Used

Statistics are based on all cases with valid data for all variables in the model.

Syntax

UNIANOVA H1RP2 BY BIO_SEX

/METHOD=SSTYPE(3)

/INTERCEPT=INCLUDE

/PRINT ETASQ DESCRIPTIVE

/CRITERIA=ALPHA(.05)

/DESIGN=BIO_SEX.

Resources

Processor Time

00:00:00.09

 

Elapsed Time

00:00:00.09

 

 

 

Between-Subjects Factors

 

N

BIOLOGICAL SEX-W1

1

2197

 

2

2230

 

 

 

Descriptive Statistics

Dependent Variable: S8Q2 PREGNANT NOW NOT SO BAD-W1

BIOLOGICAL SEX-W1

Mean

Std. Deviation

N

1

4.23

.989

2197

2

4.17

1.027

2230

Total

4.20

1.009

4427

 

 

 

Tests of Between-Subjects Effects

Dependent Variable: S8Q2 PREGNANT NOW NOT SO BAD-W1

Source

Type III Sum of Squares

df

Mean Square

F

Sig.

Partial Eta Squared

Corrected Model

4.334a

1

4.334

4.259

.039

.001

Intercept

78000.884

1

78000.884

76647.787

.000

.945

BIO_SEX

4.334

1

4.334

4.259

.039

.001

Error

4503.116

4425

1.018

 

 

 

Total

82504.000

4427

 

 

 

 

Corrected Total

4507.451

4426

 

 

 

 

a. R Squared = .001 (Adjusted R Squared = .001)

 

 

 

 

 

DEMO FROM SEX WORKER SAMPLE

 

 

 

Univariate Analysis of Variance

 

 

 

Notes

Output Created

15-APR-2024 14:18:08

Comments

 

Input

Data

C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav

 

Active Dataset

DataSet2

 

Filter

<none>

 

Weight

<none>

 

Split File

<none>

 

N of Rows in Working Data File

63

Missing Value Handling

Definition of Missing

User-defined missing values are treated as missing.

 

Cases Used

Statistics are based on all cases with valid data for all variables in the model.

Syntax

UNIANOVA firstsexrecoded BY condition

/METHOD=SSTYPE(3)

/INTERCEPT=INCLUDE

/PRINT ETASQ DESCRIPTIVE

/CRITERIA=ALPHA(.05)

/DESIGN=condition.

Resources

Processor Time

00:00:00.02

 

Elapsed Time

00:00:00.01

 

 

 

[DataSet2] C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav

 

 

 

Between-Subjects Factors

 

Value Label

N

Sample type

1

cohort

30

 

2

prost

28

 

 

 

Descriptive Statistics

Dependent Variable: firstsexrecoded

Sample type

Mean

Std. Deviation

N

cohort

1.8667

.86037

30

prost

2.1429

.70523

28

Total

2.0000

.79472

58

 

 

 

Tests of Between-Subjects Effects

Dependent Variable: firstsexrecoded

Source

Type III Sum of Squares

df

Mean Square

F

Sig.

Partial Eta Squared

Corrected Model

1.105a

1

1.105

1.773

.188

.031

Intercept

232.829

1

232.829

373.645

<.001

.870

condition

1.105

1

1.105

1.773

.188

.031

Error

34.895

56

.623

 

 

 

Total

268.000

58

 

 

 

 

Corrected Total

36.000

57

 

 

 

 

a. R Squared = .031 (Adjusted R Squared = .013)

 

 

 

Looking at the first sexual experience variable, there was no significant difference between the cohort sample (M=1.86, SD = .86) and the sex worker sample (M=2.15, SD = .70), F (1, 56) = 1.77, p =1.88, partial eta = .031.

 

 

 

DEMO SEX WORKER CHI SQUARE

 

 

 

Crosstabs

 

 

 

Notes

Output Created

15-APR-2024 14:33:53

Comments

 

Input

Data

C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav

 

Active Dataset

DataSet2

 

Filter

<none>

 

Weight

<none>

 

Split File

<none>

 

N of Rows in Working Data File

63

Missing Value Handling

Definition of Missing

User-defined missing values are treated as missing.

 

Cases Used

Statistics for each table are based on all the cases with valid data in the specified range(s) for all variables in each table.

Syntax

CROSSTABS

/TABLES=condition BY sex_o

/FORMAT=AVALUE TABLES

/STATISTICS=CHISQ

/CELLS=COUNT

/COUNT ROUND CELL.

Resources

Processor Time

00:00:00.00

 

Elapsed Time

00:00:00.01

 

Dimensions Requested

2

 

Cells Available

524245

 

 

 

[DataSet2] C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav

 

 

 

Case Processing Summary

 

Cases

 

Valid

Missing

Total

 

N

Percent

N

Percent

N

Percent

Sample type * sexual orientation

63

100.0%

0

0.0%

63

100.0%

 

 

 

Sample type * sexual orientation Crosstabulation

Count

 

sexual orientation

Total

 

hetero

bi

homo

 

Sample type

cohort

30

2

0

32

 

prost

20

9

2

31

Total

50

11

2

63

 

 

 

Chi-Square Tests

 

Value

df

Asymptotic Significance (2-sided)

Pearson Chi-Square

8.441a

2

.015

Likelihood Ratio

9.588

2

.008

Linear-by-Linear Association

8.058

1

.005

N of Valid Cases

63

 

 

a. 2 cells (33.3%) have expected count less than 5. The minimum expected count is .98.

 

 

 

 

In our study we found a greater prevalence of heterosexuality in the cohort sample (94%) than in the sex worker sample (65%), chisquare – 8.44, DF=2, p=.015. Comment by John Edlund: 30/32 Comment by John Edlund: 20/31

 

 

 

 

 

 

Univariate Analysis of Variance

 

 

 

Notes

Output Created

17-APR-2024 14:12:57

Comments

 

Input

Data

C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav

 

Active Dataset

DataSet1

 

Filter

H1RP5<5.5 (FILTER)

 

Weight

<none>

 

Split File

<none>

 

N of Rows in Working Data File

4393

Missing Value Handling

Definition of Missing

User-defined missing values are treated as missing.

 

Cases Used

Statistics are based on all cases with valid data for all variables in the model.

Syntax

UNIANOVA H1RP5 BY BIO_SEX

/METHOD=SSTYPE(3)

/INTERCEPT=INCLUDE

/PRINT ETASQ DESCRIPTIVE

/CRITERIA=ALPHA(.05)

/DESIGN=BIO_SEX.

Resources

Processor Time

00:00:00.08

 

Elapsed Time

00:00:00.08

 

 

 

[DataSet1] C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav

 

 

 

Between-Subjects Factors

 

N

BIOLOGICAL SEX-W1

1

2178

 

2

2215

 

 

 

Descriptive Statistics

Dependent Variable: S8Q5 RISK OF PREGNANCY W/O PROTECTION-W1

BIOLOGICAL SEX-W1

Mean

Std. Deviation

N

1

3.12

.999

2178

2

3.27

.990

2215

Total

3.20

.997

4393

 

 

 

Tests of Between-Subjects Effects

Dependent Variable: S8Q5 RISK OF PREGNANCY W/O PROTECTION-W1

Source

Type III Sum of Squares

df

Mean Square

F

Sig.

Partial Eta Squared

Corrected Model

24.738a

1

24.738

25.024

<.001

.006

Intercept

44927.538

1

44927.538

45447.377

.000

.912

BIO_SEX

24.738

1

24.738

25.024

<.001

.006

Error

4340.775

4391

.989

 

 

 

Total

49314.000

4393

 

 

 

 

Corrected Total

4365.513

4392

 

 

 

 

a. R Squared = .006 (Adjusted R Squared = .005)

 

 

 

In looking at men’s and women attitudes about the risk of unprotected sex, we found that men (M = 3.12, SD .99) were less concerned about the risks than were women (M 3.27, SD=.99), F (1, 4391) = 25.02, p <.001, partial eta square = .006.

 

DATA FROM APRIL 17THS CLASS SESSION

 

 

Crosstabs

 

 

 

Notes

Output Created

17-APR-2024 14:20:11

Comments

 

Input

Data

C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav

 

Active Dataset

DataSet1

 

Filter

H1RP5<5.5 (FILTER)

 

Weight

<none>

 

Split File

<none>

 

N of Rows in Working Data File

4393

Missing Value Handling

Definition of Missing

User-defined missing values are treated as missing.

 

Cases Used

Statistics for each table are based on all the cases with valid data in the specified range(s) for all variables in each table.

Syntax

CROSSTABS

/TABLES=BIO_SEX BY H1NM5

/FORMAT=AVALUE TABLES

/STATISTICS=CHISQ

/CELLS=COUNT

/COUNT ROUND CELL.

Resources

Processor Time

00:00:00.08

 

Elapsed Time

00:00:00.09

 

Dimensions Requested

2

 

Cells Available

524245

 

 

 

Case Processing Summary

 

Cases

 

Valid

Missing

Total

 

N

Percent

N

Percent

N

Percent

BIOLOGICAL SEX-W1 * S12Q5 BIO MOM DISABLED-W1

4393

100.0%

0

0.0%

4393

100.0%

 

 

 

BIOLOGICAL SEX-W1 * S12Q5 BIO MOM DISABLED-W1 Crosstabulation

Count

 

S12Q5 BIO MOM DISABLED-W1

Total

 

0

1

7

8

 

BIOLOGICAL SEX-W1

1

282

18

1878

0

2178

 

2

254

24

1936

1

2215

Total

536

42

3814

1

4393

 

 

 

Chi-Square Tests

 

Value

df

Asymptotic Significance (2-sided)

Pearson Chi-Square

3.890a

3

.274

Likelihood Ratio

4.280

3

.233

Linear-by-Linear Association

1.571

1

.210

N of Valid Cases

4393

 

 

a. 2 cells (25.0%) have expected count less than 5. The minimum expected count is .50.

 

 

When looking at men’s and women’s responses to the whether their mom is disabled, we found no difference between men (86% reported no disabilities) and women (87% reported no disabilities), chisquare 3.89, DF =3, p =.274.

 

 

 

 

Correlations

 

 

 

Notes

Output Created

17-APR-2024 14:32:21

Comments

 

Input

Data

C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav

 

Active Dataset

DataSet1

 

Filter

H1RP5<5.5 (FILTER)

 

Weight

<none>

 

Split File

<none>

 

N of Rows in Working Data File

4393

Missing Value Handling

Definition of Missing

User-defined missing values are treated as missing.

 

Cases Used

Statistics for each pair of variables are based on all the cases with valid data for that pair.

Syntax

CORRELATIONS

/VARIABLES=H1RP1 H1RP2

/PRINT=TWOTAIL NOSIG FULL

/MISSING=PAIRWISE.

Resources

Processor Time

00:00:00.14

 

Elapsed Time

00:00:00.08

 

 

 

Correlations

 

S8Q1 PREGNANT NOW ONE OF THE WORST-W1

S8Q2 PREGNANT NOW NOT SO BAD-W1

S8Q1 PREGNANT NOW ONE OF THE WORST-W1

Pearson Correlation

1

-.492**

 

Sig. (2-tailed)

 

<.001

 

N

4393

4393

S8Q2 PREGNANT NOW NOT SO BAD-W1

Pearson Correlation

-.492**

1

 

Sig. (2-tailed)

<.001

 

 

N

4393

4393

**. Correlation is significant at the 0.01 level (2-tailed).

 

 

 

The next analysis that I ran was correlating the pregnant now being worse with the pregnant now being not so bad variable; these variables were highly negatively correlated, r = -.492, p <.001.

 

 

 

 

Correlations

 

 

 

Notes

Output Created

17-APR-2024 14:37:06

Comments

 

Input

Data

C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav

 

Active Dataset

DataSet1

 

Filter

H1RP5<5.5 (FILTER)

 

Weight

<none>

 

Split File

<none>

 

N of Rows in Working Data File

4393

Missing Value Handling

Definition of Missing

User-defined missing values are treated as missing.

 

Cases Used

Statistics for each pair of variables are based on all the cases with valid data for that pair.

Syntax

CORRELATIONS

/VARIABLES=H1RP1 H1RP3

/PRINT=TWOTAIL NOSIG FULL

/MISSING=PAIRWISE.

Resources

Processor Time

00:00:00.08

 

Elapsed Time

00:00:00.13

 

 

 

Correlations

 

S8Q1 PREGNANT NOW ONE OF THE WORST-W1

S8Q3 WILL SUFFER IF HIV POSITIVE-W1

S8Q1 PREGNANT NOW ONE OF THE WORST-W1

Pearson Correlation

1

.200**

 

Sig. (2-tailed)

 

<.001

 

N

4393

4393

S8Q3 WILL SUFFER IF HIV POSITIVE-W1

Pearson Correlation

.200**

1

 

Sig. (2-tailed)

<.001

 

 

N

4393

4393

**. Correlation is significant at the 0.01 level (2-tailed).

 

 

The next analysis that I ran was correlating the pregnant now being worse with the will suffer with HIV variable; these variables were moderate correlated, r = .200, p <.001.

We can handle this paper for you

We Guarantee ZERO Plagiarism ZERO AI

Done by Professional writers from scratch


Leave a Reply

Your email address will not be published. Required fields are marked *