Determine what the appropriate statistical test is for your main two variables of interest

· Analysis of variance (ANOVA) assesses whether the means of two or more groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means (quantitative variables) of groups (categorical variables). The null hypothesis is that there is no difference in the mean of the quantitative variable across groups (categorical variable), while the alternative is that there is a difference.

· A Chi-Square Test of Independence compares frequencies of one categorical variable for different values of a second categorical variable. The null hypothesis is that the relative proportions of one variable are independent of the second variable; in other words, the proportions of one variable are the same for different values of the second variable. The alternate hypothesis is that the relative proportions of one variable are associated with the second variable. Note: although it is possible to run large Chi-Square tables (e.g. 5 x 5, 4 x 6, etc.), the test is really only interpretable when you response variable has 2 levels (see Graphing decisions flow chart in bivariate graphing chapter).

· Correlation coefficient assesses the degree of linear relationship between two variables. It ranges from +1 to -1. A correlation of +1 means that there is a perfect, positive, linear relationship between the two variables. A correlation of -1 means there is a perfect, negative linear relationship between the two variables. In both cases, knowing the value of one variable, you can perfectly predict the value of the second. Note: Two 3+ level categorical variables can be used to generate a correlation coefficient if the the categories are ordered and the average (i.e. mean) can be interpreted. The scatter plot on the other hand will not be useful. In general the scatterplot is not useful for discrete variables (i.e. those that take on a limited number of values). When we square r, it tells us the proportion of the variability in one variable that is described by variation in the second variable (aka RSquare or Coefficient of Determination).

· Please note: If you have a quantitative explanatory variable and a categorical response, you will eventually be using logistic regression. For now, categorize your explanatory variable and use a chi-square test as explained above.

The requirement of this assignment is to: Run the appropriate test, post the syntax used, and interpret your findings. In addition, use post-hoc tests if appropriate. Please see the samples below for guidance in writing statistical findings.

Example Writeup:

· Example of how to write results for ANOVA:

· When examining the association between current number of cigarettes smoked (quantitative response) and past year nicotine dependence (categorical explanatory), an Analysis of Variance (ANOVA) revealed that among daily, young adult smokers (my sample), those with nicotine dependence reported smoking significantly more cigarettes per day (Mean=14.6, s.d. ±9.15) compared to those without nicotine dependence (Mean=11.4, s.d. ±7.43), F(1, 1313)=44.68, p=.0001.

· Post hoc ANOVA results: ANOVA revealed that among daily, young adult smokers (my sample), number of cigarettes smoked per day (collapsed into 5 ordered categories, which is the categorical explanatory variable) and number of nicotine dependence symptoms (quantitative response variable) were significantly associated, F (4, 1308)=11.79, p=.0001. Post hoc comparisons of mean number of nicotine dependence symptoms by pairs of cigarettes per day categories revealed that those individuals smoking more than 10 cigarettes per day (i.e. 11 to 15, 16 to 20 and >20) reported significantly more nicotine dependence symptoms compared to those smoking 10 or fewer cigarettes per day (i.e. 1 to 5 and 6 to 10). All other comparisons were statistically similar.

· Chi-Square Test of Independence

· When examining the association between lifetime major depression (categorical response) and past year nicotine dependence (categorical explanatory), a chi- square test of independence revealed that among daily, young adults smokers (my sample), those with past year nicotine dependence were more likely to have experienced major depression in their lifetime (36.2%) compared to those without past year nicotine dependence (12.7%), X2 =88.60, 1 df, p=0001.

· Post hoc Chi-Square results: A Chi Square test of independence revealed that among daily, young adult smokers (my sample), number of cigarettes smoked per day (collapsed into 5 ordered categories) and past year nicotine dependence (binary categorical variable) were significantly associated, X2 =45.16, 4 df, p=.0001. Post hoc comparisons of rates of nicotine dependence by pairs of cigarettes per day categories revealed that higher rates of nicotine dependence were seen among those smoking more cigarettes, up to 11 to 15 cigarettes per day. In comparison, prevalence of nicotine dependence was statistically similar among those groups smoking 10 to 15, 16 to 20, and > 20 cigarettes per day.

· Correlation

· Among daily, young adult smokers (my sample), the correlation between number of cigarettes smoked per day (quantitative) and number of nicotine dependence symptoms experienced in the past year (quantitative) was 0.17 (p=.0001), suggesting that only 3% (i.e. 0.17 squared) of the variance in number of current nicotine dependence symptoms can be explained by number of cigarettes smoked per day.

Sample Submission:

ANOVA

In looking at the question assessing whether getting pregnant now would be one of the worst things, we saw no difference between men (M = 1.64, SD = .97) and women (M = 1.69. SD = 1.01), F (1, 4425) = 2.10, p = .148, partial eta square = .000.

Univariate Analysis of Variance

Notes
Output Created	10-APR-2024 14:19:24
Comments
Input	Data	C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav
	Active Dataset	DataSet1
	Filter	H1RP1<5.5 (FILTER)
	Weight	<none>
	Split File	<none>
	N of Rows in Working Data File	4427
Missing Value Handling	Definition of Missing	User-defined missing values are treated as missing.
	Cases Used	Statistics are based on all cases with valid data for all variables in the model.
Syntax	UNIANOVA H1RP1 BY BIO_SEX /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT ETASQ DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=BIO_SEX.
Resources	Processor Time	00:00:00.06
	Elapsed Time	00:00:00.07

Between-Subjects Factors
	N
BIOLOGICAL SEX-W1	1	2197
	2	2230

Descriptive Statistics
Dependent Variable: S8Q1 PREGNANT NOW ONE OF THE WORST-W1
BIOLOGICAL SEX-W1	Mean	Std. Deviation	N
1	1.64	.965	2197
2	1.69	1.007	2230
Total	1.67	.986	4427

Tests of Between-Subjects Effects
Dependent Variable: S8Q1 PREGNANT NOW ONE OF THE WORST-W1
Source	Type III Sum of Squares	df	Mean Square	F	Sig.	Partial Eta Squared
Corrected Model	2.040a	1	2.040	2.097	.148	.000
Intercept	12279.740	1	12279.740	12621.511	.000	.740
BIO_SEX	2.040	1	2.040	2.097	.148	.000
Error	4305.178	4425	.973
Total	16590.000	4427
Corrected Total	4307.218	4426
a. R Squared = .000 (Adjusted R Squared = .000)

In analyzing the getting pregnant now not being so bad question, there was a significant difference between men (M = 4.23, SD =.99) and women (M= 4.17, SD= 1.03), where men thought that getting pregnant now would not be as bad as women thought, F (1, 4425) = 4.26, p = .039, partial eta square = .001.

Univariate Analysis of Variance

Notes
Output Created	10-APR-2024 14:33:07
Comments
Input	Data	C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav
	Active Dataset	DataSet1
	Filter	H1RP2<5.5 (FILTER)
	Weight	<none>
	Split File	<none>
	N of Rows in Working Data File	4427
Missing Value Handling	Definition of Missing	User-defined missing values are treated as missing.
	Cases Used	Statistics are based on all cases with valid data for all variables in the model.
Syntax	UNIANOVA H1RP2 BY BIO_SEX /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT ETASQ DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=BIO_SEX.
Resources	Processor Time	00:00:00.09
	Elapsed Time	00:00:00.09

Between-Subjects Factors
	N
BIOLOGICAL SEX-W1	1	2197
	2	2230

Descriptive Statistics
Dependent Variable: S8Q2 PREGNANT NOW NOT SO BAD-W1
BIOLOGICAL SEX-W1	Mean	Std. Deviation	N
1	4.23	.989	2197
2	4.17	1.027	2230
Total	4.20	1.009	4427

Tests of Between-Subjects Effects
Dependent Variable: S8Q2 PREGNANT NOW NOT SO BAD-W1
Source	Type III Sum of Squares	df	Mean Square	F	Sig.	Partial Eta Squared
Corrected Model	4.334a	1	4.334	4.259	.039	.001
Intercept	78000.884	1	78000.884	76647.787	.000	.945
BIO_SEX	4.334	1	4.334	4.259	.039	.001
Error	4503.116	4425	1.018
Total	82504.000	4427
Corrected Total	4507.451	4426
a. R Squared = .001 (Adjusted R Squared = .001)

DEMO FROM SEX WORKER SAMPLE

Univariate Analysis of Variance

Notes
Output Created	15-APR-2024 14:18:08
Comments
Input	Data	C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav
	Active Dataset	DataSet2
	Filter	<none>
	Weight	<none>
	Split File	<none>
	N of Rows in Working Data File	63
Missing Value Handling	Definition of Missing	User-defined missing values are treated as missing.
	Cases Used	Statistics are based on all cases with valid data for all variables in the model.
Syntax	UNIANOVA firstsexrecoded BY condition /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT ETASQ DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=condition.
Resources	Processor Time	00:00:00.02
	Elapsed Time	00:00:00.01

[DataSet2] C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav

Between-Subjects Factors
	Value Label	N
Sample type	1	cohort	30
	2	prost	28

Descriptive Statistics
Dependent Variable: firstsexrecoded
Sample type	Mean	Std. Deviation	N
cohort	1.8667	.86037	30
prost	2.1429	.70523	28
Total	2.0000	.79472	58

Tests of Between-Subjects Effects
Dependent Variable: firstsexrecoded
Source	Type III Sum of Squares	df	Mean Square	F	Sig.	Partial Eta Squared
Corrected Model	1.105a	1	1.105	1.773	.188	.031
Intercept	232.829	1	232.829	373.645	<.001	.870
condition	1.105	1	1.105	1.773	.188	.031
Error	34.895	56	.623
Total	268.000	58
Corrected Total	36.000	57
a. R Squared = .031 (Adjusted R Squared = .013)

Looking at the first sexual experience variable, there was no significant difference between the cohort sample (M=1.86, SD = .86) and the sex worker sample (M=2.15, SD = .70), F (1, 56) = 1.77, p =1.88, partial eta = .031.

DEMO SEX WORKER CHI SQUARE

Crosstabs

Notes
Output Created	15-APR-2024 14:33:53
Comments
Input	Data	C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav
	Active Dataset	DataSet2
	Filter	<none>
	Weight	<none>
	Split File	<none>
	N of Rows in Working Data File	63
Missing Value Handling	Definition of Missing	User-defined missing values are treated as missing.
	Cases Used	Statistics for each table are based on all the cases with valid data in the specified range(s) for all variables in each table.
Syntax	CROSSTABS /TABLES=condition BY sex_o /FORMAT=AVALUE TABLES /STATISTICS=CHISQ /CELLS=COUNT /COUNT ROUND CELL.
Resources	Processor Time	00:00:00.00
	Elapsed Time	00:00:00.01
	Dimensions Requested	2
	Cells Available	524245

[DataSet2] C:UsersjeegshDesktopLehman – Stats – Spring 2024Prostitution – class data.sav

Case Processing Summary
	Cases
	Valid	Missing	Total
	N	Percent	N	Percent	N	Percent
Sample type * sexual orientation	63	100.0%	0	0.0%	63	100.0%

*Sample type sexual orientation Crosstabulation**
Count
	sexual orientation	Total
	hetero	bi	homo
Sample type	cohort	30	2	0	32
	prost	20	9	2	31
Total	50	11	2	63

Chi-Square Tests
	Value	df	Asymptotic Significance (2-sided)
Pearson Chi-Square	8.441a	2	.015
Likelihood Ratio	9.588	2	.008
Linear-by-Linear Association	8.058	1	.005
N of Valid Cases	63
a. 2 cells (33.3%) have expected count less than 5. The minimum expected count is .98.

In our study we found a greater prevalence of heterosexuality in the cohort sample (94%) than in the sex worker sample (65%), chisquare – 8.44, DF=2, p=.015. Comment by John Edlund: 30/32 Comment by John Edlund: 20/31

Univariate Analysis of Variance

Notes
Output Created	17-APR-2024 14:12:57
Comments
Input	Data	C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav
	Active Dataset	DataSet1
	Filter	H1RP5<5.5 (FILTER)
	Weight	<none>
	Split File	<none>
	N of Rows in Working Data File	4393
Missing Value Handling	Definition of Missing	User-defined missing values are treated as missing.
	Cases Used	Statistics are based on all cases with valid data for all variables in the model.
Syntax	UNIANOVA H1RP5 BY BIO_SEX /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT ETASQ DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=BIO_SEX.
Resources	Processor Time	00:00:00.08
	Elapsed Time	00:00:00.08

[DataSet1] C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav

Between-Subjects Factors
	N
BIOLOGICAL SEX-W1	1	2178
	2	2215

Descriptive Statistics
Dependent Variable: S8Q5 RISK OF PREGNANCY W/O PROTECTION-W1
BIOLOGICAL SEX-W1	Mean	Std. Deviation	N
1	3.12	.999	2178
2	3.27	.990	2215
Total	3.20	.997	4393

Tests of Between-Subjects Effects
Dependent Variable: S8Q5 RISK OF PREGNANCY W/O PROTECTION-W1
Source	Type III Sum of Squares	df	Mean Square	F	Sig.	Partial Eta Squared
Corrected Model	24.738a	1	24.738	25.024	<.001	.006
Intercept	44927.538	1	44927.538	45447.377	.000	.912
BIO_SEX	24.738	1	24.738	25.024	<.001	.006
Error	4340.775	4391	.989
Total	49314.000	4393
Corrected Total	4365.513	4392
a. R Squared = .006 (Adjusted R Squared = .005)

In looking at men’s and women attitudes about the risk of unprotected sex, we found that men (M = 3.12, SD .99) were less concerned about the risks than were women (M 3.27, SD=.99), F (1, 4391) = 25.02, p <.001, partial eta square = .006.

DATA FROM APRIL 17THS CLASS SESSION

Crosstabs

Notes
Output Created	17-APR-2024 14:20:11
Comments
Input	Data	C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav
	Active Dataset	DataSet1
	Filter	H1RP5<5.5 (FILTER)
	Weight	<none>
	Split File	<none>
	N of Rows in Working Data File	4393
Missing Value Handling	Definition of Missing	User-defined missing values are treated as missing.
	Cases Used	Statistics for each table are based on all the cases with valid data in the specified range(s) for all variables in each table.
Syntax	CROSSTABS /TABLES=BIO_SEX BY H1NM5 /FORMAT=AVALUE TABLES /STATISTICS=CHISQ /CELLS=COUNT /COUNT ROUND CELL.
Resources	Processor Time	00:00:00.08
	Elapsed Time	00:00:00.09
	Dimensions Requested	2
	Cells Available	524245

Case Processing Summary
	Cases
	Valid	Missing	Total
	N	Percent	N	Percent	N	Percent
BIOLOGICAL SEX-W1 * S12Q5 BIO MOM DISABLED-W1	4393	100.0%	0	0.0%	4393	100.0%

*BIOLOGICAL SEX-W1 S12Q5 BIO MOM DISABLED-W1 Crosstabulation**
Count
	S12Q5 BIO MOM DISABLED-W1	Total
	0	1	7	8
BIOLOGICAL SEX-W1	1	282	18	1878	0	2178
	2	254	24	1936	1	2215
Total	536	42	3814	1	4393

Chi-Square Tests
	Value	df	Asymptotic Significance (2-sided)
Pearson Chi-Square	3.890a	3	.274
Likelihood Ratio	4.280	3	.233
Linear-by-Linear Association	1.571	1	.210
N of Valid Cases	4393
a. 2 cells (25.0%) have expected count less than 5. The minimum expected count is .50.

When looking at men’s and women’s responses to the whether their mom is disabled, we found no difference between men (86% reported no disabilities) and women (87% reported no disabilities), chisquare 3.89, DF =3, p =.274.

Correlations

Notes
Output Created	17-APR-2024 14:32:21
Comments
Input	Data	C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav
	Active Dataset	DataSet1
	Filter	H1RP5<5.5 (FILTER)
	Weight	<none>
	Split File	<none>
	N of Rows in Working Data File	4393
Missing Value Handling	Definition of Missing	User-defined missing values are treated as missing.
	Cases Used	Statistics for each pair of variables are based on all the cases with valid data for that pair.
Syntax	CORRELATIONS /VARIABLES=H1RP1 H1RP2 /PRINT=TWOTAIL NOSIG FULL /MISSING=PAIRWISE.
Resources	Processor Time	00:00:00.14
	Elapsed Time	00:00:00.08

Correlations
	S8Q1 PREGNANT NOW ONE OF THE WORST-W1	S8Q2 PREGNANT NOW NOT SO BAD-W1
S8Q1 PREGNANT NOW ONE OF THE WORST-W1	Pearson Correlation	1	-.492**
	Sig. (2-tailed)		<.001
	N	4393	4393
S8Q2 PREGNANT NOW NOT SO BAD-W1	Pearson Correlation	-.492**	1
	Sig. (2-tailed)	<.001
	N	4393	4393
**. Correlation is significant at the 0.01 level (2-tailed).

The next analysis that I ran was correlating the pregnant now being worse with the pregnant now being not so bad variable; these variables were highly negatively correlated, r = -.492, p <.001.

Correlations

Notes
Output Created	17-APR-2024 14:37:06
Comments
Input	Data	C:UsersjeegshDesktopLehman – Stats – Spring 2024Public dataAdd_Health_Wave_I.sav
	Active Dataset	DataSet1
	Filter	H1RP5<5.5 (FILTER)
	Weight	<none>
	Split File	<none>
	N of Rows in Working Data File	4393
Missing Value Handling	Definition of Missing	User-defined missing values are treated as missing.
	Cases Used	Statistics for each pair of variables are based on all the cases with valid data for that pair.
Syntax	CORRELATIONS /VARIABLES=H1RP1 H1RP3 /PRINT=TWOTAIL NOSIG FULL /MISSING=PAIRWISE.
Resources	Processor Time	00:00:00.08
	Elapsed Time	00:00:00.13

Correlations
	S8Q1 PREGNANT NOW ONE OF THE WORST-W1	S8Q3 WILL SUFFER IF HIV POSITIVE-W1
S8Q1 PREGNANT NOW ONE OF THE WORST-W1	Pearson Correlation	1	.200**
	Sig. (2-tailed)		<.001
	N	4393	4393
S8Q3 WILL SUFFER IF HIV POSITIVE-W1	Pearson Correlation	.200**	1
	Sig. (2-tailed)	<.001
	N	4393	4393
**. Correlation is significant at the 0.01 level (2-tailed).

The next analysis that I ran was correlating the pregnant now being worse with the will suffer with HIV variable; these variables were moderate correlated, r = .200, p <.001.

We can handle this paper for you

We Guarantee ZERO Plagiarism ZERO AI

Done by Professional writers from scratch

Edged Essays

Determine what the appropriate statistical test is for your main two variables of interest

Leave a Reply Cancel reply