Reference no: EM133672385
Experimental Data Analysis Practice and Assignment
We are going to use STATA15 as the analytical program of this course.
In Programes instal.lats => Matemàtiques you can find Stata 15.
We will open the main program and a new do-file editor to log the code of our analyses.
We start by importing into Stata the Excel data that you can download from the Aula Virtual of the course to the remote desktop: independence.xls
Independence Test
Statistics=> Summaries, tables and tests => Frequency tables => Two-way table with measures of association.
Select the contrast variables: Aveart1 in rows and Aveart4 in columns and select Pearson Chi square as the Statistic. ALT: tab Aveart1 Aveart4, chi2
Ho: variables are independent => P-value=0.233 Result: Ho Not rejected.
We can repeat with all pairs: avpri, avdif, avloc,...
Normality Test
We can also check whether a variable is normally distributed.
Graphics=> Histogram=> Select Aveart1 as main variable and then select normal density plot in the tab. We could also add Kernel.
Statistics => Summaries, tables and tests => Distributional Plots and tests => Skewness and kurtosis or Shapiro-Wilk: Select aveart1, avdift1, avloct1 and avprit1.
Ho: variable is normal. =>Result: No Reject.
Repeat normality tests with aveart2, avdift2, avloct2 an d avprit2.
Ho: variables are normal. I do not reject.
By not rejecting normality I can use parametric statistics with them.
For the other variables, except for avpr2t1 and avpr3t4, most normality tests are rejected, therefore with these price variables for different degrees of differentiation we will use non- parametric statistics.
Comparing means under normality
Statistics=> Summaries, tables and tests => Classical test of hypothesis => t-test
We have very interesting options: one sample, two- sample using groups, using variables and paired.
Given the structure of our data, we choose "two- sample using variables" and select Aveart1 and Aveart4.
We cannot reject the null of equality in favor of greater earnings in T1 (we could if we had an ex- ante hypothesis, using the one-tailed test result).
The t-test for two independent samples
Previous hypotheses: Normality, independence, equal or different variances.
Contrast variables: Avdift1 vs Avdift4, Avloct1 vs Avloct4 and Avprit1 vs Avprit4.
Significantly higher averages in t1 than in t4 in prices, and no significant differeces in locations and differentiation.
IMP! Alternative data configuration
Normally we do not have the data organized in different variable names (columns) for different treatments. Instead, it is standard to have one variable name and then an auxiliary variable indicating the treatment to which each datapoint (row) belongs.
File => import => excel => comparisons_independent_data.xls.
T-test using groups: Avear, Avdiff, Avloc and Avpri with Treatment as grouping variable. Same results!
Comparison of medians without assuming normality
To contrast differences in prices under each level of differentiation we will use non- parametric statistics because the normality tests had been rejected.
We will use the Mann-Whitney U test, also known as Wilcoxon rank-sum test.
If the samples were paired we would use the Wilcoxon matched-pairs signed-rank test
Mann-Whitney U test for 2 samples
Statistics => Summaries, tables and tests => nonparametric tests => Wilcoxon rank-sum test.
Contrast prices under each level of differentiation separately starting with Avpri0, with Treadi0 as the grouping variable.
Repeat with Avpri1, Avpri2 and avpri3. Change also the corresponding grouping variable!
With differentiation 1 MEDIAN prices are greater in treatment 1 than 4, but no significant differences for other differentiation levels.
(by Treatdi1, sort : summarize Avpri1, detail)
Correlation Analysis
An interesting analysis, specially if we want to do regression analysis is correlation analysis: Pearson (normality) or Spearman (non-parametric).
Statistics=> Summaries, tables and tests => Summary and descriptive statistics => Pairwise Correlations
Statistics => Summaries, tables and tests => nonparametric tests => Spearman Correlation.
Correlate Avear, Avdiff, Avpri using both methods.
We can display N, significance level and Bonferroni correct!
Correlation Analysis
Result: Average Price and Average Earnings are highly and significantly correlated.
As a consequence we would not like to use both as explanatory variables in a regression, to avoid multicollinearity problems.
But we could use for instance prices to explain earnings in a regression:
regress Avear Avpri Avdiff Avloc
We could also separate our analysis by treatment.
You should use STATA15 to perform the kind of analyses that we have seen in this practice session and upload them in the Aula Virtual link in a pdf. Please also upload the .do file.
Use the folowing dataset that you can download in the AV: File => import => excel
=> data intelligence and risk aversion test.xls.
The maximum grade will be 2.5 points out of the total 10 points of the course.
Attachment:- Experimental Data Analysis Practice.rar