http://www3.nd.edu/~tweninge/data/reddit_post_manipulation_data.csv
http://www3.nd.edu/~tweninge/data/reddit_report.Rmd
These are the statistics generated from the reddit data.
Number of posts, N=
## [1] 93019
Control:
## [1] 31225
Positive Treatment:
## [1] 30998
Negative Treatment:
## [1] 30796
All scores:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -67 1 2 30 8 4350
Control Scores:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -67 1 2 29 8 4280
Positive Treatment Scores (without adjustment):
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -59 1 2 32 9 4350
Negative Treatment Scores (without adjustment):
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -26 0 1 28 7 3420
Boxplot Comparison:
Boxplot Comparison (log):
## [1] "Skewness (log) by Treatment:"
## [1] "Control = 1.14299655650167"
## [1] "Positive = 1.08944528778635"
## [1] "Negative = 1.09341028291602"
## [1] "Kurtosis (log) by Treatment:"
## [1] "Control = 3.97168065319083"
## [1] "Positive = 3.94333809954101"
## [1] "Negative = 3.88151202033085"
## [1] "Boxplot Comparison (log) of Scores by Treatment"
Line graph Comparison:
Outlier Elimination:
## [1] "Skewness for all scores = 11.2295865064024"
## [1] "Kurtosis for all scores = 149.824273669107"
## [1] "Skewness for all scores (with outliers eliminated) = 6.4830125926408"
## [1] "Kurtosis for all scores (with outliers eliminated) = 54.9451510876046"
Shapiro Test Control:
##
## Shapiro-Wilk normality test
##
## data: sample(nothingScore, 5000)
## W = 0.1511, p-value < 2.2e-16
Shapiro Test Positive Treatment:
##
## Shapiro-Wilk normality test
##
## data: sample(upvoteScore, 5000)
## W = 0.1693, p-value < 2.2e-16
Shapiro Test Negative Treatment:
##
## Shapiro-Wilk normality test
##
## data: sample(downvoteScore, 5000)
## W = 0.153, p-value < 2.2e-16
KS Test Control:
##
## One-sample Kolmogorov-Smirnov test
##
## data: nothingScore
## D = 0.6226, p-value < 2.2e-16
## alternative hypothesis: two-sided
KS Test Positive Treatment:
##
## One-sample Kolmogorov-Smirnov test
##
## data: upvoteScore
## D = 0.6523, p-value < 2.2e-16
## alternative hypothesis: two-sided
KS Test Negative Treatment:
##
## One-sample Kolmogorov-Smirnov test
##
## data: downvoteScore
## D = 0.5128, p-value < 2.2e-16
## alternative hypothesis: two-sided
Fit to distribution:
## [1] "Normal:"
## [1] "Control"
## [1] "AIC: 325792.703578005"
## [1] "BIC: 325808.907844949"
## [1] "LL: -162894.351789002"
## [1] "Positive"
## [1] "AIC: 337728.745256123"
## [1] "BIC: 337745.009527972"
## [1] "LL: -168862.372628062"
## [1] "Negative"
## [1] "AIC: 278863.29341751"
## [1] "BIC: 278879.167068734"
## [1] "LL: -139429.646708755"
## [1] "Exponential:"
## [1] "Control"
## [1] "AIC: 226366.207162895"
## [1] "BIC: 226374.309296367"
## [1] "LL: -113182.103581447"
## [1] "Positive"
## [1] "AIC: 236484.318007565"
## [1] "BIC: 236492.450143489"
## [1] "LL: -118241.159003782"
## [1] "Negative"
## [1] "AIC: 196098.068459806"
## [1] "BIC: 196106.005285419"
## [1] "LL: -98048.0342299032"
## [1] "Pareto:"
## [1] "Control"
## [1] "AIC: 69915.6180434678"
## [1] "BIC: 69923.7201769401"
## [1] "LL: -34956.8090217339"
## [1] "Positive"
## [1] "AIC: 76304.3996584613"
## [1] "BIC: 76312.5317943858"
## [1] "LL: -38151.1998292307"
## [1] "Negative"
## [1] "AIC: 61716.7897895901"
## [1] "BIC: 61724.7266152022"
## [1] "LL: -30857.394894795"
Fit to Pareto:
##
## Call:
## glm(formula = y ~ log(1:max(i)), family = quasipoisson())
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -5.923 -0.629 -0.423 -0.332 6.939
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.08635 0.00687 1322 <2e-16 ***
## log(1:max(i)) -1.45422 0.00289 -503 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for quasipoisson family taken to be 0.6669)
##
## Null deviance: 251865.1 on 4284 degrees of freedom
## Residual deviance: 2793.1 on 4283 degrees of freedom
## AIC: NA
##
## Number of Fisher Scoring iterations: 5
##
## Call: glm(formula = y ~ log(1:max(i)), family = quasipoisson())
##
## Coefficients:
## (Intercept) log(1:max(i))
## 9.09 -1.45
##
## Degrees of Freedom: 4284 Total (i.e. Null); 4283 Residual
## Null Deviance: 252000
## Residual Deviance: 2790 AIC: NA
##
## Call:
## glm(formula = y ~ log(1:max(i)), family = quasipoisson())
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -9.674 -0.714 -0.493 -0.390 6.215
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.87178 0.00818 1085 <2e-16 ***
## log(1:max(i)) -1.42186 0.00331 -430 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for quasipoisson family taken to be 0.7778)
##
## Null deviance: 198956.5 on 3418 degrees of freedom
## Residual deviance: 2687.5 on 3417 degrees of freedom
## AIC: NA
##
## Number of Fisher Scoring iterations: 5
##
## Call: glm(formula = y ~ log(1:max(i)), family = quasipoisson())
##
## Coefficients:
## (Intercept) log(1:max(i))
## 8.87 -1.42
##
## Degrees of Freedom: 3418 Total (i.e. Null); 3417 Residual
## Null Deviance: 199000
## Residual Deviance: 2690 AIC: NA
##
## Call:
## glm(formula = y ~ log(1:max(i)), family = quasipoisson())
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -22.757 -0.707 -0.481 -0.381 11.060
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.05276 0.00757 1196 <2e-16 ***
## log(1:max(i)) -1.41477 0.00299 -474 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for quasipoisson family taken to be 0.8083)
##
## Null deviance: 251384.9 on 4345 degrees of freedom
## Residual deviance: 3851.8 on 4344 degrees of freedom
## AIC: NA
##
## Number of Fisher Scoring iterations: 5
##
## Call: glm(formula = y ~ log(1:max(i)), family = quasipoisson())
##
## Coefficients:
## (Intercept) log(1:max(i))
## 9.05 -1.41
##
## Degrees of Freedom: 4345 Total (i.e. Null); 4344 Residual
## Null Deviance: 251000
## Residual Deviance: 3850 AIC: NA
## [1] "KS Test -- Positive Treatment vs Control p= 0"
## [1] "KS Test -- Positive Treatment vs Control p= 0"
## [1] "Sanity Check -- KS Test -- Control vs Control p= 0.99588096530338"
## [1] "Wilcox Test -- Positive Treatment vs Control p= 5.86264455955917e-53"
## [1] "Wilcox Test -- Negative Treatment vs Control p= 7.83462479119778e-73"
## [1] "Sanity Check -- Wilcox Test -- Control vs Control p= 0.621666799285642"
## [1] "T Test -- Log Positive Treatment vs Log Control p= 1.68975176107589e-20"
## [1] "T Test -- Log Negative Treatment vs Log Control p= 1.68557267325307e-09"
## [1] "Sanity Check -- T Test -- Log Control vs Log Control p= 0.506586669634357"
## [1] "KS Test -- Positive Treatment vs Control at 0 sec - D = 0.086927836982747 p = 0"
## [1] "KS Test -- Positive Treatment vs Control at 30 sec - D = 0.0868038249225742 p = 0"
## [1] "KS Test -- Positive Treatment vs Control at 60 sec - D = 0.0875402520359694 p = 0"
## [1] "KS Test -- Positive Treatment vs Control at 300 sec - D = 0.0825614363984992 p = 0"
## [1] "KS Test -- Positive Treatment vs Control at 600 sec - D = 0.0821767185437847 p = 0"
## [1] "KS Test -- Positive Treatment vs Control at 1800 sec - D = 0.0865049398796312 p = 0"
## [1] "KS Test -- Positive Treatment vs Control at 3600 sec - D = 0.0782516860489065 p = 0"
## [1] "KS Test -- Nevative Treatment vs Control at 0 sec - D = 0.118812991237424 p = 0"
## [1] "KS Test -- Nevative Treatment vs Control at 30 sec - D = 0.110131960763721 p = 0"
## [1] "KS Test -- Nevative Treatment vs Control at 60 sec - D = 0.110522258386419 p = 0"
## [1] "KS Test -- Nevative Treatment vs Control at 300 sec - D = 0.11244828603096 p = 0"
## [1] "KS Test -- Nevative Treatment vs Control at 600 sec - D = 0.119906905396027 p = 0"
## [1] "KS Test -- Nevative Treatment vs Control at 1800 sec - D = 0.0969316008492061 p = 0"
## [1] "KS Test -- Nevative Treatment vs Control at 3600 sec - D = 0.0992090647728554 p = 0"
## [1] "Wilcox Test -- Positive Treatment vs Control at 0 sec - p = 6.09216002326674e-14"
## [1] "Wilcox Test -- Positive Treatment vs Control at 30 sec - p = 1.36078023236168e-18"
## [1] "Wilcox Test -- Positive Treatment vs Control at 60 sec - p = 6.23268794568268e-18"
## [1] "Wilcox Test -- Positive Treatment vs Control at 300 sec - p = 1.83359295515348e-13"
## [1] "Wilcox Test -- Positive Treatment vs Control at 600 sec - p = 2.18364887990707e-11"
## [1] "Wilcox Test -- Positive Treatment vs Control at 1800 sec - p = 6.22235743880408e-15"
## [1] "Wilcox Test -- Positive Treatment vs Control at 3600 sec - p = 8.55656029924071e-12"
## [1] "Wilcox Test -- Negative Treatment vs Control at 0 sec - p = 3.28974328705637e-22"
## [1] "Wilcox Test -- Negative Treatment vs Control at 30 sec - p = 2.53183805069274e-17"
## [1] "Wilcox Test -- Negative Treatment vs Control at 60 sec - p = 4.52130963522907e-24"
## [1] "Wilcox Test -- Negative Treatment vs Control at 300 sec - p = 2.57964251980208e-21"
## [1] "Wilcox Test -- Negative Treatment vs Control at 600 sec - p = 9.97773954203552e-27"
## [1] "Wilcox Test -- Negative Treatment vs Control at 1800 sec - p = 4.26017895022146e-15"
## [1] "Wilcox Test -- Negative Treatment vs Control at 3600 sec - p = 1.52093006947079e-11"
## [1] "T Test -- Log Positive Treatment vs Log Control at 0 sec - p = 1.26564116427947e-07"
## [1] "T Test -- Log Positive Treatment vs Log Control at 30 sec - p = 1.20667230619452e-06"
## [1] "T Test -- Log Positive Treatment vs Log Control at 60 sec - p = 1.62710683848065e-07"
## [1] "T Test -- Log Positive Treatment vs Log Control at 300 sec - p = 4.37299441131072e-06"
## [1] "T Test -- Log Positive Treatment vs Log Control at 600 sec - p = 0.000561252607330452"
## [1] "T Test -- Log Positive Treatment vs Log Control at 1800 sec - p = 2.52817614638093e-07"
## [1] "T Test -- Log Positive Treatment vs Log Control at 3600 sec - p = 2.77337105484995e-05"
## [1] "T Test -- Log Negative Treatment vs Log Control at 0 sec - p = 0.00515517796907591"
## [1] "T Test -- Log Negative Treatment vs Log Control at 30 sec - p = 0.00115130587497559"
## [1] "T Test -- Log Negative Treatment vs Log Control at 60 sec - p = 0.0768726272685686"
## [1] "T Test -- Log Negative Treatment vs Log Control at 300 sec - p = 0.00180717685086556"
## [1] "T Test -- Log Negative Treatment vs Log Control at 600 sec - p = 0.0414153750128376"
## [1] "T Test -- Log Negative Treatment vs Log Control at 1800 sec - p = 0.0303008389080154"
## [1] "T Test -- Log Negative Treatment vs Log Control at 3600 sec - p = 3.20358247499746e-07"
## ymax not defined: adjusting position using y instead
Summary Statistics:
## action subreddit N score sd se ci
## 1 Downvote AskReddit 1150 1.790 12.02 0.3545 0.6956
## 2 Nothing AskReddit 1131 3.200 25.76 0.7658 1.5026
## 3 Upvote AskReddit 1186 5.347 32.00 0.9293 1.8232
## 4 Downvote funny 878 7.481 46.50 1.5694 3.0803
## 5 Nothing funny 984 7.842 41.54 1.3244 2.5989
## 6 Upvote funny 942 10.510 52.68 1.7164 3.3685
## 7 Downvote AdviceAnimals 651 8.542 37.02 1.4510 2.8492
## 8 Nothing AdviceAnimals 734 9.811 44.45 1.6406 3.2208
## 9 Upvote AdviceAnimals 698 13.910 58.55 2.2163 4.3514
## 10 Downvote pics 550 6.055 31.80 1.3558 2.6632
## 11 Nothing pics 561 10.282 49.71 2.0987 4.1223
## 12 Upvote pics 545 9.251 35.08 1.5025 2.9514
## 13 Downvote leagueoflegends 554 3.859 42.35 1.7993 3.5342
## 14 Nothing leagueoflegends 517 4.965 44.06 1.9377 3.8067
## 15 Upvote leagueoflegends 531 6.414 40.46 1.7560 3.4496
## 16 Downvote gaming 406 11.852 60.31 2.9932 5.8841
## 17 Nothing gaming 405 14.146 59.01 2.9325 5.7648
## 18 Upvote gaming 403 14.538 67.60 3.3672 6.6195
## 19 Downvote videos 424 4.349 34.11 1.6567 3.2564
## 20 Nothing videos 407 5.902 36.12 1.7905 3.5199
## 21 Upvote videos 360 12.522 60.45 3.1862 6.2660
## 22 Downvote aww 323 16.653 54.66 3.0415 5.9837
## 23 Nothing aww 305 24.492 74.60 4.2718 8.4061
## 24 Upvote aww 304 20.178 53.93 3.0931 6.0867