1. Please explain what motivates your interest in advocacy on behalf of farm animals and how you have pursued this interest in volunteer activities or other actions.

As a young researcher focused on biological and data sciences, I want to use my skills in data analysis to advocate for an urgent and underappreciated cause: animal advocacy. I have a strong commitment to veganism that began when I went vegetarian my freshman year of college. This was the first time in my life that I had ever given any thought to the food that I put on my plate. Though I made this change for environmental reasons, it was the start of a large shift in my worldview. While I have always been unsettled by animal cruelty, I began to think critically about food.

Within a handful of months, I was vegan. I have come to view animal rights as an intersectional issue. As more animals suffer on factory farms, the industry consolidates to a handful of large companies. In turn, these companies have the power to influence policy. This industrial complex has caused not only more animal suffering, but also dangerous work conditions, pollution, public health crises, and food insecurity, all of which disproportionately impact poor communities. I believe that effective animal advocacy is not separate from these issues; addressing them in tandem will generate a strong foundation for sincere advocacy.

I have only begun my advocacy work, but believe I am posed to do more. As an undergraduate, I conducted independent research on animal agriculture. As I learned skills in data analysis and hypothesis testing, I turned them towards animal advocacy. I have highlighted my work on my website, including the finding that industrial animal agriculture disproportionately pollutes poor communities. I also joined a housing co-op that cooks communal vegan meals nightly. I have found that in my social circles, the simple act of cooking a vegan meal with friends can go a very long way. I believe that I can combine my professional aspirations as a data scientist and my vegan convinctions and lifestyle, I can help advance this important cause.

2. A certain packaged food product was advertised in four different but very similar regions. The amounts spent on advertising the product through TV ads in these regions were $2K, $5K, $6K and $8K. The amounts spent on advertising the same product during the same period through radio ads in these same regions were $3K, $4K, $9K and $12K, respectively. During this period, the number of units of the product purchased in these same regions were 271, 440, 635 and 787, respectively. Assume there is no overlap between the people reached through the TV ads and those reached through the radio ads. What is your best estimate of the number of units of the product purchased in a certain fifth and very similar region where the amounts spent on advertising it were $10K on TV and $15K on radio? State your assumptions and explain briefly how you obtained your answer.

I estimate that 1002.84 ± 11.09 purchases will be made in the fifth region.

To reach this estimate, I assume that the relationships between advertising spending (for both TV and radio) and product purchases are linear. I make this assumption because there are not enough data to model alternative distributions reliably. Radio yields the most successful linear model by fitting the data well (R^2 = 0.9262). By using a linear model I also assume that the market has not reached advertising saturation. Each advertising dollar grants the product in question additional exposure. Lastly, I assume that there is no interaction between TV and radio spending.

Under these assumptions, I modeled TV spending, radio spending, and also both types of spending at once as sales predictors using linear regressions. Multiple regressions were used for modeling both types of spending at once. The model of best fit was used for the final estimate.

Its seems that the linear model for radio alone is the strongest predictor of sales, based on the greatest statistical significance at p = 0.037, and the lowest standard error of 11.09. Using radio advertising spending as a model for estimating product sales, there can be an expected amount of 1002.84 ± 11.09 sales. As it is unlikely that TV advertising has 0 impact on sales, this estimate is almost definitely flawed. With more data, a better model that is capable of significantly estimating the effects of both variables could provide a more accurate sales estimate.

library(lme4)
## Warning: package 'lme4' was built under R version 3.4.3
## Loading required package: Matrix
library(bbmle)
## Loading required package: stats4
# Make vectors from data
dollarsTV <- c(2,5,6,8)
dollarsRadio <- c(3,4,9,12)
purchasesCount <- c(271,440,735,787)

# Make linear models between spending and purchases
m.radio <- lm(purchasesCount~dollarsRadio)  #radio
m.TV <- lm(purchasesCount~dollarsTV)  #TV
m.both <- lm(purchasesCount ~ (dollarsTV + dollarsRadio)) #both

# Create statistical summaries of these relationships
stats.radio <- summary(m.radio)
stats.TV <- summary(m.TV)
stats.both <- summary(m.both)

# Lowest standard error - radio
radioError <- stats.radio$coefficients[2,2]
radioError
## [1] 11.09093
# Based on p-values, R2, and standard error, radio alone is the best fit.

# Compare models to estimate weights
m.compare <- AICtab(m.both, m.TV, m.radio, weights = TRUE)

# Estimate sales based on $10K TV and $15K radio spending
# y = mx + b, x = 15
estimate <- coef(m.radio)[2]*15 + coef(m.radio)[1]
estimate
## dollarsRadio 
##     1002.843

4. A large grocery store chain has bragged that more than 20% of its customers regularly bring their own tote bags for grocery purchases. Your hypothesis is that the grocery store’s claim is correct. You decide to verify the claim but, unfortunately, you have only enough time to conduct a very small experiment. You interview four customers, chosen randomly, and you find that one of these four people regularly brings her own tote bags. You want to conclude that the result of your experiment is consistent with your hypothesis since one out of four is better than 20%. What is the p value of the outcome of your experiment? Explain briefly how you arrived at your answer.

The p-value is 0.401.

I tested the hypothesis that more than 20% of customers regularly bring their own tote bags using an upper tail test of population proportion. I used sample size (4), sample proportion (0.25), and hypothesized value (0.2) to calculate a Z test-statistic, I then used to calculate this p-value using an upper tail test.

# Set values
n <- 4        #sample size
pbar <- 1/n   # sample proportion 
p0 <- .2       # hypothesized value 

# Get Z statistic, then p-value
z <- (pbar-p0)/sqrt(p0*(1-p0)/n) # test statistic
pvalue <- pnorm(z, lower.tail=FALSE)  #pvalue
pvalue
## [1] 0.4012937

5. Give two specific examples of something you choose to believe (but which many people do not believe) because of evidence-based reasoning. Alternately, these can also be examples of something you choose to not believe (but which many people do believe) because of evidence-based reasoning.

I believe that the growing human population is not a meaningful concern in addressing climate change because the population that is growing does not drive climate change. While the growing population is associated with an increase in global carbon emissions, energy sources are becoming more renewable, and innovations in agricultural science have allowed for more people to be fed than ever. Furthermore, the wealthiest percentiles of the human population are responsible for a disproportionate amount of CO2 emissions, as shown in “Extreme Carbon Inequality,” a 2015 Oxfam report. As the poorest 50% of the population contributes 10% of the global CO2 emissions, and as poorer countries have the most rapidly increasing populations, it seems unconvincing that the increasing population is as large a threat to climate change as the top 10% of the global population, who contribute roughly 49% of global emissions.

Additionally, attempts at population control have historically targeted vulnerable populations. Rather than focusing climate change efforts on castigating poorer populations, we should aim to change the practices and policies of wealthy countries and industries.

A second thing I believe is that, as a general rule, addressing social issues with punitive measures rather than support systems largely perpetuates inequality. The policies that lead to mass incarceration have sharpened divides of race and wealth in the United States. Lack of social services for individuals with mental illnesses also leads them to be disproportionately incarcerated. Similar criminalization policies have contributed to the opioid crisis that so strongly impacts Philadelphia. Turning a new leaf, Philadelphia’s Mayor and DA are pursuing the opening a Safe Injection Facility that will almost certainly save lives and mitigate harm.

6. It has been alleged that a certain coin used in a game is strongly biased in such a way that, when tossed, the probability with which it comes up with heads is 0.8. I decide to run an experiment in which I will toss the coin 10 times. If I get 8 or more heads, I will conclude that the allegation has merit. What is the statistical power of my experiment? Briefly explain how you obtained your answer.

The statistical power of your experiment is 55.47%.

In order to reach this value, I first determined significance threshold. I did this by calculating a p-value using an upper tail test of population proportion in which null hypothesis is that the coin is fair, in an experiment where 8 of 10 coin flips return heads. I then calculated power by using this p-value as the significance threshold. The Cohen’s d effect size was calculated using the hypothesized proportion of 0.8 and the null proportion of 0.5.

# Set values
n = 10              #sample size
pbar = 8/n          # sample proportion 
p0 = .5             # hypothesized value 
z = (pbar-p0)/sqrt(p0*(1-p0)/n)  # test statistic

pvalue = pnorm(z, lower.tail=FALSE) # alpha value
pvalue
## [1] 0.02888979
library(pwr)

# Calculate power using effect size, significance level, n, and an upper tail hypothesis
test <- pwr.p.test(h = ES.h(p1 = 0.8, p2 = 0.5),
           sig.level = pvalue, 
           n = n,
           alternative = "greater")

test$power
## [1] 0.5547069
7. You are on your first day of work in a retail store where the store owner claims that, given any customer who visits the store, the probability that the customer will make a purchase is greater than 75%. Since you started working, only one customer has visited the store and you observe that this customer did make a purchase. Given the store owner’s claim and your observation of that one customer, what would you say is the probability that the store owner’s claim is correct? Briefly explain how you obtained your answer and state any assumptions you make.

Using an upper tail test of population proportion in which the null hypothesis is that there is a 75% chance that any given customer will make a purchase, I calculated 0.28 as the p-value for this experiment.

While the probability of seeing this result if the store owner is incorrect is 0.28, the probability that the store owner is correct cannot be directly evaluated. As hypothesis testing is limited to determining the probability that a result would be seen if a null hypothesis (that there is up to a 75% chance that any given customer will make a purchase) is true, I would not say that the probability that the store owner’s claim is correct can be determined.

pbar <- 1             # sample proportion 
p0 <- .75             # hypothesized value 
n <- 1                # sample size 
z <- (pbar-p0)/sqrt(p0*(1-p0)/n)  # test statistic 

pvalue <- pnorm(z, lower.tail=FALSE) 
pvalue
## [1] 0.2818514

8. As a research associate working for Humane League Labs, what is an example of a research question you would like to answer? Please motivate your interest in this research question and thoroughly describe a study you might use to answer the question.

In my independent research and motivation for farm animal advocacy I make a big assumption: that people will care more about animal welfare if the prevalence of factory farming is addressed as an intersectional issue. To examine whether this approach is effective, I would like to test the hypothesis that a leaflet that intersectionally advocates for animal welfare is more effect at decreasing animal product consumption than no leaflet.

A leaflet that discusses animal agriculture with respect to the inequalities exacerbated by the industry would be created. These include: health of workers, health of poorer communities, prevalence of food insecurity, environmental racism, and the increasing hardships of independent farmers.

Participants would be collected outside of dining halls at universities. After completing a food frequency questionnaire, they will be assigned to either the variable group that receives the leaflet or the control group that does not. Two months later, they would be contacted through their email (provided in the questionnaire) and be asked to complete the same questionnaire. Social desirability bias will be mitigated by not reminding participants of the leaflet in the second questionnaire.

A power analysis can determine the sample size necessary to find the smallest effect size of interest, which should be small in the case of exploratory research such as this. I suggest using an alpha value of 0.05, a Cohen’s d of 0.2, and power of 90%. This analysis suggests a sample size of 215 per group, or 430 total. Using a conservatively estimated 24% email survey response rate, a total of 1,792 participants should be collected at intake, and 896 should be in each group.

This study would be pre-registered, and all data and materials would be made available online. Limitations include self-reporting and recall biases inherent to the invalidated food frequency questionnaire, as well as prevailing social desirability bias.

test <- pwr.p.test(h = .2,
           sig.level = 0.05, 
           power = 0.90,
           alternative = "greater")

test$n
## [1] 214.0962

9. A researcher is comparing two new interventions using a randomized trial where subjects will be randomly assigned to either of the interventions and a single continuous outcome measured. Based on the expected Cohen’s d effect size, significance level (0.05) and sample size, the researcher anticipates the experiment will have power of 90%. The variance of the two treatments is known to be homogenous based on prior research. After running a t-test, the researcher finds a p-value of 0.14 and concludes the interventions are equally effective and neither intervention should be preferred over the other. How would you assess the researcher’s analysis and conclusions thus far? Are they correct? Do you need additional information from the researcher to make an assessment? Is there additional analysis you would recommend? Please describe your assessment and recommendations in sufficient detail for the researcher to implement them.

I would ask the researcher for the expected effect size used in power analysis as well as their justification for using this estimate. If the expected effect size was greater than the actual effect size of the difference between these interventions, it is likely that the study did not have enough subjects to infer a genuine relationship when there could very well be one. If the research is exploring unstudied relationships, I would hope that a small effect size (like d=0.2) was used to estimate power. On the other hand, if one treatment needs to be demonstrated to be a certain degree more effective than the other (e.g. if one treatment is more expensive and needs to be demonstrated by a company to be financially worthwhile), this should also be reflected in power estimate conducted using higher effect size.

While the p-value of 0.14 does not cross the defined threshold for statistical significance here and shows that the results may be a result of randomness, it does not mean that one intervention is not preferable to the other. If one of the two interventions must be carried out, if the expected effect size was justified and two interventions have equal costs, and if one intervention was demonstrated to possibly have a stronger efficacy, that intervention might as well be the one picked. If the effect size was not justified, the study should be either reconducted entirely. If the researcher does not the resources to conduct a new experiment, the existing results should be examined using a post-hoc power analysis. If the researcher reconducts studies entirely, the first completed study should also be reported.

The discussion concerning the importance of specifying a smallest effect size of interest is well described in the following paper:

Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses: Sequential analyses. European Journal of Social Psychology, 44(7), 701–710. https://doi.org/10.1002/ejsp.2023

10. For a certain research experiment, you have gathered 50 men and 50 women into a focus group. As part of this experiment, you will create 4 subgroups of 25 people each by assigning the gathered people to subgroups randomly and independent of gender. For example, you may end up with 13, 11, 16 and 10 women in the subgroups; in this instance, the number of women in the subgroup with the largest number of women is 16. Using Monte Carlo simulation, estimate the expected number of women in the subgroup with the largest number of women. Report your estimate, correct to at least 2 decimal places, in the field below.

~15.075, resimulated below:

trial <- function(){
  # There are 4 groups of 25 people, created from 50 men and 50 women
  # Function returns a vector returning a simulated number of women in groups A,B,C,D
  
  # Count women and men
  women <- 50
  men <- 50
  total <- women+men
  
  # Create randomly ordered sample of remaining women and men
  # Women are coded as 1 so that they can be summed easily
  random <- sample(c(rep(1, women), rep(0, men)))
  
  # Organize into a data frame with 4 rows, sum to get number of women in each, get largest group size
  max(colSums(matrix(random, ncol=4, byrow=TRUE)))
}

# Conduct n simulations and average to get group estimates
n <- 50000  # arbitrarily specified
largestGroup <- sum(replicate(n, trial()))/n
largestGroup
## [1] 15.05892

15.05892