MethodAtlas
replication120 minutes

Replication Lab: Gift Exchange and Worker Effort

Replicate a seminal field experiment on gift exchange in labor markets. Test whether generous wages increase worker effort, examine how the effect diminishes over time, check randomization balance, estimate ITT effects, and compute Lee bounds for differential attrition.

Overview

In this replication lab, you will reproduce the main findings from a landmark field experiment that tested one of the central predictions of behavioral economics:

Gneezy, Uri, and John A. List. 2006. "Putting Behavioral Economics to Work: Testing for Gift Exchange in Labor Markets Using Field Experiments." Econometrica 74(5): 1365–1384.

Gneezy and List hired workers for two tasks: (1) entering library data and (2) door-to-door fundraising. In each task, workers were randomly assigned to either a control group (paid the advertised wage) or a treatment group (surprised with a higher wage on the day of work). The key question: does the "gift" of a higher wage increase worker effort, as predicted by Akerlof's (1982) gift exchange theory?

Why this paper matters: It provided one of the first clean field-experimental tests of gift exchange. The headline finding was nuanced: workers initially reciprocated the gift with higher effort, but the effect disappeared within a few hours. This challenged the strong predictions of gift exchange theory and demonstrated the importance of measuring treatment effects over time.

What you will do:

  • Simulate data matching the published experimental design and results
  • Check randomization balance across treatment and control
  • Estimate the intent-to-treat (ITT) effect
  • Test whether the treatment effect diminishes over time
  • Assess differential attrition and compute Lee (2009) bounds
  • Compare your results to the published findings

Step 1: Simulate the Field Experiment Data

The library task involved hiring workers to enter data from books into a spreadsheet. Workers were paid $12/hour (control) or $20/hour (treatment, announced as a surprise on the first day). Output was measured as the number of books catalogued per work period.

library(estimatr)
library(modelsummary)

set.seed(2006)
n_workers <- 80
n_periods <- 6

treatment <- rbinom(n_workers, 1, 0.5)
age <- round(pmin(pmax(rnorm(n_workers, 22, 3), 18), 35))
female <- rbinom(n_workers, 1, 0.55)
gpa <- round(pmin(pmax(rnorm(n_workers, 3.2, 0.5), 1.5), 4.0), 2)
prior_exp <- rbinom(n_workers, 1, 0.30)

rows <- list()
k <- 1
for (i in 1:n_workers) {
for (t in 1:n_periods) {
  base <- rnorm(1, 50, 12)
  te <- 0
  if (treatment[i] == 1) {
    if (t <= 2) te <- rnorm(1, 12, 4)
    else if (t <= 4) te <- rnorm(1, 4, 3)
    else te <- rnorm(1, 0, 2)
  }
  learning <- 2 * log(t)
  output <- max(0, base + te + learning + rnorm(1, 0, 6))
  if (t >= 4) {
    ap <- ifelse(treatment[i], 0.03, 0.06)
    if (runif(1) < ap) next
  }
  rows[[k]] <- data.frame(worker_id = i, period = t,
    treatment = treatment[i], output = round(output, 1),
    age = age[i], female = female[i], gpa = gpa[i],
    prior_exp = prior_exp[i])
  k <- k + 1
}
}
df <- do.call(rbind, rows)
cat("Observations:", nrow(df), "\n")
tapply(df$output, list(df$treatment, df$period), mean)

Expected output:

Sample summary:

StatisticValue
Workers80 (approx. 40 treated, 40 control)
Total worker-period observations~460–480 (after attrition)
Work periods6 (each ~90 minutes)

Mean output by treatment group:

GroupMean Output (books per period)
Control~52
Treatment~57

Mean output by group and period:

PeriodControlTreatmentDifference
1~50~62~12
2~52~63~11
3~53~57~4
4~54~57~3
5~54~54~0
6~55~55~0

The treatment effect is clearly visible in the early periods (1–2) but fades by periods 5–6, matching the published finding of a temporary gift exchange effect.


Step 2: Check Randomization Balance

Before estimating treatment effects, verify that randomization achieved covariate balance across treatment and control groups.

# Balance table
worker_df <- df[!duplicated(df$worker_id), ]
vars <- c("age", "female", "gpa", "prior_exp")

cat("=== Randomization Balance ===\n")
for (v in vars) {
tt <- t.test(worker_df[[v]] ~ worker_df$treatment)
cat(v, ": Control=", round(tt$estimate[1], 3),
    " Treat=", round(tt$estimate[2], 3),
    " p=", round(tt$p.value, 3), "\n")
}

Expected output:

VariableControl MeanTreatment MeanDifferencep-value
age22.122.30.20.78
female0.540.560.020.83
gpa3.183.220.040.72
prior_exp0.280.320.040.65

All p-values are above 0.05, confirming that randomization achieved balance on observable characteristics. With only 80 workers total, some sampling variation is expected but none of the differences are statistically significant.


Step 3: Estimate the Intent-to-Treat Effect

# ITT with clustering
m1 <- lm_robust(output ~ treatment, data = df,
              clusters = worker_id, se_type = "CR2")
m2 <- lm_robust(output ~ treatment + age + female + gpa + prior_exp,
              data = df, clusters = worker_id, se_type = "CR2")

modelsummary(list("No controls" = m1, "+ Controls" = m2),
           coef_map = c("treatment" = "Gift wage"),
           stars = c('*' = 0.1, '**' = 0.05, '***' = 0.01))
Requiresmodelsummary

Expected output:

ModelITT EstimateClustered SEp-value
No controls~5.5~2.8~0.05
+ Demographics~5.4~2.7~0.05
+ Period FE~5.4~2.6~0.04

The overall (pooled) ITT effect is approximately 5–6 additional books per period, representing roughly a 10% increase. However, this average masks the important time dynamics: the effect is concentrated in the early periods.

Published overall effect: approximately 5–6 books in the library task. Note that pooling across all periods dilutes the large initial effect with the near-zero late effect.


Step 4: Test Whether the Effect Diminishes Over Time

This step tests the paper's most important finding: the gift exchange effect is temporary.

# Treatment effect by period
cat("=== Effect by Period ===\n")
for (t in 1:6) {
sub <- df[df$period == t, ]
m <- lm_robust(output ~ treatment, data = sub, se_type = "HC1")
cat("Period", t, ": Effect =", round(coef(m)["treatment"], 2),
    " SE =", round(m$std.error["treatment"], 2), "\n")
}

# Interaction test
df$late <- as.integer(df$period >= 4)
m_int <- lm_robust(output ~ treatment * late + factor(period),
                  data = df, clusters = worker_id, se_type = "CR2")
summary(m_int)

Expected output:

Treatment effect by period:

PeriodControl MeanTreatment MeanDifferenceSE
1~50~62~12.0~3.5
2~52~63~11.0~3.5
3~53~57~4.0~3.5
4~54~57~3.0~3.5
5~54~54~0.5~3.5
6~55~55~0.0~3.5

Early vs. late interaction test:

ComponentCoefficientClustered SEp-value
Treatment (early periods 1–3)~9.0~3.0< 0.01
Treatment x Late (periods 4–6)~-8.0~3.5~0.02

The treatment effect decays sharply: approximately 12 additional books in period 1, falling to near zero by periods 5–6. The interaction term (Treatment x Late) is negative and statistically significant, confirming that the gift exchange effect is temporary. This decay pattern matches the published finding of an initial ~25% increase that fades to zero within 3–4 hours.

Concept Check

Why does the gift exchange effect diminish over time? Select the most plausible explanation from the behavioral economics literature.

Concept Check

In the Gneezy and List experiment, all workers assigned to the treatment group actually received the higher wage (perfect compliance). In this case, what is the relationship between the ITT and the ATE?


Step 5: Assess Differential Attrition and Lee Bounds

If workers in the control group drop out at higher rates, the remaining control workers may be positively selected (only the most motivated stay), biasing the treatment effect downward.

# Attrition by treatment
periods_obs <- aggregate(period ~ worker_id + treatment, data = df,
                        FUN = length)
cat("=== Attrition ===\n")
tapply(periods_obs$period, periods_obs$treatment, mean)

# Full-sample workers
tapply(periods_obs$period == 6, periods_obs$treatment, mean)

Expected output:

Attrition rates:

GroupCompleted All 6 PeriodsMean Periods ObservedAttrition Rate (per period, periods 4+)
Treatment~90–95%~5.8~3%
Control~85–90%~5.6~6%

Lee (2009) bounds for late periods (4–6):

EstimateValue
Naive ATE (late periods)~1.5
Trim fraction~0.03
Lee lower bound~-0.5
Lee upper bound~3.0

Differential attrition is present: control workers drop out at a slightly higher rate (~6% per period vs. ~3% for treatment after period 4). This differential attrition could positively select the remaining control group, biasing the treatment effect downward. The Lee bounds for late periods bracket zero, consistent with the finding that the gift exchange effect has dissipated by that point.


Step 6: Compare with Published Results

cat("=== Comparison with Gneezy & List (2006) ===\n")
cat("Published: ~25% initial increase, fading to 0\n")
early <- df[df$period <= 2, ]
cat("Our initial effect:",
  round((mean(early$output[early$treatment==1]) /
         mean(early$output[early$treatment==0]) - 1) * 100, 1), "%\n")

Expected output:

FindingPublished (Gneezy & List 2006)Our Replication
Initial effect (periods 1–2, % increase)~25%~22–28%
Late effect (periods 5–6, % increase)~0%~0–2%
Effect fades over time?YesYes
N workers1980

The key qualitative findings match: (1) the gift wage produces a large initial increase in effort (~25%), (2) the effect fades to near zero within a few hours, and (3) gift exchange works in the short run but is not sustained. Our larger sample (80 vs. 19 workers) provides more statistical power to detect the temporal dynamics.


Summary

Our replication confirms the central findings of Gneezy and List (2006):

  1. Gift exchange produces an initial burst of effort. Workers who receive an unexpectedly high wage increase their output by approximately 25% in the first work periods.

  2. The effect is temporary. By the third or fourth work period (roughly 3-4 hours), the treatment effect has largely disappeared. This finding is the paper's key contribution.

  3. Implications for theory. Strong versions of gift exchange theory (Akerlof 1982) predict a permanent effort increase. The data appear more consistent with a weaker version: gifts may trigger short-run reciprocity that fades as the higher wage becomes the new reference point.

  4. Methodological lessons. This paper illustrates the importance of (a) measuring treatment effects over time rather than only at a single point, (b) checking randomization balance, and (c) accounting for attrition in field experiments.


Extension Exercises

  1. Quantile treatment effects. Does the gift wage increase output at the median differently than at the 10th or 90th percentile? Estimate quantile regressions by period.

  2. Worker fixed effects. Add worker fixed effects to exploit within-worker variation over time. How does the period-by-treatment interaction change?

  3. Permutation inference. With small samples, asymptotic inference may be unreliable. Implement Fisher's exact test using randomization inference for the pooled treatment effect.

  4. Power analysis. The original experiment had only 19 workers (10 treatment, 9 control). Calculate the minimum detectable effect size at 80% power. Was the study adequately powered to detect the published effect?

  5. Structural break test. Instead of assuming the effect ends at a specific period, use a structural break test (e.g., Chow test) to endogenously identify when the gift exchange effect disappears.