# «Abstract We use the cross-section of index funds to assess the extent of skill in active mutual funds. First, we apply methods designed to ...»

We tabulate the percentiles of the bootstrapped and actual distributions of gross t(α) performance for index funds in Table 2. The tabulated results support the graphical evidence across all of the benchmark models. In particular, the empirical distribution of t(α) for index funds outperforms the bootstrap distribution above the 50th percentile for all models. As in Fama and French (2010), we also report the fraction of bootstrap runs in which a given percentile of the simulated distribution falls below the empirical percentile value. Fama and French (2010) discuss how this measure can be used for informal inference concerning the likelihood of observing the diﬀerence in performance between the simulated and actual data. In particular, if the bootstrapped and actual distributions are equal at a given percentile (i.e., no skill), the likelihood value should be 0.5.

Interestingly, the likelihood statistic can be biased away from 0.5 for extreme percentiles even for distributions with no skill (i.e., alpha of zero). This bias is decreasing in the number of funds in the cross-section (not the number of bootstrap samples). This is a possible concern given the small size of the index fund crosssection. We use Monte Carlo simulations to establish appropriate critical values for the likelihood statistics under the null of no diﬀerences in the distributions (i.e., no skill).18 Our estimated likelihoods for percentiles where the actual estimate exceeds the average bootstrap value (bold entries in Table 2) are generally well in excess of these critical values. Thus, the actual distribution of index fund performance signiﬁcantly outperforms the zero-skill bootstrap distribution.

Table 3 reports the same tests for our actively-managed sample. The results look quite similar to both Fama and French (2010) and the index funds. The acDetails of the Monte Carlo results are available on request.

tual percentiles are greater than the bootstrapped percentiles above the median.

Kosowski, Timmermann, Wermers and White (2006) and Fama and French (2010) interpret these results as evidence of skill. A similar interpretation of the index fund results implies that some index funds are skilled as well. The other possibility is that the tests are confounded by errors in measuring performance through the use of benchmark models. As such, our comparison between the active and index space in Section 4 can potentially further our understanding of skill relative to a comparison of the bootstrapped distribution alone.

3.2. The proportion of skilled, zero-alpha, and unskilled funds In a similar vein to the bootstrap tests, Barras, Scaillet and Wermers (2010) use a variant of the false discovery rate (FDR) estimation developed by Storey (2002) to estimate the fractions of funds in the cross-section that are skilled, unskilled, and zero-alpha. The technique controls for false discoveries of mutual fund skill, i.e., mutual funds exhibiting signiﬁcant alphas by luck alone. If one assumes that funds are drawn from one of three populations (skilled, unskilled, and zero-alpha), the cross-sectional distribution of t-statistics for risk-adjusted alphas will be a mixture distribution. The right tail of this mixture distribution will contain both skilled funds and lucky zero-alpha funds. Using a critical value for the t-statistics alone will falsely attribute skill to these lucky zero-alpha funds.

The FDR technique uses the fact that the p-values for both skilled and unskilled funds will be located disproportionately close to zero, while the p-values for zeroalpha funds will be uniformly distributed from zero to one. Intuitively, the more the mass in the p-value distribution close to zero diﬀers from the uniform level of the distribution close to one, the lower the proportion of zero-alpha funds in the crosssection. The additional mass on the left-side of the p-distribution is due to either skilled (positive alphas), unskilled (negative alpha) funds, or both. The fractions of the population containing these types of funds can be estimated using the sign of the t(α) for each fund.

We estimate the proportions of unskilled, zero-alpha, and skilled funds (ˆ −, π 0, πˆ π + ) in the index fund population using the Barras et al. (2010) methodology.19 The ˆ results using gross returns are shown in Table 4. For purposes of comparison, we also report estimates for the active fund sample. Surprisingly, the estimated proportion of skilled funds (ˆ + ) in the index fund population is at least as large as the estimated π proportion of skilled funds in the active fund population in all benchmark models except the market model. Under a standard Fama-French-Carhart four-factor model, we ﬁnd that 16% of index funds are classiﬁed as skilled, compared to 10% of active funds. The results provide little evidence that active funds consist of skilled funds in greater frequencies than the supposedly unskilled group of index funds. If anything, the evidence suggests a smaller fraction of active funds are skilled than in index funds, even before fees are considered.

3.3. Persistence of α Empirical studies of mutual fund performance often point to the (lack of) persistence of risk-adjusted performance as evidence of the (lack of) skill for managers (e.g., Carhart (1997)). In this section, we evaluate the persistence of performance for index funds and compare it to that of active funds. Speciﬁcally, we estimate Fama-French-Carhart alphas for half-decade subsamples (i.e., 2000-2004, 2005-2009, etc.) and sort funds into quintiles based on the alphas in each period. If relative performance persists, then transition matrices of the alpha quintiles should be disproportionately populated along the diagonals. If funds are truly skilled, we should expect persistence in the top quintile in particular. If there is no persistence and performance rankings are random from one period to the next, we should see uniform transition probabilities of 20% across the entire matrix.

We present gross alpha transition matrices for index and active funds in Table 5.

19 ∗ λ ∈ [0, 1] denotes the threshold above which p-values are assumed to be generated from zeroalpha funds only (i.e., funds with alpha p-values greater than λ∗ are comprised solely of zero-alpha funds). γ ∗ denotes the signiﬁcance level used for determining the critical t-value used to estimate the fraction of lucky zero-alpha funds incorrectly identiﬁed as possessing skill (or lack of skill). We ﬁx λ∗ at 0.5 and γ ∗ at 0.35 to put active and index funds on equal footing, but our results are qualitatively unchanged if we follow the selection algorithms for λ∗ and γ ∗ used by Barras et al.

(2010).

Panels 5a and 5b show transitions from 2000-2004 to 2005-2009, and Panels 5c and 5d show transitions from 2005-2009 to 2010-2013. For the ﬁrst transition, there is weak evidence of persistence in skill for the top active funds; 24.5% of funds in the top quintile remain in that quintile. This is only slightly above the null of 20%. In contrast, for those index funds in the top quintile in the ﬁrst period, over 40% remain in that quintile.

Moving to Panels 5c and 5d, we see stronger evidence of persistence for index funds and less evidence for actively managed funds over the later time period. The transition probabilities for the active funds are generally close to the null of 20%.

However, we see signiﬁcant persistence along the entire diagonal for index funds.

Across the two periods, the average diagonal transition probability for index funds is about 30% compared to about 20% for active funds. Overall, the results provide little evidence that the top-performing active funds possess skill in the form of persistent alpha. If anything, passive funds display more persistent performance, even before fees.20

3.4. The Flow-Performance Relationship A number of papers have documented a positive relationship between net fund ﬂows and lagged performance in active mutual funds (e.g., Sirri and Tufano (1998), Chevalier and Ellison (1997)). A leading explanation for this relationship is that investors rationally update their beliefs about manager skill based on past performance (Berk and Green (2004)). We assess the ﬂow-performance relationship in the context of our unskilled group of index funds. By construction, investors should not attempt to update their beliefs about manager talent for these funds. If ﬂow-performance is due to rational learning by investors about managers’ stock picking abilities, index fund ﬂows should not be responsive to ﬂows.

We examine the relationship between net fund ﬂows and lagged performance for both active and index funds. As is standard in the literature, we measure new money Elton, Gruber and Busse (2004) show signiﬁcant persistence in S&P 500 index fund net returns.

They ﬁnd that much of this persistence is driven by fees. Our results indicate a signiﬁcant amount of persistence in the broader index fund space, even for gross returns.

**growth as:**

T N Ait − T N Ait−1 (1 + rit ) F lowit = T N Ait−1 where T N Ait is the total net assets under management by fund i in month t. Flows are winsorized at the 1% level.

Table 6 presents panel regressions of net fund ﬂows on lagged returns and an interaction of lagged returns with an index fund indicator variable. We use gross excess returns and benchmark-adjusted returns, controlling for log total net assets and a fund’s expense ratio.21 Each regression contains year and fund ﬁxed eﬀects. As has been widely documented in the literature for active funds, new money growth is positively correlated with lagged fund performance measured using any performance measure. We show that this eﬀect exists for a broad set of index funds as well.22 The interaction term estimates are positive and statistically signiﬁcant using all benchmark-adjusted returns. An increase in Fama-French-Carhart abnormal performance of 10 basis points (bps) per month is associated with increased ﬂows of 3.5 bps of assets under management for index funds. This is signiﬁcantly greater than the 1.9 bps increase in assets for active funds. In general, the results indicate that new money growth in index funds is more sensitive to past performance than it is in active funds.23 This result is inconsistent with investors rationally updating about fund skill if index funds are assumed to have no skill. As such, we provide empirical evidence consistent with Choi, Laibson and Madrian (2010), who present experimental evidence that investors chase past performance even within S&P 500 funds if presented with diﬀerently framed information. On the other hand, the results could be consistent with investors rationally updating about which investment strategies outperform and Using preferences revealed by mutual fund ﬂows, Berk and van Binsbergen (2014a) argue that CAPM is closest to the true model. To be consistent with our previous tests, we use abnormal returns from each benchmark model.

Elton, Gruber and Busse (2004) show a ﬂow-performance relationship between S&P 500 funds using post-expense performance, but do not relate this to active funds.

While index mutual funds and ETFs are used for sector exposure, we note that we have excluded sector-speciﬁc funds from our sample, so this does not drive the result.

which funds have the best execution. Nonetheless, it remains puzzling that index funds would be more responsive to past returns than active funds.

4. Controlling for Luck Using the Performance Distribution of Index Funds The surprising ﬁnding that index funds appear skilled leads us to propose a new method to disentangle skill from luck. In this section, we assess the extent of activelymanaged skill by using the cross-section of index fund performance as a distribution of performance measures under the null of no skill. These tests are similar in spirit to the bootstrap tests of Kosowski et al. (2006) and Fama and French (2010), but we use the traded index fund distribution rather than the bootstrapped distribution as our counterfactual, no-skill distribution. We assess diﬀerences between active and passive fund performance at various points of the distribution using quantile regressions.

Finally, we use (second-order) stochastic dominance tests to assess whether the extent of any active fund skill (or lack thereof) in aggregate is suﬃcient to induce a riskaverse investor to choose an active fund rather than an index fund.

4.1. Cumulative distribution functions In Figure 2, we plot the cumulative distribution functions (CDFs) of gross alphas for index and active funds under the various benchmark models. We also display the returns in excess of the S&P 500 since this is a common benchmark used for equity fund performance in practice. We expect the priors of most readers to be that index funds have a narrower benchmark-adjusted return distribution, with a possible negative level shift relative to active funds for before-fee returns. Absent a level shift, a narrower distribution for index fund alphas corresponds to a CDF that is to the right (left) of the active CDF below (above) the median.

We focus on gross returns to abstract away from the bargaining process between investors and fund managers. Panel 2a shows the surprising result that the CDF of fund returns in excess of the S&P 500 is remarkably similar for above-median funds. The largest diﬀerences in the distributions are in the left half of the distributions where index fund exhibit alphas much closer to zero than those of the worst-performing active funds. When adjusted for systematic risk using benchmark models (Panels 2b-2f), the poorer performance of below-median active funds remains.

The similar performance of the top-performing funds generally survives the benchmark model adjustments. The largest above-median diﬀerences appear in the 70th to 90th percentiles, where active funds have slightly higher alphas. We explicitly test the statistical and economic signiﬁcance of these diﬀerences in Section 4.2.