# «Abstract We use the cross-section of index funds to assess the extent of skill in active mutual funds. First, we apply methods designed to ...»

Figure 3 plots the CDFs of alpha t-statistics for index and active funds. As discussed in Kosowski et al. (2006) and Fama and French (2010), t-statistics may be a better representation of skill because they adjust the alpha estimate for the residual variance a fund takes in order to earn that alpha (as well as for statistical issues related to diﬀering sample size across funds). There are some diﬀerences here across benchmark models, but we again see very few diﬀerences between active and passive funds in the right tail. Even after adjusting for diﬀerences in precision of alpha estimates due to residual variance diﬀerences, the top-performing funds appear quite similar in the portion of the distribution where one would expect to ﬁnd the most skilled funds. The outperformance of index funds below the median is weaker under the t-statistic measure, but the results suggest they are at least as good as the active funds.

4.2. Quantile Regressions To statistically test for diﬀerences in these distributions, we examine the distribution of alphas and t-statistics for each risk model and measure diﬀerences between index funds and actively managed funds at various points in the distribution. To do this, we use a quantile regression approach. Following Angrist and Pischke (2008), we examine distribution eﬀects by ﬁnding the conditional quantile function that solves

**the following:**

Qτ (yi |Xi ) = arg min E[ρτ (yi − q(Xi ))] (2) q(X) where ρτ (µ) = (τ − 1(µ ≤ 0))µ for quantile τ and yi is a risk-adjusted fund performance measure estimated from equation 1, either αi or t(αi ). We estimate q(Xi ) as

**a linear function of covariates:**

where Indexi takes the value of one if fund i is an index fund.

Using this approach, we test the diﬀerences between the index fund performance and the actively managed fund performance at various points of their respective distributions. We analyze the 1st, 5th, 25th, 50th, 75th, 95th, and 99th percentiles of the benchmark-adjusted return distribution. Statistical signiﬁcance is determined by calculating bootstrapped standard errors.

4.2.1. Gross alphas We again start by examining gross alphas. In Table 7, we present results from the quantile regressions of benchmark-adjusted returns, before fees, on an index fund indicator variable. Each panel of the table presents the active fund performance (the constant) and the diﬀerence in alpha distributions between index funds and active funds (the coeﬃcient on Indexi ) under each of the diﬀerent benchmark models. Each column represents a diﬀerent quantile of the distribution in ascending order. For example, the middle column, Q50, represents the median active fund’s benchmarkadjusted return (Constant coeﬃcient) and the diﬀerence in medians across the two distributions (Index coeﬃcient). The results are consistent with the visual evidence provided by the CDFs.

The simplest benchmark adjustment considers fund returns in excess of the S&P 500 return. As in the CDF plot, we see almost no diﬀerence in passive and active fund performance at or above the median. The median fund in our sample beats the S&P 500 by 11 bps per month before fees. Below the median, however, index funds statistically and economically out-perform corresponding active funds. It is worth noting that this simple benchmark adjustment gives active managers credit for well-known strategies that most studies generally do not consider skill (e.g., value or momentum). It is therefore striking that even under the model that should bias most in favor of active managers, we see no evidence that the best funds outperform the best passive alternatives.

For purposes of discussion, we focus generally on the results from the distribution of four-factor (Fama-French-Carhart) alphas. These results are presented in the third panel of Table 7. Median risk-adjusted performance for actively managed funds is approximately one basis point per month. At the median, there is no economic or statistical diﬀerence between index funds and active funds; the coeﬃcient on the index fund indicator is zero. Even before fees, median risk-adjusted performance is similar across the two groups.

The left tail of the distribution is where we observe the largest diﬀerences. Under the four-factor model, we estimate that for the very worst funds (Q01), index funds outperform active funds. While the estimated performance of active funds in this quantile is -78 basis points per month, the index funds are only losing half that amount. This estimate is large economically and is statistically signiﬁcant at the 1% level. As we move along the distribution, the performance diﬀerence in favor of index funds gets smaller economically, but is still large relative to the active fund performance at that point in the distribution. At the 25th percentile, active funds lose 11 basis points per month, while index funds lose three basis points.

Perhaps these diﬀerences in the left tail are not surprising. If one believes that index funds merely track passive portfolios, then we might expect active managers, either due to poor talent or bad luck, to do worse on the down side just by virtue of the fact that they are picking stocks. However, this would suggest that the active managers should then perform better in the right tail of the distribution. We see relatively little evidence of this. At the 99th percentile, there is no statistically signiﬁcant diﬀerence between the active and the passive funds. Even if we lack power to detect skill in the tail of the distribution, the magnitude of the eﬀect is economically insigniﬁcant relative to the estimated performance of funds in that part of the distribution (the point estimate of the diﬀerence is zero basis points).24 This is surprising. Given the evidence of Kosowski et al. (2006), Fama and French (2010), and Barras, Scaillet and Wermers (2010), who ﬁnd small but signiﬁcant talent in the It is worth noting that we have power to detect diﬀerences in the left tail of the distribution, so it is reasonable that we do not lack power on the right side.

best funds, it is remarkable that this talent does not seem to exceed the performance of the index funds with the largest estimated alphas.

We do estimate some diﬀerences in the performance of index funds at the 95th and 75th percentiles. In a few models, the index funds perform statistically signiﬁcantly worse than the active funds at the 95th percentile. However, the magnitudes are model dependent and in one case, the estimated diﬀerence even ﬂips signs. At the 75th percentile, there is consistent evidence in favor of the active funds. However, the magnitudes of this eﬀect are small at around four basis points a month. This advantage is roughly half the size of the underperformance of the active funds at the 25th percentile. Overall, this suggests that any small advantage the active managers may have in the right shoulder of the distribution is more than oﬀset by their poor performance in the left side of the distribution.

4.2.2. Gross t(α) In Table 8, we turn our attention to the distribution of t-statistics. There are some subtle diﬀerences relative to the alphas. While we still see little compelling evidence to suggest that active managers outperform, the advantage that index funds have in the left tail of the distribution is reduced when performance is adjusted for estimation precision. This is primarily due to the fact that the poor performance of the worst index funds, while smaller in magnitude, is very precisely estimated. Therefore, adjusting for residual risk, the index funds’ advantage in the left tail is smaller. In the right half of the distribution, the magnitudes of the diﬀerences are small, and the magnitude and signs of the estimates diﬀer depending on the benchmark model choice. Overall, it would be diﬃcult to conclude that the actively managed funds perform better in the right tail of the distribution.

4.2.3. Gross Dollar Returns Berk and van Binsbergen (2014b) argue that benchmark-adjusted alphas are poor measures for mutual fund skill and that skill should be measured as the total dollar value extracted from the market by fund managers. Speciﬁcally, they calculate skill

where qit−1 is the lagged real assets under management and αit is the gross benchmarkadjusted performance of fund i in month t. Given the size disparity between active and passive funds, this measure could lead to diﬀerent conclusions relative to our previous tests.

We calculate Skilli for active and index funds using our set of benchmark models and report the results in Table 9. Our skill measures are denoted in 2013 dollars.

The results are qualitatively the same as using time-series alphas to measure skill.

The distribution of the dollar amount extracted from the market by active funds overlaps substantially with the performance distribution of index funds in the right tail while index funds outperform active funds in the left tail.25 In our sample, the median active fund loses $27,000 per month to the market on a benchmark-adjusted basis (Fama-French-Carhart), before fees. The point estimate for the median index fund is slightly higher, a diﬀerence of $52,000 per month. In the tails, the distribution of dollar value returns is narrow for index funds, although the results are only statistically signiﬁcant in the left tail where the diﬀerences are also economically large. This outperformance by index funds is signiﬁcant at the 75th percentile and below.

4.3. Stochastic Dominance Tests Our quantile regression results are useful in assessing how percentiles of the index and active fund distributions compare, but assessing the aggregate amount of skill in the active space is challenging due to multiple comparison issues. To overcome these issues, we ask whether there exists enough skill (positive or negative) in the active Berk and van Binsbergen (2014b) report the distribution of their value added measure for active and index funds in Table 7 of their paper. Consistent with the main results of our paper, their active and index fund distributions exhibit remarkably similar levels of dispersion from the 5th to 95th percentile. Unlike our sample, they ﬁnd that index fund dollar returns are lower at the 1st and 99th percentiles, presumably due to diﬀerent sample periods and ﬁlters.

fund space to warrant a risk-averse investor choosing an active fund over an index fund. We answer this question by testing the null that active funds stochastically dominate index funds and vice versa. We can easily reject the nulls that either distribution dominates the other in a ﬁrst-order sense, so we focus our discussion on second-order stochastic dominance.

Our tests of stochastic dominance follow the bootstrap-based test of Barrett and Donald (2003), which are based on Kolmogorov-Smirnov tests comparing the distributions at all points. The intuition of the test of the null that distribution G secondorder stochastically dominates distribution F is that a number of bootstrapped draws from F generates a distribution of CDFs that can be compared to the empirical distribution G. The deviations between F and G can be compared to those found between F and the bootstrapped CDFs to determine the likelihood of observing the deviations between F and G by chance.

Table 10 displays p-values associated with the tests of stochastic dominance for the distributions of alpha and t(α). For alphas, we cannot reject the null that index funds dominate active funds, but we can strongly reject the null that active funds dominate index funds for all benchmark models. When using t(α), we again cannot reject the null that index funds dominate active funds for any benchmark model. For four of the six benchmarks, we can again reject the null that active funds dominate index funds.

Economically, the results indicate that the magnitude of any active fund outperformance relative to index funds is insuﬃcient to outweigh the active funds’ underperformance relative the worst index funds. A risk-averse investor facing a random draw from the distribution of active funds versus the distribution of index funds should prefer the index fund lottery based on benchmark-adjusted performance.

**5. Should All Active Funds Be Compared To Index Funds?**

We have shown that the distribution of benchmark-adjusted performance of index funds is remarkably similar to that of the population of active funds. If an investor faces a random draw from index funds versus active funds, the results suggest index funds provide an investment opportunity set that is at least as good as active funds.

Some readers may argue that this is not a fair comparison. One concern may be that investors need not randomly choose between active funds. Researchers have identiﬁed fund characteristics associated with better ex-post performance, and perhaps these active funds should be compared to index funds. Another concern may be that the appropriate comparison should be within a given benchmark, e.g. S&P 500 funds, rather than across all funds. In this section, we address these potential concerns and their implications.

5.1. Index Funds vs. High Active Share/Return Gap Active Funds As discussed in the introduction, a strand of the mutual fund evaluation literature has focused on identifying characteristics of funds that are correlated with ex-post performance. In this section, we test how the performance of index funds compare to that of active funds selected based on these characteristics. We split the crosssection of active funds using two characteristics: Active Share (Cremers and Petajisto (2009)) and Return Gap (Kacperczyk, Sialm and Zheng (2008)). We compare the performance of active funds in the top quartile of these measures over our sample period to the performance of index funds. The results using the Fama-French-Carhart model are summarized in Figure 4.