Are ETFs A Better Benchmark?

Jocelyn Gilligan, CFA, CIPM
Partner
June 28, 2024
15 min
Are ETFs A Better Benchmark?

Using Exchange-Traded Funds (ETFs) as benchmarks instead of traditional indices has become a common practice among investors and fund managers. ETFs offer practical advantages, such as reflecting real-world trading costs, and incorporating management fees and tax considerations. These aspects make ETFs a more accurate and accessible benchmark as they are an actual investible alternative to the strategy being assessed.

However, this approach is not without its drawbacks. Understanding both the advantages and disadvantages of using ETFs as benchmarks is crucial for making informed investment decisions and ensuring accurate performance comparisons.

This article discusses the pros and cons of using an ETF as a benchmark and considerations for making an informed decision on how to go about selecting one that is meaningful.

The Advantages:

Using an ETF as a benchmark rather than the underlying index has several advantages. These include:

Cost:

The decision to use an ETF rather than an actual index as a benchmark often stems from the costs associated with using index performance data. While index providers typically charge licensing fees for access to their indices, these fees can be cost-prohibitive for some firms, especially smaller ones, or those with limited resources.

ETFs offer a more accessible and cost-effective alternative, as they provide readily available, real-time performance data and can be traded easily on stock exchanges and accessed by anyone. By using an ETF as a benchmark, firms can circumvent the barriers to entry associated with marketing index performance directly, allowing them to still compare performance against a relevant benchmark.

Practical Investment Comparison:

ETFs represent actual investment vehicles that investors can buy and sell, thus providing a more practical and realistic performance comparison. Indices, on the other hand, are theoretical constructs that do not account for real-world trading costs, whereas ETFs do. Additionally, ETFs are traded on stock exchanges and can be bought and sold throughout the trading day at market prices, unlike indices which cannot be directly traded.

Incorporation of Costs:

ETFs include trading and management expenses and other costs associated with managing the pool of securities. When using an ETF as a benchmark, you get a more accurate reflection of the net returns an investor would actually receive after these costs. In addition, ETF performance considers the costs of buying and selling the underlying assets, including bid-ask spreads and any market impact, which indices do not.

Dividend Reinvestment:

ETFs may account for the reinvestment of dividends, providing a more accurate measure of total return. Indices often do not factor in the practical aspects of dividend reinvestment, such as timing delays, transaction costs, and tax implications, leading to a potentially less realistic depiction of investment returns.

Tax Considerations:

ETFs may have different tax treatments and efficiencies compared to the theoretical index performance. Using an ETF as a benchmark will reflect these considerations, providing a potentially more relevant comparison for taxable investors.

Replication and Tracking Error:

ETFs can exhibit tracking error, which is the deviation of the ETF's performance from the index it seeks to replicate. While tracking error may be perceived as a limitation, it also reflects the real-world challenges and frictions involved in managing an investment portfolio. Thus, using an ETF as a benchmark encompasses this aspect of real-world performance—which acknowledges the practical complexities of investing and serves to enhance transparency and accountability in investment decision making.

Transparency and Real-time Data:

ETFs provide real-time pricing information throughout trading hours, allowing investors to monitor and compare performance continuously as market conditions fluctuate. This real-time data enables more informed and timely decision-making, as investors can react instantly to market events, manage risks more effectively, and capitalize on opportunities as they arise.

Advantages Summary

In summary, using an ETF as a benchmark provides a less-costly, more realistic, practical, and accurate measure of investment performance that includes real-world considerations like costs, liquidity, tax implications, and dividend reinvestment, which are not fully captured by indices. ETFs are a true investable alternative, while indexes are not directly investible.

The Disadvantages:

While using an ETF as a benchmark has several advantages, there are also some potential drawbacks to consider:

Downside of Tracking Error:

ETFs may not perfectly track their underlying indices due to various factors such as imperfect replication methods, sampling techniques, and management decisions. This tracking error can result from differences in timing, costs, and portfolio composition between the ETF and its benchmark index.

This deviation can lead to discrepancies when comparing the ETF's performance to the actual index and can affect investors' expectations, portfolio management decisions, and performance evaluations. Thus, it is prudent to evaluate and monitor tracking error of ETFs when they are used as a benchmark.

Tracking Method: Full Replication vs. Sampling

ETFs employ different replication strategies to track their underlying indices, with some opting for full replication, while others utilize sampling techniques. These differences can lead to varying levels of tracking error and performance differences from the underlying index.

Full replication involves holding all of the securities in the index in the same proportions as they are weighted in the index, aiming to closely mirror its performance. In contrast, sampling techniques involve holding a representative subset of securities that capture the overall characteristics of the index.

While full replication theoretically offers the closest tracking to the index, it can be more costly and logistically challenging, especially for indices with a large number of securities. Sampling, while potentially more cost-effective and manageable, introduces the risk of tracking error, as the subset of securities may not perfectly reflect the index's performance.

Non-Comparable Expense Ratios:

ETFs incur management fees, which reduce returns over time. While these fees are part of the real-world costs, they can make the ETF's performance look worse compared to the theoretical performance of the index, especially when compounded over time. This may be problematic when using an ETF as a comparison tool (think expense ratios dragging down ETF benchmark performance thus making the strategy appear to have performed better than it would have against the actual index). This has the potential to influence investment decisions and performance evaluations. To address this concern, the GIPS Standards now require firms that use an ETF as a benchmark to disclose the ETF’s expense ratio.

Many active managers might argue that it’s “unfair” that the SEC requires them to compare net returns against an index that has no fees or expenses. However, if the strategy’s goal is to beat the index with active management, the manager should be doing this even after fees, otherwise passive investing (with lower fees) is a better option.

Liquidity Constraints:

Some ETFs may suffer from lower liquidity, leading to wider bid-ask spreads and higher trading costs, especially for large transactions. This can affect the ETF's performance and make it less ideal as a benchmark.

Selection Dilemma

Multiple ETFs may track the same index, each with different structures, expense ratios, and tracking accuracy (e.g., check out the differences between SPY, IVV, VOO, SPLG). As a result, choosing the most appropriate ETF as a benchmark should involve consideration of factors such as cost-effectiveness, liquidity, tracking error, and the strategy’s specific investment objectives. As a result, some due diligence should be done to ensure that the selected ETF aligns closely with the desired index and makes sense for the investment strategy.

Some firms have made it a habit to mix the use of different ETFs in factsheets, often because their data sources lack all the data needed for one ETF. While it may seem like it’s all the same, for many of the reasons discussed in this post, not all ETFs are created equal. We do not recommend mixing benchmarks, even when using actual indices (e.g., comparing performance returns to the Russell 1000 Growth, but then showing other statistics like sectors compared to the S&P 500). Similarly, we wouldn’t recommend doing that with ETFs either (e.g., comparing performance returns to IVV but using sector information from SPY). Mixing benchmark information in factsheets is messy and likely to be questioned by regulators, especially when doing so makes strategy performance look better.

Regulatory and Structural Issues:

ETFs are subject to evolving regulatory oversight that might affect their operations, costs and performance as benchmarks. This is not the case for indices.

In addition, the structural differences between ETFs, particularly regarding whether they are physically backed or use synthetic replication through derivatives, can significantly impact their risk profile and performance relative to their underlying indices.

Physically backed ETFs typically hold the actual securities that comprise the index they track, aiming to replicate its performance as closely as possible. In contrast, synthetic ETFs use derivatives, such as swaps, to replicate the index's returns without owning the underlying assets directly. While synthetic replication can offer cost and operational advantages, it also introduces counterparty risk, as the ETF relies on the financial stability of the swap provider.

As a result, it’s best to consider the structure of the ETF before using it as a benchmark.

Market Influences:

ETFs can trade at prices above (premium) or below (discount) their net asset value (NAV), which can introduce short-term performance differences that are not reflective of the underlying index performance.

These premiums and discounts arise due to supply and demand dynamics in the market, as well as factors such as investor sentiment, liquidity, and trading volume. These fluctuations can affect the ETF's reported returns and introduce discrepancies when comparing its performance to the benchmark index. Therefore, investors must consider the impact of these premiums and discounts on the ETF's short-term performance and recognize that these variances may not accurately represent the true performance of the underlying index.

When material differences in price vs. NAV exist, some firms believe that the NAV is a better representation of the fair value rather than the price and have used NAV for performance calculations. Please note that when this is done, it is important to document how fair value is determined and if the performance is based on the change in NAV or change in trading price.

Currency Risk:

Investors utilizing ETFs tracking international indices face the added complexity of currency fluctuations, which can significantly influence the ETF's performance. When investing in foreign ETFs, investors are exposed to currency risk, as fluctuations in exchange rates between the ETF's base currency and the currencies of the underlying index's constituents can impact returns. Currency movements can either enhance or detract from the ETF's performance, depending on whether the base currency strengthens or weakens relative to the underlying currencies.

Consequently, currency risk should be considered when using international ETFs as benchmarks.

Dividend Handling:

The handling of dividends by ETFs, whether they are paid out to investors or reinvested back into the fund, can have a notable impact on their total return compared to the index they track. Indices typically assume continuous reinvestment of dividends without considering real-world frictions such as transaction costs or timing delays associated with reinvestment. In contrast, ETFs may adopt different dividend distribution policies based on investor preferences and fund objectives.

ETFs that reinvest dividends back into the fund can potentially enhance their total return over time by capitalizing on the power of compounding. However, this approach may result in tracking errors if the reinvestment process incurs costs or timing discrepancies that deviate from the index's assumed reinvestment.

ETFs that distribute dividends to investors as cash payments may offer more immediate income but could lag behind the index's total return if investors do not reinvest these dividends efficiently. Therefore, the dividend handling policy adopted by an ETF can significantly influence its performance relative to the index and should be carefully considered.

Lack of Historical Data:

Some ETFs, especially newer ones, may not have a long track record. This can make historical performance comparisons less reliable or comprehensive. Without an extensive performance history, sufficient data may be lacking to assess an ETF's performance across various market conditions and economic cycles, making it challenging to gauge its potential risks and returns accurately.

Strategies that existed long before an ETF was created to track the comparable index, may end up with timing differences. Many firms often need to use multiple benchmarks to cover the entire period. But, for some strategies that go way back, an ETF may not exist back to inception. Be sure to include rationale in your documentation for benchmark selection so that it is clear when and why a benchmark was selected for the given time periods.

Conclusion:

In conclusion, using ETFs as benchmarks offers practical benefits, potentially making them a more accurate and accessible measure of investment performance compared to traditional indices since they are an actual investable alternative to hiring an active manager. However, these benefits do not come without shortcomings. By carefully evaluating these factors and considering the specifics of the ETFs selected for each strategy, managers can effectively use ETFs as benchmarks to assess and monitor investment strategies. In understanding these factors, an ETF may actually be a better comparison tool for your strategy than the underlying index.

Recommended Post

View All Articles

Every Spring, the performance measurement community gathers for PMAR: The Performance Measurement, Attribution & Risk Conference, hosted by TSG. This year marked the twenty-fourth annual, and I left thinking about it differently than I have in years past.

Most years, the themes evolve gradually. This year, I felt like the ground was moving.

The theme nobody put on the agenda but ran underneath nearly every session was the pace of change. Specifically, what artificial intelligence is about to do to our work. And while I came away energized, I also came away with a healthy dose of " we (as in everyone) are not ready for how fast this is coming."

Here's what stayed with me.

AI Was the Undercurrent of the Whole Event

The session titled "AI, Anxiety, and Opportunity: What Performance Professionals Need to Know" was, predictably, one of the most sought-after sessions of the conference. The panel, which included practitioners from across the industry, did a nice job naming both sides of the coin: the anxiety of not knowing what your job looks like in five years, and the opportunity sitting right in front of us if we lean in.

Here's my honest read of the room, though. The mood was optimistic. Maybe a little too optimistic. There was a comfortable assumption that AI will mostly handle the tedious parts and leave the interesting work to us. Or that AI won’t take your job, someone that knows AI will. I'm not sure it'll be that tidy.

From what we're already seeing in our own work and across the firms we serve, the capabilities are advancing faster than most people can comprehend. The days where “our industry is just slower to adapt” are gone. Just last week, anthropic released Fable 5 and before it was shut down (temporarily?), we played around with it a little and its capabilities are dumbfounding. I don't think it will be long before these conferences look drastically different. Different sessions, different vendors, maybe a different sense of what the job even is. That's not a doom prediction. It's just a reason to pay closer attention than feels comfortable.

Separating Skill From Luck Just Got Harder and More Important

One of my favorite sessions was Michael Ervolini's "You Can't Find Skill in Returns: Distinguishing Performance From the Decisions That Generate Them." It's a deceptively simple premise: returns tell you what happened, not whether the manager was actually good. A great number can come from a great decision, or from luck. A bad number can hide genuine skill.

What I appreciate about PMAR is that the community keeps bringing fresh perspectives to this old, hard problem: how do we actually evaluate skill versus luck? It's a question that never fully resolves, and every year someone pushes the thinking forward.

It struck me that this question gets more important in an AI world, not less. As machines take over more of the calculation and even some of the decision-making, our value shifts toward judgment – knowing which decisions deserved credit, which results were noise, and what a number actually means in context. That's the kind of discernment a model can assist with but can't own. For more from Mr. Ervolini, here's a link to his latest book Skill vs. Luck.

The GIPS Challenges That Keep Coming Back

I'm biased here, but the "Common GIPS Challenges and How to Avoid Them" session was a highlight for us, in part because our own Matthew Deatherage, CFA, CIPM, was on the panel alongside peers from TSG, MassPRIM, and Strategic Investment Group.

What I always find striking about this topic is how consistent the challenges are. Firms pursuing compliance with the Global Investment Performance Standards (GIPS®)* tend to stumble on the same handful of issues year after year, and almost all of them are avoidable with the right foundation in place. That's a big part of why we do what we do at Longs Peak: helping firms get ahead of those pitfalls instead of discovering them during verification or, worse, during a regulatory exam.

Matt is a familiar face on these panels, and it's great to have our perspective in the mix. But the takeaway that stuck with me tied right back to the AI thread running through the whole conference.

Across several different panels, presenters talked about feeding the GIPS standards into their own AI models to churn out GIPS reports. And here's the thing, anyone can do that. You can drop the standards into a model in minutes. What a model can't do is provide critical judgment about how a principles-based framework should be applied to your specific facts and circumstances and whether those GIPS reports and statistics were calculated correctly. The GIPS standards aren't a checklist; they're a set of principles that require interpretation, and interpretation is exactly where experience earns its keep.

I'm not saying don't use AI to help build a framework. Use it. But like any model, if you don't really know what you're asking it to do, the output won't save you. Simply asking a model to "make my firm GIPS compliant" isn't going to make it so. At least not yet!

And there's one problem every performance professional already knows AI hasn't solved: data. As they say, garbage in, garbage out. Meaningful performance lives and dies on clean, well-organized data, and no software tool or AI model fixes messy inputs alone. At Longs Peak, we have spent the last 10 years working with clients to improve data quality through data integrity testing. For us, these AI models have only expanded what’s possible. We know one thing for sure: setting these tools up with the proper context (i.e., knowing what to look for) and then evaluating that context on an ongoing basis may turn out to be the most crucial piece of it all.

CFA Institute Is Listening on the CIPM

A session I didn't expect to find as interesting as I did was "CIPM Through the Practitioner Lens," facilitated by Rob Langrick of CFA Institute. Rather than simply presenting at the room, CFA Institute came to listen and gather candid feedback on the CIPM designation: where it's delivering value, where it's falling short, and how it should evolve to stay relevant to the work we actually do day to day.

The audience didn't hold back, and there were some genuinely thoughtful suggestions including how the code of ethics will evolve in this new AI era, some recommendations on reformatting the exam to break it into smaller chunks (going into greater detail on each) as well as adding a CIPM group within the CFA societies to encourage further connection. It was refreshing to see CFA Institute putting real energy behind a credential that so many of us have invested in and want to see grow in value. Given the pace of change in our field, willingness to adapt feels necessary. For anyone interested in contributing ideas to the CIPM, you can use this link to provide feedback.

A Quick Word on the Trivia

I'd be remiss not to mention that Performance Trivia got a much-needed upgrade this year. In past years, only a handful of contestants got to play while the rest of us watched (though in fairness, not all of us were clamoring for the spotlight). The new format this time allowed everyone to participate (without taking center stage), and it was a lot more fun for it. A small change, but it captured something I value about this community: it's competitive, but it's also genuinely collegial and prides itself on memorizing quirky names and vintage formulas.

Before PMAR Even Started: Women in Performance Measurement

For me, the week actually started the day before the conference, at the Women in Performance Measurement (WiPM) gathering. An event created just for the women in our industry. It's one of my favorite parts of this trip every year, and not only because the conversation is good. There's something energizing about being in a room full of women who do this work, comparing notes and reconnecting.

Fittingly, AI came up here too, though in a much more hands-on way than it would on the main stage. Practitioners shared real use cases, both personal and professional: the small ways AI is already saving them time day to day, and the bigger experiments they're running at their firms. It was practical, curious, and refreshingly free of hype.

We were also lucky to have a guest speaker, Lidia Arshavsky, who spoke on executive presence. She broke down how executive presence actually gets evaluated inside organizations (the signals people pick up on, often without realizing it) and offered practical recommendations for strengthening your own. It was the kind of talk that's useful no matter where you are in your career.

It was a great way to kick off PMAR, and an even better way to reconnect with women I only get to see a few times a year. Sometimes the most valuable part of a conference happens in these opportunities to network and reconnect within our niche performance community. A big thank you to TSG who donated the space for this event to take place and have done so for many years.

What AI Can't Take From Us

The conference's forward-looking sessions, including "Innovative Ways to Present Performance: Dashboards & Analytics," got me thinking. The tools are evolving so quickly and so much of the analysis, presentation, and reporting can now be automated. I am left wondering how long the traditional use of software in our space will last in its current form.

When the capabilities advancing fastest don’t always come from the established vendors, who benefits? My hope is that everyone does. That these tools level a playing field that used to tilt heavily toward the largest institutions, give smaller firms the ability to deliver high-caliber analytics previously out of reach, and push the whole field toward better solutions. That makes for a more competitive space and ultimately a clearer picture for investors to evaluate their options.

That's the optimistic case, and I believe it. But it only holds if we stay clear-eyed about where our own value comes from and that's the note I want to leave you on. The pace of change is a reason to focus, not to panic. The things that make us valuable are the things AI can't take: consciousness, judgment, and the human-in-the-loop accountability that clients ultimately trust. Machines will calculate faster and present prettier. They won't sit across the table from a client and take responsibility for what a number actually means.

So, by all means, get curious about the tools (Claude seemed to be most people’s favorite – mine as well). Experiment. Don't be the individual or firm that gets left behind. But anchor yourself in the part of this work that's irreplaceably human, because that's the part that was always the point.

See you at PMAR 2027. I suspect it'll look a little different.

________________________________________

GIPS® is a registered trademark owned by CFA Institute. CFA Institute does not endorse or promote this organization, nor does it warrant the accuracy or quality of the content contained herein.

Most firms that decide to pursue compliance with the Performance Standards (GIPS®) already understand why it matters. They know institutional investors and consultants often expect it. They understand the credibility that comes from standardized, transparent performance reporting. And they recognize that a strong performance reporting process can improve internal consistency well beyond marketing.

What many firms struggle with is not the “why,” but the“how.”

The GIPS standards can be especially intimidating at first glance. There are detailed requirements, technical terminology, and a long list of considerations that touch everything from performance, to operations, compliance, and marketing. For firms approaching GIPS compliance for the first time, it is easy to feel like they need to solve everything at once.

Whether working independently or with external GIPS support,becoming GIPS compliant is significantly more manageable when approachedthrough a structured process.

At a high level, implementing the GIPS standards comes down to four major phases:

  1. Define the firm and scope the universe of portfolios
  2. Build policies and procedures
  3. Construct composites and calculate performance
  4. Create GIPS Reports and establish ongoing monitoring controls

The order matters more than many firms realize. When the foundation is built correctly at the beginning, the downstream work becomes significantly easier. For a broader overview of what the GIPS standards are and why firms choose to comply, see our earlier post, What Are the GIPS Standards?

Phase 1: Define the Firm and Scope Your Universe

Before constructing composites or calculating returns, firms first need to define the “firm” for the purpose of claiming compliance with the GIPS standards. This sounds simple, but it is often one of the most important decisions in the entire implementation process.

The GIPS standards require compliance on a firm-wide basis. Firms cannot selectively apply compliance to only their best-performing strategies or business lines. The firm definition determines which portfolios fall within the scope of compliance and ultimately impacts composite construction, total firm assets, disclosures, and marketing claims.

For smaller organizations, this step is often straightforward. The legal entity, branding, regulatory registration, and operational structure are usually aligned. In those situations, the firm definition may be relatively easy to document.

For larger organizations with complex legal structures or ones that operate under multiple brands, the analysis can become significantly more complex.

The following questions should be considered:

  • How is the firm held out to the public?
  • Do affiliates or subsidiaries share investment personnel or investment decision-making?
  • How are the various entities registered and branded relative to one another? 
  • How are investment strategies actually managed across entities?

The answer to these questions matters because the firm definition affects everything that follows. For complex organizations, it is worth investing real time here and involving compliance and legal teams before any other work begins.

Once the firm is defined, the next step is performing a full inventory of assets (or portfolios) that fall within the defined firm. That includes discretionary accounts, non-discretionary accounts, pooled funds, terminated portfolios, and any other assets managed by the firm over the entire period for which the firm will claim compliance.

One of the most common implementation mistakes is discovering late in the process that certain portfolios were overlooked or incorrectly categorized. Taking the time upfront to fully scope the universe of portfolios prevents significant cleanup work later on.

Phase 2: Build Your GIPS Standards Policies & Procedures Manual

The next step is building the firm’s GIPS standards policies and procedures manual, often referred to as the GIPS standards “P&P.”

The GIPS standards require firms to document the policies and procedures used to comply with all applicable requirements. But beyond satisfying the standards themselves, strong documentation creates consistency across operations, compliance, marketing, and portfolio management teams. Your GIPS standards P&P becomes the operational blueprint for how your firm calculates performance and maintains GIPS compliance. When drafted thoughtfully, ongoing maintenance becomes manageable. Firms that rush this phase often find themselves cleaning up problems indefinitely.

 

A well-designed P&P typically addresses the following:

  • Firm definition
  • Definition of discretion
  • Composite construction rules
  • Treatment of significant cash flows and composite minimums
  • Calculation methodologies
  • Fair valuation hierarchy
  • Error correction procedures
  • Books and records retention
  • GIPS Report distribution policies
  • Benchmark selection and changes
  • Fee schedules and policies for the use of actual or model fees

The definition of discretion deserves particular attention. Under the GIPS standards, discretion is not the same thing as having legal discretion documented in the investment management agreement. A client may impose restrictions that prevent full implementation of the strategy, and if those restrictions are significant enough, the portfolio should be classified as non-discretionary under the GIPS standards. Firms should establish objective criteria that can be applied consistently across portfolios and clearly document those criteria within their P&P. This determination has a direct impact on composite construction, as only portfolios deemed discretionary maybe included in composites, while non-discretionary portfolios must be excluded.

Calculation methodology should also be clearly addressed within the P&P. Firms should define how external cash flows are handled, the methodology used to asset-weight portfolios within composites, and whether any composites are subject to minimum asset levels or significant cash flow policies. For a more detailed discussion of large cash flow policies versus significant cash flow policies—and why both matter—see our post Large vs. Significant Cash Flows: What’s the Difference? These methodologies should be clearly documented.

Finally, firms often do not devote enough attention to developing their error correction policy during implementation. The GIPS standards require firms to establish materiality thresholds in advance that determine what actions must be taken when an error is identified. The time to think through that process is before an error occurs, not in the middle of responding to one.

We often find that this is the phase where firms realize that implementing GIPS compliance is not just a performance reporting exercise. It frequently exposes inconsistencies in operational workflows, account coding, historical records, or portfolio classifications. That is not necessarily a bad thing. It allows you to intentionally strengthen processes and reporting before those issues surface in higher-stakes situations such as verification, a regulatory examination, or investor due diligence.

One of the biggest hidden benefits of the GIPS compliance implementation process is that it forces firms to formalize processes that may have evolved informally over time. Many firms come out of the GIPS compliance implementation process with cleaner data, stronger internal controls, and more consistency across teams.

The key is to make the policies practical. The best GIPS standards P&Ps are not written purely for regulators or verifiers. They are designed to reflect what the firm actually does in practice and to provide internal teams with a framework they can follow consistently.

Phase 3: Construct Composites and Calculate Performance

Once your policies and procedures are in place, firms can begin the process of constructing composites and calculating performance. This is the phase most firms typically associate with GIPS compliance because it is where the visible performance reporting work happens. At this point, much of the difficult decision-making should already be complete. Composite construction is next and while that may sound straightforward conceptually, this is the phase where most firms get stuck in the implementation process.

Under the GIPS standards, discretionary portfolios must be grouped into composites based on similar investment mandates. The goal is to ensure firms present strategy-level performance fairly and consistently instead of selectively highlighting individual account results. The construction process typically has four steps:

  1. Identify every portfolio that meets the composite definition
  2. Determine the correct dates each portfolio should be in and out of the composite
  3. Asset-weight the portfolio-level monthly returns to produce composite-level performance
  4. Calculate all required composite-level statistics, including internal dispersion and the three-year annualized ex-post standard deviation of both the composite and the benchmark

It sounds simple, but it’s not always easy. The data work alone is substantial. Before anything meaningful can be built, the underlying portfolio-level data must be reviewed to ensure it is reconciled and accurate. That foundation matters because everything downstream depends on it.

The historical composite membership analysis adds another layer of complexity. Strategies evolve. Clients add or remove restrictions. Portfolios change mandates. When building a composite with a ten-year history, the question is not simply which portfolios belong in the composite today. Itis which portfolios belonged in the composite during each period in the historical record, and why. Working through that analysis portfolio by portfolio and period by period requires both strong documentation and sound judgment.

The good news is that there are tools available to help firms identify historical performance outliers to help test if composite inclusion was accurate. When a portfolio within a composite posts a return that deviates meaningfully from its peers during a given period, that deviation can be identified for further research. Was there a client-imposed restriction during that period that prevented full implementation of the strategy (i.e., making it non-discretionary)? Or was the deviation driven by a large cash flow, different starting position, security-specific activity, or simply the normal variation expected across portfolios managed within the same strategy? The answer determines whether the portfolio appropriately belongs in the composite for that period, and the wrong conclusion in either direction can undermine the integrity of the track record.

It is also important to be direct about the limits of technology in this process. Software can identify anomalies, efficiently perform calculations, and output results once the inputs are correct, but no system makes judgment calls. Determining how composites should be defined, how discretion should be applied, what constitutes a significant restriction, and how to evaluate edge cases in historical membership all require experienced professionals who understand both the investment strategies and the GIPS standards. The policies documented in Phase 2 provide the framework but applying that framework consistently across real portfolios and real historical circumstances requires thoughtful human judgment at every step.

Perhaps most importantly, composite construction should not be treated purely as an operations exercise. If composites are built solely around how data happens to be structured within the portfolio accounting system, without input from the investment team and without alignment with sales and marketing, the result may be operationally convenient but commercially ineffective. Bringing together operations, performance, investment professionals, and sales and marketing teams early in the process is essential. That collaboration is what ultimately produces results that are not only GIPS compliant, but also meaningful, defensible, and aligned with how the firm communicates its investment approach.

Phase 4: Create GIPS Reports and Go Live

The final phase of becoming compliant is creating the GIPS Composite Report(s). The GIPS Report is the firm’s external-facing proof of compliance and is a presentation that must be provided to every prospective client. Getting to this stage is what most people think of as going live, but in practice it is a milestone, not the finish line.

GIPS Reports require more than simply presenting returns in a table. Firms must include required statistics, disclosures, benchmark information, and firm-level details necessary for prospective clients to properly interpret the results. The disclosures are especially important because they provide context investors need to understand how the performance was calculated and what the results represent. They are very specific and must be in sync with what is documented in the P&P and what was performed to construct the composites. For more specifics on what is required, see our previous post, How to Update Your GIPS Reports for the 2020 GIPS Standards.

Once the GIPS Reports are complete and all requirements have been met, the final administrative step to claiming compliance is submitting the GIPS Compliance Notification Form to CFA Institute. This must be filed before compliance can be claimed, and it must be renewed annually.

Ongoing Maintenance: Where Most Firms Struggle

Getting to compliance is an achievement. Staying there requires consistent operational discipline.

The most common breakdowns in ongoing maintenance are rarely dramatic failures. More often, they are small process gaps that compound over time. Portfolios that should have been added to composites were omitted or added late. Significant cash flows were not handled in accordance with the composite’s policy. A new strategy was launched without formally determining whether it warranted the creation of a new composite.

To avoid these breakdowns, firms should incorporate GIPS compliance procedures into their regular monthly or quarterly performance processes rather than treating compliance as an annual reporting exercise. Clear ownership should be assigned internally and firms should establish a recurring compliance calendar that includes monthly composite reviews and annual GIPS Report updates. The GIPS standards policy manual should also be reviewed at least annually to confirm it continues to reflect how the firm actually operates. For a deeper look at what effective ongoing governance looks like in practice, see our post What Good GIPS Compliance Governance Looks Like in Practice.

Should You Get Verified

Verification is not required. Firms that are early in the process sometimes view this as a bigger decision than it needs to be. If a firm chooses to pursue verification, it must be performed by an independent third party, but the decision itself is entirely voluntary.

That said, verification is generally worthwhile, particularly for firms competing for institutional mandates where GIPS compliance is often considered table stakes. According to eVestment, two out of three searches conducted in their database by investors or consultants exclude firms that are not GIPS compliant, so the marketing benefit can be meaningful. Verification also provides an added level of assurance that the firm’s policies and procedures have been designed in accordance with the GIPS standards and implemented consistently across the organization. Operationally, it can create valuable discipline as well, since the expectation of independent review encourages firms to maintain strong processes throughout the year.

For firms that are newer to compliance or navigating budget constraints, the best path is often to first establish a solid compliance foundation and then pursue verification when the timing makes sense. We have also worked alongside nearly every major verifier in the industry and can provide practical insight into how different firms approach the process, what working styles may align best with your organization, and how to evaluate which verifier may be the best fit for your needs. For a more detailed walkthrough of the process, see our series on How to Survive a GIPS Verification.

Final Thoughts

Becoming GIPS compliant can feel overwhelming at the start, especially when going through it for the first time. But it does not have to be. For firms that approach implementation as a structured operational framework — rather than a collection of isolated technical requirements —typically find the process much more manageable.

The goal is not simply to produce a compliant presentation. The real value comes from building a repeatable, transparent, and defensible framework for calculating and presenting investment performance.

When implemented thoughtfully, the GIPS standards do more than support marketing efforts. They help firms improve consistency, strengthen controls, and communicate performance results with greater credibility and transparency. And in an environment where investors continue to demand greater transparency and comparability, those operational improvements can provide a meaningful competitive advantage, strengthening both credibility and investor confidence.

Mission-driven institutions are entrusted with something larger than capital. They are entrusted with purpose.

Endowments, foundations, and long-term investment pools exist to support education, healthcare, research, environmental initiatives, religious or cultural programs, community development, and countless other causes—often for generations.

That long-term horizon changes how investment performance should be reported. Because when an institution thinks in decades instead of quarters, investment performance is not just about what happened recently, itis about whether the portfolio is structured to sustain spending, preserve purchasing power, and remain aligned with its mission through full market cycles.

Many institutions rely entirely on their investment managers to calculate and present investment performance. That’s common, but it’s not always sufficient.

Performance Oversight Is Not the Same as Performance Results

Investment managers are responsible for generating returns. Boards and oversight committees are responsible for evaluating those results.

Those responsibilities are distinct.

Oversight is a fiduciary duty. It is not passive, and it cannot rely solely on the information created by the party being evaluated. Effective oversight requires independence, consistency, and clarity.

When the same party both manages assets and determines how performance is calculated and presented, the lines between management and oversight can blur—even when intentions are sound and calculations are technically accurate.

In some situations, reporting may not be:

  • Consistent across managers
  • Based on uniform calculation methodologies
  • Presented in a format designed for governance review
  • Structured to facilitate long-term policy evaluation

Consider a board reviewing results from three different managers. Each reports strong performance, but one calculates returns net-of-fees, another presents gross results, and a third uses slightly different valuation timing.

At first glance, the numbers appear comparable. In reality, they may not be measuring the same thing.

Some larger institutions maintain internal performance teams or engage independent performance professionals to standardize reporting, organize data across managers, and present results in accordance with established best practices—often aligning reporting with their Investment Policy Statement and/or recognized frameworks such as the Global Investment Performance Standards (GIPS® standards).

But many of these organizations operate lean. They may not have dedicated performance measurement expertise or the infrastructure required to consolidate, normalize, and present results in a governance-ready format.

In those cases, boards are often reviewing manager-produced materials that were designed primarily for client communication—not institutional oversight. Performance reporting for these institutions should be designed to serve the governing body—not simply to showcase results.

Why This Matters for Mission-Based Institutions

Boards of endowments and foundations are often composed of dedicated volunteers, philanthropists, community leaders, and subject-matter experts. They bring vision, experience, and commitment to the institution’s mission—but not always a deep understanding of investment management and reporting.

That makes investment performance clarity essential. When reporting is unclear, oversight weakens—not because trustees lack commitment, but because the information is not presented in a way that supports meaningful evaluation.

When reporting is structured and tied directly to policy benchmarks, risk parameters, and spending objectives, trustees know what questions to ask. Conversations remain focused on long-term sustainability and mission impact.

A Practical Framework for Strong Performance Reporting

Boards of mission-driven institutions are often operating at the governance-level and should evaluate their reporting structure against four questions:

1. Is performance calculated independently?

Independent calculation or oversight reduces potential conflicts and strengthens fiduciary governance. In institutional investing, separating portfolio management from performance oversight is widely viewed as a best practice.

2. Is the methodology consistent across managers?

Multi-manager portfolios require uniform return calculation, fee treatment, and valuation policies to ensure comparability. Without consistency, “relative performance” becomes difficult to interpret.

One practical way institutions address this challenge is by complying with and requiring their managers to comply with the GIPS® standards.

The GIPS standards are a globally recognized framework administered by CFA Institute designed to promote fair representation and full disclosure in the calculation and presentation of investment performance.

Endowments and foundations that adopt the GIPS standards for their own performance calculations—and require the same of the managers they hire—send a powerful message to their boards and stakeholders that the institution is committed to transparency in how results are calculated and presented.  

3. Is reporting aligned with policy benchmarks?

Boards should see performance relative to long-term policy objectives, not just absolute returns. And this information should be shown at the level at which it is managed. Simply reporting that “the portfolio returned 8%” does not answer the real governance question.

A portfolio can have a positive year and still fail to meet its strategic role within the overall allocation.

For example:

  • Did the equity allocation meet its return objective relative to its benchmark?
  • Did the diversifying strategies provide the downside protection they were intended to deliver?
  • Did fixed income serve its role as a stabilizer?
  • Did alternative investments justify their complexity and liquidity constraints?

Even if the overall portfolio met its expected return, boards should understand how it got there. Reviewing performance by allocation allows boards to evaluate whether each segment is fulfilling its mandate, not just whether the total return looks acceptable.

When reported this way, it becomes easier to see where the portfolio is meeting expectations and where it may be falling short.

4. Is communication designed for governance?

Once performance is aligned to policy benchmarks, reporting should help trustees interpret what the results mean without requiring them to operate at the manager or security-selection level.

Reports should help answer key questions:

·        Are we meeting long-term objectives?

·        How are managers performing relative to their mandates?

·        Is risk aligned with the investment policy?

·        Are we preserving capital appropriately given our spending needs?

·        Did managers follow investment guidelines that align with our institution’s mission?

If any of these areas underperform, governance-level reporting should prompt clear, high-level discussion: Why did this occur? Was the result consistent with expectations? What steps, if any, are being considered to address issues going forward? If shortfalls persist, boards may need to evaluate whether the strategy or manager remains appropriate.

This kind of oversight strengthens outcomes by reinforcing accountability. Performance reporting should be communicated in plain language and simplify complex data into clear actionable insight. When this occurs, it enables boards to move from procedural review toward informed, effective governance.

From Calculation to Communication

Accurate returns are the starting point. Clear communicationis the outcome.

When performance calculation, oversight, and presentation are thoughtfully structured, board discussions become more strategic and less reactive. Boards gain confidence in their oversight, managers operate within clearer expectations, and the institution stays focused on its purpose.

A Closing Thought

Mission-driven institutions think in decades, not quarters. Their performance reporting should reflect that same discipline. Investment oversight is not just about generating returns, it is about ensuring those returns are measured, understood, and aligned with the institution’s long-term purpose.

Clear reporting strengthens governance.
Strong governance protects sustainability.
And sustainability protects the mission.