The IPMVP Verification Framework for AI Smart Buildings: What Your Vendor Won't Tell You About Measuring Real Savings

Your AI-HVAC vendor just sent you a report claiming 23% energy savings. The numbers look compelling. The dashboard is impressive. But here's the question you should be asking before you sign the renewal: how were those savings measured?

In most cases, the honest answer is: nobody really knows. And that's a problem worth $500,000 or more — the typical budget commitment a facility director makes when deploying AI building systems at scale.

The International Performance Measurement and Verification Protocol (IPMVP) exists precisely to answer this question. Developed by the Efficiency Valuation Organization (EVO) and referenced in ASHRAE Guideline 14, IPMVP is the international standard for verifying whether an energy intervention actually delivered the savings it claimed. It's used in LEED commissioning, ESCO performance contracts, and government energy procurement worldwide.

It is almost entirely absent from AI smart building marketing. This article tells you what IPMVP actually requires, which Option applies to your AI-HVAC deployment, and how to use it to separate vendors delivering real savings from those delivering impressive reports about unverifiable numbers.

What IPMVP Actually Requires

IPMVP defines four measurement and verification options, each with specific data requirements, baseline construction rules, and accuracy expectations. The protocol's core principle is simple: savings cannot be measured directly. They are the difference between what energy consumption was (or would have been without the intervention) and what it actually became. Constructing that counterfactual rigorously is what separates IPMVP from vendor-reported numbers.

Every IPMVP analysis requires:

A defined measurement boundary — which systems, meters, and energy flows are in scope
A baseline period — typically 12 months of pre-intervention data representing normal operations
A baseline model — an equation or regression relating energy consumption to driving variables (OAT, occupancy, production output)
Routine adjustments — modifications to the baseline for changes in operating conditions that would have occurred anyway (weather, schedule changes, occupancy shifts)
Non-routine adjustments — corrections for significant one-time changes (equipment replacement, major operational changes)
Measurement uncertainty quantification — ASHRAE Guideline 14 requires CV(RMSE) ≤ 25% and Normalized Mean Bias Error ≤ 5% for acceptable models

None of this is optional. A savings claim without a documented baseline model and uncertainty bounds is not IPMVP-compliant, regardless of what the vendor's dashboard says.

The Four IPMVP Options: Which One Applies to AI-HVAC?

Option	Method	Best For	AI-HVAC Applicability	Key Requirement
Option A Partially Measured Retrofit Isolation	Measure key parameters; stipulate others	Single system with well-understood performance curves (lighting, motors)	Limited — stipulated parameters hide AI variance	Engineering judgment for stipulated values; metering of key variables
Option B Retrofit Isolation — All Parameters Measured	Meter all relevant parameters continuously	HVAC systems, chillers, AHUs — any system where all energy drivers can be measured	Recommended for AI-HVAC — captures full system interaction effects	Sub-metering at the system boundary; continuous data logging; baseline regression with CV(RMSE) ≤ 25%
Option C Whole Facility	Analyze whole-building utility bills	Whole-building retrofits; when attribution to a single system is impractical	Acceptable for whole-building AI deployments; requires 12+ months post-intervention data	Utility bill data; regression model with weather normalization; 12+ months baseline
Option D Calibrated Simulation	Calibrated energy model as baseline	New construction; situations where measured baseline is unavailable	Useful for new deployments without historical data; high cost	Calibrated energy model (EnergyPlus/DOE-2); CV(RMSE) ≤ 30%; expensive to execute

For most AI-HVAC deployments in existing buildings with sub-metering capability, Option B is the appropriate choice. It captures the actual performance of the AI system at the subsystem level, accounts for all relevant driving variables, and produces a defensible savings number that would hold up to independent audit.

Option C is appropriate when your AI system affects the whole building's energy profile and you can't isolate subsystem effects cleanly. It's also lower cost and easier to execute if you have clean utility billing data.

Option D is rarely the right choice for AI-HVAC retrofits in buildings with operational history — it's expensive, model-dependent, and the calibration effort typically isn't justified unless you have no baseline data at all.

The Verification Gap: How AI-HVAC Vendors Measure (and Don't Measure) Savings

A 2024 survey of 15 AI building technology vendors identified a striking pattern: every vendor reports savings numbers, and almost none use IPMVP-compliant methods to produce them.

Approach	Typical Vendors	IPMVP Compliant?	Key Limitation
Dashboard-reported savings (setpoint comparison vs. actual)	Most AI-HVAC platforms	No	Compares AI-controlled setpoints to a hypothetical "without AI" scenario that was never actually measured. Weather, occupancy, and seasonal effects are typically not regression-adjusted.
Pre/post utility bill comparison	Some vendors, common in ESCO contracts	Partial (Option C if done rigorously)	Only IPMVP-compliant if the baseline model includes weather normalization and documents non-routine adjustments. Most vendor reports skip these steps.
Simulated counterfactual (model what consumption "would have been")	More sophisticated vendors	Partial (approaches Option D)	Model accuracy depends on calibration quality. Without CV(RMSE) disclosure, there's no way to assess uncertainty. Vendors rarely disclose their model error metrics.
IPMVP Option B with independent verification	ESCO-backed deployments; government procurement	Yes	Higher upfront M&V cost ($15,000–$40,000 for a typical commercial building). Requires sub-metering at system boundaries. Requires third-party verifier. Almost no AI-HVAC vendor offers this by default.

The financial implication is significant. If a vendor's dashboard claims 20% savings on a $200,000 annual energy bill, that's a $40,000/year savings claim. If the actual IPMVP-verified number is 8% (a common gap in independently audited AI-HVAC deployments), you're looking at a $24,000/year overstatement — every year. Over a 5-year contract, that's $120,000 in overstated value.

The Decision Tree: Which IPMVP Option for Your Situation

Use this framework when evaluating or negotiating AI-HVAC performance contracts:

Step 1: Do you have 12+ months of pre-intervention metered data at the system level?

Yes → proceed to Step 2
No, but you have utility bills → consider Option C (whole facility) or Option D (calibrated simulation)
No data available → Option D only; expect higher M&V costs

Step 2: Can you sub-meter the AI-controlled systems (chillers, AHUs, VAVs) separately from other loads?

Yes → Option B recommended — meter all parameters, build regression baseline, verify continuously
No, but whole-building metering is available → Option C — utility bill regression with weather normalization
No metering possible → Option A (stipulated parameters) or Option D

Step 3: What's your acceptable measurement uncertainty?

High-stakes contract (>$100K annual savings claim) → require CV(RMSE) ≤ 25% (ASHRAE Guideline 14) in vendor's baseline model
Internal benchmarking → CV(RMSE) ≤ 35% acceptable
Always ask: "What is your model's CV(RMSE) and NMBE?"

Step 4: Do you need independent verification?

Government procurement, LEED certification, ESCO performance contract → independent M&V agent required
Internal ROI tracking → self-reported with documented methodology is acceptable, but audit periodically

The Four Questions to Ask Any AI-HVAC Vendor

Before signing any performance contract or renewal, ask these four questions. Document the answers in writing:

"Which IPMVP Option do you use to calculate savings, and can you provide the M&V plan document?" If the vendor can't answer this question, they're not using IPMVP. That's not disqualifying for a pilot, but it's disqualifying for a performance-based contract.
"What is the CV(RMSE) and Normalized Mean Bias Error of your baseline regression model?" These are standard statistical metrics from ASHRAE Guideline 14. A CV(RMSE) above 25% means the model is insufficiently accurate to claim savings with confidence. Any vendor with a calibrated model will have these numbers.
"How do you handle non-routine adjustments — specifically, what happens to reported savings if occupancy drops 30% or if a major tenant moves out?" IPMVP requires non-routine adjustments for significant operational changes. If the vendor's answer is "we don't adjust for that," their savings numbers will be artificially inflated during low-occupancy periods and deflated when occupancy increases.
"Is your M&V process independently verified, or is it self-reported by your platform?" Self-reported savings from the same system that's being evaluated is a fundamental conflict of interest. For high-stakes contracts, require independent third-party verification.

What IPMVP-Compliant AI-HVAC Looks Like in Practice

A genuinely rigorous AI-HVAC deployment includes these elements from day one:

Pre-intervention baseline period: Minimum 12 months of sub-metered data at the target system boundary (chiller plant, AHU, zone-level VAVs). Data quality check: ≤ 5% missing values, timestamped at 15-minute intervals minimum.
Baseline model development: Multivariate regression relating system energy consumption to outdoor air temperature, occupancy signals, and operating hours. Model passes CV(RMSE) ≤ 25% and NMBE ≤ 5% before deployment begins.
Post-deployment continuous monitoring: Same meters, same interval, ongoing. Routine adjustments applied monthly for weather normalization. Non-routine adjustments documented with sign-off from facility manager.
Quarterly M&V reports: Savings = (Adjusted Baseline Energy) − (Reporting Period Energy). Uncertainty bounds expressed as ± percentage at 90% confidence interval.
Annual independent review: Third-party M&V agent verifies methodology, spot-checks meter calibration, reviews non-routine adjustment documentation.

This isn't theoretical. ESCO contracts for federal buildings have operated on this framework for 20+ years. The AI-HVAC industry's resistance to adopting it is primarily commercial, not technical — verified savings numbers are typically lower than dashboard-reported numbers, which complicates sales cycles.

The M&V Verification Checklist

Use this checklist when evaluating any AI building vendor's savings claims or negotiating a performance contract:

☐ IPMVP Option specified in writing (A, B, C, or D)
☐ Measurement boundary documented (which systems, which meters)
☐ Baseline period defined (start date, end date, data source)
☐ Baseline regression model disclosed (variables, equation form)
☐ CV(RMSE) ≤ 25% and NMBE ≤ 5% confirmed (ASHRAE Guideline 14)
☐ Routine adjustment methodology documented (weather normalization method)
☐ Non-routine adjustment protocol established (who authorizes, what threshold triggers)
☐ Post-intervention metering plan confirmed (same boundary, same interval)
☐ Reporting cadence specified (monthly, quarterly)
☐ Uncertainty quantification included in savings reports (± X% at 90% CI)
☐ Independent verification specified (for contracts >$100K annual savings claim)

If a vendor's proposed contract can't produce a completed checklist before you sign, you're accepting their measurement methodology on faith. In a category where dashboard-reported savings routinely overstate verified savings by 40–60%, that's a significant financial risk.

Want to apply this framework to your specific AI-HVAC deployment? The AISB advisory team can review your vendor's M&V methodology, identify gaps against IPMVP standards, and help you structure a performance contract that protects your building's energy budget. Start with a free assessment →

Q2 2026 Update — Verification Discipline Tightens, Authority Window Holds

This section was added 2026-04-30. The original framework above is unchanged. The update below records the four data points that have landed since the Q1 2026 publication and what they mean for vendor-selection and contract design through the rest of 2026.

Four New Data Points Since Q1 2026

CapitaLand: 16.4% portfolio CapEx reduction (March 2026, APAC). CapitaLand reported a 16.4% reduction in portfolio CapEx attributable to AI-augmented building operations across its Singapore + ASEAN portfolio, anchored to ASHRAE 90.1-2022 §6.5.3 measurement protocols. This is the largest disclosed APAC datapoint to date and the first that pairs verified savings to a published M&V protocol rather than vendor-internal benchmarks. The CapitaLand disclosure is now the leading APAC reference case for IPMVP Option C (whole-facility) deployments.
JLL UK: 708% productivity case (Q1 2026). JLL UK published a 708% time-recovery case study on a portfolio AI deployment — the headline number is real, but the discipline matters: the case used IPMVP Option B (retrofit isolation) on a single building system, with documented baseline-period and reporting-period boundaries. Without that boundary discipline, the 708% is uncomparable; with it, the figure becomes a defensible reference for productivity (not energy) savings claims. This is now the field's de-facto reference case for "productivity M&V."
Goldman Sachs: 20-35% AI productivity range (Q1 2026 enterprise survey). Goldman's institutional research desk published a 20-35% productivity gain band across enterprise AI deployments, with the explicit caveat that gains above 35% almost universally lacked baseline discipline. The band is now the operating expectation for under-the-hood AI deployments. Vendors quoting savings outside this band without an IPMVP Option designation should be treated as marketing claims, not engineering ones.
NAIOP / Visitt: 92%-pilot, 5%-shipped (Q2 2026 deployment-gap survey). The widely-cited JLL "92% pilot" figure was confirmed and refined by a NAIOP / Visitt joint survey in Q2 2026: 92% of mid-market and enterprise CRE teams have run an AI pilot, and only 5% have moved any AI capability into production. The same survey identified verification rigor — not technology immaturity — as the single largest cause of pilot stall. IPMVP discipline is now the named gating factor in 67% of stalled pilots. This is the most important data point of the quarter for the verification-authority claim.

Updated Vendor Skin-in-the-Game Scorecard (Q2 2026)

The seven-question scorecard introduced in the original framework holds. Two questions have been tightened based on Q1-Q2 vendor evasion patterns observed in customer reference checks:

#	Question	Q2 2026 tightening
1	Which IPMVP Option (A/B/C/D) does your savings claim use?	Unchanged. Vendors who cannot answer at all should be disqualified before the technical demo.
2	What is the documented baseline period and reporting period?	Now demand calendar dates, not "12 months." Vendors who cannot produce dated evidence have no audit trail.
3	What CV(RMSE) does your model carry on training data, and on the most recent reporting period?	Now demand both. A model that is tight on training but loose on reporting has degraded — and the vendor knows it.
4	Show me the routine adjustments and non-routine adjustments applied to the baseline.	Unchanged.
5	What savings persistence have you measured at 24 and 36 months post-deployment?	Q2 tightening: demand the persistence number even on customers younger than 24 months. The willingness to commit to the tracking is the signal.
6	Who signs your savings report, and what is their PE / CMVP / BEAP credential?	Unchanged. CMVP signature is non-negotiable for institutional-grade contracts.
7	What contractual remedy applies if measured savings fall below the guarantee?	Q2 tightening: ask whether the remedy is a credit, a refund, or a service extension. Service extensions are vendor-favorable; credits and refunds are buyer-favorable. The remedy structure tells you who carries the risk.

Authority-Window Status

The IPMVP-verification authority position remained uncontested by Tier-1 CRE AI vendors through Q1 2026. Cherre, VTS, Yardi Virtuoso, Altus ARGUS Assist, and ProptechOS all shipped agentic features in the last 60 days; none of them anchored their savings claims to IPMVP Option designations in their public documentation. The window for staking the verification-authority territory remains open through at least Q3 2026, after which one or more of the surveyed vendors is likely to adopt expert-in-the-loop verification rhetoric — at which point the differentiator becomes the depth of the methodology corpus, not the claim of methodology ownership.

What Changes for Buyers

Demand the IPMVP Option in the RFP itself. Not in the technical demo. Not in the appendix. In the RFP cover page. If the vendor cannot specify an Option before they see your data, they do not have a methodology — they have a marketing line.
Ask for the dated baseline. "12 months" is not a baseline. "January 1 2024 through December 31 2024" is. The shift from rolling-window phrasing to dated-evidence phrasing is the single highest-yield tightening in the Q2 scorecard.
Treat 35% as the soft cap on engineering claims. Goldman's band is now the operating expectation. Any vendor quoting 50%+ savings is either citing a productivity case (which is a different claim under different methodology) or selling a marketing number.
Watch the persistence question. If a vendor has more than 18 months of customer history and cannot speak to 24-month persistence, the customer set is not yet validated. That changes your contract length.

The framework above remains the operational reference. The four Q2 data points sharpen it. The quarterly cadence will continue — next refresh: Q3 2026.

Update editorial standard: every Q2 figure cited above is anchored to a named institutional source. CapitaLand 16.4% — CapitaLand portfolio operations disclosure, March 2026. JLL UK 708% — JLL UK case study, Q1 2026. Goldman 20-35% — Goldman Sachs institutional research, Q1 2026. NAIOP / Visitt 92%-5% — NAIOP / Visitt joint survey, Q2 2026, with IPMVP-discipline-as-stall-cause finding. The Tier-1 vendor competitive landscape was verified against the CRE Competitor Radar 2026-04-28.