How to Read a DMARC Aggregate Report XML

DMARC aggregate reports are the bedrock of understanding your email ecosystem's deliverability and security posture. They provide a high-level overview of all mail claiming to be from your domain, detailing how much passed DMARC, how much failed, and crucially, why. The catch? These reports arrive as XML files, often compressed, and for a busy domain, you can receive hundreds of them daily from various Mailbox Providers (MBPs).

Manually sifting through these XML files can feel like trying to drink from a firehose. However, understanding their structure is fundamental to debugging DMARC alignment issues, identifying unauthorized senders, and ensuring your legitimate mail reaches its destination. This article will break down the anatomy of a DMARC aggregate report XML, helping you interpret its often-cryptic contents and pinpoint exactly what needs fixing.

Understanding the Structure of a DMARC Aggregate Report

A DMARC aggregate report is a structured XML document. While the specific XML schema can be verbose, it consistently contains three main top-level elements that provide context and detail:

  • <report_metadata>: Information about the report itself, such as who generated it and the time range it covers.
  • <policy_published>: Your DMARC policy as it was published in DNS at the time the mail was evaluated. This is critical context.
  • <record>: The core of the report. Each <record> element details a set of emails that share the same source_ip and DMARC evaluation outcome. A single report can contain many <record> elements.

Let's dissect these elements.

Diving into <report_metadata>

This section provides general information about the report. It's usually straightforward:

  • <org_name>: The name of the Mailbox Provider (e.g., Google, Microsoft, Yahoo) that generated the report.
  • <email>: The email address associated with the reporting organization.
  • <report_id>: A unique identifier for this specific report.
  • <date_range>: Contains <begin> and <end> timestamps, indicating the Unix epoch time range this report covers. You'll need to convert these to human-readable dates to understand the reporting period.

This metadata helps you track reports, understand their source, and correlate them with your internal logs if necessary.

Decoding <policy_published>

This element is often overlooked but provides crucial context. It lists the DMARC policy (p=, sp=, adkim=, aspf=, pct=, fo=) that was active for your domain when the emails in the report were processed. This is important because your DMARC policy might have changed since then.

Key sub-elements here include:

  • <domain>: The domain for which the policy was published (your domain).
  • <adkim>: Your DKIM alignment mode (relaxed r or strict s).
  • <aspf>: Your SPF alignment mode (relaxed r or strict s).
  • <p>: Your DMARC policy for the organizational domain (none, quarantine, or reject).
  • <sp>: Your DMARC policy for subdomains.
  • <pct>: The percentage of mail subject to DMARC policy. If pct is less than 100, not all failing mail will be acted upon.
  • <fo>: Forensic options (e.g., 0, 1, d, s).

Understanding adkim and aspf is paramount, as they dictate how DMARC alignment is evaluated, which we'll cover next.

The Core: The <record> Element

This is where the bulk of the actionable information resides. Each <record> element describes a group of emails that shared a common DMARC evaluation outcome.

Inside each <record>, you'll typically find:

<row>

This element aggregates data for a specific source_ip and policy evaluation outcome.

  • <source_ip>: The IP address from which the emails were sent. This is invaluable for identifying legitimate and unauthorized senders.
  • <count>: The number of emails processed from this source_ip with this specific outcome.
  • <policy_evaluated>: This is the heart of the DMARC decision.
    • <disposition>: The DMARC action taken (none, quarantine, or reject). This is based on