DMARC aggregate reports (rua reports) arrive in operator mailboxes daily, sometimes hourly. The reports are XML files describing how receivers handled mail claiming to be from the operator’s domain. Most operators forward them to a third-party service (DMARC Analyzer, EasyDMARC, dmarcian, Postmark’s DMARC service) for parsing and visualization.

The third-party services produce useful dashboards. They also obscure information that the raw XML contains. For operators who want deeper insight into their authentication behavior, parsing the XML directly produces visibility that dashboards do not provide.

This post is what the DMARC aggregate report XML actually contains, what the fields mean operationally, and what you can find when you parse the reports yourself.

The basic structure

DMARC aggregate reports follow the schema defined in RFC 7489. The structure is consistent across receivers (with minor variations).

A report is a single XML file. The top-level element is <feedback>. The file contains:

<report_metadata>: information about the report itself (who sent it, when, the time range it covers, contact information).

<policy_published>: the DMARC policy the receiver observed at the time of the report.

<record>: one or more records describing specific sending patterns observed. Each record aggregates mail with the same source IP, authentication results, and header source.

The report file is typically gzipped and emailed to the address specified in the operator’s DMARC rua parameter.

Report metadata

The <report_metadata> section provides report-level context.

<report_metadata>
  <org_name>google.com</org_name>
  <email>noreply-dmarc-support@google.com</email>
  <report_id>1234567890</report_id>
  <date_range>
    <begin>1645833600</begin>
    <end>1645920000</end>
  </date_range>
</report_metadata>

The fields:

org_name: the receiver organization sending the report. Common values: google.com, outlook.com, yahoo.com, apple.com, comcast.net, various corporate mail servers.

email: contact address for inquiries about the report.

report_id: unique identifier from the receiver. Useful for correlating multiple reports from the same receiver.

date_range: Unix timestamps for the report’s coverage window. Typical reports cover 24 hours, but some receivers send more frequent reports.

The metadata identifies the report’s origin and scope. For multi-receiver analysis, the org_name field becomes the primary grouping dimension.

Policy published

The <policy_published> section documents the DMARC policy the receiver observed when handling the mail.

<policy_published>
  <domain>example.com</domain>
  <adkim>r</adkim>
  <aspf>r</aspf>
  <p>quarantine</p>
  <sp>quarantine</sp>
  <pct>100</pct>
</policy_published>

The fields:

domain: the sender domain whose mail is being reported.

adkim: DKIM alignment mode. r for relaxed (subdomain matching), s for strict (exact matching).

aspf: SPF alignment mode. Same r or s values.

p: the policy for the sender domain. none for monitoring, quarantine for spam folder, reject for outright rejection.

sp: the policy for subdomains. Defaults to p if not specified.

pct: percentage of mail the policy applies to. Used during gradual policy progression.

The policy section confirms what the receiver thinks the policy is. Discrepancies between this and the actual DNS record indicate DNS caching or recent policy changes that have not propagated.

Records: the operational data

The <record> elements contain the actual operational data. Each record aggregates mail with consistent characteristics over the report period.

<record>
  <row>
    <source_ip>192.0.2.1</source_ip>
    <count>523</count>
    <policy_evaluated>
      <disposition>none</disposition>
      <dkim>pass</dkim>
      <spf>pass</spf>
    </policy_evaluated>
  </row>
  <identifiers>
    <header_from>example.com</header_from>
  </identifiers>
  <auth_results>
    <dkim>
      <domain>example.com</domain>
      <result>pass</result>
      <selector>mail</selector>
    </dkim>
    <spf>
      <domain>example.com</domain>
      <result>pass</result>
    </spf>
  </auth_results>
</record>

The fields in detail:

Row section

source_ip: the IP address that sent the mail to the receiver.

count: number of messages aggregated in this record. Records aggregate messages with matching characteristics (same IP, same authentication results, same From domain).

policy_evaluated.disposition: how the receiver actually treated the mail. Values: none (delivered normally), quarantine (sent to spam), reject (refused).

policy_evaluated.dkim: whether DKIM passed alignment check.

policy_evaluated.spf: whether SPF passed alignment check.

Identifiers section

header_from: the domain in the From header of the messages.

Auth results section

dkim.domain: the domain the DKIM signature claimed to be from.

dkim.result: pass, fail, neutral, none, policy, temperror, permerror.

dkim.selector: the DKIM selector used. Useful for identifying which key signed.

spf.domain: the domain SPF validated against.

spf.result: pass, fail, neutral, none, softfail, temperror, permerror.

What the data actually tells you

The aggregate report data answers specific operational questions when analyzed properly.

Which IPs are sending mail claiming to be from your domain

The source_ip field reveals all IPs that sent mail to receivers using your domain in the From header. This includes:

Your authorized sending infrastructure (these should pass authentication)
Sources you authorized but did not know about (third-party tools you forgot)
Unauthorized sources (spoofing attempts or compromised accounts)

The first category should produce passing authentication. The second category needs investigation (do you actually want this source sending as your domain?). The third category needs immediate attention (spoofing of your domain).

Authentication failure patterns

The policy_evaluated.dkim and policy_evaluated.spf fields show which authentication methods are failing for which sources.

Common failure patterns:

One source consistently failing SPF: SPF record needs updating
One source consistently failing DKIM: DKIM configuration issue at that source
All sources from one ISP failing SPF: ISP routing changes
Particular subdomains failing: subdomain authentication setup gap

The patterns inform specific remediation.

Receiver-specific behavior

Different receivers send reports with different patterns. Google reports include extensive detail. Microsoft reports are typically shorter but consistent. Some receivers send sparse reports. The differences reveal receiver-specific handling.

For operators sending to multiple receivers, the per-receiver analysis helps understand receiver-specific delivery patterns.

Volume verification

The count field shows how much mail each receiver observed from each source. Comparing this to your sending logs verifies that mail is actually reaching the intended receivers.

Discrepancies (you sent more than the receiver reports) may indicate:

Mail rejected before reaching the receiver (intermediate servers)
Mail spam-filtered before reaching the receiver
Reporting omissions at the receiver

Forwarding pattern detection

Records with passing authentication but odd source IPs sometimes indicate forwarding patterns. Mail authentication can pass even after forwarding through unexpected intermediaries.

The pattern is informative for understanding where your mail actually flows.

What the third-party dashboards typically show vs hide

Third-party DMARC services provide dashboards that summarize the data. The summaries are useful but lose information.

What dashboards show well

Aggregate authentication pass/fail rates over time.

Top sending sources by volume.

Top receivers by report volume.

Trend graphs for various metrics.

What dashboards typically hide

Specific failure reasons in detail. The dashboard shows “DKIM fail” but not which specific selector or what the body hash mismatch was.

Receiver-specific behavior variations. The dashboard aggregates across receivers in ways that hide receiver-specific patterns.

Time-of-day patterns. The dashboard typically shows daily aggregates rather than hourly granularity.

Source IP grouping by sender infrastructure. The dashboard treats each IP individually rather than grouping by sender system.

Rare event detail. Low-frequency events (a few failures from an unknown source) get lost in aggregation rather than highlighted.

For operators wanting deeper analysis, the raw XML provides information the dashboards do not surface.

Parsing the reports yourself

The infrastructure to parse DMARC aggregate reports yourself is bounded.

Receiving the reports

The DMARC rua parameter specifies the address receiving reports. The address should be a real mailbox capable of receiving the report volume.

For low-volume domains, the volume is modest (a few reports per day). For high-volume domains, the volume can be significant (dozens to hundreds per day).

Decoding the format

Reports arrive as email attachments. The attachment is typically gzipped XML. Some receivers use ZIP rather than gzip. The decoding is bounded:

import gzip
import xml.etree.ElementTree as ET

def parse_dmarc_report(file_path):
    if file_path.endswith('.gz'):
        with gzip.open(file_path, 'rb') as f:
            xml_data = f.read()
    else:
        with open(file_path, 'rb') as f:
            xml_data = f.read()
    
    root = ET.fromstring(xml_data)
    return root

The XML parsing is standard. The schema is well-documented in RFC 7489.

Extracting useful data

For each report, extract relevant fields into structured form:

def extract_records(root):
    records = []
    for record in root.findall('record'):
        row = record.find('row')
        identifiers = record.find('identifiers')
        auth_results = record.find('auth_results')
        
        data = {
            'source_ip': row.find('source_ip').text,
            'count': int(row.find('count').text),
            'disposition': row.find('policy_evaluated/disposition').text,
            'dkim_aligned': row.find('policy_evaluated/dkim').text,
            'spf_aligned': row.find('policy_evaluated/spf').text,
            'header_from': identifiers.find('header_from').text,
        }
        
        # Extract auth results
        dkim_results = []
        for dkim in auth_results.findall('dkim'):
            dkim_results.append({
                'domain': dkim.find('domain').text,
                'result': dkim.find('result').text,
                'selector': dkim.find('selector').text if dkim.find('selector') is not None else None,
            })
        
        spf_results = []
        for spf in auth_results.findall('spf'):
            spf_results.append({
                'domain': spf.find('domain').text,
                'result': spf.find('result').text,
            })
        
        data['dkim_results'] = dkim_results
        data['spf_results'] = spf_results
        records.append(data)
    
    return records

The extracted data can be stored in a database for further analysis.

Storage considerations

DMARC reports accumulate over time. A high-volume domain might receive 100+ reports per day, each with dozens to hundreds of records.

For long-term storage, structured database tables work well:

Reports table: one row per report (metadata)
Records table: one row per record (operational data)
Linked by report_id

Indexes on common query fields (source_ip, header_from, date) support fast lookups.

Storage volume after one year of operation for a moderate-volume domain: typically 500MB-5GB of structured data.

Analysis patterns

Common analysis queries:

Authentication success rate by source IP:

SELECT source_ip, 
       SUM(CASE WHEN dkim_aligned = 'pass' OR spf_aligned = 'pass' THEN count ELSE 0 END) as passing,
       SUM(count) as total
FROM records
WHERE report_date >= NOW() - INTERVAL '30 days'
GROUP BY source_ip;

Unauthorized source detection:

SELECT source_ip, SUM(count) as message_count
FROM records
WHERE source_ip NOT IN (SELECT authorized_ip FROM authorized_sending_ips)
  AND report_date >= NOW() - INTERVAL '7 days'
GROUP BY source_ip
ORDER BY message_count DESC;

Per-receiver authentication patterns:

SELECT report_metadata.org_name,
       records.dkim_aligned,
       records.spf_aligned,
       SUM(records.count) as message_count
FROM records JOIN report_metadata ON records.report_id = report_metadata.report_id
GROUP BY report_metadata.org_name, records.dkim_aligned, records.spf_aligned;

The queries reveal operational patterns that aggregate dashboards do not surface.

What we find when we parse reports for customers

Our customer base shows specific patterns when reports are analyzed in detail.

Unauthorized senders are common

Approximately 60-80% of our customers have at least one unauthorized sending source visible in their DMARC reports during any 30-day window.

The sources include:

Forgotten third-party tools that customers signed up for years ago
Marketing platforms used by other teams within the organization
Vendors who set up email integration without notifying the email team
Spoofing attempts (rare but real)

The detection of unauthorized senders is one of the most valuable outcomes of DMARC report analysis.

Configuration drift is common

Many customers have configuration drift over time. SPF records become incomplete as new sending sources are added. DKIM signatures become inconsistent as infrastructure changes. The drift produces gradually worsening authentication outcomes.

Detailed analysis surfaces drift early, when remediation is bounded. Without analysis, drift accumulates until it produces visible deliverability problems.

Receiver-specific issues exist

Different receivers handle the same authentication setup with different results. The reasons vary: different validation strictness, different intermediate caching, different reputation models.

Identifying receiver-specific patterns informs receiver-specific remediation when needed.

Specific selector issues

DKIM selector usage patterns reveal interesting operational information. Some sources use specific selectors; others rotate selectors; others use ESP-managed selectors.

Tracking selector usage helps identify when key rotations happen and whether the rotations are clean.

Volume calibration

The volume reported by receivers gives operational calibration. If you think you sent 100K messages to Gmail but Gmail reports 80K, the 20% gap needs explanation. The explanation might be:

Messages rejected before reaching Gmail
Messages spam-filtered before reaching reporting infrastructure
Sampling differences in Gmail’s reporting

The calibration discipline catches issues that pure send-side logging cannot detect.

Time pattern detection

Hourly granularity in analysis surfaces time patterns that daily aggregates miss. Some sources only send during business hours. Some sources send constantly. The patterns inform operational understanding.

What we have built for customer analysis

We have built infrastructure to parse and analyze customer DMARC reports.

Centralized report processing

Customer DMARC reports come to our managed report collection. We decode, parse, and store the reports automatically.

For customers using our infrastructure for mail, the report collection is part of our managed services. The customer does not need separate DMARC report infrastructure.

Customer-facing dashboards

We provide customers with monthly summary dashboards showing:

Authentication success rates
Unauthorized source alerts
Configuration drift indicators
Volume patterns

The dashboards focus on actionable information rather than vanity metrics.

Alert-based notification

When the analysis surfaces concerning patterns (sudden authentication degradation, new unauthorized source, etc.), customers receive immediate notification.

The alert thresholds are calibrated to surface meaningful issues without false positives.

Detailed investigation support

For customers facing specific deliverability issues, we provide deeper investigation using the report data. Specific source IPs, specific failure patterns, specific receiver behaviors all become queryable for diagnostic purposes.

Historical archive

The historical archive of reports supports trend analysis over months and years. Customers can see patterns they might not notice in shorter windows.

Based on our customer analysis experience:

Send reports somewhere parseable

Even if not analyzing in detail initially, send DMARC reports to a destination where they can be analyzed later. Third-party services with raw data access are better than services that only provide dashboards.

Maintain historical archive

Reports compound in value with historical context. Maintain at least 12 months of reports.

Look beyond authentication pass rates

The pass rate is the headline metric but not the only valuable metric. Source IP analysis, receiver-specific patterns, volume calibration all provide operational value.

Investigate unexpected sources

Any source that should not be sending as your domain warrants investigation. Either authorize the source, remediate the gap, or take action against the unauthorized use.

Catch configuration drift early

Periodic analysis catches drift while remediation is bounded. Waiting for visible deliverability problems means remediation is more difficult.

Combine with other monitoring

DMARC reports are one data source. Combining with sender-side logging, receiver dashboards (Postmaster Tools, SNDS), inbox placement testing, and customer feedback produces more complete operational picture.

Don’t drown in the data

For most operations, monthly review with daily alerting for specific patterns is sufficient. The reports do not need to be reviewed daily by hand.

What does not work in DMARC report analysis

Some approaches we have seen that do not produce value.

Treating reports as compliance checkbox

Some operators send DMARC reports to a parsing service and consider the work done. The reports contain information that requires actual review and action to produce value. Mere collection produces no benefit.

Optimizing for high pass rates without investigation

A 99% pass rate looks good in dashboards. The 1% failures may contain the operationally meaningful information. Optimizing for the headline metric while ignoring the failure patterns misses value.

Treating reports as security tool only

DMARC reports support security (spoofing detection) and operational (delivery patterns) analysis. Treating them as security-only misses the operational insight.

Excessive granularity

Tracking every individual report event is overwhelming and not productive. Aggregating to meaningful groupings (source, receiver, time period) produces actionable information without noise.

The longer-term operational value

DMARC report parsing produces value that compounds over time.

The historical archive enables trend analysis. Specific changes (infrastructure migrations, key rotations, new sender deployments) can be correlated with their impacts.

The pattern detection catches issues earlier. Operators with established report parsing infrastructure identify issues weeks before operators without the infrastructure.

The operational understanding improves. Repeated exposure to report data builds operator intuition for what normal looks like and what abnormal looks like.

The compliance discipline strengthens. Operators who actively analyze DMARC reports maintain better email infrastructure than operators who do not.

The cost of the analysis infrastructure is bounded. The benefit accumulates indefinitely. The investment justifies itself for operators serious about email infrastructure quality.

The honest assessment

DMARC report parsing is operationally valuable but underutilized. Most operators send reports to third-party dashboards and move on. The reports contain information that warrants deeper analysis.

For operators with technical capability, parsing reports themselves provides access to information that dashboards summarize away. The infrastructure to parse reports is bounded engineering. The ongoing analysis is bounded operational discipline.

For operators without technical capability, working with services that provide raw data access (rather than dashboard-only services) preserves the option for deeper analysis. The choice of DMARC service affects what analysis is possible.

For operators reading this with reports going to dashboards they rarely look at: the value is in the analysis, not the collection. Either invest in actual analysis (yourself or through your service provider) or stop pretending to do DMARC monitoring.

The reports are sent daily by major receivers. The data is operationally meaningful. The information is actionable when properly analyzed. The choice to extract the value is the operator’s. For our customer base, we extract the value as part of standard managed services. The customers benefit from the analysis we do; the customers can dig deeper when specific situations warrant.

For other operators: the work to extract value from DMARC reports is bounded. The benefit is sustainable improvement in email infrastructure operations. The investment is worth making for operators serious about long-term operational quality.

Daily reports, structured data, queryable patterns, actionable insights. The information is there. The discipline to extract it is the operator’s choice. The customers we work with who have chosen to extract the insight are operating with better infrastructure than the customers who have not. The pattern is observable, repeatable, and operationally meaningful. The investment in DMARC report analysis pays back over time in ways that more visible operational practices often do not.