DMARC Report Parser Open Source
DMARC is a critical email authentication protocol designed to protect your domain from impersonation and phishing. It builds on SPF (Sender Policy Framework) and DKIM (DomainKeys Identified Mail) by providing a framework for domain owners to instruct recipient mail servers on how to handle unauthenticated emails and, crucially, to receive reports on email authentication failures.
These reports, known as aggregate (RUA) reports, are XML files sent by recipient mail servers. They contain invaluable data: which IPs are sending mail purporting to be from your domain, whether SPF and DKIM passed, and, most importantly for troubleshooting, whether they aligned with your From domain. However, these reports are notoriously difficult to read and analyze manually. They're large, verbose, and come from many different sources, often zipped. This is where DMARC report parsers come in.
Why Open Source DMARC Parsers?
For engineers, the appeal of open source is strong. It offers transparency, control, and often, cost savings. When dealing with something as sensitive as email security and data about your sending infrastructure, being able to inspect the code, understand how data is processed, and even contribute fixes or custom features is a significant advantage.
An open-source DMARC parser allows you to:
- Own your data: Store your DMARC report data in your own infrastructure, giving you full control over privacy and retention.
- Customize: Adapt the parser to your specific needs, integrate it with existing monitoring systems, or add custom reporting features.
- Audit: Examine the parsing logic to ensure accuracy and understand exactly how results are being interpreted.
- Avoid vendor lock-in: Should a commercial service no longer meet your needs, you have a self-hosted alternative.
While commercial DMARC services offer convenience and advanced features, understanding and potentially deploying an open-source solution provides a deeper insight into the underlying mechanics.
Understanding DMARC Alignment Failures
Before diving into tools, let's clarify the core problem DMARC helps you identify: alignment failures.
DMARC requires that the domain used in the From header (the one users see) aligns with the domains authenticated by SPF and DKIM. This alignment can be either "strict" or "relaxed."
-
SPF Alignment: For SPF to align, the domain in the
Return-Pathheader (also known as themfromor envelope sender) must match the domain in theFromheader.- Strict alignment: The domains must be an exact match.
example.cominReturn-Pathandexample.cominFrom. - Relaxed alignment: The
Return-Pathdomain can be a subdomain of theFromdomain. E.g.,bounce.example.cominReturn-Pathaligns withexample.cominFrom.
- Strict alignment: The domains must be an exact match.
-
DKIM Alignment: For DKIM to align, the domain specified in the
d=tag of the DKIM signature must match the domain in theFromheader.- Strict alignment: The domains must be an exact match.
d=example.comaligns withFrom: user@example.com. - Relaxed alignment: The
d=domain can be a subdomain of theFromdomain. E.g.,d=marketing.example.comaligns withFrom: user@example.com.
- Strict alignment: The domains must be an exact match.
Why do these fail? The most common reason is using third-party email sending services (marketing platforms, transactional email APIs, CRMs).
Example 1: SPF Alignment Failure with a Marketing Platform
You use marketing.thirdparty.com to send emails from yourdomain.com.
* Your From header is user@yourdomain.com.
* The marketing platform, by default, might set the Return-Path to bounces@marketing.thirdparty.com for bounce tracking.
* Even if marketing.thirdparty.com is included in your yourdomain.com SPF record, SPF passes because the sending IP is authorized for marketing.thirdparty.com.
* However, marketing.thirdparty.com does not align with yourdomain.com in the From header. DMARC fails for SPF.
Example 2: DKIM Alignment Failure with a Transactional Email Service
You use api.transactional-email.com to send order confirmations from yourdomain.com.
* Your From header is orders@yourdomain.com.
* The service signs the email with its own DKIM key, so the d= tag in the signature is d=transactional-email.com.
* DKIM passes because transactional-email.com is a valid signer for that message.
* But transactional-email.com does not align with yourdomain.com in the From header. DMARC fails for DKIM.
DMARC reports highlight these exact scenarios, showing you which sources are failing and why.
Open Source DMARC Parsers: Tools and Approaches
Setting up your own open-source DMARC parser typically involves:
- Configuring your DMARC record: Add
rua=mailto:your_email@yourdomain.comto receive aggregate reports. Point this email address to a mailbox that your parser can access. - Report ingestion: A script or application that fetches the XML reports (often gzipped) from the mailbox.
- Parsing: Extracting relevant data from the XML: source IP, SPF/DKIM results, alignment status, message counts, policy applied, etc.
- Storage: Saving the parsed data into a database (e.g., PostgreSQL, MySQL, SQLite).
- Reporting/Visualization: Tools to query and visualize the stored data.
Let's look at some concrete examples:
1. dmarcian-community/dmarc-report-parser (Python)
This project, available on GitHub, is a Python-based tool designed to parse DMARC XML reports and store them in a database. It's relatively straightforward and a good starting point for engineers comfortable with Python.
How it works:
The core of this tool is a Python script that takes a DMARC XML file as input. It uses Python's xml.etree.ElementTree or similar libraries to navigate the XML structure, extract the relevant fields (like source_ip, policy_evaluated details including spf_aligned and dkim_aligned), and then inserts this data into a database.
Basic Setup & Usage (Conceptual): You'd typically set up a cron job or a dedicated service to: 1. Fetch new DMARC reports from an IMAP mailbox. 2. Unzip them (if gzipped). 3. Pass each XML file to the parser script.
# Example (simplified) of how you might run it after setup
# Assuming you have a report.xml file
python -m dmarc_report_parser --file report.xml --db-connection "sqlite:///dmarc_data.db"
The output is not directly a UI but data in your database. You'd then use SQL queries to analyze this data. For instance, to find all sources that failed DKIM alignment for your domain:
SELECT
source_ip,
COUNT(*) as total_failures
FROM
dmarc_records
WHERE
dkim_aligned = 0 AND header_from = 'yourdomain.com'
GROUP BY
source_ip
ORDER BY
total_failures DESC;
Strengths: Python's ecosystem, ease of scripting, direct database interaction. Weaknesses: Lacks a built-in web UI, requires manual integration with visualization tools (like Grafana, Metabase, or custom dashboards), ongoing maintenance of the script and its dependencies. Scaling for very high volumes of reports might require more robust queuing and processing.
2. DMARC-Reports (PHP/Laravel)
The `DMARC-Reports/DMARC-