Painless compliance, and a thousand audits a day. An engineering first approach

This insight is based on my talk at LeadDev Berlin 2025, where we explored an alternative approach to compliance and audits that is developer centric. It is a pleasure to have the opportunity to participate in such an event!

The Compliance Reality Check

Picture this: You’re sitting in an audit session, feeling as prepared as you could be. Minutes in, the auditor starts digging deeper into your processes. Suddenly, you realize there’s a gap. Your stomach drops, your heart rate spikes, cold sweat breaks out. Maybe you manage to pull through, maybe you don’t.

Or perhaps you’re doing some routine investigation and discover personal data where it absolutely shouldn’t be. Credit card numbers in log files. Bank account details in development databases. Social security numbers in analytics systems. The incredulity hits first (“This can’t be!”). Then anger, then panic as you realize the magnitude of what you’ve found.

These scenarios aren’t hypothetical. They’re the reality of how most organizations experience compliance today. We live in constant fear of what we might discover next, what auditors might find, what gaps exist that we haven’t identified yet.

But here’s what’s particularly frustrating: we’re still approaching compliance like it’s 1999.

One thought that changes how we look at compliance

Let me share a perspective that I hope changes how you think about compliance:

An audit is just a test plan we run once a year, by hand, with dozens of people.

Read that again. We’re running critical tests, once a year, manually. Think about what would happen if we applied this same approach to software testing. Our applications would be catastrophically unreliable. We’d have constant outages, data corruption, security breaches. No engineering team would accept shipping code with such poor testing practices.

Yet somehow, when it comes to compliance, which often governs our most critical security and operational requirements, we have accepted this antiquated approach as normal.

The 1999 Software Testing Parallel

The challenges we face with compliance today sound remarkably familiar:

Process-oriented colleagues who love manual screenshots, spreadsheets, and email workflows.
Late issue discovery, finding problems weeks or months before audits, completely disconnected from development flow.
Reactive and disruptive processes, pulling engineers away from product work for compliance theater.
Siloed knowledge, only compliance people understand compliance requirements.
Constant fear and uncertainty, “Are we still compliant? What failed now?”.

If you traveled back to 1999 and attended a software engineering conference, you’d hear identical complaints about software testing. The same frustrations, the same inefficiencies, the same fear-driven culture.

We solved those testing problems through automation and continuous integration. The transformation was remarkable. We went from:

Manual testing that happened prior to a release.
Last-minute surprises and fire-fighting.
Fear of deploying because “we’re not sure what might break”.
Testing as a separate, disconnected activity.

To:

Automated tests that run thousands of times per day.
Immediate feedback on every code change.
Confidence in our systems and ability to deploy fearlessly.
Testing as an integrated part of the development process.

The same transformation is possible and necessary for compliance.

The Engineering Solution

1. Compliance as Product Features

The first step is to stop treating compliance as “non-functional requirements.” In the cloud era, compliance requirements are functional features you can verify through APIs, configurations, and code.

Write user stories to clarify what the compliance requirement actually expects:

As a security officer investigating data protection controls,
I want a real-time inventory of all databases with their encryption status,
So that I can verify our data-at-rest is encrypted to protect against threats like:
- Unauthorized filesystem access (insider threats, compromised accounts)
- Physical theft of storage devices or backups
- Cloud storage misconfigurations exposing raw database files
- Decommissioned hardware containing unwiped sensitive data

Context: Multiple ISO 27001:2022 controls:
- A.5.9 (Inventory of assets):
  Real-time database inventory maintains accurate asset records

- A.5.12 (Classification of information):
  Knowing which databases contain sensitive data

- A.8.24 (Use of cryptography):
  Verifying encryption implementation and effectiveness

- A.8.16 (Monitoring activities):
  Continuous verification demonstrates ongoing control monitoring

This user story approach provides:

Clear scope: What exactly needs to be verified
Compliance context: Why this requirement exists
Measurable outcome: How success is defined
Multi-control traceability: How one implementation supports multiple ISO 27001 controls
Interconnected compliance: Shows how asset management, monitoring, and cryptography work together

Getting started with existing systems: If your product is already running, you don’t need to define every scenario at once. Pick one compliance scenario that’s causing the most pain, maybe you’re spending hours every quarter gathering database encryption evidence, or your security team is constantly chasing down unencrypted S3 buckets. Automate that one pain point first, measure the time savings, then iterate. This approach builds momentum and demonstrates value.

2. Automating your compliance test suite

Once you have clarity on what you need to implement and why (through user stories), you can start automating those tests.

For new products: Apply Test-Driven Development (TDD) principles to compliance. Write your compliance tests first, then build your system to pass them. This approach means you’re compliance-ready from day one, no last-minute scrambling for approvals or manual verification. When your test suite is green, that becomes your compliance evidence.

For existing systems: Start with your biggest pain points and work backwards. Pick the most time-consuming manual checks and automate those first. You will find out that controls you thought you had implemented had gaps, as you were not testing them, they weren’t appropriately applied on some areas. Keep working on fixing those issues and stabilize your compliance foundation.

Regardless of which approach you choose, your compliance test suite should focus on the what, not the how, making your tests more resilient to changes while giving engineering teams the flexibility to choose their own implementation methods. For example, testing “S3 buckets must enforce HTTPS” is more durable than testing “S3 bucket policy must contain specific JSON strings” - the first survives CloudFormation migrations, the second breaks.

Here’s a concrete example:

test_suite: "data_protection"
description: "Verify data protection controls across infrastructure"

tests:
  - name: "database_encryption_at_rest"
    requirement: "All production databases encrypted at rest"
    standards:
      - "ISO 27001:2022 - A.5.12 (Classification of information)"
      - "ISO 27001:2022 - A.8.24 (Use of cryptography)"
      - "ISO 27001:2022 - A.8.16 (Monitoring activities)"
    validation:
      - check: aws_rds_encryption_enabled
      - check: aws_dynamodb_encryption_enabled
      - check: mongodb_encryption_enabled
    failure_action: "block_deployment"

  - name: "s3_bucket_https_enforcement"
    requirement: "All S3 buckets exposed to the internet use HTTPS"
    standards:
      - "ISO 27001:2022 - A.8.24 (Use of cryptography)"
      - "ISO 27001:2022 - A.13.1.1 (Network controls)"
      - "ISO 27001:2022 - A.8.9 (Configuration management)"
    validation:
      - check: aws_s3_bucket_uses_https
    failure_action: "warn_and_track"

Notice how these tests:

Don’t care about implementation: Whether you use AWS KMS, CloudFormation, or Terraform doesn’t matter
Are continuously verifiable: Can run against live infrastructure at any time
Provide clear traceability: Each test links back to specific compliance requirements
Handle exceptions gracefully: Different failure actions for different risk levels
Start with warnings: New controls begin as “warn_and_track” to give teams time to adapt

The check functions (like aws_rds_encryption_enabled) are simple API calls that verify the current state without caring how it was configured.

Something interesting happens at this stage. As you keep automating compliance checks, you start to layer security control over security control, which helps your security posture. You slowly build a control wall and in case one of them fails, there are others that help you to keep your business secure. Suddenly, compliance isn’t just about getting a certificate, but supporting your approach to security.

3. CI/CD Integration: testing your compliance on every deployment

Once you have your compliance test suite, integrate it into your CI/CD pipelines. This way, if you deploy 1000 times a day, you’ll check your compliance posture just as many times and catch issues early when they’re easier and cheaper to fix.

Real-world example from payments: Many financial companies monitor logs in real-time to detect credit card number leakage, generating thousands of alerts per day - almost always false positives. This reactive approach is expensive, disruptive, and still misses issues.

Here’s how to shift from reactive monitoring to proactive prevention. This example shows the compliance checks that run before deployment:

# Instead of monitoring logs for card numbers after they're written...
compliance_check: "prevent_card_data_logging"
pipeline_stage: "pre_deployment"

validations:
  - name: "authorized_loggers_only"
    description: "Only approved logging libraries allowed"
    check: "scan_for_unauthorized_loggers"
    failure_action: "block_build"

  - name: "automatic_data_masking"
    description: "PCI-scope data automatically masked"
    check: "verify_masking_rules_applied"
    patterns:
      - credit_card_numbers
      - social_security_numbers
      - bank_account_numbers
    whitelist_patterns:
      - transaction_ids
      - timestamp_formats
      - order_numbers
    failure_action: "block_build"

Let’s break down what each validation does:

Authorized loggers only: Scans your codebase to ensure developers only use approved logging libraries that have built-in data masking capabilities. This prevents accidental logging of sensitive data through unauthorized libraries.
Automatic data masking validation: Verifies that your masking rules are properly applied to sensitive data patterns while allowing whitelisted patterns (like transaction IDs or timestamps) that might look similar but aren’t actually sensitive. This catches cases where developers might bypass or misconfigure masking while reducing false positives.

This approach:

Prevents issues at the source instead of detecting them after they occur
Scales automatically with your development velocity
Provides immediate feedback to developers
Eliminates most false positives through intelligent pattern matching

Now, when you look at your ongoing log monitoring, you can focus on very specific areas of the logs, like API requests and responses, while relegating the rest of the checks to your build stage.

4. Add observability to your process compliance

For process-heavy requirements that can’t be automated (like approval workflows, training completion, or incident response), treat them as an observability problem. Just like you monitor system reliability with SLIs and SLOs, you can monitor process reliability with compliance indicators.

Here’s how to build process observability:

1. Define your Service Level Indicators (SLIs) - the specific metrics that indicate process health:

process_slis:
  - name: "vulnerability_remediation_time"
    metric: "time_from_discovery_to_fix_in_days"
    measurement: "p95_across_rolling_30_days"

  - name: "code_review_coverage"
    metric: "percentage_of_changes_with_two_reviewers"
    measurement: "count_over_rolling_7_days"

  - name: "incident_response_time"
    metric: "time_from_alert_to_first_human_response"
    measurement: "p90_across_rolling_24_hours"

2. Set your Service Level Objectives (SLOs) - the targets that define “good enough”:

process_slos:
  - sli: "vulnerability_remediation_time"
    target: "< 15 days for 95% of critical vulnerabilities"

  - sli: "code_review_coverage"
    target: "> 99% of production changes"
    exceptions: "emergency_deployments_with_post_review"

  - sli: "incident_response_time"
    target: "< 30 minutes for 90% of critical alerts"

3. Build observability dashboards that show current performance against targets:

Traffic lights: Green when SLO is met, amber when close to breach, red when breached
Trend analysis: Is performance improving or degrading over time?
Error budgets: How much compliance “debt” can you afford before taking action?
Leading indicators: Early warning signals before SLO breaches occur

5. Risk-Focused Alerting: Evolution, Not Enumeration

The traditional compliance approach generates one alert per issue: “Database X is unencrypted,” “User Y hasn’t completed training,” “Service Z failed vulnerability scan.” This doesn’t scale. With hundreds of services and thousands of compliance checks, you end up drowning in alerts that don’t help you understand your actual risk posture.

The problem with individual alerts: They create a detailed to-do list for engineering teams without context about what actually matters. Engineers spend time fixing low-risk issues while high-risk problems go unaddressed.

The risk evolution approach: Instead of alerting on individual compliance violations, track how your overall risk profile is changing and alert on trends that indicate deteriorating security posture.

Instead of 600 individual vulnerability alerts like this:

❌ CVE-2024-1234 found in service-auth-api (Critical)
❌ CVE-2024-1234 found in service-payment-processor (Critical)
❌ CVE-2024-1234 found in service-user-management (Critical)
❌ CVE-2024-1234 found in service-analytics-engine (Critical)
❌ CVE-2024-1234 found in service-notification-hub (Critical)
... (595 more alerts for the same CVE across different services)

You get one strategic alert like this:

🔴 Security Posture Alert: New Critical Vulnerability

Risk Level: High (new critical CVE affecting 600 services)

Issue Summary:
• CVE-2024-1234 discovered in logging library used across platform
• Affects 600 services (95% of production environment)
• 15-day remediation SLA now at risk due to scope

Business Impact: Average resolution rate is 75 vulnerabilities per week.
At current capacity, estimated resolution time is 8 weeks, which would breach
the 15-day SLA and keep the company exposed to a critical vulnerability for
an extended period, increasing the chances of a threat actor exploiting it

Recommended Response:
Coordinate resolution with the impacted teams (Alpha, Beta and Gamma teams).

This approach focuses on risk direction rather than individual items:

Trend-based alerting: Alert when risk is increasing, not on every individual finding
Business impact: Connect compliance metrics to actual business risk
Let teams prioritize: Provide risk context, let engineering teams decide implementation details
Strategic focus: Security teams focus on risk assessment, not micromanaging fixes

Implementation example: Payment Processing Compliance

A payments company needed to maintain PCI DSS compliance across microservices. Traditional approach: quarterly manual reviews of each service, spreadsheet tracking, lots of screenshots.

Engineering approach:

pci_compliance_suite:
  - cardholder_data_isolation:
      test: verify_no_card_data_in_logs
      test: verify_encrypted_data_transmission
      test: verify_secure_key_management
  - access_controls:
      test: verify_least_privilege_access
      test: verify_mfa_enforcement
      test: verify_session_management
  - network_security:
      test: verify_firewall_rules
      test: verify_network_segmentation

Outcome: Engineers have a clear understanding of the PCI DSS requirements and treat compliance as another feature in their daily work. Failures are detected early and can be addressed proactively, whenver the issues appear instead of waiting weeks or months to detect them.

Changes in Team Behavior

Organizations that implement this approach typically observe these changes:

Engineering Teams

Audit preparation time reduces as evidence is continuously collected
Compliance issues are detected during development rather than during manual assessments
Engineers gain familiarity with compliance requirements through daily interaction
Development patterns adjust to incorporate compliance checks

Security Teams

Focus changes from individual infractions to evaluating risk evolution
Cross-team interaction improves as security checks are embedded in development processes in the DevSecOps approach
Decision-making relies on continuous data rather than periodic reports

Operational Changes

Audit cycles require less organizational disruption as the organization is trained on managing compliance as part of their daily routine
Compliance costs shift from labor-intensive evidence gathering to tooling and automation
Issue resolution time decreases as there is active monitoring and we don’t wait till the next audit
Knowledge retention improves as compliance understanding becomes embedded in daily work

Working with Auditors

The automated approach changes how you interact with auditors in two key ways:

From Spreadsheets to Automation

Instead of manually filling compliance spreadsheets during audit prep, you provide auditors with:

Automated reports showing current compliance status
Clear methodology documentation explaining how your automated tests work

From Screenshots to Asset Inventories

Replace ad-hoc screenshots of configurations with:

Real-time asset inventories showing all systems and their compliance status
Standardized reports generated from your compliance test suite
Audit trails showing when controls were implemented and how they’ve been maintained

Bringing Auditors Along

Start conversations with your auditors 6-12 months before the audit. Explain that you’re moving from manual evidence collection to automated compliance monitoring. Most experienced auditors understand that automated validation is more reliable than point-in-time manual checks, they just need to understand your methodology and be able to validate your approach.

The Vision: Audit-Ready by Default

Imagine this scenario one year from now:

You’re at your desk, enjoying your morning coffee, when a colleague from the security team stops by. They’re smiling as they congratulate you, your company just successfully completed its annual compliance audit.

You pause, confused. You didn’t even know you were being audited.

But then you smile back. Of course you passed. Your systems are audit-ready by default. Every compliance requirement is continuously validated, every process is monitored in real-time, every change is automatically verified against your compliance test suite.

This isn’t fantasy. It’s the north star of applying proven engineering practices to compliance.

This is what painless compliance looks like. This is what a thousand audits a day enables.

Categories: Compliance Engineering Automation

Tags: #compliance automation #automated testing #CI/CD #security #audit readiness

Share this article:

LinkedIn Email

This article is about...

What You'll Learn

Contents