Privacy-Preserving ML for Security

Why privacy matters for security ML

Security telemetry (logs, user behaviour, device signals) is powerful for detecting threats, but it often contains sensitive personal data. Privacy-preserving ML enables building effective detectors while reducing exposure of raw data.

Key approaches

Federated learning

Models train on-device and only send model updates (not raw data) to a central aggregator. This reduces the need to centralise user logs while still benefiting from diverse data.

Differential privacy

Mathematical guarantees are added to model updates or query results so that individual data points cannot be reconstructed. When combined with federated learning, it provides strong privacy assurances.

Secure aggregation

Cryptographic techniques let servers combine model updates from many devices without seeing any single device's contribution in cleartext.

Use-cases in security

On-device phishing detection that improves from signals across users without sending message contents to a central server.
Anomaly detection for login patterns where raw logs remain on-premises and only anonymised model improvements are shared.
Malware classification improvements where telemetry stays local and only aggregated learning occurs.

Practical considerations

Privacy-preserving techniques reduce risk but are not a silver bullet. They require careful engineering, auditing, and clear user consent. For organisations, prioritise transparent policies and independent verification of privacy claims.

Where this fits with Esrok

This post supports our AI + security pillar and links naturally to privacy guidance: Privacy, and to authentication discussions like Passkeys explained, which reduce credential exposure.

Privacy-Preserving Machine Learning for Security