Is nasty noise actually harder than malicious noise?
G. Blanc and Y. Huang and T. Malkin and R. Servedio and
37th Symposium on Discrete Algorithms (SODA), 2026, to appear.


Abstract:

We consider the relative abilities and limitations of computationally efficient algorithms for learning in the presence of noise, under two well-studied and challenging adversarial noise models for learning Boolean functions: malicious noise, in which an adversary can arbitrarily corrupt a random subset of examples given to the learner; and nasty noise, in which an adversary can arbitrarily corrupt an adversarially chosen subset of examples given to the learner. We consider both the distribution-independent and fixed-distribution settings. Our main results highlight a dramatic difference between these two settings:

  • For distribution-independent learning, we prove a strong equivalence between the two noise models: If a class ${\cal C}$ of functions is efficiently learnable in the presence of $\eta$-rate malicious noise, then it is also efficiently learnable in the presence of $\eta$-rate nasty noise.
  • In sharp contrast, for the fixed-distribution setting we show an arbitrarily large separation: Under a standard cryptographic assumption, for any arbitrarily large value $r$ there exists a concept class for which there is a ratio of $r$ between the rate $\eta_{\mathrm{malicious}}$ of malicious noise that polynomial-time learning algorithms can tolerate, versus the rate $\eta_{\mathrm{nasty}}$ of nasty noise that such learning algorithms can tolerate.
  • To offset the negative result given in (2) for the fixed-distribution setting, we define a broad and natural class of algorithms, namely those that ignore contradictory examples (ICE). We show that for these algorithms, malicious noise and nasty noise are equivalent up to a factor of two in the noise rate: Any efficient ICE learner that succeeds with $\eta$-rate malicious noise can be converted to an efficient learner that succeeds with $\eta/2$-rate nasty noise. We further show that the above factor of two is necessary, again under a standard cryptographic assumption.

    As a key ingredient in our proofs, we show that it is possible to efficiently amplify the success probability of nasty noise learners in a black-box fashion. Perhaps surprisingly, this was not previously known; it turns out to be an unexpectedly non-obvious result which we believe may be of independent interest.


    Back to main papers page