
The theoretical diversity of peptide libraries is astronomical. We define a practical lower bound as Y ≥ 1/(s × a × ∏k=1mfk$$ {prod}_{k=1}^m{f}_k $$). A literature survey estimates that a diversity of ≈1.6 × 105 (m ≈ 4, a ≈ 1, s ≈ 1) is required to include one specific binder.
ABSTRACT
Molecular library display systems utilizing phage, bacteria, and genetic material are powerful tools for identifying target-specific peptides. Libraries of sufficient diversity are required to isolate target-specific peptides. Although the evolution of molecular display techniques—from phage display to mRNA display—has substantially expanded achievable library diversities, the minimum diversity required to reliably isolate target-specific peptides remains unclear. Here, we propose a straightforward equation to estimate the minimum diversity (Y). Y is defined by three experimentally accessible parameters: (i) the number of important amino acids (m, consensus motif residues whose substitution markedly reduces binding), (ii) the number of independent binding sites (s) on the target molecule and (iii) the arrangement factor (a) that counts the possible positional permutations of the motif within the random region. By analyzing 35 target-specific peptides reported previously, we found that the average value of m was ≈4, whereas a and s were most frequently ≈1. These representative values imply a practical lower-bound benchmark of diversity, Y ≥ 1.6 × 105 for random peptide libraries in screening. Our findings will aid researchers in rationalizing the design and construction of peptide libraries, facilitating efficient identification of high-affinity, target-specific peptides.

