Each individual harbors an enormous diversity of unique TCR sequences. Regarding protective immunity, this diversity is also necessary in order to mount effective T-cell responses against an unpredictable set of target epitopes derived from pathogens or tumor cells. TCR diversity is restricted by thymic selection processes, including the recognition of MHC and available space for maintaining T cells in an individual. However, in a similar manner epitope diversity is restricted by the capacities to bind to a specific MHC, chemical stability as well as the presence of limited key residues in an epitope that are exposed to TCR recognition. Current estimates of TCR and epitope diversity predict a high likelihood to mount an effective immune response against any given target, or in other words: there is a high probability of finding a high avidity clonotype in a naïve repertoire for any specificity.
New technologies enabled access to large-scale TCR sequencing data and will improve further in the future. However, screening TCR functionality of hundreds of thousands of TCRs in a high-throughput manner is still not feasible. We aim to understand how we can efficiently predict TCR specificity from the TCR sequence alone and combine this with the identification of high avidity clonotypes. TCR sequence data from naïve T-cell repertoires could serve as an invaluable source for rapid identification of high avidity TCRs for cellular therapy. In order to achieve this goal, we are generating large-scale TCR libraries of experimentally validated epitope specificities to elucidate TCR recognition in different contexts.