Chinese Chemical Letters  2025, Vol. 36 Issue (8): 110546   PDF    
Site-specific protein labeling: Recent progress
Yiming Ma, Yuanbo Wang, Fang Wang, Sheng Lu*, Xiaoqiang Chen*     
State Key Laboratory of Materials-Oriented Chemical Engineering, College of Chemical Engineering, Jiangsu National Synergetic Innovation Center for Advanced Materials (SICAM), Nanjing Tech University, Nanjing 211800, China
Abstract: Site-specific protein labeling plays important roles in drug discovery and illuminating biological processes at the molecular level. However, it is challenging to label proteins with high specificity while not affecting their structures and biochemical activities. Over the last few years, a variety of promising strategies have been devised that address these challenges including those that involve introduction of small-size peptide tags or unnatural amino acids (UAAs), chemical labeling of specific protein residues, and affinity-driven labeling. This review summarizes recent developments made in the area of site-specific protein labeling utilizing genetically encoding- and chemical-based methods, and discusses future issues that need to be addressed by researchers in this field.
Keywords: Protein labeling    Site specific labeling    Chemical labeling    Enzyme-mediated labeling    Peptide tags    
1. Introduction

A significant need exists in several fields related to in vivo imaging [1-3], chemotherapeutics [4, 5], drug-release [6, 7] and materials science [8] for efficient chemical and biological methods for site-specific labeling of proteins with small-size tags such as affinity probes, fluorophores and potent cytotoxins. In particular, these methods are key to the development of highly specific techniques to create third generation antibody-drug conjugates (ADCs) that have improved therapeutic indexes owing to regulation of homogeneities and drug-to-antibody ratios, and drug-conjugation sites [9, 10]. Moreover, specific protein labeling techniques are indispensable components of state-of-the-art microscopy techniques, including single-molecule fluorescence resonance energy transfer (smFRET) [11], stimulated emission depletion (STED) microscopy and single-molecule tracking [12], and single-molecule localization microscopy (SMLM) [13], that are utilized to elucidate molecular mechanisms of biological processes and understand complex biological system. Furthermore, the advent of super-resolution microscopy, which enables imaging in the low nanometer range, has led to an increased demand for small-sized labeling protocols that locate fluorophores closer to sites of interest.

Fluorescent proteins (FPs), which are robust and universally applicable, have been employed in approaches for site-specific protein labeling. However, the sizes of FPs (~30 kDa) are often comparable to or even greater than those of proteins of interest (POI). Thus, incorporation of FPs can perturb the folding, function or even location site of the POI. In addition, the limited fusion sites (usually at N-terminus or C-terminus) and relatively poor resistance to photobleaching limit the application of this method. Overall, an ideal labeling strategy needs to have high site selectivity, use small-sized tags that minimally disrupt the function and localization of the POI. Furthermore, a high-quality technique for this purpose should be simply and rapidly executable, sufficiently versatile to accommodate a range of labeling group, and capable of generating highly stable tagged POIs. Finally, the protein labeling method should not create a toxic burden on the cell or interfere with the key cellular process being investigated.

The criteria outlined above have spurred searches for alternative strategies for site-specific protein labeling. These efforts have led to the development of novel self-labeling peptide tags that have a wide range of applications for monitoring local protein conformational changes. Another new method utilizes enzyme-mediated peptide tagging for selective introduction of desired functional groups at specific positions. At present, the smallest tags available for this purpose are unnatural amino acids (UAAs) incorporated by using amber stop codons or related techniques [14, 15]. Moreover, interesting and technically simple chemical labeling methods that specifically target amino acid residues in proteins have been devised and widely utilized. In addition, new, tag type independent, affinity-driven and ligand-directed labeling protocols have been described.

In the review below, we discuss the current state and future directions of investigations targeted at new protocols for site-specific protein labeling. As this field continues to develop, we anticipate that the summary will provide insight that will facilitate the development of approaches that are optimal for specific applications.

2. Site-specific protein labeling based on genetic encoding

To date, the most widely used strategy for protein labeling relies on peptide or enzyme tags [16, 17]. The fusion of POI with engineered, self-modifying enzymes such as SNAP- [17, 18], CLIP- [19] or Halo-tags [20] significantly extends the scope of customized labelling by enabling incorporation of reporter units with unique spectroscopic signatures. However, labeling with these intact proteins results in a massive increase in size (20–40 kDa), which can substantially perturb the function of the POI and limit fusion sites at N- or C-terminal positions. Genetic fusion with small size, self-labeling tags and enzyme-mediated peptide tags is an attractive and straightforward alternative approach for protein labelling and it enables introduction of a wide variety of reporter groups. Moreover, use of the amber stop codon or related techniques in association with UAA can be utilized to introduce very small labels. In this chapter, we focus on progress made in methods to introduce labeling sites by using the gene coding technology, with special emphasis being given to applications.

2.1. Self-labeling peptide tags

Self-labeling peptide tags contain unique sequences that can be categorized as metal ion-dependent or small molecule-dependent. For metal ion-containing labeling systems, the selectivity is ensured through the mutual recognition between electron-deficient metal ions and electron-rich ligands. Alternatively, the other type of self-labeling peptide tags relies on enhanced reactivity between specific amino acids and reagents, which is mediated by the peptide sequences themselves. The small sizes (0.6–6 kDa) and flexible incorporation profiles of these tags have advantageous features associated with minimizing disturbance of the structures and properties of POIs.

One of the shortest peptide-tags of this type is comprised of tetracysteine (TC) containing motifs CCXXCC (preferentially CCPGCC) [16]. The TC-tag is genetically fused to the terminal or internal positions of the protein, where it is specifically recognized by and covalently bonds to a membrane-permeable fluorescein derivative containing two As(Ⅲ) substituents, fluorescein arsenical hairpin binder (FlAsH) (Fig. S1A in Supporting information). In subsequent studies, new biarsenical ligands, such as the resorufin arsenical hairpin binder (ReAsH) (Fig. S1B in Supporting information), the rhodamine-derived bisboronic acid reagent (RhoBo) (Fig. S1C in Supporting information) and TC motifs were developed for protein labeling in live cells [21]. However, the cytotoxicity of these arsenic probes, along with the uncertainty resulting from the dissociation of the interaction between TC and arsenic caused by thiol compounds limit applications of this strategy.

To circumvent this drawback, Chen et al. developed a fluorescein-based protein marker (FL-DT) containing bis(2-cyclopentenone) [22]. FL-DT reacts with a specific di-cysteine (Cys) containing peptide tag (CPGC) rapidly via a Michael addition process (Fig. 1a). This strategy has been employed to determine the total amount of protein expressed in Escherichia coli (E. coli) system in a rapid and highly specific manner.

Download:
Fig. 1. Self-labeling peptide tags. (A) Biscysteine-FL-DT tag. (B) DBCO tag. (C) π-clamp tag. (D) Hexahistidine-trisNTA tag.

Another type of labeling method accomplishes site-specific labeling by modulating the reactivity of peptides. For example, Pentelute et al. designed a seven-residue peptide tag (DBCO-tag, LCYPWVY), which selectively reacts with aza-dibenzocyclooctyne (DBCO) probes through thiol-yne coupling (Fig. 1B) [23]. Fusion of DBCO-tag with POIs enables site-selective conjugation with fluorescent probes, affinity tags and cytotoxic drug molecules.

Recently, Pentelute et al. screened the peptide library and identified the π-clamp sequence (FCPF) [24, 25], wherein the reactivity of a Cys thiol is fine-tuned to facilitate site-specific conjugation with perfluoroaromatic reagents. Building upon this strategy, Bernardes and his team devised a homobifunctional linking approach for the incorporation of perfluoroaromatic reagents [26]. This approach involves the use of pentafluorophenyl (PFP) sulfonamide as a linker to couple with the Cys residue within the π-clamp sequence, thereby enabling the site-specific production of either homodimeric or heterodimeric protein complexes (Fig. 1c). In addition, the π-clamp Cys residue was found to exhibit site-selectivity towards a Cys residue in Asp-Cys-Glu in the second protein domain. This method is expected to be applicable to the preparation of bivalent, biparatopic and bispecific ADCs as potentially powerful therapeutics.

His-tag labels, consisting of 6–12 histidine residues, are commonly used for purifying and detecting recombinant proteins by binding to metal chelators like nickel-nitrilotriacetic acid (Ni-NTA) [27, 28]. Due to their low affinity and fast off-rate, transient labeling occurs. To enhance affinity, multivalent chelator head groups (MCHs) such as trivalent NTAs (trisNTA) have been developed [29, 30]. In a study by Tampé and colleagues, cyclic trisNTA showed the highest affinity and stability for labeling His-tagged proteins compared to linear and dendritic forms (Fig. 1D) [31]. The authors developed a new labeling method using glyoxal, which allows precise coupling of His-tagged proteins with fluorescent cyclic trisNTA at picomolar concentrations, providing a high-affinity, non-interfering labeling technique that aids in understanding cell structure and dynamics.

2.2. Enzyme-mediated peptide-labeling tags

Enzymes are powerful tools for use in protein labelling processes due to their high specificities and mildness of reaction conditions. Enzyme-mediated peptide labeling tagging refers the use of short peptide sequences that serve as substrates for enzyme-catalyzed modification. In the enzymatic labeling approach, a recombinantly expressed target protein, equipped with a peptide recognition motif (4–15 amino acids), binds an appropriate enzyme (such as transferase or ligase). The bound enzyme then promotes a reaction that leads to covalent attachment of a functionalized substrate to a specific amino acid in the POI. Compared with fluorescence experiments utilizing self-labeling peptide tags, those that employ the enzyme-mediated labeling strategy to introduce fluorophores are not limited by background signals arising from non-specific fluorophore binding and cytotoxicity associated with arsenic containing reagents. A panel of enzymes have been exploited to covalently attach functionalized probes to amino acids within recognition tag. The relevant enzyme methods are discussed in Section 2.2 in Supporting information and summarized in Table S1 (Supporting information).

Over the years, sortase-mediated ligation has proven to be a useful protein conjugation technique [32-35]. Sortases are grouped into six different families (A–F) according to their structures and substrate identities. The most extensively studied enzyme in this family is Staphylococcus aureus Sortase A (SaSrtA), which is a Ca2+-dependent transpeptidase that cleaves the amide bond between the threonine and glycine (Gly) residues of proteins with C-terminal LPXTG recognition motifs. Recently, Clubb et al. reported that the pilus-specific sortase from Corynebacterium diphtheriae (CdSrtA) can be used to link a peptide to a pilin motif (PM) fusion protein via a specific lysine (Lys, K)-isopeptide bond [36]. Rational mutagenesis of CdSrtA3M created a highly activated Cys transpeptidase that catalyzes in vitro isopeptide bond formation (Fig. 2A), resulting in crosslinking adjacent SpaA proteins through their N- and C-terminal domains in greater than 95% peptide modification yields.

Download:
Fig. 2. Examples of labeling methods based on enzyme-mediated peptide-labeling tags. (A) Strategy for linking peptides to proteins via formation of a specific Lys-isopeptide bond using CdSrtA3M. (B) Strategy for utilizing engineered OaAEP1 [C247A] to irreversibly incorporate diverse, commercially available amines at C-terminal asparagines. (C) Strategy for involving LACE to site-specifically label folded proteins at internal Lys residues. (D) Strategy for using OaAEP1 to site-specifically label target proteins with an isopeptide-linked glycylglycine moiety.

While sortase-mediated ligation has proven to be a valuable tool for protein labeling, recent advancements have also highlighted the potential of other ligase classes, particularly asparaginyl endopeptidases (AEPs) for similar applications due to their exceptional catalytic efficiencies and minimal (-NGL-) recognition elements [37]. The engineered Oldenlandia affinis AEP (OaAEP1) [C247A], produced in E. coli [38], is highly specific for its cognate recognition sequence, which displays inherent promiscuity in reactions with nucleophile peptides [39]. It enables site-specific sequential ligation without unwanted reactions associated with recognition of a reconstituted ligation product [39]. Ploegh et al. developed a promiscuous asparaginyl ligase for peptide/protein that promotes incorporation of a C-terminal asparagine by reactions with diverse, commercially available nonpeptidyl amines (Fig. 2B) [40], achieving incorporation levels over 90% with low enzyme quantities (0.002 equiv.). This approach facilitates three- to four-component reactions with a minimal (single residue) footprint in the final product and allows for straightforward dual labeling of protein termini by leveraging the orthogonality of trans-peptidation and aminolysis reactions.

Many current protein labeling protocols focus on terminal positions and often use non-peptidic metabolites or large domains. In contrast, methods that modify internal sites offer more flexibility in terms of locations and number of modifications. In this regard, Bode developed a chemoenzymatic method, referred to as Lys acylation using conjugating enzymes (LACE), to site-specifically modify folded proteins at internal Lys residues [41]. LACE utilizes a minimal genetically encoded tag (four residues, IKXE) recognized by the E2 small ubiquitin like modifier coupled enzyme Ubc9 (Ube2I), and the smallest label requires only two specific residues adjacent to the receptor Lys. Ubc9 recognizes and modifies Lys embedded in a consensus SUMOylation motif, even in the absence of E3 ligases (Fig. 2C). This approach enables isopeptide formation using just Ubc9 in a programmable manner without requiring on E1 and E3 enzymes.

In another design, Lang and co-workers combined OaAEP1-mediated transpeptidation with genetic code expansion to site-specifically incorporate an isopeptide-linked glycylglycine moiety (GGisoK) into POI (Fig. 2D) [42]. It was shown that OaAEP1 enables site-specific internal labeling and user-defined conjugation of GGisoK-bearing POI with a variety of NGL-bearing probes and proteins, leaving a minimal peptidic footprint (N-terminal Gly-glyoxamide (NGG)) in the ligation product. OaAEP1 labeling can create novel protein architectures and conjugates with a native isopeptide bond, such as linking ubiquitin (Ub) or Ub-like (Ubl) proteins to POI with just one point mutation needed in the linker. This approach complements Bode's method [41], which requires up to three point-mutations for Ubc9-mediated ubiquitylation.

2.3. Incorporation of UAAs and bioorthogonal labeling of proteins

The emergence of robust methods to expand the genetic code has enabled the site-specific incorporation of UAAs into proteins while reducing functional disruptions. Amber codon suppression employs orthogonal translational machinery to achieve this [43]. This protocol involves using engineered aminoacyl-tRNA synthetases (aaRS)-tRNA pair to incorporate UAAs at specific stop codons (mainly UAG, UAA, UGA). Efficient variants like NES PylRS and designed tRNAPyl variants have been created for this purpose (Fig. S2A in Supporting information) [44]. This strategy has been used to label various proteins, including cytoskeletal and viral proteins, allowing the genetic encoding of over 200 distinct UAAs in both prokaryotes and eukaryotes [45, 46]. These UAAs include fluorescent UAAs (discussed in Section 2.3.1 in Supporting information) [47-49], clickable UAAs with bioorthogonal handles [50, 51], photo-responsive UAAs for regulation of protein function by light [52] and post-translationally modified (PTM) UAAs [53, 54].

The combination of genetically encoded UAAs and biological orthogonal reactions comprise a powerful tool for protein labeling. The approach provides precise control of protein marker sites at virtually any desired location in POIs with a wide selection range of fluorescent reporters [55]. To exploit these advantages for designing methods for protein labeling in living systems, it is important to develop UAAs that carry bioorthogonal functional groups (e.g., azide, alkyne, alkene, tetrazine), which can be incorporated into target protein using a diverse array of conjugation reactions. The most common types of bioorthogonal reactions designed for this purpose and related information are provided in Section 2.3.2 (Supporting information).

Recently, the superiority of this strategy has been demonstrated by its application to the design of various site-specifically-modified protein therapeutics. For example, receptor-biased PEGylation of azide-bearing amino acid incorporated interleukin-2 via a copper-free click reaction has found utility in the development of a therapeutic for autoimmune diseases and for antitumor therapy [56].

Extracellular labeling of transmembrane AMPA receptors (AMPAR) regulatory proteins in living neurons using click chemistry also highlights the high potential of this conjugation method [57]. In one effort, Spiegel et al. utilized incorporation of UAAs combined with the ultrafast bioorthogonal strain promoted inverse electron demand Diels-Alder cycloadditions (SPIEDAC) reactions in the design of a minimally invasive approach for protein labeling in living primary neurons [58]. The versatility of this approach was demonstrated by its use in advanced microscopy studies of the evolution of fixed and living neurons, dual-color pulse–chase click labeling and super-resolution imaging of neurofilament light chain (NFL), the smallest of the three major neurofilament subunits. Incorporation of UAAs in mammalian cells has provided opportunities to perform direct super-resolved imaging of native human proteins and smFRET measurements [59-61]. Liu et al. established an RNA editing mediated non-canonical amino acid protein labeling (RENNAPT) system, integrating RNA editing with UAA techniques for site-specific labeling in living cells. This system uses trans-cyclooctene Lys TCO (TCO-K), which reacts with SiR dye for super-resolution imaging [62]. The resulting protein and tracker showed high colocalization under the live-cell high intelligent and sensitive structured illumination microscope (HIS-SIM), confirming the high specificity of RENNAPT system.

Advanced methods for protein labeling with UAAs have been extended to grafting two or more functional groups onto proteins as part of personalized and precision medicine applications. Recently, Liu et al. described a method for producing protein dual and multiconjugates by site-specific incorporation of UAAs containing mutually orthogonal and bioorthogonal azide and tetrazine reaction handles [63]. This strategy allows simultaneous introduction of different functional groups (such as fluorescent probes or drugs) into proteins under physiological conditions, enhancing the development of protein-based therapeutics. In addition to introducing two UAAs with different functional groups to achieve dual labeling, the strategy of sequential dual labeling of the same UAA has also made a breakthrough. As an example, Liu and co-workers identified and characterized a novel host-guest pair between synthetic naphthotubes and a UAA containing phenyltetrazine derivatives [64]. Naphthotubes recognize and bind phenyltetrazine with low micromolar to submolar binding affinity, while competitive small molecules with higher affinity can restore the reactivity of phenyltetrazine, thereby achieving reversible regulation. Firstly, the exposed tetrazine residues were complexed within the trans naphthalene tube, causing the semi-buried tetrazine residue to undergo inverse-electron-demand Diels–Alder (IEDDA) reaction with the marker. Subsequently, by adding the competitive guest, the trans naphthalene tube combines with the competitive guest with higher affinity, releasing the exposed tetrazine amino acids to participate in the new IEDDA reaction. This strategy has been applied to protein site modification and labeling in living cells and mammals, as well as the preparation of long-acting fluorescent antibodies and fluorescent ADCs.

The presented examples show that currently available labeling methods based on peptide tags and genetically encoded UAAs have already had a considerable impact on the field of biocatalysis. Due to significant advancements in reducing label size, improving and developing bioorthogonal chemistry, these strategies are now widely used in site-specific labeling with outstanding rapid reaction kinetics in a variety of biological contexts, e.g., test tube, cells, tissues and animals. However, this can involve extensive modification of the protein expression conditions to generate the modified substrate protein so that the number of variables in the already complex problem of enzyme design increase. Moreover, these methods are laborious and require specifically engineered cells as well as a wide selection of binding sites [65].

Site-specific protein labeling that relies on genetic encoding and the incorporation of peptide tags is a prevalent technique. Self-labeling peptide tags facilitate the combination of induced labeling with small tag sizes, minimizing disruptions to receptors. Furthermore, the utilization of enzymatic reactions, featured by inherent specificity and typically mild reaction conditions offers another powerful auxiliary tool in this process. However, many challenges still need to be addressed, including but not limited to some labeling programs or labeling reactions that are incompatible with living systems, hindering their use for labeling and imaging POIs in their native cellular environment; additionally, the insufficient diversity of enzymatic reactions often restricts the variety of reporter groups that can be used.

3. Chemical labeling of native amino acids residues

Chemical labeling of proteins with minimal functional and structural perturbation and high site selectivity has proven to be exceptionally useful in developing of tools to study the biological properties of proteins and to generate therapeutic protein conjugates [66-68]. Compared to methods involving UAAs or peptide sequences, selective labeling of native residues in POIs does not require tedious biological operations. This approach focuses on the reactivity, accessibility, and abundance of specific amino acid side chains, primarily targeting Cys sulfhydryl and Lys ε-amine groups. Advances have also been made in engineering of other residue for specific labeling, which is discussed in Section 3.3 in Supporting information. Owing to the limited stability of biomolecules, chemical reactions employed for protein labeling must be biocompatible and proceed rapidly under mild conditions. In this section, recently reported strategies for labeling naturally occurring amino acids in proteins in a site-selective fashion with retention of structure and activity will be discussed.

3.1. Site-specific Cys labeling

Targeting Cys residues, whether native or incorporated, is advantageous for selective protein labeling due to the unique nucleophilic property of the sulfhydryl group and their low abundance (only 2.3% genome-wide) on protein surfaces. Maleimides are commonly used to alkylate Cys [69-71], but their resulting thioether linkages are unstable in the presence of external thiols, potentially causing toxicity in ADCs. To address this, alternative Cys labeling strategies have emerged [72-74], one of which is selective Cys arylation. As an example, Pentelute et al. showed that arylpalladium reagents, promote arylation of Cys-containing peptides and proteins (Fig. 3A) [75]. In this process, the oxidative addition complexes (OAC) formed between a 1, 4-dihaloarene and Pd(0) reacts with Cys thiol to form an intermediate that undergoes oxidative addition to generate a Pd-protein OAC, which couples with various nucleophiles, including Cys-containing proteins. It is worth noting that the process can be carried out under aqueous conditions, and the synthesized Pd-protein OACs are stable. Thus, this strategy can be employed to generate stably linked protein therapeutics.

Download:
Fig. 3. Site-specific Cys labeling methods. (A) Cys labeling via palladium-protein OAC. (B) General method for dual labeling of protein Cys residues with chlorotetrazines. (C) Substituted diethynyl phosphinates as reagents for selective thiol–thiol bioconjugation and rebridging of native disulfides and application to forming therapeutic antibodies.

In addition to labeling a single Cys residue with specific functional elements, this approach accommodates doubly reactive Cys-specific reagents to form the corresponding doubly labeled peptides or proteins. For example, Goncolves et al. utilized dichlorotetrazine as a trivalent platform to assemble Cys linked doubly modified proteins (Fig. 3B) [76]. The process developed for this purpose takes place in two steps, the first being highly selective reaction of a monosubstituted chlorotetrazines selectivity with the thiol of a Cys residue in the POI. Next, the tetrazine ring reacts with strained dienophiles under mild conditions to produce a doubly modified protein in a site-specific manner. This approach has been applied to dual labeling of albumin with a macrocyclic chelator for nuclear imaging and a fluorescent probe for fluorescence imaging.

Dual protein labeling often faces issues with unstable linkages under reducing conditions or in the presence of excess thiols. In a previous investigation aimed at circumventing this issue, Hackenberger et al. employed unsaturated phosphonamidates and phosphonothioates for stable Cys modification [77-79]. More recently, this group described a novel technique for selective protein labeling and disulfide rebridging that is based on bis-reactive unsaturated PV-compounds (Fig. 3C) [80]. Diethynyl phosphinates were utilized as bisfunctional electrophiles, in rapid and selective reactions with sulfhydryl groups in aqueous media. In addition, this method was observed to be suitable to generating functional protein conjugates for intracellular delivery. The formed constructs exhibit outstanding stability in the presence of excesses of small thiols, inside both human serum and living cells. Applying this method to trastuzumab allows for a precise antibody-cargo ratio of 4, targeting HER-2 positive cells effectively. Another labeling method based on disulfide rebridging was devised by Weil's group [81]. In the approach, a tetrazine group is installed into disulfide containing peptides and proteins without utilizing a genetic code expansion (GCE) operation and with high rates, efficiencies and site-selectively (Fig. S3 in Supporting information).

3.2. Site-specific Lys labeling

While the amino acid Cys contains the most nucleophilic side chain at physiological pH, some pitfalls exist in utilizing it as a residue for selective protein labeling. Specifically, Cys are often buried within folded proteins and can form disulfide bonds critical for stability. Modifying these bonds may disrupt protein function, and creating mutants with non-native Cys can affect folding and activity.

In contrast, Lys residues are much more abundant (5.9% of all sites) and possess a versatile ε-amino group that is active in various biological processes and present at many active and allosteric sites. Thus, Lys is a routine conjugating site for protein chemical labeling, modification, and regulation of the structure and functions of proteins. Lys residues are also involved in numerous chemical pathways for catalysis by enzymes that regulate various biological processes. Together, these properties make Lys residues a desirable target for covalent protein labeling. Proteome profiling by using activated esters such as N-hydroxysuccinimide (NHS)-ester [82, 83] and STPyne [84] has identified > 9000 ligandable Lys, but the formed conjugates have poor hydrolytic stability [83]. In addition, the relatively low nucleophilicity of Lys ε-amine group also challenges the development of selective labeling methods.

The examples of marker specificity of multiple Lys for specific reagents provides a starting point for developing general protein labeling methods [82, 85-90]. For example, Rai et al. devised a linchpin directed modification (LDMK-K) that enables precise labeling of a single Lys (Fig. 4A) [91]. This methodology uses FK1-spacer-FK2 reagents in which the functional group FK1 (o-hydroxyaldehyde core) undergoes a rapidly reversible chemical selective reaction with all available Lys residues, and the FK2 moiety (acylating group) in the reagent reacts with a Lys in an intramolecular and irreversible fashion, exhibiting high selectivity even when N-terminal amine groups of up to 19 Lys residues and when proteins having a wide range of structural complexity. Thus, engineering the relative reactivity of FK1 and FK2 can be employed to control the overall labeling process. Moreover, the FK1 moiety can be utilized for the precise and parallel introduction of NMR, affinity and fluorophore tags. In addition, the method was successfully applied to create an ADC that showed a 45% growth inhibition rate on HER-2 positive SKBR-3 cells, outperforming the commercial drug Kadcyla, and had minimal impact on HER-2 negative MDA-MB-231 cells, highlighting its specificity.

Download:
Fig. 4. Representative site-specific Lys labeling methods. (A) Labeling of a single Lys using LDMK-K technology. Rx indicates the labels from oxime derivative of the LDM reagent and x varies as per the LDM reagent. (B) Labeling of a single Lys using LDMC-K technology. (C) Strategy for Cys-directed Lys site-selective stapling and single-site labeling. (D) Strategy for stapling at Lys and Tyr or Arg with formaldehyde.

Progress has been made in labeling single sites within complex biological systems by leveraging low-frequency residues to modify high-frequency sites. One example is its use to accomplish Cys-based chemoselective linchpin-directed site-selective modification of Lys residue in a protein (LDMC-K) (Fig. 4B) [92]. A nitroolefin (FC) and an acylating group (FK) were utilized for chemoselective reactions with respective Cys and Lys residues, while tetrafluorophenoxide as the leaving group was replaced by Lys. β-Lactoglobulin A (BLGA), which possesses one free Cys and fifteen Lys residues was selected as the model protein to validate the feasibility of the method. The reaction achieved over 99% conversion, labeling only one proximal Lys while leaving others unaffected. A sequential one-pot protocol was developed to reverse the formation of a nitroolefin thio-Michael adduct, generating aldehydes and oximes for further modifications. In a rigorous test, human serum albumin (HSA), containing 57 Lys residues, was selectively labeled in a mixture with other proteins and cell lysates. An ADC (trastuzumab-DM1) synthesized using this method showed high efficiency and specificity in inhibiting HER-2 positive breast cancer cell proliferation, significantly outperforming Kadcyla.

Cooperative stapling of Lys with another amino acids is a supplementary approach to this general labeling technology. This method, which involves direct cyclization reaction between two adjacent side chains, using either a separate bifunctional linker or a linchpin to bridge the side chains, has proven to be particularly effective in modulating the structures and properties of peptides used for drug development [93-96]. Inspired by this, Xiong et al. developed a visible-light-driven and Cys-directed Lys site-selective stapling approach that utilizes cleavable Cys anchoring (Fig. 4C) [97]. Specifically, the reagent contains an aryl thioether group, which acts as a cleavable anchor specifically for Cys. This allows for the modified Cys to either remain as an anchor or revert to its natural state. The o-nitrobenzyl alcohol group reacts with amines when activated by ultraviolet (UV) light but remains inactive during Cys labeling. This design enables selective modification at Lys sites, with Cys providing preferential labeling to the nearest Lys residue, reducing the chances of inaccurate labeling.

However, the selection of amino acid side chain of present strategies for selectivity control in stapling of native peptides is dominated by Cys, which poses a major challenge in controlling the reaction sequence and positional selectivity. Chen and co-workers described a new method for stapling native peptides at Lys residues using formaldehyde and "cooperation" of nearby Tyr or arginine (Arg) residues [98]. This method facilitates control of positional selectivity for peptides with multiple reaction sites and thus broadens its applicability (Fig. 4D).

The chemical labeling of natural amino acids offers a diverse selection for targeted labeling of protein sites. For the convenience of readers' comparison, Table S2 (Supporting information) summarizes the reaction parameters associated with these methods.

3.3. Labeling N- or C-terminal residues

Though numerous reactions have been developed for targeting, they are limited by the high abundance of residues, which makes site-specific labeling a great challenge. In contrast, almost all single-chain naturally occurring proteins possess one N- and one C-terminus. Thus, methods that target these terminal sites are versatile for comprehensive labeling and chemical modification of POIs. Also, labeling N- and C-terminal positions is less likely to adversely interfere with the endogenous activity of the protein, and few steric or conformational restrictions exist to accessing these sites.

3.3.1. Labeling N-terminal residues

N-terminal positions are commonly accessible for chemical modification as they are typical solvent exposure, and the labeling process often takes place with minimal disruption of the overall protein structure [99]. In addition, the average pKa value of the N-terminal α-ammonium group (pKa = 6–8) is substantially lower than those of Lys ε-ammonium groups (pKa = 10.5) owing to inductive effects of the nearby amide carbonyl group [100], a property which facilitates site-specific modification at these positions.

Several strategies have been developed for N-terminal protein labeling via pH-controlled processes, such as using NHS esters or aldehydes [101-103], and incorporating click handles like azides [104] or alkynes [105]. However, these methods can require extensive optimization and may lead to off-target labeling. An alternative approach focuses on N-terminal Cys (N-Cys) residues, which are more accessible than internal Cys and do not disrupt essential disulfide bonds. Labeling techniques for N-Cys include native chemical ligation (NCL) [106] and a condensation reaction with 2-cyanobenzothiazole (CBT) [107], though these have limited biological applications due to low reaction rates and specific pH requirements. Recent advancements with thiazolidino boronate (TzB) complexes formed from 2-formyl-phenylboronic acid (2-FPBA) improve selectivity and stability (Fig. 5A) [108, 109]. Additionally, a rapid reaction between 2-benzylacrylaldehyde (BAA) and terminal 1, 2-aminothiol groups can efficiently label large peptides and proteins, aiding in cyclic peptide synthesis (Fig. 5B) [110].

Download:
Fig. 5. Strategies for selective labeling the N-terminal positions of proteins. N-Cys labeling via (A) 2-FPBA, (B) BAA, (C) TAMM, (D) CPO, (E) IBH.

A novel bioorthogonal reaction between 2-((alkylthio)(aryl)methylene)malononitrile (TAMM) and a N-Cys group has been developed recently for site-specific protein modification and peptide cyclization (Fig. 5C) [111]. Notably, this process has also been exploited in the construction of cyclic peptide libraries displayed on phage surfaces. Recently, Bernardes group developed an efficient N-Cys labeling technique using cyclopropenone (CPO) reagents, which react with 1, 2-aminothiols under mild conditions to form stable 1, 4-thiazepan-5-one moiety (Fig. 5D) [112]. This reaction occurs rapidly (67 L mol−1 s−1 at 37 ℃), minimizing undesired interactions with internal Cys or biological thiols. CPO reagents are compatible with reducing agents such as dithiothreitol (DTT), which is often required for maintaining the reactivity of the cysteine residue and avoiding protein dimerization by disulfide bond formation. This method has been successfully applied for site-specific N-Cys labeling and creating dual protein conjugates, including the dimer of nIL2, showcasing its potential for precise bioconjugate construction.

More recently, Kalia and co-workers developed a Baylis Hillman orchestrated protein aminothiol labeling (BHoPAL) platform that uses an isatin-derived Baylis Hillman (IBH) adduct for rapid and stable 1, 2-aminothiol derivatization, effective both in vitro and live-cell conditions (Fig. 5E) [113]. Additionally, they developed a lipoic acid ligase-based method that allows for the introduction of a 1, 2-aminothiol moiety at any site in proteins, not just N-Cys. The approach has been successfully tested on protein mixtures and live cells, enabling simultaneous dual labeling of proteins.

Other N-terminal residues can be targets of appropriately designed reagents. This is discussed in Section 3.4.1 (Supporting information).

3.3.2. Labeling C-terminal residues

Strategies for direct and selective transformations and labeling of protein C-terminal carboxylates are underexplored [114-116]. One difficulty in labeling a POI at its C-terminus is the lower reactivity of the carboxylate group relative to those of amine groups. Furthermore, selective labeling is complicated by the presence of abundant carboxylates from aspartate and glutamate, which have similar reactivity. While some methods involve introducing reactive moieties through chemical synthesis or genetic code expansion, many existing strategies rely on amidation or thioester-forming reactions [117, 118]. Recently, photocatalytic redox-promoted decarboxylation methods have been developed, validated with small peptides, and shown to have potential use in a wide range of applications owing to the mild conditions employed to generate radical intermediates [119, 120] in numerous processes including arylation [121, 122], reduction [123, 124], allylation [125], cyanation [126], and Giese coupling [127, 128]. The latter process was employed by the MacMillan group in developing a strategy that exploits the differences in oxidation potentials of internal vs. C-terminal carboxylates to bring about C-terminal carboxylic acid targeted bioconjugation (Fig. 6A) [127]. In another effort, Waser and coworkers developed a direct decarboxylative alkynylation at the C-terminal of POIs that uses hypervalent iodine reagents (Fig. 6B) [129]. Recently, Anslyn et al. successfully applied photocatalytic C-terminal decarboxylation alkylation to peptide mass spectrometry and single-molecule protein sequencing, which can be widely applied to analyze proteomes [130]. This group successfully labeled C-terminal residues in peptides from bovine serum albumin and yeast/human cell extracts using the Michael acceptor, 3-methylene-2-norbornanone (NB) (Fig. 6C). Notably, this group also demonstrated the utility of a decarboxylation-alkylation process in single-molecule proteomics analysis by fluorosequencing, a technique expected to gain wide use in proteomics research.

Download:
Fig. 6. Strategies for selected labeling C-terminal positions of proteins. (A) C-terminus labeling via visible-light-mediated single-electron transfer. (B) C-terminus labeling via photoredox-catalyzed decarboxylative alkynylation using ethynylbenziodoxolone (EBX) reagents. (C) C-terminus labeling via photoredox-catalyzed decarboxylative alkynylation using NB reagents.

The above examples have shown that chemical methods for site-specific labelling of natural amino acids have played an important role in research and development across various fields, spanning from biological imaging to pharmaceutical manufacturing. Nevertheless, a key bottleneck for the approach is the occurrence of a certain level of unspecific reactions. Therefore, it is imperative to conduct quantitative evaluations of the specificity and bioorthogonality of these reactions to gain a deeper understanding of the factors that contribute to off-target reactions. Despite its immaturity for complex biological settings, chemical labeling approach still holds immense potential in the pharmaceutical sector, particularly in the construction of ADCs, owing to the flexibility in chemical structures.

4. Affinity-driven and ligand-directed labeling

Although enzymatic, bio-engineering and chemical techniques have been extensively explored in the context of protein labeling and a variety of other biochemical and pharmaceutical applications, each has specific limitations, such as the requirement for complex genetic manipulation, the occurrence of heterogeneous labeling and low labeling efficiencies. In sharp contrast, affinity guided reactions based on proximity-governed chemical processes have served as powerful strategies for site-specific labeling of native proteins and antibodies. In these methods, the POI selectively binds to the ligand of a substance containing a reactive center. Interactions between the protein and ligand position the reactive moiety of the probe in proper location for reaction with the complementarily reactive center in the POI to produce a covalent adduct. The labeling process usually occurs at a position that is not involved in ligand recognition by the protein. The binding substances utilized in this approach include small molecules [131, 132], aptamers [133], peptides [134, 135] and proteins [136, 137]. Until the current time, this method has shown wide applications to site-specific antibody labeling with drugs to generate highly effective ADCs, with examples including a Fc-targeted peptide for site-specific toxin assembly [138] and the AJICAP™ method for adding thiol functional groups to immunoglobulin G (IgG) [139, 140]. Recently, Tang and Huang et al. reported a novel, traceless strategy that simplifies the synthesis of site-specific ADCs with enhanced efficiency by using reagents composed of optimized Fc targeting ligands, thioester bridges, and therapeutic payloads [141]. However, conventional affinity-driven labeling processes often lead to irreversible inhibition of the function of the protein, because the active site architecture and functionality is significantly altered.

Regarding this issue, Hamachi's group developed ligand-directed (LD) chemistry for labeling endogenous proteins in living systems, involving a wide range of reactive groups, which is discussed in Section 4.1 (Supporting information). Although LD chemistry methods overcome the problem of inhibition of protein function, some challenges still need to be addressed. For example, the large sizes of the reactive groups in the reagent and/or linkers hinder labeling of residues close to the active site of the POI. In addition, the residues that in the protein that serve as complimentary reactive centers have been identified only empirically, thus sites amenable to specific types of chemistry needed for labeling are uncertain. Lastly, some of the processes employing in this approach take place slowly, and yield adducts that have low stability in the cellular environment and structural complexity.

Hence, a need exists to develop new ligand-directed chemistries using simple and small reactive groups that position in desired locations and specifically label specific amino acids. In studies aimed at this goal, London developed a covalent ligand directed release (CoLDR) site-specific labeling strategy, which enables installation of a variety of functional tags on a target protein and release of the directing ligand [142]. Previously, London demonstrated that α-substituted methacrylamides can act as electrophiles, undergoing a conjugated addition−elimination reaction with thiols to release the substituent at α′-position [143]. In the current study, the acrylamide's position in the reagent was selected so that the targeting ligand is the α′-substituent in the Michael addition adduct. The Bruton's tyrosine kinase (BTK) was used as a model protein, with its known inhibitor ibrutinib serving as the ligand for selective labeling of a noncatalytic Cys (Fig. 7A). This allows for the removal of the ligand and frees the protein's active site. Applications of this strategy include assessing BTK's half-life and degradation profile using gel fluorescence. This method still has some limitations such a restricted universality due to the specific interaction between the ligand and receptor, and limited targetability to noncatalytic Cys residues. However, studies leading to its development have uncovered a new generation of protein proximity inducers, which is an important addition to the toolbox of chemical biology.

Download:
Fig. 7. Site specific protein labeling through affinity-driven and ligand directed methods. (A) Schematic representation of the reaction of a target Cys with a substituted α-methacrylamide through CoLDR chemistry. (B) Schematic representation of aptamers modified with cleavable electrophiles that provide transfer of chemical motifs to a protein.

Recently, Deiters described the first example of an aptamer containing cleavable electrophiles that serves as an affinity ligand for transfer of chemical motifs to a protein (Fig. 7B) [144]. Aptamers are short, single-stranded nucleic acids, generated through powerful selection strategies, that bind to target proteins with high specificities and affinities. For execution of the protocol, thrombin-binding aptamer (TBA) was modified to incorporate tosyl and N-acyl sulfonamide leaving groups, and biotin and coumarin were used as transferable handles to label thrombin. Protein labeling employing this strategy occurs with complete specificity in that only Lys at the aptamer binding interface is labeled, even in complex mixtures like bovine serum albumin and human plasma. This method can be extended by using a variety of aptamers and, consequently, the nucleic acid-based probes technology can be employed in the modification, detection and inhibition of a various protein.

However, only a few examples of the application of affinity guided protein labeling have been presented thus far. In addition, establishment of a synthetic receptor library whose members can selectively bind to specific pockets or surfaces in a POI in complex biological environments would be of great significance in enabling use of this approach in protein research, biotechnology and drug development.

5. Conclusion and future direction

From this perspective, we have summarized the significant progress in site-specific protein labeling through various biological and chemical methods in recent years. Evaluating the merits and limitations, labeling positions, effects on protein structure and function, as well as the reaction kinetics of diverse labeling techniques will aid in selecting the optimal labeling methods that meet specific needs. By leveraging these advanced technologies, various protein conjugates have been created, which can be used not only for monitoring protein transport, assembly, and conformational changes, but also for investigating protein-protein interactions, developing bioreactive materials, and constructing next-generation ADCs with enhanced pharmacokinetics and efficacy. In this regard, chemical or enzymatic methods offer precise modification of specific amino acid or sugar residues, resulting in the production of homogeneous ADCs that exhibit superior overall pharmacological characteristics compared to heterogeneous ADCs. Moreover, accurate drug-antibody ratios and coupling sites contribute to maintaining the pharmacokinetic properties of the antibody while achieving anti-tumor efficacy. This also significantly impacts the stability of the ADCs, which ultimately determines the rate of drug release in the body circulation and at tumor sites. However, there are still challenges to overcome, such as the instability of ADC in blood circulation, which leads to premature release of payloads, posing potential risks. Additionally, coupling reactions require precise control and consistency to ensure optimal performance. Addressing these issues will pave the way for significant clinical and commercial success in the field of ADCs. For all the aforementioned methods, a long-term challenge that demands attention stems from the lack of specificity and efficiency in labeling. In other words, some labeling protocols or reactions still don't align with living systems. Bearing this in mind, the integration of other technologies is crucial. For example, the development of gene editing tools like CRISPR/Cas9 provides powerful tools for inserting tags into endogenous sites and facilitates the use of short tags as POI markers at natural expression levels in cell culture models. New labeling methods, such as protein trans-splicing technology, mediated by split inteins, have the advantages of specificity, universality, and simplicity, enabling protein labeling to go beyond traditional amino acid-specific labeling and opening up new multi site-specific protein labeling techniques. In addition, computer-simulated labeling reactions or computer-aided probe design. By predicting the physicochemical attributes of amino acids or protein secondary structures, along with the transition state conformations of key enzymatic catalytic groups and substrates, the workload of establishing and screening a library of labeled sites is significantly reduced, and precision is enhanced. At the same time, the data gathered by the combination of optical microscopy technology and advanced measuring instruments such as liquid chromatography-tandem mass spectrometry (LC-MSMS) should facilitate (1) the theoretical identification of specific amino acids accountable for enzymatic reactivity in native proteins, (2) in silico simulations of protein labeling reactions in environments resembling those within cells, and (3) computer-assisted rational design of molecular probes. These efforts will ultimately lay the groundwork for the development of novel labeling techniques.

We assume that the future research direction of site-specific protein labeling will focus on precisely controlling the number and location of reactions to install a variety of functional entities. For broader protein labeling technology, the existing research has provided a variety of chemically selective methods, and progress has been made in site-selective methods. The future direction may point to the realization of protein-specific labeling in multimolecular crowding biosystems. For instance, endogenous protein labeling in living cells is considered an exciting challenge, albeit extremely difficult. Furthermore, the development of more powerful enzyme-mediated labeling schemes or more biocompatible ligation reactions will enable the entire labeling process to be carried out in living cells or even in animals. Significant efforts in these directions will trigger a paradigm shift in this field and lead to new frontier in protein omics research. Although there are still numerous challenges to address, progress in this field is rapid, and it is realistic to expect that many of these issues will be solved in the near future.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

CRediT authorship contribution statement

Yiming Ma: Writing – review & editing, Writing – original draft, Resources. Yuanbo Wang: Writing – original draft. Fang Wang: Supervision. Sheng Lu: Supervision. Xiaoqiang Chen: Visualization, Validation, Supervision, Resources, Project administration.

Acknowledgments

This work is supported by the National Key R&D Program of China (No. 2021YFC2103600), the National Natural Science Foundation of China (Nos. 22278224, 22478191), and the Project of Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD) and the State Key Laboratory of Materials-Oriented Chemical Engineering (No. KL21-08).

Supplementary materials

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.cclet.2024.110546.

References
[1]
A.W. Woodham, S.H. Zeigler, E.L. Zeyang, et al., Nat. Methods 17 (2020) 1025-1032. DOI:10.1038/s41592-020-0934-5
[2]
X. Li, H. Yang, Y. Teng, et al., Chin. Chem. Lett. 33 (2022) 4223-4228.
[3]
H.C. Gong, Y.H. Zhang, Y. Gao, et al., Chin. Chem. Lett. 34 (2023) 108329.
[4]
P. Strop, K. Delaria, D. Foletti, et al., Nat. Biotechnol. 33 (2015) 694-696. DOI:10.1038/nbt.3274
[5]
S. Ha, J.C. Zhu, H. Xiang, et al., Chin. Chem. Lett. 35 (2024) 109192.
[6]
P. Agarwal, C.R. Bertozzi, Bioconjug. Chem. 26 (2015) 176-192. DOI:10.1021/bc5004982
[7]
T.Q. Wang, Y.N. Fu, S.J. Sun, et al., Chin. Chem. Lett. 34 (2023) 107508.
[8]
Y.X. Chen, G. Triola, H. Waldmann, Acc. Chem. Res. 44 (2011) 762-773. DOI:10.1021/ar200046h
[9]
P.J. Carter, G.A. Lazar, Nat. Rev. Drug. Discov. 17 (2018) 197-223. DOI:10.1038/nrd.2017.227
[10]
A. Beck, L. Goetsch, C. Dumontet, Nat. Rev. Drug. Discov. 16 (2017) 315-337. DOI:10.1038/nrd.2016.268
[11]
R. Roy, S. Hohng, T. Ha, Nat. Methods 5 (2008) 507-516. DOI:10.1038/nmeth.1208
[12]
M.J. Rust, M. Bates, X. Zhuang, Nat. Methods 3 (2006) 793-796. DOI:10.1038/nmeth929
[13]
T.A. Klar, S. Jakobs, M. Dyba, et al., Proc. Natl. Acad. Sci. U. S. A. 97 (2000) 8206-8210.
[14]
J.H. Zhang, H.X. Xu, B.S. Wang, et al., Chin. Chem. Lett. 34 (2023) 107871.
[15]
Z. Dai, L.Z. Tan, Y.Y. Su, et al., Chin. Chem. Lett. 35 (2024) 109121.
[16]
B.A. Griffin, S.R. Adams, R.Y. Tsien, Science 281 (1998) 269-272.
[17]
A. Keppler, S. Gendreizig, T. Gronemeyer, et al., Nat. Biotechnol. 21 (2003) 86-89.
[18]
S. Leng, Q.L. Qiao, Y. Gao, et al., Chin. Chem. Lett. 10 (2017) 1911-1915.
[19]
A. Gautier, A. Juillerat, C. Heinis, et al., Chem. Biol. 15 (2008) 128-136.
[20]
G.V. Los, L.P. Encell, M.G. McDougall, et al., ACS Chem. Biol. 3 (2008) 373-382. DOI:10.1021/cb800025k
[21]
S.R. Adams, R.E. Campbell, L.A. Gross, et al., J. Am. Chem. Soc. 124 (2002) 6063-6076.
[22]
S.Y. Zheng, W.M. Shao, S. Lu, et al., AIChE J. 68 (2022) e17912.
[23]
C. Zhang, P. Dai, A.A. Vinogradov, et al., Angew. Chem. Int. Ed. 57 (2018) 6459-6463. DOI:10.1002/anie.201800860
[24]
C. Zhang, M. Welborn, T. Zhu, et al., Nat. Chem. 8 (2016) 120-128. DOI:10.1038/nchem.2413
[25]
P. Dai, J.K. Williams, C. Zhang, et al., Sci. Rep. 7 (2017) 7954.
[26]
R.J. Taylor, M. Aguilar Rangel, M.B. Geeson, et al., J. Am. Chem. Soc. 144 (2022) 13026-13031. DOI:10.1021/jacs.2c04747
[27]
E. Hochuli, H. Döbeli, A. Schacher, J. Chromatogr. A 411 (1987) 177-184.
[28]
S.H. Uchinomiya, H. Nonaka, S.H. Fujishima, et al., Chem. Commun. (2009) 5880-5882. DOI:10.1039/b912025d
[29]
A. Guesdon, F. Bazilel, R.M. Buey, et al., Nat. Cell. Biol. 18 (2016) 1102-1108. DOI:10.1038/ncb3412
[30]
K. Gatterdam, E.F. Joest, V. Gatterdam, et al., Angew. Chem. Int. Ed. 57 (2018) 12395-12399. DOI:10.1002/anie.201802746
[31]
V. Glembockyte, R. Wieneke, K. Gatterdam, et al., J. Am. Chem. Soc. 140 (2018) 11006-11012. DOI:10.1021/jacs.8b04681
[32]
M. Cong, S. Tavakolpour, L. Berland, et al., Bioconjug. Chem. 32 (2021) 2397-2406. DOI:10.1021/acs.bioconjchem.1c00442
[33]
H.E. Morgan, Z.L.P. Arnott, T.P. Kaminski, et al., Bioconjug. Chem. 33 (2022) 2341-2347. DOI:10.1021/acs.bioconjchem.2c00411
[34]
T. Tanaka, T. Yamamoto, S. Tsukiji, et al., ChemBioChem 9 (2008) 802-807. DOI:10.1002/cbic.200700614
[35]
C. Zuo, R. Ding, X. Wu, et al., Angew. Chem. Int. Ed. 61 (2022) e202201887.
[36]
S.A. McConnell, B.R. Amer, J. Muroski, et al., J. Am. Chem. Soc. 140 (2018) 8420-8423. DOI:10.1021/jacs.8b05200
[37]
R. Yang, Y.H. Wong, G.K.T. Nguyen, et al., J. Am. Chem. Soc. 139 (2017) 5351-5358. DOI:10.1021/jacs.6b12637
[38]
K.S. Harris, T. Durek, Q. Kaas, et al., Nat. Commun. 6 (2015) 10199.
[39]
F.B.H. Rehm, T.J. Harmand, K. Yap, et al., J. Am. Chem. Soc. 141 (2019) 17388-17393. DOI:10.1021/jacs.9b09166
[40]
F.B.H. Rehm, T.J. Tyler, K. Yap, et al., J. Am. Chem. Soc. 143 (2021) 19498-19504. DOI:10.1021/jacs.1c08976
[41]
R. Hofmann, G. Akimoto, T.G. Wucherpfennig, et al., Nat. Chem. 12 (2020) 1008-1015. DOI:10.1038/s41557-020-0528-y
[42]
M. Fottner, J. Heimgärtner, M. Gantz, et al., J. Am. Chem. Soc. 144 (2022) 13118-13126. DOI:10.1021/jacs.2c02191
[43]
L. Wang, A. Brock, B. Herberich, et al., Science 292 (2001) 498-500. DOI:10.1126/science.1060077
[44]
I. Nikic, G.E. Girona, J.H. Kang, et al., Angew. Chem. Int. Ed. 55 (2016) 16172-16176. DOI:10.1002/anie.201608284
[45]
A. Dumas, L. Lercher, C.D. Spicer, et al., Chem. Sci. 6 (2015) 50-69.
[46]
C.C. Liu, A.V. Mack, M.L. Tsao, et al., Proc. Natl. Acad. Sci. U. S. A. 105 (2008) 17688-17693. DOI:10.1073/pnas.0809543105
[47]
J. Wang, J. Xie, P.G. Schultz, J. Am. Chem. Soc. 128 (2006) 8738-8739. DOI:10.1021/ja062666k
[48]
L. Wang, A. Brock, P.G. Schultz, J. Am. Chem. Soc. 124 (2002) 1836-1837.
[49]
P. Cheruku, J.H. Huang, H.J. Yen, et al., Chem. Sci. 6 (2015) 1150-1158.
[50]
C.S. McKay, M.G. Finn, Chem. Biol. 21 (2014) 1075-1101.
[51]
K. Lang, L. Davis, J.W. Chin, Methods Mol. Biol. 1266 (2015) 217-228. DOI:10.1007/978-1-4939-2272-7_15
[52]
A.C. Kneuttinger, K. Straub, P. Bittner, et al., Cell Chem. Biol. 26 (2019) 1501-1514.
[53]
J. Zang, Y. Chen, C. Liu, et al., Nat. Struct. Mol. Biol. 30 (2023) 62-71. DOI:10.1038/s41594-022-00866-9
[54]
H. Xiao, W. Xuan, S. Shao, et al., ACS Chem. Biol. 10 (2015) 1599-1603. DOI:10.1021/cb501055h
[55]
T. Peng, H.C. Hang, J. Am. Chem. Soc. 138 (2016) 14423-14433. DOI:10.1021/jacs.6b08733
[56]
Y.X. Li, Y.Y. Su, H.Y. Wang, et al., J. Am. Chem. Soc. 146 (2024) 26884-26896. DOI:10.1021/jacs.4c07958
[57]
D. Bessa-Neto, G. Beliu, A. Kuhlemann, et al., Nat. Commun. 12 (2021) 6715.
[58]
A. Arsic, C. Hagemann, N. Stajkovic, et al., Nat. Commun. 13 (2022) 314.
[59]
W. Liu, A. Brock, S. Chen, et al., Nat. Methods 4 (2007) 239-244. DOI:10.1038/nmeth1016
[60]
R. Serfling, I. Coin, Methods Enzymol 580 (2016) 89-107.
[61]
R. Serfling, C. Lorenz, M. Ezel, et al., Nucleic Acids Res. 46 (2018) 1-10. DOI:10.1093/nar/gkx1156
[62]
M. Hao, X.Y. Ling, et al., Nat. Chem. Biol. 20 (2024) 721-731. DOI:10.1038/s41589-023-01533-w
[63]
Y. Wang, J. Zhang, B. Han, et al., Nat. Commun. 14 (2023) 974. DOI:10.2991/978-94-6463-034-3_100
[64]
W.B. Cao, H.Y. Wang, M. Quan, et al., Chem 9 (2023) 2881-2901.
[65]
K. Lang, J.W. Chin, Chem. Rev. 114 (2014) 4764-4806. DOI:10.1021/cr400355w
[66]
Y. Wang, Z. Li, F. Mo, et al., Chem. Soc. Rev. 52 (2023) 1068-1102. DOI:10.1039/d2cs00142j
[67]
T. Tamura, I. Hamachi, J. Am. Chem. Soc. 141 (2019) 2782-2799. DOI:10.1021/jacs.8b11747
[68]
P. Gao, J.Y. Chen, P. Sun, et al., Chin. Chem. Lett. 34 (2023) 108296.
[69]
B.Q. Shen, K. Xu, L. Liu, et al., Nat. Biotechnol. 30 (2012) 184-189. DOI:10.1038/nbt.2108
[70]
S.B. Gunnoo, A. Madder, ChemBioChem 17 (2016) 529-553. DOI:10.1002/cbic.201500667
[71]
J.M.J.M. Ravasco, H. Faustino, A. Trindade, et al., Chem. Eur. J. 25 (2019) 43-59. DOI:10.1002/chem.201803174
[72]
V. Laserna, A. Istrate, K. Kafuta, et al., Bioconjug. Chem. 32 (2021) 1570-1575. DOI:10.1021/acs.bioconjchem.1c00317
[73]
N. Forte, M. Livanos, E. Miranda, et al., Bioconjug. Chem. 29 (2018) 486-492. DOI:10.1021/acs.bioconjchem.7b00795
[74]
E.A. Hull, M. Livanos, E. Miranda, et al., Bioconjug. Chem. 25 (2014) 1395-1401. DOI:10.1021/bc5002467
[75]
H.H. Dhanjee, A. Saebi, I. Buslov, et al., J. Am. Chem. Soc. 142 (2020) 9124-9129. DOI:10.1021/jacs.0c03143
[76]
C. Canovas, M. Moreau, C. Bernhard, et al., Angew. Chem. Int. Ed. 57 (2018) 10646-10650. DOI:10.1002/anie.201806053
[77]
A.L. Baumann, S. Schwagerus, K. Broi, J. Am. Chem. Soc. 142 (2020) 9544-9552. DOI:10.1021/jacs.0c03426
[78]
M.A. Kasper, M. Glanz, A. Stengl, et al., Angew. Chem. Int. Ed. 58 (2019) 11625-11630. DOI:10.1002/anie.201814715
[79]
M.A. Kasper, M. Glanz, A. Oder, et al., Chem. Sci. 10 (2019) 6322-6329. DOI:10.1039/c9sc01345h
[80]
C.E. Stieger, L. Franz, F. Korlin, et al., Angew. Chem. Int. Ed. 60 (2021) 15359-15364. DOI:10.1002/anie.202100683
[81]
L. Xu, M. Raabe, M.M.M. Zegota, et al., Org. Biomol. Chem. 18 (2020) 1140-1147. DOI:10.1039/c9ob02687h
[82]
C.C. Ward, J.I. Kleinman, D.K. Nomura, ACS Chem. Biol. 12 (2017) 1478-1483. DOI:10.1021/acschembio.7b00125
[83]
M.J. Matos, B.L. Oliveira, N. Martínez-Sáez, et al., J. Am. Chem. Soc. 140 (2018) 4004-4017. DOI:10.1021/jacs.7b12874
[84]
S.M. Hacker, K.M. Backus, M.R. Lazear, et al., Nat. Chem. 9 (2017) 1181-1190. DOI:10.1038/nchem.2826
[85]
M. Chilamari, L. Purushottam, V. Rai, Chem. Eur. J. 23 (2017) 3819-3823. DOI:10.1002/chem.201605938
[86]
M. Chilamari, N. Kalra, S. Shukla, et al., Chem. Commun. 54 (2018) 7302-7305. DOI:10.1039/c8cc03311k
[87]
G.H. Pham, W. Ou, B. Bursulaya, et al., ChemBioChem 19 (2018) 799-804. DOI:10.1002/cbic.201700611
[88]
S.M. Sarrett, C. Rodriguez, G. Rymarczyk, et al., Bioconjug. Chem. 33 (2022) 1750-1760. DOI:10.1021/acs.bioconjchem.2c00354
[89]
D. Hwang, K. Tsuji, H. Park, et al., Bioconjug. Chem. 30 (2019) 2889-2896. DOI:10.1021/acs.bioconjchem.9b00609
[90]
L.H. Liu, R. Chen, G. Xue, et al., Chin. Chem. Lett. 35 (2024) 108455.
[91]
S.R. Adusumalli, D.G. Rawale, K. Thakur, et al., Angew. Chem. Int. Ed. 59 (2020) 10332-10336. DOI:10.1002/anie.202000062
[92]
N.C. Reddy, R. Molla, P.N. Joshi, et al., Nat. Commun. 13 (2022) 6038.
[93]
J. Ceballos, E. Grinhagena, G. Sangouard, et al., Angew. Chem. Int. Ed. 60 (2021) 9022-9031. DOI:10.1002/anie.202014511
[94]
M. Todorovic, K.D. Schwab, J. Zeisler, et al., Angew. Chem. Int. Ed. 58 (2019) 14120-14124. DOI:10.1002/anie.201906514
[95]
K. Kubota, P. Dai, B.L. Pentelute, et al., J. Am. Chem. Soc. 140 (2018) 3128-3133. DOI:10.1021/jacs.8b00172
[96]
Y. Zhang, Q. Zhang, C.T.T. Wong, et al., J. Am. Chem. Soc. 141 (2019) 12274-12279. DOI:10.1021/jacs.9b03623
[97]
J. Li, Q.L. Hu, Z. Song, et al., Sci. China Chem. 65 (2022) 1356-1361.
[98]
B. Li, H. Tang, A. Turlik, et al., Angew. Chem. Int. Ed. 60 (2021) 6646-6652. DOI:10.1002/anie.202016267
[99]
E. Jacob, R. Unger, Bioinformatics 23 (2007) 225-230.
[100]
C.B. Rosen, M.B. Francis, Nat. Chem. Biol. 13 (2017) 697-705. DOI:10.1038/nchembio.2416
[101]
J. Yu, D. Shen, H. Zhang, et al., Bioconjug. Chem. 29 (2018) 1016-1020. DOI:10.1021/acs.bioconjchem.8b00086
[102]
X. Shi, Y. Jung, L.J. Lin, et al., Nat. Methods 9 (2012) 499-503. DOI:10.1038/nmeth.1954
[103]
M. Djalo, M.J.S.A. Silva, H. Faustino, et al., Chem. Commun. 58 (2022) 7928-7931. DOI:10.1039/d2cc02204d
[104]
N. Inoue, A. Onoda, T. Hayashi, Bioconjug. Chem. 30 (2019) 2427-2434. DOI:10.1021/acs.bioconjchem.9b00515
[105]
H.Y. Shiu, T.C. Chan, C.M. Ho, et al., Chem. Eur. J. 15 (2009) 3839. DOI:10.1002/chem.200800669
[106]
P.E. Dawson, T.W. Muir, I. Clarklewis, et al., Science 266 (1994) 776-779. DOI:10.1126/science.7973629
[107]
H. Ren, F. Xiao, K. Zhan, et al., Angew. Chem. Int. Ed. 48 (2009) 9658-9662. DOI:10.1002/anie.200903627
[108]
A. Bandyopadhyay, S. Cambray, J.M. Gao, Chem. Sci. 7 (2016) 4589-4593.
[109]
H. Faustino, M.J.S.A. Silva, L.F. Veiros, et al., Chem. Sci. 7 (2016) 5052-5058.
[110]
Y. Wu, C. Li, S. Fan, et al., Bioconjug. Chem. 32 (2021) 2065-2072. DOI:10.1021/acs.bioconjchem.1c00378
[111]
X. Zheng, Z. Li, W. Gao, et al., J. Am. Chem. Soc. 142 (2020) 5097-5103. DOI:10.1021/jacs.9b11875
[112]
A. Istrate, M.B. Geeson, C.D. Navo, et al., J. Am. Chem. Soc. 144 (2022) 10396-10406. DOI:10.1021/jacs.2c02185
[113]
M.H. Mir, S. Parmar, C. Singh, et al., Nat. Commun. 15 (2024) 859.
[114]
B Peschke, S. Bak, Peptides 30 (2009) 689-698.
[115]
W. Duan, G. Xu, Methods Mol. Biol. 1574 (2017) 135-144. DOI:10.1007/978-1-4939-6850-3_10
[116]
G. Xu, S.B.Y. Shin, S.R. Jaffrey, ACS Chem. Biol. 6 (2011) 1015-1020. DOI:10.1021/cb200164h
[117]
L. Yi, H. Sun, Y.W. Wu, et al., Angew. Chem. Int. Ed. 49 (2010) 9417-9421. DOI:10.1002/anie.201003834
[118]
B. Wu, H.J. Wijma, L. Song, et al., ACS Catal. 6 (2016) 5405-5414. DOI:10.1021/acscatal.6b01062
[119]
C. Bottecchia, T. Noël, Chem. Eur. J. 25 (2019) 26-42. DOI:10.1002/chem.201803074
[120]
C. Hu, Y. Chen, Tetrahedron Lett. 56 (2015) 884-888.
[121]
K. Maeda, H. Saito, K. Osaka, et al., Tetrahedron 71 (2015) 1117-1123.
[122]
A. Lipp, G. Lahm, T. Opatz, J. Org. Chem. 81 (2016) 4890-4897. DOI:10.1021/acs.joc.6b00715
[123]
C. Cassani, G. Bergonzini, C.J. Wallentin, Org. Lett. 16 (2014) 4228-4231. DOI:10.1021/ol5019294
[124]
T. Itou, Y. Yoshimi, K. Nishikawa, et al., Chem. Commun. 46 (2010) 6177-6179. DOI:10.1039/c0cc01464h
[125]
S.B. Lang, K.M. O'Nele, J.T. Douglas, et al., Chem. Eur. J. 21 (2015) 18589-18593. DOI:10.1002/chem.201503644
[126]
F.Le Vaillant, M.D. Wodrich, J. Waser, Chem. Sci. 8 (2017) 1790-1800.
[127]
S. Bloom, C. Liu, D.K. Kolmel, et al., Nat. Chem. 10 (2018) 205-211. DOI:10.1038/nchem.2888
[128]
D.C. Marcote, R. Street-Jeakings, E. Dauncey, et al., Org. Biomol. Chem. 17 (2019) 1839-1842. DOI:10.1039/c8ob02702a
[129]
M. Garreau, F.Le Vaillant, J. Waser, Angew. Chem. Int. Ed. 58 (2019) 8182-8186. DOI:10.1002/anie.201901922
[130]
L. Zhang, B.M. Floyd, M. Chilamari, et al., ACS Chem. Biol. 16 (2021) 2595-2603. DOI:10.1021/acschembio.1c00631
[131]
T. Tamura, Z. Song, K. Amaike, et al., J. Am. Chem. Soc. 139 (2017) 14181-14191. DOI:10.1021/jacs.7b07339
[132]
T. Tamura, T. Ueda, T. Goto, et al., Nat. Commun. 9 (2018) 1870.
[133]
C. Cui, H. Zhang, R. Wang, et al., Angew. Chem. Int. Ed. 56 (2017) 11954-11957. DOI:10.1002/anie.201706285
[134]
D. Yuan, Y. Zhang, K.H. Lim, et al., J. Am. Chem. Soc. 144 (2022) 18494-18503. DOI:10.1021/jacs.2c07594
[135]
T. Lee, J.H. Kim, S.J. Kwon, et al., J. Med. Chem. 65 (2022) 5751-5759. DOI:10.1021/acs.jmedchem.2c00084
[136]
E.V. Witting, S. Hober, S. Kanje, Bioconjug. Chem. 32 (2021) 1515-1524.
[137]
C. Yu, J. Tang, A. Loredo, et al., Bioconjug. Chem. 29 (2018) 3522-3526. DOI:10.1021/acs.bioconjchem.8b00680
[138]
S. Kishimoto, Y. Nakashimada, R. Yokota, et al., Bioconjug. Chem. 30 (2019) 698-702. DOI:10.1021/acs.bioconjchem.8b00865
[139]
K. Yamada, N. Shikida, K. Shimbo, et al., Angew. Chem. Int. Ed. 58 (2019) 5592-5597. DOI:10.1002/anie.201814215
[140]
T. Fujii, Y. Matsuda, T. Seki, et al., Bioconjug. Chem. 34 (2023) 728-738.
[141]
Y. Zeng, W. Shi, Q. Dong, et al., Angew. Chem. Int. Ed. 61 (2022) e202204132.
[142]
R.N. Reddi, A. Rogel, E. Resnick, et al., J. Am. Chem. Soc. 143 (2021) 20095-20108. DOI:10.1021/jacs.1c06167
[143]
R.N. Reddi, E. Resnick, A. Rogel, et al., J. Am. Chem. Soc. 143 (2021) 4979-4992. DOI:10.1021/jacs.0c10644
[144]
Y. Tivon, G. Falcone, A. Deiters, Angew. Chem. Int. Ed. 60 (2021) 15899-15904. DOI:10.1002/anie.202101174