- freely available
Cyclodipeptides: An Overview of Their Biosynthesis and Biological Activity
Awdhesh Kumar Mishra 1,†, Jaehyuk Choi 1,†, Seong-Jin Choi 2 and Kwang-Hyun Baek 1,*
Department of Biotechnology, Yeungnam University, Gyeongsan, Gyeongbuk 38541, Korea
Department of Biotechnology, Daegu Catholic University, Gyeongsan 38430, Korea
Correspondence: email: Tel.: +82-53-810-3029; Fax: +82-53-810-4769
These authors contributed equally to this work.
Received: 27 September 2017 / Accepted: 19 October 2017 / Published: 23 October 2017
Cyclodipeptides (CDP) represent a diverse family of small, highly stable, cyclic peptides that are produced as secondary functional metabolites or side products of protein metabolism by bacteria, fungi, and animals. They are widespread in nature, and exhibit a broad variety of biological and pharmacological activities. CDP synthases (CDPSs) and non-ribosomal peptide synthetases (NRPSs) catalyze the biosynthesis of the CDP core structure, which is further modified by tailoring enzymes often associated with CDP biosynthetic gene clusters. In this review, we provide a comprehensive summary of CDP biosynthetic pathways and modifying enzymes. We also discuss the biological properties of some known CDPs and their possible applications in metabolic engineering.
Keywords:aa-tRNAsynthetase; cyclodipeptides; cyclodipeptide synthase; non-ribosomal peptide synthetase; tailoring enzyme
Natural peptide products are one of the most dynamic sources of medicinally significant compounds . Cyclic dipeptides or cyclodipeptides (CDPs), also called 2,5-diketopiperazines, are the smallest cyclic peptides frequently found in nature, and are mainly synthesized by microorganisms . CDPs are a class of cyclic organic compounds in which the two nitrogen atoms of a piperazine 6-membered ring form amide linkages. The mainframe structure of CDPs is a CDP scaffold generated by the condensation of two α-amino acids. The nomenclature of CDPs is indicated by the three letter code for each of the two amino acids, plus a prefix to designate the absolute configuration, e.g., cyclo(l-Xaa-l-Yaa). CDPs can be configured as both cis and trans isoforms, but cis configurations are predominant .Various amino acid modifications confer diversified chemical and biological functions. CDPs exhibit better biological activity than their linear counterparts due to their higher stability, protease resistance, and conformational rigidity, all factors that increase their ability to specifically interact with biological targets [4,5]. They constitute a large class of secondary metabolites produced by bacteria, fungi, plants, and animals [1,2,6,7]. The available data indicate that approximately 90% of CDP producers are bacterial . CDPs and their derivatives exhibit a broad range of biological activities, such as bacterial quorum sensing, and antibacterial, antimicrobial, anticancer, and radical-scavenging properties. They have also been developed to carry biologically active molecules across the blood-brain barrier [1,8,9].
The CDP scaffold can be synthesized either by purely chemically means using different solid-phases or under reflux conditions in solution [1,10] or more naturally, by biosynthetic enzymes called non-ribosomal peptide synthetases (NRPSs) and CDP synthases (CDPSs; [7,11]) (Figure 1). Common chemical synthesis of CDPs includes the condensation of individual amino acids at high temperature. Dipeptides substituted with an amine at one terminus and an ester at the other can also spontaneously cyclize to form a 2,5-DKP. However, conditions must be optimized carefully in order to force a cyclization reaction and to limit racemization. This is the procedure most commonly used for the chemical synthesis of CDP. Cyclization of amino dipeptide esters can also be carried out under thermal conditions, normally by refluxing them in high boiling solvents such as toluene or xylene for 24 h . In addition, CDPs are often products of unwanted side reactions or degradation products of oligo- and polypeptides in processed food and beverages [2,12]. They are frequently formed during the chemical degradation of products in roasted coffee , stewed beef , and beer . Non-enzymatic processes can also lead to the formation of functional CDPs in various organisms, for example, cyclo(l-His-l-Pro) is found throughout the central nervous systems of mammals . This, cyclo(l-His-l-Pro) was the first active CDP and detected in human urine in 1965 . In this review, we will highlight the CDP biosynthetic machinery and the associated modifying enzymes crucial for their biological activities.
2. Biological Mechanisms of CDP Formation
CDPs are commonly synthesized from amino acids by various organisms, including mammals, and are considered secondary functional metabolites or side products of terminal peptide cleavage. Several CDP biosynthetic pathways have been elucidated, and in general, they can be classified into non-enzymatic and enzymatic pathways.
2.1. Non-Enzymatic Pathways of CDP Formation
Cyclo(His-Pro) is an endogenous cyclic dipeptide that exists throughout the central nervous systems of various organisms, including mammals, and plays roles in a number of regulatory processes . In mammals, cyclo(His-Pro) is derived from the precursor to thyrotropin-releasing hormone (TRH, pGlu-His-Pro). The TRH precursor, called TRH-Gly (pGlu-His-Pro-Gly), is first cleaved by pyroglutamate aminopeptidase, producing His-Pro-Gly, which is then non-enzymatically cyclized to cyclo(His-Pro). The proline induces constraints that promote the cis-conformation of the peptide bond between the histidine and the proline, thereby facilitating cyclization which generates the CDP scaffold. This mammalian CDP imparts the cytoprotective effect during NF-κB (nuclear factor kappa-light-chain-enhancer of activated B cells) and Nrf2 (nuclear factor like 2) signaling .
2.2. Enzymatic Pathways of CDP Formation
CDPs are commonly considered to be secondary metabolites. Some protease enzymes, such as dipeptidyl peptidases, cleave the terminal ends of proteins into generate dipeptides, which can naturally cyclize to form CDPs. Two unrelated biosynthetic enzyme families catalyze the formation of CDPs: NRPSs and CDPSs.
2.2.1. NRPS-Mediated CDP Biosynthesis
CDPscaffolds can be synthesized by one or more specialized NRPSs, either through dedicated biosynthetic pathways or through the premature release of dipeptidyl intermediates during the chain elongation process. The NRPS genes for a certain peptide are usually organized in one operon in prokaryotes, and in a gene cluster in eukaryotes . NRPSs are large modular enzymes, which simultaneously act as a template and as biosynthetic machinery. Each module is responsible for the incorporation of one amino acid into the final peptide, and can be further subdivided into the catalytic domains responsible for specific synthetic steps during peptide synthesis . In each module, NRPSs consist of three necessary domains: an adenylation (A) domain; a thiolation (T) domain post-translationally modified with a 4′-phosphopantetheinyl (4′-Ppant) arm, also termed the peptidyl carrier protein (PCP) domain; and a condensation (C) domain, separated by short spacer regions of approximately 15 amino acids. The A domain selects, activates, and loads the monomer onto the PCP domain. Here, the thiol group of the 4′-Ppantarm of the T domain mediates the nucleophilic attack of the adenylated amino acid. Subsequent peptide bond formation between two adjacent T-bound aminoacyl intermediates is catalyzed by the C domains . Another essential NRPS catalytic unit is the thioesterase (TE) domain, which is located in the C-terminus and catalyzes peptide release by either hydrolysis or macrocyclization. In addition, modification domains can be integrated into NRPS modules at different locations to modify the incorporated amino acids. Epimerization and N-methyltransferase domains are examples, which catalyze the generation of D- and methylated amino acids, respectively [20,21]. NRPSs rely not only on the 20 canonical amino acids, but also use several different building blocks, including non-proteinogenic amino acids, and this contributes to the structural diversity of non-ribosomal peptides and their differential biological activities . CDPs synthesized by NRPSs can be further modified by tailoring enzymes, usually encoded by genes clustered with the NRPS genes. The majority of known NRPS-derived CDPs are produced by fungi, whereas few bacteria are recognized as NRPS-derived CDP producers.
Many CDPs can be formed by dedicated NRPS pathways, such as brevianamide F , erythrochelin , ergotamine , roquefortine C , acetylaszonalenin , thaxtomin A , gliotoxin , and sirodesmin PL . In a few cases, CDPs can be formed by NRPSs during the synthesis of longer peptides, as truncated side products, as in the biosynthesis of cyclo(d-Phe-l-Pro) and cyclomarazine A [30,31].
2.2.2. CDPS-Mediated CDP Biosynthesis
CDPSs are a new family of tRNA-dependent peptide bond-forming enzymes that do not require amino acid charging. CDPSs share a common architecture reminiscent of the catalytic domain of class-Ic amino acid tRNAsynthetases (aaRSs), for example TyrRS and TrpRS . Both CDPSs and class-IcaaRSs comprise well conserved Rossmannfold domains along with a helical connective polypeptide 1 (CP1) subdomain. However, Class-IcaaRSs possess signature motifs involved in ATP binding (HIGH and KMSKS sequences) that are not present in CDPSs. In addition, CDPSs do not possess a distinct tRNA-binding domain, but rather contain a large patch of positively charged residues located in helix α4, which are important for the binding of aminoacyl-tRNA substrates. All these observed differences between CDPSs and their ancestral aaRSs result in unique enzymes for CDP biosynthesis.
CDPSs use amino acid tRNAs as substrates to catalyze the formation of CDP peptide bonds [11,33,34,35], diverting two aminoacyl-tRNAs from their essential role in ribosomal protein synthesis for use as substrates and catalyzing the formation of the two peptide bonds required for CDP formation . The synthesis process is initiated by the binding of the first aminoacyl substrate, likely involving ionic interactions between the negatively-charged ribose–phosphate tRNA backbone and the positive charges in helix α4 [32,37]. Hence, by using aminoacyl-tRNAs as substrates, CDPSs represent a direct link between primary and secondary metabolism. The catalytic mechanism of CDPSs can be described using a ping–pong model (Figure 2). All CDPSs possess two surface-accessible pockets that contain active site residues important for substrate selection and catalysis. The different aminoacyl binding sites for the two aa-tRNA substrates are termed pocket 1 (P1) and pocket 2 (P2). Upon specific recognition of the first substrate, the first aminoacyl group is transferred to the conserved serine residue of P1. Here, interaction between the tRNA moiety and basic residues in the α4 helix generates an aminoacyl–enzyme intermediate [34,38]. In the meantime, the aminoacyl moiety of the second aa-tRNA interacts with P2 through the α6–α7 loop. Ultimately, the aminoacyl–enzyme intermediate reacts with the second aa-tRNA to generate a dipeptidyl–enzyme intermediate, which undergoes intramolecular cyclization through the involvement of a conserved tyrosine, leading to the CDP scaffold as the final product. These CDPs can also be modified by closely associated tailoring enzymes.
There are approximately 163 putative CDPS genes identified so far, and of these, 150 are reported in bacteria, distributed among six phyla (Actinobacteria, Bacteroidetes, Chlamydiae, Cyanobacteria, Firmicutes, and Proteobacteria). Most known CDPSs are found in Actinobacteria, with 77 CDPSs reported to date. Twelve CDPSs were distributed among four eukaryotic phyla (Ascomycota, Annelida, Ciliophora, and Cnidaria), and one archaeon (Haloterrigena hispanica) CDPS has also been reported [7,11,39]. Some bacterial CDPSs have been fully characterized, such as albonoursin in Streptomyces noursei, pulcherrimin in Bacillus subtilis, and mycocyclosin in Mycobacterium tuberculosis [11,40].
3. Comparison of CDPS- and NRPS-Dependent Pathways
Both the CDPS and NRPS systems are used to synthesize CDP metabolites in nature. CDPSs are small enzymes (~26 kDa), while NRPSs are large modular enzymes (>100 kDa) . This size difference probably reflects the different strategies used to activate the amino acid carboxyl group required for peptide bond formation: NRPSs use A and PCP domains to recognize and activate amino acids in the form of PCP-bound aminoacyl thioesters, whereas CDPSs hijack aminoacyl-tRNAs, thereby eliminating the need to activate amino acids. The substrates of CDPSs are therefore limited to the 20 l-amino acids charged on tRNAs, whereas the range of amino acids that can be incorporated by NRPSs is much wider, including non-proteinogenic amino acids such as anthranilic acid in the synthesis of acetylaszonalenin, and 4-nitrotryptophan in thaxtomin biosynthesis [26,41]. Moreover, NRPS substrates can be altered on the enzyme by accessory domains, which introduce chemical modifications such as methylation (methylation domains, thaxtominsynthetase) or configuration changes (epimerization domains, erythrochelin synthetase), while in CDPS pathways, chemical modifications can only be introduced after CDP formation. Hence, wider structural complexity is found in CDPs synthesized via NRPS pathways (e.g., roquefortine, siderosmin, ergotamine) than in those synthesized via CDPS pathways. Moreover, NRPS-dependent pathways are prevalent in bacteria and fungi (but have not yet been identified in plants or animals), while CDPS-dependent pathways have been identified in bacteria (Bacillus sp., Pseudomonas sp.), fungi (Gibberella zeae, Fusarium oxysporum), protozoa (Ichthyophthirius multifiliis), and animals (Nematostella vectensis, Platynereis dumerilii).
4. CDP-Tailoring Enzymes and Their Functions
Tailoring enzymes that specifically modify CDP-containing natural products are usually associated with biosynthetic enzymes. Putative tailoring enzymes that modify the initially assembled CDPscaffold can be found in almost all NRPS and CDPS gene clusters, and are responsible for installing functional groups crucial for the biological activities of CDPs. In CDPS-dependent pathways, a large variety of different modification enzymes are found in close association with the respective CDPS genes [7,11], including different types of oxidoreductases, hydrolases, transferases, and ligases. The most prevalent putative tailoring enzymes in CDPS clusters are cyclic dipeptide oxidases (CDOs). CDOs are composed of two distinct small subunits that assemble into an apparent mega-dalton protein complex. Depending on the substrate, the CDO can sequentially perform one or two dehydrogenation reactions. The precise reaction mechanism for this has not been elucidated, although three different scenarios have been proposed: direct dehydrogenation, α-hydroxylation followed by loss of water, and imine formation with subsequent rearrangement of the enamine . Known CDOs include at least seven distinct P450 enzymes, five different types of α-ketoglutarate/FeII-dependent oxygenases, and three distinct flavin-containing mono-oxygenases.
In addition to oxidoreductases, a large number of different C-, N-, and O-methyltransferases, α/β-hydrolases, peptide ligases, and acyl-CoA transferases have been found in CDPS gene clusters in which different transcription factors belonging to the LuxR and MarR families, among others, are observed. They are usually involved in regulating various processes in response to environmental stimuli like toxic chemicals and antibiotics, which may hint at the biological functions of CDPS-dependent modified CDPs . Regarding NRPS-dependent pathways, a similar variety of modification enzymes has been reported, and again, enzymes that modulate the oxidation of the CDP scaffold and side chains are the most numerous . One distinguishing feature of fungal NRPS gene clusters is the prevalence of different prenyltransferases, which perform prenylations and reverse prenylations at various positions of tryptophan-containing CDP scaffolds . Judging by the diverse set of putative modification enzymes found within NRPS and CDPS gene clusters (Table 1), it is assumed that highly modified CDPs represent a diverse family of microbial natural products with varied functions.
Both chemical synthesis and enzyme-catalyzed assembly are valid ways of providing suitable substrates for CDP tailoring enzymes. When using chemically synthesized substrates, CDP modification enzymes can be employed in chemoenzymatic and cell-free in vitro settings, as well as in feeding experiments, while whole-cell in vivo biosynthesis based on in situ substrate generation by NRPS or CDPS enzymes represents an alternative approach to obtain modified CDPs .
5. Rational Design of CDPs for Structurally Diverse Peptides
There are two common methods for rationally altering the structure of natural peptides currently in practice. First, the peptide backbone itself can be altered by changing the identity, number, or connectivity of the constitutive amino acids. Second, tailoring enzymes can be introduced to catalyze specific chemical modifications into already assembled peptide scaffolds, leading to the synthesis of functionally altered peptides .
DPs are normally quite limited with regards to peptide backbone modification, as they are composed of only two amino acid residues arranged with a predefined connectivity. Hence, the only possible diversification method is the alteration of monomer identity. Moreover, CDPS-derived CDPs have an additional limitation, because they use charged tRNAs as substrates, which means that they only contain 20–22 proteinogenic amino acids. CDPS specificity mainly depends on the identity of the aminoacyl moiety bound to the tRNA. Hence, diversification of the CDP scaffold can be achieved by changing either the building block carried by a certain tRNA or the sequence of a tRNA that is specific for a particular amino acid. In vitro transcription or standard mutagenesis techniques can be employed to introduce small specific sequence changes into tRNAs that do not affect their overall structure or aminoacylation .
Additionally, altering the proteinogenic amino acids loaded onto specific tRNAs to produce non-proteinogenic amino acids could be used to produce CDPs containing non-standard monomers. For this, a residue-specific incorporation strategy can be employed by omitting the natural amino acid of choice in the growth medium while providing a non-canonical analog. By combining this with the use of auxotrophs as expression hosts, it is possible to obtain high-level replacement [54
The diversity of fungi is reflected in the variety of fungal metabolites, but it seems that certain groups are able to produce more metabolites than others. For example, Frisvad42 showed that species of Aspergillus, Penicillium, and Talaromyces are particularly productive organisms for secondary metabolites. A comparison with other genera shows that most secondary metabolites have been reported from Aspergillus (1984), from Penicillium (1338), and from Talaromyces, (316). Two other common genera, Fusarium (507) and Trichoderma (438), produce fewer secondary metabolites.
Frisvad42 preferred the term exometabolites for secondary metabolites and defined this term as small molecules produced during morphological and chemical differentiation that are outwardly directed, that is, secreted or deposited in or on the cell wall, and accumulated. This contrasts with endometabolites (primary metabolites), which fluctuate in concentration and are either transformed into other endometabolites or feed into exometabolites, exoproteins, exopolysaccharides, or morphological structures. While endometabolites can be found for almost all species of fungi, exometabolites, exoproteins, and exopolysaccharides are taxonomically delimited and produced in species-specific profiles. Some metabolites can occur as both endo- and exometabolites, for example, citric acid.
The biosynthetic pathways involved are also diverse, including polyketides, sesquiterpenes and diterpenes, diketopiperazines, cyclic peptides, β-lactams, and combinations of these pathways. Many of these compounds have biological activity that may be harmful, such as mycotoxins and phytotoxins, or beneficial, such as antibiotics and other pharmaceuticals.
Toxins in Food
There is a vast literature on mycotoxins, and numerous monographs have been published.43–49 A wealth of information is available about the fungal toxins produced in food. Many books and papers have been published on the occurrence, toxicity, and detection of these compounds. Wu et al.50 recently reviewed the public health impacts of food-borne mycotoxins. Although there are approximately 400 compounds described and considered to be toxic, the most important mycotoxins known today are (1) aflatoxins, which cause liver cancer and have also been implicated in child growth impairment and acute toxicoses; (2) fumonisins, which have been associated with esophageal cancer and neural tube defects; (3) deoxynivalenol and other trichothecenes, which are immunotoxic and cause gastroenteritis; and (4) ochratoxin A, which has been associated with renal diseases.
Toxins in Indoor Environments
There are many reports on the occurrence of mycotoxins in the indoor environment. Although species of the indoor mycobiota have the potential to produce toxic metabolites, much of the information in many publications or on the internet is not correct. The reported data mostly refer to species that grow on food (and can produce toxins on specific substrates), but it is important to know, however, whether the same species can produce toxic metabolites when grown on building material. Nielsen and Frisvad51 have reported that the number of species producing toxins in the indoor environment is actually small. They also explained that mycotoxin production on materials occurs at high water activity (aw > 0.9 on the material surface), but significant mycotoxin production will occur only above an aw of 0.95.
Sorensen et al.52 found that the conidia of Stachybotris chartarum contain trichothecene mycotoxins. In view of the potent toxicity of the trichothecenes, the inhalation of aerosols containing high concentrations of these conidia is considered to be a potential hazard to health. However, exposure is highest from dry materials and decaying biomass. Therefore the worst case scenario is consecutive water damage, in which large quantities of biomass and mycotoxins are formed, followed by desiccation of the biomass. In such a situation, many conidia and small fungal fragments will become aerosolized and will be deposited all over the building, including the building envelope.
Xerophilic species are common indoor fungi,20 and these molds are not known to produce important toxins in food. However, the metabolites they produce when growing in indoor environments have not been thoroughly investigated. Slack et al.53 reported that Eurotium species could produce neoechinulin A and B, epiheveadride, flavoglaucin, auroglaucin, and isotetrahydroauroglaucin as major metabolites. These compounds possess toxic properties, but the relevance to human exposure is not yet known. Furthermore Desroches et al.54 have found that Wallemia strains from the built environment in Canada can produce a number of metabolites, including the known compound walleminone and a new compound, wallimidione (1-benzylhexahydroimidazo[1,5-a]pyridine-3,5-dione). Based on an in silico analysis, wallimidione is likely to be the most toxic of the metabolites reported to date from W. sebi.