Academia.eduAcademia.edu

Evolutionary specialization of the nuclear targeting apparatus

1997, Proceedings of the National Academy of Sciences of the United States of America

The ␣-and ␤-karyopherins (Kaps), also called importins, mediate the nuclear transport of proteins. All ␣-Kaps contain a central domain composed of eight approximately 40 amino acid, tandemly arranged, armadillolike (Arm) repeats. The number and order of these repeats have not changed since the common origin of fungi, plants, and mammals. Phylogenetic analysis suggests that the various ␣-Kaps fall into two groups, ␣1 and ␣2. Whereas animals encode both types, the yeast genome encodes only an ␣1-Kap. The ␤-Kaps are characterized by 14-15 tandemly arranged HEAT motifs. We show that the Arm repeats of ␣-Kaps and the HEAT motifs of ␤-Kaps are similar, suggesting that the ␣-Kaps and ␤-Kaps (and for that matter, all Arm and HEAT repeat-containing proteins) are members of the same protein superfamily. Phylogenetic analysis indicates that there are at least three major groups of ␤-Kaps, consistent with their proposed cargo specificities. We present a model in which an ␣-independent ␤-Kap progenitor gave rise to the ␣-dependent ␤-Kaps and the ␣-Kaps.

Proc. Natl. Acad. Sci. USA Vol. 94, pp. 13738–13742, December 1997 Evolution Evolutionary specialization of the nuclear targeting apparatus (nuclear transportya and b karyopherinsyArmadillo and HEAT motifs) HARMIT S. MALIK, THOMAS H. EICKBUSH, AND DAVID S. GOLDFARB* Department of Biology, University of Rochester, Rochester, NY 14627 Communicated by Masayasu Nomura, University of California, Irvine, CA, October 2, 1997 (received for review August 1, 1997) containing HEAT motifs (11) (the HEAT acronym is based on proteins in which these repeats were originally identified, ref. 11). These motifs are 38–45 amino acids in length and have been proposed to characterize proteins involved in transport processes. HEAT motifs appear in both eukaryotic and prokaryotic proteins (11). The complexity of the extant ayb Kap nuclear targeting apparatus raises interesting evolutionary questions about the origin and function of these genetically diverse gene families. This apparatus must have evolved in the earliest nucleated eukaryotes given their need to compartmentalize proteins to either the nucleus or cytoplasm. Here, we have investigated the evolutionary relationships among these factors by using molecular phylogenetic techniques. This analysis suggests that the various a- and b-Kaps have evolved from a common ancestor. Three major specialization events have occurred during the evolution of these eukaryotic gene families. First, the a-Kaps arose together with the a-dependent b-Kap from an aindependent b-Kap progenitor. Second, the a-Kaps divided into two groups, the a1-Kaps and a2-Kaps. Finally, gene duplication and functional specialization has occurred within the a2-Kap family. ABSTRACT The a- and b-karyopherins (Kaps), also called importins, mediate the nuclear transport of proteins. All a-Kaps contain a central domain composed of eight approximately 40 amino acid, tandemly arranged, armadillolike (Arm) repeats. The number and order of these repeats have not changed since the common origin of fungi, plants, and mammals. Phylogenetic analysis suggests that the various a-Kaps fall into two groups, a1 and a2. Whereas animals encode both types, the yeast genome encodes only an a1-Kap. The b-Kaps are characterized by 14–15 tandemly arranged HEAT motifs. We show that the Arm repeats of a-Kaps and the HEAT motifs of b-Kaps are similar, suggesting that the a-Kaps and b-Kaps (and for that matter, all Arm and HEAT repeat-containing proteins) are members of the same protein superfamily. Phylogenetic analysis indicates that there are at least three major groups of b-Kaps, consistent with their proposed cargo specificities. We present a model in which an a-independent b-Kap progenitor gave rise to the a-dependent b-Kaps and the a-Kaps. The nuclear pore complex is remarkable for its capacity to catalyze the bi-directional translocation of diverse macromolecules (1–3). In contrast, the a and b karyopherins (here designated ayb Kaps) families of nuclear targeting factors [also called ayb importin, pore-targeting apparatus, nuclear localization signal (NLS) receptor-P97 complex, or Kap60y95] appear to have become specialized to mediate the transport of selected kinds of NLS cargo (4). The prototypic b-Kap95 factors of yeast and humans function as heterodimers with a-Kap60, an NLS receptor, to mediate the docking of NLS cargo at the cytoplasmic face of the nuclear pore complex. a-Kap60 mediates the import of a wide range of karyophilic proteins that contain simian virus 40 T-antigen-like or nucleoplasmin-like NLS sequences (1–3). In contrast to a-Kap dependent b-Kap95, yeast b-Kap104 (5), and its human homologue transportin (here called human Kap104) (6–8) are a-Kap independent transport factors that exhibit both the NLS-binding and nucleoporin binding activities needed to import an abundant group of M9 signalcontaining heterogeneous nuclear ribonucleoproteins. Recent results suggest that yeast b-Kap121 and b-Kap123 (9) and the human b-Kap121 homologue (10) are also a-Kap independent transport factors that mediate the import of ribosomal proteins. b-Kap95, -Kap104, -Kap121, and -Kap123 also have been termed b1–4, respectively (8, 10). The in vivo substrate ranges and binding properties of the different NLS receptors for their cognate NLSs remain inadequately defined. For example, it is still unclear if the different a-Kap and b-Kap NLS receptors have discrete or overlapping NLS-cargo specificities. Although little is known about the secondary structure of the b-Kaps, an earlier report has identified these proteins as RESULTS AND DISCUSSION Phylogeny of the a-Kaps. Phylogenetic analysis was first used to establish the relationship between different a-Kaps. a-Kaps are '60-kDa polypeptides previously characterized as being composed of eight tandem 38–45 residue Armadillo (Arm) repeats (identified in Drosophila armadillo protein) flanked by '100-aa N- and C-terminal domains (12, 13). Alignment of all available a-Kap-like sequences revealed extensive similarity within the Arm repeats, N-, and Cterminus domains (Fig. 1). Because the a-Kaps contain internal tandem repeats, the alignment in Fig. 1 cannot be used in constructing a phylogeny unless it can be unambiguously determined that the order of individual repeats within the domain has been conserved. Otherwise, nonhomologous repeats would be aligned, leading to incorrectly inferred phylogenies. A phylogenetic analysis of each of the eight separate Arm repeats from three highly divergent a-Kaps is shown in Fig. 2 and indicates that the order of the eight Arm repeats has been conserved from yeast to humans. For example, Arm repeat no. 1 in the yeast a-Kap is more similar to repeat no. 1 in the two human a-Kaps than to any other Arm repeat in the yeast and human proteins. As shown in Fig. 2, a similar conclusion can be made for each of the eight Arm repeats. We conclude that the order of the Arm repeats has been conserved in all known a-Kaps. This result predicts that the number and order of the Arm repeats of the extant a-Kaps is similar to that of the progenitor a-Kap. The precise conservation of Arm repeat order further suggests that the eight repeats are not functionally interchangeable. This The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked ‘‘advertisement’’ in accordance with 18 U.S.C. §1734 solely to indicate this fact. Abbreviations: Kap, karyopherin; Arm, armadillo; NLS, nuclear localization signal. *To whom reprint requests should be addressed. e-mail: Goldfarb@ Nucleus.Biology.Rochester.edu. © 1997 by The National Academy of Sciences 0027-8424y97y9413738-5$2.00y0 PNAS is available online at http:yywww.pnas.org. 13738 Evolution: Malik et al. Proc. Natl. Acad. Sci. USA 94 (1997) 13739 FIG. 1. a-Kap alignment. Complete ORFs from the various a-Kaps were aligned by using the CLUSTALV package of programs (27) and high gap penalties. p and • below the alignment indicate identities and similarities, respectively among all a-Kaps. The boxes indicate the most recent definition of the Arm repeats (12, 13), and the horizontal arrows indicate the HEAT motifs. The PID (protein identification) numbers of each sequence are indicated at the beginning of the alignment. Note that the Caenorhabditis elegans a-Kap, a composite of several expressed sequence tags, has a considerably longer N terminus, which may be an artifact. hypothesis may be tested by constructing recombinant a-Kaps with shuffled repeat units. The Arm repeat domains of all available a-Kap sequences were used to construct a phylogeny by the Neighbor-Joining method (Fig. 3). Based on the phylogeny shown in Fig. 3, we propose that the members of the group containing the yeast SRP1 encoded a-Kap be referred to as ‘‘a1-Kaps’’ (or a1importins), and that the members of the other groups be called ‘‘a2-Kaps’’ (or a2-importins). These two groups are likely to FIG. 2. Conservation of the order of Arm repeats within the a-Kaps. The eight individual Arm repeats from yeast a1-Kap, human a1-Kap, and human a2B-Kap were aligned, and a phylogenetic relationship was derived by using the Neighbor-Joining method (28). The tree shown here is a 50% consensus tree with bootstraps represented as a percentage of 1,000 replications. reflect a functional divergence of a-Kaps, because each group is broadly distributed in different phyla. Note, however, that within the a2 clade shown in Fig. 3 there are three distinct sub-clades, each as different from one another as they are from the a1 group. We have designated these three clades as subdivisions of a single a2 group. This designation was done in recognition of the fact that yeast contains only an a1-Kap (see below). Thus, the a2A, a2B, and a2C Kaps all are likely to have arisen from an archetypal a1-Kap. Alternatively, we could have just as easily designated the three a2 clades as a2, a3, and FIG. 3. Phylogeny of the a-Kaps. The alignment of the Arm repeat-containing domains of the various a-Kaps was used to construct a phylogeny by the Neighbor-Joining method (28). Shown here is the consensus tree of 1,000 bootstrap replications. The tree was midpointrooted between the a1 and a2-Kaps. The phylogenetic relationship of the proteins based on the Maximum Parsimony algorithm, by using the PAUP package of programs (30), also was determined and gave the identical topology. Nomenclatures based both on the submitted names as well as a proposed scheme are indicated. 13740 Evolution: Malik et al. a4. Previous number designations of a-Kaps have been made somewhat arbitrarily, for example, according to the order of their submission to the database (see Fig. 1). The previous nomenclature incorrectly implies that the human and mouse a2-Kaps are more closely related to the yeast a1-Kap than are the human and mouse a1-Kaps. The approach we have taken to assigning a-Kap nomenclature is more consistent with the derived a-Kap phylogeny shown in Fig. 3. The a-Kap phylogeny is notable for the existence of only a single a1-Kap and no a2-Kap gene in Saccharomyces cerevisiae (SRP1) as well as multiple a1- and a2-Kaps in humans. Because of the availability of the entire yeast genome sequence, it is unlikely that another a-Kap exists in yeast. This fact indicates that the yeast a1-Kap must be capable of performing all of the transport duties required of an a-Kap in this single-extensive organism. The a2-Kap family has undergone more extensive specialization than the a-Kaps. Three highly divergent clades of the a2-Kaps, designated a2A, a2B, and a2C, may serve the distinct developmental, tissue-type specific, and NLS cargo-specific functions of more complex organisms. This hypothesis is supported by the observations that the expression of Drosophila a2A-Kap (Pendulin: refs. 14 and 15) and mouse a2B-Kap (16) are developmentally regulated and that a1 and a2B-Kaps in humans differ in their in vitro NLS binding properties (17). Even within the three clades of the a2-Kaps there has been significant sub-specialization. Two similar a2B-Kaps identified in Xenopus (18), and the Proc. Natl. Acad. Sci. USA 94 (1997) human alpha3yQip1 (a2C-Kap clade) are examples of this specialization. Future sequencing of a-Kap genes from divergent species should help further resolve the various a-Kap groups and provide insight into when these groups appeared in the eukaryotic lineage. For example, are a2-Kaps restricted to metazoans? Do they occur in vascular plants? [To date, only a1-Kap expressed sequence tags have been reported in plants (not shown).] Do mammalian equivalents of the Drosophila a2A-Kap (Pendulin) exist? Repeat Structure and Phylogeny of the b-Kaps. b-Kaps in yeast fall into four major classes—the a-dependent b-Kap95p, the heterogeneous nuclear ribonucleoprotein-transporting bKap104p, and the ribosomal protein transporting factors bKap121p and b-Kap123p. As mentioned earlier, b-Kaps are characterized by internal HEAT motifs. At the amino acid sequence level, HEAT motifs are significantly more degenerate than Arm repeats, but each tandem repeat appears to represent a conserved secondary structure motif consisting of antiparallel a-helices with short connecting loops (11, 19). In the b-Kaps, specific HEAT repeats have been proposed to bear Ran-binding determinants (see ref. 20). We have attempted to construct a comprehensive map of the HEAT and non-HEATcontaining domains of these b-Kaps. As shown in Fig. 4, the b-Kaps are composed predominately of 14–15 tandem HEAT repeats, flanked by 100–150 amino acid arms (the KAP95s are missing the C-terminal arms). The HEAT repeats are in tandem except for the interruption of 1–3 non-HEAT se- FIG. 4. Distribution of HEAT motifs in the b-Kaps. HEAT motifs are indicated as boxes, and non-HEAT-containing sequences are represented by thick lines. The entire human and yeast ORFs were aligned by CLUSTALV (27) in pairs. A window of about 40 amino acids, matching the HEAT consensus (11), then was used to match the ORF pairs with a high gap penalty, to score for individual HEAT motifs. The alignment windows were evaluated individually on the basis of matches to the residues identified previously as being important for the secondary structure. Mismatches of one important residue per a-helix were allowed as long as the secondary structure prediction was unaffected. Secondary structure predictions were used based on the SSPRED method (29) to confirm the primary sequence alignments. The darkly shaded boxes represent HEAT motifs that are unambiguously homologous in the various b-Kaps (data not shown). The dashed boxes represent HEAT motifs that are not close matches to the universal HEAT consensus (Fig. 3) but are predicted to assume a similar secondary structure (11, 19, 20). Each HEAT repeat is bounded by numbers above and below the box that indicate the beginning and ending amino acid for that particular repeat. Accession numbersyprotein identification numbers are indicated below the protein names. Evolution: Malik et al. quence or ‘‘partial’’ HEAT sequence-containing regions. Most of the length variation of the different b-Kaps is derived from non-HEAT motif expansion domains in the larger b-Kaps. Because of the high sequence degeneracy among HEAT motifs, it is difficult to match homologous HEAT repeats among the b-Kaps. However, we were able to unambiguously identify three sets of homologous HEAT repeats among the four b-Kap groups—the first, sixth, and seventh (Fig. 4, dark boxes). An approximately 79-residue sequence (Fig. 5) containing the sixth and seventh HEAT repeats, which may contain part of a Ran-binding determinant (20), was used to construct a phylogeny of the four b-Kap groups (Fig. 6). An alignment of the entire protein sequences based on their similar domain structures (Fig. 4) yielded a similar tree (not shown). Not included in this analysis (because they were missing the sixth and seventh HEAT repeats) are a number of expressed sequence tags from vertebrates and invertebrates that fall in the Kap95, Kap104, or Kap121 lineages. No expressed sequence tags homologous to yeast Kap123p were identified. Thus, KAP123, which is nonessential for growth, may be unique to yeast or is expressed at low levels in metazoans. The tree in Fig. 6 suggests that the Kap95p, Kap104p, and Kap121y123p groups are very old and about equally divergent, consistent with their proposed functions in defining eukaryotic nuclear transport pathways (1–10). Although it is not possible to root this tree, the most parsimonious explanation to account for two of three major branches being a-Kap independent is that the Kap95p ancestor arose from an a-Kap independent progenitor. Alignment of Arm and HEAT Repeats. If, as we postulate, a-dependence is a derived character, then it is likely that the a-Kaps would have appeared in tandem with the a-dependent b-Kaps. This simultaneous appearance could have occurred by a genetic event that separated the NLS-binding domain from the nuclear pore complex docking domain of an a-independent b-Kap-like ancestor into separate genes. Evidence that the ancestral a-KAP gene arose from a b-KAP progenitor is that the a-Kap Arm and b-Kap HEAT repeats can be aligned by shifting by 10 residues the currently accepted definition of the tandem Arm repeat junctions (refs. 12 and 13; Fig. 7). Indeed, the definition of the central domain of a-Kaps as containing HEAT motifs is better because the relatively poor first Arm repeat becomes an excellent match to a HEAT repeat (Figs. 1 and 7). Second, the proposed secondary structure of the HEAT repeat-containing proteins suggests they are in a repeating helical packing structure (19). If, as we suggest, this structure is present in a-Kaps and other Arm repeatcontaining proteins (19, 21), the currently defined (12, 13) Arm repeat junction interrupts one of the two helical secondary structures found in every repeat (Figs. 1 and 7). The original definition (22) of the Arm repeat junction is in accordance with the HEAT junction, and correctly positions the border of the sequence outside of the two helical regions. It should be noted that the phylogenetic analysis of the a-Kaps that used HEAT junctions yields the same tree topologies as Proc. Natl. Acad. Sci. USA 94 (1997) 13741 FIG. 6. Phylogeny of the b-Kaps. Neighbor-Joining trees (28) based on the alignment in Fig. 5, with bootstrap values of 1,000 replications. The phylogenetic relationship of the proteins based on Maximum Parsimony (30) differs in that the human Kap121 falls outside a branch containing both the yeast Kap121 and yeast Kap123. shown in Figs. 2 and 3, which are based on the Arm junctions (data not shown). The finding that Arm and HEAT repeats define the same two-helical secondary structure suggests that Arm and HEAT repeat-containing proteins belong to the same protein superfamily and may share analogous functions. Both domains appear to function as scaffolds to bind various protein ligands. For example, the Arm repeats of b-catenin are required for binding to a-catenin, cadherin, and adenomatous polyposis coli (23). The HEAT motifs of the A subunit of PP2A associate with tumor antigens and with the B and C subunits of PP2A (19). The same is likely to be true for the Arm and HEAT motifs of a-Kaps and b-Kaps, respectively, which form a heterodimer in the targeting of NLS-bearing substrates to the nuclear pore complex. Model for the Evolution of the Kaps. Based on these results we propose the following model for the evolution of b and ayb Kap-mediated nuclear transport. Both b and ayb systems likely evolved from an ancient a-Kap-independent b-Kap progenitor. The b-KAP95 gene family arose at about the time of the separation between the b-KAP104 and b-KAP121y123 families. Recently, additional genes with partial homology to the b-Kaps have been reported (24, 25), suggesting further specialization within this family. We propose that the ancestral a-KAP gene, evolved in tandem with the b-KAP95 progenitor. Because yeast has a single a1-Kap and the a2-Kaps appear to be more adapted to specialized functions, we propose the first a-Kap was probably of the a1 type. A second major phase of evolution within the Kaps, the appearance and subspecialization of the a2-Kaps, occurred after the separation of the yeast and human lineages to serve the specialized targeting and nuclear functions of early metazoan cell types, associated with development (26). For example, the loss of the Drosophila FIG. 5. Alignment of conserved sixth and seventh HEAT motifs of b-Kap sequences. p and • at the bottom of the alignment represent identities and similarities among the b-Kap sequences, respectively. Putative Ran-binding determinants (20) are highlighted in bold. 13742 Evolution: Malik et al. Proc. Natl. Acad. Sci. USA 94 (1997) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. FIG. 7. Alignment of Arm and HEAT motifs. HEAT motifs from the various b-Kaps and Arm repeat-containing segments from Yeast a2-Kap and Drosophila Armadillo (b-catenin) were aligned to the universal HEAT consensus (11). p and h indicate polar and hydrophobic residues, and HHH indicates a-helical secondary structure prediction. The arrow indicates the Arm repeat junction identified in Fig. 1 (12). Numbers at the beginning and end of each amino acid stretch represent the number of amino acids to the end of the ORF. Numbers in parentheses represent the number of amino acids between the consecutive HEAT repeats. a2-Kap Pendulin is an embryonic lethal mutation that causes defects in the control of cell proliferation and malignancy of hematopoietic organs (14, 15). Note Added in Proof. Recently, the structure of the Arm-repeat domain of murine b-catenin was solved (31). The junctions between the 12 secondary structure motifs were found to be inconsistent with the Arm repeat definition. These authors proposed an alternative repeat junction that is in agreement with the HEAT motif definitions that we suggest applies to the a-Kaps and b-catenins (Fig. 7). This work was supported by National Institutes of Health Public Health Service Grants GM40362 (D.S.G.) and GM42790 (T.H.E.). 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. Gorlich, D. S. & Mattaj, I. W. (1996) Science 271, 1513–1518. Nigg, E. A. (1997) Nature (London) 386, 779–787. Corbett, A. H. & Silver, P. A. (1997) Microbiol. Mol. Biol. Rev. 61, 193–211. Goldfarb, D. S. (1997) Curr. Biol. 7, 14–17. Aitchison, J. D., Blobel, G. & Rout, M. P. (1996) Science 274, 624–627. Pollard, V. W., Michael, W. M., Nakielny, S., Siomi, M. C., Wang, F. & Dreyfuss, G. (1996) Cell 86, 985–994. Nakielny, S., Siomi, M. C., Siomi, H., Michael, W. M., Pollard, V. & Dreyfuss, G. (1996) Exp. Cell Res. 229, 261–266. Bonifaci, N., Moroianu, J., Radu, A. & Blobel, G. (1997) Proc. Natl. Acad. Sci. USA 94, 5055–5060. Rout, M. P., Blobel, G. & Aitchison, J. D. (1997) Cell 89, 715–725. Yaseen, N. R. & Blobel, G. (1997) Proc. Natl. Acad. Sci. USA 94, 4451–4456. Andrade, M. A. & Bork, P. (1995) Nat. Genet. 11, 115–116. Peifer, M., Berg, S. & Reynolds, A. B. (1994) Cell 76, 789–791. Kussel, P. & Frasch, M. (1995) J. Cell Biol. 129, 1491–1507. Kussel, P. & Frasch, M. (1995) Mol. Gen. Genet. 248, 351–363. Torok, I., Strand, D., Schmitt, R., Tick, G., Torok, T., Kiss, I. & Mechler, B. M. (1995) J. Cell Biol. 129, 1473–1489. Prieve, M. G., Guttridge, K. L., Munguia, J. E. & Waterman, M. L. (1996) J. Biol. Chem. 271, 7654–7658. Nadler, S. G., Tritschler, D., Haffar, O. K., Blake, J., Bruce, A. G. & Cleaveland, J. S. (1997) J. Biol. Chem. 272, 4310–4315. Gorlich, D., Prehn, S., Laskey, R. A., Hartmann, E. (1994) Cell 79, 767–778. Ruediger, R., Hentz, M., Fait, J., Mumby, M. & Walter, G. (1994) J. Virol. 68, 123–129. Moroianu, J., Blobel, G. & Radu, A. (1996) Proc. Natl. Acad. Sci. USA 93, 6572–6576. Hirschl, D., Bayer, P., Muller, O. (1996) FEBS Lett. 383, 31–36. Yano, R., Oakes, M., Yamaghishi, M., Dodd, J. A. & Nomura, M. (1992) Mol. Cell. Biol. 12, 5640–5651. Peifer, M. (1995) Trends Cell Biol. 5, 224–229. Fornerod, M., van Deursen, J., van Baal, S., Reynolds, A., Davis, D., Murti, K. G., Fransen, J. & Grosveld, G. (1997) EMBO J. 16, 807–816. Gorlich, D., Dabrowski, M., Bischoff, F. R., Kutay, U., Bork, P., Hartmann, E., Prehn, S. & Izaurralde, E. (1997) J. Cell Biol. 138, 65–80. Gorlich, D., Kraft, R., Kostka, S., Vogel, F., Hartmann, E., Laskey, R. A., Mattaj, I. W. & Izaurralde, E. (1996) Cell 87, 21–32. Higgins, D. G., Bleasby, A. J. & Fuchs, R. (1992) Comput. Appl. Biosci. 8, 189–191. Saitou, N. & Nei, M. (1987) Mol. Biol. Evol. 4, 406–425. Mehta, P. K., Heringa, J. & Argos, P. (1995) Protein Sci. 4, 2517–2525. Swofford, D. L. (1991) Illinois Natural History Survey, Champaign, IL. Huber, A. H., Nelson, W. J. & Weis, W. I. (1997) Cell 90, 871–882.