Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore Advanced Dairy Chemistry

Advanced Dairy Chemistry

Published by BiotAU website, 2021-11-21 15:21:43

Description: Advanced Dairy Chemistry

Search

Read the Text Version

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 389 Fig. 13.1 Concentration Lagomorphs (g/L) of milk proteins in eight species (from Jenness 1974; 200 Holt and Jenness 1987; Grabowski et al., 1991; Rat Martin 1996) 95 Ewe 61 Sow 44 Camel 36 Cow 33 Goat 30 Mare 20 10 Woman Besides this wide quantitative variability, it is 90 κ αs2 worth noting that, between species, a high rate of 80 sequence divergence generally occurs in ortholo- 70 β αs1 gous gene products. Hence, the casein fraction of milk is a complex and specific system which 60 deserves to be considered in terms of diversity, par- ticularly in light of the growing number of biologi- 50 cally active peptides derived from milk proteins, including caseins, identified during the last 20 years 40 (Clare and Swaisgood 2000; Meisel 2005). 30 In addition, post-translational processing, such as phosphorylation, glycosylation and lim- 20 ited proteolysis by plasmin, increases the hetero- geneity of this system which is complicated even 10 more by the occurrence of genetic variants. The primary focus of this chapter will be interspecies 0 Rabbit comparisons in terms of quantitative and struc- Rat tural variability. Nevertheless, within-species Pig variability will be also considered when it is a species-specific feature. Particular attention will Ovine be paid to discrete phosphorylation and exon- Bovine skipping events which contribute to protein diver- Caprine sity and evolution and very likely to specific Horse micellar organisation. Human Camel 13.2.1 The Casein Gene Locus (CSN) and Quantitative Variability Fig. 13.2 Comparative properties of milk caseins for nine species. Numbers correspond to mean values from Caseins are present in the milk of all mammals. several data sets in the literature (Grabowski et al., 1991; However, their total concentration and their rela- Ribadeau Dumas and Brignon 1993; Martin and tive proportions are largely species dependent. Grosclaude 1993; Kappeler et al., 1998; Ginger and The species studied so far produce more or less Grigor 1999; Miranda et al., 2004). The percentage of as1- casein given for sow’s milk corresponds to as- caseins (as1 + as2) large quantities of milks whose protein content ranges between 10 and 200 g/kg (Fig. 13.1). Human milk has one of the lowest protein con- tent (10 g/kg), whereas that of rabbit milk is undoubtedly one of the highest (200 g/kg). Amongst dairy ruminants, with more than 50 g/ kg, sheep milk has the highest total protein con- tent. Beyond this large variability in the milk pro- tein content, there are large differences in the relative proportions of caseins, between species (Fig. 13.2). Thus, in human milk, b-casein is, by far, the main casein component. Conversely, as1- casein predominates in rabbit milk. The milk of

390 P. Martin et al. kb 50 0 100 200 300 HSA 4 ODAM CSN3 CSN1S1 CSN2 STATH HTN3 HTN1 CSN1S2a CSN1S2b 7 human 7.5 30 22 8.7 34 26 14 8.5 48 CSN1S1 CSN2 CSN1S2a CSN1S2b ODAM CSN3 7 55 ECA 3 75 80 35 horse Csn1s1 Csn2 Csn1s2a Csn1s2b Odam Csn3 15 12 MMU 5 10 13 26 12 19 40 mouse CSN1S1 CSN2 STATH CSN1S2 ODAM CSN3 20 45 23 40 BTA 6 45 cattle CSN1 CSN2 ODAM CSN3 MDO 5 opossum CSN1 CSN2 CSN2b CSN3 OAN platypus Fig. 13.3 Evolution of the casein locus organisation. The monotreme (platypus) locus is significantly smaller (less casein loci from platypus (Ornithorhynchus anatinus), than 150 kb) with a duplication of CSN2 (grey arrow), opossum (Monodelphis domestica), cattle (Bos taurus), which occurred recently in this lineage. STATH (statherin) mouse (Mus musculus), horse (Equus caballus) and and HTN3 and HTN1 (histatins) are genes having a com- human (Homo sapiens) genomes are drawn approximately mon origin and encoding salivary proteins that protect to scale in order to underline the expansion of this locus teeth by regulating the spontaneous precipitation of cal- during the course of mammalian evolution (adapted from cium phosphate salts on enamel surface (Kawasaki and Lefèvre et al., 2009; Warren et al., 2008, taking into Weiss 2003). ODAM is a gene highly conserved across account additional genomic information from the NCBI). species encoding the odontogenic ameloblast-associated Genes are depicted by arrow boxes, giving the orientation protein, a tooth-associated epithelia protein that probably of transcription. Empty boxes represent putative genes plays a role in odontogenesis, possibly incorporated into based on similarity, of which the expression remains to be the enamel matrix at the end of the mineralisation process demonstrated. Intergenic region sizes are given in kb. The (Kestler et al., 2008), but also conspicuous by its expres- human and horse loci have approximately the same size sion in several epithelial tissues (Moffatt et al., 2008). (320/330 kb), whereas the cattle locus (250 kb) is ca. CSN1S1, CNS1S2, CSN2, and CSN3 are genes encoding 80 kb shorter. Whilst the marsupial (opossum) locus is as1-, as2-, b- and k-casein, respectively close in size to the cattle locus, on the other hand, the rat, porcine and bovine contains approximately they are in placental mammals (Warren et al., 2008). The genomic organisation of the platypus the same proportion of b- and as1-caseins. Today, casein locus has been elucidated and compared human milk remains the only thoroughly studied with other mammalian genomes, including the marsupial opossum and several eutherians (Lefèvre milk in which as2-casein has not been found. The et al., 2009). Whereas the physical linkage of presence of two different as2-caseins has been casein genes has been confirmed in platypus, a detected recently in equidae milk (Martin et al., recent duplication of CSN2 was observed in the monotreme lineage, as opposed to more ancient unpublished). duplications of CSN1S2 in the eutherian lineage, whilst marsupials possess only single copies of a- Comparative analysis of the casein locus organ- and b-casein-encoding genes. Another striking isation (Fig. 13.3) appears to be highly conserved between species, even for ancestral mammals such as monotremes (platypus) in which casein genes are tightly clustered together in the genome, as

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 391 feature is the close proximity between CSN1 and differences in their responsiveness to lactogenic the main CSN2. The lineage-specific gene duplica- hormones, at least in transfected cell lines. The tions that have occurred within the casein locus of species-specific arrangement of transcription fac- monotremes and eutherians but not marsupials, tor binding sites in the b-casein gene appears to which may have lost part of the ancestral casein be crucial for the strength and stage at which this locus, emphasise the independent selection on gene is expressed in different species, including milk provision strategies to the young, most likely human, rodents and ruminants (Winklehner- linked to different developmental strategies Jennewein et al., 1998). For example, the bovine, (Lefèvre et al., 2009). but not the mouse, b-casein gene is strongly induced shortly before parturition. This differ- The four (or five) genes are confined to a 250– ence in stage-specific expression was recapitu- 350 kb region on chromosome 6 in cattle and lated in the expression of a bovine b-casein goats (Threadgill and Womack 1990; Hayes transgene (including 16 kb of 5¢- and 8 kb of et al., 1993) and arranged in the order as1, b, as2 3¢-flanking regions) in transgenic mice, thus indi- and k. In the goat, loci encoding as1- and b-caseins cating that cis-acting sequences might be, at least were shown to be ca. 12 kb apart and conver- in part, responsible for species-specific expres- gently transcribed (Leroux and Martin, 1996). sion patterns (Rijnkels et al., 1995). These results were confirmed in cattle for which the genomic organisation of the casein gene locus Nevertheless, transcription is not the only was determined (Rijnkels et al., 2003). Despite level at which regulation of gene expression may some differences in the distance separating casein occur. In the following, we will see that there are genes and their numbers, the overall organisation many other factors acting at the post-transcrip- of the locus is fairly well conserved, and the pres- tional level, including messenger RNA stability ence of dominant cis-acting regulatory elements, and processing, as well as translational regulation required for the high-level coordinate expression (Bevilacqua et al., 2006; Rhoads and Grudzien- of the casein genes, is suspected in the as1-/b-re- Nogalska, 2007). The protein-coding regions of gion (Rijnkels et al., 1997). most vertebrate genes, including those encoding caseins in mammals, are split. Most eukaryotic Indeed, all four genes are coordinately messenger RNAs are thus transcribed as precur- expressed at high levels in a tissue- and stage- sors (pre-messengers) containing intervening specific fashion. The three genes encoding the sequences (introns) which have to be removed to “calcium-sensitive” caseins (as1, as2 and b), that generate mature and functional mRNAs. The are related through evolution, share common reg- process of intron removal, and exon joining ulatory motifs in the proximal 5¢-flanking region (splicing), is a major function ensured, in the (Groenen et al., 1993). Although the organisation nucleus, by a large multicomponent (five small of the 5¢-flanking region of the k-casein gene is nuclear RNAs and more than 50 proteins) com- different (Coll et al., 1995), its expression pattern plex, called spliceosome, assembled in a stepwise seems to be similar to that of the other casein pathway. This accurate mechanism is governed genes. There is more and more evidence demon- by a set of rather strict rules to achieve high strating that a common set of transcription factors fidelity and efficiency in splicing. However, is required in most mammalian species for the caseins spliced variants are widely spread across expression of milk protein genes. The mecha- species. A dysfunction of this machinery may nisms controlling milk protein gene expression, have dramatic biological consequences by modi- especially pertaining to the behaviour of Stat5, in fying the message and accordingly the primary the cow are significantly different from the mouse structure of the protein. This is well exemplified (Wheeler et al., 1997). More precisely, the differ- in mare’s milk in which a low-molecular-weight ent organisation of the hormone response regions b-casein variant, showing a 132 amino acid resi- of casein genes, from the binding of factors such dues internal deletion, has been characterised as as Stat5 and C/EBP, in different mammalian spe- arising from a cryptic splice site usage occurring cies, apparently does not result in fundamental

392 P. Martin et al. within exon 7, during the course of primary respectively, whereas, due to the insertion of a transcripts processing (Miclo et al., 2007). Such tandem repeated hexapeptide sequence deviant splicing behaviour might be regulated by (QASLAQ), the protein is significantly larger an intronic splicing enhancer, sometimes located (from 280 to about 300 amino acid residues) in far away from the splicing site, as was shown for mouse (Hennighausen et al., 1982) and rat (Hobbs the gene encoding b-casein in mare mammary and Rosen, 1982). This sequence was shown to gland (Lenasi et al., 2006). correspond to a short “virtual exon” occurring within intron 13 of the bovine gene and surrounded 13.2.2 Primary Structure of Caseins: by quite perfect consensus splice sequences Comparison Across Species (Martin et al., 1996). In addition, the same short sequence is recognised as an exon in the porcine Since the elucidation of the primary structure of as1-casein mRNA (Alexander et al., 1992). bovine as1-casein by Mercier et al. (1971), the complete amino acid or nucleotide (cDNA, gene) The three hydrophobic domains identified in sequence of the four (five) caseins has been deter- the bovine molecule, spanning residues 1–44, mined in a number of species (including human, 90–113 and 132–199, are more or less well con- horse, ruminants and rodents). Multiple align- served between species. The most highly con- ments, which help to define functional domains, served region, except the signal peptide, remains, performed with sequences available today, can however, the multiple phosphorylation site, reach highly informative levels, provided that the encoded by the 3¢ end of exon 9. This SerP clus- structural intron/exon organisation of the gene is ter is confined within a sequence (encoded by taken into account. Indeed, in such a way, multiple exons 7–10) carrying a high net negative charge alignments, in which gaps were introduced to max- (7 SerP, 3 Asp and 8 Glu, in bovine), at the natu- imise the alignment, reveal the conserved regions ral pH of milk, whilst the remainder of the mole- but also highlight their evolutionary pathways. This cule is, under such conditions, essentially is particularly true for the as-caseins, the genes for uncharged. These features are rather well con- which comprise up to about 20 exons. served in the 12 compared species, exemplified herein by the third hydrophobic domain (residues 13.2.2.1 as1-Casein 132–199), corresponding to the seventeenth exon, A multiple alignment of as1-casein from 12 spe- probably being one of the most conserved parts cies is presented in Fig. 13.4. Even tuning the of the molecule. as1-Casein does not contain any alignment, taking into account the exon modular cysteine or cystine, except in rodents (Hobbs and splitting derived from known gene structural Rosen, 1982; Grusby et al., 1990) and humans organisations (Koczan et al., 1991; Leroux et al., (Rasmussen et al., 1995; Johnsen et al., 1995; 1992; Jolivet et al., 1992), there are few even short Martin et al., 1996), a feature which is usually segments of amino acid identity across the 12 spe- found in as2-casein. In this connection, it is worth cies. Conversely, such a method of alignment noting that, at least in human milk, in which the immediately indicates the occurrence of insertion/ presence of as2-casein still remains to be demon- deletion events. Exon skipping, first found in goat strated (Rijnkels et al., 2003), as1-casein is capa- as1-casein (Leroux et al., 1992) and in human ble of forming disulphide-linked heteromultimers b-casein (Menon et al., 1992a, b; Martin and with k-casein which contains only one cysteinyl Leroux, 1992), was shown to be responsible for residue (Rasmussen et al., 1999). such events and for the apparent relatively high structural divergence observed between as1-ca- 13.2.2.2 b-Casein seins from different species, as well as for its wide With 209–217 residues, in cattle and pig, respectively, variability in size. as1-Casein ranges from 183 to without any cysteinyl residue, this casein is the most 199 amino acid residues, in guinea pig and cattle, hydrophobic of the four caseins. It is especially rich in proline and displays, in all species, an amphipathic structure with a single multiple phosphorylation

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 393 Fig. 13.4 Multiple alignment of the amino acid sequence Large green boxes within blocks depict species-specific of as1-casein from 12 eutherian species. Abbreviations constitutively outspliced exons. Red boxes under exonic and accession numbers are given in parentheses: cow blocks identify highly conserved amino acid residues (bov, M38641), water buffalo (buf, AJ005430), sheep (>9/12) between species. Guinea pig as1-casein is the (she, X03237), goat (goa, X59836), camel (cam, sequence of casein B characterised by Hall et al. (1984b). AJ012628), pig (pig, X54973), rabbit (rab, X13042), Italics correspond to the signal peptides, of which the guinea pig (cav, X00938), rat (rat, J00710), mouse (mou, cleavage site is indicated by the vertical blue arrow. M36780), horse (hor, NM_001081883) and human (hum, Spaced dashes are inserted gaps introduced to maximise X98084). Peptide sequences are split into blocks of amino the alignment. Asterisk refers to the basic motifs of tan- acid residues to visualise the exonic modular structure of dem hexapeptide repeats occurring in the rat and mouse the protein as deduced from known splice junctions of the as1-caseins. A monotreme (platypus: pla) CSN1 casein bovine (Koczan et al., 1991), goat (Leroux et al., 1992) sequence (FJ548613) is also given for comparison. and rabbit (Jolivet et al., 1992) genes. Exon numbering (in Underlined sequence represents part of the protein falling bold) is that of the ruminants genes. Additional exons are in an unresolved platypus genome sequence (Lefèvre numbered in single quotes and double quotes (in italics). et al., 2009) site, located in the N-terminal part of the molecule. Eleven of the same 12 species, except the At the pH of milk, the N-terminal sequence (30 resi- guinea pig, of which the b-casein sequence is not dues) is highly negatively charged (an average of 11 available, are compared and aligned in Fig. 13.5. negative charges, 7 Glu and 4 SerP, in ruminants There is no evidence in the extensive literature concentrated in the 21 first N-terminal amino acid related to guinea pig caseins for the existence of residues), whereas the rest of the peptide chain, b-casein. However, there is no irrefutable which is highly hydrophobic, has no net charge. evidence for its absence from guinea pig milk. Such a feature explains the property of b-casein Again, as previously mentioned for as1-caseins, which allows for micellar aggregates to be formed the conservation of the leader peptide is strikingly in solution. notable. Moreover, with ca. 80% homology

394 P. Martin et al. Fig. 13.5 Multiple alignment of the amino acid sequence the human (Hansson et al., 1994) genes. Exon numbering of b-casein from 11 eutherians. Abbreviations and acces- (in bold) is that of ruminants genes. Additional exons are sion numbers are given in parentheses: cow (bov, numbered in single quotes and double quotes (in italics) M15132), water buffalo (buf, AJ005165), sheep (she, for platypus sequence. Large blue boxes, within blocks X16482), goat (goa, AH001195), camel (cam, AJ012630), depict species-specific constitutively outspliced exons. pig (pig, X54974), rabbit (rab, X13043), rat (rat, J00711), Red and black boxes, between eutherian and marsupial mouse (mou, X04490), human (hum, X17070) and horse sequences, identify highly conserved amino acid residues (hor, NM_001081852, Q9GKK3 on Expasy UniProtKB). (>10/14) between species and anchoring points of marsu- Two marsupial, tammar wallaby (wal, X54715) and pos- pial sequences, respectively. Italics correspond to the sig- sum (pos, AF128397), as well as a monotreme (platypus: nal peptides, of which the cleavage site is indicated by the pla, FJ548612) sequences are also given. Peptide vertical blue arrow. Spaced dashes are inserted gaps sequences are split into blocks of amino acid residues to introduced to maximise the alignment. Underlined and visualise the exonic modular structure of the protein as bold amino acids in the marsupial sequences depict dupli- deduced from known splice junctions of the rat (Jones cations and the basic motif of the tandem octapeptide et al., 1985), the bovine (Bonsing et al., 1988), sheep repeats found in marsupials, respectively (Provot et al., 1995), the rabbit (Thépot et al., 1991) and between as1- and b-casein signal peptides, the level of the mature proteins, the poly-phosphory- evolutionary relationship between the genes lated region (encoded by exon 4) is no longer the encoding calcium-sensitive caseins is further only region showing a clear conservation between substantiated. The close proximity of the main a- species. Indeed, all along the large and mainly and b-casein proteins in monotremes strongly hydrophobic sequence, encoded by exon 7 (more supports this statement. On the other hand, at the than 160 amino acid residues), a rather high level

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 395 of homology (30%) is observed. This ratio is encoding the 3 calcium-sensitive caseins confirm quite good, given the number of sequences such an assumption. aligned, having a number of isolated amino acid residues (Q, P, K and L) conserved. Furthermore, 13.2.2.3 as2-Casein most of substitutions tend to be conservative as2-Casein was the last bovine casein to be (Holt and Sawyer, 1988). sequenced (Brignon et al., 1977). Of the calcium- sensitive caseins, it is the most highly phosphory- Albeit less frequently, probably owing to its lated. as2-Casein occurs in milk in several forms less split genomic organisation, exon skipping and differs in the level of phosphorylation (10–13 also occurs during the course of the processing of phosphate groups/molecule). The peptide chain is primary transcripts from the b-casein gene, in 207 amino acid residues long, and the phosphate humans (Menon et al., 1992a, b; Martin and groups are clustered in three regions of the mole- Leroux 1992) and horse (Miranda et al., 2004; cule (7–31, 55–66 and 129–143). Peptide seg- Lenasi et al., 2006; Miclo et al., 2007). Despite ments spanning residues 68–125 and the C-terminal apparent high dissimilarity with those of euthe- part of the protein are predominantly hydrophobic rian species, two marsupial (tammar wallaby and (Holt and Sawyer 1988). Given that human milk brushtail possum) sequences (Collet et al., 1992; does not appear to contain as2-like casein and the Ginger et al., 1999) and one monotreme (platy- equine sequence is still not available, amino acid pus) sequence (Lefèvre et al., 2009) have been sequence comparisons are restricted to only 10 included in the alignment (Fig. 13.5). Interestingly, eutherian species (Fig. 13.6). this attempt strongly suggests again an outsplic- ing of exon 3, as reported for humans. Elsewhere, The structural organisation of the as2-casein tammar b-casein and that isolated from the milk gene, first determined for the bovine species of the common brush-tailed possum, which are (Groenen et al., 1993), provides evidence that by far larger than the others (270 amino acid resi- genes encoding as2- and b-caseins are more dues for the mature polypeptide chain vs. 209 in closely related to each other than to the as1-casein cattle), display a tandemly repeated (16 or 17 gene. However, analyses of interspecies relation- times) octapeptide sequence (RESLLAHE) in the ships performed at the transcript level, show that C-terminal part of the molecule. This strongly as2-caseins have diverged through extensive supports the notion that the gene encoding sequence rearrangements and a high level of b-casein might have grown through intragenic nucleotide substitution (Stewart et al., 1987). A duplication, eventually coupled with (or followed tandem repeat was first detected in the amino acid by) changes of splice sites, before being sub- sequence of bovine as2-casein (Brignon et al., jected subsequently to duplications, giving rise to 1977). On the basis of the gene sequence (Groenen the cognate genes that have then evolved et al., 1993), the large internal repeat was pre- divergently. cisely extended to codons 43–124 and 125–204 which resulted in the formation of exons 12–16 Considered together, these observations sup- by a duplication of exons 7–11. In addition, from port the hypothesis (Jones et al., 1985) of a pre- both amino acid and nucleotide sequence com- sumed primitive and common ancestral gene parisons, it is still evident that, with up to 60% resulting from the recruitment into a functional similarity, exons 3–5 also arise from a duplication gene with a minimum of five exons: the first and event of exons 8–10. Therefore, it can be hypoth- the last corresponding to the 5¢ and 3¢ non-coding esised that this gene has been subjected to two regions; the second encoding the signal peptide; successive duplications of a 5-exons module fol- the third, a highly hydrophilic region including a lowed by the loss of one upstream (exon 7/12) and multiple phosphorylation site; and the penulti- one downstream (exon 11/16) exon. mate coding for a hydrophobic sequence required to ensure aggregation properties, which in turn is The same observation can be made for the other essential for casein micelle formation. Sequence artiodactyls, the camel, the rat and the guinea pig similarities between the first exons of the genes sequences, whilst for the mouse and the rabbit, the

396 P. Martin et al. Fig. 13.6 Multiple alignment of the amino acid sequence split into blocks of amino acid residues to visualise the of as2-casein from ten mammalian species. Abbreviations exonic modular structure of the protein, as deduced from and accession numbers are given in parentheses: cow known splice junctions of the bovine gene (Groenen et al., (bov, M16644), water buffalo (buf, AJ005431), sheep 1995). Additional exons are numbered in single quotes (she, X03238), goat (goa, X65160), camel (cam, and double quotes (in italics). Large blue boxes, within AJ012629), pig (pig, X54975), rabbit (rab a, X76907; rab blocks depict species-specific constitutively outspliced b, X76909), mouse (mou e, J00379; mou g, D10215), exons. Red boxes identify highly conserved amino acid guinea pig (cav, X00374) and rat (rat g, J00712). Two residues (>10/12) between sequences. Italics correspond sequences are given for rabbit (a and b) and mouse (d/e to the signal peptides, of which the cleavage site is indi- and g). The sequence for the rat and the guinea pig corre- cated by the vertical blue arrow. Spaced dashes are sponds to the g-casein (Hobbs et al., 1982) and to casein A inserted gaps introduced to maximise the alignment (Hall et al., 1984a), respectively. Peptide sequences are situation is not so clear and complicated by the of the bovine gene, correspond to exon sequences outspliced from the mature messengers during the existence of two (e/d and g or as2-a and b, respec- processing of the primary transcripts. Camel as2- tively) as2-casein genes (Hennighausen et al., casein, with 178 amino acid residues, lacks two 1982; Sasaki et al., 1993; Dawson et al., 1993). In internal stretches of 9 and 15 amino acid residues, very likely corresponding to exons 8 and 10. It these species, the as2-like caseins are shorter than now remains to demonstrate the presence, at the the others since they are 143, 184, 180 and 182 genomic level, of the relevant nucleotide sequence (encoding the missing peptide segment) and to amino acid residues long, respectively, and all the characterise mutations responsible for this double exon-skipping event, as had been done for the duplicated sequences were apparently lost in mouse e/d-casein. Thus, there are marked differ- ences in size. Here too, deleted segments, empha- sised by multiple alignments based on the structure

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 397 bovine as2-casein D variant (Bouniol et al., 1993). Although it belongs to the same chromosomal On the other hand, as previously mentioned for casein locus, the gene encoding k-casein does not as1-casein, additional exons may also break out share any common structural organisation scheme from intron sequences. The presence of an extra with the other casein genes. It is thought to be evo- peptide sequence (IQSGEELST), between exon lutionarily related to the g fibrinogen gene (Jollès 11 and 12 in pig as2-casein, could be due to this et al., 1978; Alexander et al., 1988), which encodes kind of event. Similarly, one can anticipate the a protein similarly involved in a clotting process probable existence of an additional exon sequence (blood), following a limited proteolytic cleavage. within intron 14 of the gene encoding as2-casein b in the rabbit genome. Interspecies comparison reveals that the k-ca- sein gene is identically organised in ruminants One of the main biochemical features of as2- (Alexander et al., 1988; Persuy et al., 1995), rab- casein, usually underlined and plausibly having bit (Baranyi et al., 1996) and humans (Edlund an important functional role, is its ability to form et al., 1996). The transcription unit invariably disulphide bridges, owing to the presence of two comprises 5 exons, three of which are small (65, cysteinyl residues at positions 36 and 40, in the 62 and 33 nucleotides in bovine), encoding the mature bovine peptide chain. These residues are 5¢-untranslated region (5¢-UTR) (exon 1) and the encoded by exon 6, the sequence of which is signal peptide (exons 2 and 3) which is longer rather well conserved, when present, with both (21 vs. 15 residues in the calcium-sensitive cysteinyl residues at the same position. In con- caseins). Therefore, the majority of the mature trast, in rat (g), mouse (e/d and g) and rabbit as2- protein sequence (160 amino acid residues) is casein, which lack this exon, both cysteinyl encoded by exon 4, whilst the last exon encodes residues are obviously removed. Nevertheless, the 3¢-untranslated region. with a single and two contiguous cysteinyl resi- dues in the middle of the peptide sequence Multiple alignments from the same 12 spe- encoded by exon 11 in rat and mouse g-caseins, cies, considered in this chapter, are shown in respectively, and since rat, mouse and human as1- Fig. 13.7. Gaps have been introduced to maxi- caseins contain at least one cysteinyl residue, mise similarity, taking into account comparisons there is no eutherian milk in which any as-casein performed at the nucleotide level of exon 4 is devoid of a cysteinyl residue. (Cronin et al., 1996; Gatesy et al., 1996) for higher ruminants as well as 21 other species 13.2.2.4 k-Casein including cetaceans, hippo, deer, giraffe, tapir Glycosylated at various levels, k-casein is highly and zebra. heterogeneous, soluble in presence of calcium and differs considerably in structure from the cal- 13.2.3 Molecular Diversity of Caseins: cium-sensitive caseins. The functional duality of Interspecies Variability k-casein, which is to interact hydrophobically with the other caseins and at the same time pro- In addition to differences in primary structure vide a hydrophilic and negatively charged surface across species, examined above, which reflect on the micelle to stabilise the colloidal suspen- changes at the genomic level within coding sion, is strikingly reflected by its amphipathic pri- sequences and/or flanking sequences (splice site mary structure. Its hydrophilic and flexible consensus sequences), further sources of varia- C-terminal part (caseinomacropeptide or CMP) is tion occur at a post-transcriptional level. They cleaved specifically by chymosin (between resi- affect mainly the processing of the primary tran- dues Phe105 and Met106, in ruminants), thus leading scripts and PTM such as phosphorylation (all to the destabilisation of the micelle, to which the caseins) and glycosylation (k-casein). The extent highly hydrophobic and insoluble N-terminal part of this variability and the complexity of the (para-k-casein) remains anchored. Two cysteine specific pattern within each species provide fur- residues are found in the para-k-casein region. ther criteria of distinctiveness between species.

398 P. Martin et al. Fig. 13.7 Multiple alignment of the amino acid sequence 1988), the rabbit (Baranyi et al., 1996) and the human of k-casein from 12 mammalian species. Abbreviations (Edlund et al., 1996) genes. Red boxes identify highly and accession numbers are given in parentheses: cow conserved amino acid residues (>10/12) between species. (bov, M36641), water buffalo (buf, AJ011387), sheep Italics correspond to the signal peptides, of which the (she, X51822), goat (goa, X60763), camel (cam, Y10082), cleavage site is indicated by the vertical blue arrow. pig (pig, X51977), rabbit (rab, Z18243), guinea pig (cav, Spaced dashes are inserted gaps introduced to maximise X56020), rat (rat, K02598), mouse (mou, M10114), horse the alignment. The chymosin-sensitive bond is indicated (hor, NM_001081884 (NCBI) and P82187 (UniProtKB, by the green vertical arrow. X corresponds to the basic Expasy)) and human (hum, M73628). Peptide sequences motif of the repeated sequence (underlined) in the guinea are split into blocks of amino acid residues to visualise the pig. The platypus (pla, FJ548626) sequence is also given, exonic modular structure of the protein, as deduced from taking into account the structural organisation of the gene known splice junctions of the bovine (Alexander et al., (Lefèvre et al., 2009) 13.2.3.1 Defects in the Processing of primarily due to a “slippage” of the splicing Primary Transcripts: Splice machinery, is induced by a “favourable” junction Variants sequence. As far as caseins are concerned, this defect in accuracy leads to the loss of the first codon Two kinds of events may arise during the process- (usually a CAG) of the 3¢ exon. The second event, ing of the primary transcripts, both leading to a which is particularly well exemplified by small shortening of the peptide chain length. The first ruminants, gives rise to a casual alternative exon event, referred to as cryptic splice site usage,

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 399 skipping (sometimes referred to as non-allelic exon and efficiently identified involves a 5¢-to-3¢ scan- skipping). It is thought to be caused by weaknesses ning process. The first AG downstream from the in the consensus sequences, either at the 5¢ and/or branch point-polypyrimidine tract is preferen- 3¢ splice junctions or at the branch point, or both. tially selected. A second AG, competitive with the proximal one, can be used alternatively Casual Usage of Cryptic Splice Sites (Smith et al., 1993). Starting the exon sequence The casual deletion of a glutaminyl residue (Gln78 with a CAG (coding for a glutaminyl residue) or Q78), first detected in goat as1-casein (Leroux would be a facilitating situation. The short size et al., 1992), seems to be a rather frequent phe- of the intron might be an enhancing factor. nomenon, occurring in most of the species exam- Indeed, introns 6 and 10, involved in human and ined so far. This codon skipping, which is likely ruminant genes encoding as1-casein, are 150 bp due to an erroneous 3¢ cryptic splice site usage (Martin et al., 1996) and 100 bp long (Koczan when exons 10 and 11 are joined, has been found et al., 1991; Leroux et al., 1992), respectively. in the four major ruminant species (Ferranti et al., Likewise, in mice, upstream introns are 81 (Csn2, 1997, 1999). It is worth noting that this kind of exon 6) and 84 bp (Csn1s1, exon 8) long, event may play a functional role in the structure respectively. and stability of casein micelles, since Gln78 is located at the junction between the polar cluster of “Species-Specific” Casual Exon Skipping phosphoseryl residues and the hydrophobic Structural characterisation of caseins and/or domain of the protein. Interestingly, such a cryptic analyses of relevant mRNA have enabled the splice site event occurs not only in ruminants. In identification, in the four ruminant species, of the human, as1-casein transcript, in which exon 11 multiples forms of as1-casein. However, the is lacking, a glutaminyl residue (Q37 in the mature extent of this heterogeneity depends on the spe- protein), encoded by the first codon of exon 6¢, an cies. Whilst as1-casein phenotypes consist of a additional exon found in the intron-bridging exon mixture of two forms (199 and 198 amino acid 6 to exon 7 (Martin and Leroux, unpublished residues) in cattle and water buffalo, due to the results), was also shown to be casually absent alternative deletion of Q78, there are, in sheep and (Johnsen et al., 1995; Martin et al., 1996). goats, at least seven molecular forms which dif- fer in their peptide chain length, regardless of Examples of glutaminyl residue insertion/ genetic polymorphism (Ferranti et al., 1997). deletion in protein, due to cryptic splice site The main component corresponds to the 199-res- usage, are well-documented (Condorelli et al., idue form initially described in goat milk 1994; Vogan et al., 1996; Hayashi et al., 1997), (Brignon et al., 1989). The others, in lower and they have been shown recently to occur also amounts, are shorter forms of as1-casein differ- with other calcium-sensitive casein pre-messen- ing in deleted sequences (residues 110–117 and/ gers and for species other than humans and rumi- or 141–148). Genomic and mRNA analyses nants. Boumahrou et al. (2011) reported in mice demonstrated that these forms originated from the loss of the first codon (CAG) in exon 6 and 81 exon-skipping events affecting exon 13 (encod- of Csn2 (2Q26) and Csn1s1 (2Q44) genes, respec- ing peptide 110–117) and/or exon 16 (encoding tively. This loss in accuracy of the splicing peptide 141–148) during the processing of the machinery would be due to the nucleotide primary transcripts. Deletions of peptide 110– sequence at the intron-exon junction. The mech- 117, which contains 4 charged residues (SP115, anism by which the 3¢ splice site AG is accurately E110, E117 and K114) and of peptide 141–148, which contains only one (E141), produce proteins 1Exon 8 if we adopt, for comparative purposes, a generic with a different net charge. The protein lacking exon numbering valid for all species for which CSN1S1 sequence 141–148 is the only one to date gene sequence is known. Otherwise it is exon 7 in the identified and localised in isoelectric focusing in mice gene. sheep (Chianese et al., 1996). 2Referring to the mature protein sequence.

400 P. Martin et al. The alternative splicing of exon 13 and/or 16 affecting 30–40% of mRNA, another structural reported for small ruminant species has not been difference involving an internal stretch of 44 detected in cattle and water buffalo (Ferranti nucleotides in the 5¢-UTR has been reported sub- et al., 1999). Such differences in the processing sequently to be caused by casual exon skipping. of as1-casein pre-messengers in these closely related species are amazing and hard to explain, Exon skipping has therefore to be considered since none of the mutations (substitution or dele- as a frequent event, mainly in the case of as1- and tion) identified between small ruminants and as2-casein genes, for which the coding region is cattle affects consensus splice sites (Leroux divided into many short exons. However, since 1992). However, according to Passey et al. (1996), these pioneering works, many additional exam- a substitution within the donor splice site could ples have been found, and the existence of intronic be responsible for this casual skipping, estimated cis-element (intronic splicing enhancer) increas- to affect 20% of the total ovine as1-casein mRNA, ing the inclusion of “weak” exons or influencing through the formation of an inhibitory RNA sec- cryptic splice site usage has been reported in the ondary structure. Long and short variants of as1- equine b-casein gene (Lenasi et al., 2006). Do casein which differ by the presence or absence of those deletions in calcium-sensitive caseins sim- a stretch of 8 amino acid residues encoded by ply reflect the lack of accuracy of an intricate pro- exon 16 have been observed in camel milk also cessing mechanism whenever mutations induce (Kappeler et al., 1998). conformational modifications of pre-mRNA, pre- venting or enhancing the normal progress of This phenomenon is clearly not restricted to events? Notwithstanding, these phenomena are ruminants. The existence of three as1-casein tran- mainly responsible for the great complexity of scripts has been reported in human mammary tis- casein composition. sue (Johnsen et al., 1995; Martin et al., 1996). This heterogeneity is due to a differential splicing Genetic Polymorphisms Increase Casein of exon 7 (bovine gene numbering) and to the Heterogeneity in Peptide Chain Length usage of a cryptic splice site. Likewise, porcine In addition to casual exon skipping, genetic poly- as1-casein also shows such heterogeneity morphism of milk proteins may sharply increase (Alexander et al., 1992) with multiple forms dif- the heterogeneity of caseins in milk. Studies per- fering by internal deletion. However, exons (12 formed on goat milk have reported extensive and 13¢, using the bovine gene numbering) show- genetic polymorphism of as1-casein with at least ing such a casual alternative splicing are different 15 alleles at the goat as1-casein (CSN1S1) locus, from those reported for the other species. Exon- distributed in seven different classes of protein skipping events affecting exons 9, 10, 16.3,3 variants associated with four levels of expression 16.143 and 17 have been reported recently in mice (Bevilacqua et al., 2002). as1-Casein A, B, C and (Boumahrou et al., 2011). E variants differ from each other in amino acid substitutions, whilst as1-casein variants F and G, Multiple forms arising from casual alternative which are associated with a low level of protein splicing have been reported also in ovine as2-ca- synthesis, are internally deleted (Martin 1993; sein (Boisnard et al., 1991). Two non-allelic Martin and Leroux 1994). The establishment of forms of as2-casein differing by an internal dele- the overall organisation of the goat as1-casein tion of nine amino acid residues at positions gene (19 exons scattered along 17 kb), the char- 34–42 in the peptide chain have been found in acterisation of allele F (Leroux et al., 1992) at the ovine milk. Analysis of the products obtained by genomic level as well as the analysis of their tran- reverse transcription of mRNAs has shown greater scription products demonstrated that the internal heterogeneity of as2-casein transcripts. In addi- deletion of 37 amino acid residues, occurring in tion to the expected deletion of codons (34–42) variant F, arises from the outsplicing of three consecutive exons (9, 10 and 11), skipped en bloc 3Refers to tandem repeats of exon 16, using the generic during the processing of the primary transcripts exon numbering valid for all species.

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 401 (Fig. 13.9). Furthermore, the CSN1S1*F allele Amongst the hypotheses proposed to explain was shown to yield multiple alternatively spliced such a safeguard mechanism, one can mention transcripts, amongst which were transcripts lack- “nuclear scanning” which recognises nonsense ing 24 nucleotide-long sequences encoded by codons and then has an effect on exon definition exons 13 and 16 (Leroux et al., 1992). By com- (Zhang et al., 1998a, b). This raises the question: parison with the non-defective CSN1S1*A, B and How is a normal termination codon (which does C alleles, a reduction in the amount of mRNA, not usually mediate a reduction in the abundance due to mRNA decay and therefore accounting for of mRNA) distinguished from a premature stop lower as1-casein content in milk, was observed. A codon? A rule has been proposed for the posi- single point deletion in exon 12 of the as1-casein tion of the termination codon. According to gene, leading to truncated proteins and hence a Nagy and Maquat (1998), it must be located less low content of as1-casein in the milk, has been than 50–55 nucleotides upstream from the 3¢- described as being unique to the Norwegian goat most exon-exon junction. The normal termina- population (Hayes et al., 2006). tion codon in as1-casein, as well as in b-casein transcripts, is in part or fully encoded by the Likewise, a single-nucleotide deletion result- penultimate exon (exons 18 and 8, respectively) ing in a premature stop codon is associated with a at 43 and 36 nucleotides upstream from the last marked reduction in the amount and an extensive exon-exon junction, respectively, thus conform- heterogeneity of transcripts from goat b-casein ing perfectly with the stated rule. Conversely, (CSN2) null allele in Créole and Pyrenean breeds stop codons identified both with CSN1S1*F and (Persuy et al., 1996, 1999). These authors have the French and Italian CSN2*0 alleles are located shown the occurrence of multiple and shorter well beyond the 55 nucleotide limit. Therefore, transcripts which differ from their full-length they could be suspected to mediate mRNA decay counterparts in large nucleotide stretches that and promote the occurrence of multiple forms of were missing in exon 7. Four in-frame nonsense transcripts. codons, due to a one- nucleotide deletion, were found in the CSN2*0 allele and the cognate 13.2.3.2 Differences in Post-translational mRNA. Another b-casein null allele identified in Modifications a Neapolitan goat breed (Chianese et al., 1993) was shown to differ from the wild type by a transi- Our present knowledge of casein heterogeneity is tion C→T, affecting codon 157 in exon 7 (Rando rather advanced for a large number of species, et al., 1996). The resulting premature termination including cattle, goats and even for, until now, codon is associated with a tenfold decrease in less thoroughly investigated species such as b-casein mRNAs. However, data are lacking about sheep, humans or horse. With the growing resolv- the possible occurrence of multiple mRNAs. ing power of 2D-electrophoretic techniques, the development of immunochemical procedures, As regards both CSN1S1*F (as well as the coupled with gel electrophoresis and the increas- Norwegian CSN1S1 allele) and CSN2*0, muta- ing usage of mass spectrometry-based proteom- tions are responsible for the existence of prema- ics, we now have a clear vision of the complexity ture stop codons, associated with a decrease in of the casein fraction in most of the species stud- the relevant level of transcripts and responsible ied so far. Genetic polymorphisms remain one of for the presence of multiple forms of messen- the factors determining casein heterogeneity gers, due to alternative splicing. Many reports through alterations in electrical charge, molecu- (reviewed by Valentine 1998) have drawn atten- lar weight and hydrophobicity of proteins. tion to a possible relationship between nonsense However, other factors such as PTM, including codons and exon skipping. Indeed, some genes phosphorylation and glycosylation (k-casein), containing premature codons express alterna- also contribute significantly. These factors will tively spliced mRNA in which the exon contain- be examined below. ing the nonsense codon has been skipped.

402 P. Martin et al. Phosphorylation tion (Ferranti et al., 1995). Differences between Phosphorylation of caseins is a post-translational event occurring in the Golgi apparatus and catal- the three genetic variants, A, C and D, are “silent” ysed by specific kinase(s) that recognises an amino acid triplet where the determinants are substitutions that affect the degree of protein dicarboxylic residues (mainly Glu) or phospho- seryl residues (Mercier 1981). The occurrence of phosphorylation: Variant C differs from variant A the tripeptide sequences Ser-X-Glu/SerP is a nec- essary but not a sufficient condition for phospho- for the substitution Ser13→Pro, which determines rylation of caseins to occur. Possible factors of the loss of a phosphate group at site 12 of the pep- constraint such as different intrinsic properties of both phosphate acceptor residues and acidic tide chain, PSer12→Ser; a further substitution, determinants, the characteristics of the local envi- PSer68→Asn, causes the disappearance of the ronment, secondary structure and steric hin- phosphate group on both phosphorylated residues, drance, an insufficient available pool of kinase(s) may explain incomplete phosphorylation. Ser64 and Ser66, in variant D which is widespread in Italian breeds (Russo and Davoli 1983). Indeed, unlike milk from various ruminant spe- cies, human and equine milks display complex As for other species, ovine as2-casein appears phosphorylation patterns (Poth et al., 2008; Matéos to be the most heterogeneous fraction due to its et al., 2009). Whilst bovine b- and as1-caseins exist either predominantly as single phosphoforms, con- high degree of multiphosphorylation, with 9–12 taining 5 and 8 phosphate groups, respectively, other mammals, in contrast, have more variable phosphate groups (Mamone et al., 2003). phosphorylation forms. For instance, equine and human b-caseins have variable phosphorylation In sows’ milk, polymorphism of as2-casein levels with 3–7 and 0–5 phosphates, respectively consists of a fast-migrating band with a minor (Girardet et al., 2006; Greenberg et al., 1984). This notion, however, has to be considered cautiously satellite band which is absent from some samples since in ewe’s milk, as1-casein has been reported to exist as multiple phosphoforms with 7–11 phos- (Erhardt 1989). The author suggested that this phate groups, (Mamone et al., 2003). This is also true for b-casein (2–7 phosphates). was determined by the incomplete phosphoryla- Our current knowledge on the phosphoryla- tion of potentially phosphorylable sites which are tion level of the main 4 caseins from 12 mammals is gathered in Table 13.2. The greatest amount of saturated only in the case of bovine and water data refers to the four widely studied ruminant species, whilst only limited information is cur- buffalo as0-casein. rently available regarding the other species, with Camel milk caseins are less phosphorylated the exception of human and horse. Experimental data are compared to the theoretical number of than bovine caseins (Kappeler et al., 1998). Six sites expected on the basis of Mercier’s rule. putative phosphorylation sites (Ser at positions Five variants of ovine as1-casein (A to E) have been described so far, associated with quantitative 18, 68 and 70–73) have been identified in camel variation in casein content (Chianese et al., 1996). The primary structure of three of them, A, C and as1-casein, with a possible incomplete saturation. D (formerly called Welsh variant), has been deter- Although there are four predicted phosphoryla- mined. They differ from each other by few amino acid substitutions and the degree of phosphoryla- tion sites in the b-casein peptide sequence, from molecular mass measurements (MALDI-MS), it was observed that the most frequent form has only three phosphate groups. One deletion that shortens camel as2-casein, likely due to the skip- ping of exon 8, is responsible for the loss of the phosphorylated serine cluster Ser56-Ser57-Ser58. Two phosphorylation sites had been identified to date (SerP151 and SerP168) out of five potentially phosphorylable sites (also including Ser127, Thr135 and Thr137) in the bovine k-casein peptide chain. Holland et al. (2006), using a proteomic approach (2D-electrophoresis, combined with mass spec- trometry) to analyse the casein fraction of milk from a single cow, homozygous for the B variant of k-casein, have characterised 17 isoforms with different PTM and were able to identify a previ- ously unrecognised site (Thr166) that could be phosphorylated or glycosylated.

Table 13.2 Main structural features of the caseins from 12 placental mammalian species 13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity Species Caseins Cattle Water buffalo Sheep Goat Pig Camel Horse Rat Mouse Guinea pig Rabbit Human Features CSN1S1 199 199 199 199 191 215 197 187 187 179 197 170 Mature protein (n) 15 15 15 15 15 15 15 15 15 15 15 15 Signal peptide (n) 9/9 8/8 10/10 11/11 7/? 6/? 10/? 3/? 4/? 12/? 9/? 6/0–8 Phosphorylation sites (p/a) CSN2 209 209 207 207 217 217 226 216 214 / 222 212 Mature protein (n) 15 15 15 15 15 15 15 15 15 / 15 15 Signal peptide (n) 6/5 5/5 6/6 6/6 5/? 4/? 9/7 11/? 9/? / 4/? 6/6 Phosphorylation sites (p/a) CSN1S2 207 207 208 208 220 178 / 163 128–169 209 166 / Mature protein (n) 15 15 15 15 15 15 / 15 15–15 15 15 / Signal peptide (n) 17/? 17/? 17/13 16/? 18/? 9/? / 14/? 6/?–16/? 20/? 7/? / Phosphorylation sites (p/a) CSN3 169 169 171 171 167 162 185 159 160 213 160 162 Mature protein (n) 21 21 21 21 21 20 21 21 21 21 21 20 Signal peptide (n) 5/3 5/3 5/3 6/3 ?/? ?/? 2S+6T/? 2/? 2/? 9/? 5/? 5/? Phosphorylation sites (p/a) For each species, the number of amino acid residues of the mature chain and of the signal peptide, as well as the number of phosphorylation sites (putative/effective), are indicated 403

404 P. Martin et al. Glycosylation sponding to a maximum of four NeuAc residues Ovine k-casein is O-glycosylated, and Thr resi- were observed in the multiply glycosylated forms dues at positions 156, 158 and 159 have been (Holland et al., 2004a, b). On the other hand, the proposed as putative glycosylation sites (Fiat and core disaccharide, Galb(1–3)GalNAc, to which Jolles 1989). The oligosaccharide units of ovine the NeuAc residues are attached, appeared to be k-casein contain both N-acetyl and N-glycolyl relatively stable, and fragment ions with up to neuraminic acids at all stages of lactation (Jollès three disaccharides were observed allowing the and Fiat 1979; Soulier et al., 1980; Soulier and number of oligosaccharides attached to be Gaye 1981). The disaccharide, Gal b (1→3) determined. GalNAc, and the tetrasaccharide, Gal b (1→3) [Gal b (1→4) GlcNAc b (1→6) GalNAc], occur 13.2.4 Impact on Micelle Organisation in mature ovine k-casein, whereas defined tetra- and penta-saccharide structures are present in Although it is not within the scope of this chap- k-casein from colostrum, indicating an evolution ter, one cannot ignore the effect of casein struc- of the sugar moiety as a function of the time after ture and diversity on the characteristics and parturition. In a comparative study, the caseino- behaviour of the casein micelle. Interspecies glycopeptide from ewes’ milk was shown to have comparison, as well as genetic polymorphisms greater antithrombotic activity than that of the (Martin et al., 1999), must be considered as a cow (Bal dit Solier et al., 1996). valuable tool for probing the overall organisation of the casein micelle and to extending our under- The carbohydrate content of k-caseinoglyco- standing of the mechanisms involved in its for- peptide is significantly higher in human (55%) mation. Relative proportions of caseins, which is than for bovine (ca. 10%). The monosaccharides, a specific trait, their intrinsic characteristics Gal, GalNAc and NeuAc, are common to the (Table 13.2), essentially determined by their pri- k-casein from both species, whereas Fuc and mary structure, are amongst the many factors that GlcNAc are specific of human k-casein (van will determine, in each species, the average size Halbeek et al., 1985). An increasing number of (diameter and size distribution) of the casein oligosaccharide structures for human k-caseino- micelle, its surface charge and hydrodynamic glycopeptide is available (Saito et al., 1988; Saito radius, its hydration and mineral content. and Itoh 1992). Amongst these oligosaccharides, GlcNAc b (1→6) GalNac and GalNAc b (1→4) Several experiments (Heth and Swaisgood GlcNAc b (1→6) GalNAc represent novel types 1982; Donnelly et al., 1984; Dalgleish et al., 1989) of core structures for mucin-type carbohydrate have led to the conclusion that the average size of chains (Fiat and Jolles 1989). the micelle increases as the proportion of k-casein decreases. This finding is confirmed through inter- Although the primary structure of k-casein species comparisons, using the freeze-fracture from several species is available, glycosylation technique (Buchheim et al., 1989). Camel and sites are essentially documented for bovine and human milks, with a low (3.5%, Kappeler et al., human k-caseins (Pisano et al., 1994). Ser resi- 1998) and a high (17%, Dev et al., 1994; Miranda dues were not glycosylated in bovine or human et al., 2000) content in k-casein, respectively, dis- k-casein (Fiat et al., 1980), whereas nine out of play larger (up to 600 nm, with an average around ten putative Thr-containing consensus sequences 350 nm, as for the llama) and smaller (64 nm) of human k-casein are actually glycosylated. casein micelles, although the higher mineral con- Potentially, bovine k-casein could have up to 12 tent of camel milk might also play a significant NeuAc residues if all six glycosylation sites (only role in this regard. It seems that the casein micelles five in the B variant) were modified with the in milk from different species have a similar ultra- major tetrasaccharide NeuAca(2–3)Galb(1–3) structure but with considerable differences in the [NeuAca(2–6)]GalNAc, identified in k-casein. size distribution. Bloomfield (1979) provided the However, using 2D-electrophoresis coupled with mass spectrometry, only fragment masses corre-

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 405 most complete theory for the origin of a broad size as structural and biological aspects of selected distribution of micelles using the model of Slattery major whey proteins. and Evard (1973), based on the variations in sub- unit composition. It is clear that the occurrence of Variation in the milk protein gene copy num- multiple molecular species of each casein, differ- ber potentially contributes to the diversity of milk ing in their length, their sequence and their level protein composition (Lemay et al., 2009). This is of phosphorylation (e.g. human casein micelle; particularly well exemplified by the gene encod- Dev et al., 1994), further complicated by genetic ing b-lactoglobulin (BLG) which is one of the polymorphisms more or less pronounced within major whey protein of ruminant species, appar- each species (e.g. the goat), is not without any ently absent in human, camel, rabbit and rodents consequence, as far as this issue is concerned. (Sawyer 2003). Surprisingly, another major whey protein—the whey acidic protein or WAP—is In contrast, the relationship between micellar frequently found instead BLG. However, the size and the proportions of the “calcium-sensitive” presence of WAP in human milk has still to be caseins is less clearly established. Few contradic- demonstrated. On the other hand, both proteins tory data are available for bovine casein micelles. are found in marsupial and monotreme milks For example, Davies and Law (1983) found that (Fig. 13.8). Up to now, swine is the only euthe- as1-casein is present in about the same proportion rian species for which WAP has been identified in micelle fractions isolated by differential together with BLG (Simpson et al., 1998). centrifugation, whereas as2-casein increased slightly with ease of sedimentation. Conversely, 13.3.1 b-Lactoglobulin: A Singular and Donnelly et al. (1984), using a chromatographic Enigmatic Whey Protein approach, reported that both as-caseins are present in greater proportions in the larger micelles. Once BLG has been studied extensively across species again, goats, with the complex (including quanti- and a comprehensive review was published few tative) genetic polymorphism described at several years ago (Sawyer 2003; see also Chap. 7). casein loci (namely, as1, b and k) provides a valu- Despite relatively weak sequence homologies, able tool to address this issue. Indeed, deficiency multiple alignments reveal some striking features of as1-casein, together with changes in the primary and short peptide sequences precisely conserved structure, has been shown to be responsible for the across species, including marsupials (Fig. 13.9). variability in micellar diameter (Remeuf 1993; The genomic organisation of the gene encoding Grosclaude et al., 1994; Pierre et al., 1995). One BLG is highly conserved across species, with should therefore no longer address “the” micelle seven exons, encompassing a ca. 5 kb genomic structure as a singular issue but rather consider segment, located on chromosome 11 in cattle. that the broad size distribution generally observed With the release and the assembly of the Bos tau- might reflect the extensive diversity of molecular rus genome, it appeared that the gene encoding species arising from the expression of each of the BLG is duplicated in cattle as it is in dog and four (or five) casein genes. horse genomes. The duplicated gene, first described in cattle as a pseudogene (Passey and 13.3 Whey Proteins MacKinlay 1995), shows similarities to BLG-II genes identified in the horse and cat (Lear et al., There are a significant number of proteins in the 1999; Pena et al., 1999). However, there is no evi- whey, synthesised in the mammary tissue or not dence for its expression in the bovine mammary for which the functions are not fully understood. gland (Lemay et al., 2009), thus being without We will discuss briefly below variation across any effect on the concentration of BLG in bovine species, taking into account quantitative as well milk. On the other hand, mutations within BLG-I

406 P. Martin et al. Fig. 13.8 Mammalian phylogeny and the presence of in milk. : marks the presence of b-lactoglobulin, while WAP/BLG in milk. Genera are given in red. NWM corre- : indicates the occurrence of several copies of the gene sponds to New World Monkeys, whereas OWM are related to Old World Monkeys. : indicates the presence of WAP encoding b-lactoglobulin in the genome. : marks the absence of b-lactoglobulin Fig. 13.9 Amino acid residues conserved in b-lactoglobulin sequences of platypus, tammar wallaby, brush tail pos- sum, horse, cattle, sheep, goat and pig gene, including its promoter region, seem to members of which share relatively low sequence impact BLG expression level. similarity but have a highly conserved exon/intron structure and three-dimensional protein folding. The amino acid sequence and 3-dimensional Most of them bind small hydrophobic ligands and structure of BLG show that this protein belongs thus may act as specific transporters, as does to the widely diverse lipocalin superfamily, the

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 407 serum retinol binding protein. Bovine BLG binds question and the impact of milk protein variants, a wide range of ligands, but this may not be the including BLG, on milk composition have been reason for its presence in milk. The structure and studied extensively (for a review, see Martin et al., physicochemical properties of the protein have 2002 and Chap. 15). Two studies dealing with this been reviewed by Kontopidis et al. (2004). The issue in cattle have been published recently (Heck apparent ability of the binding site to accommo- et al., 2009; Hallén et al., 2008). In addition, a date a wide range of ligands may point to a higher expression of allele A has been described in possible physiological function. However, by heterozygous (AB) animals (Graml et al., 1989). considering the lipocalin family, in general, and This differential allelic expression has been the species distribution of BLG in particular, explained by nucleotide differences in the pro- some speculation can be made. It has been moter regions associated with these two alleles. reported as being implicated in hydrophobic Wagner et al. (1994) identified 14 single-nucle- ligand transport and uptake, enzyme regulation otide polymorphisms (SNP) within the 5¢-flanking and the neonatal acquisition of passive immunity. region and two in the 5¢-UTR of exon 1 of the However, these functions do not appear to be con- bovine BLG gene. Some of them are located in sistent between species. Sequence comparisons potential binding sites for trans-acting factors or amongst members of the lipocalin family reveal in the 5¢-UTR. Sequences of the 5¢-flanking regions that glycodelin (also known as PP14, placental and BLG genotypes suggest that alleles A or B in protein 14 or PAEP, progestogen-associated endo- the coding regions were connected with distinct metrial protein), found in the human endome- promoter variants. Such intragenic haplotype asso- trium during early pregnancy, is the most closely ciations may explain the observed differences in related to BLG. Although the function of gly- the effects of A or B variants of BLG on milk pro- codelin is not fully elucidated, it appears to have duction traits particularly on BLG synthesis (A>B) essential roles in regulating a uterine environment in heterozygous cows (Graml et al., 1989). suitable for pregnancy and possibly to have effects on the immune system and/or to be involved in By sequence analysis of the 5¢-flanking regions differentiation (Kontopidis et al., 2004). of the milk protein-encoding genes altogether 65 variable sites have been revealed by Geldermann Several polymorphic variants of BLG are et al. (1996). Sixty of these sites were base sub- known in cattle (Farrell et al., 2004; Table 13.1), stitutions, and five were deletions/insertions. but the most frequent two (A and B) were shown About 50% of the variable sites were located in to be associated with differences in milk protein potential protein binding sites, identified by com- yield and composition. These variants differ by puter-aided analysis. In cell culture tests, the two amino acid substitutions in the polypeptide investigated promoter variants led to different chain arising from two single-nucleotide substi- reporter gene expression. In the case of the BLG tutions in BLG-I gene: Asp 64 (GAT)→Gly encoding gene, the promoter variant of the (GGT) and Val 118 (GTC)→Ala (GCC). The lat- BLG*A allele produced up to 3.5 times greater ter T→C transition creates a HaeIII restriction expression of a reporter gene than the promoter site, thus enabling a restriction fragment length associated with the BLG*B allele. Folch et al. polymorphism analysis at the BLG locus (1999) also showed differential expression of a (Medrano and Aquilar-Cordova 1990). reporter gene fused to bovine BLG*A or B pro- moters in transiently transfected HC11 cells; the Quantitative effects of these common variants A promoter driving more efficient expression of on milk composition and cheesemaking properties the reporter than the B (57% vs. 43%). have been reported (Aleandri et al., 1990). Allele B of BLG is associated with high casein and fat More recently, Braunschweig and Leeb (2006) contents in cows’ milk, whilst Holstein cows with have shown the existence of a C to A transversion AA genotype at the BLG locus were shown to at position 215 bp upstream the translation initia- produce milk containing more whey and total tion site (g.-215C > A), segregating perfectly with proteins than those of the other genotypes. This a differential phenotypic expression of two

408 P. Martin et al. BLG*B alleles (B and B*). The sequence of the gene in ruminants raises the question of the bio- BLG*B allele in the region of the mutation is logical role of the WAP. highly conserved amongst four related ruminant species. The mutation site corresponds to a puta- The WAP proteins share limited amino acid tive consensus-binding sequence for transcription sequence identities with the exception of these factors c-Rel and Elk-1. These results support the cysteine residues (at least one, usually two 4-DSC hypothesis according to which sequence varia- domains in eutherian and even three in metathe- tion within the promoter of the BLG gene is prob- rian mammals) and positional conservation of ably one of the factors responsible for differences several proline (P), glutamic acid (G), aspartic in BLG content in milk. acid (D) and lysine (K) residues (Simpson and Nicholas 2002). Unlike the eutherian WAP 13.3.2 Whey Acidic Protein (WAP) sequences, marsupial WAPs display a conserved motif (KXGXCP) at the beginning of each 4-DSC Whey acidic protein has been identified in the domain (Fig. 13.10). However, currently no func- milk of only a few mammalian species, including tional significance has been ascribed for this mouse, rat, rabbit, camel, pig, tammar wallaby, motif, although it is proposed to be important for brushtail possum, echidna and platypus, but it is correct folding of the protein (Ranganathan et al., absent from ruminant milks due to a frameshift 2000). The presence of 4-DSC domain sequences mutation in the WAP encoding gene (Hajjoubi on chromosome 20 (WFDC2 or HE4 protein) et al., 2006). The three ruminant WAP sequences within the human genome, raises the possibility have the same deletion of a single nucleotide at (not yet demonstrated) that a secreted WAP pro- the end of the first exon when compared with the tein may be present in human milk. pig sequence. Due to the induced frameshift, the putative proteins encoded by these sequences do WFDC2/HE4 protein is a small secretory pro- not harbour the features of a usual WAP protein tein shown to function as an anti-proteinase (pro- with two four-disulfide core (4-DSC) domains, tease inhibitor) involved in the innate immune approximately 50 amino acids which contain defence of multiple epithelia (Bingle et al., 2006). eight cysteine residues in a conserved arrange- The relevant gene is highly expressed in pulmonary ment (Hennighausen and Sippel 1982; epithelial cells, in saliva and was also found to be Ranganathan et al., 2000). Moreover, RT-PCR expressed in some ovarian cancers and epididymis. experiments have shown that these sequences are not transcribed. This loss of functionality of the The organisation of eutherian WAP genes is highly conserved and composed of four exons with exon 1 encoding the 5¢-UTR, signal peptide and first 8–10 amino acids of the mature protein. Exons 2 and 3 encode the two 4-DSC domains, Fig. 13.10 Schematic representation of the structure and have proceeded step by step by loss of exon to lead to the evolution of the whey acidic protein (WAP) gene (adapted present WAP genes in monotremes (platypus and echidna), from Sharp et al., 2007). (a) The ancestral progenitor is marsupial and placental mammals. (b) Alignment of depicted by six exons (boxes) numbered from 1 to 6. eutherian, marsupial and monotreme WAP sequences Coloured boxes represent exons (2–5) encoding 4-DSC shows conservation of protein structure. The 4-DSC domains, whilst black boxes represent exons encoding the domains represented by 8 cysteine residues (C) are high- signal peptide (SP) and the N-terminal part (N) of the lighted with a pink background in each domain (exon). mature protein (exon 1) and the C-terminal part (C-ter) of Highly conserved residues are highlighted with a yellow the protein (exon 6). In eutherian WAP, 4-DSC domains background. Mouse (P01173), rat (P01174), pig (O46655), (DI and DIIa) are encoded by exons 3 and 4. The two camel (P09837) and rabbit (P09412) for eutherian 4-DSC domains of echidna WAP are encoded by exons 2 sequences have been aligned with brush-tailed possum (DIII) and 4 (DIIa). Marsupial and platypus WAPs com- (Q95JH3), tammar wallaby (Q9N0L8), platypus (A7J9L3) prise three 4-DSC domains in different configurations: and echidna (A7J9L2) to maximise similarity within DIII—DI—DIIb and DIII—DIIa—DIIb, respectively. exons depicted by boxes for which the colour code of (A) Evolution of WAP genes in mammalian species would has been retained

a Exon1 Exon2 Exon3 Exon4 Exon5 Exon6 D III DI D IIa D IIb ancestral Monotrema D III D IIa D IIb platypus D III D IIa Monotrema echidna Marsupial D III DI D IIb Eutheria D I D IIa

410 P. Martin et al. and exon 4 encodes the last 8–10 amino acids of transferase, it forms the lactose synthase complex the protein and the 3¢-UTR. Whilst the size of which catalyses the formation of lactose from each exon remains rather conserved between spe- glucose and UDP-galactose, (Brew and Hill cies, intron size varies considerably. Exon 3, 1975) in the Golgi apparatus of MEC. a-Lactal- encoding 4-DSC domain II, has the higher degree bumin was shown to be a calcium metalloprotein, of sequence conservation between species. It was in which the calcium ion has an unusual role in proposed to be the primordial domain, with folding and structure (Hiraoka et al., 1980). domain I likely to have arisen by intragenic dupli- cation (Simpson and Nicholas 2002). Most of the molecular structure and function was known and extensively reviewed by Brew A third 4-DSC domain encoded by an addi- (2003; see also Chap. 8), but this protein has tional exon has been identified in marsupial WAP, experienced a renewal of interest with “HAMLET” as well as in platypus (Sharp et al., 2007; Topcic (human alpha-lactalbumin made lethal to tumour et al., 2009), whereas the WAP gene structure of cells), a partially unfolded a-lactalbumin, which echidna is different and closer to that of the acquires, when it binds oleic acid, a tumoricidal WFDC2 gene, with only two 4-DSC domains function (Pettersson-Kastberg et al., 2009). (Fig. 13.10). It is possible that domain III of the marsupial WAP gene may be the ancestral gene, a-Lactalbumin is present in the milk of almost which was subsequently lost during evolution in all species of mammals, except some Otariidae eutherian species. Sharp et al. (2007) suggest that (Arctocephalus pusillus: Cape fur seal), the milk the evolution of the WAP gene in the mammalian of which is rich in fat (more than 20% long-chain lineage may be either through exon loss from an fatty acids, mainly unsaturated) and devoid of ancient ancestor or by rapid evolution via the pro- lactose and a-lactalbumin (Dosako et al., 1983). cess of exon shuffling. The female fur seal modulates its lactation by Whereas eutherian WAP is expressed in the turning milk production “on” and “off” without mammary gland throughout lactation, marsupial regression and involution of the mammary gland WAP is expressed only during mid-late lactation. (Sharp et al., 2006). After undergoing a perinatal This transient expression pattern in marsupials fast of 2–3 days suckling pups on shore, the has to be correlated with a short gestation giving mother leaves her young and the colony to forage birth to an immature young followed by a long at sea for 3 weeks to replenish body stores during lactation during which milk progressively which time her mammary gland remains active changes in composition to suit developing young without initiating involution, demonstrating an requirements. This suggests that WAP may play a apoptotic function for a-lactalbumin (Sharp role in the development of the mammary gland or et al., 2008). This apoptotic potential which is influence development of the young (Sharp et al., consistent with observations made on 2007). Interestingly, tammar mammary gland a-lactalbumin-deficient mice (Stinnakre et al., was shown to express strongly a second WAP- 1994) has to be compared with the tumoricidal like protein (WFDC2) during pregnancy, at a function of HAMLET. reduced level in early lactation before it disap- pears in mid-late lactation. These different tem- The gene structure and the protein sequence are poral expression patterns of WAP and WFDC2 highly conserved across species, with more diver- suggest they play complementary roles. gence in rodents than in primates (Brew 2003). 13.3.3 a-Lactalbumin Due to its prominent role in milk synthesis, a-lactalbumin is considered to be a valuable Amongst the main milk proteins, a-lactalbumin genetic marker for milk production traits in cat- (LALBA) is so far the only one with enzyme- tle. However, a-lactalbumin appears rather related activity. Together with b-1,4-galactosyl- weakly polymorphic in cattle (3 variants) and sheep (2 variants). After screening at the protein, as well as the nucleotide level, few mutations have been found, mainly within the regulatory sequences of the gene.

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 411 Bleck and Bremel (1993b) sequenced the milk to 400 (in human) and even 800 mg/L in 5¢-flanking region of the a-lactalbumin gene in mares’ milk (Farkye 2003; Miranda et al., 2004). cattle. Three SNPs occurring at positions +15, +21 and +54 relative to the mRNA transcription 13.3.5 Lactoferrin start point were identified within a ca. 2-kb frag- ment including 1,952 bp of 5¢-flanking region Lactoferrin (LTF) is of mammary origin and is and 66 bp of the protein-coding region. The +15 found in the milk of most species (Schanbacher and +21 variations occurred in the 5¢-UTR of the et al., 1993; see also Chap. 10). LTF is an iron- mRNA, whereas the +54 polymorphism is a silent binding glycoprotein with a molecular mass mutation in the signal peptide-coding region of around 80 hDa, belonging to the transferrin fam- the gene. A transition A→G at position +15 was ily that is expressed and secreted by epithelial shown to occur only in the Holstein breed (Bleck cells and found in the secondary granules of neu- and Bremel 1993a) and to be associated with an trophils from which it is released in infected tis- increased milk yield. Cows with the A allele of sues and blood during the inflammatory process. the LALBA gene had higher milk yield, protein Initially described as an iron-binding molecule yield and fat yield; the B allele was associated with bacteriostatic properties, LTF is now known with higher percentage of protein and fat. These to be a multifunctional or multitasking protein data suggest that although not located in the gene with multiple biological activities (Ward et al., promoter, this SNP potentially alters a-lactalbu- 2002; Vogel et al., 2002). It is a major component min expression at the translational level and may of the innate immune system of mammals. Its be associated with differences in milk yield. protective effects range from direct antimicrobial activities against a large range of microorganisms In addition to SNP/+15, a second SNP (also a including bacteria, viruses, fungi and parasites, to transition A→G), located at position −1,689 from anti-inflammatory and anti-cancer activities. the transcription start point was identified Whilst iron chelation is central to some of the (Voelker et al., 1997). The allele showing an A at biological functions of LTF, other activities position −1,689 was designated as allele A, and involve interactions of LTF with molecular and that with a G at this position was designated allele cellular components of both hosts and pathogens B. The −1,689 and +15 polymorphisms were (Legrand et al., 2008). compared within the Holstein population to determine their linkage relationship. In this study, The internal structure of LTF is highly con- the +15 A variant was always linked to variant A served and is dedicated to binding iron. On the at −1,689. These results suggest the existence of other hand, the external structure (its molecular a haplotype A (+15A and –1,689A) associated surface) is much more variable across species, with higher milk, protein and fat yields. making it more difficult to identify functionally important sites. Recent work shows that the cat- 13.3.4 Lysozyme ionic N terminus and associated lactoferricin domain on the N-lobe of LTF, in addition to its Lysozyme is a bacteriocidal enzyme, structurally role in antibacterial activity and probable role in related to a-lactalbumin, sharing 40% similarity DNA binding, is also involved in complex forma- (Qasba and Kumar 1997). Ranging in molecular tion with other proteins. Finally, it may be time to mass between 14 and 18 kDa, this enzyme, also re-examine the importance of glycosylation, called b-1,4-N-acetylmuramidase, cleaves a gly- given the growing evidence that many pathogens cosidic linkage in the peptidoglycan component of depend on binding to glycans for pathogenesis bacterial cell walls, resulting in a loss of cell wall (Baker and Baker 2009). integrity and cell lysis. The concentration of lysozyme in milk varies from 1 to 3 mg/L in bovine The overall structural organisation of the human, mouse, cattle, dog and horse LTF genes

412 P. Martin et al. is rather well conserved, at least in terms of size instance, butyrophilin (BTN) accounts for up to and number of exons (n = 17). Indeed, LTF is 40% of the total protein content in the bovine encoded by an approximately 30-kb gene (rang- MFGM. Roughly, MFGM material can be ing in size between 23.5 kb in mice and 33.4 kb resolved by SDS-PAGE into eight protein bands in cattle), located on chromosome 3 in human, 9 corresponding to MUC-1, fatty acid synthase in mice and 22 in cattle (Le Provost et al., 1994). (FAS), xanthine oxidoreductase (XOR), MUC- 15/PAS III, CD36, BTN, lactadherin (LDH) and A total of 60 LTF nucleotide sequences with adipophilin (ADRP). Major MFGM proteins the complete coding regions (CDS) and corre- have been reviewed extensively (Mather 2000; sponding amino acids belonging to 11 species Keenan and Mather 2006). We will therefore were analysed recently and differences within focus on MFGM proteins for which recent and across species studied (Kang et al., 2008). advances have been made with special attention The length of the LTF cDNA with the complete paid to structural and functional differences CDS varies greatly, from 2,055 to 2,190 bp, due across species. to deletion, insertion and stop codon mutation, resulting in elongation. Observed genetic diversity 13.4.1 Mucins was higher across species than within species, and Sus scrofa had more polymorphisms than Mucins are large proteins containing more than any other species. Novel amino acid variation 50% O-glycans by weight which are present at sites were detected within several species (8 in the interface between epithelia and their extracel- Homo sapiens, 6 in Mus musculus, 6 in Capra lular environment. The extracellular part of the hircus, 10 in Bos taurus and 20 in Sus scrofa), protein contains a domain of a tandemly repeated illustrating functional variation. 20-amino acids motif known as PTS regions, which are proline, threonine and/or serine- 13.4 Milk Fat Globule Membranes enriched regions containing numerous Proteins O-glycosylation sites. Due to the anti-adhesive properties of O-glycans, mucins are involved in Fat is present in milk as droplets of apolar lipids protection against infections, either caused by surrounded by a complex membrane derived viral or bacterial agents (Schroten 1998; Patton from the MEC and is called MFGM. MFGM has 1999; Dewettinck et al., 2008). The MUC family a complex tripartite structure comprising a mono- contains more than 20 members. To date, only layer membrane derived from the endoplasmic three mucins have been more or less extensively reticulum (ER) surrounded by a bilayer mem- characterised in milk: MUC-X, MUC-1 and brane arising from the plasma membrane of the MUC-15 (formerly known as PASIII). MUC-X is MEC. Hence, the composition of MFGM reflects a poorly characterised high molecular mass those of endoplasmic reticulum and plasma mem- mucin which has been reported to be homologous branes. Using high-performance liquid chroma- to MUC-4. The presence of MUC-4 has been tography (HPLC) coupled to tandem mass confirmed in human milk (Patton 2001; Zhang spectrometry (MS) applied to one-dimensional et al., 2005). A gene predictively encoding a SDS-PAGE fractionated samples, Reinhardt and mucin 4-like protein, highly homologous to the Lippolis (2006) identified more than 120 proteins human, dog and mouse counterpart has been in bovine MFGM with diverse functions such as found on chromosome 1 in the Bos taurus genome trafficking, signalling or immune response. (NCBI, GeneID: 786701). However, efforts need Although MFGM proteins represent only 1% of to be made to characterise MUC-4 in milk through total milk proteins by weight, they possess essen- species at the protein level. We will therefore tial roles in nutritional or technological proper- focus on two well-described mucins in milk: ties of MFGM (Dewettinck et al., 2008). A great MUC-1 and, more recently, MUC-15. variability is observed in protein abundance. For

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 413 13.4.1.1 MUC-1 7–23 VNTR units, each encoding 20 amino acids, in Holstein-Friesian cattle (Sando et al., 2009). MUC-1 is undoubtedly the best characterised Three alleles, containing 11, 14 and 16 VNTR milk mucin. Bovine MUC-1 is a protein of 580 units, respectively, were predominant. In addi- amino acids residues including a signal peptide tion, a polymorphism in one of the VNTR units of 22 residues. General features of the protein are has the potential to introduce a unique site for a large extracellular region (467 amino acids) N-linked glycosylation. MUC-1 appears highly with partially conserved tandem repeats (20 glycosylated, primarily with O-linked sialy- amino acids each), a membrane-proximal SEA lated T-antigen [Neu5Ac(a2-3)-Gal(b1-3)- module which is a 120-amino acid domain fre- GalNAca1] and, to a lesser extent, with N-linked quently associated with heavily O-glycosylated oligosaccharides, which together account for proteins, a transmembrane region and a short (70 approximately 60% of the apparent mass of the amino acids) cytoplasmic tail (Pallesen et al., protein (Sando et al., 2009). 2001). The primary sequence of bovine MUC-1 has a relatively low level of homology to those We recently confirmed the polymorphic for human and mouse MUC-1, with similarities aspect of MUC-1 in goat milk (presence of one of 52% and 46%, respectively. However, these or two equally PAS stained bands for MUC-1) values increase to 77% and 79% when amino previously demonstrated by Campana et al. acid sequence of the cytoplasmic tail of bovine (1992). In addition, we also confirmed that in MUC-1 is compared to its human and murine comparison to its bovine counterpart, goat counterparts, thus suggesting a key function for mucin is a considerably larger protein (Cebo this region. Indeed, the cytoplasmic part of et al., 2009). MUC-1 has been shown to be involved in numer- ous intracellular signalling pathways (Singh and Polymorphism was shown to be variable Hollingsworth 2006). amongst species. In the goat, where high homol- ogy was observed between VNTR repeats, 15 Because each codominant allele may contain alleles were identified (Sacchi et al., 2004). In a variable number of repeats encoding the contrast, in the ovine species where average 20-amino acids motif, different sizes of MUC-1 homology between repeats was lower, only four are observed by SDS-PAGE. Heterozygous indi- alleles could be identified. Thus, the conservation viduals display two bands for MUC-1 on SDS- between repeats seems to be positively correlated PAGE, whereas a single band is observed for with the degree of polymorphism observed homozygous individuals. MUC-1 polymorphism (Rasero et al., 2007). Additional evidence for the has been evidenced for human, chimpanzee, relationship between homology of repeats and horse, cat and dog mucins (Spicer et al., 1991). In degree of polymorphism exists in mice. Indeed, contrast, the polymorphic nature of the gene has murine Muc-1, which displays only 75% homol- been lost in the mouse and other rodents. The ogy in the repetitive domain, is not polymorphic number of tandem repeats truly represents a mat- (Spicer et al., 1991). ter of interspecies differences. In humans, the number of tandem repeats varies from 21 to 125 13.4.1.2 MUC-15 with 41 and 85 repeats being the most frequent Separation of MFGM proteins by SDS-PAGE motif encountered in the Northern European followed by PAS staining reveals the existence of population (Gendler et al., 1990). As a conse- another heavily glycosylated protein, MUC-15, quence, the apparent molecular mass in SDS- previously known as PASIII in bovine MFGM PAGE for human MUC-1 range between 240 and (Mather 2000; Keenan and Mather 2006). 450 kDa, whereas that for bovine MUC-1 seems Although the extracellular part of the protein to be considerably lower (Pallesen et al., 2001). lacks the typical tandem repeats which are hall- PCR analysis of genomic DNA from 630 indi- marks of mucins, MUC-15 contains regions rich viduals identified nine allelic variants spanning in proline, threonine and serine residues with several potential glycosylation sites and therefore

414 P. Martin et al. belongs to the mucin family (Pallesen et al., homologies of the B30.2 domain are consider- 2002). Recently, the same authors confirmed the ably higher across species, thus suggesting a presence of MUC-15 orthologs in ewe and goat conserved functional role for this region. BTN milks by purification and N-terminal sequencing has been shown to be essential for the regulation (Pallesen et al., 2008). By western blotting using of milk lipid droplet secretion, since lactation antibodies raised against a 15-amino acids region was severely compromised in mice with an conserved in human, mouse and bovine MUC-15, ablated Btn1a1 gene (Ogg et al., 2006). Two a 130 kDa band was observed for bovine, caprine, models are currently proposed for milk fat secre- ovine, porcine and buffalo milks, whereas a tion. The prevailing model favours that a supra- higher molecular mass (150 kDa) band was molecular complex between BTN, XOR and observed for human milk (Pallesen et al., 2008). adipophilin at the surface of lipid droplet may Because the calculated molecular weight deduced initiate the budding of lipid droplet at the apical from primary sequence is 36,294 Da for human plasma membrane and their release as fat glob- MUC-15 and 35,715 Da for its bovine counter- ules into milk (Mather and Keenan 1998). This part, it is likely that discrepancies in SDS-PAGE model is currently challenged by several studies mobilities observed for MUC-15 proteins are due suggesting that BTN homophilic interactions to different glycosylation patterns. Accordingly, solely orchestrate fat globule extrusion from alignment of amino acids sequences of human, mammary cells (Robenek et al., 2006). However, bovine, mouse, rat and chimpanzee MUC-15 direct evidence of binding of BTN to XOR showed between 55 and 98% similarity through the conserved B30.2 domain has been (Fig. 13.11). The region showing the lowest con- recently reported (Jeong et al., 2009). Arguing servation between species corresponds to the for the existence of a high degree of sequence extracellular part of the protein. The cytoplasmic homology between B30.2 domains from differ- tail of MUC-15 is more conserved, thus suggest- ent species, the authors demonstrated that the ing the existence of a functional domain as a binding was species independent, since xanthine common feature. Indeed, structural motifs link- oxidase from mice binds to B30.2 domain of ing MUC-15 to the Ras intracellular signalling bovine or human BTN (Jeong et al., 2009). pathway were identified as previously shown for Recently, the existence of polymorphism in the MUC-1 (Singh and Hollingsworth, 2006; Pallesen BTN gene was suggested (Bhattacharya et al., et al., 2008). 2007). Comparisons of DNA sequences of exon 8 from sheep, cow and buffalo BTN gene 13.4.2 Non-mucin Proteins revealed the existence of two alleles A and B, and three corresponding genotypes (AA, BB and 13.4.2.1 Butyrophilin AB) in the considered species. Interestingly, these BTN belongs to the B7/BTN-like proteins, a sub- authors suggested a relationship between geno- set of the immunoglobulin superfamily. The main types and levels of milk fat secretion and/or size features of BTN are an extracellular part contain- of fat globules through species. However, some ing two Ig-like domains, a short transmembrane differences exist at the molecular level for BTN, region and a long carboxy-terminal cytoplasmic across species (Fig. 13.12). We have recently domain called B30.2 domain. Interestingly, the shown that BTN from bovine and caprine milks BTN genes cluster is located close to the leuko- displays different apparent mobilities in SDS- cyte antigen class I genes on human chromosome PAGE after PAS staining or immunoblotting with 6, thus linking BTN to other proteins involved in specific antibodies (Cebo et al., 2009). Since the the immune response (Rhodes et al., 2001). The molecular weight deduced from primary sequence of bovine BTN displays 71%, 84% and sequences of bovine (accession number P18892) 97% homology with mouse, human and goat or caprine (accession number A3EY52) BTN are sequences, respectively. However, sequence quite similar (59 kDa), we hypothesised that dif- ferent apparent molecular weights in SDS-PAGE

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 415 Fig. 13.11 Multiple alignments of the amino acid alignment; residues for which conserved “ : ” or semi- sequences of MUC-15. Murine (Q8C6Z1), rat (Q5XHX5), conserved “ . ” substitutions have been observed. bovine (Q8MI01) and human (Q8N387) MUC-15 were Accession numbers are given in parentheses. Solid line: aligned using the ClustalW2 program at the EBI site transmembrane region; dashed line: missing residues in (http://www.ebi.ac.uk/tools/clustalw2). Consensus sym- the alternatively spliced variant (secreted MUC-15, MUC- bols denoting the degree of conservation observed in each 15/S). N-Glycosylation sites are indicated in bold (Pallesen column are “ * ”, residues identical in all sequences in the et al., 2002) are due to differences in carbohydrate contents. epithelial antigen, BA46), murine (milk fat We showed that BTN either from cow or goat globule EGF factor 8, MFG-E8), rat and porcine milk does not contain O-glycans, but large (sperm surface protein P47) proteins (Fig. 13.13). amounts of (a2.6)-linked sialic acids carried by With only a 12-amino acid long sequence located N-linked carbohydrates (Cebo et al., 2009). in the C-terminal part of the protein, the full- length sequence for caprine LDH is still missing. 13.4.2.2 Lactadherin Sequence homologies are high across species The complete amino acid sequence of lactad- with values ranging between 61 and 94%. herin (LDH) are available for bovine (formerly General features of LDH are the presence of known as PAS 6/7 glycoprotein), human (breast two EGF-like domains in the N-terminal part of

Fig. 13.12 Across species comparison of butyrophilin EBI site (http://www.ebi.ac.uk/tools/clustalw2). Accession (BTN) amino acid sequences. Bovine (P18892), caprine numbers are given in parentheses. Locations of extracel- (A3EY52), human (Q13410) and murine (Q62556) buty- lular Ig-like and cytoplasmic B30.2 functional domains rophilin were aligned using the ClustalW2 program at the are indicated

Fig. 13.13 Alignment of the amino acid sequences of tools/clustalw2). Accession numbers are given in paren- bovine (Q95114), porcine (P79385), murine (P21956), rat theses. Locations of EGF-like and F5/8 type C functional (P70490) and human (Q08431) lactadherin using the domains are indicated. The RGD (Arginine-Glycine- ClustalW2 program at the EBI site (http://www.ebi.ac.uk/ Aspartic acid) adhesive sequence is indicated in bold

418 P. Martin et al. the protein with an Arginine-Glycine-Aspartic small intestine, testis and mammary gland. The acid (RGD) sequence in the second EGF-like expression of the long variant increases remark- domain, and of two C-like domains of about 150 ably in late gestation and during lactation. amino acids called F5/8 type C or C1/C2-like Interestingly, the long variant contains a proline/ domains also present in coagulation factors V threonine (Pro-Thr)-rich 37-amino acid domain and VIII. The C-terminal domain of the second containing multiple O-linked glycans chains F5/8 repeat has been shown to be responsible for which may be functionally important for secre- membrane binding through a phosphatidylserine- tion of milk fat globules from MEC (Oshima binding motif (Foster et al., 1990). The RGD et al., 1999). Finally, a new family of zona pellu- sequence is a cell-adhesion motif able to bind to cida-binding proteins homologous to bovine integrins (Dong et al., 1995). However, some dif- LDH and murine MFG-E8 is growing with the ferences have been observed between species. isolation of a 47-kDa protein from porcine Human LDH, characterised by two proteins of sperm. Using antibodies raised against bovine 50 and 30 kDa, does not contain the first EGF- LDH, proteins of apparent molecular mass simi- like domain. The 50 kDa protein is the full-length lar to that of P47 protein were also detected in protein also known as breast carcinoma protein porcine milk (Ensslin et al., 1998). BA46 that is highly expressed in human breast tumours. The 30-kDa protein is a truncated form 13.4.2.3 Adipophilin of BA46 consisting of the C-terminal factor V/ Complete amino acids sequences are available VIII-like domain which appears to anchor BA46 for bovine, porcine, human and mouse adipophi- to the MFGM (Giuffrida et al., 1998). Bovine lin (Fig. 13.14). Sequence similarities are high LDH, also known as PAS 6/7, consists of two between species thus suggesting conserved func- polypeptides staining well with Coomassie blue tions for this protein. The N-terminal region of and the PAS reagent. These bands correspond to adipophilin contains a sequence motif that is protein isoforms produced by alternative splic- shared by other proteins, namely, perilipin and ing, a long isoform consisting of 427 amino TIP-47 proteins. They define a new family of acids and a short isoform arising from an inter- lipid droplet-associated proteins called PAT, nal truncation of 52 amino acids starting from which is an acronym for perilipin, adipophilin position 169 and extending to position 220. This and TIP-47 proteins. Two PAT subdomains are is the consequence of an exon-skipping event described. The PAT-1 domain (~100 amino acids) occurring during the course of the pre-messen- defines the high identity N-terminal region, gers maturation process (Hvarregaard et al., whereas the PAT-2 domain refers to the more dis- 1996). Although limited information is available tal region of lesser similarity in PAT proteins (Lu on its sequence, we have shown that LDH from et al., 2001; Miura et al., 2002). Interactions goat milk consists in a single 55-kDa protein by between BTN, XOR and adipophilin are sup- contrast to bovine LDH for which two polypep- posed to be involved in milk lipid droplet secre- tide chains of about 52 and 50 kDa in 6% SDS- tion in the MEC (Mather and Keenan 1998). PAGE are easily identified by peptide mass Conversely, the fat content of milk from ADPH fingerprinting MALDI-TOF analysis (Cebo null mice was comparable to that of wild-type et al., 2009). This difference from bovine LDH mice. However, it has been shown that that ADPH may be related to a singular secretion mode null mice display an N-truncated form of adipo- hypothesised in the goat species (Neveu et al., philin that retains the ability to promote the secre- 2002). In mice, two protein variants produced by tion of lipid droplets in milk (Russell et al., 2008). alternative splicing of the same premature mRNA Thus, the PAT domain of adipophilin is not have been evidenced: a long 61-kDa isoform, directly involved in its physiological function. expressed predominantly in the mammary gland, Levels of transcripts were lower in ADPH null and a short 53-kDa isoform expressed ubiqui- mice than in wild-type mice. By contrast, amounts tously in various tissues including lung, liver, of ADPH proteins were comparable in mutant

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 419 Fig. 13.14 Alignment of the amino acid sequences of Location of the PAT-1 domain sharing 40% identity with bovine (Q9TUM6), porcine (Q4PLW0), human (Q99541) other murine lipid droplet-associated proteins, namely, and murine (P43883) adipophilin using the ClustalW2 perilipin and TIP-47 is indicated. The more distal PAT-2 program at the EBI site (http://www.ebi.ac.uk/tools/clust- sequence (~120 amino acids) can also be aligned but more alw2). Accession numbers are given in parentheses. poorly (Lu et al., 2001)

420 P. Martin et al. and null mice thus suggesting the N-terminal infections in vitro. Indeed, pre-incubation of virus region of ADPH was involved in the stability of with LDH significantly reduced infection, whereas the protein (Orlicky et al., 2008). The PAT region pre-incubation of LDH with host cells did not of adipophilin was also shown to control the show any effects on rotavirus infection. This sug- access of TIP-47 to the cytoplasmic lipid droplet. gests that LDH acts as a decoy receptor during the However, it is not clear if the PAT is directly or course of infection (Kvistgaard et al., 2004). indirectly linked to the multiple cellular functions observed (Orlicky et al., 2008). Concomitantly to the nature of carbohydrates found on MFGM proteins, a quantitative aspect 13.4.3 Glycosylation as a Factor of of health benefits provided by carbohydrates Variability of MFGM Proteins present on milk glycoproteins must be consid- Through Species ered. Indeed, a relationship between the variable number of tandem repeats (VNTR), which are Glycosylation variations through species have domains containing numerous O-glycosylation been initially reported for MUC-1, most probably sites, and the resistance to Helicobacter pylori because of its high carbohydrate content. Indeed, infections has already been demonstrated in monosaccharide composition of MUC-1 from human for MUC-1 and MUC-6 (Nguyen et al., bovine milk suggested profound differences with 2006; Costa et al., 2008). Hence, species showing MUC-1 from human milk (Pallesen et al., 2001). shorter VNTR regions are supposed to be more A more recent study firmly established the spe- susceptible to these bacteria than those present- cies-dependent nature of carbohydrate structures ing larger polymorphic domains, that is, inhibit- found in glycoproteins from milk. Glycosylation ing in a more efficient way, adhesion of pathogens of MFGM proteins from eight species (human, to host cells. Considering the higher number of cow, goat, sheep, pig, horse, dromedary and rab- tandem repeats found on MUC-1 from human bit) were investigated by using lectins and milk compared to MUC-1 from bovine milk, it carbohydrate-specific antibodies (Gustafsson may explain the lower incidence of infectious et al., 2005). Large-scale techniques now avail- diseases in breastfed infants (Schroten 1998). able (i.e. glycoproteomics) confirmed the differ- ences in the nature of glycans found either on 13.5 Concluding Remarks bovine or human MFGM glycoproteins (Wilson et al., 2008). Bovine O-linked oligosaccharides The past 10 years have seen a fantastic break- were reported to present mono- and disialylated through in the knowledge of genome structure core 1 oligosaccharides (Galb1-3GalNAc), whilst and organisation. New insights and clues to better O-glycans from human milk had core type 2 oli- understand mechanistic details involved in the gosaccharides (Galb1-3(GlcNAcb1-6)GalNAc). regulation and variability of gene expression have Interestingly, the Lewis b epitope, which has been been provided already and are still expected. Data shown to be a target for Helicobacter pylori bac- now available on the architecture of the casein teria, was present in human but not in bovine locus in several species, including monotremes MFGM proteins. Most generally, because the and marsupials, will contribute to our understand- extreme diversity of glycans found on MFGM ing of the mechanisms responsible for variations proteins is thought to prevent the attachment of and heterogeneity in milk casein composition. various pathogenic organisms to intestinal mucosa, However, it is perhaps in the functional field that it may be hypothesised that bovine milk will pro- we might progress significantly in the near vide a different protection against pathogens than future. human milk. A striking demonstration has been recently reported for LDH. It was found that Factors of variability of MFGM proteins have human, but not bovine, LDH inhibits rotavirus been evidenced, both intraspecies (i.e. existence of polymorphisms) and interspecies, with PTM such as glycosylation pointed out as a main factor

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 421 Fig. 13.15 Representative a pattern of milk fat globule b membrane (MFGM) proteins in SDS-PAGE. MFGM proteins were separated on 6% SDS- PAGE and stained with (a) Coomassie or (b) PAS reagent (staining of glycoproteins). MFGM proteins are from bovine (lanes 1–2) and caprine (lanes 3–4) milks. FAS fatty acid synthase; XOR xanthine oxidoreductase; BTN butyrophilin; LDH lactadherin; ADPH adipophilin. Positions of protein standards (kDa) are indicated to the left of the panel (adapted from Cebo et al., 2009) for the observed molecular diversity of MFGM Although PTM and genetic polymorphisms glycoproteins (Fig. 13.15). Large-scale technolo- were for a long time considered as the most potent gies now available, like proteomics or glycopro- factors capable of generating multiple protein teomics, generate a huge amount of data on products starting from a single gene, it is now MFGM proteins. However, efforts need now to obvious that alternative splicing is responsible be made to go further in the understanding of bio- for a considerable proportion of proteomic com- logical mechanisms underlying this molecular plexity in mammals. It is clear that this process variability. must be intimately related to the great diversity and heterogeneity of caseins as well as to their Development of instrumental techniques has evolutionary pathway. As frameshift mutations, played a key role in these breakthroughs, particu- which deeply change the nature of the message larly in the field of PTM. Although we now know and/or lead to premature termination (linked to many of the details of casein structure, a number mRNA decay), such mechanisms, by promoting of questions remain unanswered in our under- deletion or addition of a protein domain through standing of the biogenesis of casein micelles. exon skipping or cryptic splice site usage, How do caseins interact between themselves and undoubtedly provide a real plasticity to gene with colloidal calcium phosphate? At what stage information. With at least a total of 21 coding are they modified? How is this process influenced exons that can be, according to the species, con- by (or influences) the cellular pathway of protein stitutively included or skipped, the gene encod- folding and assembly?

422 P. Martin et al. ing as1-casein is one of the most impressive Alexander, L.J., Stewart, A.F., MacKinlay, A.G., examples, in this regard. Kapelinskaya, T.V., Tkach, T.M. and Gorodesky, S.I. (1988). Isolation and characterization of the bovine Obviously, such a wide structural diversity is kappa-casein gene. Eur. J. Biochem. 178, 395–401. unlikely without consequences for the character- istics and the properties of casein micelles, par- Baker, E.N. and Baker, H.M. (2009). A structural frame- ticularly if one considers the possible unique work for understanding the multifunctional character function that seems to be played by as1-casein in of lactoferrin. Biochimie 91, 3–10. the micelle assembly, transport and secretion (Chanat et al., 1999). In MEC of small ruminants, Bal dit Solier, C., Drouet, L., Pignaud, G., Chevallier, C., as1-casein appears to be a complex mixture of Caen, J., Fiat A.-M., Izquierdo, C. and Jollès, P. more or less internally deleted proteins. The (1996). Effect of kappa-casein split peptides on plate- occurrence of genetic polymorphisms disturbing let aggregation and on thrombus formation in the the splicing machinery adds further to the com- guinea pig. Thromb. Res. 81, 427–437. plexity of the casein fraction. With up to 40 vari- ants of as1-casein produced in the milk of a single Baranyi, M., Aszodi, A., Devinoy, E., Fontaine, M.L., goat, heterozygous A/F at the relevant locus, the Houbedine, L.M. and Bosze, Z. (1996). Structure of secretion pathway may be dramatically disturbed the rabbit k-casein encoding gene: expression of the with an impact on milk composition and quality, cloned gene in the mammary gland of transgenic mice. including modifications in fat structure and com- Gene 174, 27–34. position (Chilliard et al., 2006) as well as in its susceptibility to lipolysis (Lamberet et al., 1996) Bevilacqua, C., Helbling, J.C., Miranda, G. and Martin, P. (2006). Translational efficiency of casein transcripts in Notwithstanding, the growing number of the mammary tissue of lactating ruminants. Reprod. casein genes displaying such complex patterns Nutr. Dev. 46, 567–578. of splicing, thus increasing the coding capacity of genes, supports the notion that the extreme Bhattacharya, T. K., Sheikh, F. D., Sukla, S., Kumar, P. and protein isoform diversity generated from a sin- Sharma, A. (2007). Differences of ovine butyrophilin gle gene can no longer be considered as an gene (exon 8) from its bovine and bubaline counter- epiphenomenon. A parsimonious vision of this part. Small Ruminant Res. 69, 198–202. issue addresses the following question: Does this convey any biological significance? Bingle, L., Cross, S.S., High, A.S., Wallace, W.A., Rassl, Important new insights are expected, in this D., Yuan, G., Hellstrom, I., Campos, M.A. and Bingle, field, in the near future. C.D. (2006). WFDC2 (HE4): a potential role in the innate immunity of the oral cavity and respiratory tract Acknowledgements This chapter was modified from and the development of adenocarcinomas of the lung. P. Martin, C. Cebo, G. Miranda (2011). Inter-species Respir. Res. 7, 61–70. comparison of milk proteins: quantitative variability and molecular diversity. In Encyclopedia of Dairy Sciences, Bleck, G.T. and Bremel, R.D. (1993a). Correlation of the 2nd Edition, J.W. Fuquay, P.F. Fox and P.L.H. McSweeney a-lactalbumin (+15) polymorphism to milk produc- (eds). Elsevier, Amsterdam, pp. 821–842, with permission. tion and milk composition of Holsteins. J. Dairy Sci. 76, 2292–2298. References Bleck, G.T. and Bremel, R.D. (1993b). Sequence and Aleandri, R., Buttazzoni, L.G., Schneider, J.C., Caroli, A. single base polymorphisms of the bovine a-lactalbu- and Davoli, R. (1990). The effects of milk protein min 5’ flanking region. Gene 126, 213–218. polymorphisms on milk components and cheese- producing ability. J. Dairy Sci. 73, 241–255. Bloomfield, V.A. (1979). Association of protein. J. Dairy Res. 46, 241–252. Alexander, L.J., Das Gupta, N.A. and Beattie, C.W. (1992). The sequence of porcine as1-casein cDNA. Boisnard, M., Hue, D., Bouniol, C., Mercier, J.-C. and Anim. Genet. 23, 365–367. Gaye, P. (1991). Multiple mRNA species code for two non-allelic forms of ovine as2-casein. Eur. J. Biochem. 201, 633–641. Bonsing, J., Ring, J.M., Stewart, A.F. and MacKinlay, A.G. (1988). Complete nucleotide sequence of the bovine beta-casein gene. Aust. J. Biol. Sci. 41, 527–537. Bouniol, C., Printz, C. and Mercier, J.-C. (1993). Bovine as2-casein D is generated by exon VIII skipping. Gene 128, 289–293. Braunschweig, M.H. and Leeb, T. (2006). Aberrant low expression level of bovine beta-lactoglobulin is asso- ciated with a C to A transversion in the BLG promoter region. J. Dairy Sci. 89, 4414–4419. Brew, K. (2003). a-Lactalbumin, in, Advanced Dairy Chemistry, Proteins, part A, Vol. 1, 3rd edn., P.F. Fox and P.L.H. McSweeney, eds., Kluwer Academic/ Plenum Publishers, New York. pp. 387–419. Brew, K. and Hill, R.L. (1975). Lactose biosynthesis. Rev. Physiol. Biochem. Pharmacol. 72, 105–158.

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 423 Brignon, G., Mahé, M.F., Grosclaude, F. and Ribadeau gastric epithelial cells. World J. Gastroenterol. 14, Dumas, B. (1989). Sequence of caprine as1-casein 1411–1414. and characterization of those of its genetic variants Cronin, M.A., Stuart, R., Pierson, B.J. and Patton, J.C. which are synthesized at a high level, as1-CN A, B (1996). k-Casein gene phylogeny of higher rumi- and C. Protein Seq. Data Anal. 2, 181–188. nants (Pecora artiodactyla). Mol. Phylogenet. Evol. 6, 295–311. Brignon, G., Ribadeau Dumas, B., Mercier, J.C., Pélissier, Dalgleish, D.G., Horne, D.S. and Law, A.J.R. (1989). J.P. and Das, B.C. (1977). Complete amino acid Size-related differences in bovine casein micelles. sequence of bovine alpha S2-casein. FEBS Lett. 76, Biochim. Biophys. Acta. 991, 383–387. 274–279. Dalgleish, D. G., Spagnuolo, P. A. and Goff, H. D. (2004). A possible structure of the casein micelle based on Buchheim, W., Lund, S. and Scholtissek, K. (1989). high-resolution scanning electron microscopy. Int. Comparative studies on the structure and size of casein Dairy J. 14, 1025–1031. micelles in the milk of different species, Kieler Davies, D.T. and Law, A.J.R. (1983). Variation in the pro- Milchwirtschaftliche Forschungsberichte 41, 253–266. tein composition of bovine casein micelles and serum casein in relation to micellar size and milk tempera- Campana, W. M., Josephson, R. V. and Patton, S. (1992). ture. J. Dairy Res. 56, 727–735. Presence and genetic polymorphism of an epithelial Dawson, S.P., Wilde, C.J., Tighe, P.J. and Mayer, R.J. mucin in milk of the goat (Capra hircus). Comp. (1993). Characterization of two novel casein tran- Biochem. Physiol. 103B, 261–266. scripts in rabbit mammary gland. Biochem. J. 296, 777–784. Cebo, C., Caillat, H., Bouvier, F. and Martin, P. (2009). Dev, B.C., Sood, S.M., DeWind, S. and Slattery, C.W. Major proteins of the goat milk fat globule membrane. (1994). k-Casein and b-caseins in human milk J. Dairy Sci. 93, 868–876. micelles: structural studies. Arch. Biochem. Biophys. 314, 329–336. Ceriotti, G., Chessa, S., Bolla, P., Budelli, E., Bianchi, L., Dewettinck, K., Rombaut, R., Thienpont, N., Le, T.T., Duranti, E. and Caroli, A. (2004). Single nucleotide Messens, K. and Van Camp, J. (2008). Nutritional and polymorphisms in the ovine casein genes detected by technological aspects of milk fat globule material. Int. polymerase chain reaction-single strand conformation Dairy J. 18, 436–457. polymorphism. J. Dairy Sci. 87, 2606–2613. Dong, L.-J., Hsieh, J.-C. and Chung A.E. (1995). Two dis- tinct cell attachment sites in entactin are revealed by Chanat, E., Martin, P. and Ollivier-Bousquet, M. (1999). amino acid substitutions and deletion of the RGD Alpha(S1)-casein is required for the efficient transport sequence in the cysteine-rich epidermal growth factor of beta- and kappa-casein from the endoplasmic retic- repeat 2. J. Biol. Chem. 270, 15838–15843. ulum to the Golgi apparatus of mammary epithelial Donnelly, W.J., McNeill, G.P., Buchheim, W. and cells. J. Cell Sci. 112, 3399–412. McGann, T.C.A. (1984). A comprehensive study of the relationship between size and protein composition Chianese, L., Garro, G., Mauriello, R., Laezza, P., Ferranti, in natural bovine casein micelles. Biochim. Biophys. P. and Addeo, F. (1996). Occurrence of five as1-casein Acta 789, 136–143. variants in ovine milk. J. Dairy Res. 63, 49–59. Dosako, S., Taneya, S., Kimura, T., Ohmori, T., Daikoku, H., Suzuki, N., Sawa, J., Kano, K. and Katayama, S. (1983). Chianese, L., Garro, G., Nicola, M.A., Mauriello, R., Milk of northern fur seal: composition, especially carbo- Ferranti, P., Pizzano, R., Cappuccio, U., Laezza, P., hydrate and protein. J. Dairy Sci. 66, 2076–2083. Addeo, F., Ramunno, L., Rando, A. and Rubino, R. Edlund, A., Johansson, T., Ledvik, B. and Hansson, L. (1993). The nature of b casein heterogeneity in caprine (1996). Structure of the human kappa-casein gene. milk. Lait 73, 533–547. Gene 174, 65–69. Ensslin, M., Vogel, T., Calvete, J.J., Thole, H.H., Chilliard, Y., Rouel, J. and Leroux, C. (2006). Goat’s Schmidtke, J., Matsuda, T. and Töpfer-Petersen, E. alpha s1 casein genotype influences its milk fatty acid (1998). Molecular cloning and characterization of composition and delta-9 desaturation ratios. Anim. P47, a novel boar sperm-associated zona pellucida- Feed Sci. Technol. 131, 474–487. binding protein homologous to a family of mamma- lian secretory proteins. Biol. Reprod. 58, 1057–1064. Clare, D.A. and Swaisgood, H.E. (2000). Bioactive milk Erhardt, G. (1989). Isolierung und charaktrisierung von peptides: a prospectus. J. Dairy Sci. 83, 1187–1195. caseinfraktionen sowie deren genetische varianten in schweinamilch. Milchwissenschaft 44, 17–20. Coll, A., Folch, J.M. and Sanchez, A. (1995). Structural Farkye, N.Y. (2003). Other enzymes, in, Advanced Dairy features of the 5’ flanking region of the caprine kappa- Chemistry, Vol. 1 - Proteins, 3rd edn., part A, P.F. Fox casein gene. J. Dairy Sci. 78, 973–977. and P.L.H. McSweeney eds., Kluwer Academic/ Plenum Publishers, New York. pp. 571–603. Collet, C., Joseph, R. and Nicholas, K.J. (1992). Molecular characterization and in vitro hormonal requirements for expression of two casein genes from a marsupial. Mol. Endocrinol. 8, 13–20. Condorelli, G., Bueno, R. and Smith, R.J. (1994). Two alternative splice forms of the human insulin-like growth factor I receptor have distinct biological activities and internalization kinetics. J. Biol. Chem. 269, 8510–8516. Costa, N., Mendes, N., Marcos, N., Reis, C., Caffrey, T., Hollingsworth, M. and Santos-Silva, F. (2008). Relevance of MUC1 mucin variable number of tan- dem repeats polymorphism in H pylori adhesion to

424 P. Martin et al. Farrell, H.M., Jr., Jimenez-Flores, R., Bleck, G.T., Brown, mammary gland extract and culture medium. J. Dairy E.M., Butler, J.E., Creamer, L.K., Hicks, C.L., Hollar, Sci. 74, 4143–4150. C.M., Ng-Kwai-Hang, K.F. and Swaisgood, H.E. Graml, R., Weiss, G., Buchberger, J. and Pirchner, F. (2004). Nomenclature of the proteins of cow’s milk - (1989). Different rates of synthesis of whey protein Sixth Revision. J. Dairy Sci. 87, 1641–1674. and casein by alleles of the b-lactoglobulin and as1- casein locus in cattle. Genet. Sel. Evol. 21, 547–554. Ferranti, P., Lilla, S., Chianese, L. and Addeo, F. (1999). Groenen, M.A.M., Dijkhof, R.J.M., Verstege, A.J.M. and Alternative nonallelic deletion is constitutive of van der Poel, J.J. (1993). The complete sequence of ruminant as1-casein. J. Protein Chem. 18, 595–602. the gene encoding bovine alpha-s2-casein. Gene 123, 187–193. Ferranti, P., Addeo, F., Malorni, A., Chianese, L., Leroux, Grosclaude, F., Ricordeau, G., Martin, P., Remeuf, F., C. and Martin P. (1997). Differential splicing of pre- Vassal, L. and Bouillon, J. (1994). Du gène au from- messenger RNA produces multiple forms of goat as1- age: le polymorphisme de la caséine as1 caprine, ses casein. Eur. J. Biochem. 249, 1–7. effets, son évolution. Product. Anim. 7, 3–19. Grusby, M.J., Mitchell, S.C., Nabavi, N. and Glimcher, Ferranti, P., Malorni, A., Nitti, G., Laezza., P., Pizzano, L.H. (1990). Casein expression in cytotoxic T lympho- R., Chianese, L. and Addeo, F. (1995). Primary struc- cytes. Proc. Natl. Acad., Sci., U.S.A. 87, 6897–6901. ture of ovine as1-casein: localization of phosphoryla- Gustafsson, A., Kacskovics, I., Breimer, M.E., tion sites and characterization of genetic variants. J. Hammarstrom, L. and Holgersson, J. (2005). Dairy Res. 62, 281–296. Carbohydrate phenotyping of human and animal milk glycoproteins. Glycoconjugate J. 22, 109–118. Fiat, A.-M. and Jolles, P. (1989). Caseins of various origin Hajjoubi, S., Rival-Gervier, S., Hayes, H., Floriot, S., and biologically active casein peptides and oligosac- Eggen, A., Piumi, F., Chardon, P., Houdebine, L.M. charides: structural and physiological aspects. Mol. and Thépot, D. (2006). Ruminants genome no longer Cell. Biochem. 87, 5–30. contains whey acidic protein gene but only a pseudo- gene. Gene 370, 104–112. Fiat, A.-M., Jolles, J., Aubert, J.-P., Loucheux-Lefèbre, Hall, L., Laird, J.E., Pascall, J.C. and Craig, R.K. (1984a). M.-H. and Jolles, P. (1980). Localisation and impor- Guinea-pig casein A cDNA. Nucleotide sequence tance of the sugars part of human casein. Eur. J. analysis and comparison of the deduced protein Biochem. 111, 333–339. sequence with that of bovine alpha s2 casein. Eur. J. Biochem. 138, 585–589. Folch, J.M., Dovc, P. and Medrano, J.F. (1999). Differential Hall, L., Laird, J.E. and Craig, R.K. (1984b). Nucleotide expression of bovine b-lactoglobulin A and B pro- sequence determination of guinea-pig casein B mRNA moter variants in transiently transfected HC11 cells. J. reveals homology with bovine and rat alpha s1 caseins Dairy Res. 66, 537–544. and conservation of the non-coding regions of the mRNA. Biochem J. 222, 561–570. Foster, P.A., Fulcher, C.A., Houghten, R.A. and Hallén, E., Wedholm, A., Andrén, A. and Lundén, A. Zimmerman, T.S. (1990). Synthetic factor VIII pep- (2008). Effect of beta-casein, kappa-casein and beta- tides with amino acid sequences contained within the lactoglobulin genotypes on concentration of milk pro- C2 domain of factor VIII inhibit factor VIII binding to tein variants. J. Anim. Breed Genet. 125, 119–129. phosphatidylserine. Blood 75, 1999–2004. Hansson, L., Edlund, A., Johansson, T., Hernell, O., Strömqvist, M., Lindqvist, S., Lönnerdal, B. and Gatesy, J., Hayashi, C., Cronin, M.A. and Arctander, P. Bergström, S. (1994). Structure of the human b-casein (1996). Evidence from milk casein genes that ceta- gene. Gene 139, 193–199. ceans are close relatives of hippopotamid artiodactyls. Hayashi, Y., Ohmori, S., Ito, T. and Seo, H. (1997). A Mol. Biol. Evol. 13, 954–963. splicing variant of steroid receptor coactivator-1 (SRC-1E): the major isoform of SRC-1 to mediate Geldermann, H., Gogol, J., Kock, M. and Tacea, G. thyroid hormone action. Biochem. Biophys. Res. (1996). DNA variants within the 5’ flanking region of Commun. 236, 83–87. bovine milk protein encoding genes. J. Anim. Breed. Hayes, H., Petit, E., Bouniol, C. and Popescu, P. (1993). Genet. 113, 261–267. Localisation of the alpha-S2-casein gene (CASAS2) to the homologous cattle, sheep and goat chromo- Gendler, S.J., Lancaster, C.A., Taylor-Papadimitriou, J., somes 4 by in situ hybridization. Cytogenet. Cell. Duhig, T., Peat, N., Burchell, J., Pemberton, L., Genet. 64, 282–285. Lalani, E.N. and Wilson, D. (1990). Molecular clon- Hayes, B., Hagesaether, N., Adnøy, T., Pellerud, G., Berg, ing and expression of human tumor-associated poly- P.R. and Lien, S. (2006). Effects on production traits morphic epithelial mucin. J. Biol. Chem. 265, of haplotypes among casein genes in Norwegian goats 15286–15293. and evidence for a site of preferential recombination.Genetics 174, 455–464. Ginger, M.R. and Grigor, M.R. (1999). Comparative aspects of milk caseins. Comp. Biochem. Physiol. 124, 133–145. Giuffrida, M.G., Cavaletto, M., Giunta, C., Conti, A. and Godovac-Zimmermann, J. (1998). Isolation and char- acterization of full and truncated forms of human breast carcinoma protein BA46 from human milk fat globule membranes. J. Protein Chem. 17, 143–148. Grabowski, H., Le Bars, D., Chene, N., Attal, J., Malienou- Ngassa, R., Puissant, C. and Houdebine, L.M. (1991). Rabbit whey acid protein concentration in milk, serum,

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 425 Heck, J.M., Schennink, A., van Valenberg, H.J., Jollès P. and Fiat, A.-M. (1979) The carbohydrate portions Bovenhuis, H., Visker, M.H., van Arendonk, J.A. and of milk glycoprotein. J. Dairy Res. 46, 187–191. van Hooijdonk, A.C. (2009). Effects of milk protein variants on the protein composition of bovine milk. J. Jollès, P., Loucheux-Lefebvre, M.H. and Henschen, A. Dairy Sci. 92, 1192–1202. (1978) Structural relatedness of kappa-casein and fibrinogen gamma-chain. J. Mol. Evol. 11, 271–277. Hennighausen, L.G. and Sippel, A.E. (1982). Mouse whey acidic protein is a novel member of the family of Jolivet, G., Devinoy, E., Fontaine, M.L. and Houdebine, ‘four-disulfide core’ proteins. Nucleic Acids Res. 10, L.M. (1992) Structure of the gene encoding rabbit 2677–2684. alpha S1-casein. Gene 113, 257–262. Hennighausen, L.G., Steudle, A. and Sippel, A.E. (1982). Jones, W.K., Yu-Lee, L.Y., Clift, S.M., Brown, T.L. and Nucleotide sequence of cloned cDNA coding for Rosen, J.M. (1985) The rat casein multigene family. mouse e-casein. Eur. J. Biochem. 126, 569–572. Fine structure and evolution of the b-casein gene, J. Biol. Chem. 260, 7042–7050. Heth, A.A. and Swaisgood, H.E. (1982). Examination of casein micelle structure by a method for reversible Kang, J.F., Li, X.L., Zhou, R.Y., Li, L.H., Feng, F.J. and covalent immobilization. J. Dairy Sci. 65, 2047. Guo, X.L. (2008) Bioinformatics analysis of lacto- ferrin gene for several species. Biochem Genet. 46, Hiraoka, Y., Segawa, T., Kuwajima, K., Sugai, S. and 312–322. Murai, N. (1980). a-Lactalbumin: a calcium metal- loprotein. Biochem. Biophys. Res. Commun. 95, Kappeler, S., Farah, Z. and Puhan, Z. (1998) Sequence 1098–1104. analysis of Camelus dromedarius milk casein. J. Dairy Res. 65, 209–222. Hobbs, A.A. and Rosen, J.M. (1982). Sequence of rat a- and g-casein mRNAs: evolutionary comparison of the Kawasaki, K. and Weiss, K.M. (2003) Mineralized tissue calcium-dependent rat casein multigene family. and vertebrate evolution: the secretory calcium-bind- Nucleic Acid Res. 10, 8079–8098. ing phosphoprotein gene cluster. Proc. Natl. Acad. Sci. U.S.A. 100, 4060–4065. Holland, J.W., Deeth, H.C. and Alewood, P.F. (2004a). Proteomic analysis of kappa-casein micro- heteroge- Keenan, T.W. and Mather, I.H. (2006) Intracellular origin neity. Proteomics 4, 743–752. of milk fat globules and the nature of the milk fat glob- ule membrane, in, Advanced Dairy Chemistry—2. Holland, J.W., Deeth, H.C. and Alewood, P.F. (2004b). Lipids, 3rd edn., P.F. Fox and P.L.H. McSweeney, eds., Resolution and characterisation of multiple isoforms Springer Science+Business Media, LLC, New York. of bovine kappa-casein by 2-DE following a reversible pp. 137–171. cysteine-tagging enrichment strategy. Proteomics 6, 3087–3095. Kestler, D.P., Foster, J.S., Macy, S.D., Murphy, C.L., Weiss, D.T. and Solomon, A. (2008) Expression of Holt, C. (1985). The size distribution of bovine casein odontogenic ameloblast-associated protein (ODAM) micelles: A review. Food Microstructure 4, 1–10. in dental and other epithelial neoplasms. Mol Med. 14, 318–326. Holt, C. (1992). Structure an stability of bovine casein micelles. Advances Prot. Chem. 43, 63–151. Koczan, D., Hobom, G. and Seyfert, H.M. (1991) Genomic organization of the bovine as1-casein gene. Nucleic Holt, C. and Jenness, R. (1987) Interrelationships of con- Acids Res. 18, 5591–5596. stituents and partition of salts in milk samples from eight species. Comp Biochem Physiol A Comp Physiol. Kontopidis, G., Holt, C. and Sawyer, L. (2004) Invited 77, 275–282. review: beta-lactoglobulin: binding properties, struc- ture, and function. J Dairy Sci. 87, 785–796. Holt, C. and Sawyer, L. (1988). Primary and predicted secondary structures of the caseins in relation to their Kvistgaard, A.S., Pallesen, L.T., Arias, C.F., Lopez, S., biological functions. Prot. Eng. 2, 251–259. Petersen, T.E., Heegaard, C.W. and Rasmussen, J.T. (2004) Inhibitory effects of human and bovine milk Hvarregaard, J., Andersen M.H., Berglund, L., Rasmussen, constituents on rotavirus infections. J. Dairy Sci. 87, J.T. and Petersen, T.E. (1996) Characterization of gly- 4088–4096. coprotein PAS-6/7 from membranes of bovine milk fat globules. Eur. J. Biochem. 240, 628–636. Lamberet, G., Degas, C., Delacroix-Buchet, A. and Vassal, L. (1996) Effect of characters linked to A and F caprine Jenness, R. (1974) Proceedings: biosynthesis and compo- as1-casein alleles on goat flavour: cheesemaking with sition of milk. J Invest Dermatol. 63, 109–118. protein-fat exchange. Lait 76, 349–361. Jeong, J., Rao, A.U., Xu, J., Ogg, S.L., Hathout, Y., Lear, T.L., Brandon, R., Masel, A., Bell, K. and Bailey, E. Fenselau, C. and Mather, I.H. (2009) The PRY/SPRY/ (1999) Horse alpha-1-antitrypsin, beta-lactoglobulins B30.2 domain of butyrophilin 1A1 (BTN1A1) binds 1 and 2, and transferrin map to positions 24q15-q16, to xanthine oxidoreductase: implications for the func- 28q18-qter, 28q18-qter and 16q23, respectively. tion of BTN1A1 in the mammary gland and other tis- Chromosome Res. 7, 667. sues. J. Biol. Chem. 284, 22444–22456. Lefèvre, C.M., Sharp, J.A. and Nicholas K.R. (2009) Johnsen, L.B., Rasmussen, L.K., Petersen, T.E. and Characterisation of monotreme caseins reveals lineage Berglung, L. (1995) Characterization of three types of specific expansion of an ancestral casein locus in human as1-casein mRNA transcripts. Biochem. J. mammals. Reprod Fertil Dev. 21, 1015–1027. 309, 237–242.

426 P. Martin et al. Legrand, D., Pierce, A., Elass, E., Carpentier, M., Mariller, Mather, I.H. and Keenan, T.W. (1998) Origin and secre- C. and Mazurier, J. (2008) Lactoferrin structure and tion of milk lipids. J Mammary Gland Biol. Neopl. 3, functions. Adv. Exp. Med. Biol. 606, 163–194. 259–273. Lemay, D.G., Lynn, D.J., Martin, W.F., Neville, M.C., McMahon, D.J. and Oommen, B.S. (2008) Supramole- Casey, T.M., Rincon, G., Kriventseva, E.V., Barris, cular structure of the casein micelle. J. Dairy Sci. 91, W.C., Hinrichs, A.S., Molenaar, A.J., Pollard, K.S., 1709–1721. Maqbool, N.J., Singh, K., Murney, R., Zdobnov, E.M., Tellam, R.L., Medrano, J.F., German, J.B. and Medrano, J.F. and Aquilar-Cordova, E. (1990) Polymerase Rijnkels, M. (2009) The bovine lactation genome: chain reaction amplification of bovine b-lactoglobulin insights into the evolution of mammalian milk. genomic sequences and identification of genetic vari- Genome Biol. 10(4) R43. ants by RFLP analysis. Anim. Biotech. 1, 73–77. Lenasi, T., Peterlin, B.M., Dovc, P. (2006) Distal regula- Meisel, H. (2005) Biochemical properties of peptides tion of alternative splicing by splicing enhancer in encrypted in bovine milk proteins Curr. Med. Chem. equine beta-casein intron 1. RNA 12, 498–507. 12, 1905–1919. Le Provost, F., Nocart, M., Guerin, G. and Martin, P. Menon, R.S., Chang, Y.F., Jeffers, K.F., Jones, C. and Ham, (1994) Characterization of the goat lactoferrin cDNA: R. (1992) Regional localization of human b-casein assignment of the relevant locus to bovine U12 syn- gene (CSN2) to 4pter-q21. Genomics 13, 225–226. teny group. Biochem. Biophys. Res. Commun. 203, 1324–1332. Menon, R.S., Chang, Y.F., Jeffers, K.F. and Ham, R.G. (1992) Exon-skipping in human b-casein. Genomics Leroux, C. and Martin, P. (1996) The caprine as1- and 12, 13–17. b-casein genes are 12-kb apart and convergently tran- scribed. Anim. Genet. 27, 93. Mercier, J.-C. (1981) Phosphorylation of casein. Present evidence for an amino acid triplet code post-transla- Leroux, C. (1992) Analyse du polymorphisme du gène tionally recognized by specific kinases. Biochimie caprin codant la caséine as1et des produits de sa tran- 63, 1–17. scription. Application au développement d’une procé- dure de typage précoce des animaux, PhD. Mercier, J.-C., Grosclaude, F. and Ribadeau Dumas, B. Thesis—Université d’Orsay-Paris XI. (1971) Structure primaire de la caséine as1 bovine. Séquence complète. Eur. J. Biochem. 23, 41–51. Leroux, C., Mazure, N. and Martin, P. (1992) Mutation away from splice site recognition sequences might cis- Miclo, L., Girardet, J.M., Egito, A.S., Mollé, D., Martin, modulate alternative splicing of goat as1-casein tran- P. and Gaillard, J.L. (2007) The primary structure of a script. Structural organization of the relevant gene. J. low-Mr multiphosphorylated variant of beta-casein in Biol. Chem. 267, 6147–6157. equine milk. Proteomics 7, 1327–1335. Martin, P. (1993) Polymorphisme génétique des lactopro- Miranda, G., Mahé, M.F., Leroux, C. and Martin, P. (2000) téines caprines. Lait 73, 511–532. Proteomic tools to characterise the protein fraction of equine milk. Milk Protein Conference, 30th March- Martin, P. and Grosclaude, F. (1993) Improvement of milk 2nd April 2000, Vinstra, Norway. protein quality by gene technology. Livestock Prod. Sci. 35, 95–115. Miranda, G., Mahé, M.F., Leroux, C. and Martin, P. (2004) Proteomic tools to characterize the protein fraction of Martin, P. and Leroux, C. (1992) Exon-skipping is respon- Equidae milk. Proteomics 4, 2496–2509. sible for the 9 amino acid residue deletion occurring near the N-terminal of human b-casein. Biochem. Moffatt, P., Smith, C.E., St-Arnaud, R. and Nanci, A. Biophys. Res. Commun. 183, 750–757. (2008) Characterization of Apin, a secreted protein highly expressed in tooth-associated epithelia. J Cell Martin, P. and Leroux, C. (1994) Characterization of a Biochem. 103, 941–956. further as1-casein variant generated by exon skipping. Proc. 24th Int. Soc. Anim. Genet. Conf., Prague, Nagy, E. and Maquat, L.E. (1998) A rule for termination- Abstract E43, 88. codon position within intron-containing genes: when nonsense affects RNA abundance. Trends Biochem. Martin, P., Brignon, G., Furet, J.-P. and Leroux, C. (1996) Sci. 23, 198–199. The gene encoding as1-casein is expressed in human mammary epithelial cells during lactation. Lait 76, Neveu, C., Riaublanc, A., Miranda, G., Chich, J.-F. and 523–535. Martin, P. (2002) Is the apocrine milk secretion pro- cess observed in the goat species rooted in the pertur- Martin, P., Ollivier-Bousquet, M. and Grosclaude, F. bation of the intracellular transport mechanism induced (1999) Genetic polymorphism of caseins: a tool to by defective alleles at the alpha(s1)-Cn locus? Reprod. investigate casein micelle organization. Int. Dairy J. 9, Nutr. Deve. 42, 163–172. 163–171. Nguyen, T., Janssen, M., Gritters, P., te Morsche, R., Martin, P., Szymanowska, M., Zwierzchowski, L. and Drenth, J. van Asten, H., Laheij, R. and Jansen, J. Leroux, C. (2002) The impact of genetic polymor- (2006) Short mucin 6 alleles are associated with H phisms on the protein composition of ruminants milks. pylori infection. Gastroenterology 12, 6021–6025. Reprod. Nutr. Dev. 42, 433–459. Ogg, S.L., Weldon, A.K. Dobbie, L., Smith, A.J. and Mather, I.H. (2000) A review and proposed nomenclature Mather, I.H. (2006) Expression of butyrophilin for major proteins of the milk-fat globule membrane. (Btn1a1) in lactating mammary gland is essential for J. Dairy Sci. 83, 203–247. the regulated secretion of milk-lipid droplets. Proc. Natl. Acad. Sci. U.S.A. 101, 10084–10089.

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 427 Oshima, K., Aoki, N., Negi, M., Kishi, M., Kitajima, K. Provot, C., Persuy, M.-A. and Mercier, J.-C. (1995) and Matsuda, T. (1999) Lactation-dependent expres- Complete sequence of the ovine b-casein-encoding sion of an mRNA splice variant with an exon for a gene and interspecies comparison. Gene 154, 259–263. multiply O-glycosylated domain of mouse milk fat globule glycoprotein MFG-E8. Biochem. Biophys. Qasba, P. K. and Kumar, S. (1997) Molecular divergence Res. Commun. 254, 522–528. of lysozymes and a-lactalbumin. Crit. Rev. Biochem. Mol. Biol. 32, 255–306. Pallesen, L.T., Andersen, M.H., Nielsen, R.L., Berglund, L., Petersen, T.E., Rasmussen, L.K. and Rando, A., Pappalardo, M. Capuano, M., Di Gregorio, P. Rasmussen, J.T. (2001) Purification of MUC1 from and Ramunno, L. (1996) Two mutations might be bovine milk-fat globules and characterization of a responsible for the absence of beta-casein in goat milk. corresponding full-length cDNA clone. J. Dairy Animal Genet. 27, 31. Sci. 84, 2591–2598. Ranganathan, S., Simpson, K.J., Shaw, D.C. and Nicholas, Pallesen, L.T., Berglund, L., Rasmussen, L.K., Petersen, K.R. (2000) The whey acidic protein family: a new sig- T.E. and Rasmussen J.T. (2002) Isolation and charac- nature motif and three-dimensional structure by com- terization of MUC15, a novel cell membrane-associ- parative modeling. J. Mol. Graph. Model. 17, 106–113. ated mucin. Eur. J. Biochem. 269, 2755–2763. Rasero, R., Bianchi, L., Cauvin, E., Maione, S., Sartore, Pallesen, L.T., Pedersen, L.R.L., Petersen, T.E., Knudsen, S., Soglia, D. and Sacchi, P. (2007) Analysis of the C.R. and Rasmussen, J.T. (2008) Characterization of sheep MUC1 gene: Structure of the repetitive region human mucin (MUC15) and identification of ovine and polymorphism. J. Dairy Sci. 90, 1024–1028. and caprine orthologs. J. Dairy Sci. 91, 4477–4483. Rasmussen, L.K., Johnsen, L.B., Tsiora, A., Sorensen, Passey, R.J. and MacKinlay, A.G. (1995) Characterisation E.S., Thomsen, J.K., Nielsen, N.C., Jakobsen, H.J. and of a second, apparently inactive, copy of the bovine Petersen, T.E. (1999) Disulphide-linked caseins and b-lactoglobulin gene. Eur J Biochem. 233, 736–743. casein micelles. Int. Dairy J. 9, 215–218. Passey, R., Glenn, W. and MacKinlay, A. (1996) Exon skip- Rasmussen, L.K., Due, H.A. and Petersen, T.E. (1995) ping in the ovine alpha S1-casein gene. Comp. Biochem. Human as1-casein: purification and characterization. Physiol. B Biochem. Mol. Biol. 114, 389–394. Comp. Biochem. Physiol. 111B, 75–81. Patton, S. (1999) Some practical implications of the milk Remeuf, F. (1993) Influence du polymorphisme génétique mucins. J. Dairy Sci. 82, 1115–1117. de la caséine as1 caprine sur les caractéristiques physico-chimiques et technologiques du lait. Lait 73, Patton, S. (2001) MUC1 and MUC-X, epithelial mucins 549–557. of breast and milk. Adv. Exp. Med. Biol. 501, 35–45. Reinhardt, T.A. and Lippolis, J. (2006) Bovine milk fat glob- Pena, R.N., Sánchez, A., Coll, A. and Folch, J.M. (1999) ule membrane proteome. J. Dairy Res. 73, 406–416. Isolation, sequencing and relative quantitation by fluorescent-ratio PCR of feline b-lactoglobulin I, II, Rhoads, R.E. and Grudzien-Nogalska, E. (2007) and III cDNAs. Mamm. Genome 10, 560–564. Translational regulation of milk protein synthesis at secretory activation. J Mam. Gland Biol. Neopl. 12, Persuy, M.A., Printz, C., Medrano, J.F. and Mercier, J.C. 283–292. (1999) A single nucleotide deletion resulting in a pre- mature stop codon is associated with marked reduc- Rhodes, D. A., Stammers, M., Malcherek, G., Beck, S. tion of transcripts from a goat beta-casein null allele. and Trowsdale, J. (2001). The cluster of BTN genes in Animal Genet. 30, 444–451. the extended major histocompatibility complex. Genomics 71, 351–362. Persuy, M.A., Printz, C., Medrano, J.F. and Mercier, J.C. (1996) One mutation might be responsible for the Ribadeau Dumas, B. and Brignon, G. (1993) Les proté- absence of beta-casein in two breeds of goats. Animal ines du lait de différentes espèces, in: Progrès en Genet. 27, 96. Pédiatrie 10. Allergies Alimentaires, J. Navarro and Schmitz, J., eds., Doin, Paris, France. pp. 27–39. Persuy, M.A., Legrain, S., Printz, C., Stinnakre, M.G., Lepourry, L., Brignon, G. and Mercier, J.C. (1995) Rijnkels, M., Kooiman, P.M., Krimpenfort, P.J.A., de High-level, stage- and mammary-tissue-specific Boer, H.A. and Pieper, F.R. (1995) Expression analy- expression of a caprine kappa-casein-encoding mini- sis of the individual bovine beta-, alpha s2- and kappa- gene driven by a beta-casein promoter in transgenic casein genes in transgenic mice. Biochem. J. 311, mice. Gene 165, 291–296. 929–937. Pettersson-Kastberg, J., Aits, S., Gustafsson, L., Mossberg, Rijnkels, M., Kooiman, P.M., de Boer, H.A. and Pieper, A., Storm, P., Trulsson, M., Persson, F., Mok, K.H. and F.R. (1997) Organisation of the bovine casein gene Svanborg, C. (2009) Can misfolded proteins be beneficial? locus. Mamm. Genome 8, 148–152. The HAMLET case. Ann. Med. 41, 162–176. Rijnkels, M., Elnitski, L., Miller, W. and Rosen, J.M. Pierre, A., Michel, F. and Le Graet, Y. (1995) Variation in (2003) Multispecies comparative analysis of size of goat milk casein micelles related to casein gen- mammalian-specific genomic domain encoding secre- otype. Lait 75, 489–502. tory proteins. Genomics 82, 417–432. Pisano, A., Packer, N.H., Redmond, J.W., Williams, K.L. Robenek, H., Hofnagel, O., Buers, I., Lorkowski, S., and Gooley, A.A. (1994) Characterization of O-linked Schnoor, M., Robenek, M.J., Heid, H., Troyer, D. and glycosylation motifs in the glycopeptide domain of Severs, N.J. (2006) Butyrophilin controls milk fat bovine k-casein. Glycobiology 4, 837–1994. globule secretion. Proc. Nat. Acad. Sci. U.S.A. 103, 10385–10390.

428 P. Martin et al. Russo, V. and Davoli, R. (1983) Polymorphism of ovine Smith, C.W., Chu, T.T. and Nadal-Ginard, B. (1993) and caprine milk proteins. Proc. of Vth National Scanning and competition between AGs are involved Congress S.I.P.A.O.C. (Italian Society for the in 3’ splice site selection in mammalian introns. Mol. Pathology and Rearing of Goats and Ewes) Acireale, Cell. Biol. 13, 4939–4952. 9–11 December, Italy. pp.541–555. Soulier, S., Sarfati, R.S. and Szabo, L. (1980) Structure of Sacchi, P., Caroli, A., Cauvin, E., Maione, S., Sartore, S., the asialyl oligosaccharide chains of kappa-casein iso- Soglia, D. and Rasero, R. (2004) Analysis of the lated from ovine colostrum. Eur. J. Biochem. 108, MUC1 gene and its polymorphism in Capra hircus. 465–472. J. Dairy Sci. 87, 3017–3021. Soulier, S. and Gaye, P. (1981) Enzymatic O-glycosylation Saito, T. and Itoh, T. (1992) Variations and distributions of of k-caseinomacropeptide by ovine mammary Golgi O-glycosidically linked sugar chains in bovine membranes. Biochimie 63, 619–628. k-casein. J. Dairy Sci. 75, 1768–1774. Spicer, A.P., Parry, G., Patton, S. and Gendler, S.J. (1991) Saito, T., Itoh, T. and Adachi, S. (1988) Chemical structure Molecular cloning and analysis of the mouse homo- of neutral sugar chains isolated from human mature logue of the tumor-associated mucin, MUC1, reveals milk k-casein. Biochim. Biophys. Acta 964, 213–220. conservation of potential O-glycosylation sites, trans- membrane, and cytoplasmic domains and a loss of Sando, L., Pearson, R., Gray, C., Parker, P., Hawken, R., minisatellite-like polymorphism. J. Biol. Chem. 266, Thomson, P.C., Meadows, J.R., Kongsuwan, K., 15099–15109. Smith, S. and Tellam, R.L. (2009) Bovine Muc1 is a highly polymorphic gene encoding an extensively gly- Stewart, A.F., Bonsing, J., Beattie, C.W., Shah, F., Willis, cosylated mucin that binds bacteria. J Dairy Sci. 92, I.M. and MacKinlay, A.G. (1987) Complete nucleotide 5276–5291. sequences of bovine as2- and b-casein cDNAs: com- parisons with related sequences in other species. Mol. Sasaki, T., Sasaki, M. and Enami, J. (1993) Mouse g-casein Biol. Evol. 4, 231–241. cDNA: PCR cloning and sequence analysis. Zool. Sci. 10, 65–72. Stinnakre, M.-G., Vilotte, J.-L., Soulier, S. and Mercier, J.-C. (1994) Creation and phenotypic analysis of Sawyer, L. (2003) b-lactoglobulin, in Advanced Dairy alpha-lactalbumin-deficient mice. Proc. Natl. Acad. Chemistry, Proteins, Part A, Vol. 1, 3rd edn., P.F. Fox Sci. U.S.A. 91, 6544–6548. and P.L.H.McSweeney, eds., Kluwer Academic/ Plenum Publishers, New York. pp. 319–386. Thépot, D., Devinoy, E., Fontaine, M.-L. and Houdebine, L.-M. (1991) Structure of the gene encoding rabbit Schanbacher, F.L., Goodman, R.E. and Talhouk, R.S. b-casein. Gene 97, 301–306. (1993) Bovine mammary lactoferrin: implications from messenger ribonucleic acid (mRNA) sequence Threadgill, D.W. and Womack, J.E. (1990) Genomic anal- and regulation contrary to other milk proteins. J Dairy ysis of the major bovine casein genes. Nucleic Acids Sci. 76, 3812–3831. Res. 18, 6935–6942. Schmidt, D.G. (1982) Association of caseins and casein Topcic, D., Auguste, A., De Leo, A.A., Lefèvre, C., Digby, micelle structure, in, Developments in Dairy M.R. and Nicholas, K.R. (2009) Characterization of Chemistry, Vol. 1. P.F. Fox, ed., Applied Science, the tammar wallaby (Macropus eugenii) whey acidic London. pp. 61–86. protein gene; new insight into the function of the pro- tein. Evol. Dev. 11, 363–375. Schroten, H. (1998) The benefits of human milk fat glob- ule against infection. Nutrition 14, 52–53. Valentine, C.R. (1998) The association of the nonsense codons with exon skipping. Mutation Res. 411, 87–117. Sharp, J.A., Cane, K.N., Lefevre, C., Arnould, J.P. and Nicholas, K.R. (2006) Fur seal adaptations to lacta- van Halbeek, H., Vliegenthart, J.F.G., Fiat, A.-M. and tion: insights into mammary gland function. Curr. Top Jollès, P. (1985) Isolation and structural characterisa- Dev. Biol. 72, 275–308. tion of the smaller-size oligosaccharide from desialy- lated human k-casein. Establishment of a novel type of Sharp, J.A., Lefèvre, C. and Nicholas, K.R. (2007) core for a mucin-type carbohydrate chain. FEBS Lett. Molecular evolution of monotreme and marsupial 187, 81–88. whey acidic protein genes. Evol. Dev. 9, 378–392. Voelker, G.R., Bleck, G.T. and Wheeler, M.B. (1997) Sharp, J.A., Lefèvre, C. and Nicholas, K.R. (2008) Lack of Single-base polymorphisms within the 5’flanking functional alpha-lactalbumin prevents involution in Cape region of the bovine a-lactoalbumin gene. J. Dairy fur seal and identifies the protein as an apoptotic milk Sci. 80, 194–197. factor in mammary gland involution. BMC Biol. 6, 48. Vogan, K.J., Underhill, D.A. and Gros, P. (1996) An alter- Simpson, K.J. and Nicholas, K.R. (2002) The compara- native splicing event in the Pax-3 paired domain tive biology of whey proteins. J. Mamm. Gland Biol. identifies the linker region as a key determinant of Neopl. 7, 313–326. paired domain DNA-binding activity. Mol. Cell Biol. 12, 6677–6686. Singh, P.K. and Hollingsworth, M.A. (2006) Cell surface- associated mucins in signal transduction. Trends Cell Vogel, H.J., Schibli, D.J., Jing, W., Lohmeier-Vogel, E.M., Biol. 16, 467–476. Epand, R.F. and Epand, R.M. (2002) Towards a struc- ture-function analysis of bovine lactoferricin and Slattery, C.W. and Evard, R. (1973) A model for the for- related tryptophan- and arginine-containing peptides. mation and structure of casein micelles from subunits Biochem. Cell Biol. 80, 49–63. of variable composition. Biochim. Biophys. Acta 317, 529–538.

13 Interspecies Comparison of Milk Proteins: Quantitative Variability and Molecular Diversity 429 Walstra, P. (1990) On the stability of casein micelles. Wheeler, T.T., Kuys, Y.M., Broadhurst, M.M. and J. Dairy Sci. 73, 1965–1979. Molenaar, A.J. (1997) Mammary STAT5 abundance and activity are not altered with lactation state in cows. Ward, P.P., Uribe-Luna, S. and Conneely, O.M. (2002) Mol. Cell. Endocrinol. 133, 141–149. Lactoferrin and host defense. Biochem. Cell Biol. 80, 95–102. Wilson, N.L., Robinson, L.J., Donnet, A., Bovetto, L., Packer, N.H. and Karlsson, N.G. (2008). Warren, W.C., Hillier, L.W., Marshall Graves, J.A., Birney, Glycoproteomics of milk: differences in sugar epitopes E., Ponting, C.P., Grützner, F., Belov, K., Miller, W., on human and bovine milk fat globule membranes. Clarke, L., Chinwalla, A.T., Yang, S.P., Heger, A., J. Proteome Res. 7, 3687–3696. Locke, D.P., Miethke, P., Waters, P.D., Veyrunes, F., Fulton, L., Fulton, B., Graves, T., Wallis, J., Puente, Winklehner-Jennewein, P., Geymayer, S., Lechner, J., X.S., López-Otín, C., Ordóñez, G.R., Eichler, E.E., Welte, T., Hanson, L., Geley, S. and Doppler, W. Chen, L., Cheng, Z., Deakin, J.E., Alsop, A., (1998) A distal enhancer region in the human b-casein Thompson, K., Kirby, P., Papenfuss, A.T., Wakefield, gene mediates the response to prolactin and glucocor- M.J., Olender, T., Lancet, D., Huttley, G.A., Smit, ticoid hormones. Gene 217, 127–139. A.F., Pask, A., Temple-Smith, P., Batzer, M.A., Walker, J.A., Konkel, M.K., Harris, R.S., Whittington, C.M., Zeder, M.A. (2008) Domestication and early agricul- Wong, E.S., Gemmell, N.J., Buschiazzo, E., Vargas ture in the Mediterranean Basin: origins, diffusion, Jentzsch, I.M., Merkel, A., Schmitz, J., Zemann, A., and impact. Proc. Natl. Acad. Sci. USA. 105, Churakov, G., Kriegs, J.O., Brosius, J., Murchison, 11597–11604. E.P., Sachidanandam, R., Smith, C., Hannon, G.J., Tsend-Ayush, E., McMillan, D., Attenborough, R., Zhang, J., Perez, A., Yasin, M., Soto, P., Rong, M., Rens, W., Ferguson-Smith, M., Lefèvre, C.M., Sharp, Theodoropoulos, G., Carothers Carraway, C.A. and J.A., Nicholas, K.R., Ray, D.A., Kube, M., Reinhardt, Carraway, K.L. (2005) Presence of MUC4 in human R., Pringle, T.H., Taylor, J., Jones, R.C., Nixon, B., milk and at the luminal surfaces of blood vessels. Dacheux, J.L., Niwa, H., Sekita, Y., Huang, X., Stark, J. Cell. Physiol. 204, 166–177. A., Kheradpour, P., Kellis, M., Flicek, P., Chen, Y., Webber, C., Hardison, R., Nelson, J., Hallsworth- Zhang, J., Sun, X., Qian, Y., LaDuca, J.P. and Maquat, Pepin, K., Delehaunty, K., Markovic, C., Minx, P., L.E. (1998a) At least one intron is required for the Feng, Y., Kremitzki, C., Mitreva, M., Glasscock, J., nonsense-mediated decay of triosephosphate isomerase Wylie, T., Wohldmann, P., Thiru, P., Nhan, M.N., Pohl, mRNA: a possible link between nuclear splicing and C.S., Smith, S.M., Hou, S., Nefedov, M., de Jong, P.J., cytoplasmic translation. Mol. Cell Biol. 18, Renfree, M.B., Mardis, E.R. and Wilson, R.K. (2008) 5272–5283. Genome analysis of the platypus reveals unique signa- tures of evolution. Nature 453 (7192), 175–183. Zhang, J., Sun, X., Qian, Y., LaDuca, J.P. and Maquat, L.E (1998b) Intron function in the nonsense-mediated decay of beta-globin mRNA: indications that pre- mRNA splicing in the nucleus can influence mRNA translation in the cytoplasm. RNA 4, 801–815.

Genetics and Biosynthesis of Milk 14 Proteins J.-L. Vilotte, E. Chanat, F. Le Provost, C.B.A. Whitelaw, A. Kolb, and D.B. Shennan 14.1 Introduction The previous edition of this chapter (Vilotte et al., 2002) described in detail the hormonal During lactation, mammary epithelial cells regulation of milk protein gene expression, their secrete large quantities of milk proteins. More mRNA and gene structures, their co- and post- than 90 % of these proteins are derived from the translational modifications and the transport and transcription of a few tissue-specific genes, the secretion of milk proteins. The aim of this expression of which is under a complex multi- revision is to summarize briefly our knowledge hormonal regulation that involves both transcrip- on the structure of the milk protein genes and to tional and post-transcriptional mechanisms. put into context the rapid growth of information Furthermore, to fulfil its bioreactor activity, the on the regulatory elements involved in control- mammary gland needs an optimal supply of ling the expression of these genes. We will also amino acids as well as efficient translation and focus on the amino acid supply to the mammary transport machineries during lactation. gland and on the intracellular routing and sorting of milk proteins in mammary cells. However, J.-L. Vilotte ( ) • F. Le Provost other important topics will not be discussed. The UMR1313 Génétique Animale et Biologie Intégrative, widespread presence of caseins variants will be Institut National de la Recherche Agronomique, covered in Chap. 15, while the practical applica- INRA, 78352 Jouy-en-Josas Cedex, France tions of these studies for the dairy field will be described in Chap. 16. Similarly, global analysis E. Chanat of genome evolution with regard to the mammary UR1196 Génomique et physiologie de la lactation, gland, as described in Lemay et al. (2009), will Institut National de la Recherche Agronomique, be discussed in other chapters of this book. INRA, 78352 Jouy-en-Josas Cedex, France 14.2 Structure of Milk Protein Genes C.B.A. Whitelaw Division of Molecular Biology, Roslin Institute The major milk protein genes have been (Edinburgh), Roslin, Midlothian, EH25 9PS, UK sequenced in several species. Overall, the mosaic structure of these genes has been well conserved A. Kolb during evolution, and observed species differ- Metabolic Health Theme, Rowett Institute of Nutrition ences in the length of their transcription unit can and Health, University of Aberdeen, often be attributed to the occurrence of repetitive Aberdeen, AB21 9SB, UK DNA within some introns, mainly artiodactyl D.B. Shennan Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, G1 1XW, UK P.L.H. McSweeney and P.F. Fox (eds.), Advanced Dairy Chemistry: Volume 1A: Proteins: Basic Aspects, 431 4th Edition, DOI 10.1007/978-1-4614-4714-6_14, © Springer Science+Business Media New York 2013

432 β β-like κ J.-L. Vilotte et al. αs1 Platypus αs1 β Odam κ (50 kb) Opossum αs1 β Hstn Stath αs2 Odam κ Bovine (250 kb) αs1 β Stath Hstn γ δ AK05291 Odam κ Murine (260 kb) αs1 β STATH HSTN3 HSTN1 αs2-like αs2-like NP-999876 ODAM FDSCP κ Human (370 kb) Fig. 14.1 Overall organization of the casein locus in available, orientation of the gene transcription unit is various species. Green boxes represent the transcription indicated by an arrow. Origins of the data are mentioned unit of the genes. White boxes: putative genes. When in the text retroposons. Similarly, deletions or insertions of et al., 1996; George et al., 1997; Fujiwara et al., amino acids between caseins from different spe- 1997; Rijnkels et al., 1997a,b,c; 2002; Lefèvre cies or variants appear to occur mostly by exon et al., 2009). The overall size of the casein locus skipping. These genes share the canonical struc- varies from around 50 kb in opossum to 370 kb in ture of tissue-specific eukaryotic genes. Beyond humans, but the order and the orientation of the these basic similarities, these genes differ sub- genes within the cluster are conserved (Fig. 14.1). stantially in their genomic organization. In mice, the g- and d-casein genes, which are linked within 60 kb, encode an as2-like protein, 14.2.1 The Casein-Encoding Genes and sequence analysis suggests that the ancestral as2-casein gene duplicated at the time of radiation 14.2.1.1 General Organization of the between rodents and Artiodactyla (George et al., Gene Cluster 1997; Rijnkels et al., 1997a). A similar duplica- tion of the as2-casein gene is also suspected to Classical genetic studies have demonstrated that have occurred in rabbits (Dawson et al., 1993). A the four casein genes are closely linked, and a recent duplication of b-casein was evidenced in general organization of the gene cluster was the monotreme lineage, and the duplication of the deduced (reviewed by Grosclaude, 1979). Since a-casein that occurred in the eutherian lineage is then, structural analysis of the gene cluster at the absent in marsupials (Lefèvre et al., 2009). DNA level in various species has confirmed and Interestingly, the casein locus encompasses refined the protein data (Fig. 14.1; Tomlinson several other genes, expression profile of which

14 Genetics and Biosynthesis of Milk Proteins 433 αs1-casein 53 12/51 33 39 24 24 24 24 33 2454 42 24 42 27 24 155 1/43 385 αs2-casein I’ 44 44 12/51 27 21 42 27 27 27 24 45 123 27 27 24 45 120 12/33 266 β-casein 44 12/51 27 27 24 42 498 6/36 322 κ-casein 65 5/57 33 483/34 171 1 kb Fig. 14.2 Organization of the bovine casein-encoding numbers are indicated below exons that comprise both genes. Exons are represented by boxes. White boxes: untranslated regions, black boxes: coding frame. Exons are untranslated and coding sequences. Exon I¢ from the as2- not to scale and their sizes are indicated as base pairs. Two casein gene corresponds to a partially skipped exon in the sheep species. Origins of the data are mentioned in the text may not be restricted to the mammary gland. common origin of these calcium-sensitive Inactivation of the b-casein gene (Kumar et al., casein-encoding genes. This hypothesis was 1994) did not prevent expression of the remaining further substantiated by the identification of caseins within the locus, suggesting that they are common sequence motifs in the proximal 5¢-flan- expressed independently of each other. king region and the similar structural organiza- tion of the first four exons. Despite its location The casein locus has been localized to chro- within the casein genes cluster, the k-casein gene mosome 6 in the cow, sheep and goat (Threadgill appears to have no evolutionary relationship and Womack, 1990; Hayes et al., 1992; Hayes with the calcium-sensitive casein-encoding et al., 1993a; Gallagher et al., 1994); 5 in mice genes. It was recently hypothesized that the milk (Geissler et al., 1988); 12 in rabbits (Gellin casein genes have evolved from genes involved et al., 1985); 4 in humans; 3 in chimpanzees in tooth development before the origin of mammals (McConkey et al., 1996); and 5 in opossum (Kawasaki et al., 2011). (Lefèvre et al., 2009). The structure of the as1-casein gene has been 14.2.1.2 Individual Gene Structures partially analysed in rat (Yu-Lee et al., 1986) Internal homologies within as1- and as2-casein and fully determined in cow (Koczan et al., proteins suggested that the cognate genes 1991), goat (Leroux et al., 1992) and rabbit evolved through intragenic duplications, a result (Jolivet et al., 1992), or could be deduced from confirmed at the DNA level by the observed several other sequenced genomes, the same duplication of intron-exon-intron stretches. The being true for the other individual milk protein ubiquity of the major phosphorylation site and genes. The transcription unit of the gene is split the striking homology of signal peptides of as1-, into 19 exons and spans around 17.5 kb in rumi- as2- and b-casein proteins indicated a possible nants (Fig. 14.2).

434 J.-L. Vilotte et al. The structure of the as2-casein gene has been exons and a transcription unit length of around described in cow (Groenen et al., 1993). The 4.8 kb, was conserved during evolution (Fig. 14.3). gene transcription unit is split into 18 exons and In dogs and horses, two functional and closely spans 18.5 kb (Fig. 14.2). The first intron con- linked genes are present in the genome (Halliday tains a non-coding exon (exon I¢ in Fig. 14.2) that et al., 1990; Lear et al., 1998). In cats, a third is known to be retained in 4 % of the ovine mRNA functional gene is present (Halliday et al., 1990; (Boisnard et al., 1991). In this species, exon VI is Pena et al., 1999). also partially skipped. Sequence comparisons suggest that this gene is more closely related to In ruminants, a b-lactoglobulin pseudogene the b-casein gene than it is to the as1-casein gene has been identified (Passey and MacKinlay, 1995; (Groenen et al., 1993). Folch et al., 1996), while in cow and goat a gene conversion event has occurred. The bovine and The structure of the b-casein gene is known in caprine pseudogenes contain seven exons, and mouse (Yoshimura and Oka, 1989), cow (Bonsing the ancestral protein that they encode is related to et al., 1988), rat (Jones et al., 1986), rabbit the monomeric b-lactoglobulin II protein. The (Thépot et al., 1991), goat (Roberts et al., 1992), bovine pseudogene is located 14 kb 5¢ from the human (Hansson et al., 1994) and sheep (Provot functional gene, in a similar orientation (Passey et al., 1995). Its transcription unit is composed of and MacKinley, 1995). 9 exons and spans between 8 and 10 kb according to differences between species in the length of b-Lactoglobulin belongs to the lipocalin pro- intronic sequences (Fig. 14.2). Exon III is skipped tein family (Flower, 1996, for review). Comparison in humans, leading to the deletion of 9 amino of the structure of various lipocalin genes with acids in the mature protein. that of the b-lactoglobulin gene has revealed striking similarities, confirming further their evo- Characterisation of the k-casein gene has been lutionary relationship (Ali and Clark, 1988). The reported in cow (Alexander et al., 1988; b-lactoglobulin gene(s) and pseudogene were Kapelinskaia et al., 1989), human (Edlund et al., assigned to chromosome 11 in cow and goat, 3 in 1996) and rabbit (Baranyi et al., 1996). The tran- sheep (Hayes and Petit, 1993c; Folch et al., 1996) scription unit of this gene comprises 5 exons, and and 28 in horse (Lear et al., 1998). its length varies from 7.5 kb in rabbit to 12.5 kb in cow (Fig. 14.2). This gene is evolutionarily 14.2.2.2 The a-Lactalbumin Gene related to fibrinogens (Jollès et al., 1974). Indeed, and Pseudogenes a 24 bp sequence located at the 5¢ end of exon IV of the gene was found to be similar with the end The a-lactalbumin gene has been sequenced in of exon II of the g-fibrinogen gene, suggesting rat (Quasba and Safaya, 1984), cow (Vilotte that it represents an exon from the ancestral gene et al., 1987), human (Hall et al., 1987), guinea (Alexander et al., 1988). pig (Laird et al., 1988), goat (Vilotte et al., 1991), mouse (Vilotte and Soulier, 1992), tammar wal- 14.2.2 The Major Whey Protein- laby (Collet and Joseph, 1995) and otariid and Encoding Genes phocid seals (Sharp et al., 2008). In eutherians, the gene is composed of four exons and its tran- 14.2.2.1 The b-Lactoglobulin-Encoding scription unit is about 2 kb in length (Fig. 14.3). Gene and Pseudogenes It shares the same structural organization with the lysozyme gene, corroborating the hypothesis of a The structure of the b-lactoglobulin-encoding common ancestor (Quasba and Safaya, 1984). gene has been reported in sheep (Ali and Clark, The structure of the tammar wallaby a-lactalbu- 1988; Harris et al., 1988), cow (Alexander et al., min gene appears different with the occurrence of 1993), goat (Folch et al., 1994), tammar wallaby a putative 5¢-untranslated first exon (Collet and (Collet and Joseph, 1995) and horse (Lear et al., Joseph, 1995). In ruminants, the occurrence of 1998). The structure of this gene, with seven related sequences, probably pseudogenes, has been reported (Soulier et al., 1989; Vilotte et al.,

14 Genetics and Biosynthesis of Milk Proteins 435 Bovine β-lactoglobulin 45/90 140 74 111 105 17/25 183 Bovine α-lactalbumin 27/133 159 76 61/269 Rabbit WAP 46/82 126 159 17/113 Tammar WAP 23/47 147 167 149 66/125 1 kb Fig. 14.3 Organization of the major whey protein-encod- tein-encoding gene. Exons are not to scale and their sizes ing genes. Exons are represented by boxes. White boxes : are indicated as base pairs. Two numbers are indicated untranslated regions, black boxes: coding frame, grey box: below exons that comprise both untranslated and coding exon 2 from marsupials and monotremes whey acidic pro- sequences. Origins of the data are mentioned in the text 1991; 1993). All a-lactalbumin-related sequences various eutherians (Thépot et al., 1990; Rival- are closely linked (Hayes et al., 1993b; Gallagher Gervier et al., 2003), marsupials and monotremes et al., 1993) and located 3¢ to the functional gene (Topcic et al., 2009, for review). The 2 kb tran- (quoted in Stinnakre et al., 1999). The a-lactal- scription unit is composed of four exons in eutheri- bumin-encoding gene (and related sequences) ans and of five exons in marsupials and monotremes has been assigned to human chromosome 12 (Fig. 14.3). In ruminants, the WAP gene is charac- (Davies et al., 1987), sheep chromosome 3, terized by a nucleotide deletion at the end of exon bovine and goat chromosomes 5 (Hayes et al., one and a lack of detectable transcription. It thus 1993b) and pig chromosome 5 (Rohrer et al., appeared to be a pseudogene (Hajjoubi et al., 2006). 1997). In Cape fur seals, the gene appears to be The functional role for WAP in milk is unknown, transcriptionally silenced and, although compar- although it bears a similarity to a family of protease ative analysis of proximal promoter sequence inhibitors; however, its absence in deficient mice revealed some differences, for which none leads to nutritional deficiencies in the offspring appears to be responsible (Sharp et al., 2008). which appear to be unrelated to its activity as a pro- tease inhibitor (Triplett et al. 2005). Both WAP and 14.2.2.3 The Whey Acidic Protein- protease inhibitors of the Kunitz family are charac- Encoding Gene terized by highly conserved cysteine residues located in two proteic domains (Hennighausen and The whey acidic protein (WAP) gene was origi- Sippel, 1982). The murine WAP gene has been nally thought to be present only in rodents assigned to chromosome 11 (Gupta et al., 1982). (Campbell et al., 1984) but has been sequenced in

436 J.-L. Vilotte et al. 14.3 Milk Protein-Encoding Gene tating animals (Fujiwara et al., 1999), and casein Expression and Regulation mRNA expression has been observed in cytotoxic T-lymphocyte-derived cell lines (Grusby et al., 14.3.1 Tissue Specificity and 1990). Occurrence of casein-like immunoreac- Developmental Regulation tive substances in diverse organs, including the thymus, was reported in the rat (Onoda and Inano, The major milk protein genes are defined as 1997). More recently, promiscuous milk protein mammary-specific and developmentally regu- gene expression was observed in medullary thy- lated expressed genes. As such, they represent mic epithelial cells, probably in relation with the markers of mammary differentiation. The amount development of central T cell tolerance (Derbinski of milk protein mRNA in mammary epithelial et al., 2008). It is of interest to note that many of cells increases steadily from mid-pregnancy to the tissues where ectopic expression is observed lactation, although at different rates according to contain some of the transcription factors required the different genes (Harris et al., 1990; Robinson for mammary expression of the milk protein et al., 1995). This is due to an increase in the tran- genes, for example, in T cells, interleukin-2 acti- scription rate of these genes as well as a stabiliza- vates STAT5 (Gilmour et al., 1995). tion of the transcripts (Guyette et al., 1979). An observed asynchrony of mammary epithelial cell 14.3.2 Hormonal Regulation and maturation during pregnancy (Robinson et al., Identification of cis-Regulatory 1995) is paralleled by heterogeneous expression Elements during lactation of the major milk protein genes in sheep and cattle (Molenaar et al., 1992). This Expression of the major milk protein-encoding heterogeneous pattern of expression was not genes is under a complex multi-hormonal regula- observed in lactating mouse mammary glands tion, resulting from the interplay of steroid and (Dobie et al. 1996). Nevertheless, a short closure polypeptide hormones. In addition, local growth of lactating murine mammary gland resulted in factors and cell-cell and cell-substratum interac- local perturbation of milk protein gene expres- tions are also involved. Since many reviews have sion, leading to the appearance of a mosaic pat- already focused on this topic, including the previ- tern (Faerman et al., 1995). These results strongly ous edition of this chapter (Vilotte et al., 2002), suggest that the regulation of the expression of we will only give a brief survey here. the major milk protein genes in mammary epithe- lial cell is under a complex regulation that could Schematically, lactogenic hormones, such as involve both a graded and a binary mechanism. insulin, prolactin and glucocorticoids, activate the transcription of the major milk protein genes Over the last few years, the concept that the whereas other hormones, such as progesterone, expression of the milk protein genes is restricted inhibit this activation in the early stages of to the mammary gland has been questioned. pregnancy to favour cell proliferation over cell Expression of a-lactalbumin in the rat epididymis differentiation. It should be noted that in many was reported (Qasba et al., 1983) and denied systems it is difficult to separate direct induction (Moore et al., 1990; Tang, 1993). The murine of milk protein transcription from indirect differ- a-lactalbumin gene was also reported to be entiation-related events. As already mentioned, expressed, alongside the b-casein gene, in the transcription activation of the major milk protein sebaceous glands during lactation (Maschio genes is not concomitant during pregnancy, per- et al., 1991), but this observation has not been haps due to the presence of various hormonal confirmed (Persuy et al., 1992; Vilotte and micro-environments and/or different responses Soulier, 1992). RT-PCR experiments have sug- to this environment displayed by different milk gested that the rat a-lactalbumin gene is also protein genes. For example, expression of calci- expressed at low levels in the brain of some lac- um-sensitive casein genes that are activated at

14 Genetics and Biosynthesis of Milk Proteins 437 mid-pregnancy relies on prolactin and is b-lactoglobulin gene, regulation of expression increased by the synergetic action of glucocorti- by the ECM occurs through activation of coids (Kabotyanski et al., 2006, 2009), while the STAT5 (Streuli et al., 1995), and this may occur WAP promoter, a gene expressed late in preg- through an ECM-dependent modulation of nancy, is reciprocally regulated by these two protein-tyrosine phosphatase activity (Edwards hormones. In addition, whereas the glucocorti- et al., 1998). Cell-substratum components as coid induction of the WAP gene expression is well as glucocorticoids can, at least for the casein rapid, its action on the b-casein gene occurs only genes, also act at the post-transcriptional level with a significant time-lag and requires de novo (Eisenstein and Rosen, 1988). protein synthesis. Expression of the b-lactoglob- ulin gene, although displaying a similar tempo- Beside these differences, milk protein genes ral expression profile to the caseins, appears to share specific hormonal responses. For example, be less dependent on lactogenic hormones. in most species, progesterone inhibits milk pro- Finally, the a-lactalbumin gene, although dis- tein gene expression. Although the exact mecha- playing a temporal expression profile similar to nism is still unclear, it has been reported that WAP, is induced by prolactin in the presence of progesterone might repress expression of the long low concentrations of glucocorticoids whereas form of the prolactin receptor mRNA in the mam- high concentrations of glucocorticoids inhibit its mary gland (Mizoguchi et al., 1997) and/or expression, at least in eutherians (Funder, 1989). directly repress the prolactin/STAT5-mediated Induction of the a-lactalbumin gene in the transcription at the milk protein gene promoter absence of prolactin could be observed in the level (Buser et al., 2007). pregnant murine mammary explants in the pres- ence of insulin and cortisol (Warner et al., 1993), Transfection and transgenic studies have while in marsupials, a-lactalbumin gene expres- revealed that the promoters of most of the major sion depends only on prolactin (Collet et al., milk protein-encoding genes are responsive to 1990). The mechanistic rationale for these lactogenic hormones sufficiently to target intriguing differences has not been identified. mammary-specific expression of reporter genes Finally, it was recently shown that beside in transgenic animals. However, these promoters transcriptional regulation and transcript stabili- cannot sustain full expression on their own (com- zation, lactogenic hormones are also involved in pare Webster et al., 1995 with Whitelaw et al., the translational regulation of milk protein syn- 1992). Indeed, intragenic sequences of some of thesis (Rhoads and Grudzien-Nogalska, 2007, the major milk protein-encoding genes were for review). shown to be able to contribute to their hormonal regulation (Lee et al., 1989), and important regu- Regulation of mammary gene expression is latory elements have been identified within the also controlled by the epithelial cell basement introns (Kang et al., 1998, Kolb, 2003) and/or in membrane. For example, laminin can induce the 3¢ untranslated region (UTR) and flanking expression of a-lactalbumin, aS1-casein and regions (Dale et al., 1992). Furthermore, and as b-casein by 160-fold (Aggeler et al., 1988; Blum already mentioned, milk protein genes are also et al., 1989). The differences observed between regulated at a post-transcriptional level. milk protein genes with regard to hormonal induction are also evident in their varying Sequence comparison of a particular gene requirement for a basement membrane. between several species or from different milk Transcriptional control of the b-casein promoter protein genes has led to the identification of con- appears less dependent on the three-dimensional served DNA motifs that were suspected to be structure of the mammary epithelial cells than involved in the control of the gene transcription. does the WAP promoter, although both of them A classical example is the high conservation of are sensitive to extracellular-matrix (ECM) sequence elements in the proximal 5¢-flanking components (Lin et al., 1995). At least for the region (−200/+1) of the calcium-sensitive casein genes (Rosen, 1987; Kolb, 2002). These early identified elements were subsequently found to

438 J.-L. Vilotte et al. be recognized by regulatory nuclear factors. 1994a, 1995; Raught et al., 1995). A cooperation Consensus sequences recognized by effectors of distal and proximal promoter elements is known to be involved in lactogenesis were also required to achieve both maximum expression identified within the major milk protein gene and maximum hormone responsiveness of the promoter sequences by computer searches. murine b-casein gene in mouse HC11 cells However, evidence for the presence of a cis-reg- (Robinson and Kolbs, 2009). Thus, it appears that ulatory element within a DNA fragment came the transcriptional regulation of the major milk from transfection experiments in cell cultures, protein genes is under a combinatorial control transgenic studies, DNAse I protection, foot- with the binding of multiprotein complexes that printing and/or gel shift assays. Site-directed can either repress or activate gene expression mutagenesis of the identified binding sites and (see Wolberger, 1998, for review). observations, either in cell culture or in transgen- ics, of the consequences of such mutations on Differences in the composition of these com- the promoter transcriptional regulation was posite regulatory elements may explain the sometimes performed to further define the func- observed hormonal differences in the transcrip- tional role of these elements. tional developmental regulation between the vari- ous major milk protein genes. In the WAP gene, 14.3.3 Transcriptional Control of Milk for example, an Ets-1 binding site located at −110 Protein Genes appears to be important for the stage-specific transcriptional activation of the gene but not for Binding sites for several transcription factors its stable expression during lactation (McKnight have been identified within the promoters of most et al., 1995). Activation of promoters by tran- of the major milk protein-encoding genes, such scription factors can be mediated by the relief of as binding sites for OCT-1, NF-1, C/EBP, STAT5, the binding of transcription repressors through GR, Ets-1 and YY1 (see Rosen et al., 1996, for both the competitive binding of these activators review). Other DNA elements have been shown and the hormonal regulation of the expression or to interact with yet unidentified effectors, such as activation of these factors (Schmitt-Ney et al., the negative regulatory elements of the WAP pro- 1991). Thus, some negative binding factors moter (Kolb et al., 1994) and of the b-casein pro- appear to be present in the mammary gland only moter (Lee and Oka, 1992; Altiok and Groner, during pregnancy but not during lactation (Lee 1993, 1994). Most of these sequences appear to and Oka, 1992). Some of them mediate the inhib- be clustered within short DNA fragments of sev- itory action of progesterone. Similarly, expres- eral hundred base pairs in length that encom- sion of C/EBPb protein isoforms that are essential passed both positive and negative regulatory both for mammogenesis and lactogenesis elements. Such composite response elements (Robinson et al., 1998; Seagroves et al., 1998) is have been identified in the proximal 5¢-flanking regulated during pregnancy and lactation. The regions of the b-lactoglobulin gene (region ratio between LIP, a dominant-negative transcrip- −406/+1; Watson et al., 1991), of the calcium- tional repressor, and LAP, which is an activator sensitive casein genes (region −200/+1; Rosen of transcription, is high during the pregnancy et al., 1986; 1998 for review), in more distal stage and decreases during lactation due, in part, regions of the bovine (BCE-1 element: region to the inhibition of LIP expression by glucocorti- −1613/−1562; Schmidhausser et al., 1992, Myers coids (Raught et al., 1995). Expression of C/ et al., 1998) and human (region −4700/−4550; EBPa is also increased during lactation, possibly Winklehner-Jennewein et al., 1998) b-casein following stimulation by some ECM components genes, of the rabbit as1-casein gene (region (Raught et al., 1995). The action of another fac- −3442/−3118; Pierre et al., 1994) and of the rat tor, the transcription factor YY1, that represses WAP gene (region −949/−720; Li and Rosen expression of the b-casein promoter by binding to it at position −120/−110 is counteracted by its replacement from its binding site by the

14 Genetics and Biosynthesis of Milk Proteins 439 prolactin-activated STAT5 protein (see below), MAPK and SOCS, resulting in a logistical prob- which in turn positively regulates the promoter lem for the cell to manage if it is to maintain activity (Meier and Groner, 1994). A more sur- tight regulation of the signal. It does this by bal- prising example of regulation via binding of ancing the activation signal with the generation negative regulatory factors is given by the two of factors that inhibit the signal (Starr and Hilton, proteins that bind the upper strand of the b-casein 1999; Tomic et al., 1999). promoter at position −221/−170 during preg- nancy and involution (Altiok and Groner, 1994). STAT5 is an important positive transcription During lactation, a molecule inhibits the binding factor in the transcriptional regulation of milk of these factors to the gene promoter, and it is protein genes (see Barash, 2006, for review). It is suspected that this molecule could be the b-ca- involved in the transduction of the prolactin sein mRNA itself that possesses high-affinity signal. Functional STAT5 binding sites have binding sites for the two proteins in its 5¢ UTR been identified in the promoter region of almost (Altiok and Groner, 1994). all major milk protein-encoding genes, with the potential exception of the k-casein-encoding Much has been discovered about how milk gene (Adachi et al., 1996). Within calcium-sen- protein genes are regulated, and the involvement sitive casein and b-lactoglobulin composite ele- of many transcription factors has been described. ments, the occurrence of multiple STAT5 Notwithstanding all this information, the binding sites is observed. These STAT5 binding identification of STAT5 as the end point of pro- sites were shown to be essential to confer pro- lactin signalling in the mammary gland heralded lactin transcriptional stimulation to the linked a new era in our understanding of mammary promoter (Schmitt-Ney et al., 1991; Demmer gene regulation. All the more so, since the STAT et al., 1995; Jolivet et al., 1996; Soulier et al., proteins are central to all cytokine responses 1999). Furthermore, several experiments sug- (Heim, 1999). Much of the work on STAT pro- gest that STAT5 effects are limited to the modu- teins has been pioneered by studies involving lation of expression level, but are not involved mammary genes (Schmitt-Ney et al., 1991; in determining the tissue specificity of expres- Watson et al., 1991). sion. For example, mutation of one or several of the STAT5 binding sites within the b-lactoglob- 14.3.4 Prolactin Signal Transduction ulin promoter did not affect the tissue-specific expression of this gene in transgenic mice In the mammary gland, prolactin induction (Burdon et al., 1994). results in the expression of the milk protein genes. This occurs through a rapid but transient The STAT5 transcription factor actually con- signalling transduction pathway (Heim, 1999). sists of two proteins, STAT5a and STAT5b. First, prolactin-induced dimerisation of its recep- These highly similar proteins are encoded by two tor causes trans-phosphorylation of the kinase genes, with differences in their binding affini- which is constitutively associated with the cyto- ties due to a single amino acid substitution plasmic domain of the receptor. The kinase is (Boucheron et al., 1998). The role of these pro- called JAK2 (for janus kinase 2). The activated teins has been studied using gene knockout JAK2 phosphorylates the receptor creating a approaches, with STAT5a emerging as the major docking site for STAT5 through its SH2 domain. factor required for milk protein gene expression. Subsequent phosphorylation and dimerisation of STAT5a-deficient mice exhibit defective mammary STAT5 result in its translocation to the nucleus gland development (Liu et al., 1997). As one where it binds to GAS (g-interferon activation might expect, given the different response of the sequences) elements in target genes, for exam- various milk protein genes to hormones, STAT5a ple, b-lactoglobulin. Many other proteins are knockout mice express the milk protein genes to associated with this pathway, for example, different levels. Essentially normal levels of b-casein and a-lactalbumin mRNA levels are detected, with only WAP mRNA levels showing


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook