Scholarly article on topic 'Single-Cell Transcript Profiles Reveal Multilineage Priming in Early Progenitors Derived from Lgr5+ Intestinal Stem Cells'

Single-Cell Transcript Profiles Reveal Multilineage Priming in Early Progenitors Derived from Lgr5+ Intestinal Stem Cells Academic research paper on "Biological sciences"

Share paper
Academic journal
Cell Reports
OECD Field of science

Abstract of research paper on Biological sciences, author of scientific article — Tae-Hee Kim, Assieh Saadatpour, Guoji Guo, Madhurima Saxena, Alessia Cavazza, et al.

Summary Lgr5+ intestinal stem cells (ISCs) drive epithelial self-renewal, and their immediate progeny—intestinal bipotential progenitors—produce absorptive and secretory lineages via lateral inhibition. To define features of early transit from the ISC compartment, we used a microfluidics approach to measure selected stem- and lineage-specific transcripts in single Lgr5+ cells. We identified two distinct cell populations, one that expresses known ISC markers and a second, abundant population that simultaneously expresses markers of stem and mature absorptive and secretory cells. Single-molecule mRNA in situ hybridization and immunofluorescence verified expression of lineage-restricted genes in a subset of Lgr5+ cells in vivo. Transcriptional network analysis revealed that one group of Lgr5+ cells arises from the other and displays characteristics expected of bipotential progenitors, including activation of Notch ligand and cell-cycle-inhibitor genes. These findings define the earliest steps in ISC differentiation and reveal multilineage gene priming as a fundamental property of the process.

Academic research paper on topic "Single-Cell Transcript Profiles Reveal Multilineage Priming in Early Progenitors Derived from Lgr5+ Intestinal Stem Cells"

Cell Reports

Single-Cell Transcript Profiles Reveal Multilineage Priming in Early Progenitors Derived from Lgr5+ Intestinal Stem Cells

Graphical Abstract


Tae-Hee Kim, Assieh Saadatpour,

Guoji Guo.....Stuart H. Orkin,

Guo-Cheng Yuan, Ramesh A. Shivdasani

Correspondence (G.-C.Y.), (R.A.S.)

In Brief

Characterizing the earliest cells to exit in vivo stem-cell compartments is a challenge. Kim et al. demonstrate multilineage priming—co-expression of markers for both the absorptive and secretory daughter lineages—in the earliest progeny of Lgr5+ intestinal crypt stem cells.


• Single-cell analyses of intestinal Lgr5+ cells identify two discrete populations

• Both pools express stem-cell genes, but only one activates terminal cell markers

• Multilineage-primed Lgr5+ cells express features of bipotential progenitors

• A suite of informatics tools reveals that these progenitors originate in Lgr5+ ISCs

Kim et al., 2016, Cell Reports 16,1-8 August 23, 2016 © 2016 The Author(s).


Cell Reports


Single-Cell Transcript Profiles Reveal Multilineage Priming in Early Progenitors Derived from Lgr5+ Intestinal Stem Cells

Tae-Hee Kim,12'8'910 Assieh Saadatpour,38 Guoji Guo,411 Madhurima Saxena,12 Alessia Cavazza,12 Niyati Desai,5 Unmesh Jadhav,12 Lan Jiang,3 Miguel N. Rivera,5 Stuart H. Or^n,46,7 Guo-Cheng Yuan,3,6 * and Ramesh A. Shivdasani1,2,6,12 *

1 Department of Medical Oncology and Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA 02215, USA 2Department of Medicine, Harvard Medical School, Boston, MA 02215, USA

3Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA

4Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston Children's Hospital and Harvard Medical School, Boston, MA 02215, USA

5Department of Pathology and Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA

6Harvard Stem Cell Institute, Cambridge, MA 02138, USA 7Howard Hughes Medical Institute, Boston, MA 02115, USA 8Co-first author

9Present address: Developmental and Stem Cell Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada 10Present address: Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada "Present address: Center of Stem Cell and Regenerative Medicine, Zhejiang University, Hangzhou, Zhejiang 310058, China 12Lead Contact

'Correspondence: (G.-C.Y.), (R.A.S.)


Lgr5+ intestinal stem cells (ISCs) drive epithelial self-renewal, and their immediate progeny—intestinal bipotential progenitors—produce absorptive and secretory lineages via lateral inhibition. To define features of early transit from the ISC compartment, we used a microfluidics approach to measure selected stem- and lineage-specific transcripts in single Lgr5+ cells. We identified two distinct cell populations, one that expresses known ISC markers and a second, abundant population that simultaneously expresses markers of stem and mature absorptive and secretory cells. Single-molecule mRNA in situ hybridization and immunofluorescence verified expression of lineage-restricted genes in asubset of Lgr5+ cells in vivo. Transcriptional network analysis revealed that one group of Lgr5+ cells arises from the other and displays characteristics expected of bipotential progenitors, including activation of Notch ligand and cell-cycle-inhibitor genes. These findings define the earliest steps in ISC differentiation and reveal multilineage gene priming as a fundamental property of the process.


Cell turnover in the small bowel relies on pools of 12-15 Wnt-responsive Lgr5+ intestinal stem cells (ISCs) that lie at the base

of each intestinal crypt and replicate daily to produce new ISCs and transit-amplifying (TA) progenitors (Barker et al., 2007). Other cells present near crypt tier 4 express a combination of Bmi1, mTert, and Hopxl (Barker et al., 2012) and may represent Paneth cell precursors that are recruited into the stem-cell pool upon epithelial injury (Buczacki et al., 2013). Both Lgr5+ ISCs and TA cells replicate briskly, albeit at different rates, and TA cells quickly adopt a single fate—absorptive or secretory—whereas ISCs stay multipotent; the basis for these cardinal differences is unknown. In another self-renewing tissue, blood cell progenitors simultaneously activate genes specific to each daughter lineage before distinct cell types are specified, a phenomenon known as multilineage priming (Hu et al., 1997; Miyamoto et al., 2002). Because absorptive and secretory fates are determined by lateral inhibition, a means for reciprocal cell specification (Pellegrinet et al., 2011; Stamataki et al., 2011), it is unclear whether the progeny of Lgr5+ ISCs traverse a similar phase. Lateral inhibition likely occurs in intestinal bipotential progenitors (IBPs), which have never been captured and may represent the earliest, albeit transient, progeny of Lgr5+ ISCs.

Lgr5+ cells show a range of GFP signals in Lgr5Gfp mice (Barker et al., 2007), and cells at the center of the crypt base produce larger clones than cells located at the periphery (Ritsma et al., 2014). Not all Lgr5+ cells spawn functional clones in vivo (Kozar et al., 2013), and some of them correspond to non-cycling Paneth-cell precursors (Buczacki et al., 2013). Although these observations suggest that early progenitors might arise among Lgr5+ cells, a recent single-cell mRNA study (Grnn et al., 2015) reported that Lgr5hi cells are homogeneous, possibly because the method has low sensitivity for transcripts expressed at low

Cell Reports 16, 1-8, August 23, 2016 © 2016 The Author(s). 1 This is an open access article under the CC BY license (

0 5 10

tSNE 1

Normalized log2 expression level

Figure 1. Targeted mRNA Profiles Identify Two Populations of Lgr5+ Intestinal Crypt Cells

(A) Flow cytometry plot, showing the gates applied to Isolate Lgr5hl (green) cells.

(B) Heatmap display of k-means (k = 2) clustering of Ct values from 183 mRNAs (x axis, five genes are represented by two primer sets each) In 192 single Lgr5+ Intestinal crypt cells (y axis). Blue represents absent to low, and yellow to amber represent increasing, transcript levels. Genes are ordered by hierarchical clustering with the average linkage method and Euclidean distance. A block of genes that best distinguishes the two cell populations, including most mature villus markers, is boxed.

(C) Violin plots showing differential expression of representative stem (Lgr5 and Olfm4) and differentiated (Apoal and Muc2) cell markers in all cells in populations 1 (P1; blue) and 2 (P2; green).

(D) t-SNE analysis of the qRT-PCR data, demonstrating discrete Lgr5+ cell populations (blue and green); overlaid colors are from the adjoining k-means clusters. See also Figure S1 and Tables S1 and S2.

abundance. To overcome this limitation, we measured 185 transcripts for selected stem cell and lineage-specific markers in single GFP+ (Lgr5+) intestinal crypt cells isolated from the same Lgr5GFP mice (Barker et al., 2007). We identified a distinct population that expresses slightly reduced levels of known ISC transcripts and co-expresses markers of mature secretory cells and enterocytes. Immunofluorescence and single-molecule mRNA in situ hybridization (ISH) confirmed the presence of these cells in vivo, and analysis of transcript networks indicates that they represent early ISC-derived bipotential progenitors.


We used microfluidic qRT-PCR following targeted pre-amplification of 185 genes from deflned categories (Table S1), including genes previously identifled as Lgr5+ cell speciflc (Kim et al., 2014; Munoz et al., 2012); targets of various signaling pathways; markers speciflc to mature enterocytes or secretory cells (Kim

et al., 2014); and tissue-restricted transcription factors. To ensure reproducibility and RNA quality, we assessed three housekeeping genes (Actb, Gapdh, and Hprt) and used two separate primer pairs to measure flve genes. From Lgr5Gfp mice (Barker et al., 2007), we captured crypt epithelial cells that showed strong GFP fluorescence in flow cytometry (Figure 1A) but might, nevertheless, include LGR5+ cells on the verge of ISC exit. Fluorescence microscopy and direct visualization verifled the recovery of dilute, viable GFP+ singlets (Figure S1A). Following reverse transcription with primers speciflc to the selected genes and PCR ampliflcation of cDNA, we excluded wells that gave cycle threshold (Ct) values <13 in qRT-PCR for Actb, further eliminating possible rare doublets. Different primers for each of flve selected genes gave concordant results (Table S1), indicating a robust protocol.

We measured the levels of all 185 genes in 192 cells captured on 2 separate days and pooled the data for subsequent analyses (Table S2); two genes, Zg16 and Ido1, gave no signal in any cell


and were excluded from the analysis. fr-means clustering of the RNA data, using the Silhouette measure (Kaufman and Rous-seeuw, 1990) to identify the best fr (Figure S1B), revealed two distinct cell populations that were roughly equal in size (Figure 1B) and expressed similar levels of markers historically assigned to quiescent ISCs (Figure S1C). The salient differences between these two populations were a modestly higher (2- to 8-fold) expression of ISC markers, such as Lgr5 and Olfm4, in one pool and an 8- to >100-fold higher expression of many genes in the other (Figures 1B and 1C); adjusted p (padj), <10~7 to <10~5. After confirming efficient qPCR by selected primer pairs, we estimated copy numbers of some of the latter mRNAs at 3% to 8% of Hprt copies (Figure S1D). Cells isolated on different days were similarly distributed in the two pools, and, to verify the results from fr-means clustering, we used t-distrib-uted Stochastic Neighbor Embedding (t-SNE) (van der Maaten and Hinton, 2008). The two cell populations identified by fr-means clustering remained distinct on a t-SNE map (blue and green dots in Figure 1D), and the high concordance of RNA profiles in each group (Figure 1B), together with the absence of outliers in t-SNE, strongly supports the absence of cell doublets.

Among the 185 genes we interrogated, 35 genes discriminated the two cell populations without ambiguity (DCt > 3, padj < 10~6; Figure 1B; shaded in Table S1), and 31 of these transcripts were higher in population 2. Weighted gene co-expression network analysis (WGCNA) (Zhang and Horvath, 2005) revealed two specific, highly coordinated gene modules in this population (Figure 2A), compared to the modest connectivity of expressed genes in population 1 (Figure S2A), and the transcripts elevated in population 2 overlapped significantly with these modules (Figure 2B). Eighteen of the 27 common genes represented secretory or enterocyte-specific markers (Figure 2C) that were not mutually exclusive but appeared at similar levels in nearly every cell in population 2 and were virtually absent in the other cells (Figures 1B, 2C, and 2D). The simultaneous expression of different lineage programs is reminiscent of multilineage priming in blood progenitors (Hu et al., 1997; Miyamoto et al., 2002), and the lack of any instance of unilineage expression suggests that population 2 may represent IBPs. Single-cell latent variable modeling (scLVM) (Buettner et al., 2015) attributed only 12.2% of the variation to cell replication, and transcript profiles were very similar before and after correcting for cell-cycle effects (Figure S2B). Cell-cycle-related transcripts that were increased in IBPs included both positive and negative regulators of the cell cycle, and Pcna, Mfri67, and targets of Wnt signaling were expressed at comparable levels (Figure S2C). Thus, the distinct mRNA profiles do not trivially reflect differential mitotic activity, and both populations seem to include cycling cells.

Superficially, the presence of numerous candidate IBPs among Lgr5+ cells contrasts with recent evidence of population homogeneity by single-cell mRNA sequencing (mRNA-seq) (Grün et al., 2015). One explanation is that Grnn et al. examined cells with higher GFP levels than we did. Thus, our population 1 might represent homogeneous GFPhi ISCs, whereas population 2 may contain cells with modestly lower Lgr5 mRNA (Figure 1C) and protein levels, i.e., cells leaving the ISC compartment. Another explanation is the low sensitivity of single-cell RNA sequencing (RNA-seq) for low-abundance transcripts, and,

indeed, the method did not reliably capture genes that distinguish ISCs from IBPs in our qRT-PCR study. Although a few lineage markers—such as Defa5, Muc2, and Ang4—were detected in some cells, most markers were not (Figure S2D). Nevertheless, to exclude the possibility that our qRT-PCR signals are spurious, we performed bulk (ensemble) RNA-seq analysis on triplicate samples of Lgr5hi cells, sorted using the same parameters as in our single-cell analysis, and also queried bulk RNA data from Lgr5hi cells profiled on microarrays (Munoz et al., 2012). Every lineage marker we detected in single cells was represented among the >11,000 genes identified in these ensemble studies (Figure S3A), compared to <4,000 genes in the single-cell mRNA-seq study (Grün et al., 2015).

In light of the multilineage profiles of putative IBPs, transcripts specific to enterocytes or secretory cells might persist in specified progenitors of the other type. This was, indeed, evident in ensemble analysis of the respective purified progenitors (Figure S3A); e.g., whereas high Alpi levels are restricted to enterocytes in vivo (Tetteh et al., 2016), levels ~10-fold lower than those found in bulk villus cells are equally abundant in both enter-ocyte and secretory progenitors. Conversely, we detected many secretory genes in enterocyte progenitors. Because this Atohl null population categorically lacks secretory cells (Kim et al., 2014; Yang et al., 2001), genes from this lineage were likely activated in a preceding cell generation, IBP. Together, these observations imply that the earliest cells to leave the ISC compartment activate genes of both intestinal lineages, at levels that elude detection at the current resolution of single-cell RNA-seq.

To confirm ourfindings by independent methods, first, we used single-molecule mRNA ISH with branched DNA (bDNA) signal amplification (Player et al., 2001). Probes for the villus cell markers Alpi, Chga, Neurog3, and Cck gave the expected signals in most (enterocyte) or few (enteroendocrine) wild-type mouse villus cells, respectively, with weaker signals in crypt epithelium and virtually none in the lamina propria; conversely, Lgr5 probes carrying a different chromophore stained only crypt base columnar cells (Figure S3B). We detected low levels of mature villus cell marker mRNAs in up to 24.7% of Lgr5-expressing cells (Figures 3A and 3B; Figure S3C), greatly exceeding the background of red signals and compatible with the different sensitivities of single-cell qRT-PCR and single-mRNA ISH to detect transcripts of low abundance. Second, we used Atoh1Gfp knockin mice (Rose et al., 2009) to examine protein levels of ATOH1, a transcription factor whose RNA is restricted to the pool of putative IBPs (Figure S3D). After verifying ATOH1/GFP expression in lysozyme+ Paneth cells and occasional secretory progenitors positioned higher than crypt tier 5 (red arrow in Figure 3C), we restricted attention to Lgr5+ cells in the crypt base (open arrows, Figure 3C; n = 454), which showed distinct populations of ATOH1 + and ATOH1~ nuclei (filled or open arrows, respectively, in Figures 3D, 3E, and S3E). As protein expression must trail new transcripts, the fraction of ATOH1+ cells (23.7%; Figure 3F) is compatible with that detected by qRT-PCR (47.9%). mRNA ISH and ATOH1/GFP stains did not localize lineage-marker-expressing Lgr5+cells to high crypt tiers, which suggests that cell heterogeneity may originate—perhaps stochastically—among ISCs at the crypt bottom and that cells with this feature preferentially exit the ISC compartment. The cells we regard as IBPs may,

■76 genes

ACt >3 Padj <10-6

Gene ACt Connectivity

Pop1-Pop2 Pop1 Pop2

Lifr 7.49 0.4 5.5

Muc2 7.37 0.6 8.7

Rep15 7.36 0.5 4.9

Spdef 7.31 0.Э 6.1

Apoa1 7.24 0.Э 6.6

Tdgf1 7.03 0.2 7

Defa5 6.72 4.Э 9

Nupr1 6.71 0.8 6.8

Dct 6.39 0.1 6.8

Cdkn1a 6.39 4.6 4.1

Cck 6.36 0.04 3.3

Chga 6.23 4.7 6.4

Alpi 6.19 0.1 4.7

Atoh1 6.08 0.6 2

Ang4 6 4.1 8.8

Kit 5.вЭ 1.2 1.7

Pax4 5.82 4.7 Э.Э

Ccnb1 5.8 2.3 6.2

Treh 5.67 0.4 4.5

Tlr1 5.51 0.Э 6.8

Hes3 5.4в 0.2 1.9

Neurog3 5.43 5.6 2.5

Bmp4 5.3 0.Э 1.2

Lct 5.27 0.4 5.2

Plk1 5.26 2.1 4.8

Dll4 5.16 1.1 1.2

Casp12 4.97 0.1 6.4

Cdkn2b 4.в6 0.1 2.8

Rrm2 4.Э 2 0.9

Lipe 4.09 0.1 1.6

Clca3 3.38 0.1 5

Lineage markers

Figure 2. Candidate IBPs Show Multilineage Priming

(A) Results of WGCNA, showing network modules of genes that are strongly co-expressed across the cells In population 2. In contrast, population 1 showed limited connectivity (Figure S2A).

(B) Overlap of 31 genes showing differentially high expression in IBP with 76 genes showing high network connectivity (see Supplemental Experimental Procedures).

(C) DCt and connectivity values for the 31 genes that best distinguish population 2; lineage-specific markers are labeled green.

(D) Violin plots showing highly differential expression of markers of each major terminal intestinal cell type in all cells in populations 1 (P1; blue) and 2 (P2; green): Lct and Treh (enterocytes), Cck (endocrine), Spdef (goblet and other secretory cells), and Defa5 (Paneth cells).

See also Supplemental Experimental Procedures and Figure S2.

however, correspond to a GFPlow population in vivo (Basak et al., 2014), and their property of multilineage priming is significant, regardless of the precise crypt location.

To examine further the relationship between populations 1 and 2, we considered that any transition among them is likely not abrupt; rather, transcripts from one cell state might decline, while those from the other begin to accumulate. The foregoing cluster analysis (Figure 1B), which is discrete, would fail to detect such a transition, but the non-branching structure of the t-SNE map (Figure 1D) permits the use of principal curves to infer cell trajectories (Hastie and Stuetzle, 1989). We derived such a principal curve, then divided all the cells into ten groups according to the inferred pseudo-time (Marco et al., 2014), and identified 28 cells at the boundary between the two major populations (Fig-

ure 4A). Average expression of each of the 183 genes in the ten groups of cells revealed 66 genes that discriminate between ISCs and IBPs (denoted by a box on the cluster dendrogram and heatmap in Figure 4B) and, as expected, include nearly every gene that had shown high ACt values (Figure 4C). Whereas ISCs and IBPs expressed uniformly higher levels of different subsets in this gene group, the 28 boundary cells varied in expression (Figure 4C), with declining average levels of stem cell markers, such as Lgr5, and concomitant increase of mature markers (Figure 4D). Average expression values were similar for different numbers of bins. For example, using eight bins instead of ten, the histogram of cell numbers identified 12 boundary cells, and mean expression over these 12 cells was highly correlated (R2 = 0.95) with that in the 28 cells identified

Lgr5 Cck

Experiment 1

Lgr5 Neurog3

Experiment 2

% DBL+ % Bkgd % DBL+ % Bkgd

Alpi 16.8 2.4 24.7 1.6

ChgA 13.5 1.4 13 1.3

Neurog3 18.8 2.3 17.4 3.1

Cck 16.9 1.2 14.3 0.2

CD f /

using ten bins. Gradual accumulation of terminal cell markers, as revealed in boundary cells, strongly suggests a cell transition from ISCsto putative IBPs.

The high census of IBPs suggests that they are distinct from the small, label-retaining fraction of Lgr5+ cells. Transcripts recently assigned to the latter—Nfatc3, Nfat5, and Cd82 (Buczacki et al., 2013)—were essentially similar in ISCs and IBPs (Figure S3F) and may increase only in Paneth-cell precursors. Notably, and in line with recent evidence for extreme plasticity in crypts (Kim et al., 2014; Tian et al., 2011; van Es et al., 2012), IBPs may be unstable cells that revert to ISCs as readily as they differentiate into absorptive or secretory cells. The latter event occurs as some cells use DLL1 or DLL4 to signal to Notch receptors on their neighbors (Pellegrinet et al., 2011; Stamataki et al., 2011). Because lateral inhibition requires equipotent cells to deliver or respond to Notch signals, increased expression of these ligands

Figure 3. Expression of Lineage Markers in Lgr5+ Crypt Base Cells In Vivo

(A) Representative images of single-molecule mRNA ISH for Alpi, ChgA, Cck, Neurog3 (red), and Lgr5 (blue), showing red and blue signals in the same crypt base cells. Cells with arrows pointing to co-expressed red and blue dots are magnified in the respective insets. Scale bars, 15 mm.

(B) Fraction of double-positive (DBL+, red and blue) cells and background (Bkgd) of extra-epithelial cells with red dots in intestines from four mice in two experiments.

(C) Immunostaining of Atoh1Gfp/Gfp crypts with lysozyme (red) and GFP (green) antibody (Ab) and DAPI nuclear stain (blue). GFP (ATOH1) was present in lysozyme+ Paneth cells (P) at the crypt base and in occasional TA cells (red arrow); only slim columnar cells wedged between Paneth cells (white arrows) were assessed further.

(D) Absence (open arrows) or presence (filled arrows) of ATOH1 in a representative z-section of three consecutive crypts, with fluorescence channels separated for clarity.

(E) Magnified view of a single crypt, showing that ATOH1 signals in some putative IBP are similar to those in neighboring Paneth (P) cells. Open arrows, absence of ATOH1; filled arrows, presence of ATOH1.

(F) Fraction of ATOH1/GFP+ cells among 454 columnar DAPI+ nuclei in tiers 0-3 of Atoh1Gfp/Gfp mouse crypts.

See also Figure S3.

is one feature expected in IBPs. Indeed, average Dll1 mRNA is higher in IBPs, and Dll4 increases substantially in most of these cells (Figure 4E).

In summary, microfluidic qRT-PCR reveals a distinct cell population that seems to represent the earliest progeny of Lgr5+ ISCs: putative IBPs with multili-neage priming and modestly reduced Lgr5/GFP expression. Although multili-neage priming was originally inferred from bulk cell populations (Hu et al., 1997; Miyamoto et al., 2002), recent studies suggest that single blood progenitors express genes exclusive to one lineage or another (Paul et al., 2015; Perie et al., 2015). In contrast, our analysis revealed no cell expressing genes specific to just one intestinal lineage (Figure 1), and enterocyte progenitors continue to express secretory genes (Figure S3A); these findings likely reflect features particular to lineage specification by lateral inhibition. Levels of certain TF mRNAs—Atoh1, Spdef, Pax4, and Tbx3—first rise in IBPs, where they may initiate the lineage-affiliated programs. Although equal expression of Mki67 and Pcna in ISCs and IBPs supports the idea that all crypt cells other than Paneth cells and their precursors replicate, high mRNA levels of cell-cycle inhibitors Cdkn1a, 1b, 2a, and 2b in IBPs (Figure S2C) suggest that they, or their immediate progeny, may replicate more slowly than ISCs or TA cells.

Figure 4. Evidence that Lgr5+ ISCs Transition into the IBP Population

(A) Principal curve analysis (black curve) projected on the t-SNE map from Figure 1D reveals the relationship of the two populations, based on the proximity of gene expression, as a non-branching curve. The 28 boundary cells—determined by partitioning of the principal curve into ten bins of equal distance—are now represented in pink. The graph indicates cell numbers in each bin; blue and green denote ISCs and IBPs, respectively.

(B) Heatmap of the global analysis (183 genes x 192 single cells; red indicates high expression, and green indicates low expression) partitioned in ten bins according to the aforementioned principal curve analysis. 66 transcripts denoted by a dotted box provide discrimination.

(C) The latter transcripts include nearly every gene that distinguished populations 1 and 2 by ACt (Figure 1B), and the dotted box in (B) is here expanded and rotated 90° to show the trajectory of expression in ISCs (blue), boundary cells (pink), and IBPs (green). Diff. Exp., different expression; Princ., principal.

(D) Average levels of representative IBP-enriched (Lifr, Muc2, Dct, and Kit), ISC-enriched (Lgr5, Agr3, and Sema4d), and Actb mRNAs in cell groups defined by distance along the principal curve.

(E) Violin plots for expression of Notch ligand genes Dll1 and Dll4 in all ISCs and IBPs.

Despite clear differences in gene activity, IBPs are unlikely to show different behaviors than ISCs by lineage tracing or in organoids, where even ISCs and specified progenitors are difficult to

distinguish (Buczacki et al., 2013; Tetteh et al., 2016; van Es et al., 2012). Moreover, no Cre driver or surface marker is likely expressed exclusively in IBPs, i.e., not also in ISCs or specified

progenitors. Thus, our targeted single-cell analysis, reinforced by localization of transcripts in vivo, reveals features of a crucial and transient cell population that is likely difficult to isolate or to characterize by other means.


Intestines harvested from Lgr5GFP mice (Barker et al., 2007) were washed with PBS. Villi were scraped away using coverslips, and the crypt epithelium was collected by shaking in 5 mM EDTA for 1 hr at 4°C(Kim et al., 2014). Single cells were obtained on 2 separate days by digestion in 5x TrypLE (Invitrogen) for 1 hr at 37°C and verified by fluorescence microscopy. GFPhl cells were sorted into individual wells in 96-well plates using a BD FACSAria II sorter (Becton Dickinson). Cells from one of the two isolations were also examined visually in microfluidic channels. Animals were handled according to protocols approved and monitored by the Animal Care and Use Committee of the Dana-Farber Cancer Institute.

Single-Cell Gene Expression Analysis by Microfluidic qRT-PCR

The pre-amplification solution in 96 wells included 5 ml of a master mix containing 2.5 ml CellsDirect reaction mix (Invitrogen), 0.5 ml primer pool (0.1 mM [Table S1], synthesized at Bioneer), 0.1 ml reverse transcriptase (RT)/Taq poly-merase (Invitrogen), and 1.9 ml nuclease-free water. Lysed cells were treated with this mix at 50°C for 1 hr, followed by inactivation of RT, activation of Taq at 95°Cfor 3 min, and 20 cycles of sequence-specific cDNA amplification (15sdenaturation at 95°C, 15 min annealing and elongation at 60°C). Amplified single-cell cDNAs were first tested in control qRT-PCR reactions for Actb, and samples giving Ct values between 13 and 17 were selected for subsequent analysis with the full primer pools, Universal PCR Master Mix (Applied Biosystems), and EvaGreen Binding Dye (Biotium), using the 96 x 96 Dynamic Array on the BioMark System (Fluidigm). Table S2 lists the Ct values for each gene in each cell, calculated using BioMark Real-Time PCR Analysis software (Fluidigm).

Computational Analyses

mRNA levels were estimated by subtracting the Ct values from the background level of 28 (start of the tail of the distribution in the histogram of Ct values), which approximates log2 gene expression levels. We conducted k-means clustering in MATLAB using the squared Euclidean distance of normalized data (z scores). To determine the optimal k, we applied every value from 2 to 20, assessed the average Silhouette value (Kaufman and Rousseeuw, 1990) for each clustering result (Figure S1B), and selected k = 2, which gave the largest mean Silhouette value. Differentially expressed genes were identified using a two-sided Wil-coxon-Mann-Whitney rank-sum test implemented in the "coin" package in R. Differences between populations were determined by subtracting mean Ct values (equivalent to log2 expression levels). The p values were adjusted for multiple testing (Benjamini and Hochberg, 1995). Violin plots were generated in R using the package "vioplot." For t-SNE analysis (van der Maaten and Hinton, 2008), we used the MATLAB toolbox for dimensionality reduction ( The pseudotime of individual cells was estimated as previously described (Marco et al., 2014), fitting a principal curve (Hastie and Stuetzle, 1989) to the single-cell expression data. We used the R package "princurve," with the options "smoother = lowess'' and "maxit = 200.'' Heatmaps (Figure 2) were prepared with the MultiExperiment Viewer (, using the Euclidean distance and average linkage as parameters for unsupervised hierarchical clustering of genes. Latent variable modeling and analysis of co-expression gene networks are described in the Supplemental Experimental Procedures.

Analysis of Public mRNA-Seq Data

Processed mRNA-seq data on 192 isolated Lgr5+ mouse intestinal cells (Grün et al., 2015) were obtained from GEO: GSE62270 (accession file GSE62270_ data_counts_Lgr5SC.txt.gz). Violin plots for genes relevant to our study were generated using the Vioplot2 function in R. The accession number for the ensemble RNA-seq is GEO: GSE71713.

Single-mRNA ISH with bDNA Amplification

Intestines from C57BL/6J mice were fixed overnight in 4% paraformaldehyde, embedded in paraffin, and cut in 5-mm sections. ISH was performed twice on two intestines each, using Quantigene ViewRNA probes (Affymetrix) for two-color ISH, as described in the Supplemental Experimental Procedures. Between 320 and 460 Lgr5+ crypt base cells were counted in at least 50 crypts from each mouse (n = 4). Cells were scored as double positive (DBL+) when at least one dot for a mature-cell marker mRNA (red) was present in a cell expressing Lgr5 mRNA (blue dots). Background signals were estimated from counts of red dots in 370 to 440 nucleated sub-epithelial cells for each mature-cell marker in each sample.


Supplemental Information includes Supplemental Experimental Procedures, three figures, and two tables and can be found with this article online at


T.-H.K. and R.A.S. conceived the study; T.-H.K., G.G., M.S., N.D., and U.J. performed experiments; A.S. performed computational analyses; A.S., T.-H.K., M.S., A.C., L.J., G.-C.Y., and R.A.S. analyzed data; S.H.O. and M.N.R. supervised portions of the study; G.-C.Y. supervised computational analyses; and R.A.S. provided overall supervision. R.A.S., A.S., T.-H.K., and G.-C.Y. drafted the manuscript, with input from all authors.


This work was supported by grants R01DK081113 (including a supplement from the Office of the Director [NIH Common Fund]), U01DK103152 (the Intestinal Stem Cell Consortium of the NIDDK and NIAID), K99DK095983 (to T-H.K.), F32DK103453 (to U.J.), and P50CA127003; an American-Italian Cancer Foundation fellowship (to A.C.); and funds from the Harvard Stem Cell Institute (to G-C.Y. and S.H.O.), Affymetrix (to N.D. and M.N.R.), and the Lind family (to R.A.S.). We thank L. Deary and D. Breault for help with microscopy.

Received: February 12, 2016 Revised: April 18, 2016 Accepted: July 20, 2016 Published: August 11,2016


Barker, N., van Es, J.H., Kuipers, J., Kujala, P., van den Born, M., Cozijnsen, M., Haegebarth, A., Korving, J., Begthel, H., Peters, P.J., and Clevers, H. (2007). Identification of stem cells in small intestine and colon by marker gene Lgr5. Nature 449, 1003-1007.

Barker, N., van Oudenaarden, A., and Clevers, H. (2012). Identifying the stem cell of the intestinal crypt: strategies and pitfalls. Cell Stem Cell 11, 452-460. Basak, O., van de Born, M., Korving, J., Beumer, J., van der Elst, S., van Es, J.H., and Clevers, H. (2014). Mapping early fate determination in Lgr5+ crypt stem cells using a novel Ki67-RFP allele. EMBO J. 33, 2057-2068. Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. B 57, 289-300.

Buczacki, S.J., Zecchini, H.I., Nicholson, A.M., Russell, R., Vermeulen, L., Kemp, R., and Winton, D.J. (2013). Intestinal label-retaining cells are secretory precursors expressing Lgr5. Nature 495, 65-69.

Buettner, F., Natarajan, K.N., Casale, F.P., Proserpio, V., Scialdone, A., Theis, F.J., Teichmann, S.A., Marioni, J.C., and Stegle, O. (2015). Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat. Biotechnol. 33, 155-160.

Grün, D., Lyubimova, A., Kester, L., Wiebrands, K., Basak, O., Sasaki, N., Clevers, H., and van Oudenaarden, A. (2015). Single-cell messenger RNA sequencing reveals rare intestinal cell types. Nature 525, 251-255. Hastie, T., and Stuetzle, W. (1989). Principal curves. J. Am. Stat. Assoc. 84, 502-516.

Hu, M., Krause, D., Greaves, M., Sharkis, S., Dexter, M., Heyworth, C., and En-ver, T. (1997). Multilineage gene expression precedes commitment in the hemopoietic system. Genes Dev. 11, 774-785.

Kaufman, L., and Rousseeuw, P.J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis (John Wiley & Sons).

Kim, T.H., Li, F., Ferreiro-Neira, I., Ho, L.L., Luyten, A., Nalapareddy, K., Long, H., Verzi, M., and Shivdasani, R.A. (2014). Broadly permissive intestinal chromatin underlies lateral inhibition and cell plasticity. Nature 506, 511-515. Kozar, S., Morrissey, E., Nicholson, A.M., van der Heijden, M., Zecchini, H.I., Kemp, R., Tavare, S., Vermeulen, L., and Winton, D.J. (2013). Continuous clonal labeling reveals small numbers of functional stem cells in intestinal crypts and adenomas. Cell Stem Cell 13, 626-633.

Marco, E., Karp, R.L., Guo, G., Robson, P., Hart, A.H., Trippa, L., and Yuan, G.C. (2014). Bifurcation analysis of single-cell gene expression data reveals epigenetic landscape. Proc. Natl. Acad. Sci. USA 111, E5643-E5650. Miyamoto, T., Iwasaki, H., Reizis, B., Ye, M., Graf, T., Weissman, I.L., and Aka-shi, K. (2002). Myeloid or lymphoid promiscuity as a critical step in hematopoietic lineage commitment. Dev. Cell 3, 137-147.

Murioz, J., Stange, D.E., Schepers, A.G., van de Wetering, M., Koo, B.K., Itz-kovitz, S., Volckmann, R., Kung, K.S., Koster, J., Radulescu, S., et al. (2012). The Lgr5 intestinal stem cell signature: robust expression of proposed quiescent '+4' cell markers. EMBO J. 31, 3079-3091.

Paul, F., Arkin, Y., Giladi, A., Jaitin, D.A., Kenigsberg, E., Keren-Shaul, H., Winter, D., Lara-Astiaso, D., Gury, M., Weiner, A., et al. (2015). Transcriptional heterogeneity and lineage commitment in myeloid progenitors. Cell 163,16631677.

Pellegrinet, L., Rodilla, V., Liu, Z., Chen, S., Koch, U., Espinosa, L., Kaestner, K.H., Kopan, R., Lewis, J., and Radtke, F. (2011). Dll1- and dll4-mediated notch signaling are required for homeostasis of intestinal stem cells. Gastroenterology 140, 1230-1240.e1, 7.

Perie, L., Duffy, K.R., Kok, L., de Boer, R.J., and Schumacher, T.N. (2015). The branching point in erythro-myeloid differentiation. Cell 163, 1655-1662.

Player, A.N., Shen, L.P., Kenny, D., Antao, V.P., and Kolberg, J.A. (2001). Single-copy gene detection using branched DNA (bDNA) in situ hybridization. J. Histochem. Cytochem. 49, 603-612.

Ritsma, L., Ellenbroek, S.I., Zomer, A., Snippert, H.J., de Sauvage, F.J., Simons, B.D., Clevers, H., and van Rheenen, J. (2014). Intestinal crypt homeostasis revealed at single-stem-cell level by in vivo live imaging. Nature 507, 362-365.

Rose, M.F., Ren, J., Ahmad, K.A., Chao, H.T., Klisch,T.J., Flora, A., Greer, J.J., and Zoghbi, H.Y. (2009). Math1 is essential for the development of hindbrain neurons critical for perinatal breathing. Neuron 64, 341-354.

Stamataki, D., Holder, M., Hodgetts, C., Jeffery, R., Nye, E., Spencer-Dene, B., Winton, D.J., and Lewis, J. (2011). Delta1 expression, cell cycle exit, and commitment to a specific secretory fate coincide within a few hours in the mouse intestinal stem cell system. PLoS ONE 6, e24484. Tetteh, P.W., Basak, O., Farin, H.F., Wiebrands, K., Kretzschmar, K., Begthel, H., van den Born, M., Korving, J., de Sauvage, F., van Es, J.H., et al. (2016). Replacement of lost Lgr5-positive stem cells through plasticity of their enter-ocyte-lineage daughters. Cell Stem Cell 18, 203-213.

Tian, H., Biehs, B., Warming, S., Leong, K.G., Rangell, L., Klein, O.D., and de Sauvage, F.J. (2011). A reserve stem cell population in small intestine renders Lgr5-positive cells dispensable. Nature 478, 255-259.

van der Maaten, L.J.P., and Hinton, G.E. (2008). Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579-2605. van Es, J.H., Sato, T., van de Wetering, M., Lyubimova, A., Nee, A.N., Gregor-ieff, A., Sasaki, N., Zeinstra, L.,van den Born, M., Korving, J., etal.(2012). Dll1 + secretory progenitor cells revert to stem cells upon crypt damage. Nat. Cell Biol. 14, 1099-1104.

Yang, Q., Bermingham, N.A., Finegold, M.J., and Zoghbi, H.Y. (2001). Requirement of Math1 for secretory cell lineage commitment in the mouse intestine. Science 294,2155-2158.

Zhang, B., and Horvath, S. (2005). A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, e17.