ReplicationDomain - Documentation


Table of Contents:


Replication Timing Analysis

Replication timing data were obtained by hybridizing early and late replication intermediates to Nimblegen oligonucleotide arrays, as described in Hiratani et al [PLoS Biology (2008) 6: e245]. Briefly, replication intermediates are prepared from cells that are first pulse-labeled with BrdU and then sorted into early and late stages of S-phase by flow cytometry, followed by anti-BrdU immunoprecipitation of the BrdU-substituted (nascent) replication intermediates that were synthesized either early or late during S-phase. After unbiased amplification of recovered DNA, the samples are differentially labeled with Cy3 and Cy5 and hybridized to Nimblegen CGH arrays containing one oligonucleotide probe every 5.8 kb across the mouse genome (Nimblegen, 2006-07-26_MM8_WG_CGH). Raw data from two independent biological replicates in which the early and late replicating DNA were labeled reciprocally with Cy3 and Cy 5 (“dye switch”) are loess-normalized and scaled to have the same median-absolute deviation using the limma package (R/Bioconductor) and then averaged. Finally, the data are smoothed with a weighted moving average (loess: local polynomial smoothing). Further details can be found in Hiratani et al [PLoS Biology (2008) 6: e245].
Top

 

Transcription Analysis (Steady-State Transcript Levels)

This is standard Affymetrix GeneChip analysis of steady state transcript levels from total RNA, performed in triplicate from cells grown under identical conditions to those used to evaluate replication timing (GeneChip Mouse Genome 430 2.0 analyzing 45,037 single-copy murine genes and EST clusters). Each data set was first subject to Affymetrix Quality Control measures using GeneChip® Operating Software (GCOS) according to manufacturer’s protocols. All three data sets passed this quality control test and were subjected to normalization by the Probe Logarithmic Intensity Error algorithm (PLIER) developed by Affymetrix for calculating probe signals. For each Affymetrix ‘‘probe set,’’ signal intensity of the three biological replicates were averaged (i.e., average intensity). Genes are often represented by multiple probe sets. In such cases, the one with the highest total intensity (i.e., sum of ESC and NPC/EBM9 average intensity) was defined as the representative probe set, and the other probe sets were not used. We did so because such highest-intensity probe sets were empirically most consistent with reverse transcriptase (RT)-PCR analysis and can be defined in an objective way. Present (transcriptionally active) and absent (inactive) calls are generated by MAS5.0 (Affymetrix) per replicate per probe set, which results in multiple present–absent calls for a given gene [= 3 x (total number of probe sets for a gene)]. We defined ‘‘present’’ genes as those with more than 50% of all their probe set calls being present. A total of 15,143 (81%) of the 18,679 RefSeq genes, for which replication-timing ratios were obtained, were represented on the Affymetrix GeneChip microarrays and were assigned transcription levels and present-absent calls. Validation of transcription array results was evident from previously published transcription analysis under the same condition. Further details can be found in Hiratani et al [PLoS Biology (2008) 6: e245].
Top

 

Definitions of Data Sets and Data Entry Terms

Data sets are defined by a combination of 14 data entry terms, as described below. Upon uploading data sets, users can either select terms from the dropdown list, or create a new term by filling in the blank.

 

  • Species: Supported species are: Mus musculus, Homo sapiens, Drosophila and Schizosaccharomyces pombe . Contact us to create any new species page.
Top

 

  • Company: Microarray product supplier name.
Top

 

  • Chip ID: This is the unique identifier for each data set. While a “Chip ID” normally represents a single replicate experiment (e.g. one microarray hybridization), most data sets currently displayed on the Data Display Page are averages of multiple replicates. Therefore, we have re-defined Chip ID as a string of characters combining individual “Chip ID” numbers and description of the data set identity. Chip ID is not useful except to communicate comments regarding a particular experiment. On the Data Display Page, however, the Chip ID for each data set is set up as a link to the "Data Set Details" (also accessible from the "Database" link on the main menu). The Chip ID is also useful for identifying data sets when downloading entire data sets through the "Download Data" link in the main menu.
Top

 

  • Build: This indicates the version of the genomic sequence information that was used to assemble the microarray chip in the particular experiment (for example, mm7, and mm8 for the mouse). Builds change slightly as sequence information becomes updated, so the exact base pair position of any given DNA sequence will change as the sequence information becomes annotated. The build information indicated in each data set shows the build used for chromosomal coordinates of probes on the particular array type used.
Top

 

  • Order ID: For Nimblegen data sets only.
Top

 

  • Cell Line: This indicates the name of cell lines employed. At present, D3, TT2, and 46C are three different established mouse embryonic stem cells (mESCs). References for these cells lines are as follows:
     
    Mouse Embryonic Stem Cell line D3:
    Doetschman TC, Eistetter H, Katz M, Schmidt W, Kemler R: The in vitro development of blastocyst-derived embryonic stem cell lines: formation of visceral yolk sac, blood islands and myocardium. J Embryol Exp Morphol 1985, 87:27-45
     
    Mouse Embryonic Stem Cell line 46C:
    Ying QL, Stavridis M, Griffiths D, Li M, Smith A: Conversion of embryonic stem cells into neuroectodermal precursors in adherent monoculture. Nat Biotechnol 2003, 21(2):183-186.
     
    Mouse Embryonic Stem Cell line TT2:
    Yagi T, Tokunaga T, Furuta Y, Nada S, Yoshida M, Tsukada T, Saga Y, Takeda N, Ikawa Y, Aizawa S: A novel ES cell line, TT2, with high germline-differentiating potency. Anal Biochem 1993, 214(1):70-76
     
Top

 

  • Differentiation State: This is the tissue or tissue type represented by the cell line used and grown under the indicated conditions. Currently, there are four such differentiation states:
     
    ESC
    Undifferentiated ES cells
     
    NPC/EBM9
    The 9th day of differentiation following an established neural differentiation protocol that differentiates ESCs via embryoid bodies to Sox1 positive NPCs in conditioned medium.

    Reference: Rathjen J, Haines BP, Hudson KM, Nesci A, Dunn S, Rathjen PD. Directed differentiation of pluripotent cells to neural lineages: homogeneous formation and differentiation of a neurectoderm population. Development. 2002 Jun;129(11):2649-61.
     
    NPC/ASd6
    The 6th day of differentiation following an established neural differentiation protocol that differentiates ES cells to Sox1 positive NPCs in monolayer cultures using defined medium.

    Reference: Ying QL, Stavridis M, Griffiths D, Li M, Smith A: Conversion of embryonic stem cells into neuroectodermal precursors in adherent monoculture. Nat Biotechnol 2003, 21(2):183-186.
     
    iPS
    "Induced Pluripotent Stem Cells" re-programmed from tail-tip fibroblasts derived from a 129xBL-6 hybrid strain of mice to the pluripotent state as described.

    Reference: Hanna J, Wernig M, Markoulaki S, Sun CW, Meissner A, Cassady JP, Beard C, Brambrink T, Wu LC, Townes TM et al: Treatment of sickle cell anemia mouse model with iPS cells generated from autologous skin. Science 2007, 318(5858):1920-1923.
     

     
Top

 

  • Array Design Name: Microarray supplier and catalog number.
Top

 

  • Data Type: Indicates the property being measured in the indicated experiment. At present, replication timing and transcription data are shown. In the future, data for other genome-wide properties of chromosomes may be displayed. Contact us to add new types of data such as ChIP-chip or ChIP-Seq.
Top

 

  • Reference: to be displayed publicly must include a reference.
Top

 

  • Comments: We provide detailed microarray design information here but any additional comments can be added.
Top

 

  • Present or Absent Column: For uploading transcription data sets that contain present-absent calls, specify here.
Top

 

  • Partial Dataset: Dataset that do not contain information for all chromosomes.
Top

 

  • Data Security Level: Users can select Public, Private, or Über Private. Users can make their published or “in press” data sets publicly available by selecting “Public” and providing a reference under the entry term, "Reference." Private data sets are viewable by all registered users with a ReplicationDomain account, while Über Private data sets are viewable only by the user who uploaded the data set.
Top

 

  • Data Starts on Line: Usually starts on line-2, with line-1 being the column names, but files without column names are also acceptable (i.e. starts on line-1).
Top