Scholarly article on topic 'Dataset of target mass spectromic proteome profiling for human chromosome 18'

Dataset of target mass spectromic proteome profiling for human chromosome 18 Academic research paper on "Chemical sciences"

CC BY
0
0
Share paper
Academic journal
Data in Brief
OECD Field of science
Keywords
{}

Abstract of research paper on Chemical sciences, author of scientific article — Ekaterina V. Ilgisonis, Arthur T. Kopylov, Victor G. Zgoda

Abstract Proteome profiling is a type of quantitative analysis that reveals level of protein expression in the sample. Proteome profiling by using selected reaction monitoring is an approach for the Chromosome-centric Human Proteome Project (C-HPP). Here we describe dataset generated in the course of the pilot phase of Russian part of C-HPP, which was focused on human Chr 18 proteins. Proteome profiling was performed using stable isotope-labeled standards (SRM/SIS) for plasma, liver tissue and HepG2 cells. Dataset includes both positive and negative results of protein detection. These data were partly discussed in recent publications, “Chromosome 18 Transcriptome Profiling and Targeted Proteome Mapping in Depleted Plasma, Liver Tissue and HepG2 Cells” [1] and “Chromosome 18 transcriptoproteome of liver tissue and HepG2 Cells and targeted proteome mapping in depleted plasma: Update 2013” [2], supporting the accompanying publication “State of the Chromosome 18-centric HPP in 2016: Transcriptome and Proteome Profiling of Liver Tissue and HepG2 Cells” [3], and are deposited at the ProteomeXchange via the PASSEL repository with the dataset identifier PASSEL: PASS00697 for liver and HepG2 cell line.

Academic research paper on topic "Dataset of target mass spectromic proteome profiling for human chromosome 18"

□

Data Article

Dataset of target mass spectromic proteome profiling for human chromosome 18

Ekaterina V. Ilgisonis * Arthur T. Kopylov, Victor G. Zgoda

Orekhovich Institute of Biomedical Chemistry, Moscow, Russia

ARTICLE INFO ABSTRACT

Proteome profiling is a type of quantitative analysis that reveals level of protein expression in the sample. Proteome profiling by using selected reaction monitoring is an approach for the Chromosome-centric Human Proteome Project (C-HPP). Here we describe dataset generated in the course of the pilot phase of Russian part of C-HPP, which was focused on human Chr 18 proteins. Proteome profiling was performed using stable isotope-labeled standards (SRM/SIS) for plasma, liver tissue and HepG2 cells. Dataset includes both positive and negative results of protein detection.

These data were partly discussed in recent publications, "Chromosome 18 Transcriptome Profiling and Targeted Proteome Mapping in Depleted Plasma, Liver Tissue and HepG2 Cells" [1] and "Chromosome 18 transcriptoproteome of liver tissue and HepG2 Cells and targeted proteome mapping in depleted plasma: Update 2013" [2], supporting the accompanying publication "State of the Chromosome 18-centric HPP in 2016: Transcriptome and Proteome Profiling of Liver Tissue and HepG2 Cells" [3], and are deposited at the ProteomeXchange via the PASSEL repository with the dataset identifier PASSEL: PASS00697 for liver and HepG2 cell line.

© 2016 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Contents lists available at ScienceDirect

Data in Brief

journal homepage: www.elsevier.com/locate/dib

CrossMark

Article history: Received 8 June 2016 Received in revised form 30 June 2016 Accepted 19 July 2016 Available online 26 July 2016

* Corresponding author. E-mail address: ilgisonis.ev@gmail.com (E.V. Ilgisonis).

http://dx.doi.org/10.1016/j.dib.2016.07.034

2352-3409/© 2016 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Specifications Table

Subject area

Biology

More specific subject area Type of data How data was

acquired Data format Experimental

factors Experimental features

Data source

location Data accessibility

Targeted mass-spectrometric proteome profiling of liver and HepG2 cell line Figure, table, raw files (.d), skyline files (.sky)

Proteome profiling was performed using stable isotope-labeled standards

(SRM/SIS) for liver tissue and HepG2 cells

The trypsin digestion was used.

Digested samples were separated using the HPLC Agilent 1290 system including pump and autosampler. Internal Standard were produced using Overture (Protein Technologies, USA) or Hamilton Microlab STAR devices. The quantitative SRM analysis was performed using Agilent 6495 Triple Quadrupole (Agilent, USA) equipped with Jet Stream ionization source. Institute of Biomedical Chemistry, Moscow, Russia

Data is available within this article and at the ProteomeXchange via PASSEL

(http://proteomecentral.proteomexchange.org/cgi/GetDataset?

ID = PXD004407).

Value of the data

• This data characterizes the diversity of chromosome 18 protein species in liver tissue and HepG2 cell line using SRM.

• This data could be of interest to laboratories studying protein reference levels and cross-tissue biological variability of proteome.

• This data could be useful for protein, peptide and transition selection for SRM-assay development.

• Dataset may be used as a test for automated SRM-data processing algorithms.

1. Data

This dataset describes conditions of liver tissue and HepG2 cell line proteome profiling. Targeted protein list included 268 proteins of chromosome 18. Data were automatically processed to quantify proteins in the biosample. Dataset includes raw data, transition list, skyline files and sample preparation instructions, available in PASSEL, 2 figures and Supplementary table with protein copy numbers in liver tissue and HepG2 cell line.

2. Experimental design, materials and methods

2.1. Sample preparation

The trypsin digestion of liver tissue and cell lysates was performed as described in Ponomarenko et al. [2].

2.2. Peptide synthesis

The peptides were produced using the SOLiD-phase peptide synthesis on the Overture (Protein Technologies, USA) or Hamilton Microlab STAR devices according to the method published in Hood

et al. [4]. The isotope-labeled leucine (Fmoc-Leu-OH-13C6,15N) was used for isotope-labeled peptide synthesis instead of the unlabeled leucine (Fmoc-Leu-OH) [5].

2.3. Transition list

List of peptides for 268 chromosome 18 proteins was generated manually using data about occurrence of proteotypic peptides from proteomic repositories GPMdb, ProteinAtlas and PRIDE and MaRiMba-criteria (protocol was described earlier in Supplementary note 2, Zgoda et al. [1]). For each protein one "best-flyer" peptide was chosen. For each peptide 3 the most intensive transitions [6] were chosen using previous research results.

All 268 peptides were distributed over 3 SRM-assays (A-C) in equal parts according to their calculated retention time to avoid interference.

LC-SRM Analysis was held as described earlier in Supplementary note 2, Zgoda et al. [1]. Each SRM experiment was repeated in 3 technical runs. Each transition peak was characterized with the following variables: retention time, peak height, SIS/endogenous peak area ratio. No manual inspection for to find transitions that were similar to those in the target peptides or to reveal detected peptides was held.

2.4. How to use data

Dataset is represented by several file types (Fig. 1). For re-using of the dataset and extraction relevant information from it one can install freely-available and open source Windows client application Skyline [7] (for each biosample there is one skyline file, including transitions and technical runs info). It provides an opportunity to open raw data, visualize (Fig. 2) and analyze SRM data. Besides it is possible to use proprietary software (Agilent MassHunter Workstation Software) for data visualization.

File type HepG2 cell line Liver tissue

1 «1 Raw data from the instrument (.d format) 1 sample, 3 technical runs for each SRM-assay (A, B, C) 1 sample, 3 technical runs for each SRM-assay (A, B, C)

Skyline files (sky) 1 for sample, includes 268 peptides, 268 SIS, 3 technical runs for each SRM-assay (A, B, C) 1 for sample, includes 268 peptides, 268 SIS, 3 technical runs for each SRM-assay (A, B, C)

S " ill Transition list (merged assays A,B,C) 3 SRM assays (A, B, C) for 268 endogenous peptides and 268 SIS 3 SRM assays (A, B, C) for 268 endogenous peptides and 268 SIS

W n Sample preparation protocol

Fig. 1. Chromosome 18 proteome profiling dataset scheme.

Fig. 2. Screenshot of raw data visualization for HepG2 cell line, SRM-assay C using Skyline. All raw file are named using the following template:

1) Chromosome number: X18;

2) Sample type : HLV (liver tissue) or HPG (HepG2 cell line);

3) SRM-assay id: A, B or C; and

4) Technical run number: rool, r002, r003.

For example, raw file name X18HPG_C-r002.d (Fig. 2). Skyline files also are named using these keys.

2.5. Quantification

Calibration curves were obtained for each of the desired peptides using the mixtures of purified synthetic non-labeled peptides in the concentration range of 100-100 fmole/ml and its isotopically labeled standards (SIS) were added at the concentration of 2 fmole/ml. All calibration curves were linear in the range of 100-0.1 fmole/ml and showed the coefficient of linear regression equal to 0.95.

Prior to the sample processing, the performance of the LC-SRM platforms used was validated by obtaining the calibration curves of the corresponding set of SIS and synthetic non-labeled peptides. Moreover, after five LC-SRM runs we verified the relevance of calibration by analyzing one of the calibration peptide solution at 10 fmole/ml.

The detection limit was defined as the lowest concentration determined on the linear part of calibration curve. It varies for different peptides in the range from 100 amole/ml to 200 amole/ml.

Labeled (SIS)/target peptide peak area ratios were used to calculate the concentration of the targeted peptide in a sample. Peak area ratios were obtained using Skyline software.

Cpept = Clab*Spept/Slab where Cpept - target peptide concentration, Clab - labeled peptide (SIS) concentration (see Quantification), Spept - area of target peptide peak, and Slab - area of labeled peptide peak.

All calculated target peptide copy numbers for liver tissue and HepG2 are listed in the Supplementary Table 1.

Conflicts of interest

Acknowledgments

This work was supported by Russian Science Foundation (#15-15-30041).

Transparency document. Supplementary material

Transparency data associated with this article can be found in the online version at http://dx.doi. org/10.1016/j.dib.2016.07.034.

Appendix A. Supplementary material

Supplementary data associated with this article can be found in the online version at http://dx.doi. org/10.1016/j.dib.2016.07.034.

References

[1] V.G. Zgoda, A.T. Kopylov, O.V. Tikhonova, et al., Chromosome 18 transcriptome profiling and targeted proteome mapping in depleted plasma, liver tissue and HepG2 cells, J. Proteome Res. 12 (1) (2013) 123-134.

[2] E.A. Ponomarenko, A.T. Kopylov, A.V. Lisitsa, et al., Chromosome 18 transcriptoproteome of liver tissue and HepG2 Cells and targeted proteome mapping in depleted plasma: update 2013, J. Proteome Res. 13 (1) (2014) 183-190.

[3] E.V. Poverennaya, A.T. Kopylov, E.A. Ponomarenko, et al., State of the chromosome 18-centric HPP in 2016: transcriptome and proteome profiling of liver tissue and HepG2 cells, J. Proteom Res. (2016) (submitted for publication).

[4] C.A. Hood, G. Fuentes, H. Patel, K. Page, M. Menakuru, J.H. Park, Fast conventional Fmoc solid-phase peptide synthesis with HCTU, J. Pept. Sci. 14 (1) (2008) 97-101.

[5] D. Fekkes, State-of-the-art of high-performance liquid chromatographic analysis of amino acids in physiological samples, J. Chromatogr. B. Biomed. Appl. 682 (1) (1996) 3-22.

[6] C. Ludwig, M. Claassen, A. Schmidt, R. Aebersold, Estimation of absolute protein quantities of unlabeled samples by selected reaction monitoring mass spectrometry, Mol. Cell. Proteom. [Internet] 11 (3) (2012) (M111.013987).

[7] B. MacLean, D.M. Tomazela, N. Shulman, et al., Skyline: an open source document editor for creating and analyzing targeted proteomics experiments, Bioinformatics [Internet] 26 (7) (2010) 966-968.