Cell culture
The primary human hepatocytes (Lot: 005 and Lot: 201678901, Novabiosis) were used to induced ProliHHs in the study. Cryopreserved PHH were thawed and seeded into Matrigel (Corning)-coated 6-well culture plates at 200,000 viable cells per well. After 6 h, cell medium was replaced by HM as previously published protocols. HM was mixed by 500 ml Advanced DMEM/F-12 (Life Technologies), 1 \(\times\) N2 supplement 100 (Life Technologies), 1 \(\times\) B27 Supplement 50 minus vitamin A (Life Technologies), 1 mM N-acetyl-cysteine (Sigma-Aldrich), 10 mM Nicotinamide (Solarbio), 2 ng/ml Recombinant humanFGF10 (Peprotech), 50 ng/ml Recombinant human EGF (Peprotech), 25 ng/ml Recombinant human HGF (Peprotech), 10 nM Human [Leu15]-gastrin I (Sigma-Aldrich), 5uM A 83–01 (Tocris Bioscience), 10uM Rho kinase inhibitor Y-27632 (Selleck), 50 ng/ml Wnt3a protein (stemimmune LLC), 1% Fetal bovine serum (Ausbian) [15]. The cell culture medium was changed every 3 days. After 7 days, PHH were successfully induced ProliHHs by HM and 2% hypoxic culture. Cells were washed with PBS and trypsinized for passaging when they reached 90% confluence. ProliHHs were incubated in 37℃, hypoxia (5% CO2, 2% O2) incubator. ProliHHs (P0, P1 and P4) morphology were performed by phase contrast microscopy after cultured 1 day. HepG2 cells were obtained from ATCC. 293FT cells were provided by Professor Lijian, Hui (State Key Laboratory of Cell Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Science). Hepatoblast cells were provided by Professor Xin Cheng (State Key Laboratory of Cell Biology, CAS Center for Excellence in Molecular Cell Science, Institute of Biochemistry and Cell Biology, University of Chinese Academy of Sciences, Chinese Academy of Sciences). Primary rat hepatocytes were isolated from SD rats according to previous protocol [32].
RT-qPCR analysis
ProliHHs (P0, P1 and P4) were maintained on Matrigel-coated 12-well plates at 250,000 viable cells per well. After 24 h, samples were collected by TRIzol reagent (Life Technology) and the EZ-10 Spin column & Collection Tubes (Sangon Biotech). 500 ng RNA was reversely transcribed to cDNA using Hifair® III 1st Strand cDNA Synthesis SuperMix (Yeasen Biotech). cDNA was amplified by Hieff® qPCR SYBR Green Master Mix (Yeasen Biotech) on the Applied Biosystems 7500 Fast real-time PCR System (Thermo Fisher Scientific). Primers sequences were listed in Table S1. The relative mRNA levels were normalized by GAPDH. Each sample was performed in 3 replicates. GraphPad Prism 8.0 software was used to analyze data. The results represent means ± SD. One-way ANOVA was used for statistical analysis, ns p ≥ 0.05, * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001.
Reactive oxygen species measurements
ProliHHs (P0, P1 and P4) were seeded on Matrigel-coated 96-well plates at 50,000 viable cells per well and incubated overnight. Cells were washed in serum-free medium and treated with 10 mol\(\upmu\)/L DCFH-DA (Beyotime Biotechnology) for 20 min at 37℃. Then cells were washed 3 times in serum-free medium and measured fluorescence intensity at 488 nm excitation and 525 nm emission by the automatic microplate reader (Biotek).
Triglyceride measurements
ProliHHs (P0, P1 and P4) were cultured on Matrigel-coated 96-well plates at 50,000 viable cells per well. Cells were washed with PBS and removed supernatant. Then, 10 μl RIPA (Beyotime Biotechnology) was added into cells per well for 10 min. Lysate was determined protein content by Take3 Micrometer plate (BioTek) and TG levels by Triglyceride assay kit (Nanjing Jiancheng Bioengineering Institute) according to manufacturer’s instructions.
Hydroxyproline measurements
ProliHHs (P0, P1 and P4) were maintained on Matrigel-coated 12-well plates at 1000,000 viable cells per well. After 24 h, the supernatant was collected and the hydroxyproline concentration was measured by Hydroxyproline assay kit (Nanjing Jiancheng Bioengineering Institute) according to manufacturer’s instructions.
Raman spectroscopy measurements
ProliHHs (P0, P1 and P4) were seeded on 8-Well Chamber Raman Scattering Microslide (D-BAND) as cell density 100,000 cells/ml for 24 h. Cells were washed with PBS, fixed with 4% paraformaldehyde (Beyotime Biotechnology) for 15 min, washed 3 times with sterile water. Then, the cells on Raman Scattering Microslide were air dried. Raman measurements were performed with a confocal Raman imaging system (alpha 300 R, WITec). The system was included in a 532 nm laser, 1800 grooves/mm grating (BLZ = 500 nm) and a CCD camera. Raman spectra (from 300 to 1800 cm−1) were collected by a 100 \(\times\) objective (N.A. = 0.9) with laser power 9 mW, integration time 10 s and accumulation number 1. Calibration was performed using a silicon plate with its unique peak located at 520.7 cm−1. Raman spectra for each cell were randomly acquired within the cytoplasm (n = 5) and on the periphery (n = 5), respectively, based on the brightfield photo. A representative image with different focusing locations of laser was shown in Additional file 1: Figure S1. For each cell type, Raman spectra (n = 600 in total) were acquired from 20 single cells per batch and 3 replicate batches. All spectra were normalized with subtraction of cosmic ray, baseline correction and area normalization.
Raman data analysis
All Raman data analysis were done under an R 3.6.3 environment with inhouse scripts. Linear discriminant analysis (LDA) and Principal component analysis (PCA)were used to reduce data dimensions and visualize classification. The Raman bands area were integrated to semiquantitative associated with biomolecules. The results represent median. The Student's t-test was used for statistical analysis, ns p ≥ 0.05, * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001. Machine learning models were applied to build databases to rapidly identify cells. 75% data was used as a training set to establish model, 25% as a test set to evaluate model. It was used to build model by tenfold cross validation with five repetitions. Primarily, single classifiers were measured by k-nearest neighbor, linear discriminant analysis, partial least-squares regression, linear support vector machine, radial basis function kernel support vector machine and random forest. In order to improve the accuracy of prediction, total of single classifiers were stacked together forming a two-layer model by GBM algorithm. The characteristics of the second layer is the single model results (KNN, LDA, PLS, Linear-SVM, RBF-SVM and RF), according to Hsu et al. protocol [33].