read GSE series matrix file (contains expression data)
read_gse_matrix(matrix_file)
string. path to the matrix file
tibble or NULL. the first variable is ID_REF
(probe ID), others are gene expression value of each sample
For now, we assume that the only special value in input file is '' (empty, you may search \\t\\t
), i.e., no Inf
, NaN
, NA
, etc. And we don't collect sample meta data.
Other read raw data:
parse_gse_soft()
,
read_gse_soft()
read_gse_matrix(system.file('extdata/GSE51280_series_matrix.txt.gz', package = 'rGEO'))
#> # A tibble: 123 × 25
#> ID_REF GSM1241791 GSM1241792 GSM1241793 GSM1241794 GSM1241795 GSM1241796
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 20 -3.88 -4.00 -3.56 -2.54 -3.98 -4.20
#> 2 21 -8.82 -9.61 -8.94 -5.44 -8.36 -8.95
#> 3 22 -3.65 -5.80 -1.00 -3.54 -2.81 -5.58
#> 4 23 -4.92 -7.61 -5.88 -4.60 -4.00 -4.75
#> 5 24 -3.51 -4.12 -3.92 -4.30 -4.31 -5.04
#> 6 25 -3.36 -3.90 -5.60 -2.52 -4.19 -4.12
#> 7 26 -6.06 -6.38 -7.17 -4.71 -6.41 -5.98
#> 8 27 -5.40 -5.01 -6.53 -4.74 -6.29 -6.22
#> 9 28 -2.77 -3.20 -1.29 -4.07 -3.16 -2.98
#> 10 29 -3.38 -2.26 -4.26 -2.42 -4.33 -4.51
#> # … with 113 more rows, and 18 more variables: GSM1241797 <dbl>,
#> # GSM1241798 <dbl>, GSM1241799 <dbl>, GSM1241800 <dbl>, GSM1241801 <dbl>,
#> # GSM1241802 <dbl>, GSM1241803 <dbl>, GSM1241804 <dbl>, GSM1241805 <dbl>,
#> # GSM1241806 <dbl>, GSM1241807 <dbl>, GSM1241808 <dbl>, GSM1241809 <dbl>,
#> # GSM1241810 <dbl>, GSM1241811 <dbl>, GSM1241812 <dbl>, GSM1241813 <dbl>,
#> # GSM1241814 <dbl>