Proportion estimation function for multi-subject case, and apply tree-guided deconvolution

SCDC_prop_subcl_marker(
  bulk.eset,
  sc.eset,
  ct.varname,
  fl.varname,
  sample,
  ct.sub = NULL,
  ct.fl.sub,
  iter.max = 3000,
  nu = 1e-04,
  epsilon = 0.001,
  weight.basis = T,
  truep = NULL,
  select.marker = T,
  markers = NULL,
  marker.varname = NULL,
  allgenes.fl = F,
  pseudocount.use = 1,
  LFC.lim = 0.5,
  ct.cell.size = NULL,
  fl.cell.size = NULL,
  ...
)

Arguments

bulk.eset

ExpressionSet object for bulk samples

sc.eset

ExpressionSet object for single cell samples

ct.varname

variable name for 'cell types'

fl.varname

variable name for first-level 'meta-clusters'

sample

variable name for subject/samples

ct.sub

a subset of cell types that are selected to construct basis matrix

ct.fl.sub

'cell types' for first-level 'meta-clusters'

iter.max

the maximum number of iteration in WNNLS

nu

a small constant to facilitate the calculation of variance

epsilon

a small constant number used for convergence criteria

weight.basis

logical, use basis matrix adjusted by MVW, default is T.

truep

true cell-type proportions for bulk samples if known

select.marker

logical, select marker genes to perform deconvolution in tree-guided steps. Default is T.

markers

A set of marker gene that input manully to be used in deconvolution. If NULL, then

marker.varname

variable name of cluster groups when selecting marker genes. If NULL, then use ct.varname.

allgenes.fl

logical, use all genes in the first-level deconvolution

pseudocount.use

a constant number used when selecting marker genes, default is 1.

LFC.lim

a threshold of log fold change when selecting genes as input to perform Wilcoxon's test.

ct.cell.size

default is NULL, which means the "library size" is calculated based on the data. Users can specify a vector of cell size factors corresponding to the ct.sub according to prior knowledge. The vector should be named: names(ct.cell.size input) should not be NULL.

fl.cell.size

default is NULL, similar to ct.cell.size. This is for first-level 'meta-clusters'.

Value

Estimated proportion, basis matrix, predicted gene expression levels for bulk samples