delnx.pp.size_factors¶
- delnx.pp.size_factors(adata, method='normed_sum', layer=None, obs_key_added='size_factors')[source]¶
Compute size factors for (single-cell) RNA-seq normalization.
This function calculates sample/cell-specific normalization factors (size factors) to account for differences in sequencing depth and technical biases between samples. The computed size factors can be used to normalize counts for visualization or as offset terms in statistical models for differential expression analysis.
- Parameters:
adata (AnnData) – Annotated data matrix containing expression data.
method (str, default="normed_sum") –
Method to compute size factors: - “normed_sum”: Library size normalization based on the total counts per
sample. Simple and efficient, works with sparse matrices.
”ratio”: DESeq2-style median-of-ratios size factors, robust to differential expression between samples. Requires dense matrices.
”poscounts”: Positive counts method with geometric mean normalization. Requires dense matrices.
layer (str, optional) – Layer in
adata.layersto use for size factor calculation. If None, usesadata.X. Should contain raw (unlogged) counts.obs_key_added (str, default="size_factors") – Key in
adata.obswhere the computed size factors will be stored.
- Returns:
None Updates
adatain place and sets the following field: -adata.obs[obs_key_added]: Size factors for each cell.
Examples
Calculate library size normalization (default):
>>> import scanpy as sc >>> import delnx as dx >>> adata = sc.read_h5ad("counts.h5ad") >>> dx.pp.size_factors(adata, method="normed_sum")
Calculate DESeq2-style median-of-ratios size factors:
>>> # Requires dense matrix >>> if sparse.issparse(adata.X): ... adata.X = adata.X.toarray() >>> dx.pp.size_factors(adata, method="ratio", obs_key_added="ratio_factors")
Use size factors for normalization in differential expression analysis:
>>> # Compute DE with size factors as offset >>> results = dx.tl.de(adata, condition_key="treatment", size_factor_key="size_factors")
Notes
Size factors are scaled to have a geometric mean of 1.0 across all samples
Methods “ratio” and “poscounts” require dense matrices; use “normed_sum” for sparse data
A warning will be raised if size factors cannot be computed for some cells
Zero or invalid size factors are replaced with a small value (0.001)