delnx.pp.size_factors¶

delnx.pp.size_factors(adata, method='normed_sum', layer=None, obs_key_added='size_factors')[source]¶

Compute size factors for (single-cell) RNA-seq normalization.

This function calculates sample/cell-specific normalization factors (size factors) to account for differences in sequencing depth and technical biases between samples. The computed size factors can be used to normalize counts for visualization or as offset terms in statistical models for differential expression analysis.

Parameters:
  • adata (AnnData) – Annotated data matrix containing expression data.

  • method (str, default="normed_sum") –

    Method to compute size factors: - “normed_sum”: Library size normalization based on the total counts per

    sample. Simple and efficient, works with sparse matrices.

    • ”ratio”: DESeq2-style median-of-ratios size factors, robust to differential expression between samples. Requires dense matrices.

    • ”poscounts”: Positive counts method with geometric mean normalization. Requires dense matrices.

  • layer (str, optional) – Layer in adata.layers to use for size factor calculation. If None, uses adata.X. Should contain raw (unlogged) counts.

  • obs_key_added (str, default="size_factors") – Key in adata.obs where the computed size factors will be stored.

Returns:

None Updates adata in place and sets the following field: - adata.obs[obs_key_added]: Size factors for each cell.

Examples

Calculate library size normalization (default):

>>> import scanpy as sc
>>> import delnx as dx
>>> adata = sc.read_h5ad("counts.h5ad")
>>> dx.pp.size_factors(adata, method="normed_sum")

Calculate DESeq2-style median-of-ratios size factors:

>>> # Requires dense matrix
>>> if sparse.issparse(adata.X):
...     adata.X = adata.X.toarray()
>>> dx.pp.size_factors(adata, method="ratio", obs_key_added="ratio_factors")

Use size factors for normalization in differential expression analysis:

>>> # Compute DE with size factors as offset
>>> results = dx.tl.de(adata, condition_key="treatment", size_factor_key="size_factors")

Notes

  • Size factors are scaled to have a geometric mean of 1.0 across all samples

  • Methods “ratio” and “poscounts” require dense matrices; use “normed_sum” for sparse data

  • A warning will be raised if size factors cannot be computed for some cells

  • Zero or invalid size factors are replaced with a small value (0.001)