delnx.tl.auroc¶

delnx.tl.auroc(adata, condition_key, reference=None, mode='all_vs_all', layer=None, min_samples=2, batch_size=2048, verbose=False)[source]¶

Calculate Area Under the Receiver Operating Characteristic (AUROC) between condition levels.

This function computes AUROC values for all features between different experimental conditions. AUROC quantifies how well a feature’s expression can distinguish between two conditions, providing a measure of the feature’s discriminative power independent of any specific threshold.

Parameters:
  • adata (AnnData) – AnnData object containing expression data and metadata.

  • condition_key (str) – Column name in adata.obs containing condition labels.

  • reference (str | tuple[str, str] | None (default: None)) – Reference condition for comparisons, specified as: - Single string: reference condition for all comparisons - Tuple (reference, comparison): specific pair to compare - None: automatically determined based on mode parameter

  • mode (Literal['all_vs_ref', 'all_vs_all', '1_vs_1', 'continuous'] (default: 'all_vs_all')) – Comparison strategy: - “all_vs_ref”: Compare all condition levels against reference level - “all_vs_all”: Compare all pairs of condition levels - “1_vs_1”: Compare only reference vs comparison (requires tuple reference)

  • layer (str | None (default: None)) – Layer in adata.layers to use for expression data. If None, uses adata.X.

  • min_samples (int (default: 2)) – Minimum number of samples required per condition level. Comparisons with fewer samples are skipped.

  • batch_size (int (default: 2048)) – Number of features to process per batch. Adjust based on available memory and dataset size.

  • verbose (bool (default: False)) – Whether to print progress information.

Return type:

DataFrame

Returns:

pd.DataFrame DataFrame containing AUROC results with columns: - “feature”: Feature/gene names - “test_condition”: Test condition label - “ref_condition”: Reference condition label - “auroc”: AUROC values (0.5=random, 1=perfect separation with higher values in test)

Examples

Basic usage for all pairwise comparisons:

>>> import scanpy as sc
>>> import delnx as dx
>>> adata = sc.read_h5ad("dataset.h5ad")
>>> results = dx.tl.auroc(adata, condition_key="cell_type")

Looking at specific condition comparisons:

>>> # Compare only CD4+ T cells vs CD8+ T cells
>>> results = dx.tl.auroc(adata, condition_key="cell_type", reference=("CD4+ T", "CD8+ T"), mode="1_vs_1")
>>> # Compare all cell types against a reference type
>>> results = dx.tl.auroc(adata, condition_key="cell_type", reference="B cells", mode="all_vs_ref")

Notes

  • AUROC values range from 0 to 1, where: - 0.5 indicates the feature cannot distinguish between conditions (random) - Values >0.5 indicate higher expression in the test condition - Values <0.5 indicate higher expression in the reference condition

  • The implementation uses JAX for accelerated computation and batch processing to efficiently handle large datasets