delnx.tl.deΒΆ
- delnx.tl.de(adata, condition_key=None, formula=None, contrast=None, reference=None, covariate_keys=None, method='lr', layer=None, multitest_method='fdr_bh', batch_size=2048, maxiter=100, verbose=True)[source]ΒΆ
Differential expression testing for non-count data.
General-purpose DE function for log-normalized, scaled, or binary data. Builds a design matrix from
formulaorcondition_keyand tests a singlecontrast. For multiple comparisons, call this function once per contrast or usegrouped()for per-group analysis.For count data, use
nb_fit()+nb_test()instead. For fast cluster markers, userank_de().- Parameters:
adata (
AnnData) β Annotated data object containing expression data and metadata.condition_key (
str|None(default:None)) β Column name inadata.obscontaining condition labels. Internally builds"~ condition_key"for the design matrix. Mutually exclusive withformula.formula (
str|None(default:None)) β R-style formula for the design matrix (e.g.,"~ treatment + batch"). Parsed by patsy. Mutually exclusive withcondition_key.contrast (
str|int|None(default:None)) βCoefficient to test. Supports shorthand:
Level name:
"drugA"(resolved viacondition_key).Bracket shorthand:
"treatment[drugA]"(for multi-covariate formulas).Full patsy name:
"treatment[T.drugA]"(always works).Integer index or None (last coefficient).
reference (
str|None(default:None)) β Reference level for categorical conditions. This level becomes the intercept in treatment coding.covariate_keys (
list[str] |None(default:None)) β Columns inadata.obsto include as covariates. Only used withcondition_key(include covariates informuladirectly).method (
str(default:'lr')) βStatistical method for testing:
"lr": Logistic regression with likelihood ratio test. Recommended for log-normalized single-cell data."anova": ANOVA based on linear model. Recommended for log-normalized or scaled data."anova_residual": Linear model with residual F-test."binomial": Binomial GLM likelihood ratio test. For binary data (e.g., ATAC-seq).
layer (
str|None(default:None)) β Layer inadata.layerscontaining expression data. If None, usesadata.X.multitest_method (
str(default:'fdr_bh')) β Method for multiple testing correction (seestatsmodels.stats.multipletests()).batch_size (
int(default:2048)) β Number of features to process per batch.maxiter (
int(default:100)) β Maximum number of optimization iterations.verbose (
bool(default:True)) β Whether to print progress messages.
- Return type:
DataFrame- Returns:
pd.DataFrame Results with columns:
feature: Gene/feature nameslog2fc: Log2 fold change (coefficient / log(2), clipped to [-10, 10])coef: Model coefficient (log scale)stat: Test statistic (LR chi-squared or F-statistic)pval: Raw p-valuepadj: Adjusted p-value
Examples
Simple condition comparison:
>>> results = dx.tl.de(adata, condition_key="treatment", reference="control", ... contrast="treatment[T.drugA]")
Formula-based with covariates:
>>> results = dx.tl.de(adata, formula="~ treatment + batch", ... contrast="treatment[T.drugA]")
Continuous covariate:
>>> results = dx.tl.de(adata, formula="~ age + sex", contrast="age")
Binomial for binary ATAC data:
>>> results = dx.tl.de(adata, condition_key="treatment", reference="control", ... contrast="treatment[T.drugA]", method="binomial", ... layer="binary")
Notes
For count data (RNA-seq), use
nb_fit()+nb_test()instead. For fast cluster markers, userank_de().