delnx.tl.deΒΆ

delnx.tl.de(adata, condition_key=None, formula=None, contrast=None, reference=None, covariate_keys=None, method='lr', layer=None, multitest_method='fdr_bh', batch_size=2048, maxiter=100, verbose=True)[source]ΒΆ

Differential expression testing for non-count data.

General-purpose DE function for log-normalized, scaled, or binary data. Builds a design matrix from formula or condition_key and tests a single contrast. For multiple comparisons, call this function once per contrast or use grouped() for per-group analysis.

For count data, use nb_fit() + nb_test() instead. For fast cluster markers, use rank_de().

Parameters:
  • adata (AnnData) – Annotated data object containing expression data and metadata.

  • condition_key (str | None (default: None)) – Column name in adata.obs containing condition labels. Internally builds "~ condition_key" for the design matrix. Mutually exclusive with formula.

  • formula (str | None (default: None)) – R-style formula for the design matrix (e.g., "~ treatment + batch"). Parsed by patsy. Mutually exclusive with condition_key.

  • contrast (str | int | None (default: None)) –

    Coefficient to test. Supports shorthand:

    • Level name: "drugA" (resolved via condition_key).

    • Bracket shorthand: "treatment[drugA]" (for multi-covariate formulas).

    • Full patsy name: "treatment[T.drugA]" (always works).

    • Integer index or None (last coefficient).

  • reference (str | None (default: None)) – Reference level for categorical conditions. This level becomes the intercept in treatment coding.

  • covariate_keys (list[str] | None (default: None)) – Columns in adata.obs to include as covariates. Only used with condition_key (include covariates in formula directly).

  • method (str (default: 'lr')) –

    Statistical method for testing:

    • "lr": Logistic regression with likelihood ratio test. Recommended for log-normalized single-cell data.

    • "anova": ANOVA based on linear model. Recommended for log-normalized or scaled data.

    • "anova_residual": Linear model with residual F-test.

    • "binomial": Binomial GLM likelihood ratio test. For binary data (e.g., ATAC-seq).

  • layer (str | None (default: None)) – Layer in adata.layers containing expression data. If None, uses adata.X.

  • multitest_method (str (default: 'fdr_bh')) – Method for multiple testing correction (see statsmodels.stats.multipletests()).

  • batch_size (int (default: 2048)) – Number of features to process per batch.

  • maxiter (int (default: 100)) – Maximum number of optimization iterations.

  • verbose (bool (default: True)) – Whether to print progress messages.

Return type:

DataFrame

Returns:

pd.DataFrame Results with columns:

  • feature: Gene/feature names

  • log2fc: Log2 fold change (coefficient / log(2), clipped to [-10, 10])

  • coef: Model coefficient (log scale)

  • stat: Test statistic (LR chi-squared or F-statistic)

  • pval: Raw p-value

  • padj: Adjusted p-value

Examples

Simple condition comparison:

>>> results = dx.tl.de(adata, condition_key="treatment", reference="control",
...                    contrast="treatment[T.drugA]")

Formula-based with covariates:

>>> results = dx.tl.de(adata, formula="~ treatment + batch",
...                    contrast="treatment[T.drugA]")

Continuous covariate:

>>> results = dx.tl.de(adata, formula="~ age + sex", contrast="age")

Binomial for binary ATAC data:

>>> results = dx.tl.de(adata, condition_key="treatment", reference="control",
...                    contrast="treatment[T.drugA]", method="binomial",
...                    layer="binary")

Notes

For count data (RNA-seq), use nb_fit() + nb_test() instead. For fast cluster markers, use rank_de().