deconvatac.tl#

Submodules#

Classes#

Sampler

Class to sample cells and clusters from a given dataset.

Functions#

cell2location(adata_spatial, adata_ref, ...[, ...])

Run Cell2Location

tangram(adata_spatial, adata_ref, labels_key[, ...])

Run Tangram

destvi(adata_spatial, adata_ref[, labels_key, ...])

Run DestVI

jsd(→ float)

Compute Jensen-Shannon divergence on true and predicted cell type proportions.

rmse(→ float)

Compute RMSE on true and predicted cell type proportions.

rctd(adata_spatial, adata_ref, labels_key[, ...])

Run RCTD

conway_maxwell_poisson(lambda_, nu)

Sample from the Conway-Maxwell-Poisson distribution.

generate_spatial_data(→ [anndata.AnnData, muon.MuData])

Generate spatial data.

spatialdwls(adata_spatial, adata_ref, labels_key[, ...])

Run SpatialDWLS

Package Contents#

deconvatac.tl.cell2location(adata_spatial, adata_ref, N_cells_per_location, detection_alpha, labels_key=None, layer_spatial=None, layer_ref=None, use_gpu=True, max_epochs_spatial=30000, max_epochs_ref=None, return_adatas=False, plots=True, results_path='./cell2location_results', setup_ref_kwargs={}, train_ref_kwargs={}, setup_spatial_kwargs={}, train_spatial_kwargs={})#

Run Cell2Location

Parameters#

adata_spatialAnnData

AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref.

adata_refAnnData

AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.

N_cells_per_locationfloat

Expected cell number per location.

detection_alphafloat

Regularisation of per-location normalisation.

labels_keystr

Cell type key in adata_ref.obs for label information

layer_spatialstr

Layer of adata_spatial to use for deconvolution. If None, uses adata_spatial.X.

layer_refstr

Layer of adata_ref to use for deconvolution. If None, uses adata_ref.X.

use_gpubool

Whether to use the GPU.

max_epochs_spatial: int

Number of epochs for the spatial mapping model. If None, defaults to np.min([round((20000 / n_cells) * 400), 400]).

max_epochs_ref: int

Number of epochs for the reference model. If None, defaults to np.min([round((20000 / n_cells) * 400), 400]).

return_adatas: bool

Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).

plots: bool

Whether to plot QC and ELBO plots.

results_path: str

Path to save estimated cell type abundances to.

setup_ref_kwargs: dict

Parameters for cell2location.models.RegressionModel.setup_anndata()

train_ref_kwargs: dict

Parameters for cell2location.models.RegressionModel.train()

setup_spatial_kwargs: dict

Parameters for cell2location.models.Cell2location.setup_anndata()

train_spatial_kwargs: dict

Parameters for cell2location.models.Cell2location.train()

Returns#

  • Saves ‘q05_cell_abundance_w_sf’ and ‘means_cell_abundance_w_sf’ as csv-files to results_path.

  • If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.

deconvatac.tl.tangram(adata_spatial, adata_ref, labels_key, run_rank_genes=False, layer_rank_genes=None, num_epochs=1000, device='cpu', return_adatas=False, result_path='./tangram_results', **kwargs)#

Run Tangram

Parameters#

adata_spatialAnnData

AnnData of the spatial data.

adata_refAnnData

AnnData of the reference data.

labels_keystr

Cell type key in adata_ref.obs for label information

run_rank_genes: bool

If true, will run sc.tl.rank_genes_groups on reference anndata followed by tg.pp_adatas. If false, will only run tg.pp_adatas with all peaks of the input anndatas. Thus, expects anndatas to be already filtered by HVFs.

layer_rank_genes: str

Only if run_rank_genes is true. Layer to use for sc.tl.rank_genes_groups.

num_epochs: int

Number of epochs.

devicestring or torch.device

Which device to use.

return_adatas: bool

Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).

results_path: str

Path to save estimated cell type abundances to.

**kwargs:

Parameters for tangram.mapping_utils.map_cells_to_space()

Returns#

  • Saves estimated proportions as csv-file to results_path.

  • If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.

deconvatac.tl.destvi(adata_spatial, adata_ref, labels_key=None, layer_spatial=None, layer_ref=None, use_gpu=True, max_epochs_spatial=2000, max_epochs_ref=300, return_adatas=False, plots=True, results_path='./destvi_results', model_ref_kwargs={}, train_ref_kwargs={}, model_spatial_kwargs={}, train_spatial_kwargs={})#

Run DestVI

Parameters#

adata_spatialAnnData

AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref.

adata_refAnnData

AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.

labels_keystr

Cell type key in adata_ref.obs for label information

layer_spatialstr

Layer of adata_spatial to use for deconvolution. If None, uses adata_spatial.X.

layer_refstr

Layer of adata_ref to use for deconvolution. If None, uses adata_ref.X.

use_gpubool

Whether to use the GPU.

max_epochs_spatial: int

Number of epochs for the stLVM.

max_epochs_ref: int

Number of epochs for the scLVM.

return_adatas: bool

Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).

plots: bool

Whether to plot ELBO plots and UMAP of scLVM latent space.

results_path: str

Path to save estimated cell type abundances to.

model_ref_kwargs: dict

Parameters for scvi.model.CondSCVI()

train_ref_kwargs: dict

Parameters for scvi.model.CondSCVI.train()

model_spatial_kwargs: dict

Parameters for scvi.model.DestVI.from_rna_model()

train_spatial_kwargs: dict

Parameters for scvi.model.DestVI.train()

Returns#

  • Saves estimated proportions as csv-file to results_path.

  • If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.

deconvatac.tl.jsd(true: pandas.DataFrame | numpy.ndarray, predicted: pandas.DataFrame | numpy.ndarray) float#

Compute Jensen-Shannon divergence on true and predicted cell type proportions.

Parameters#

true

True cell type proportions.

predicted

Predicted cell type proportions.

Returns#

Jensen-Shannon divergence.

deconvatac.tl.rmse(true: pandas.DataFrame | numpy.ndarray, predicted: pandas.DataFrame | numpy.ndarray) float#

Compute RMSE on true and predicted cell type proportions.

Parameters#

true

True cell type proportions.

predicted

Predicted cell type proportions.

Returns#

Root mean squared error.

deconvatac.tl.rctd(adata_spatial, adata_ref, labels_key, doublet_mode='full', r_lib_path=None, results_path='./rctd_results', create_rctd_kwargs={})#

Run RCTD

Parameters#

adata_spatialAnnData

AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref. Raw counts are expected in .X.

adata_refAnnData

AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial. Raw counts are expected in .X.

labels_keystr

Cell type key in adata_ref.obs for label information

doublet_mode: str [“doublet”, “full”]
On which mode to run RCTD: ‘doublet’ (at most 1-2 cell types per pixel),

‘full’ (no restrictions on number of cell types)

r_lib_pathstr

Path to R library in which RCTD is installed.

results_pathstr

Path to save estimated cell type abundances to.

create_rctd_kwargsdict

Parameters for create.RCTD().

Returns#

  • Saves estimated proportions as csv-file to results_path.

deconvatac.tl.conway_maxwell_poisson(lambda_, nu)#

Sample from the Conway-Maxwell-Poisson distribution.

class deconvatac.tl.Sampler(reference: [muon.MuData, anndata.AnnData], cell_type_key: str, num_spots: int, n_regions: int, region_type: str = 'stripes', cell_number_mean: [int, list] = 6, cell_number_nu: [float, list] = 20.0, cell_type_number: [int, list] = 4, balance: str | None = 'balanced')#

Class to sample cells and clusters from a given dataset.

reference#
cell_type_key#
num_spots#
obs#
n_regions#
region_type = 'stripes'#
init_sample_prob(balance='unbalanced')#

Initialize the sample probabilities based on cell type counts.

Returns#

None

define_regions(used_clusters)#

Define the regions to sample from.

Parameters#

used_clustersdict

A dictionary containing the clusters to be used in each region.

Returns#

None

Raises#

ValueError

If the region_type parameter is not one of [‘stripes’, ‘circles’, ‘gradient_number’, ‘gradient_celltype’]

stripe_regions()#

Define regions as stripes.

Returns#

None

circle_regions()#

Define regions in circles.

Returns#

None

gradient_number_regions()#

Define regions as a gradient.

Returns#

None

gradient_celltype_regions(used_clusters)#

Define regions as a gradient.

Parameters#

used_clusterslist

The clusters to be used in the gradient cell type regions.

Returns#

None

sample_data()#

Sample data from the given dataset.

Returns#

tuple

A tuple of expression and density arrays.

sample_spots(params)#

Sample cells based on given parameters.

Parameters#

paramsnp.ndarray

Array of cell and cluster counts.

Returns#

tuple

A tuple of expression and density arrays.

Raises#

None

get_coords()#

Get the coordinates for the spots.

Returns#

tuple

A tuple of X and Y coordinates.

deconvatac.tl.generate_spatial_data(reference: [muon.MuData, anndata.AnnData], cell_type_key: str, num_spots: int = 1024, n_regions: int = 5, balance: str | None = None, cell_number_mean: [int, list] = 6, cell_number_nu: [float, list] = 20.0, cell_type_number: [int, list] = 4, **kwargs) [anndata.AnnData, muon.MuData]#

Generate spatial data.

Parameters#

referenceUnion[mu.MuData, ad.AnnData]

The reference dataset used for generating spatial data.

cell_type_keystr

The key in the reference dataset that specifies the cell type information.

num_spotsint, optional

The number of spots (locations) to generate, by default 1024.

n_regionsint, optional

The number of spatial regions to generate, by default 5.

balancestr, optional

The balancing method to use for generating spatial data, by default None.

cell_number_meanUnion[int, list], optional

The mean number of cells per spot, by default 6.

cell_number_nuUnion[float, list], optional

The dispersion parameter for the negative binomial distribution used to model cell numbers, by default 20.0.

cell_type_numberUnion[int, list], optional

The number of cell types to generate, by default 4.

**kwargs

Additional keyword arguments.

referenceUnion[mu.MuData, ad.AnnData]

The reference dataset used for generating spatial data.

cell_type_keystr

The key in the reference dataset that specifies the cell type information.

num_spotsint, optional

The number of spots (locations) to generate, by default 1024.

n_regionsint, optional

The number of spatial regions to generate, by default 5.

balancestr, optional

The balancing method to use for generating spatial data, by default None.

cell_number_meanUnion[int, list], optional

The mean number of cells per spot, by default 6.

cell_number_nuUnion[float, list], optional

The dispersion parameter for the negative binomial distribution used to model cell numbers, by default 20.0.

cell_type_numberUnion[int, list], optional

The number of cell types to generate, by default 4.

**kwargs

Additional keyword arguments.

Returns#

Union[ad.AnnData, mu.MuData]

The generated spatial data.

Notes#

This function generates spatial data based on a reference dataset. It uses a sampling approach to generate synthetic spatial data with specified characteristics such as cell type composition, cell numbers, and spatial organization.

The generated spatial data is returned as an AnnData object if the reference dataset is an AnnData object, or as a MuData object if the reference dataset is a MuData object. Union[ad.AnnData, mu.MuData]

The generated spatial data.

Notes#

This function generates spatial data based on a reference dataset. It uses a sampling approach to generate synthetic spatial data with specified characteristics such as cell type composition, cell numbers, and spatial organization.

The generated spatial data is returned as an AnnData object if the reference dataset is an AnnData object, or as a MuData object if the reference dataset is a MuData object.

deconvatac.tl.spatialdwls(adata_spatial, adata_ref, labels_key, cluster_key='leiden', n_cell=50, tfidf=True, r_lib_path=None, results_path='./spatialdwls_results')#

Run SpatialDWLS

Parameters#

adata_spatialAnnData

AnnData of the spatial data , filtered by highly variable features. Feature space needs to be the same as the one of adata_ref. Normalized counts are expected to be saved in .X. Names of observations and features are expected as row names of adata_spatial.obs and adata_spatial.var.

adata_refAnnData
AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.

Normalized counts are expected to be saved in .X. Names of observations and features are expected as row names of adata_ref.obs and adata_ref.var.

labels_keystr

Cell type key in adata_ref.obs for label information.

cluster_keystr

Cluster key in adata_spatial.obs for cluster information.

n_cellint

Number of cells per spot.

tfidfbool

Whether the normalized counts are TFIDF-normalized.

r_lib_pathstr

Path to R library.

results_pathstr

Path to save estimated cell type abundances to.

Returns#

  • Saves estimated proportions as csv-file to results_path.