deconvatac.tl

deconvatac.tl#

Submodules#

Classes#

Sampler

Class to sample cells and clusters from a given dataset.

Functions#

`cell2location`(adata_spatial, adata_ref, ...[, ...])	Run Cell2Location
`tangram`(adata_spatial, adata_ref, labels_key[, ...])	Run Tangram
`destvi`(adata_spatial, adata_ref[, labels_key, ...])	Run DestVI
`jsd`(→ float)	Compute Jensen-Shannon divergence on true and predicted cell type proportions.
`rmse`(→ float)	Compute RMSE on true and predicted cell type proportions.
`rctd`(adata_spatial, adata_ref, labels_key[, ...])	Run RCTD
`conway_maxwell_poisson`(lambda_, nu)	Sample from the Conway-Maxwell-Poisson distribution.
`generate_spatial_data`(→ [anndata.AnnData, muon.MuData])	Generate spatial data.
`spatialdwls`(adata_spatial, adata_ref, labels_key[, ...])	Run SpatialDWLS

Package Contents#

deconvatac.tl.cell2location(adata_spatial, adata_ref, N_cells_per_location, detection_alpha, labels_key=None, layer_spatial=None, layer_ref=None, use_gpu=True, max_epochs_spatial=30000, max_epochs_ref=None, return_adatas=False, plots=True, results_path='./cell2location_results', setup_ref_kwargs={}, train_ref_kwargs={}, setup_spatial_kwargs={}, train_spatial_kwargs={})#

Run Cell2Location

Parameters#

adata_spatialAnnData: AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref.
adata_refAnnData: AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.
N_cells_per_locationfloat: Expected cell number per location.
detection_alphafloat: Regularisation of per-location normalisation.
labels_keystr: Cell type key in adata_ref.obs for label information
layer_spatialstr: Layer of adata_spatial to use for deconvolution. If None, uses adata_spatial.X.
layer_refstr: Layer of adata_ref to use for deconvolution. If None, uses adata_ref.X.
use_gpubool: Whether to use the GPU.
max_epochs_spatial: int: Number of epochs for the spatial mapping model. If None, defaults to np.min([round((20000 / n_cells) * 400), 400]).
max_epochs_ref: int: Number of epochs for the reference model. If None, defaults to np.min([round((20000 / n_cells) * 400), 400]).
return_adatas: bool: Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).
plots: bool: Whether to plot QC and ELBO plots.
results_path: str: Path to save estimated cell type abundances to.
setup_ref_kwargs: dict: Parameters for cell2location.models.RegressionModel.setup_anndata()
train_ref_kwargs: dict: Parameters for cell2location.models.RegressionModel.train()
setup_spatial_kwargs: dict: Parameters for cell2location.models.Cell2location.setup_anndata()
train_spatial_kwargs: dict: Parameters for cell2location.models.Cell2location.train()

Returns#

Saves ‘q05_cell_abundance_w_sf’ and ‘means_cell_abundance_w_sf’ as csv-files to results_path.
If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.

deconvatac.tl.tangram(adata_spatial, adata_ref, labels_key, run_rank_genes=False, layer_rank_genes=None, num_epochs=1000, device='cpu', return_adatas=False, result_path='./tangram_results', **kwargs)#

Run Tangram

Parameters#

adata_spatialAnnData: AnnData of the spatial data.
adata_refAnnData: AnnData of the reference data.
labels_keystr: Cell type key in adata_ref.obs for label information
run_rank_genes: bool: If true, will run sc.tl.rank_genes_groups on reference anndata followed by tg.pp_adatas. If false, will only run tg.pp_adatas with all peaks of the input anndatas. Thus, expects anndatas to be already filtered by HVFs.
layer_rank_genes: str: Only if run_rank_genes is true. Layer to use for sc.tl.rank_genes_groups.
num_epochs: int: Number of epochs.
devicestring or torch.device: Which device to use.
return_adatas: bool: Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).
results_path: str: Path to save estimated cell type abundances to.
**kwargs:: Parameters for tangram.mapping_utils.map_cells_to_space()

Returns#

Saves estimated proportions as csv-file to results_path.
If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.

deconvatac.tl.destvi(adata_spatial, adata_ref, labels_key=None, layer_spatial=None, layer_ref=None, use_gpu=True, max_epochs_spatial=2000, max_epochs_ref=300, return_adatas=False, plots=True, results_path='./destvi_results', model_ref_kwargs={}, train_ref_kwargs={}, model_spatial_kwargs={}, train_spatial_kwargs={})#

Run DestVI

Parameters#

adata_spatialAnnData: AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref.
adata_refAnnData: AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.
labels_keystr: Cell type key in adata_ref.obs for label information
layer_spatialstr: Layer of adata_spatial to use for deconvolution. If None, uses adata_spatial.X.
layer_refstr: Layer of adata_ref to use for deconvolution. If None, uses adata_ref.X.
use_gpubool: Whether to use the GPU.
max_epochs_spatial: int: Number of epochs for the stLVM.
max_epochs_ref: int: Number of epochs for the scLVM.
return_adatas: bool: Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).
plots: bool: Whether to plot ELBO plots and UMAP of scLVM latent space.
results_path: str: Path to save estimated cell type abundances to.
model_ref_kwargs: dict: Parameters for scvi.model.CondSCVI()
train_ref_kwargs: dict: Parameters for scvi.model.CondSCVI.train()
model_spatial_kwargs: dict: Parameters for scvi.model.DestVI.from_rna_model()
train_spatial_kwargs: dict: Parameters for scvi.model.DestVI.train()

Returns#

Saves estimated proportions as csv-file to results_path.
If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.

deconvatac.tl.jsd(true: pandas.DataFrame | numpy.ndarray, predicted: pandas.DataFrame | numpy.ndarray) → float#

Compute Jensen-Shannon divergence on true and predicted cell type proportions.

Parameters#

true: True cell type proportions.
predicted: Predicted cell type proportions.

Returns#

Jensen-Shannon divergence.

deconvatac.tl.rmse(true: pandas.DataFrame | numpy.ndarray, predicted: pandas.DataFrame | numpy.ndarray) → float#

Compute RMSE on true and predicted cell type proportions.

Parameters#

true: True cell type proportions.
predicted: Predicted cell type proportions.

Returns#

Root mean squared error.

deconvatac.tl.rctd(adata_spatial, adata_ref, labels_key, doublet_mode='full', r_lib_path=None, results_path='./rctd_results', create_rctd_kwargs={})#

Run RCTD

Parameters#

adata_spatialAnnData

AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref. Raw counts are expected in .X.

adata_refAnnData

AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial. Raw counts are expected in .X.

labels_keystr

Cell type key in adata_ref.obs for label information

doublet_mode: str [“doublet”, “full”]

On which mode to run RCTD: ‘doublet’ (at most 1-2 cell types per pixel),: ‘full’ (no restrictions on number of cell types)

r_lib_pathstr

Path to R library in which RCTD is installed.

results_pathstr

Path to save estimated cell type abundances to.

create_rctd_kwargsdict

Parameters for create.RCTD().

Returns#

Saves estimated proportions as csv-file to results_path.

deconvatac.tl.conway_maxwell_poisson(lambda_, nu)#: Sample from the Conway-Maxwell-Poisson distribution.

class deconvatac.tl.Sampler(reference: [muon.MuData, anndata.AnnData], cell_type_key: str, num_spots: int, n_regions: int, region_type: str = 'stripes', cell_number_mean: [int, list] = 6, cell_number_nu: [float, list] = 20.0, cell_type_number: [int, list] = 4, balance: str | None = 'balanced')#

Class to sample cells and clusters from a given dataset.

reference#

cell_type_key#

num_spots#

obs#

n_regions#

region_type = 'stripes'#

init_sample_prob(balance='unbalanced')#: Initialize the sample probabilities based on cell type counts.

Returns#

None

define_regions(used_clusters)#

Define the regions to sample from.

Parameters#

used_clustersdict: A dictionary containing the clusters to be used in each region.

Returns#

None

Raises#

ValueError: If the region_type parameter is not one of [‘stripes’, ‘circles’, ‘gradient_number’, ‘gradient_celltype’]

stripe_regions()#: Define regions as stripes.

Returns#

None

circle_regions()#: Define regions in circles.

Returns#

None

gradient_number_regions()#: Define regions as a gradient.

Returns#

None

gradient_celltype_regions(used_clusters)#

Define regions as a gradient.

Parameters#

used_clusterslist: The clusters to be used in the gradient cell type regions.

Returns#

None

sample_data()#

Sample data from the given dataset.

Returns#

tuple: A tuple of expression and density arrays.

sample_spots(params)#

Sample cells based on given parameters.

Parameters#

paramsnp.ndarray: Array of cell and cluster counts.

Returns#

tuple: A tuple of expression and density arrays.

Raises#

None

get_coords()#

Get the coordinates for the spots.

Returns#

tuple: A tuple of X and Y coordinates.

deconvatac.tl.generate_spatial_data(reference: [muon.MuData, anndata.AnnData], cell_type_key: str, num_spots: int = 1024, n_regions: int = 5, balance: str | None = None, cell_number_mean: [int, list] = 6, cell_number_nu: [float, list] = 20.0, cell_type_number: [int, list] = 4, **kwargs) → [anndata.AnnData, muon.MuData]#

Generate spatial data.

Parameters#

referenceUnion[mu.MuData, ad.AnnData]: The reference dataset used for generating spatial data.
cell_type_keystr: The key in the reference dataset that specifies the cell type information.
num_spotsint, optional: The number of spots (locations) to generate, by default 1024.
n_regionsint, optional: The number of spatial regions to generate, by default 5.
balancestr, optional: The balancing method to use for generating spatial data, by default None.
cell_number_meanUnion[int, list], optional: The mean number of cells per spot, by default 6.
cell_number_nuUnion[float, list], optional: The dispersion parameter for the negative binomial distribution used to model cell numbers, by default 20.0.
cell_type_numberUnion[int, list], optional: The number of cell types to generate, by default 4.
**kwargs: Additional keyword arguments.
referenceUnion[mu.MuData, ad.AnnData]: The reference dataset used for generating spatial data.
cell_type_keystr: The key in the reference dataset that specifies the cell type information.
num_spotsint, optional: The number of spots (locations) to generate, by default 1024.
n_regionsint, optional: The number of spatial regions to generate, by default 5.
balancestr, optional: The balancing method to use for generating spatial data, by default None.
cell_number_meanUnion[int, list], optional: The mean number of cells per spot, by default 6.
cell_number_nuUnion[float, list], optional: The dispersion parameter for the negative binomial distribution used to model cell numbers, by default 20.0.
cell_type_numberUnion[int, list], optional: The number of cell types to generate, by default 4.
**kwargs: Additional keyword arguments.

Returns#

Union[ad.AnnData, mu.MuData]: The generated spatial data.

Notes#

This function generates spatial data based on a reference dataset. It uses a sampling approach to generate synthetic spatial data with specified characteristics such as cell type composition, cell numbers, and spatial organization.

The generated spatial data is returned as an AnnData object if the reference dataset is an AnnData object, or as a MuData object if the reference dataset is a MuData object. Union[ad.AnnData, mu.MuData]

The generated spatial data.

Notes#

The generated spatial data is returned as an AnnData object if the reference dataset is an AnnData object, or as a MuData object if the reference dataset is a MuData object.

deconvatac.tl.spatialdwls(adata_spatial, adata_ref, labels_key, cluster_key='leiden', n_cell=50, tfidf=True, r_lib_path=None, results_path='./spatialdwls_results')#

Run SpatialDWLS

Parameters#

adata_spatialAnnData

AnnData of the spatial data , filtered by highly variable features. Feature space needs to be the same as the one of adata_ref. Normalized counts are expected to be saved in .X. Names of observations and features are expected as row names of adata_spatial.obs and adata_spatial.var.

adata_refAnnData

AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.: Normalized counts are expected to be saved in .X. Names of observations and features are expected as row names of adata_ref.obs and adata_ref.var.

labels_keystr

Cell type key in adata_ref.obs for label information.

cluster_keystr

Cluster key in adata_spatial.obs for cluster information.

n_cellint

Number of cells per spot.

tfidfbool

Whether the normalized counts are TFIDF-normalized.

r_lib_pathstr

Path to R library.

results_pathstr

Path to save estimated cell type abundances to.

Returns#

Saves estimated proportions as csv-file to results_path.