deconvatac.tl#
Submodules#
Classes#
Class to sample cells and clusters from a given dataset. |
Functions#
|
Run Cell2Location |
|
Run Tangram |
|
Run DestVI |
|
Compute Jensen-Shannon divergence on true and predicted cell type proportions. |
|
Compute RMSE on true and predicted cell type proportions. |
|
Run RCTD |
|
Sample from the Conway-Maxwell-Poisson distribution. |
|
Generate spatial data. |
|
Run SpatialDWLS |
Package Contents#
- deconvatac.tl.cell2location(adata_spatial, adata_ref, N_cells_per_location, detection_alpha, labels_key=None, layer_spatial=None, layer_ref=None, use_gpu=True, max_epochs_spatial=30000, max_epochs_ref=None, return_adatas=False, plots=True, results_path='./cell2location_results', setup_ref_kwargs={}, train_ref_kwargs={}, setup_spatial_kwargs={}, train_spatial_kwargs={})#
Run Cell2Location
Parameters#
- adata_spatialAnnData
AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref.
- adata_refAnnData
AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.
- N_cells_per_locationfloat
Expected cell number per location.
- detection_alphafloat
Regularisation of per-location normalisation.
- labels_keystr
Cell type key in adata_ref.obs for label information
- layer_spatialstr
Layer of adata_spatial to use for deconvolution. If None, uses adata_spatial.X.
- layer_refstr
Layer of adata_ref to use for deconvolution. If None, uses adata_ref.X.
- use_gpubool
Whether to use the GPU.
- max_epochs_spatial: int
Number of epochs for the spatial mapping model. If None, defaults to np.min([round((20000 / n_cells) * 400), 400]).
- max_epochs_ref: int
Number of epochs for the reference model. If None, defaults to np.min([round((20000 / n_cells) * 400), 400]).
- return_adatas: bool
Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).
- plots: bool
Whether to plot QC and ELBO plots.
- results_path: str
Path to save estimated cell type abundances to.
- setup_ref_kwargs: dict
Parameters for cell2location.models.RegressionModel.setup_anndata()
- train_ref_kwargs: dict
Parameters for cell2location.models.RegressionModel.train()
- setup_spatial_kwargs: dict
Parameters for cell2location.models.Cell2location.setup_anndata()
- train_spatial_kwargs: dict
Parameters for cell2location.models.Cell2location.train()
Returns#
Saves ‘q05_cell_abundance_w_sf’ and ‘means_cell_abundance_w_sf’ as csv-files to results_path.
If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.
- deconvatac.tl.tangram(adata_spatial, adata_ref, labels_key, run_rank_genes=False, layer_rank_genes=None, num_epochs=1000, device='cpu', return_adatas=False, result_path='./tangram_results', **kwargs)#
Run Tangram
Parameters#
- adata_spatialAnnData
AnnData of the spatial data.
- adata_refAnnData
AnnData of the reference data.
- labels_keystr
Cell type key in adata_ref.obs for label information
- run_rank_genes: bool
If true, will run sc.tl.rank_genes_groups on reference anndata followed by tg.pp_adatas. If false, will only run tg.pp_adatas with all peaks of the input anndatas. Thus, expects anndatas to be already filtered by HVFs.
- layer_rank_genes: str
Only if run_rank_genes is true. Layer to use for sc.tl.rank_genes_groups.
- num_epochs: int
Number of epochs.
- devicestring or torch.device
Which device to use.
- return_adatas: bool
Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).
- results_path: str
Path to save estimated cell type abundances to.
- **kwargs:
Parameters for tangram.mapping_utils.map_cells_to_space()
Returns#
Saves estimated proportions as csv-file to results_path.
If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.
- deconvatac.tl.destvi(adata_spatial, adata_ref, labels_key=None, layer_spatial=None, layer_ref=None, use_gpu=True, max_epochs_spatial=2000, max_epochs_ref=300, return_adatas=False, plots=True, results_path='./destvi_results', model_ref_kwargs={}, train_ref_kwargs={}, model_spatial_kwargs={}, train_spatial_kwargs={})#
Run DestVI
Parameters#
- adata_spatialAnnData
AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref.
- adata_refAnnData
AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.
- labels_keystr
Cell type key in adata_ref.obs for label information
- layer_spatialstr
Layer of adata_spatial to use for deconvolution. If None, uses adata_spatial.X.
- layer_refstr
Layer of adata_ref to use for deconvolution. If None, uses adata_ref.X.
- use_gpubool
Whether to use the GPU.
- max_epochs_spatial: int
Number of epochs for the stLVM.
- max_epochs_ref: int
Number of epochs for the scLVM.
- return_adatas: bool
Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).
- plots: bool
Whether to plot ELBO plots and UMAP of scLVM latent space.
- results_path: str
Path to save estimated cell type abundances to.
- model_ref_kwargs: dict
Parameters for scvi.model.CondSCVI()
- train_ref_kwargs: dict
Parameters for scvi.model.CondSCVI.train()
- model_spatial_kwargs: dict
Parameters for scvi.model.DestVI.from_rna_model()
- train_spatial_kwargs: dict
Parameters for scvi.model.DestVI.train()
Returns#
Saves estimated proportions as csv-file to results_path.
If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.
- deconvatac.tl.jsd(true: pandas.DataFrame | numpy.ndarray, predicted: pandas.DataFrame | numpy.ndarray) float#
Compute Jensen-Shannon divergence on true and predicted cell type proportions.
Parameters#
- true
True cell type proportions.
- predicted
Predicted cell type proportions.
Returns#
Jensen-Shannon divergence.
- deconvatac.tl.rmse(true: pandas.DataFrame | numpy.ndarray, predicted: pandas.DataFrame | numpy.ndarray) float#
Compute RMSE on true and predicted cell type proportions.
Parameters#
- true
True cell type proportions.
- predicted
Predicted cell type proportions.
Returns#
Root mean squared error.
- deconvatac.tl.rctd(adata_spatial, adata_ref, labels_key, doublet_mode='full', r_lib_path=None, results_path='./rctd_results', create_rctd_kwargs={})#
Run RCTD
Parameters#
- adata_spatialAnnData
AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref. Raw counts are expected in .X.
- adata_refAnnData
AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial. Raw counts are expected in .X.
- labels_keystr
Cell type key in adata_ref.obs for label information
- doublet_mode: str [“doublet”, “full”]
- On which mode to run RCTD: ‘doublet’ (at most 1-2 cell types per pixel),
‘full’ (no restrictions on number of cell types)
- r_lib_pathstr
Path to R library in which RCTD is installed.
- results_pathstr
Path to save estimated cell type abundances to.
- create_rctd_kwargsdict
Parameters for create.RCTD().
Returns#
Saves estimated proportions as csv-file to results_path.
- deconvatac.tl.conway_maxwell_poisson(lambda_, nu)#
Sample from the Conway-Maxwell-Poisson distribution.
- class deconvatac.tl.Sampler(reference: [muon.MuData, anndata.AnnData], cell_type_key: str, num_spots: int, n_regions: int, region_type: str = 'stripes', cell_number_mean: [int, list] = 6, cell_number_nu: [float, list] = 20.0, cell_type_number: [int, list] = 4, balance: str | None = 'balanced')#
Class to sample cells and clusters from a given dataset.
- reference#
- cell_type_key#
- num_spots#
- obs#
- n_regions#
- region_type = 'stripes'#
- init_sample_prob(balance='unbalanced')#
Initialize the sample probabilities based on cell type counts.
Returns#
None
- define_regions(used_clusters)#
Define the regions to sample from.
Parameters#
- used_clustersdict
A dictionary containing the clusters to be used in each region.
Returns#
None
Raises#
- ValueError
If the region_type parameter is not one of [‘stripes’, ‘circles’, ‘gradient_number’, ‘gradient_celltype’]
- gradient_celltype_regions(used_clusters)#
Define regions as a gradient.
Parameters#
- used_clusterslist
The clusters to be used in the gradient cell type regions.
Returns#
None
- sample_data()#
Sample data from the given dataset.
Returns#
- tuple
A tuple of expression and density arrays.
- deconvatac.tl.generate_spatial_data(reference: [muon.MuData, anndata.AnnData], cell_type_key: str, num_spots: int = 1024, n_regions: int = 5, balance: str | None = None, cell_number_mean: [int, list] = 6, cell_number_nu: [float, list] = 20.0, cell_type_number: [int, list] = 4, **kwargs) [anndata.AnnData, muon.MuData]#
Generate spatial data.
Parameters#
- referenceUnion[mu.MuData, ad.AnnData]
The reference dataset used for generating spatial data.
- cell_type_keystr
The key in the reference dataset that specifies the cell type information.
- num_spotsint, optional
The number of spots (locations) to generate, by default 1024.
- n_regionsint, optional
The number of spatial regions to generate, by default 5.
- balancestr, optional
The balancing method to use for generating spatial data, by default None.
- cell_number_meanUnion[int, list], optional
The mean number of cells per spot, by default 6.
- cell_number_nuUnion[float, list], optional
The dispersion parameter for the negative binomial distribution used to model cell numbers, by default 20.0.
- cell_type_numberUnion[int, list], optional
The number of cell types to generate, by default 4.
- **kwargs
Additional keyword arguments.
- referenceUnion[mu.MuData, ad.AnnData]
The reference dataset used for generating spatial data.
- cell_type_keystr
The key in the reference dataset that specifies the cell type information.
- num_spotsint, optional
The number of spots (locations) to generate, by default 1024.
- n_regionsint, optional
The number of spatial regions to generate, by default 5.
- balancestr, optional
The balancing method to use for generating spatial data, by default None.
- cell_number_meanUnion[int, list], optional
The mean number of cells per spot, by default 6.
- cell_number_nuUnion[float, list], optional
The dispersion parameter for the negative binomial distribution used to model cell numbers, by default 20.0.
- cell_type_numberUnion[int, list], optional
The number of cell types to generate, by default 4.
- **kwargs
Additional keyword arguments.
Returns#
- Union[ad.AnnData, mu.MuData]
The generated spatial data.
Notes#
This function generates spatial data based on a reference dataset. It uses a sampling approach to generate synthetic spatial data with specified characteristics such as cell type composition, cell numbers, and spatial organization.
The generated spatial data is returned as an
AnnDataobject if the reference dataset is anAnnDataobject, or as aMuDataobject if the reference dataset is aMuDataobject. Union[ad.AnnData, mu.MuData]The generated spatial data.
Notes#
This function generates spatial data based on a reference dataset. It uses a sampling approach to generate synthetic spatial data with specified characteristics such as cell type composition, cell numbers, and spatial organization.
The generated spatial data is returned as an
AnnDataobject if the reference dataset is anAnnDataobject, or as aMuDataobject if the reference dataset is aMuDataobject.
- deconvatac.tl.spatialdwls(adata_spatial, adata_ref, labels_key, cluster_key='leiden', n_cell=50, tfidf=True, r_lib_path=None, results_path='./spatialdwls_results')#
Run SpatialDWLS
Parameters#
- adata_spatialAnnData
AnnData of the spatial data , filtered by highly variable features. Feature space needs to be the same as the one of adata_ref. Normalized counts are expected to be saved in .X. Names of observations and features are expected as row names of adata_spatial.obs and adata_spatial.var.
- adata_refAnnData
- AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.
Normalized counts are expected to be saved in .X. Names of observations and features are expected as row names of adata_ref.obs and adata_ref.var.
- labels_keystr
Cell type key in adata_ref.obs for label information.
- cluster_keystr
Cluster key in adata_spatial.obs for cluster information.
- n_cellint
Number of cells per spot.
- tfidfbool
Whether the normalized counts are TFIDF-normalized.
- r_lib_pathstr
Path to R library.
- results_pathstr
Path to save estimated cell type abundances to.
Returns#
Saves estimated proportions as csv-file to results_path.