deconvatac.tl
=============

.. py:module:: deconvatac.tl


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/deconvatac/tl/cell2location/index
   /autoapi/deconvatac/tl/destvi/index
   /autoapi/deconvatac/tl/metrics/index
   /autoapi/deconvatac/tl/rctd/index
   /autoapi/deconvatac/tl/simulate/index
   /autoapi/deconvatac/tl/spatialdwls/index
   /autoapi/deconvatac/tl/tangram/index


Classes
-------

.. autoapisummary::

   deconvatac.tl.Sampler


Functions
---------

.. autoapisummary::

   deconvatac.tl.cell2location
   deconvatac.tl.tangram
   deconvatac.tl.destvi
   deconvatac.tl.jsd
   deconvatac.tl.rmse
   deconvatac.tl.rctd
   deconvatac.tl.conway_maxwell_poisson
   deconvatac.tl.generate_spatial_data
   deconvatac.tl.spatialdwls


Package Contents
----------------

.. py:function:: cell2location(adata_spatial, adata_ref, N_cells_per_location, detection_alpha, labels_key=None, layer_spatial=None, layer_ref=None, use_gpu=True, max_epochs_spatial=30000, max_epochs_ref=None, return_adatas=False, plots=True, results_path='./cell2location_results', setup_ref_kwargs={}, train_ref_kwargs={}, setup_spatial_kwargs={}, train_spatial_kwargs={})

   Run Cell2Location

   Parameters
   -----------

   adata_spatial : AnnData
       AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref.
   adata_ref : AnnData
       AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.
   N_cells_per_location : float
       Expected cell number per location.
   detection_alpha : float
       Regularisation of per-location normalisation.
   labels_key : str
       Cell type key in adata_ref.obs for label information
   layer_spatial : str
       Layer of adata_spatial to use for deconvolution. If None, uses adata_spatial.X.
   layer_ref : str
       Layer of adata_ref to use for deconvolution. If None, uses adata_ref.X.
   use_gpu : bool
       Whether to use the GPU.
   max_epochs_spatial: int
       Number of epochs for the spatial mapping model. If None, defaults to np.min([round((20000 / n_cells) * 400), 400]).
   max_epochs_ref: int
       Number of epochs for the reference model. If None, defaults to np.min([round((20000 / n_cells) * 400), 400]).
   return_adatas: bool
       Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).
   plots: bool
       Whether to plot QC and ELBO plots.
   results_path: str
       Path to save estimated cell type abundances to.
   setup_ref_kwargs: dict
       Parameters for cell2location.models.RegressionModel.setup_anndata()
   train_ref_kwargs: dict
       Parameters for cell2location.models.RegressionModel.train()
   setup_spatial_kwargs: dict
       Parameters for cell2location.models.Cell2location.setup_anndata()
   train_spatial_kwargs: dict
       Parameters for cell2location.models.Cell2location.train()


   Returns
   --------

   - Saves 'q05_cell_abundance_w_sf' and 'means_cell_abundance_w_sf' as csv-files to results_path.
   - If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.


.. py:function:: tangram(adata_spatial, adata_ref, labels_key, run_rank_genes=False, layer_rank_genes=None, num_epochs=1000, device='cpu', return_adatas=False, result_path='./tangram_results', **kwargs)

   Run Tangram

   Parameters
   -----------

   adata_spatial : AnnData
       AnnData of the spatial data.
   adata_ref : AnnData
       AnnData of the reference data.
   labels_key : str
       Cell type key in adata_ref.obs for label information
   run_rank_genes: bool
       If true, will run sc.tl.rank_genes_groups on reference anndata followed by tg.pp_adatas.
       If false, will only run tg.pp_adatas with all peaks of the input anndatas. Thus, expects anndatas to be already filtered by HVFs.
   layer_rank_genes: str
       Only if run_rank_genes is true. Layer to use for sc.tl.rank_genes_groups.
   num_epochs: int
       Number of epochs.
   device : string or torch.device
       Which device to use.
   return_adatas: bool
       Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).
   results_path: str
       Path to save estimated cell type abundances to.
   **kwargs:
       Parameters for tangram.mapping_utils.map_cells_to_space()

   Returns
   --------

   - Saves estimated proportions as csv-file to results_path.
   - If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results.


.. py:function:: destvi(adata_spatial, adata_ref, labels_key=None, layer_spatial=None, layer_ref=None, use_gpu=True, max_epochs_spatial=2000, max_epochs_ref=300, return_adatas=False, plots=True, results_path='./destvi_results', model_ref_kwargs={}, train_ref_kwargs={}, model_spatial_kwargs={}, train_spatial_kwargs={})

   Run DestVI

   Parameters 
   -----------

   adata_spatial : AnnData
       AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref. 
   adata_ref : AnnData 
       AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.
   labels_key : str
       Cell type key in adata_ref.obs for label information
   layer_spatial : str
       Layer of adata_spatial to use for deconvolution. If None, uses adata_spatial.X.
   layer_ref : str
       Layer of adata_ref to use for deconvolution. If None, uses adata_ref.X.
   use_gpu : bool
       Whether to use the GPU.
   max_epochs_spatial: int
       Number of epochs for the stLVM.
   max_epochs_ref: int 
       Number of epochs for the scLVM. 
   return_adatas: bool 
       Whether to return AnnDatas with deconvolution results. Returns tupel: (adata_spatial, adata_ref).
   plots: bool 
       Whether to plot ELBO plots and UMAP of scLVM latent space.
   results_path: str
       Path to save estimated cell type abundances to. 
   model_ref_kwargs: dict
       Parameters for scvi.model.CondSCVI()
   train_ref_kwargs: dict
       Parameters for scvi.model.CondSCVI.train()
   model_spatial_kwargs: dict
       Parameters for scvi.model.DestVI.from_rna_model()
   train_spatial_kwargs: dict
       Parameters for scvi.model.DestVI.train()

       
   Returns
   --------

   - Saves estimated proportions as csv-file to results_path.
   - If return_adatas=True, returns tupel (adata_spatial, adata_ref) with saved deconvolution results. 


.. py:function:: jsd(true: Union[pandas.DataFrame, numpy.ndarray], predicted: Union[pandas.DataFrame, numpy.ndarray]) -> float

   Compute Jensen-Shannon divergence on true and predicted cell type proportions.

   Parameters
   ----------
   true
       True cell type proportions.
   predicted
       Predicted cell type proportions.

   Returns
   -------
   Jensen-Shannon divergence.


.. py:function:: rmse(true: Union[pandas.DataFrame, numpy.ndarray], predicted: Union[pandas.DataFrame, numpy.ndarray]) -> float

   Compute RMSE on true and predicted cell type proportions.

   Parameters
   ----------
   true
       True cell type proportions.
   predicted
       Predicted cell type proportions.

   Returns
   -------
   Root mean squared error.


.. py:function:: rctd(adata_spatial, adata_ref, labels_key, doublet_mode='full', r_lib_path=None, results_path='./rctd_results', create_rctd_kwargs={})

   Run RCTD

   Parameters 
   -----------

   adata_spatial : AnnData
       AnnData of the spatial data, filtered by highly variable features. Feature space needs to be the same as the one of adata_ref. Raw counts are expected in .X.
   adata_ref : AnnData 
       AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial. Raw counts are expected in .X.
   labels_key : str
       Cell type key in adata_ref.obs for label information
   doublet_mode: str ["doublet", "full"]
       On which mode to run RCTD:  'doublet' (at most 1-2 cell types per pixel),
                                   'full' (no restrictions on number of cell types)
   r_lib_path : str
       Path to R library in which RCTD is installed.   
   results_path : str
       Path to save estimated cell type abundances to. 
   create_rctd_kwargs : dict 
       Parameters for create.RCTD().
       
   Returns
   --------

   - Saves estimated proportions as csv-file to results_path.


.. py:function:: conway_maxwell_poisson(lambda_, nu)

   Sample from the Conway-Maxwell-Poisson distribution.


.. py:class:: Sampler(reference: [muon.MuData, anndata.AnnData], cell_type_key: str, num_spots: int, n_regions: int, region_type: str = 'stripes', cell_number_mean: [int, list] = 6, cell_number_nu: [float, list] = 20.0, cell_type_number: [int, list] = 4, balance: Optional[str] = 'balanced')

   Class to sample cells and clusters from a given dataset.


   .. py:attribute:: reference


   .. py:attribute:: cell_type_key


   .. py:attribute:: num_spots


   .. py:attribute:: obs


   .. py:attribute:: n_regions


   .. py:attribute:: region_type
      :value: 'stripes'



   .. py:method:: init_sample_prob(balance='unbalanced')

      Initialize the sample probabilities based on cell type counts.

      Returns
      -------
      None



   .. py:method:: define_regions(used_clusters)

      Define the regions to sample from.

      Parameters
      ----------
      used_clusters : dict
          A dictionary containing the clusters to be used in each region.

      Returns
      -------
      None

      Raises
      ------
      ValueError
          If the region_type parameter is not one of ['stripes', 'circles', 'gradient_number', 'gradient_celltype']



   .. py:method:: stripe_regions()

      Define regions as stripes.

      Returns
      -------
      None



   .. py:method:: circle_regions()

      Define regions in circles.

      Returns
      -------
      None



   .. py:method:: gradient_number_regions()

      Define regions as a gradient.

      Returns
      -------
      None



   .. py:method:: gradient_celltype_regions(used_clusters)

      Define regions as a gradient.

      Parameters
      ----------
      used_clusters : list
          The clusters to be used in the gradient cell type regions.

      Returns
      -------
      None



   .. py:method:: sample_data()

      Sample data from the given dataset.

      Returns
      -------
      tuple
          A tuple of expression and density arrays.



   .. py:method:: sample_spots(params)

      Sample cells based on given parameters.

      Parameters
      ----------
      params : np.ndarray
          Array of cell and cluster counts.

      Returns
      -------
      tuple
          A tuple of expression and density arrays.

      Raises
      ------
      None



   .. py:method:: get_coords()

      Get the coordinates for the spots.

      Returns
      -------
      tuple
          A tuple of X and Y coordinates.



.. py:function:: generate_spatial_data(reference: [muon.MuData, anndata.AnnData], cell_type_key: str, num_spots: int = 1024, n_regions: int = 5, balance: Optional[str] = None, cell_number_mean: [int, list] = 6, cell_number_nu: [float, list] = 20.0, cell_type_number: [int, list] = 4, **kwargs) -> [anndata.AnnData, muon.MuData]

   Generate spatial data.

   Parameters
   ----------
   reference : Union[mu.MuData, ad.AnnData]
       The reference dataset used for generating spatial data.
   cell_type_key : str
       The key in the reference dataset that specifies the cell type information.
   num_spots : int, optional
       The number of spots (locations) to generate, by default 1024.
   n_regions : int, optional
       The number of spatial regions to generate, by default 5.
   balance : str, optional
       The balancing method to use for generating spatial data, by default None.
   cell_number_mean : Union[int, list], optional
       The mean number of cells per spot, by default 6.
   cell_number_nu : Union[float, list], optional
       The dispersion parameter for the negative binomial distribution used to model cell numbers, by default 20.0.
   cell_type_number : Union[int, list], optional
       The number of cell types to generate, by default 4.
   **kwargs
       Additional keyword arguments.
   reference : Union[mu.MuData, ad.AnnData]
       The reference dataset used for generating spatial data.
   cell_type_key : str
       The key in the reference dataset that specifies the cell type information.
   num_spots : int, optional
       The number of spots (locations) to generate, by default 1024.
   n_regions : int, optional
       The number of spatial regions to generate, by default 5.
   balance : str, optional
       The balancing method to use for generating spatial data, by default None.
   cell_number_mean : Union[int, list], optional
       The mean number of cells per spot, by default 6.
   cell_number_nu : Union[float, list], optional
       The dispersion parameter for the negative binomial distribution used to model cell numbers, by default 20.0.
   cell_type_number : Union[int, list], optional
       The number of cell types to generate, by default 4.
   **kwargs
       Additional keyword arguments.

   Returns
   -------
   Union[ad.AnnData, mu.MuData]
       The generated spatial data.

   Notes
   -----
   This function generates spatial data based on a reference dataset. It uses a sampling approach to generate
   synthetic spatial data with specified characteristics such as cell type composition, cell numbers, and spatial
   organization.

   The generated spatial data is returned as an `AnnData` object if the reference dataset is an `AnnData` object,
   or as a `MuData` object if the reference dataset is a `MuData` object.
   Union[ad.AnnData, mu.MuData]
       The generated spatial data.

   Notes
   -----
   This function generates spatial data based on a reference dataset. It uses a sampling approach to generate
   synthetic spatial data with specified characteristics such as cell type composition, cell numbers, and spatial
   organization.

   The generated spatial data is returned as an `AnnData` object if the reference dataset is an `AnnData` object,
   or as a `MuData` object if the reference dataset is a `MuData` object.


.. py:function:: spatialdwls(adata_spatial, adata_ref, labels_key, cluster_key='leiden', n_cell=50, tfidf=True, r_lib_path=None, results_path='./spatialdwls_results')

   Run SpatialDWLS

   Parameters 
   -----------

   adata_spatial : AnnData
       AnnData of the spatial data , filtered by highly variable features. Feature space needs to be the same as the one of adata_ref.
       Normalized counts are expected to be saved in .X. Names of observations and features are expected as row names of adata_spatial.obs and adata_spatial.var. 
   adata_ref : AnnData 
       AnnData of the reference data, filtered by highly variable features. Feature space needs to be the same as the one of adata_spatial.
        Normalized counts are expected to be saved in .X. Names of observations and features are expected as row names of adata_ref.obs and adata_ref.var. 
   labels_key : str
       Cell type key in adata_ref.obs for label information.
   cluster_key : str
       Cluster key in adata_spatial.obs for cluster information.
   n_cell : int
       Number of cells per spot.
   tfidf : bool
       Whether the normalized counts are TFIDF-normalized.
   r_lib_path : str
       Path to R library.  
   results_path : str
       Path to save estimated cell type abundances to. 

       
   Returns
   --------

   - Saves estimated proportions as csv-file to results_path.


