neuralee.embedding package¶
-
class
neuralee.embedding.
FCLayers
(di, do)[source]¶ Bases:
torch.nn.modules.module.Module
Default nn structure class.
Parameters: - di (int) – Input feature size.
- do (int) – Output feature size.
How to define a custom nn Modules, check at: https://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-custom-nn-modules
-
class
neuralee.embedding.
NeuralEE
(dataset, d=2, lam=1, device=None)[source]¶ Bases:
object
NeuralEE class.
Parameters: - dataset (neuralee.dataset.GeneExpressionDataset) – GeneExpressionDataset.
- d (int) – low embedded dimension.
- lam – trade-off factor of elastic embedding function.
- device (torch.device) – device chosen to operate. If None, set as torch.device(‘cpu’).
-
D
¶ feature size.
-
EE
(size=1.0, maxit=200, tol=1e-05, frequence=None, aff='ea', perplexity=30.0)[source]¶ Free Elastic embedding (no mapping).
Fast training of nonlinear embeddings using the spectral direction for the Elastic Embedding (EE) algorithm.
Reference:
Partial-Hessian Strategies for Fast Learning of Nonlinear Embeddings. http://faculty.ucmerced.edu/mcarreira-perpinan/papers/icml12.pdfParameters: - size (int or percentage) – subsample size of the entire dataset to embed. if subsample, the affinity will be recalculated on subsamples.
- maxit (int) – max number of iterations for EE.
- tol – minimum relative distance between consecutive X.
- frequence (int) – frequence to display iterating results. if None, not display.
- aff ({'ea', 'x2p'}) – if subsampled, affinity used to calculate attractive weights.
- perplexity – if subsampled, perplexity defined in elastic embedding function.
Returns: embedding results. ‘X’: embedding coordinates; ‘e’: embedding loss; ‘sub_samples’: if subsampled, subsamples information.
Return type: dict
-
fine_tune
(optimizer=None, size=1.0, net=None, frequence=50, verbose=False, maxit=500, calculate_error=None, pin_memory=True, aff='ea', perplexity=30.0, save_embedding=None)[source]¶ NeuralEE method.
It supports incremental learning, which means nn can fine tune, if a pre-trained nn offered.
Parameters: - optimizer (torch.optim) – optimization for training neural networks. if None, set as torch.optim.Adam(lr=0.01).
- size (int or percentage) – subsample size of the entire dataset to embed. if subsample, the affinity will be recalculated on subsamples.
- net (torch.nn.Module) – the nn instance as embedding function. if None and not hasattr(self, net), then fine tune self.net; elif not None, then fine tune net as self.net; else set as the FCLayers instance.
- frequence (int) – frequence to compare and save iterating results.
- verbose (bool) – whether to show verbose training loss.
- maxit (int) – max number of iterations for NeuralEE.
- calculate_error ({None, 'cpu', 'cuda'}) – how to calculate error, if the number of samples is large, set None to avoid out of memory on ‘cuda’ or ‘cpu’.
- pin_memory (bool) – whether to pin data on GPU memory to save time of transfer, which depends on your GPU memory.
- aff ({'ea', 'x2p'}) – if subsampled, affinity used to calculate attractive weights.
- perplexity – if subsampled, perplexity defined in elastic embedding function.
- save_embedding (str) – path to save iterating results according to frequence. if None, not save.
Returns: embedding results. ‘X’: embedding coordinates; ‘e’: embedding loss; ‘sub_samples’: if subsampled, subsamples information.
Return type: dict
-
labels
¶ Returns: label vector. Return type: numpy.ndarray
-
map
(samples={}, calculate_error=None)[source]¶ Directly mapping via the learned nn.
Parameters: - samples (dict) – ‘Y’: samples to be mapped into low-dimensional coordinate. ‘labels’: samples labels. None is acceptable. ‘Wp’: attractive weights on samples. None is acceptable if error need not be calculated. ‘Wn’: repulsive weights on samples. None is acceptable if error need not be calculated. if empty dict, mapping on training data.
- calculate_error ({None, 'cpu', 'cuda'}) – how to calculate error, if the number of samples is large, set None to avoid out of memory on ‘cuda’ or ‘cpu’.
Returns: embedding results. ‘X’: embedding coordinates; ‘e’: embedding loss.
Return type: dict