neuralee.embedding package

class neuralee.embedding.FCLayers(di, do)[source]

Bases: torch.nn.modules.module.Module

Default nn structure class.

Parameters:
  • di (int) – Input feature size.
  • do (int) – Output feature size.

How to define a custom nn Modules, check at: https://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-custom-nn-modules

forward(y)[source]
class neuralee.embedding.NeuralEE(dataset, d=2, lam=1, device=None)[source]

Bases: object

NeuralEE class.

Parameters:
  • dataset (neuralee.dataset.GeneExpressionDataset) – GeneExpressionDataset.
  • d (int) – low embedded dimension.
  • lam – trade-off factor of elastic embedding function.
  • device (torch.device) – device chosen to operate. If None, set as torch.device(‘cpu’).
D

feature size.

EE(size=1.0, maxit=200, tol=1e-05, frequence=None, aff='ea', perplexity=30.0)[source]

Free Elastic embedding (no mapping).

Fast training of nonlinear embeddings using the spectral direction for the Elastic Embedding (EE) algorithm.

Reference:

Partial-Hessian Strategies for Fast Learning of Nonlinear Embeddings. http://faculty.ucmerced.edu/mcarreira-perpinan/papers/icml12.pdf
Parameters:
  • size (int or percentage) – subsample size of the entire dataset to embed. if subsample, the affinity will be recalculated on subsamples.
  • maxit (int) – max number of iterations for EE.
  • tol – minimum relative distance between consecutive X.
  • frequence (int) – frequence to display iterating results. if None, not display.
  • aff ({'ea', 'x2p'}) – if subsampled, affinity used to calculate attractive weights.
  • perplexity – if subsampled, perplexity defined in elastic embedding function.
Returns:

embedding results. ‘X’: embedding coordinates; ‘e’: embedding loss; ‘sub_samples’: if subsampled, subsamples information.

Return type:

dict

fine_tune(optimizer=None, size=1.0, net=None, frequence=50, verbose=False, maxit=500, calculate_error=None, pin_memory=True, aff='ea', perplexity=30.0, save_embedding=None)[source]

NeuralEE method.

It supports incremental learning, which means nn can fine tune, if a pre-trained nn offered.

Parameters:
  • optimizer (torch.optim) – optimization for training neural networks. if None, set as torch.optim.Adam(lr=0.01).
  • size (int or percentage) – subsample size of the entire dataset to embed. if subsample, the affinity will be recalculated on subsamples.
  • net (torch.nn.Module) – the nn instance as embedding function. if None and not hasattr(self, net), then fine tune self.net; elif not None, then fine tune net as self.net; else set as the FCLayers instance.
  • frequence (int) – frequence to compare and save iterating results.
  • verbose (bool) – whether to show verbose training loss.
  • maxit (int) – max number of iterations for NeuralEE.
  • calculate_error ({None, 'cpu', 'cuda'}) – how to calculate error, if the number of samples is large, set None to avoid out of memory on ‘cuda’ or ‘cpu’.
  • pin_memory (bool) – whether to pin data on GPU memory to save time of transfer, which depends on your GPU memory.
  • aff ({'ea', 'x2p'}) – if subsampled, affinity used to calculate attractive weights.
  • perplexity – if subsampled, perplexity defined in elastic embedding function.
  • save_embedding (str) – path to save iterating results according to frequence. if None, not save.
Returns:

embedding results. ‘X’: embedding coordinates; ‘e’: embedding loss; ‘sub_samples’: if subsampled, subsamples information.

Return type:

dict

labels
Returns:label vector.
Return type:numpy.ndarray
map(samples={}, calculate_error=None)[source]

Directly mapping via the learned nn.

Parameters:
  • samples (dict) – ‘Y’: samples to be mapped into low-dimensional coordinate. ‘labels’: samples labels. None is acceptable. ‘Wp’: attractive weights on samples. None is acceptable if error need not be calculated. ‘Wn’: repulsive weights on samples. None is acceptable if error need not be calculated. if empty dict, mapping on training data.
  • calculate_error ({None, 'cpu', 'cuda'}) – how to calculate error, if the number of samples is large, set None to avoid out of memory on ‘cuda’ or ‘cpu’.
Returns:

embedding results. ‘X’: embedding coordinates; ‘e’: embedding loss.

Return type:

dict