Laplacian Score

laplacian_scores(feature_arr: ndarray[Any, dtype[float64]], distance_matrix: ndarray[Any, dtype[float64]], epsilon: float, permutations: int, covariates: Optional[ndarray[Any, dtype[float64]]], return_random_laplacians: bool) dict[str, numpy.ndarray[Any, numpy.dtype[numpy.float64]]]
Parameters
  • feature_arr (ndarray[Any, dtype[float64]]) – An array of shape (N, num_features), where N is the number of nodes in the graph, and num_features is the number of features. Each column represents a feature on N elements. Columns should be preprocessed to remove constant features.

  • distance_matrix (ndarray[Any, dtype[float64]]) – vectorform distance matrix

  • epsilon (float) – connect nodes of graph if their distance is less than epsilon

  • permutations (int) – Generate permutations many random permutations \(\sigma\) of the set of nodes of G, and compute the laplacian scores of the features \(f \circ \sigma\) for each permutation \(\sigma\). These additional laplacian scores are used to perform a non-parametric permutation test, returning a p-value representing the chance that the Laplacian would be equally as high for a randomly selected permutation of the feature.

  • covariates (Optional[ndarray[Any, dtype[float64]]]) – (optional) array of shape (N, num_covariates), or simply (N,), where N is the number of nodes in the graph, and num_covariates is the number of covariates

  • return_random_laplacians (bool) – if True, the output dictionary will contain all of the generated laplacians. This will likely be the largest object in the dictionary.

Returns

A pair of dictionaries (feature_data, other). All values in feature_data are of shape (num_features,).

  • feature_data[‘feature_laplacians’] := the laplacian scores of f, shape (num_features,)

  • feature_data[‘laplacian_p_values’] := the p-values from the permutation test, shape (num_features,)

  • feature_data[‘laplacian_q_values’] := the q-values from the permutation test, shape (num_features,)

  • (Optional, if covariates is not None) (for i in range(1, covariates.shape[0])) feature_data[‘beta_i’] := the p-value that beta_i is not zero for that feature; see p. 228, ‘Applied Linear Statistical Models’, Nachtsheim, Kutner, Neter, Li. Shape (num_features,)

  • (Optional, if covariates is not None) feature_data[‘regression_coefficients_fstat_p_values’] := the p-value that not all beta_i are zero, using the F-statistic, see p. 226, ‘Applied Linear Statistical Models’, Nachtsheim, Kutner, Neter, Li. Shape (num_features,)

  • (Optional, if covariates is not None) feature_data[‘laplacian_p_values_post_regression’] := the p-value of the residual laplacian of the feature once the covariates have been regressed out.

  • (Optional, if covariates is not None) feature_data[‘laplacian_q_values_post_regression’] := the q-values from the permutation test, shape (num_features,)

  • (Optional, if covariates is not None) other[‘covariate_laplacians’] := the laplacian scores of the covariates, shape (num_covariates,) (if a matrix of covariates was supplied, else this entry will be absent)

  • (Optional, if return_random_laplacians is True) other[‘random_feature_laplacians’] := the matrix of randomly generated feature laplacians, shape (permutations,num_features).

  • (Optional, if covariates is not None and return_random_laplacians is True) other[‘random_covariate_laplacians’] := the matrix of randomly generated covariate laplacians, shape (permutations, num_covariates)

Return type

dict[str, numpy.ndarray[Any, numpy.dtype[numpy.float64]]]