Package csb :: Package bio :: Package utils
[frames] | no frames]

Package utils

source code

Computational utility functions.

This module defines a number of low-level, numerical, high-performance utility functions like rmsd for example.

Functions
(n,3) numpy.array
average_structure(X)
Calculate an average structure from an ensemble of structures (i.e.
source code
tuple
bfit(X, Y, n_iter=10, distribution='student', em=False, full_output=False)
Robust superposition of two coordinate arrays.
source code
(d,) numpy.array
center_of_mass(x, m=None)
Compute the mean of a set of (optionally weighted) points.
source code
numpy array
deg(x)
Convert an array of torsion angles in radians to torsion degrees ranging from -180 to 180.
source code
(m,) numpy.array
distance(X, Y)
Distance between X and Y along the last axis.
source code
numpy array
distance_matrix(X, Y=None)
Calculates a matrix of pairwise distances
source code
(m,) numpy.array
distance_sq(X, Y)
Squared distance between X and Y along the last axis.
source code
iterable
find_pairs(cutoff, X, Y=None)
Find pairs with euclidean distance below cutoff.
source code
tuple
fit(X, Y)
Return the translation vector and the rotation matrix minimizing the RMSD between two sets of d-dimensional vectors, i.e.
source code
(n,3) numpy.array
fit_transform(X, Y, fit=<function fit at 0x7fe0c56e4c08>, *args)
Return Y superposed on X.
source code
tuple
fit_wellordered(X, Y, n_iter=None, n_stdv=2, tol_rmsd=0.5, tol_stdv=0.05, full_output=False)
Match two arrays onto each other by iteratively throwing out highly deviating entries.
source code
(d,d) numpy.array
inertia_tensor(x, m=None)
Compute the inertia tensor of a set of (optionally weighted) points.
source code
bool
is_mirror_image(X, Y)
Check if two configurations X and Y are mirror images (i.e.
source code
tuple
probabilistic_fit(X, Y, w=None, niter=10)
Generate a superposition of X, Y where:
source code
numpy array
rad(x)
Convert an array of torsion angles in torsion degrees to radians.
source code
(d,) numpy.array
radius_of_gyration(x, m=None)
Compute the radius of gyration of a set of (optionally weighted) points.
source code
float
rmsd(X, Y)
Calculate the root mean squared deviation (RMSD) using Kabsch' formula.
source code
float
rmsd_cur(X, Y)
Calculate the RMSD of two conformations as they are (no fitting is done).
source code
tuple
scale_and_fit(X, Y, check_mirror_image=False)
Return the translation vector, the rotation matrix and a global scaling factor minimizing the RMSD between two sets of d-dimensional vectors, i.e.
source code
(d,d) numpy.array
second_moments(x, m=None)
Compute the tensor second moments of a set of (optionally weighted) points.
source code
float
tm_score(x, y, L=None, d0=None)
Evaluate the TM-score of two conformations as they are (no fitting is done).
source code
tuple
tm_superimpose(x, y, fit_method=<function fit at 0x7fe0c56e4c08>, L=None, d0=None, L_ini_min=4, iL_step=1)
Compute the TM-score of two protein coordinate vector sets.
source code
float
torsion_rmsd(x, y)
Compute the circular RMSD of two phi/psi angle sets.
source code
numpy.array
transform(Y, R, t, s=None, invert=False)
Transform Y by rotation R and translation t.
source code
tuple
wfit(X, Y, w)
Return the translation vector and the rotation matrix minimizing the weighted RMSD between two sets of d-dimensional vectors, i.e.
source code
float
wrmsd(X, Y, w)
Calculate the weighted root mean squared deviation (wRMSD) using Kabsch' formula.
source code
tuple
xfit(X, Y, n_iter=10, seed=False, full_output=False)
Maximum likelihood superposition of two coordinate arrays.
source code
Variables
  __package__ = 'csb.bio.utils'
Function Details

average_structure(X)

source code 

Calculate an average structure from an ensemble of structures (i.e. X is a rank-3 tensor: X[i] is a (N,3) configuration matrix).

Parameters:
  • X (numpy array) - m x n x 3 input vector
Returns: (n,3) numpy.array
average structure

bfit(X, Y, n_iter=10, distribution='student', em=False, full_output=False)

source code 

Robust superposition of two coordinate arrays. Models non-rigid displacements with outlier-tolerant probability distributions.

Parameters:
  • X (numpy.array) - (n, 3) input vector
  • Y (numpy.array) - (n, 3) input vector
  • n_iter (int) - number of iterations
  • distribution (str) - student or k
  • em (bool) - use maximum a posteriori probability (MAP) estimator
  • full_output (bool) - if true, return ((R, t), scales)
Returns: tuple

center_of_mass(x, m=None)

source code 

Compute the mean of a set of (optionally weighted) points.

Parameters:
  • x (numpy.array) - array of rank (n,d) where n is the number of points and d the dimension
  • m (numpy.array) - rank (n,) array of masses / weights
Returns: (d,) numpy.array
center of mass

deg(x)

source code 

Convert an array of torsion angles in radians to torsion degrees ranging from -180 to 180.

Parameters:
  • x (numpy array) - array of angles
Returns: numpy array

distance(X, Y)

source code 

Distance between X and Y along the last axis.

Parameters:
  • X (numpy array) - m x n input vector
  • Y (numpy array) - m x n input vector
Returns: (m,) numpy.array
scalar or array of length m

distance_matrix(X, Y=None)

source code 

Calculates a matrix of pairwise distances

Parameters:
  • X (numpy array) - m x n input vector
  • Y (numpy array) - k x n input vector or None, which defaults to Y=X
Returns: numpy array
m x k distance matrix

distance_sq(X, Y)

source code 

Squared distance between X and Y along the last axis. For details, see distance.

Returns: (m,) numpy.array
scalar or array of length m

find_pairs(cutoff, X, Y=None)

source code 

Find pairs with euclidean distance below cutoff. Either between X and Y, or within X if Y is None.

Uses a KDTree and thus is memory efficient and reasonable fast.

Parameters:
  • Y ((k,n) numpy.array)
  • cutoff (float)
  • X ((m,n) numpy.array)
Returns: iterable
set of index tuples

fit(X, Y)

source code 

Return the translation vector and the rotation matrix minimizing the RMSD between two sets of d-dimensional vectors, i.e. if

>>> R,t = fit(X,Y)

then

>>> Y = dot(Y, transpose(R)) + t

will be the fitted configuration.

Parameters:
  • X (numpy array) - (n, d) input vector
  • Y (numpy array) - (n, d) input vector
Returns: tuple
(d, d) rotation matrix and (d,) translation vector

fit_transform(X, Y, fit=<function fit at 0x7fe0c56e4c08>, *args)

source code 

Return Y superposed on X.

Parameters:
  • Y ((n,3) numpy.array)
  • X ((n,3) numpy.array)
  • fit (function)
Returns: (n,3) numpy.array

fit_wellordered(X, Y, n_iter=None, n_stdv=2, tol_rmsd=0.5, tol_stdv=0.05, full_output=False)

source code 

Match two arrays onto each other by iteratively throwing out highly deviating entries.

(Reference: Nilges et al.: Delineating well-ordered regions in protein structure ensembles).

Parameters:
  • X (numpy array) - (n, d) input vector
  • Y (numpy array) - (n, d) input vector
  • n_stdv - number of standard deviations above which points are considered to be outliers
  • tol_rmsd - tolerance in rmsd
  • tol_stdv - tolerance in standard deviations
  • full_output - also return full history of values calculated by the algorithm
Returns: tuple

inertia_tensor(x, m=None)

source code 

Compute the inertia tensor of a set of (optionally weighted) points.

Parameters:
  • x (numpy.array) - array of rank (n,d) where n is the number of points and d the dimension
  • m (numpy.array) - rank (n,) array of masses / weights
Returns: (d,d) numpy.array
inertia tensor

is_mirror_image(X, Y)

source code 

Check if two configurations X and Y are mirror images (i.e. their optimal superposition involves a reflection).

Parameters:
  • X (numpy array) - n x 3 input vector
  • Y (numpy array) - n x 3 input vector
Returns: bool

probabilistic_fit(X, Y, w=None, niter=10)

source code 

Generate a superposition of X, Y where:

   R ~ exp(trace(dot(transpose(dot(transpose(X-t), Y)), R)))
   t ~ N(t_opt, 1 / sqrt(N))
Returns: tuple

rad(x)

source code 

Convert an array of torsion angles in torsion degrees to radians.

Parameters:
  • x (numpy array) - array of angles
Returns: numpy array

radius_of_gyration(x, m=None)

source code 

Compute the radius of gyration of a set of (optionally weighted) points.

Parameters:
  • x (numpy.array) - array of rank (n,d) where n is the number of points and d the dimension
  • m (numpy.array) - rank (n,) array of masses / weights
Returns: (d,) numpy.array
center of mass

rmsd(X, Y)

source code 

Calculate the root mean squared deviation (RMSD) using Kabsch' formula.

Parameters:
  • X (numpy array) - (n, d) input vector
  • Y (numpy array) - (n, d) input vector
Returns: float
rmsd value between the input vectors

rmsd_cur(X, Y)

source code 

Calculate the RMSD of two conformations as they are (no fitting is done). For details, see rmsd.

Returns: float
rmsd value between the input vectors

scale_and_fit(X, Y, check_mirror_image=False)

source code 

Return the translation vector, the rotation matrix and a global scaling factor minimizing the RMSD between two sets of d-dimensional vectors, i.e. if

>>> R, t, s = scale_and_fit(X, Y)

then

>>> Y = s * (dot(Y, R.T) + t)

will be the fitted configuration.

Parameters:
  • X (numpy array) - (n, d) input vector
  • Y (numpy array) - (n, d) input vector
Returns: tuple
(d, d) rotation matrix and (d,) translation vector

second_moments(x, m=None)

source code 

Compute the tensor second moments of a set of (optionally weighted) points.

Parameters:
  • x (numpy.array) - array of rank (n,d) where n is the number of points and d the dimension
  • m (numpy.array) - rank (n,) array of masses / weights
Returns: (d,d) numpy.array
second moments

tm_score(x, y, L=None, d0=None)

source code 

Evaluate the TM-score of two conformations as they are (no fitting is done).

Parameters:
  • x (numpy array) - 3 x n input array
  • y (numpy array) - 3 x n input array
  • L (int) - length for normalization (default: len(x))
  • d0 (float) - d0 in Angstroms (default: calculate from L)
Returns: float
computed TM-score

tm_superimpose(x, y, fit_method=<function fit at 0x7fe0c56e4c08>, L=None, d0=None, L_ini_min=4, iL_step=1)

source code 

Compute the TM-score of two protein coordinate vector sets. Reference: http://zhanglab.ccmb.med.umich.edu/TM-score

Parameters:
  • x (numpy.array) - 3 x n input vector
  • y (numpy.array) - 3 x n input vector
  • fit_method (function) - a reference to a proper fitting function, e.g. fit or fit_wellordered
  • L (int) - length for normalization (default: len(x))
  • d0 (float) - d0 in Angstroms (default: calculate from L)
  • L_ini_min (int) - minimum length of initial alignment window (increase to speed up but loose precision, a value of 0 disables local alignment initialization)
  • iL_step (int) - initial alignment window shift (increase to speed up but loose precision)
Returns: tuple
rotation matrix, translation vector, TM-score

torsion_rmsd(x, y)

source code 

Compute the circular RMSD of two phi/psi angle sets.

Parameters:
  • x (array) - query phi/psi angles (Nx2 array, in radians)
  • y (array) - subject phi/psi angles (Nx2 array, in radians)
Returns: float

transform(Y, R, t, s=None, invert=False)

source code 

Transform Y by rotation R and translation t. Optionally scale by s.

>>> R, t = fit(X, Y)
>>> Y_fitted = transform(Y, R, t)
Parameters:
  • Y (numpy.array) - (n, d) input vector
  • R (numpy.array) - (d, d) rotation matrix
  • t (numpy.array) - (d,) translation vector
  • s (float) - scaling factor
  • invert (bool) - if True, apply the inverse transformation
Returns: numpy.array
transformed input vector

wfit(X, Y, w)

source code 

Return the translation vector and the rotation matrix minimizing the weighted RMSD between two sets of d-dimensional vectors, i.e. if

>>> R,t = fit(X,Y)

then

>>> Y = dot(Y, transpose(R)) + t

will be the fitted configuration.

Parameters:
  • X (numpy array) - (n, d) input vector
  • Y (numpy array) - (n, d) input vector
  • w (numpy array) - input weights
Returns: tuple
(d, d) rotation matrix and (d,) translation vector

wrmsd(X, Y, w)

source code 

Calculate the weighted root mean squared deviation (wRMSD) using Kabsch' formula.

Parameters:
  • X (numpy array) - (n, d) input vector
  • Y (numpy array) - (n, d) input vector
  • w (numpy array) - input weights
Returns: float
rmsd value between the input vectors

xfit(X, Y, n_iter=10, seed=False, full_output=False)

source code 

Maximum likelihood superposition of two coordinate arrays. Works similar to Theseus and to bfit.

Parameters:
  • X (numpy.array) - (n, 3) input vector
  • Y (numpy.array) - (n, 3) input vector
  • n_iter (int) - number of EM iterations
  • full_output (bool) - if true, return ((R, t), scales)
  • seed (bool)
Returns: tuple