molml.utils module¶
A collection of assorted utility functions.
-
class
molml.utils.
LazyValues
(connections=None, coords=None, numbers=None, elements=None, unit_cell=None)¶ Bases:
object
An object to store molecule graph properties in a lazy fashion.
This object allows only needing to compute different molecule graph properties if they are needed. The prime example of this being the computation of connections.
Parameters: - connections : dict, key->list of keys, default=None
A dictionary edge table with all the bidirectional connections.
- numbers : array-like, shape=(n_atoms, ), default=None
The atomic numbers of all the atoms.
- coords : array-like, shape=(n_atoms, 3), default=None
The xyz coordinates of all the atoms (in angstroms).
- elements : array-like, shape=(n_atoms, ), default=None
The element symbols of all the atoms.
- unit_cell : array-like, shape=(3, 3), default=None
An array of unit cell basis vectors, where the vectors are columns.
Attributes: - connections : dict, key->list of keys
A dictionary edge table with all the bidirectional connections. If the initialized value for this was None, then this will be computed from the coords and numbers/elements.
- numbers : array, shape=(n_atoms, )
The atomic numbers of all the atoms. If the initialized value for this was None, then this will be computed from the elements.
- coords : array, shape=(n_atoms, 3)
The xyz coordinates of all the atoms (in angstroms).
- elements : array, shape=(n_atoms, )
The element symbols of all the atoms. If the initialized value for this was None, then this will be computed from the numbers.
- unit_cell : array, shape=(3, 3)
An array of unit cell basis vectors, where the vectors are columns.
-
connections
¶
-
coords
¶
-
elements
¶
-
fill_in_crystal
(self, radius=None, units=None)¶ Duplicate the atoms to form a crystal.
Parameters: - radius : float, default=None
Specifies the radius of unit cell points to include
- units : list or int, default=None
Specifies the number of unit cells to include on each axis. These will all be equal if it is an int.
Raises: - ValueError
If radius and units are either both None, or if both are not None.
-
numbers
¶
-
unit_cell
¶
-
molml.utils.
cosine_decay
(R, r_cut=6.0)¶ Compute all the cutoff distances.
The cutoff is defined as
\[\begin{split}f_{R_{c}}(R_{ij}) = \begin{cases} 0.5 ( \cos( \frac{\pi R_{ij}}{R_c} ) + 1 ), & R_{ij} \le R_c \\ 0, & otherwise \end{cases}\end{split}\]Parameters: - R : array, shape=(N_atoms, N_atoms)
A distance matrix for all the atoms (scipy.spatial.cdist)
- r_cut : float, default=6.
The maximum distance allowed for atoms to be considered local to the “central atom”.
Returns: - values : array, shape=(N_atoms, N_atoms)
The new distance matrix with the cutoff function applied
-
molml.utils.
deslugify
(string)¶ Convert a string to a feature name and its parameters.
Parameters: - string : str
The slug string to extract values from.
Returns: - name : str
The name of the class corresponding to the string.
- final_params : dict
A dictionary of the feature parameters.
-
molml.utils.
get_angles
(coords)¶ Get the angles between all triples of coords.
The resulting values are \([0, \pi]\) and all invalid values are NaNs.
Parameters: - coords : numpy.array, shape=(n_atoms, n_dim)
An array of all the coordinates.
Returns: - res : numpy.array, shape=(n_atoms, n_atoms, n_atoms)
An array the angles of all triples.
-
molml.utils.
get_bond_type
(element1, element2, dist)¶ Get the bond type between two elements based on their distance.
If there is no bond, return None.
Parameters: - element1 : str
The element of the first atom
- element2 : str
The element of the second atom
- dist : float
The distance between the two atoms
- Returns
- ——-
- key : str
The type of the bond
-
molml.utils.
get_connections
(elements1, coords1, elements2=None, coords2=None)¶ Return a dictionary edge list
If two sets of elements and coordinates are given, then they will be treated as two disjoint sets of atoms.
Each value is is a tuple of the index of the connecting atom and the bond order as a string. Where the bond order is one of [‘1’, ‘Ar’, ‘2’, ‘3’].
Note: If two sets are given, this returns only the connections from the first set to the second. This is in contrast to returning connections from both directions.
Parameters: - elements1 : list
All the elements in set 1.
- coords1 : array, shape=(n_atoms, 3)
The coordinates of the atoms in set 1.
- elements2 : list, default=None
All the elements in set 2.
- coords2 : array, shape=(n_atoms, 3), default=None
The coordinates of the atoms in set 2.
Returns: - connections : dict, int->dict
Contains all atoms that are connected to each atom and bond type.
-
molml.utils.
get_coulomb_matrix
(numbers, coords, alpha=1, use_decay=False)¶ Return the coulomb matrix for the given coords and numbers.
\[\begin{split}C_{ij} = \begin{cases} \frac{Z_i Z_j}{\| r_i - r_j \|^\alpha} & i \neq j\\ \frac{1}{2} Z_i^{2.4} & i = j \end{cases}\end{split}\]Parameters: - numbers : array-like, shape=(n_atoms, )
The atomic numbers of all the atoms
- coords : array-like, shape=(n_atoms, 3)
The xyz coordinates of all the atoms (in angstroms)
- alpha : number, default=6
Some value to exponentiate the distance in the coulomb matrix.
- use_decay : bool, default=False
This setting defines an extra decay for the values as they get futher away from the “central atom”. This is to alleviate issues the arise as atoms enter or leave the cutoff radius.
Returns: - top : array, shape=(n_atoms, n_atoms)
The coulomb matrix
-
molml.utils.
get_depth_threshold_mask_connections
(connections, min_depth=0, max_depth=<Mock name='mock.inf' id='140635230277840'>)¶ Get the depth threshold mask from connections.
Parameters: - connections : dict, index->list of indices
A dictionary that contains lists of all connected atoms.
- min_depth : int, default=0
The minimum depth to allow in the masking
- max_depth : int, default=numpy.inf
The maximum depth to allow in the masking
Returns: - mask : numpy.array, shape=(len(connections), len(connections))
A mask of all the atoms that are less than or equal to max_depth away.
-
molml.utils.
get_dict_func_getter
(d, label='')¶
-
molml.utils.
get_element_pairs
(elements)¶ Extract all the element pairs in a molecule.
Parameters: - elements : list
All the elements in the molecule
Returns: - value : list
All the element pairs in the molecule
-
molml.utils.
get_graph_distance
(connections)¶ Compute the graph distance between all pairs of atoms using Floyd-Warshall
Parameters: - connections : dict, index->list of indices
A dictionary that contains lists of all connected atoms.
Returns: - dist : numpy.array, shape=(len(connections), len(connections))
The graph distance between all pairs of atoms
-
molml.utils.
get_index_mapping
(values, depth, add_unknown)¶ Determine the ordering and mapping of feature groups.
Parameters: - values : list
A list of possible values.
- depth : int
The number of elements to use from each values value.
- add_unknown : bool
Whether or not to include an extra collector for unknown values.
Returns: - map_func : function(key)->int
A function that gives the mapping index for a given key.
- length : int
The length of the mapping values.
- both : bool
Indicates whether both values are needed in a loop (A, B) vs (B, A).
-
molml.utils.
get_smoothing_function
(key)¶
-
molml.utils.
get_spacing_function
(key)¶
-
molml.utils.
lerp_smooth
(x)¶
-
molml.utils.
load_json
(f)¶ Load the model data from a json file
Parameters: - f : str or file descriptor
The path to save the data or a file descriptor to save it to.
Returns: - obj : Transformer
The transformer object.
-
molml.utils.
multi_beta
(f)¶
-
molml.utils.
needs_reversal
(chain)¶ Determine if the chain needs to be reversed.
This is to set the chains such that they are in a canonical ordering
Parameters: - chain : tuple
A tuple of elements to treat as a chain
Returns: - needs_flip : bool
Whether or not the chain needs to be reversed
-
molml.utils.
sort_chain
(chain)¶ Sort a chain from the inside out.
Parameters: - chain : tuple
A tuple of elements to treat as a chain
Returns: - chain : tuple
The sorted chain