sgdml.utils package

Submodules

sgdml.utils.desc module

sgdml.utils.desc.init(n_atoms)[source]
sgdml.utils.desc.pbc_diff(u, v, size)[source]

Compute the difference of two vectors, while appling the minimum-image convention as periodic boundary condition.

Parameters:
  • u (numpy.ndarray) – First vector.
  • v (numpy.ndarray) – Second vector.
  • size (float) – Edge length of (cubic) unit cell.
Returns:

Difference between two vectors u - v.

Return type:

numpy.ndarray

sgdml.utils.desc.perm(perm)[source]

Convert atom permutation to descriptor permutation.

A permutation of N atoms is converted to a permutation that acts on the corresponding descriptor representation. Applying the converted permutation to a descriptor is equivalent to permuting the atoms first and then generating the descriptor.

Parameters:perm (numpy.ndarray) – Array of size N containing the atom permutation.
Returns:Array of size N(N-1)/2 containing the corresponding descriptor permutation.
Return type:numpy.ndarray
sgdml.utils.desc.r_to_d_desc(r, pdist, ucell_size=None)[source]

Generate Jacobian of descriptor for a set of atom positions in Cartesian coordinates. This method can apply the minimum-image convention as periodic boundary condition for distances between atoms, given the edge length of the (square) unit cell.

Parameters:
  • r (numpy.ndarray) – Array of size 1 x 3N containing the Cartesian coordinates of each atom.
  • pdist (numpy.ndarray) – Array of size N x N containing the Euclidean distance (2-norm) for each pair of atoms.
  • ucell_size (float, optional) – Edge length of the (cubic) unit cell.
Returns:

Array of size N(N-1)/2 x 3N containing all partial derivatives of the descriptor.

Return type:

numpy.ndarray

sgdml.utils.desc.r_to_d_desc_op(r, pdist, F_d, ucell_size=None)[source]

Compute vector-matrix product with descriptor Jacobian.

The descriptor Jacobian will be generated and directly applied without storing it. This method can apply the minimum-image convention as periodic boundary condition for distances between atoms, given the edge length of the (square) unit cell.

Parameters:
  • r (numpy.ndarray) – Array of size 1 x 3N containing the Cartesian coordinates of each atom.
  • pdist (numpy.ndarray) – Array of size N x N containing the Euclidean distance (2-norm) for each pair of atoms.
  • F_d (numpy.ndarray) – Array of size N(N-1)/2.
  • ucell_size (float, optional) – Edge length of the (cubic) unit cell.
Returns:

Array of size 3N containing the dot product of F_d and the descriptor Jacobian.

Return type:

numpy.ndarray

sgdml.utils.desc.r_to_desc(r, pdist)[source]

Generate descriptor for a set of atom positions in Cartesian coordinates.

Parameters:
  • r (numpy.ndarray) – Array of size 3N containing the Cartesian coordinates of each atom.
  • pdist (numpy.ndarray) – Array of size N x N containing the Euclidean distance (2-norm) for each pair of atoms.
Returns:

Descriptor representation as 1D array of size N(N-1)/2

Return type:

numpy.ndarray

sgdml.utils.io module

sgdml.utils.io.dataset_md5(dataset)[source]
sgdml.utils.io.generate_xyz_str(r, z, e=None, f=None, lattice=None)[source]
sgdml.utils.io.model_file_name(task_or_model, is_extended=False)[source]
sgdml.utils.io.read_xyz(file_path)[source]
sgdml.utils.io.task_file_name(task)[source]
sgdml.utils.io.train_dir_name(dataset, n_train, use_sym, use_cprsn, use_E, use_E_cstr)[source]
sgdml.utils.io.write_geometry(filename, r, z, comment_str='')[source]
sgdml.utils.io.z_str_to_z(z_str)[source]
sgdml.utils.io.z_to_z_str(z)[source]

sgdml.utils.perm module

sgdml.utils.perm.complete_group(perms)[source]
sgdml.utils.perm.inv_perm(perm)[source]
sgdml.utils.perm.share_array(arr_np, typecode)[source]
sgdml.utils.perm.sync_mat(R, z, max_processes=None)[source]

sgdml.utils.ui module

sgdml.utils.ui.fail_str(str)[source]
sgdml.utils.ui.filter_file_type(dir, type, md5_match=None)[source]
sgdml.utils.ui.gray_str(str)[source]
sgdml.utils.ui.green_back_str(str)[source]
sgdml.utils.ui.info_str(str)[source]
sgdml.utils.ui.is_dir_with_file_type(arg, type, or_file=False)[source]

Validate directory path and check if it contains files of the specified type.

Parameters:
  • arg (str) – File path.
  • type ({‘dataset’, ‘task’, ‘model’}) – Possible file types.
  • or_file (bool) – If arg contains a file path, act like it’s a directory with just a single file inside.
Returns:

Tuple of directory path (as provided) and a list of contained file names of the specified type.

Return type:

(str, list of str)

Raises:
  • ArgumentTypeError – If the provided directory path does not lead to a directory.
  • ArgumentTypeError – If directory contains unreadable files.
  • ArgumentTypeError – If directory contains no files of the specified type.
sgdml.utils.ui.is_file_type(arg, type)[source]

Validate file path and check if the file is of the specified type.

Parameters:
  • arg (str) – File path.
  • type ({‘dataset’, ‘task’, ‘model’}) – Possible file types.
Returns:

Tuple of file path (as provided) and data stored in the file. The returned instance of NpzFile class must be closed to avoid leaking file descriptors.

Return type:

(str, dict)

Raises:
  • ArgumentTypeError – If the provided file path does not lead to a NpzFile.
  • ArgumentTypeError – If the file is not readable.
  • ArgumentTypeError – If the file is of wrong type.
  • ArgumentTypeError – If path/fingerprint is provided, but the path is not valid.
  • ArgumentTypeError – If fingerprint could not be resolved.
  • ArgumentTypeError – If multiple files with the same fingerprint exist.
sgdml.utils.ui.is_lattice_supported(lat)[source]
sgdml.utils.ui.is_strict_pos_int(arg)[source]

Validate strictly positive integer input.

Parameters:arg (str) – Integer as string.
Returns:Parsed integer.
Return type:int
Raises:ArgumentTypeError – If integer is not > 0.
sgdml.utils.ui.is_task_dir_resumeable(train_dir, train_dataset, test_dataset, n_train, n_test, sigs, gdml)[source]

Check if a directory contains task and/or model files that match the configuration of a training process specified in the remaining arguments.

Check if the training and test datasets in each task match train_dataset and test_dataset, if the number of training and test points matches and if the choices for the kernel hyper-parameter \(\sigma\) are contained in the list. Check also, if the existing tasks/models contain symmetries and if that’s consistent with the flag gdml. This function is useful for determining if a training process can be resumed using the existing files or not.

Parameters:
  • train_dir (str) – Path to training directory.
  • train_dataset (dataset) – Dataset from which training points are sampled.
  • test_dataset (test_dataset) – Dataset from which test points are sampled (may be the same as train_dataset).
  • n_train (int) – Number of training points to sample.
  • n_test (int) – Number of test points to sample.
  • sigs (list of int) – List of \(\sigma\) kernel hyper-parameter choices (usually: the hyper-parameter search grid)
  • gdml (bool) – If True, don’t include any symmetries in model (GDML), otherwise do (sGDML).
Returns:

False, if any of the files in the directory do not match the training configuration.

Return type:

bool

sgdml.utils.ui.is_valid_file_type(arg_in)[source]
sgdml.utils.ui.parse_list_or_range(arg)[source]

Parses a string that represents either an integer or a range in the notation <start>:<step>:<stop>.

Parameters:arg (str) – Integer or range string.
Returns:
Return type:int or list of int
Raises:ArgumentTypeError – If input can neither be interpreted as an integer nor a valid range.
sgdml.utils.ui.pass_str(str)[source]
sgdml.utils.ui.progr_bar(current, total, disp_str='', sec_disp_str=None)[source]

Print progress bar.

Example: [ 45%] Task description (secondary string)

Parameters:
  • current (int) – How many items already processed?
  • total (int) – Total number of items?
  • disp_str (str, optional) – Task description.
  • sec_disp_str (str, optional) – Additional string shown in gray.
sgdml.utils.ui.progr_toggle(is_done, disp_str='', sec_disp_str=None)[source]

Print progress toggle.

Example (not done): [ .. ] Task description (secondary string)

Example (done): [DONE] Task description (secondary string)

Parameters:
  • is_done (bool) – Task done?
  • disp_str (str, optional) – Task description.
  • sec_disp_str (str, optional) – Additional string shown in gray.
sgdml.utils.ui.underline_str(str)[source]
sgdml.utils.ui.warn_str(str)[source]
sgdml.utils.ui.white_back_str(str)[source]
sgdml.utils.ui.white_bold_str(str)[source]
sgdml.utils.ui.yes_or_no(question)[source]

Ask for yes/no user input on a question.

Any response besides y yields a negative answer.

Parameters:question (str) – User question.

Module contents