# sgdml.utils package¶

## sgdml.utils.desc module¶

sgdml.utils.desc.init(n_atoms)[source]
sgdml.utils.desc.pbc_diff(u, v, size)[source]

Compute the difference of two vectors, while appling the minimum-image convention as periodic boundary condition.

Parameters: u (numpy.ndarray) – First vector. v (numpy.ndarray) – Second vector. size (float) – Edge length of (cubic) unit cell. Difference between two vectors u - v. numpy.ndarray
sgdml.utils.desc.perm(perm)[source]

Convert atom permutation to descriptor permutation.

A permutation of N atoms is converted to a permutation that acts on the corresponding descriptor representation. Applying the converted permutation to a descriptor is equivalent to permuting the atoms first and then generating the descriptor.

Parameters: perm (numpy.ndarray) – Array of size N containing the atom permutation. Array of size N(N-1)/2 containing the corresponding descriptor permutation. numpy.ndarray
sgdml.utils.desc.r_to_d_desc(r, pdist, ucell_size=None)[source]

Generate Jacobian of descriptor for a set of atom positions in Cartesian coordinates. This method can apply the minimum-image convention as periodic boundary condition for distances between atoms, given the edge length of the (square) unit cell.

Parameters: r (numpy.ndarray) – Array of size 1 x 3N containing the Cartesian coordinates of each atom. pdist (numpy.ndarray) – Array of size N x N containing the Euclidean distance (2-norm) for each pair of atoms. ucell_size (float, optional) – Edge length of the (cubic) unit cell. Array of size N(N-1)/2 x 3N containing all partial derivatives of the descriptor. numpy.ndarray
sgdml.utils.desc.r_to_d_desc_op(r, pdist, F_d, ucell_size=None)[source]

Compute vector-matrix product with descriptor Jacobian.

The descriptor Jacobian will be generated and directly applied without storing it. This method can apply the minimum-image convention as periodic boundary condition for distances between atoms, given the edge length of the (square) unit cell.

Parameters: r (numpy.ndarray) – Array of size 1 x 3N containing the Cartesian coordinates of each atom. pdist (numpy.ndarray) – Array of size N x N containing the Euclidean distance (2-norm) for each pair of atoms. F_d (numpy.ndarray) – Array of size N(N-1)/2. ucell_size (float, optional) – Edge length of the (cubic) unit cell. Array of size 3N containing the dot product of F_d and the descriptor Jacobian. numpy.ndarray
sgdml.utils.desc.r_to_desc(r, pdist)[source]

Generate descriptor for a set of atom positions in Cartesian coordinates.

Parameters: r (numpy.ndarray) – Array of size 3N containing the Cartesian coordinates of each atom. pdist (numpy.ndarray) – Array of size N x N containing the Euclidean distance (2-norm) for each pair of atoms. Descriptor representation as 1D array of size N(N-1)/2 numpy.ndarray

## sgdml.utils.io module¶

sgdml.utils.io.dataset_md5(dataset)[source]
sgdml.utils.io.generate_xyz_str(r, z, e=None, f=None, lattice=None)[source]
sgdml.utils.io.model_file_name(task_or_model, is_extended=False)[source]
sgdml.utils.io.read_xyz(file_path)[source]
sgdml.utils.io.task_file_name(task)[source]
sgdml.utils.io.train_dir_name(dataset, n_train, use_sym, use_cprsn, use_E, use_E_cstr)[source]
sgdml.utils.io.write_geometry(filename, r, z, comment_str='')[source]
sgdml.utils.io.z_str_to_z(z_str)[source]
sgdml.utils.io.z_to_z_str(z)[source]

## sgdml.utils.perm module¶

sgdml.utils.perm.complete_group(perms)[source]
sgdml.utils.perm.inv_perm(perm)[source]
sgdml.utils.perm.share_array(arr_np, typecode)[source]
sgdml.utils.perm.sync_mat(R, z, max_processes=None)[source]

## sgdml.utils.ui module¶

sgdml.utils.ui.fail_str(str)[source]
sgdml.utils.ui.filter_file_type(dir, type, md5_match=None)[source]
sgdml.utils.ui.gray_str(str)[source]
sgdml.utils.ui.green_back_str(str)[source]
sgdml.utils.ui.info_str(str)[source]
sgdml.utils.ui.is_dir_with_file_type(arg, type, or_file=False)[source]

Validate directory path and check if it contains files of the specified type.

Parameters: arg (str) – File path. type ({‘dataset’, ‘task’, ‘model’}) – Possible file types. or_file (bool) – If arg contains a file path, act like it’s a directory with just a single file inside. Tuple of directory path (as provided) and a list of contained file names of the specified type. (str, list of str) ArgumentTypeError – If the provided directory path does not lead to a directory. ArgumentTypeError – If directory contains unreadable files. ArgumentTypeError – If directory contains no files of the specified type.
sgdml.utils.ui.is_file_type(arg, type)[source]

Validate file path and check if the file is of the specified type.

Parameters: arg (str) – File path. type ({‘dataset’, ‘task’, ‘model’}) – Possible file types. Tuple of file path (as provided) and data stored in the file. The returned instance of NpzFile class must be closed to avoid leaking file descriptors. (str, dict) ArgumentTypeError – If the provided file path does not lead to a NpzFile. ArgumentTypeError – If the file is not readable. ArgumentTypeError – If the file is of wrong type. ArgumentTypeError – If path/fingerprint is provided, but the path is not valid. ArgumentTypeError – If fingerprint could not be resolved. ArgumentTypeError – If multiple files with the same fingerprint exist.
sgdml.utils.ui.is_lattice_supported(lat)[source]
sgdml.utils.ui.is_strict_pos_int(arg)[source]

Validate strictly positive integer input.

Parameters: arg (str) – Integer as string. Parsed integer. int ArgumentTypeError – If integer is not > 0.
sgdml.utils.ui.is_task_dir_resumeable(train_dir, train_dataset, test_dataset, n_train, n_test, sigs, gdml)[source]

Check if a directory contains task and/or model files that match the configuration of a training process specified in the remaining arguments.

Check if the training and test datasets in each task match train_dataset and test_dataset, if the number of training and test points matches and if the choices for the kernel hyper-parameter $$\sigma$$ are contained in the list. Check also, if the existing tasks/models contain symmetries and if that’s consistent with the flag gdml. This function is useful for determining if a training process can be resumed using the existing files or not.

Parameters: train_dir (str) – Path to training directory. train_dataset (dataset) – Dataset from which training points are sampled. test_dataset (test_dataset) – Dataset from which test points are sampled (may be the same as train_dataset). n_train (int) – Number of training points to sample. n_test (int) – Number of test points to sample. sigs (list of int) – List of $$\sigma$$ kernel hyper-parameter choices (usually: the hyper-parameter search grid) gdml (bool) – If True, don’t include any symmetries in model (GDML), otherwise do (sGDML). False, if any of the files in the directory do not match the training configuration. bool
sgdml.utils.ui.is_valid_file_type(arg_in)[source]
sgdml.utils.ui.parse_list_or_range(arg)[source]

Parses a string that represents either an integer or a range in the notation <start>:<step>:<stop>.

Parameters: arg (str) – Integer or range string. int or list of int ArgumentTypeError – If input can neither be interpreted as an integer nor a valid range.
sgdml.utils.ui.pass_str(str)[source]
sgdml.utils.ui.progr_bar(current, total, disp_str='', sec_disp_str=None)[source]

Print progress bar.

Example: [ 45%] Task description (secondary string)

Parameters: current (int) – How many items already processed? total (int) – Total number of items? disp_str (str, optional) – Task description. sec_disp_str (str, optional) – Additional string shown in gray.
sgdml.utils.ui.progr_toggle(is_done, disp_str='', sec_disp_str=None)[source]

Print progress toggle.

Example (not done): [ .. ] Task description (secondary string)

Example (done): [DONE] Task description (secondary string)

Parameters: is_done (bool) – Task done? disp_str (str, optional) – Task description. sec_disp_str (str, optional) – Additional string shown in gray.
sgdml.utils.ui.underline_str(str)[source]
sgdml.utils.ui.warn_str(str)[source]
sgdml.utils.ui.white_back_str(str)[source]
sgdml.utils.ui.white_bold_str(str)[source]
sgdml.utils.ui.yes_or_no(question)[source]

Ask for yes/no user input on a question.

Any response besides y yields a negative answer.

Parameters: question (str) – User question.