sitator.site_descriptors package

Submodules

sitator.site_descriptors.SOAP module

class sitator.site_descriptors.SOAP.SOAP(tracer_atomic_number, environment=None, soap_mask=None, backend=None)

Bases: object

Abstract base class for computing SOAP vectors in a SiteNetwork.

SOAP computations are not thread-safe; use one SOAP object per thread.

Parameters:
  • tracer_atomic_number (int) – The atomic number of the tracer.
  • environment (list) – The atomic numbers or atomic symbols of the environment to consider. I.e. for Li2CO3, can be set to [‘O’] or [8] for oxygen only, or [‘C’, ‘O’] / [‘C’, 8] / [6,8] if carbon and oxygen are considered an environment. Defaults to None, in which case all non-mobile atoms are considered regardless of species.
  • soap_mask – Which atoms in the SiteNetwork’s structure to use in SOAP calculations. Can be either a boolean mask ndarray or a tuple of species. If None, the entire static_structure of the SiteNetwork will be used. Mobile atoms cannot be used for the SOAP host structure. Even not masked, species not considered in environment will be not accounted for. For ideal performance: Specify environment and soap_mask correctly!
  • soap_params = {} (dict) – Any custom SOAP params.
  • backend (func) – A function that can be called with sn, soap_mask, tracer_atomic_number, environment_list as parameters, returning a function that, given the current soap structure along with tracer atoms, returns SOAP vectors in a numpy array. (i.e. its signature is soap(structure, positions)). The returned function can also have a property, n_dim, giving the length of a single SOAP vector.
backend_dscribe()
backend_quip(quip_path='quip')
get_descriptors(stn)

Get the descriptors.

Parameters:stn (SiteTrajectory or SiteNetwork) –
Returns:An array of descriptor vectors and an equal length array of labels indicating which descriptors correspond to which sites.
class sitator.site_descriptors.SOAP.SOAPCenters(tracer_atomic_number, environment=None, soap_mask=None, backend=None)

Bases: sitator.site_descriptors.SOAP.SOAP

Compute the SOAPs of the site centers in the fixed host structure.

Requires a SiteNetwork as input.

class sitator.site_descriptors.SOAP.SOAPDescriptorAverages(*args, **kwargs)

Bases: sitator.site_descriptors.SOAP.SOAP

Compute many instantaneous SOAPs for each site, and then average them in SOAP space.

Computes the SOAP descriptors for mobile particles assigned to each site, in the host structure as it was at that moment. Those descriptor vectors are then averaged in SOAP space to give the final SOAP vectors for each site.

This method often performs better than SOAPSampledCenters on more dynamic systems, but requires significantly more computation.

Parameters:
  • stepsize (int) – Stride (in frames) when computing SOAPs. Default 1.
  • averaging (int) – Number of SOAP vectors to average for each output vector.
  • avg_descriptors_per_site (int) – Can be specified instead of averaging. Specifies the _average_ number of average SOAP vectors to compute for each site. This does not guerantee that number of SOAP vectors for any site, rather, it allows a trajectory-size agnostic way to specify approximately how many descriptors are desired.
class sitator.site_descriptors.SOAP.SOAPSampledCenters(*args, **kwargs)

Bases: sitator.site_descriptors.SOAP.SOAPCenters

Compute the SOAPs of representative points for each site, as determined by sampling_transform.

Takes either a SiteNetwork or SiteTrajectory as input; requires that sampling_transform produce a SiteNetwork where site_types indicates which site in the original SiteNetwork/SiteTrajectory it was sampled from.

Typical sampling transforms are sitator.misc.NAvgsPerSite (for a SiteTrajectory) and sitator.misc.GenerateAroundSites (for a SiteNetwork).

get_descriptors(stn)

Get the descriptors.

Parameters:stn (SiteTrajectory or SiteNetwork) –
Returns:An array of descriptor vectors and an equal length array of labels indicating which descriptors correspond to which sites.

Module contents

class sitator.site_descriptors.SiteCoordinationEnvironment(guess_ionic_bonds=True, full_chemenv_site_types=False, **kwargs)

Bases: object

Determine site types based on local coordination environments.

Determine site types using the method from the following paper:

David Waroquiers, Xavier Gonze, Gian-Marco Rignanese, Cathrin Welker-Nieuwoudt, Frank Rosowski, Michael Goebel, Stephan Schenk, Peter Degelmann, Rute Andre, Robert Glaum, and Geoffroy Hautier

Statistical analysis of coordination environments in oxides

Chem. Mater., 2017, 29 (19), pp 8346–8360, DOI: 10.1021/acs.chemmater.7b02766

as implement in pymatgen’s pymatgen.analysis.chemenv.coordination_environments.

Adds three site attributes:
  • coordination_environments: The name of the coordination environment,
    as returned by pymatgen. Example: "T:4" (tetrahedral, coordination of 4).
  • site_type_confidences: The ce_fraction of the best match chemical
    environment (from 0 to 1).
  • coordination_numbers: The coordination number of the site.
Parameters:
  • guess_ionic_bonds (bool) – If True, uses pymatgen’s bond valence analysis to guess valences and only consider ionic bonds for neighbor analysis. Otherwise, or if it fails, all bonds are fair game.
  • full_chemenv_site_types (bool) – If True, sitator site types on the final SiteNetwork will be assigned based on unique chemical environments, including shape. If False, they will be assigned solely based on coordination number. Either way, both sets of information are included in the SiteNetwork, this just changes which determines the site_types.
  • **kwargs – passed to compute_structure_environments.
run(sn)
Parameters:sn (SiteNetwork) –
Returns:sn, with type information.
class sitator.site_descriptors.SiteTypeAnalysis(descriptor, min_pca_variance=0.9, min_pca_dimensions=2, n_site_types_max=20)

Bases: object

Cluster sites into types using a continuous descriptor and Density Peak Clustering.

Computes descriptor vectors, processes them with Principal Component Analysis, and then clusters using Density Peak Clustering.

Parameters:
  • descriptor (object) – Must implement get_descriptors(st|sn), which returns an array of descriptor vectors of dimension (M, n_dim) and an array of length M indicating which descriptor vectors correspond to which sites in (site_traj.)``site_network``.
  • min_pca_variance (float) – The minimum proportion of the total variance that the taken principal components of the descriptor must explain.
  • min_pca_dimensions (int) – Force taking at least this many principal components.
  • n_site_types_max (int) – Maximum number of clusters. Must be set reasonably for the automatic selection of cluster number to work.
plot_clustering(fig=None, ax=None, **kwargs)
plot_dpc_decision_plot(ax=None, **kwargs)
plot_dpc_delta_density(ax=None, **kwargs)
plot_dvecs(fig=None, ax=None, **kwargs)
plot_voting(fig=None, ax=None, **kwargs)
run(descriptor_input, **kwargs)
Parameters:descriptor_input (SiteNetwork or SiteTrajectory) –
Returns:SiteNetwork
class sitator.site_descriptors.SiteVolumes(error_on_insufficient_coord=True)

Bases: object

Compute the volumes of sites.

Parameters:error_on_insufficient_coord (bool) – To compute an ideal site volume (compute_volumes()), at least 4 coordinating atoms (because we are in 3D space) must be specified in vertices. If True, an error will be thrown when a site with less than four vertices is encountered; if False, a volume of 0 and surface area of NaN will be returned.
compute_accessable_volumes(st, n_recenterings=8)

Computes the volumes of convex hulls around all positions associated with a site.

Uses the shift-and-wrap trick for dealing with periodicity, so sites that take up the majority of the unit cell may give bogus results.

Adds the accessable_site_volumes attribute to the SiteNetwork.

Parameters:
  • st (SiteTrajectory) –
  • n_recenterings (int) – How many different recenterings to try (the algorithm will recenter around n of the points and take the minimal resulting volume; this deals with cases where there is one outlier where recentering around it gives very bad results.)
compute_volumes(sn)

Computes the volume of the convex hull defined by each sites’ static verticies.

Requires vertex information in the SiteNetwork.

Adds the site_volumes and site_surface_areas attributes.

Volumes can be NaN for degenerate hulls/point sets on which QHull fails.

Parameters:sn (-) –
run(st)

For backwards compatability.