light_pfp_data.sample.crystal.sample_compress(input_structure: Atoms, calculator: Calculator, dataset: H5DatasetWriter, supercell: Tuple[int, int, int] = (1, 1, 1), min_scale: float = 0.95, max_scale: float = 1.05, interval: float = 0.01, opt: bool = False, md: bool = False, sampling_temp: Union[List[float], float] = 500.0, sampling_steps: Union[List[int], int] = 1000, sampling_interval: Union[List[int], int] = 100, timestep: float = 1.0, show_progress_bar: bool = False, executor: Optional[ThreadPoolExecutor] = None, num_threads: int = 8, max_retries: int = 0, pbar: Optional[tqdm_asyncio] = None) List[Future]#

Get structures by uniformly compressing and stretching the input structure.

Parameters
  • input_structure (Atoms) – The input structure.

  • calculator (Calculator) – The ASE calculator for energy calculation.

  • dataset (H5DatasetWriter) – The dataset.

  • supercell (Tuple[int, int, int], optional) – Expand the input structure into supercell. Defaults to (1, 1, 1).

  • min_scale (float, optional) – Compress the cell length to this ratio at minimum. Defaults to 0.95.

  • max_scale (float, optional) – Stretch the cell length to this ratio at maximum. Defaults to 1.05.

  • interval (float, optional) – Generate structures between min_scale and max_scale with this interval.
    Defaults to 0.01.

  • opt (bool, optional) – Optimize the structure after compression and stretch.
    Defaults to False.

  • md (bool, optional) – Run MD simulation to collect more structures or not. Defaults to False.

  • sampling_temp (Union[List[float], float], optional) – MD simulation temperature, in K. Defaults to 500.0.

  • sampling_steps (Union[List[int], int], optional) – MD steps. Defaults to 1000.

  • sampling_interval (Union[List[int], int], optional) – Collect training structure every this steps. Defaults to 100.

  • timestep (float, optional) – Time step in fs. Defaults to 1.0.

  • show_progress_bar (bool, optional) – Show progress bar. Defaults to False.

  • executor (ThreadPoolExecutor, optional) – Thread pool executor parallel calculation. Defaults to None.

  • num_threads (int, optional) – Max number of threads to use for executor if no executor is passed. Defaults to 8.

  • max_retries (int, optional) – Max retries for PFP calculation. Defaults to 0.

light_pfp_data.sample.crystal.sample_deformed(input_structure: Atoms, calculator: Calculator, dataset: H5DatasetWriter, supercell: Tuple[int, int, int] = (2, 2, 2), min_strain: float = -0.01, max_strain: float = 0.01, interval: float = 0.005, opt: bool = False, show_progress_bar: bool = False, executor: Optional[ThreadPoolExecutor] = None, num_threads: int = 8, max_retries: int = 0, pbar: Optional[tqdm_asyncio] = None) List[Future]#

Generate deformed structures by applying strain to the input structure.

Parameters
  • input_structure (Atoms) – The input structure.

  • calculator (Calculator) – The ASE calculator for energy calculation.

  • dataset (Union[List[Atoms], H5DatasetWriter]) – The dataset.

  • supercell (Tuple[int, int, int], optional) – Expand the input structure into supercell. Defaults to (2, 2, 2).

  • min_strain (float, optional) – Minimum strain applied to the lattice. Defaults to -0.01.

  • max_strain (float, optional) – Maximum strain applied to the lattice. Defaults to 0.01.

  • interval (float, optional) – Generate structures between min_strain and max_strain with this interval.
    Defaults to 0.005.

  • opt (bool, optional) – Optimize the structure after deformation.
    Defaults to False.

  • show_progress_bar (bool, optional) – Show progress bar. Defaults to False.

  • executor (ThreadPoolExecutor, optional) – Thread pool executor parallel calculation. Defaults to None.

  • num_threads (int, optional) – Max number of threads to use for executor if no executor is passed. Defaults to 8.

  • max_retries (int, optional) – Max retries for PFP calculation. Defaults to 0.

light_pfp_data.sample.crystal.sample_displaced(input_structure: Atoms, calculator: Calculator, dataset: H5DatasetWriter, delta: float = 0.1, supercell: Tuple[int, int, int] = (3, 3, 3), n_sample: Optional[int] = None, show_progress_bar: bool = False, executor: Optional[ThreadPoolExecutor] = None, num_threads: int = 8, max_retries: int = 0, pbar: Optional[tqdm_asyncio] = None) List[Future]#

Generate structures by displacing one atom.

Parameters
  • input_structure (Atoms) – The input structure.

  • calculator (Calculator) – The ASE calculator for energy calculation.

  • dataset (H5DatasetWriter) – The dataset.

  • delta (float, optional) – displacement distance in angstrom. Defaults to 0.1.

  • supercell (Tuple[int, int, int], optional) – Expand the input structure into supercell. Defaults to (3, 3, 3).

  • n_sample (Optional[int], optional) – The number of structures to be generated.
    If None, each atom will be displaced along x, y and z. Defaults to None.

  • show_progress_bar (bool, optional) – Show progress bar. Defaults to False.

  • executor (ThreadPoolExecutor, optional) – Thread pool executor parallel calculation. Defaults to None.

  • num_threads (int, optional) – Max number of threads to use for executor if no executor is passed. Defaults to 8.

  • max_retries (int, optional) – Max retries for PFP calculation. Defaults to 0.

light_pfp_data.sample.crystal.sample_md(input_structure: Atoms, calculator: Calculator, dataset: H5DatasetWriter, supercell: Tuple[int, int, int] = (3, 3, 3), sampling_temp: Union[List[float], float] = 500.0, sampling_steps: Union[List[int], int] = 10000, sampling_interval: Union[List[int], int] = 100, sampling_pressure: Optional[Union[List[float], float]] = None, timestep: float = 1.0, structure_type: int = 5, ensemble: str = ‘npt’, show_progress_bar: bool = False, executor: Optional[ThreadPoolExecutor] = None, num_threads: int = 8, max_retries: int = 0, pbar: Optional[tqdm_asyncio] = None) List[Future]#

Generate structures by running MD simulation.

Parameters
  • input_structure (Atoms) – The input structure.

  • calculator (Calculator) – The ASE calculator for energy calculation.

  • dataset (H5DatasetWriter) – The dataset.

  • supercell (Tuple[int, int, int], optional) – Expand the input structure into supercell. Defaults to (3, 3, 3).

  • sampling_temp (Union[List[float], float], optional) – MD simulation temperature, in K. Defaults to 500.0.

  • sampling_steps (Union[List[int], int], optional) – MD steps. Defaults to 10000.

  • sampling_interval (Union[List[int], int], optional) – Collect training structure every this steps. Defaults to 100.

  • timestep (float, optional) – Time step in fs. Defaults to 1.0.

  • structure_type (int, optional) – The structure type. Defaults to 5.

  • ensemble (str, optional) – The MD ensemble. Supported “nvt” and “npt. Defaults to “npt”.

  • show_progress_bar (bool, optional) – Show progress bar. Defaults to False.

  • executor (ThreadPoolExecutor, optional) – Thread pool executor parallel calculation. Defaults to None.

  • num_threads (int, optional) – Max number of threads to use for executor if no executor is passed. Defaults to 8.

  • max_retries (int, optional) – Max retries for PFP calculation. Defaults to 0.

light_pfp_data.sample.crystal.sample_rattle(input_structure: Atoms, calculator: Calculator, dataset: H5DatasetWriter, stdev: float = 0.1, n_sample: int = 10, supercell: Tuple[int, int, int] = (3, 3, 3), max_forces: Optional[float] = None, show_progress_bar: bool = False, executor: Optional[ThreadPoolExecutor] = None, num_threads: int = 8, max_retries: int = 0, pbar: Optional[tqdm_asyncio] = None) List[Future]#

Generate structures by randomly displacing all atoms.
The random displacement follows a normal distribution.

Parameters
  • input_structure (Atoms) – The input structure.

  • calculator (Calculator) – The ASE calculator for energy calculation.

  • dataset (H5DatasetWriter) – The dataset.

  • stdev (float, optional) – The standard deviation of the normal distribution.
    Defaults to 0.1.

  • supercell (Tuple[int, int, int], optional) – Expand the input structure into supercell. Defaults to (1, 1, 1).

  • n_sample (int, optional) – The number of structures to be generated. Defaults to 10.

  • max_forces (Optional[float], optional) – The maximum force (in eV/Angstrom). If exceed, the structure will be discarded.
    Defaults to None.

  • show_progress_bar (bool, optional) – Show progress bar. Defaults to False.

  • executor (ThreadPoolExecutor, optional) – Thread pool executor parallel calculation. Defaults to None.

  • num_threads (int, optional) – Max number of threads to use for executor if no executor is passed. Defaults to 8.

  • max_retries (int, optional) – Max retries for PFP calculation. Defaults to 0.

light_pfp_data.sample.crystal.sample_substitution(input_structure: Atoms, calculator: Calculator, dataset: H5DatasetWriter, n_sample: int, elements: List[int], possibility: List[float], supercell: Tuple[int, int, int] = (1, 1, 1), opt: bool = False, md: bool = False, sampling_temp: Union[List[float], float] = 500.0, sampling_steps: Union[List[int], int] = 1000, sampling_interval: Union[List[int], int] = 100, timestep: float = 1.0, show_progress_bar: bool = False, executor: Optional[ThreadPoolExecutor] = None, num_threads: int = 8, max_retries: int = 0, pbar: Optional[tqdm_asyncio] = None) List[Future]#

Generate structures by substituting atoms in the input structure.
The substitution is done by randomly selecting atoms and replacing them
with the specified elements.

Parameters
  • input_structure (Atoms) – The input structure.

  • calculator (Calculator) – The ASE calculator for energy calculation.

  • dataset (H5DatasetWriter) – The dataset.

  • n_sample (int) – The number of structures to be generated.

  • elements (List[int]) – List of element numbers to substitute.

  • possibility (List[float]) – List of probabilities for each element.
    The sum of the probabilities should be 1.0.

  • supercell (Tuple[int, int, int], optional) – Expand the input structure into supercell. Defaults to (1, 1, 1).

  • opt (bool, optional) – Optimize the structure after substitution.
    Defaults to False.

  • md (bool, optional) – Run MD simulation to collect more structures or not. Defaults to False.

  • sampling_temp (Union[List[float], float], optional) – MD simulation temperature, in K. Defaults to 500.0.

  • sampling_steps (Union[List[int], int], optional) – MD steps. Defaults to 1000.

  • sampling_interval (Union[List[int], int], optional) – Collect training structure every this steps. Defaults to 100.

  • timestep (float, optional) – Time step in fs. Defaults to 1.0.

  • show_progress_bar (bool, optional) – Show progress bar. Defaults to False.

  • executor (ThreadPoolExecutor, optional) – Thread pool executor parallel calculation. Defaults to None.

  • num_threads (int, optional) – Max number of threads to use for executor if no executor is passed. Defaults to 8.

  • max_retries (int, optional) – Max retries for PFP calculation. Defaults to 0.

light_pfp_data.sample.crystal.sample_surface(input_structure: Atoms, calculator: Calculator, dataset: H5DatasetWriter, supercell: Tuple[int, int, int] = (1, 1, 1), max_index: int = 2, min_slab_size: float = 10.0, min_vacuum_size: float = 10.0, opt: bool = True, md: bool = False, sampling_temp: Union[List[float], float] = 500.0, sampling_steps: Union[List[int], int] = 1000, sampling_interval: Union[List[int], int] = 100, timestep: float = 1.0, show_progress_bar: bool = False, executor: Optional[ThreadPoolExecutor] = None, num_threads: int = 8, max_retries: int = 0, pbar: Optional[tqdm_asyncio] = None) List[Future]#

Generate low miller index surface structures of the input structure.

Parameters
  • input_structure (Atoms) – The input structure.

  • calculator (Calculator) – The ASE calculator for energy calculation.

  • dataset (H5DatasetWriter) – The dataset.

  • supercell (Tuple[int, int, int], optional) – Expand the input structure into supercell. Defaults to (1, 1, 1).

  • max_index (int, optional) – Max miller index to go up to. Defaults to 2.

  • min_slab_size (float, optional) – Stickiness of the slab in angstrom. Defaults to 10.0.

  • min_vacuum_size (float, optional) – Stickiness of vacuum layer in angstrom. Defaults to 10.0.

  • opt (bool, optional) – Optimize the surface structures.
    Defaults to True.

  • md (bool, optional) – Run MD simulation to collect more structures or not. Defaults to False.

  • sampling_temp (Union[List[float], float], optional) – MD simulation temperature, in K. Defaults to 500.0.

  • sampling_steps (Union[List[int], int], optional) – MD steps. Defaults to 1000.

  • sampling_interval (Union[List[int], int], optional) – Collect training structure every this steps. Defaults to 100.

  • timestep (float, optional) – Time step in fs. Defaults to 1.0.

  • show_progress_bar (bool, optional) – Show progress bar. Defaults to False.

  • executor (ThreadPoolExecutor, optional) – Thread pool executor parallel calculation. Defaults to None.

  • num_threads (int, optional) – Max number of threads to use for executor if no executor is passed. Defaults to 8.

  • max_retries (int, optional) – Max retries for PFP calculation. Defaults to 0.

light_pfp_data.sample.crystal.sample_vacancy(input_structure: Atoms, calculator: Calculator, dataset: H5DatasetWriter, supercell: Tuple[int, int, int] = (3, 3, 3), n_vacancy: int = 1, n_sample: int = 10, opt: bool = False, md: bool = False, sampling_temp: Union[List[float], float] = 500.0, sampling_steps: Union[List[int], int] = 1000, sampling_interval: Union[List[int], int] = 100, timestep: float = 1.0, show_progress_bar: bool = False, executor: Optional[ThreadPoolExecutor] = None, num_threads: int = 8, max_retries: int = 0, pbar: Optional[tqdm_asyncio] = None) List[Future]#

Generate structures with vacancies.

Parameters
  • input_structure (Atoms) – The input structure.

  • calculator (Calculator) – The ASE calculator for energy calculation.

  • dataset (H5DatasetWriter) – The dataset.

  • supercell (Tuple[int, int, int], optional) – Expand the input structure into supercell. Defaults to (3, 3, 3).

  • n_vacancy (int, optional) – The number of vacancies to be created in one structure. Defaults to 1.

  • n_sample (int, optional) – he number of structures to be generated. Defaults to 10.

  • opt (bool, optional) – Optimize the vacancies structures.
    Defaults to False.

  • md (bool, optional) – Run MD simulation to collect more structures or not. Defaults to False.

  • sampling_temp (Union[List[float], float], optional) – MD simulation temperature, in K. Defaults to 500.0.

  • sampling_steps (Union[List[int], int], optional) – MD steps. Defaults to 1000.

  • sampling_interval (Union[List[int], int], optional) – Collect training structure every this steps. Defaults to 100.

  • timestep (float, optional) – Time step in fs. Defaults to 1.0.

  • show_progress_bar (bool, optional) – Show progress bar. Defaults to False.

  • executor (ThreadPoolExecutor, optional) – Thread pool executor parallel calculation. Defaults to None.

  • num_threads (int, optional) – Max number of threads to use for executor if no executor is passed. Defaults to 8.

  • max_retries (int, optional) – Max retries for PFP calculation. Defaults to 0.