DFT data used in training MolE8 chemical ML models
Publication date
2021-11-09Creators
Ermanis, Kristaps
Goodman, Jonathan M.
Metadata
Show full item recordDescription
===============================================================
Data for paper
"MolE8: Finding DFT Potential Energy Surface Minima Values from
Force-Field Optimised Organic Molecules with New Machine
Learning Representations"
Sanha Lee, Kristaps Ermanis* and Jonathan M. Goodman*
Yusuf Hamied Department of Chemistry, University of Cambridge,
Lensfield Road, Cambridge, CB2 1EW
and
School of Chemistry, University of Nottingham,
University Park Nottingham, Nottingham, NG7 2RD
===============================================================
This dataset contains Gaussian DFT optimization and frequency
calculation output files for all of the molecules used in the
training of the MolE8 representations and machine learning
methods.
The dataset is divided in 7 parts to keep the archive file sizes
manageable. Each folder contains data for around 8000 molecules.
The data includes the geometry optimization *a.out files, frequency
calculation *f.out files and *sdf files of the optimized structures
for wider compatibility with visualization software.
Part 1 contains structure files up to 009999A1*
Part 2 contains structure files up to 019999A1*
Part 3 contains structure files up to 021988A1*
Part 4 contains structure files up to 39997A1*
Part 5 contains structure files up to 49999A1*
Part 6 contains structure files up to 59999A1*
Part 6 contains structure files up to 69125A1*
All structures in these folders have been optimized and frequencies
calculated at B3LYP/6-31g(2df,p) level in gas phase.
All of the files can be opened in any text editor. Gaussian output structures
can be viewed and the frequency modes visualised in GausView, Avogadro, jmol
and in most other molecular viewers/editors. *.sdf files can be viewed in
essentially all 3D molecular editors and viewers.
External URI
Subjects
- Machine learning
- Chemistry, Organic
- Density functionals
- Computational chemistry
- DFT, Gaussian, organic chemistry, machine learning, MolE8, neural networks, kernel ridge regression
- Physical sciences::Chemistry::Organic chemistry
- Q Science::QD Chemistry::QD241 Organic chemistry
- Q Science::QD Chemistry::QD450 Physical and theoretical chemistry
Divisions
- University of Nottingham, UK Campus::Faculty of Science::School of Chemistry
Deposit date
2021-11-09Data type
Gaussian 16 DFT software output filesContributors
- Lee, Sanha
Funders
- Other
- Leverhulme Trust
- Isaac Newton Trust
- Trinity College, University of Cambridge
Grant number
- ECF-2017-255
- 17.08(d)
Data collection method
Gaussian 16 DFT softwareResource languages
- en