Qm9 Dataset Download New! Page
If you are using the DeepChem library, you don't need to manually download anything. You can fetch a cleaned version directly via Python:
Downloading the QM9 dataset is the first step toward building models that can predict how a molecule behaves without needing expensive lab equipment or time-consuming DFT simulations. Whether you use the raw files from Figshare or the streamlined loaders in PyG and DeepChem, QM9 remains the gold standard for testing new molecular architectures.
The raw data often uses Hartrees or Bohr radii. Most ML models prefer Electronvolts (eV) or Angstroms. Always check if your loader has already performed these conversions. qm9 dataset download
For the most control, you can download the raw XYZ files from the repository where the authors originally hosted it. Source: figshare - QM9
Once you complete your , you’ll have access to 19 different properties for each molecule, including: Dipole moment (μ) Isotropic polarizability (α) Highest Occupied Molecular Orbital (HOMO) energy Lowest Unoccupied Molecular Orbital (LUMO) energy Gap (HOMO-LUMO difference) Heat capacity at 298.15 K (Cv) Common Challenges with QM9 If you are using the DeepChem library, you
What makes QM9 special is that it provides calculated using Density Functional Theory (DFT). It’s essentially the "MNIST" of molecular property prediction. Where to Download the QM9 Dataset
About 3,000 molecules in the dataset failed certain consistency tests in the original study. Most modern loaders (like the ones in DeepChem or PyG) allow you to filter these out automatically. Final Thoughts The raw data often uses Hartrees or Bohr radii
In this guide, we’ll break down exactly how to handle a , what’s inside the files, and the best ways to load the data into your project. What is the QM9 Dataset?