HT Model Dataset

Dataset Image

download Download


This website contains the dataset that was used for writing the manuscript "HT Model: Using the Molecular Transformer for predicting hydrotreating reactions" 

The dataset includes a collection of hydrotreating reactions compiled from 41 peer-reviewed literature sources. These sources contain experimental data related to hydrotreating reactions. These reactions involve the reaction of chemical compounds with hydrogen gas in the presence of a catalyst to remove heteroatoms or to convert specific functional groups. 
The dataset contains reactions both with and without reaction conditions. Reaction conditions refer to the specific parameters under which the reaction takes place, such as temperature and pressure. 
For each reaction, the dataset includes both SMILES and SELFIES representations. SMILES (Simplified Molecular Input Line Entry System) and SELFIES (SELF-referencIng Embedded Strings) are two popular notations used to represent chemical structures in a compact and standardized format. 
The dataset was created with the aim of training a predictive model, specifically using the Molecular Transformer architecture.