Check out this publication by Maddy and the Coley, Reisman and Sigman labs on dataset design in building chemical reactivity models!