Dragonboard 820c or Hikey 970 For Image Processing Neural Network

Hello, I am working on running a real-time detection on a neural network, and am looking at running openCV and yolo/tenserflow/caffe on the board. The model will be trained on an external computer and run on the single board computer. Has anyone done this? I would like to exceed 15 frames per second of real-time detection using the on board GPU of one of these chips. In the past I have run the model using an Nvidia GPU and yolo-tiny, and was able to exceed 30 frames per second using CUDA. Is there are way to utilize the on board GPU of either one of these boards installing no gpu models?
Thank You