Executing inference workloads on NPU( Hikey 970)

I have installed Debian as an OS in the Hikey970 board. How can neural network workload be executed on NPU? Is there any specific API functions? And, I have read that NPU has memory constraint and will not execute if size exceeds 100MB. How can we specify computing device as NPU for certain layers ?