Collected all cudaMalloc's in the same place and made min/max look like sum along with other house cleaning
Seems to do the correct thing with mixed data types
Changed kernalDims to uint