Initial Project Ideas and Information:
One of the posibble course projects will be part of an ongoing research topic for evaluating and integrating TensorFlow deep learning framework and MPI/OpenMP/CUDA HPC software framework.
Integrating TensorFlow with HPC software stack
- Syngery of OpenMP and TensorFlow by creating a async threadpool for CPU. TensorFlow uses https://eigen.tuxfamily.org/ threadpool and task scheduler (workstealing) for schedule tensor tasks on CPU cores. Check https://bitbucket.org/eigen/eigen/src/8d1ccfd9c5a0bf1a662bd0fe35858f4e527d2bac/unsupported/Eigen/CXX11/?at=default for how threadpool is implemented and its API. The project will be using OpenMP threadpool to support Eigen threadpool API so TensorFlow will use it on CPU. There could be two approaches: 1. use the LLVM OpenMP tasking API to do that. 2. To strip out idle thread from the OpenMP threadpool and let that thread runs the scheduling loop of Eigen async threadpool.
- GPU runtime integration with OpenMP. This needs to look at how IBM is doing with OpenMP target implementation and whether we should leverage our HOMP runtime to do that.
- TensorFlow GPU scheduling
- TensorFlow with Slum manager
- https://github.com/jhollowayj/tensorflowslurmmanager/
- https://github.com/tensorflow/tensorflow/issues/1686
- TensorFlow with MPI
- http://research.baidu.com/bringing-hpc-techniques-deep-learning/
- https://github.com/baidu-research/tensorflow-allreduce, https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/mpi