To improve memory allocation performance, many TensorFlow users often use tcmalloc instead of the default malloc() implementation, as tcmalloc suffers less from fragmentation when allocating and deallocating large objects (such as many tensors). Some memory-intensive TensorFlow programs have been kn...