Mohit Kumar Researcher/Consultant/Trainer Programming is more than just typing.

About

I am fascinated with Computer Science, mathematics, Universe and stuff. I like to understand things and the way they work under-the-hood. Occasionally, I like to explain the things that I understand using the first principle's approach. More formally I am a Researcher, Trainer, and a design consultant on the design of Artificial Intelligence(deep learning) based systems. Microarchitecture based optimizations are my specialization, more specifically P5(Intel) to Skylake(Intel). On the vector side, SIMD, GPUs(Nvidia). My micro-architecture knowledge enables me to see the complete stack. Last 5 years I have been working on Optimizing Tensorflow and Models on Tensorflow on GPUs/CPUs. A general example of micro-optimization on Haswell microarchitecture. My current and major research interests include Attention-based models, Google's NeuralMachineTranslator , Google's BERT, DeepMind's DNC.
This blog is my general musings on Artificial Intelligence but every once in awhile however, I'll give in to my temptation of showing hardware optimization using X86 assembly.
I am also a feminist, aspiring guitarist and an aspiring cook and am fascinated with the idea of breathing right.

Research Interests

Deep-Breathe

I feel that the barrier to entry for Deep Learning is very steep. Especially the models which are required for Natural Language Processing. Neural Machine Translation, for example, uses concepts like LSTM, Bidirectional LSTM, Multi-Layered LSTMs, Attention, etc. Neither one of them is easy to understand by itself, imagine the plight of a student when these concepts are strung together for a Neural Machine Translation or Google's BERT based systems. I have seen Neural Machine Translation Based systems grossly underperform and, it was simply because most of the hyperparameters were not understood at all. Deep-Breathe is a complete and pure python implementation of these models, especially but not limited to Neural Machine Translator. In Fact, the scripts have been written to compare weights after a certain number of iterations between TF implementation and the Deep-Breathe one. Hopefully, this will go a long way into lowering the barrier to entry. Deep-Breathe is already being used internally by many organizations.

GitHub