Summary
- Having 6+ Years of experience as a ML/DL Engineer.
- Highly skilled in Data-Driven Machine Intelligence, specializing in Machine Learning, Deep Learning, and Advanced Analytics, and a proven tack record of delivering impactful end-to-end solutions in the industry.
- Delivered high-impact AI solutions, including innovative models for wake-up word detection, speech command detection, Coreference Resolution, Query completion, Information extraction from the query and natural language generation.
- Proficient in cutting-edge tools and technologies such as TensorFlow, PyTorch, Transformers, and ONNX, with a strong command of Python and data visualization libraries, utilizing Docker and Kubernetes as model development platforms.
- Excelled in end-to-end model development for both on device and cloud-based applications, including transitioning models from PyTorch to TensorFlow Lite for deployment on edge devices.
- Proficient in Python programming with a focus on high-quality, PEP-8 compliant code adhering to international standards.
- Possess a Master’s degree in Big Data Analytics (Data Science) with strong academic performance, including being a topper in NPTEL’s Machine Learning course and having research accepted at prestigious conferences such as ICON-2020.
- Skilled in diverse domains such as NLP, SLP, and CV, with hands-on experience in natural language understanding and generation,
- keyword spotting, image captioning, and video captioning & analytics.
- Gained experience in multiple roles including Research Engineer, NLP Engineer, and Research Intern, showcased adaptability and expertise in both development and research environments.
- Proficient in advanced AI and Machine Learning techniques including classification, clustering, feature extraction, and regression; skilled in Deep Learning methodologies such as RNNs, CNNs, encoder-decoder architectures, attention mechanisms, and transformers; extensive experience in NLP tasks including document classification, clustering, summarization, co-reference resolution, question answering, text culture analysis, natural language understanding, and generation.
- Experienced in Speech and Language Processing, and Computer Vision applications including keyword spotting, wake-up word detection, speech command identification, as well as image and video analytics; skilled in data analytics with expertise in data visualization, descriptive and exploratory data analysis (DDA & EDA), and surrogate modeling.
Technical Skills
Coding: Python3
Learning: GCP, C/C++, R, SQL, HTML
Libraries: TensorFlow, PyTorch, Transformers, Keras, ONNX, NumPy, Pandas, OpenCV, Scikit-Learn, Matplotlib, SciPy, Librosa, NLTK
OS/Environment: Linux, Kubernetes, Docker, Conda, Jupyter, GIT, Jira
Machine Learning (ML): Linear Regression, Logistic Regression, K-Nearest Neighbors (KNN), Support Vector Machines (SVM), K-Means,
Decision Tree, Random Forest, Principal Component Analysis (PCA), LDU Decomposition, Factor Analysis, K-Means Clustering
Natural Language Processing (NLP): Long Short-Term Memory (LSTM), BERT, DistilBERT, ALBERT, RoBERTa, T5, G2T, GPT-2
Speech Language Processing (SLP): Gated Recurrent Units (GRU), Speech Language Understanding (SLU), Wav2Vec, HE-KWS, Depthwise Separable Convolution Neural Network (DS-CNN), BC-ResNet, Whisper
ComputerVision(CV): ResNet, VGG16, VGG19, Inception Net, Region-based Neural Network(RNN), Region-based Convolutional Neural Network
Core Competencies & Soft Skills
- Machine Learning
- AI Model Building
- Deep Learning
- Data Visualization
- Data Analysis and Reporting
- Computer Vision
- Natural Language Processing
- Statistical Modeling
- Speech Recognition
- Research Methodology
- Problem-solving
- Analytical
- Collaborator
- Communicator
- Time Management
Certifications
- NOC23-CS77 Computer Vision | NPTEL (Issued Nov 2023)
- Machine Learning Deep Learning Model Deployment | Udemy (Issued Sep 2022)
- NOC19-CS57 Applied Natural Language Processing | NPTEL (Issued Dec 2019)
- NOC19-CS14 Machine Learning for Engineering and Science Applications | NPTEL (Issued May 2019)
- NOC19-CS35 Machine Learning, ML - Course Topper | NPTEL (Issued May 2019)
Work Experience
Timeline: Feb '21 – Present
Role: Research Engineer | Regular - LGSI AI Solution
Key Result Areas:
- Conducted research and development in AI and Machine Learning (ML) to advance technology and solve complex problems. This includes designing, developing, and optimizing algorithms for AI solutions and creating data-driven intelligence models for smart features in Home Appliances and Access Hub products.
- Analyzed large datasets to derive actionable insights and improve model performance.
- Prepared comprehensive documents and reports on research findings, methodologies, and progress.
- Collaborated with cross-functional teams to integrate research outcomes into practical applications.
Timeline: Nov '20 – Feb '21
Role: NLP Engineer | MRC - LGSI AI-2 Team
Duru Cooperation, Pvt. Ltd., Bangalore (LG Soft India, Pvt. Ltd.)
Key Result Areas:
- Investigated and implemented advanced techniques for representing input in Natural Language Generation (NLG), focusing on Abstract Representations (ARs), Meaning Representations (MRs), Relational Representations (RRs), with an emphasis on Triplet Relation Representations (TRRs) and Knowledge Graphs (KGs).
- Developed and optimized models to translate structured, sparse information into ARs, MRs, and TRRs. Used TreeNLG data to derive abstract representations for effective and adequate information and meaning extraction.
- Collaborated closely with cross-functional team members to understand the application of NLP across use cases.
Timeline: Aug '20 – Nov '20
Role: AI Engineer | Contract - LGSI AI-2 Team
Key Result Areas:
- Worked on fine-tuning DL-based Natural Language Generation models, with a focus on generating short answers.
- Fine-tuned an ALBERTa-based Question Answering system using SQuAD v1.1.1 and v2.0.0 datasets, designed to answer user queries using Product Manual Paragraphs. Further, improved system performance significantly using T5 as the model in place of ALBERTa.
Timeline: Jan '20 – Jul '20
Role: Project Linked Person | JRF/Intern - CSE Department
Indian Institute of Technology, Patna
Key Result Areas:
- Studied state-of-the-art publications related to Natural Language Understanding, Coreference Resolution, Named Entity Recognition, and Extractive Summarization.
- Enabled the application of cutting-edge models for Coreference Resolution, including developing a lightweight version to enhance model efficiency.
- Under the guidance of the research supervisor, addressed the challenge of making query contexts actionable and developed a novel Deep Learning model that achieves performance levels close to human annotation.
- Managed and analyzed research data with high accuracy, ensuring the reliability of outcomes. Prepared and presented findings in academic papers & technical reports, contributing to the dissemination of knowledge within the academic community.
Timeline: Jun '19 – Dec '19
Role: Research Intern | Intern - ECSU Department
Indian Statistical Institute, Kolkata
Key Result Areas:
- Reviewed recent research papers on Image Classification, Object Localization, and Image and Video Captioning.
- Developed a Deep Learning model for Video Captioning using VGG16, LSTM, and sequential token mapping techniques.
- Trained and evaluated the model with the MSVD dataset to assess its performance.
Projects Worked On
Audio Classification, Wake-Up Word Identification, AI Solution:
Timeline: May '23 – Present
Key Result Areas:
- Identifying and processing a specific set of Wake-up Words from continuous audio streams for home appliances and app control hubs with limited computational resources.
- Developing and optimizing a lightweight CNN-based model for high performance, tailored to the GSC and industry-specific datasets.
- Designed and implemented Sliding Window Audio Processing for real-time inference, handling streaming audio inputs. Our advanced decision strategies ensured that the performance in terms of True Positive Rate versus False Acceptance Rate remained very high.
Audio Classification, Speech Command Detection, AI Solution:
Timeline: Oct '22 – Apr '23
Key Result Areas:
- Enabled a model to identify intent keys from spoken audio commands for home appliances with low-capability neural processing engines.
- Developed a high-performing End-to-End Speech Language Understanding model using PyTorch, optimized for detecting various intents from FSC and industry-specific datasets.
- Converted the PyTorch model to TensorFlow (TF) using ONNX and subsequently to TensorFlow Lite (TF-Lite) for deployment on edge devices.
- Implemented a GRU from scratch using matrix operations (based on primitive operations) to eliminate layer dependency for OnDevice inference. This inference-GRU directly uses the trained weights of the layer-GRU, and its output matches that of the layer-GRU with precision up to 1e-7.
Data-To-Text for Stock Domain, AI Solution:
Timeline: Jan '22 – Sep '22
Key Result Areas:
- Extracted essential information from user queries and fetched stock data from Yahoo Finance API to generate natural language responses about trading prices and statistical information.
- Enabled models for key tasks including Information Extraction, Information Presentation, Raw Data Retrieval, Natural Language Understanding (NLU), and Text Generation.
- Automated SQL data population using Python to streamline data management processes.
Data-To-Text for Weather Domain, AI Solution:
Timeline: Nov '20 – Dec '21
Key Result Areas:
- Developed solutions for generating natural language responses with emotional connect to weather-related queries.
- Engaged in Information Extraction, Interpretation, Presentation, and Natural Language Generation (NLG), including model development for Named Entity Recognition (NER), Date-Time Expression Extraction, and Time-Zone & Latitude-Longitude Mapping.
- Implemented an NLG model based on the TreeNLG dataset to enhance text generation capabilities.
Making User Queries Complete, Context-Aware, and Executable:
Timeline: Jan '20 – Jun '20
Key Result Areas:
- Developed a framework to address the problem of input representation for deep learning models from contextually silent queries.
- Implemented a solution using a transformer-based Deep Learning model that segments information to enhance query reference resolution and informativeness, achieving a METEOR score of 64% in evaluating model predictions against human annotations.
- Focused on making user queries reference-resolved, informative, comprehensible, and actionable.
Video Captioning Using Deep Neural Networks:
Timeline: Jun '19 – Dec '19
Key Result Areas:
- Developed a model for generating natural language captions for MSVD (Microsoft Video Description) videos using deep neural networks, specifically VGG16 and LSTM architectures.
- Utilized key-frame features and a Word Alignment Vector, created as word embeddings for n-gram alignment. Designed the model to decode sequences of words into captions by encoding video features and word vectors with the Word Alignment Vector.
- The model generates variable-length captions for a given video. Evaluated the model's performance against human-annotated captions, achieving a METEOR score of 30%.
Achievements
- Received the Spotlight Award in June 2022 for contributions to Natural Language Processing on the Data-to-Text project for the Stock Domain.
- Honored with the High Achiever Award in the Project Category for the "Audio Classification, On-Device Speech Command Detection" project.