Summary
- Results-driven Data Science Lead with expertise in developing and deploying AI solutions for 8+ years, including sophisticated chatbots using AWS/GCP services.
- Proficient in Python and R, with hands-on experience in Databricks, AWS Bedrock, and Claude GenAI modelling.
- Skilled in vector embedding techniques and knowledge graphs, employing advanced methods like Retrieval-Augmented Generation (e) to enhance model performance.
- Committed to continuous learning and collaboration, aiming to contribute to innovative projects that drive efficiency and improve user experiences in a dynamic tech environment.
Technical Skills
Programming Languages: Python (Data Structures/DS/Algorithms), R
Machine Learning Frameworks: TensorFlow, PyTorch, Scikit-Learn
Generative AI Frameworks: Hugging Face Transformers, OpenAI GPT, LLM, RAG, Llama, Mistral AI, Streamlit, ClarifAI, AssemblyAI, Langchain, DeepFace, Mediapipe, OpenAI Libraries
Data Processing and Visualization: Pandas, NumPy, Matplotlib/Seaborn
Cloud Platforms:
- AWS: Services like S3, EC2, SageMaker, and Lambda
- Google Cloud: BigQuery, TensorFlow Extended (TFX), AI Platform
- Azure: Azure ML and Cognitive Services
DevOps and MLOps:
- Docker: Containerization for consistent environments
- Kubernetes: Orchestration of containerized applications
- MLflow: Managing the ML lifecycle
- Kubeflow: Kubernetes-native platform for deploying ML pipelines
Databases:
- SQL Databases: MySQL, PostgreSQL
- NoSQL Databases: MongoDB, Cassandra
- Data Lakes: Apache Hadoop, AWS Lake Formation
Natural Language Processing (NLP)/Computer Vision (CV): NLTK, spaCy, BERT, Computer Vision, OpenCV, YOLO, VGG/ResNet
Model Deployment: Flask/FastAPI services, Streamlit
Version Control and Collaboration: Git/GitHub, GitLab, Bitbucket
Additional Tools: Jupyter Notebooks, VS Code, Anaconda
Projects worked on
AWS LEX/Slack CHATBOT:
Skills/Domain: Python, AWS-Lambda, AWS-Lex, Snowflake, AWS-EC2, AWS-SES, AWS-VPC, LLm, RAG, PostgreSQL
Contribution to Project:
- Designed and implemented a sophisticated chatbot leveraging AWS Lex, Azure Genai(OpenAi) capability with Llama and GCP for the database.
- Utilized AWS Lambda for serverless backend processing to manage chatbot logic and integrations.
- Used Llama and Azure Genai capabilities to make the chatbot more user-friendly and domain-specific.
- Integrated AWS Bedrock for further enhancing the Genai capability of the chatbot.
- Employed Retrieval-Augmented Generation (RAG) techniques to enhance the chatbot's response accuracy and relevance.
- Utilized Docker to containerize the application, ensuring consistency across different environments.
- Managed Docker images and deployments using Amazon Elastic Container Registry (ECR).
- Leveraged AWS CodeCommit for version control and collaborative development.
- Successfully deployed the chatbot on Slack, enhancing internal communication and support mechanisms.
Resource Attrition Model:
Skills/Domain: Python, R, Snowflake, Classification, RandomForest, XGBoost
Contribution to Project:
- Generated a working model in a team which can handle and help upper management in resource management.
Predictive Health Pipeline:
Skills/Domain: Python, AWS, Snowflake, Regression, SVM, Hybrid-regression
Contribution to Project:
- Regression model that helps identify the current health status of an ongoing business deal.
HSBC:
Role: Data Science-NLP(Natural language processing)
Skills/Domain: RASA, AWS, Docker, Kubernetes, Python, Bert, Transformers, MS Copilot
Contribution to Project:
- Managed a group of four NLP data science engineers to build a chatbot leveraging RASA with Bert transformers.
MAF:
Role: Advanced Data Analytics
Skills/Domain: Airflow, Python, AWS, Superset, SQL, R, Git
Contribution to Project:
- Generated reports, charts and insights that leaders can leverage to make more significant decisions for future business growth.
POC-DAMAC:
Role: Data Lead
Skills/Domain: R&D Python SQL, AWS, Neural networks, Data Science, Machine learning, Statistical Analysis, Analytics, R, Python, SQL, Excel
Contribution to Project:
- Analysed and devised a roadmap for the client DAMAC, which can help optimise properties' prices.
- To suggest and provide data and data science potential and shortcomings for the price optimisation tool.
- Responsible for generating valuable insights for the client.
Beverly Jeans:
Role: Senior Data Scientist
Skills/Domain: Regression, Gurobi, SQL, EXCEL, Pyspark, Azure, AWS
Contribution to Project:
- Worked with a team of 15 people to generate a working model for promotion optimization.
JC Penny:
Role: Data Scientist Lead
Skills/Domain: NLP.x, MSSQL, SharePoint API
Contribution to Project:
- Information extraction from the image description data using various NLP concepts like TF-IDF, Stemming and Lemmatization etc and bundled with Naïve Bayes and BERT.
- Led a team of 4 people to gather data and develop a product.
JoAnn:
Role: Data Scientist Lead
Skills/Domain: CNN, ResNet50, Azure, Pyspark, Elasticsearch, OpenAI, Azure OpenAI
Contribution to Project:
- Designed and deployed Image attribute tagging product for the US-based retail client JoAnn using ResNet50, Pandas, Keras etc Led a team to research and develop an Image attribute generation tool.