I'm a Masters in Applied Artificial Intelligence graduate from University Of Ottawa, and looking for FTE as a Data Scientist, Machine Learning Engineer roles. Before my Master's, I worked as a Programmer Analyst at Cognizant Techology Solutions for 3.5 years. During my time there, I worked as a Data Scientist and Python Developer.
I have a keen interest in bringing value from the data, and utilize Machine Learning techniques for data-driven recommendations and decisions.
As a Data Scientist professional with almost 2 years of industry expertise, I've worked on Data Wrangling, Feature Extraction, Machine Learning Model building, and optimization among others. With tremendous enthusiasm for Machine Learning, I aspire to be a part of an organization where I can fully leverage my skills and experience to make a significant contribution to the betterment of the employer and enjoy my learning journey toward being an AI Solutions Architect.
The Disaster Tweet Claasification Application classifies whether a tweet indicates an occured disaster or not. This Application processess the text removing noise and enigneering the tweets to extract features. A custom desinged Classifier built with Stacking Naive Bayes, Logistic Regression and SVM Classifier is used for prediction. A Docker containerized Streamlit Web application is built to serve inferences, which is deployed on AWS Elastic Beanstalk.
The Loan Defaulter Predictor Application analyses a loan applicant's data and predicts the likeliness of whether the applicant will default the loan or not based on Lending Club's data. The application provides a capability to use a LightGBM model and TF based Neural Network for prediction, providing choice of model to use. Automated code integration and deployment with CI/CD Pipeline that integrates with GitHub, Docker containerize and deploy the ML application as a Streamlit web app.
Developed customer base identification solution using K-means Clustering algorithm, analysing demographic and customer population using Scikit-learn and Pandas libraries. Normalized data and, applied Principal component analysis(PCA) reducing data dimensionality and increased results’ interpretability, clustered data utilizing Silhouette score to decide the optimal number of clusters.
Built a Deep Learning system that takes in an image and provides a caption describing the image using PyTorch library on Python. Developed an Encoder-Decoder network, CNN network and Embedding layer constitute the Encoder extracting features from images which are fed to Decoder. Decoder network consisting of LSTM and fully connected layers, analyzes the features, and generates captions describing the image in a sentence that match closely with human perception.
A dynamic machine learning model was developed that can retrain itself based on online streaming data through Kafka. The model uses a sliding window technique to retrain itself if the F1 score falls below a certain threshold, and continues to do so until the end of the data stream. Data pre-processing tasks such as cleansing, upsampling, downsampling, and scaling were also performed.
Based on census data, this project aims at identifying people whose income exceeds $50,000/year. These targeted candidates are then reached out for donations.