Hi, I'm Venkatesh a passionate Data Scientist with a solid foundation in programming, analytics, and artificial intelligence. I hold a B.Tech in Electrical and Electronics Engineering, where I developed a strong problem solving mindset and love for technology driven insights. I specialize in Machine Learning, Deep Learning, and Natural Language Processing (NLP) building intelligent models that uncover patterns, predict outcomes, and automate decision making. My work includes sentiment analysis, text summarization, time series forecasting, and classification/regression models applied across real world datasets. I’ve built advanced NLP systems using BART, Hugging Face Transformers, and PyTorch. During my internships and projects, I developed solutions such as Mobiles Discount Prediction Streamlit, Digital Music Hybrid Recommender, Electrical Tools Object Detection and Festival Sales Based on Customer Demographics Sales Analysis, performing end-to-end data analysis, model training, and business insights generation. I’ve also created a Sentiment Analyzer NLP Streamlit using modern NLP frameworks and evaluation metrics like BLEU and ROUGE, showcasing deep understanding of natural language models.
My technical expertise covers Python, Pandas, NumPy, Scikit-learn, TensorFlow, PyTorch, Matplotlib, Seaborn, and Plotly. I’m skilled in SQL, Power BI, and building ETL data pipelines for analytics workflows. I have hands on experience in web scraping and data extraction using BeautifulSoup and Selenium, automating data collection from e-commerce and job portals for research and dashboards. Beyond modeling, I focus on model deployment and real world application. I deploy ML and NLP models using Streamlit both locally and on cloud platforms like Streamlit Cloud and Hugging Face Spaces for interactive demo apps. I excel in data visualization and data storytelling, turning complex data into actionable insights using interactive dashboards and visual reports in Power BI and Streamlit. I share my knowledge through my YouTube channel, Venkatesh’s Data Lab, where I explain data science projects, coding workflows, and visualization techniques for learners. My long term goal is to contribute to the growth of intelligent data systems and build scalable AI applications that bridge data insights with impactful real world outcomes.
This project is an end-to-end data product that scrapes, validates, and merges mobile phone listings from Flipkart and Amazon into a unified dataset, culminating in a deployed Streamlit application that predicts the final discount price using an optimized ensemble machine learning model.
App Link Source Code
This project implemented an Object Detection pipeline using YOLOv8 on a custom 500-image dataset of 10 electrical tool classes (labeled via Roboflow), culminating in a Streamlit application capable of real-time image and video inference.
App Link Source Code
This is a Deep Learning NLP Text Classification project that trains an LSTM model on a mental health sentiment dataset, applying rigorous text preprocessing and feature engineering, culminating in a Streamlit application for real-time prediction and classification of emotional states.
App Link Source Code
This project implements a sophisticated Hybrid Recommender System for Amazon Digital Music data, integrating content-based (TF-IDF), collaborative (SVD), and popularity methods, and features a deployed Streamlit application with user-adjustable weighting for dynamic model tuning and streamlit application.
App Link Source Code
This project performs end to end Sales Analysis using EDA and advanced visualization on customer demographics and shopping habits, successfully deriving actionable insights regarding product performance, discount effectiveness, and the impact of festivals on sales and advanced Power BI visualizations.
NAN Source Code
This project implemented an end-to-end Deep Learning Image Classification pipeline using a CNN-ANN model to accurately classify 7 categories of industrial electrical and mechanical components from a custom 3500+ image dataset, complete with a deployed Streamlit application for real-time inference.
App Link Source Code
This project created a Deep Learning Image Similarity Search web application using Streamlit, leveraging MobileNetV2 for feature extraction and Cosine Similarity on a custom dataset to instantly retrieve the Top 5 visually similar images in real time.
App Link Source Code
This project developed a Streamlit web application for an ATS Checker that analyzes uploaded resumes, compares them against a job description, and provides a quantified ATS compatibility score to help users optimize their application materials.
App Link Source Code
A deep learning project that classifies end to end sports videos using frame based analysis with a pretrained VGG model, leveraging computer vision and transfer learning for accurate, automated sports recognition and analytics.
NAN Source Code
Core programming language for all data science workflows including data analysis, automation, and machine learning model development.
Experience building predictive models using regression, classification, clustering, and ensemble methods with Scikit-learn.
Hands-on experience with neural networks, CNNs, and RNNs using TensorFlow and PyTorch for AI applications.
Implemented NLP models for sentiment analysis, summarization, and query based text generation using BERT and BART.
Explored LLMs, prompt engineering, and fine tuning using Hugging Face and OpenAI APIs to build intelligent applications.
Developed interactive dashboards to visualize key metrics and business insights for data driven decision making.
Skilled in database creation, SQL queries, joins, and integrating data pipelines with Python and analytics tools.
Strong command of Excel for data analysis, pivot tables, Power Query, and quick visual insights.
Processed and transformed messy data into structured datasets using Python libraries like Pandas and NumPy.
Applied statistical concepts like distributions, correlation, and hypothesis testing for analytical insights.
Performed univariate, bivariate, and multivariate analysis to uncover patterns and data insights.
Created advanced visualizations using Matplotlib, Seaborn, Plotly, and Power BI for impactful storytelling.
Used analytical reasoning to identify data driven solutions and optimize business or model outcomes.
Evaluated models using metrics like RMSE, ROC AUC, Precision, Recall, and F1-Score to ensure high accuracy.
Transformed and created features to boost model performance and extract more predictive insights.
Deployed models using Streamlit locally and on the cloud for real time accessibility and interactivity.
Built ETL pipelines, automated data workflows, and integrated structured and unstructured data sources.
Turned complex datasets into actionable stories using visual reports and dashboard presentations.
Collected and analyzed large scale data using BeautifulSoup and Selenium for insights and automation.
The ability to deploy, scale, and manage machine learning models and data pipelines on cloud platforms essential for real world, production level data science solutions.