Hello, I'm Yamin Hossain

|

A passionate data scientist with expertise in machine learning, statistical analysis, and data visualization. Turning complex data into actionable insights.

Profile
About Me

Who I Am

I'm an aspiring data scientist with a passion for extracting meaningful insights from complex datasets. With a strong foundation in statistics, programming, and machine learning, I transform raw data into actionable business intelligence.

Machine Learning

Building predictive models and algorithms to solve complex problems.

Data Analysis

Extracting insights from large datasets using statistical methods.

Data Visualization

Creating compelling visual stories from complex data.

Research

Exploring new methodologies and staying current with the latest advancements.

Skills

My Expertise

A combination of technical proficiency and soft skills that enable me to deliver comprehensive data science solutions.

Technical Skills

Python90%
R70%
SQL85%
MATLAB60%
TensorFlow/PyTorch80%
Scikit-Learn85%
Pandas/NumPy90%
Tableau/Power BI80%
Big Data (Hadoop)65%

Skills Overview

Soft Skills

Problem Solving

Critical Thinking

Communication

Teamwork

Attention to Detail

Time Management

Adaptability

Continuous Learning

Featured Projects

AI Agents/Automation

FinanceAI
FinanceAI
A comprehensive financial analysis platform built with Next.js that provides AI-powered insights for stocks, cryptocurrencies, and forex markets. The platform offers real-time market data, technical analysis, and personalized financial advice.
Next.js
Tailwind CSS
Shadcn UI
Chart.js
Lucide Icons
Twelve Data API
Reddit API
Agents
Cortex AI: Multi-Model Insights Hub
Cortex AI: Multi-Model Insights Hub
Developed a RAG-powered platform with multiple LLMs (DeepSeek, Qwen, LLaMA, OpenAI), achieving 98% retrieval accuracy and 40% lower response latency.
LangChain
ChatGroq API
LLM
Streamlit
MultiRAG
Search Agents
ReelSense AI: Intelligent Media Discovery & Recommendation
ReelSense AI: Intelligent Media Discovery & Recommendation
Created a Flask-based movie and TV show recommendation app using the API, featuring genre-based browsing and an AI-powered Agent. User can Also Track their favorite movies and TV shows.
Flask
TMDb API
Jinja2
Render
Multi-model
News API
PostgreSQL
Cloudinary
AI Automation Workflow Suite
AI Automation Workflow Suite
A collection of production-ready n8n automation workflows powered by LLMs, designed to handle intelligent job-matching, personalized email generation, and daily AI-assisted reminders. The suite showcases end-to-end automation—including data extraction, multi-agent reasoning, third-party API integrations, and workflow orchestration—built for real-world productivity use cases.
n8n
Automation
AI Agents
Gemini API
Google Sheets
Telegram API
Notion API
API Orchestration
Real-Time Retail Forecasting with RAG-Powered AI
Real-Time Retail Forecasting with RAG-Powered AI
A production-grade MLOps ecosystem combining ML forecasting with RAG-based AI analytics. Features real-time data streaming every 10 minutes, nightly model retraining, incremental vector DB generation for 3M+ records, and an AI-powered Data Analyst capable of answering natural-language questions about sales, trends, and stores. Includes what-if simulations and a premium glassmorphism dashboard UI.
Streamlit
Python
XGBoost
Prophet
Redis
ChromaDB
Sentence Transformers
RAG
Groq API
Serverless MLOps
GitHub Actions
MLflow
Hugging Face Hub

Machine Learning/Deep Learning

Art Intelligence Hub
Art Intelligence Hub
Multi-page application that showcases advanced deep learning models for art analysis. It includes two main tools: an Artwork Authenticity classifier to distinguish between AI-generated and human-made art, and an Art Style Ensemble Classifier that identifies the artistic style of a painting.
Keras
OpenCV
Grad-CAM
Transfer Learning
Ensemble Classifier
AI Content Classifier: Production MLOps Pipeline
AI Content Classifier: Production MLOps Pipeline
A production-ready MLOps pipeline that automatically classifies Reddit content using advanced multi-label machine learning with enterprise-level automation and continuous learning.
content-classification
reddit-api
mlops-workflow
end-to-end-machine-learning
Sleep Disorder Detection App
Sleep Disorder Detection App
Built a Streamlit-based Sleep Disorder Detection App using machine learning to predict sleep conditions like insomnia and sleep apnea based on user inputs. Integrated interactive analytics, personalized sleep tips, and deployed the app on Streamlit Cloud for accessible, data-driven sleep insights.
Scikit-learn
XGBOOST
Streamlit
Matplotlib/Seaborn/Plotly

Data Analysis

Supply Chain Analysis Dashboard
Supply Chain Analysis Dashboard
Developed a Streamlit-based dashboard to visualize key supply chain metrics like production volumes, stock levels, revenue, and costs.
Python
Pandas/Numpy
Matplotlib/Plotly/Seaborn
Streamlit Cloud
DVD Rental Analytics (Customer Behavior) Dashboard
DVD Rental Analytics (Customer Behavior) Dashboard
A polished, local-first Streamlit dashboard for exploring the DVD Rental CSV dataset. Auto-load CSVs into an in-memory DuckDB and explore 25 built-in analytics queries, interactive Plotly visualizations, a saved-queries panel, and an Advanced SQL explorer.
Streamlit
DuckDB
Pandas
Plotly
SQL
Saved Queries
Local-first
NYC Subway Ridership Analysis Dashboard
NYC Subway Ridership Analysis Dashboard
An interactive web dashboard built with Streamlit to analyze and visualize live, hourly ridership data from the New York City MTA subway system.
nyc-opendata
Sodapy
DataGov
Powerbi
dashboard-application
Cardiac Care Performance
Cardiac Care Performance
This project presents a comprehensive data analysis and interactive dashboard focused on Cardiac Surgery and Percutaneous Coronary Interventions (PCI) performance by hospital, spanning from 2008 onwards.
Healthcare data
DataGov
Feature Engineering
Plotly-express
tableau-public
Customer Insights and Regional Sales Dashboard
Customer Insights and Regional Sales Dashboard
This dashboard provides an interactive and comprehensive analysis of customer behavior, regional sales trends, and revenue insights.
Sale-analysis
Duckdb
Feature Engineering
Customer Segmentation
Sales-dashboard
Avocado Shiny Dashboard
Avocado Shiny Dashboard
Developed a Shiny-based Avocado Dashboard to analyze California avocado production and value using interactive visualizations. Deployed on a Shiny server, enabling real-time insights, with support for local deployment using Uvicorn.
Shiny for Python
Pandas/NumPy
Shiny Server/Uvicorn
Matplotlib/Seaborn/Plotly
Netflix Insights Genre Trends Analysis
Netflix Insights Genre Trends Analysis
Conducted an in-depth analysis of the Netflix dataset using SQL to extract insights on genres, languages, countries, and IMDb score trends. Utilized advanced SQL queries, statistical analysis, and visualized findings with Tableau and Looker Studio to explore content distribution and directorial performance.
SQL(MySQL)
SQL Aggregations
Tableau/Looker Studio
Pandas
Analyzing Employee’s Performance for HR Analytics
Analyzing Employee’s Performance for HR Analytics
Performed an in-depth HR analytics study using SQL to analyze employee performance, covering demographics, training effectiveness, KPI achievements, and service duration. Visualized key insights with Power BI dashboards, enabling data-driven decision-making for workforce optimization.
SQL(Microsoft SQL Server)
SQL Aggregations
Power BI
Pandas
Research

Publications

My contributions to academic research and scientific literature in data science and machine learning.

Machine Learning for Computer Vision: A Comprehensive Overview
Abdullah Ibnah Hasan, Nayem Uddin Prince, Md. Kamrul Siam, Md. Maruf Miah, Md. Miskat Hossain Siddique, Yamin Hossain, Md. Habibur Rahman
2025
International Conference On Data Mining And Information Security (ICDMIS 2024), pp. 93–104

This paper focuses on detecting sports types from images, which can be used for action recognition. Detecting small objects like athletes and sports equipment is challenging due to their varying colors, appearances, and distance from the camera. CNN is the most common detection tool for sports, but it struggles with image accuracy due to various angles and light conditions

Neural network
Image processing
Computer vision
Feature extraction
Machine learning
Pattern recognition
DOI: 10.1007/978-981-96-6053-7_7
View Publication
Improving Sports Categorization in WiFi- Blocked Areas: Modifying YOLO Network for All-Inclusive Sports Image Identification
Md. Habibur Rahman, Samar Das, Mahtab Uddin, Aminul Islam, Yamin Hossain, Abdullah Hafez Nur, Md. Abdullah
2025
4th International Conference on Machine Learning, IoT and Big Data (ICMIB 2024), pp. 229–242

This paper focuses on detecting sports types from images, which can be used for action recognition. Detecting small objects like athletes and sports equipment is challenging due to their varying colors, appearances, and distance from the camera. CNN is the most common detection tool for sports, but it struggles with image accuracy due to various angles and light conditions

Deep Learning
object detectio
YOLO
CNN
DOI: 10.1007/978-981-96-3797-3_19
View Publication
YOLOv8 Image Processing for Evaluation of Stability Algorithms Based on Neural Networks: A Sports Use Case
Md. Habibur Rahman, A. S. M. Mohiul Islam, Abdullah Ibnah Hasan, Mahtab Uddin, Ashek Ahmed, Asif Ahammad Miazee, Yamin Hossain
2024
International Conference on Inventive Communication and Computational Technologies (ICICCT 2024), pp. 613–622

Sports image classification is a complex problem with many different sports involved. It has subpar detection performance and challenges with feature recognition.

Deep learning
YOLOv8
Neural network
Sports Image Classification
Computer vision
DOI: 10.1007/978-981-97-7710-5_46
View Publication
Qualifications

Education & Certifications

Education

Bachelor of Science in Computer Science
Vellore Institute of Technology, Andhra Pradesh

CGPA: 8.36 / 10 (Indian Grading System)

2021 - 2025

Focused on programming fundamentals, algorithms, and data structures. Minor in Mathematics. Specialization in data analytics.

Key Courses:

Data Structures & Algorithms
Probability & Statistics
Linear Algebra
Operating Systems
Object-Oriented Programming
Database Management System
Computer Architecture
Computer Networks
Software Engineering
Optimization Techniques
Data Warehousing & Data Mining
Mining Massive Datasets
NoSQL Database
Application of Data Analytics

Club Activities & Extracurriculars

  • LIT-DAC Club(Live Innovate and Transform - Data Analytics)

    Marketing Team Member2022 - 2023

    Promoted seminars and events, increasing participation by 40%. Showcased marketing and public speaking skills, creating an engaging environment.

  • CSI VITAP CHAPTER

    Management Team2021 - 2022

    Organized tech events and led a team to enhance community service and academic initiatives, fostering unity among 30+ members

Certifications

  • AI Agents Fundamentals

    Hugging Face2025

  • Applications of AI for Predictive Maintenance

    NVIDIA2024

  • Building Transformer-Based Natural Language Processing Applications

    NVIDIA2024

Continuous Learning

I'm committed to continuous learning and staying updated with the latest advancements in data science and AI.

Currently enrolled in specialized courses on deep learning, natural language processing, and computer vision to expand my skill set and tackle more complex data challenges.

Get in Touch

Contact Me

Have a project in mind or want to discuss a data science opportunity? Feel free to reach out!

Send a Message
Fill out the form below and I'll get back to you as soon as possible.
Contact Information
Feel free to reach out through any of these channels.

Email

issan.yamin@gmail.com

robinmill4d@gmail.com

Phone

+91 (630) 101-1373

+880 (197) 794-0357

Location

Andhra Pradesh, India

Dhaka, Bangladesh