I'm a data scientist in practice β previously a full-stack developer (Laravel & Vue.js), now focused on building end-to-end ML pipelines, working with real-world messy data, and extracting insights that actually matter.
My engineering background means I don't just build notebooks β I build structured, reproducible, deployable data pipelines.
romel = {
"focus" : ["Machine Learning", "Data Analysis", "Geospatial Data", "Time Series"],
"background" : ["Full Stack Web Dev", "ERP/CRM Systems", "REST APIs"],
"currently" : "Building ML projects on real-world datasets",
"strength" : "Developer mindset applied to data science problems",
"contact" : "romelhasan741@gmail.com"
}District-level crop yield prediction using satellite data + stacking ensemble
- Integrated BBS agricultural data with Google Earth Engine satellite covariates (CHIRPS rainfall, MODIS LST, EVI)
- Built a Stacking Ensemble (CatBoost + Random Forest) achieving RΒ² = 0.8481
- Used SHAP analysis to identify
district_capacityas the dominant predictive feature - Generated an interactive Leaflet.js choropleth map for 2024 district-level predictions
Python CatBoost scikit-learn SHAP Google Earth Engine Leaflet.js
π View Repository
End-to-end ML pipeline on 1.58 million crime records
- Cleaned and engineered features from 1.58M LAPD records (26 columns β structured ML-ready dataset)
- Grouped 200+ crime types into 8 categories as the classification target
- Handled severe class imbalance using SMOTE, trained with XGBoost β 69% accuracy
- Built with proper software structure:
src/,notebooks/,tests/,main.py
Python XGBoost SMOTE pandas scikit-learn imbalanced-learn
π View Repository
Data Science & ML
Web Development (Background)
Tools & Infrastructure
π¬ romelhasan741@gmail.com
