UC
Available · B.Tech CSE · 2026

UPAYAN CHAKRABORTY

Data Analyst · ML Engineer · Software Developer

Scroll

About

15+
Datasets Analyzed
39+
Plant Diseases Detected
85%+
Model Accuracy
400K+
Dataset Samples

I'm a final-year Computer Science student at MCKV Institute of Engineering, passionate about turning raw data into actionable intelligence. My work spans machine learning, data pipelines, and AI-driven applications.

From building CNN-powered agriculture tools to crafting curriculum for data science workshops, I thrive at the intersection of technical precision and clear communication.

MCKV Institute of Engineering
B.Tech — Computer Science · Howrah, WB
7.34 GPA
Vivekananda Academy (CBSE)
Class 12 Science (PCM) · Shrirampur, WB
71.33%
Techno India Group Public School
Class 10 · Chinsurah, WB
83.83%

Experience

Mar 2026 – Present
AI Data Annotation Intern Remote
Utah Tech Labs · Remote
  • Annotating and labelling diverse data types — text, image, audio, and video — in strict accordance with project-specific guidelines to build high-quality training datasets.
  • Ensuring accuracy and quality of annotated datasets through systematic review and cross-validation of annotations.
  • Reviewing and validating annotated data to maintain consistency, resolve ambiguities, and uphold data integrity standards.
  • Maintaining strict confidentiality of data and project information across all assigned tasks and deliverables.
  • Consistently meeting productivity and quality benchmarks set by the team to support timely AI model development.
Data Annotation Quality Assurance AI Training Data
Jan 2026 – Mar 2026
Spring Intern — Advanced Data Science ISI Kolkata
IDEAS-TIH @ Indian Statistical Institute · Kolkata, West Bengal
  • Trained at the Institute of Data Engineering, Analytics and Science Foundation — Technology Innovation Hub at Indian Statistical Institute, Kolkata, gaining hands-on exposure to cutting-edge AI topics.
  • Large Language Models: Prompt engineering fundamentals, and real-world applications of LLMs across diverse domains.
  • Deep Learning: Neural network fundamentals, backpropagation and activation functions, and Convolutional Neural Networks (CNN).
  • Computer Vision: Image processing basics, image classification, and object detection overview.
  • Agentic AI: Autonomous AI agents, workflow automation, and practical agent design patterns.
  • Developed the OCEAN Personality Classification Model ↗ as the capstone project for the internship.
LLMsDeep LearningCNNs Computer VisionAgentic AI
Aug 2025 – Jan 2026
Data Analyst & ML Content Writer Remote
MacroEdtech™ · Pune, Maharashtra
  • Executed end-to-end data pipelines for 15+ real-world datasets, performing rigorous data preprocessing, EDA, and feature engineering to ensure model readiness.
  • Developed and evaluated ML models using both supervised and unsupervised learning techniques, optimizing performance metrics across diverse analytical problems.
  • Authored technical documentation and structured curriculum translating complex data science and ML concepts into accessible educational content for broad audiences.
Jan 2025 – Mar 2025
AI & Data Analytics Intern Top 15
Edunet Foundation × Microsoft · Bengaluru, Karnataka
  • Experimented with Power BI features including DAX functions, Power Query, and report automation — optimizing data transformation speed by 45%. Also worked on Microsoft Azure.
  • Enhanced data storytelling skills via visualization best practices, increasing user engagement with reports by 30%.
  • Ranked among the top 15 students of the batch across the program cohort.

Projects

01 — Featured
AgroCare: Plant Disease Detection & Crop Recommendation
An AI-driven agriculture solution integrating plant disease detection and smart crop recommendation. Deployed a Flask web app using CNNs (PyTorch) to identify 39+ plant diseases with treatment suggestions. Integrated IoT-based soil and environmental sensors with ML models for real-time crop recommendations. The system bridges AI with agriculture to help farmers make data-informed decisions.
PyTorch CNN Flask IoT ML Agriculture AI
02
OCEAN Personality Classification Model
A multi-output supervised learning framework for Big Five (OCEAN) personality classification using Random Forest ensembles (n_estimators = 200) on the IPIP-NEO 120-item dataset with 400K+ samples. Achieved ~85–90% classification accuracy across all five personality traits.
Random Forest Python scikit-learn Machine Learning
Highlights
Key Technical Achievements
Hyperparameter optimization via RandomizedSearchCV (n_iter=10, cv=3) · Comprehensive feature engineering on ~160 variables · Median imputation + StandardScaler normalization · Stratified 80:20 train-test split · Feature importance analysis for interpretability.
GridSearchCV Feature Engineering Model Eval
Tech Stack

Skills

Languages
Python R Java JavaScript HTML CSS
Frameworks & Libraries
ReactJS Flask PyTorch scikit-learn Pandas NumPy
Data & Analytics
Tableau Power BI MySQL MS Access Google Sheets MS Excel
Cloud & Tools
Microsoft Azure Power Query DAX Git Jupyter
ML / AI Domains
Supervised Learning Unsupervised Learning CNNs EDA Feature Engineering Data Pipelines
Languages Spoken
Bengali — Fluent English — Fluent Hindi — Fluent

Activities

Startup Club
The Startup Club of MCKVIE
Communications Team Member · MCKVIE, Howrah
Mar 2025 – Mar 2026
Heritage Club
Heritage Club of MCKVIE
Committee Member · MCKVIE, Howrah
Jan 2025 – Present

Connect With Me

I'm currently open to internships, freelance work, and full-time opportunities. Feel free to reach out!