I am a recent graduate from the Iran University of Science and Technology (IUST), where I earned my B.Sc. in Computer Engineering with a passion for tackling complex challenges in machine learning and computer vision. My academic journey has been driven by a deep curiosity and a love for solving what I call “big problems.”
This passion was truly ignited during my undergraduate thesis on object insertion using diffusion models. Discovering this fascinating and rapidly evolving field felt like finding a new frontier. It presented a steep learning curve that I embraced, reinforcing my belief that challenging problems are the greatest catalysts for growth. This experience set me on my current research path, where I aim to not only understand and apply these powerful models but also to contribute to their advancement.
As a research assistant at IUST’s CV Lab, I put this philosophy into practice by developing a novel, training-free framework for object removal and insertion in indoor scenes. This work achieved excellent results by specifically addressing key challenges in generative AI, such as high computational costs and data scarcity. I am currently preparing a first-author manuscript detailing these findings for publication.
Looking ahead, I am excited to begin my graduate studies and continue my exploration of machine learning’s fascinating realm. While my core experience is in computer vision, I am particularly fascinated by the synergy between different AI domains. I am eager to tackle challenges in advanced generative modeling, from 2D and 3D vision to video processing. Furthermore, I am deeply interested in the intersection of vision and language —exploring how machines can understand visual data, and even how these multimodal models can be used to automate the complex, human-centric evaluation of generative vision tasks.
![]() Sep 2020 -- Sep 2025 B.Sc. in Computer EngineeringCGPA: 3.98 (19.19) out of 4 (20.00, Iranian scale)Taken Courses:
Acheivements:
| |||||||||||||||||||||||
Ario Mosallah Nejad High SchoolSep. 2017 -- Jul. 2020 Diploma in Mathematics and Physics DisciplineCGPA: 19.8 out of 20 |
October 2024 - Present
Big LaMa
) to generate a coarse prior, which then provides strong geometric and semantic guidance to a final diffusion-based refinement stage (SDXL
).InstantMesh
), enabling the projection of 3D assets into the scene with physically plausible scale, perspective, and occlusion based on depth estimation.Big LaMa
while improving semantic consistency (ReMOVE) by 8.5% over Stable Diffusion XL Inpainting
.MimicBrush
by producing better object fidelity (DreamSim, 2.5% improvement) and reducing background distortion (LPIPS) by 55%.Sep. 2025 - Present
Sep. 2024 - Jan. 2025
Sep. 2024 - Jan. 2025
Sep. 2024 - Jan. 2025
Sep. 2024 - Jan. 2025
Feb. 2024 - Jul. 2024
Sep. 2023 - Jan. 2024
Feb. 2023 - Jul. 2023
Feb. 2023 - Jul. 2023
Feb. 2023 - Jul. 2023
Feb. 2023 - Jul. 2023
Sep. 2022 - Jan. 2023
Sep. 2022 - Jan. 2023
Feb. 2022 - Jul. 2022
Feb. 2022 - Jul. 2022
Developed a real-time facial spoofing detection system by comparing a classical machine learning pipeline (combining feature extractors like LBP
with PCA
and an SVM
) against a MobileNetV2
-based deep learning model. Trained on the combined LCC FASD and CASIA datasets, the deep learning approach achieved 97.4% accuracy and a 91.7% F1-score. The MobileNetV2
model also demonstrated superior robustness and generalization, outperforming the classical pipeline by over 20 percentage points in accuracy when evaluated on the unseen iBeta video dataset, effectively highlighting the advantages of modern deep learning techniques.
This project involved a detailed evaluation of large language models using a custom-built Retrieval-Augmented Generation (RAG
) framework. I developed an end-to-end pipeline, vectorizing over 235,000 Q&A pairs from the AI Medical Chatbot dataset with the all-MiniLM-L6-v2
model for context retrieval. An automated evaluation framework, using gte-large-en-v1.5
to generate semantic similarity scores, was created to quantitatively assess query relevance and document faithfulness. This system was used to benchmark Mistral-7B
, Llama-3-8B
, and Qwen2-7B
, identifying Mistral-7B
as the top-performing model with a 10% higher composite score and superior inference speed.
Developed a multi-class emotion classification model for Persian text by benchmarking several fine-tuned RoBERTa
variants on the ArmanEmo dataset. To handle the dataset’s over 7,000 noisy social media texts, I engineered a robust preprocessing pipeline featuring text normalization, diacritic removal, and artifact handling. The final model, XLM-RoBERTa-Large
, was selected for its superior performance, achieving a 75.3% Macro F1-Score—a greater than 13x improvement over the baseline—and ranking as a top project in its course. The project concluded with a detailed error analysis that identified key failure modes, such as sentence ambiguity and word-level biases.
Developed a Persian extractive question-answering system by benchmarking and fine-tuning ParsBERT
and XLM-RoBERTa-Large
models on the PersianQA
dataset. A robust text normalization pipeline using Parsivar
was implemented to handle language-specific complexities. To provide a comprehensive assessment, a dual evaluation framework was created, combining standard metrics (Exact Match
and F1-score
) with a custom semantic similarity score based on all-MiniLM-L6-v2
embeddings. After fine-tuning, the XLM-RoBERTa-Large
model was identified as the top performer, achieving an 85.15% F1-score and a 70.96% Exact Match score, outperforming the fine-tuned ParsBERT
(72.67% F1, 57.84% EM). Both models surpassed the original XLM-RoBERTa
(84.81% F1, 70.40% EM) and ParsBERT
(70.06% F1, 53.55% EM) baselines reported by the dataset’s authors.
Developed a Persian Automatic Speech Recognition (ASR) system by benchmarking three models—Whisper-Large-V3
, Seamless-M4T-V2-Large
, and stt_fa_fastconformer_hybrid_large
—on a Persian speech dataset sourced from YouTube. The FastConformer
model was subsequently fine-tuned using the NVIDIA NeMo toolkit. This process substantially improved its performance, reducing the Word Error Rate (WER) from 0.74 to 0.37 and the Character Error Rate (CER) from 0.59 to 0.19. The resulting model outperformed the best baseline, Whisper-Large-V3
(WER of 0.56), demonstrating the effectiveness of language-specific adaptation for speech recognition tasks.
Developed a comprehensive system to assist university students with course selection, architected on a scalable backend using Django and PostgreSQL and deployed in a production-ready Docker environment with automated CI/CD. A key component was a multithreaded web crawler that automated the ingestion of over 1,000 classes per cycle, achieving an >11x speedup and bypassing logins with a KNN
-based CAPTCHA solver that reached 96% accuracy. The system featured social functionalities like real-time chat, a BERT
-based semantic search engine for content discovery, and instant course availability notifications via Email, SMS, and Telegram, all accessible through a RESTful API.
Developed a reinforcement learning agent to solve the classic Mountain Car problem by implementing and comparing on-policy (SARSA
) and off-policy (Q-Learning
) algorithms with linear function approximation. A key innovation in this project was engineering a compact feature set and applying L2 regularization to stabilize learning. This optimization proved highly effective, dramatically accelerating convergence for the SARSA
agent from over 1000 episodes down to just 2 episodes.
Designed and implemented a VHDL-based hardware accelerator for CNNs optimizing performance for real-time applications. Conducted extensive testing to ensure accuracy and reliability. This project was inspired by “RASHT-A Partially Reconfigurable Architecture for Efficient Implementation of CNNs” published in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2022.
Developed a compiler using ANTLR and Python to automatically generate Python code for PyQt QLCDNumber widgets from a custom XML input. This tool streamlines the UI development process by translating a simple markup language directly into a functional GUI component.
Developed a desktop library management system using C# and WPF for the front-end and SQL Server for the back-end database. The application features distinct user roles (Administrator and Standard User), each with a unique dashboard and permissions tailored to their responsibilities.
Designed and implemented a VHDL simulation of a memory hierarchy featuring a multi-level cache. The system was driven by a simplified CPU model responsible for fetching and executing instructions, generating realistic memory access patterns to validate the cache’s performance.
Implemented and benchmarked multiple Constraint Satisfaction Problem (CSP) algorithms within a Sudoku solver. This project compares a standard backtracking approach against versions enhanced with Arc Consistency (AC-3) and the Minimum Remaining Values (MRV) heuristic. Each algorithm was evaluated on puzzles of varying difficulty by measuring execution time and the number of nodes expanded.
Developed a Tic-Tac-Toe AI agent using the Monte Carlo Tree Search (MCTS) algorithm. The agent builds a search tree over thousands of iterations; each iteration involves a process of UCB1-based selection, node expansion, random simulation, and backpropagation of results. The project includes a playable game loop for a human vs. AI match and a suite of unit tests to verify the agent’s logic.
Implemented a solution for the N-Queens problem using the Hill Climbing with Random Restart algorithm. The solver applies local search, starting from a random board state, to iteratively minimize the number of queen conflicts based on a heuristic function. A random-restart mechanism is integrated to escape local minima, with the solver’s performance benchmarked by its success rate and execution time over multiple trials.
July 2022 - October 2022
Tehran, Iran
Digikala Group is a leading e-commerce organization with a strong presence in multiple online industries, including consumer goods, fashion and apparel, e-books, content publishing, digital advertising, big data, fintech, FMCG, and logistics. The company operates through its subsidiaries, including Digikala, DIGISTYLE, Fidibo, and Digipay, which together represent a significant portion of Iran’s online retail market share.
July 2022 - October 2022
November 2024 - February 2025
Tehran, Iran
Atieh Dadeh Pardaz is a pioneering Telecom VAS, corporate mobile applications and bespoke IT software provider offering practical solutions based on the global digital trends and technologies since 2002.
November 2024 - February 2025
Facial Recognition System | Tensorflow | Convolutional Neural Network | Deep Learning | Object Detection and Segmentation
Gated Recurrent Unit (GRU) | Recurrent Neural Network | Natural Language Processing | Long Short Term Memory (LSTM) | Attention Models
Tensorflow | Deep Learning | Hyperparameter Tuning | Mathematical Optimization
Decision-Making | Machine Learning | Deep Learning | Inductive Transfer | Multi-Task Learning
Artificial Neural Network | Backpropagation | Python Programming | Deep Learning | Neural Network Architecture
Anomaly Detection | Unsupervised Learning | Reinforcement Learning | Collaborative Filtering | Recommender Systems
Tensorflow | Advice for Model Development | Artificial Neural Network | Xgboost | Tree Ensembles
Linear Regression | Regularization to Avoid Overfitting | Logistic Regression for Classification | Gradient Descent | Supervised Learning
Eigenvalues And Eigenvectors | Linear Equation | Determinants | Machine Learning | Linear Algebra
Deep Learning | Machine Learning
Data Analysis | Python Programming | Database (DBMS) | Data Visualization
Python Programming | Database (DBMS) | Sqlite | SQL
Computer Programming | Computer Programming Tools | Programming Principles | Python Programming | Data Structures | Web Development | Computer Networking | HTML and CSS
Python Syntax And Semantics | Data Structure | Tuple | Python Programming
Python Syntax And Semantics | Basic Programming Language | Computer Programming | Python Programming
Algorithms | Data Structures | Graph Theory | Theoretical Computer Science | Computer Programming
Algorithms | Computer Programming | Data Structures | Theoretical Computer Science | Problem Solving | Computer Programming Tools | Mathematical Theory & Analysis | Programming Principles | Mathematics
Algorithms | Computer Programming | Problem Solving | Theoretical Computer Science | Critical Thinking | Computer Programming Tools | Data Structures | Programming Principles | Software Testing