Haddi Amjad

Abdul Haddi Amjad is a Ph.D. candidate in Computer Science at Virginia Tech, advised by Dr. Muhammad Ali Gulzar, and a member of the ProperData research group. His research applies techniques from software debugging and testing to address security & privacy problems on the web. He has twice received the GRDP and John Lee Pratt Graduate Fellowships from Virginia Tech’s Department of Computer Science. His work earned the Distinguished Artifact Award at ACM CCS 2024, and his mentored undergraduate student received the CCI People’s Choice Award. He also co-authored the Privacy and JavaScript chapters in the Web Almanac 2024.

For more: CV [Last updated Oct ’25]

For more: Research and Teaching Statement [Last updated Oct ’25]

IN JOB MARKET!!

News

[Oct ’25] Appointed to the Program Committee for ACM WWW 2026 (Security & Privacy Track)
[Sep ’25] Guest Lecture for BIT/CS/PSCI 4164: Future of Security
[Aug ’25] Received the John Lee Pratt Graduate Fellowship and GRDP Fellowship for 2025–2026
[Apr ’25] Organizing committee member for the 2025 Workshop on Privacy‑Enhancing Technologies (PET‑VA)
[Apr ’25] Our poster won People’s Choice at the CCI poster session
[Mar ’25] Co‑authored 2024 Web Almanac JavaScript Chapter

Research Impact

Web Almanac 2024
Co‑authored Privacy Chapter and JavaScript Chapter – Edition 2024
Privacy Violation Reporting
Reported issues to EasyList/EasyPrivacy authors: #18244 #18243 #18242 #18230 #18185
Ghostery Bug Discovered
Identified a bug in Ghostery: GitHub Report

Selected Publications

Uncovering the Usage and Privacy Risks of HTTP Custom Headers

Abdul Haddi Amjad, Umar Iqbal, Muhammad Ali Gulzar

Currently under submission

Large Language Models (LLMs) are increasingly being used in post‑development tasks such as code repair and testing. A key factor in the successful completion of these tasks is a model’s ability to deeply understand code. However, the extent to which LLMs truly understand code remains largely unevaluated. Quantifying code comprehension is challenging due to its abstract nature and the lack of a standardized metric. Prior to LLMs, this was typically assessed through developer surveys, which is not feasible for evaluating LLMs. Existing LLM benchmarks focus primarily on code generation, which differs fundamentally from code comprehension; moreover, fixed benchmarks quickly become obsolete and unreliable as they inevitably become part of training data. This paper presents the first large‑scale empirical investigation into the ability of LLMs to understand code. Inspired by mutation testing, we use an LLM’s ability to find faults as a proxy for deep understanding of code. We inject faults in real‑world programs and ask LLMs to localize them, then apply semantic‑preserving mutations to verify robustness. We evaluate nine LLMs on 600,010 debugging tasks across 670 Java and 637 Python programs and find that LLMs lose the ability to debug the same bug in 78% of programs when mutations are applied, indicating shallow understanding and reliance on non‑semantic features. We also find LLMs understand earlier code more than later code, suggesting current tokenization and modeling overlook program semantics.
Unintended Privacy Risks of Using Assistive Technology in Web Applications

Abdul Haddi Amjad, Bless Jah, Muhammad Ali Gulzar

Currently under submission

Large Language Models (LLMs) are increasingly being used in post‑development tasks such as code repair and testing. A key factor in the successful completion of these tasks is a model’s ability to deeply understand code. However, the extent to which LLMs truly understand code remains largely unevaluated. Quantifying code comprehension is challenging due to its abstract nature and the lack of a standardized metric. Prior to LLMs, this was typically assessed through developer surveys, which is not feasible for evaluating LLMs. Existing LLM benchmarks focus primarily on code generation, which differs fundamentally from code comprehension; moreover, fixed benchmarks quickly become obsolete and unreliable as they inevitably become part of training data. This paper presents the first large‑scale empirical investigation into the ability of LLMs to understand code. Inspired by mutation testing, we use an LLM’s ability to find faults as a proxy for deep understanding of code. We inject faults in real‑world programs and ask LLMs to localize them, then apply semantic‑preserving mutations to verify robustness. We evaluate nine LLMs on 600,010 debugging tasks across 670 Java and 637 Python programs and find that LLMs lose the ability to debug the same bug in 78% of programs when mutations are applied, indicating shallow understanding and reliance on non‑semantic features. We also find LLMs understand earlier code more than later code, suggesting current tokenization and modeling overlook program semantics.
How Accurately Do Large Language Models Understand Code?

Sabaat Haroon, Ahmad Khan, Ahmad Humayun, Waris Gill, Abdul Haddi Amjad, Ali R. Butt, Mohammad T. Khan, Muhammad Ali Gulzar

Currently under submission

Paper

Large Language Models (LLMs) are increasingly being used in post‑development tasks such as code repair and testing. A key factor in the successful completion of these tasks is a model’s ability to deeply understand code. However, the extent to which LLMs truly understand code remains largely unevaluated. Quantifying code comprehension is challenging due to its abstract nature and the lack of a standardized metric. Prior to LLMs, this was typically assessed through developer surveys, which is not feasible for evaluating LLMs. Existing LLM benchmarks focus primarily on code generation, which differs fundamentally from code comprehension; moreover, fixed benchmarks quickly become obsolete and unreliable as they inevitably become part of training data. This paper presents the first large‑scale empirical investigation into the ability of LLMs to understand code. Inspired by mutation testing, we use an LLM’s ability to find faults as a proxy for deep understanding of code. We inject faults in real‑world programs and ask LLMs to localize them, then apply semantic‑preserving mutations to verify robustness. We evaluate nine LLMs on 600,010 debugging tasks across 670 Java and 637 Python programs and find that LLMs lose the ability to debug the same bug in 78% of programs when mutations are applied, indicating shallow understanding and reliance on non‑semantic features. We also find LLMs understand earlier code more than later code, suggesting current tokenization and modeling overlook program semantics.
Accessibility Issues in Ad‑Driven Web Applications

Abdul Haddi Amjad, Muhammad Danish, Bless Jah, Muhammad Ali Gulzar

IEEE International Conference on Software Engineering (ICSE), 2025

Paper Code News

Advertisements are a critical revenue source for many web apps, yet ad‑driven web applications (ADWAs) can degrade accessibility. We conduct an empirical study over 30,000 pages comparing ADWAs to non‑ad‑driven sites using popular accessibility tools. ADWAs exhibit substantially more issues—particularly color contrast, link text, and alternative text. We propose AdApt, a framework that uses static+dynamic analysis to detect and mitigate ad‑introduced accessibility issues without breaking ad functionality. In evaluations, AdApt mitigates over 80% of issues attributable to ads.
Blocking Tracking JavaScript at the Function Granularity

Abdul Haddi Amjad, Shaoor Munir, Zubair Shafiq, Muhammad Ali Gulzar

ACM Conference on Computer and Communications Security (CCS), 2024

ACM CCS 2024 Distinguished Artifact Award

Paper Code

Modern sites rely on JavaScript for both functionality and tracking. Mixed scripts that contain both put blockers in a bind: block and break the site, or allow and leak privacy. Not.js performs fine‑grained blocking at the function level by analyzing dynamic execution context (call stack and calling site), encoding it into a graph, and training a supervised model to detect tracking functions. It then auto‑generates surrogates that preserve functionality while removing tracking. On the top‑10K sites, Not.js attains 94% precision and 98% recall, is robust to common obfuscation, and its surrogates remove tracking with minimal breakage; mixed scripts appear on 62.3% of sites, 70.6% of which are third‑party and often engage in cookie ghostwriting.
Blocking JavaScript without Breaking the Web: An Empirical Investigation

Abdul Haddi Amjad, Zubair Shafiq, Muhammad Ali Gulzar

Proceedings on Privacy Enhancing Technologies Symposium (PETS), 2023

Paper Code Video

We study whether blocking JavaScript on the web is feasible without breaking legitimate functionality. Using a 100K‑site crawl, we evaluate blocking strategies for tracking prevention versus breakage. Blocking all scripts stops tracking but breaks ~2/3 of sites; curated‑list selective blocking fares better but ~15% of scripts are mixed (tracking+functional) and cannot be blocked wholesale. Fine‑grained method‑level blocking reduces major breakage by 3.8× for comparable privacy, highlighting opportunities and challenges for precise defenses.
TrackerSift: untangling mixed tracking and functional web resources

Abdul Haddi Amjad, Danial Saleem, Zubair Shafiq, Muhammad Ali Gulzar

Internet Measurement Conference (IMC), 2021

Paper Code Video

Talks

Mitigating JavaScript Tracking Functions Using Chromium V8's Generated ASTs
IEEE SNP — San Francisco, 2024
Not publically available
Revolutionizing Privacy: Blocking Tracking JavaScript at the Function Granularity
Google (Privacy Sandbox Team) — Remote, 2023
Not publically available
Demystifying the Mystery of Privacy‑Invasive JS Functions
Ad‑Filtering Developer Summit — Amsterdam, 2023
Video
What's the Next Era of Content Blockers?
Ad‑Filtering Developer Summit — Amsterdam, 2022
Video
How to Disentangle Tracking Code from Functional?
Ad‑Filtering Developer Summit — Remote, 2021
Video

Industry Experience

Amazon
Applied Scientist Intern • Summer 2025
Collaborated with the Special Projects Investigation Team to identify marketplace bad actors using external intelligence.
Admiral
Research Scientist Intern • Jan - Aug 2025
Developed an ML model to classify ad-blockers and deliver custom prompts to users.
Brave Software
Data Security Research Intern • Summer 2023
Developed Solana NFT token-gating for Web3 Brave Talk calls.
Brave Software
Data Security Research Intern • Summer 2022
Implemented and tested end-to-end encryption for Brave Talk.
Educative
Technical Lead – Educative Answers • May 2018 – May 2021
Led a team building lightweight Docker environments and managing content delivery pipelines.

Education

Virginia Tech
Ph.D. Candidate in Computer Science • Aug 2021 – Present
Advisor: Dr. Muhammad Ali Gulzar
FAST-NUCES (Lahore Campus)
B.S. in Computer Science • Aug 2017 – Aug 2021
Graduated with Honors

Teaching & Mentoring

Guest Lecture: BIT/CS/PSCI 4164 — Future of Security (Aug 2025)
Delivered a lecture on web tracking defenses and privacy-preserving program analysis.
Guest Lecture: Washington University in St. Louis — Data-Driven Privacy and Security
Invited to discuss research on function-level JavaScript blocking and privacy engineering.
Undergraduate Mentorship:
Mentored a student under the Center for the Enhancement of Engineering Diversity (CEED) program, resulting in:
- Best Poster Award for our joint research project
- Co-authorship on two research publications
CS 5774: User Interface Software — Graduate Teaching Assistant (Virginia Tech, Fall 2022–2023)
Assisted with course design, grading, and interactive lab sessions on usability and interface design.

Honors & Awards

Fellowships

John Lee Pratt Graduate Fellowship — 2025–2026 & 2024–2025
Graduate Research Development Program (GRDP) Fellowship — 2025–2026 & 2024–2025

Awards

CCI People’s Choice Award — 2025
For the best poster presentation at the Commonwealth Cyber Initiative symposium.
ACM CCS Distinguished Artifact Award — 2024
For the paper “Blocking Tracking JavaScript at the Function Granularity.”

Grants

IEEE Symposium on Security and Privacy Travel Grant — 2024
ACM Conference on Computer and Communications Security Travel Grant — 2024
Privacy Enhancing Technologies Symposium Travel Grant — 2023

Community Service

Program Committee — MADWeb 2026 (Co-located with NDSS)
Program Committee — ACM WWW 2026 (Security & Privacy Track)
Organizing Committee — Workshop on Privacy‑Enhancing Technologies (PET‑VA) 2025
Artifact Review Committee — PETS 2024