https://dl.acm.org/doi/abs/10.1145/3589334.3645698 skip to main content ACM Digital Library home ACM corporate logo * Advanced Search * Browse * About * + Sign in + Register * * Advanced Search * Journals * Magazines * Proceedings * Books * SIGs * Conferences * People * * More * Search ACM Digital Library[ ] SearchSearch Advanced Search 10.1145/3589334.3645698acmconferencesArticle/Chapter ViewAbstract Publication PageswwwConference Proceedingsconference-collections www * Conference * Proceedings * Upcoming Events * Authors * Affiliations * Award Winners * More * Home * Conferences * WWW * Proceedings * WWW '24 * AdFlush: A Real-World Deployable Machine Learning Solution for Effective Advertisement and Web Tracker Prevention research-article Open Access Artifacts Available / v1.1 [icon-small] Share on * * * * * AdFlush: A Real-World Deployable Machine Learning Solution for Effective Advertisement and Web Tracker Prevention * Authors: * Author PictureKiho Lee Sungkyunkwan University, Suwon, Republic of Korea Sungkyunkwan University, Suwon, Republic of Korea [o]0000-0002-8713-3863 View Profile , * Author PictureChaejin Lim Sungkyunkwan University, Suwon, Republic of Korea Sungkyunkwan University, Suwon, Republic of Korea [o]0009-0009-3547-4904 View Profile , * Author PictureBeomjin Jin Sungkyunkwan University, Suwon, Republic of Korea Sungkyunkwan University, Suwon, Republic of Korea [o]0000-0002-3237-0327 View Profile , * Author PictureTaeyoung Kim Sungkyunkwan University, Suwon, Republic of Korea Sungkyunkwan University, Suwon, Republic of Korea [o]0000-0001-7505-7931 View Profile , * Author PictureHyoungshick Kim Sungkyunkwan University, Suwon, Republic of Korea Sungkyunkwan University, Suwon, Republic of Korea [o]0000-0002-1605-3866 View Profile Authors Info & Claims WWW '24: Proceedings of the ACM on Web Conference 2024May 2024Pages 1902-1913https://doi.org/10.1145/3589334.3645698 Published:13 May 2024Publication HistoryCheck for updates on crossmark * 0citation * 24 * Downloads Metrics Total Citations0 Total Downloads24 Last 12 Months24 Last 6 weeks24 * Get Citation Alerts New Citation Alert added! This alert has been successfully added and will be sent to: You will be notified whenever a record that you have chosen has been cited. To manage your alert preferences, click on the button below. Manage my Alerts New Citation Alert! Please log in to your account * * * Publisher Site * * eReader * PDF WWW '24: Proceedings of the ACM on Web Conference 2024 AdFlush: A Real-World Deployable Machine Learning Solution for Effective Advertisement and Web Tracker Prevention Pages 1902-1913 PreviousChapterNextChapter ACM Digital Library ABSTRACT [3589334] Conventional ad blocking and tracking prevention tools often fall short in addressing web content manipulation. Machine learning approaches have been proposed to enhance detection accuracy, yet aspects of practical deployment have frequently been overlooked. This paper introduces AdFlush, a novel machine learning model for real-world browsers. To develop AdFlush, we evaluated the effectiveness of 883 features, ultimately selecting 27 key features for optimal performance. We tested AdFlush on a dataset of 10,000 real-world websites, achieving an F1 score of 0.98, thereby outperforming AdGraph (F1 score: 0.93), WebGraph (F1 score: 0.90), and WTAgraph (F1 score: 0.84). Additionally, AdFlush significantly reduces computational overhead, requiring 56% less CPU and 80% less memory than AdGraph. We also assessed AdFlush's robustness against adversarial manipulations, demonstrating superior resilience with F1 scores ranging from 0.89 to 0.98, surpassing the performance of AdGraph and WebGraph, which recorded F1 scores between 0.81 and 0.87. A six-month longitudinal study confirmed that AdFlush maintains a high F1 score above 0.97 without the need for retraining, underscoring its effectiveness. Skip Supplemental Material Section Supplemental Material rfp2363.mp4 Supplemental video mp4 45.7 MB Download References 1. Easylist. URL: https://easylist.to/.Google ScholarGoogle Scholar 2. Easyprivacy. URL: https://easylist.to/easylist/easyprivacy.txt. Google ScholarGoogle Scholar 3. Mshabab Alrizah, Sencun Zhu, Xinyu Xing, and Gang Wang. Errors, misunderstandings, and attacks: Analyzing the crowdsourcing process of ad-blocking systems. In Proceedings of the 2019 Internet Measurement Conference (IMC), pages 230--244, 2019. Google ScholarGoogle Scholar 4. Umar Iqbal, Steven Englehardt, and Zubair Shafiq. Fingerprinting the fingerprinters: Learning to detect browser fingerprinting behaviors. In Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP), pages 1143--1161, 2021.Google ScholarGoogle ScholarCross RefCross Ref 5. Umar Iqbal, Peter Snyder, Shitong Zhu, Benjamin Livshits, Zhiyun Qian, and Zubair Shafiq. AdGraph: A graph-based approach to ad and tracker blocking. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP), pages 763--776, 2020.Google Scholar Google ScholarCross RefCross Ref 6. Sandra Siby, Umar Iqbal, Steven Englehardt, Zubair Shafiq, and Carmela Troncoso. WebGraph: Capturing advertising and tracking information flows for robust blocking. In Proceedings of the 2022 USENIX Security Symposium (Security), pages 2875--2892, 2022. Google ScholarGoogle Scholar 7. Zhiju Yang, Weiping Pei, Monchu Chen, and Chuan Yue. WTAGraph: Web tracking and advertising detection using graph neural networks. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), pages 1540--1557, 2022.Google ScholarGoogle ScholarCross RefCross Ref 8. Erin LeDell and Sebastien Poirier. H2O automl: Scalable automatic machine learning. In Proceedings of the 2020 Workshop on Automatic Machine Learning (ICML), volume 2020, 2020.Google ScholarGoogle Scholar 9. Jonathan R Mayer and John C Mitchell. Third-party web tracking: Policy and technology. In Proceedings of the 2012 IEEE symposium on security and privacy (SP), pages 413--427, 2012.Google Scholar Google ScholarDigital LibraryDigital Library 10. Brian X. Chen. The battle for digital privacy is reshaping the internet, Sep 2021. URL: https://www.nytimes.com/2021/09/16/ technology/digital-privacy.html.Google ScholarGoogle Scholar 11. Your data is shared and sold... What's being done about it?, Oct 2019. URL: https://knowledge.wharton.upenn.edu/article/ data-shared-sold-whats-done/.Google ScholarGoogle Scholar 12. Mark Yep-Kui Chua, George OM Yee, Yuan Xiang Gu, and Chung-Horng Lung. Threats to online advertising and countermeasures: A technical survey. Digital Threats: Research and Practice, 1 (2):1--27, 2020.Google ScholarGoogle ScholarDigital Library Digital Library 13. Tom Hegel. Breaking down the SEO poisoning attack: Howattackers are hijacking search results, Jan 2023. URL: https:// www.sentinelone.com/blog/ breaking-downthe-seo-poisoning-attack-how-attackers-are-hijacking-search-results /.Google ScholarGoogle Scholar 14. Steven Englehardt, Dillon Reisman, Christian Eubank, Peter Zimmerman, Jonathan Mayer, Arvind Narayanan, and Edward W. Felten. Cookies that give you away: The surveillance implications of web tracking. In Proceedings of the 2015 International Conference on World Wide Web (WWW), pages 289--299, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library 15. Georg Merzdovnik, Markus Huber, Damjan Buhov, Nick Nikiforakis, Sebastian Neuner, Martin Schmiedecker, and Edgar Weippl. Block me if you can: A largescale study of tracker-blocking tools. In Proceedings of 2017 IEEE European Symposium on Security and Privacy (Euro S&P), pages 319--333, 2017.Google ScholarGoogle ScholarCross RefCross Ref 16. Raymond Hill. Ublock origin, 2020. URL: https://ublockorigin.com /.Google ScholarGoogle Scholar 17. Privacy Badger. URL: https://privacybadger.org/.Google Scholar Google Scholar 18. Disconnect. URL: https://disconnect.me/.Google ScholarGoogle Scholar 19. Firefox. URL: https://www.mozilla.org/en-US/firefox/features/ private-browsing/.Google ScholarGoogle Scholar 20. Brave browser. URL: https://brave.com/.Google ScholarGoogle Scholar 21. Fanboy list. URL: https://fanboy.co.nz/.Google ScholarGoogle Scholar 22. Umar Iqbal, Zubair Shafiq, and Zhiyun Qian. The ad wars: retrospective measurement and analysis of anti-adblock filter lists. In Proceedings of the 2017 Internet Measurement Conference (IMC), pages 171--183, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library 23. Alexander Sjosten, Peter Snyder, Antonio Pastor, Panagiotis Papadopoulos, and Benjamin Livshits. Filter list generation for underserved regions. In Proceedings of the 2020 Web Conference (WWW), pages 1682--1692, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library 24. Steven Englehardt and Arvind Narayanan. Online tracking: A 1-million-site measurement and analysis. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2016.Google ScholarGoogle ScholarDigital Library Digital Library 25. Sruti Bhagavatula, Christopher Dunn, Chris Kanich, Minaxi Gupta, and Brian Ziebart. Leveraging machine learning to improve unwanted resource filtering. In Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop (AISec), pages 95--102, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library 26. Hieu Le, Salma Elmalaki, Athina Markopoulou, and Zubair Shafiq. AutoFR: Automated filter rule generation for adblocking. In Proceedings of the 2023 USENIX Security Symposium (Security), pages 7535--7552, 2023.Google ScholarGoogle Scholar 27. Grant Storey, Dillon Reisman, Jonathan Mayer, and Arvind Naayana. The future of ad blocking: An analytical framework and new techniques. arXiv preprint arXiv:1705.08568, 2017.Google Scholar Google Scholar 28. Zainul Abi Din, Panagiotis Tigas, Samuel T. King, and Benjamin Livshits. Percival: Making in-browser perceptual ad blocking practical with deep learning. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC), pages 387--400, 2020. Google ScholarGoogle Scholar 29. Florian Tramer, Pascal Dupre, Gili Rusak, Giancarlo Pellegrino, and Dan Boneh. Adversarial: Perceptual ad blocking meets adversarial machine learning. In Proceedings of the 2019 ACMSIGSAC Conference on Computer and Communications Security (CCS), pages 2005--2021, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library 30. Victor Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Korczynski, and Wouter Joosen. Tranco: A research-oriented top sites ranking hardened against manipulation. In Proceedings of the 2019 Network and Distributed System Security Symposium (NDSS), 2019.Google ScholarGoogle ScholarCross RefCross Ref 31. Kunlun Ren, Weizhong Qiang, Yueming Wu, Yi Zhou, Deqing Zou, and Hai Jin. An empirical study on the effects of obfuscation on static machine learningbased malicious javascript detectors. In Proceedings of the 2023 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), pages 1420--1432, 2023. Google ScholarGoogle Scholar 32. Diana Kornbrot. Point biserial correlation. Wiley StatsRef: Statistics Reference Online, 2014.Google ScholarGoogle Scholar Cross RefCross Ref 33. Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. Gene selection for cancer classification using support vector machines. Machine learning, 46:389--422, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library 34. Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148--175, 2015.Google ScholarGoogle ScholarCross RefCross Ref 35. Caterina Labrin and Francisco Urdinez. Principal component analysis. In R for Political Data Science, pages 375--393. Chapman and Hall/CRC, 2020.Google ScholarGoogle ScholarCross Ref Cross Ref 36. Leland McInnes, John Healy, and James Melville. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.Google ScholarGoogle Scholar 37. Chrome.declarativenetrequest. URL: https://developer.chrome.com/ docs/ extensions/reference/declarativeNetRequest/.Google Scholar Google Scholar 38. Javascript-obfuscator: A powerful obfuscator for javascript and node.js. URL: https://github.com/javascript-obfuscator/ javascript-obfuscator.Google ScholarGoogle Scholar 39. Gnirts: Obfuscate string literals in javascript code. URL: https: //github.com /anseki/gnirts.Google ScholarGoogle Scholar 40. Alan Romano, Daniel Lehmann, Michael Pradel, and Weihang Wang. Wobfuscator: Obfuscating javascript malware via opportunistic translation to webassembly. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), pages 1574--1589, 2022. Google ScholarGoogle ScholarCross RefCross Ref 41. Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, and Kalyan Veeramachaneni. Modeling tabular data using conditional gan. Advances in neural information processing systems, 32, 2019. Google ScholarGoogle Scholar 42. Peterlowe's list. URL: https://pgl.yoyo.org/adservers/ serverlist.php?hostformat=adblockplus.Google ScholarGoogle Scholar 43. Warning removal list. URL: https:// easylist-downloads.adblockplus.org/antiadblockfilters.txt.Google ScholarGoogle Scholar Cited By View all [loader-7e6] Index Terms 1. AdFlush: A Real-World Deployable Machine Learning Solution for Effective Advertisement and Web Tracker Prevention 1. Security and privacy 1. Software and application security 1. Web application security Recommendations * Malicious web content detection by machine learning The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses ... Read More * Using a Machine Learning Model for Malicious URL Type Detection Internet of Things, Smart Spaces, and Next Generation Networks and Systems Abstract The world wide web, beyond its benefits, has also become a major platform for online criminal activities. Traditional protection methods against malicious URLs, such as blacklisting, remain a valid alternative, but cannot detect unknown sites, ... Read More * New biostatistics features for detecting web bot activity on web applications Abstract Web bots are malicious scripts that automatically traverse the websites, fill the web form and illegally scrap the data from web sites. The never-ending threat of web bot is causing serious problems on the web applications. According ... Read More Comments Please enable JavaScript to view thecomments powered by Disqus. Login options Check if you have access through your login credentials or your institution to get full access on this article. Sign in Full Access Get this Publication * Information * Contributors * Published in cover image ACM Conferences WWW '24: Proceedings of the ACM on Web Conference 2024 May 2024 4826 pages ISBN:9798400701719 DOI:10.1145/3589334 + General Chairs: + Author PictureTat-Seng Chua National University of Singapore , + Author PictureChong-Wah Ngo Singapore Management University , + Proceedings Chair: + Author PictureRoy Ka-Wei Lee Singapore University of Technology and Design , + Program Chairs: + Author PictureRavi Kumar Google , + Author PictureHady W. Lauw Singapore Management University Copyright (c) 2024 ACM Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. Sponsors In-Cooperation Publisher Association for Computing Machinery New York, NY, United States Publication History + Published: 13 May 2024 Permissions Request permissions about this article. Request Permissions Check for updates Check for updates on crossmark Badges + [icon-large]Artifacts Available / v1.1 Author Tags + ad blocking + machine learning + web security + web tracking Qualifiers + research-article Conference Acceptance Rates Overall Acceptance Rate1,899of8,196submissions,23% Funding Sources * [loader-7e6] Other Metrics View Article Metrics * Bibliometrics * Citations0 * Article Metrics + 0 Total Citations View Citations + 24 Total Downloads + Downloads (Last 12 months)24 + Downloads (Last 6 weeks)24 Other Metrics View Author Metrics * Cited By This publication has not been cited yet PDF Format View or Download as a PDF file. PDF eReader View online with eReader. eReader Digital Edition View this article in digital edition. View Digital Edition * Figures * Other * * Share this Publication link https://dl.acm.org/doi/abs/10.1145/3589334.3645698 Copy Link Share on Social Media Share on * * * * * * * * * 0References * * * Close Figure Viewer Browse AllReturnChange zoom level[ ] Caption View Table of Contents Export Citations Select Citation format[BibTeX ] * Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download + Download citation + Copy citation Footer Categories * Journals * Magazines * Books * Proceedings * SIGs * Conferences * Collections * People About * About ACM Digital Library * ACM Digital Library Board * Subscription Information * Author Guidelines * Using ACM Digital Library * All Holdings within the ACM Digital Library * ACM Computing Classification System * Digital Library Accessibility Join * Join ACM * Join SIGs * Subscribe to Publications * Institutions and Libraries Connect * Contact * Facebook * X * Linkedin * Feedback * Bug Report The ACM Digital Library is published by the Association for Computing Machinery. Copyright (c) 2024 ACM, Inc. * Terms of Usage * Privacy Policy * Code of Ethics ACM Digital Library home ACM home Your Search Results Download Request We are preparing your search results for download ... We will inform you here when the file is ready. Download now! Your Search Results Download Request Your file of search results citations is now ready. Download now! Your Search Results Download Request Your search export query has expired. Please try again.