Safeguarding User Data: Building a United Front Against Unauthorised Scraping
The Problem Space
User data has become a valuable commodity which threat actors seek and platforms protect. Threat actors have turned to automated mass collection of user data to create and sell datasets, replicate existing legitimate webpages, or exploit information for purposes such as stalking or surveillance. In order to raise awareness about the importance of safeguarding data, it is valuable to understand the rise of unauthorised scraping and its impact.
Defining Unauthorised Scraping
‘Authorised scraping’ is the automated collection of data with expressed permission. ‘Unauthorised scraping’ is the automated collection of data that violates a platform’s Terms of Service. This involves the collection of data that a user shares with other users or is accessible as a result of a user unwittingly sharing access to their account. Therefore, unauthorised scraping is not considered a breach of a platform’s security protections.The use of unauthorised scraping to access user data creates the possibility of data misuse. Given the threat of unauthorised scraping, it is important to highlight its implications and raise awareness around safeguarding data and user protection.
How Scraped Data is Used
Demand for data that informs marketing, business development, and personal targeting has significantly increased over the past decade and has fueled the growing market for user data. Simultaneously, companies have limited the supply of data by restricting its access to protect against user data misuse. As a result, there has been an unprecedented rise in the amount of unauthorised scraping incidents with negative implications for both companies and users.
Threat actors are motivated to engage in unauthorised scraping for their own personal and financial gain. Some threat actors scrape to create datasets and databases of aggregated scraped user information that can be bought, sold, or posted online by third-party actors for profit. Depending on the nature of the scraped data, it may be possible to facilitate phishing or spamming attacks, plant spyware, or steal credentials to further exploit individuals. Threat actors can also use unauthorised scraped data to create clone sites, which impersonate legitimate webpages.
In addition, they can aggregate scraped data into datasets for sale on data broker websites or for targeted advertising and marketing purposes. Often legitimate businesses or researchers are not aware that the services they rely on use unauthorised scraped data. Threat actors also access user data for political value by using targeted datasets for purposes such as reconnaissance or surveillance. Enemy nation states can also take advantage of unauthorized scraped data for their own gain. It is important to note that not all instances of unauthorized scraping lead to the aforementioned impacts.
The Impact of Unauthorised Scraping
The impacts of unauthorised scraping are far-reaching. Both unauthorised scraping and the subsequent use of the data decreases public trust and threatens industry reputations. It can also lead to system slowdowns, increased costs, and the loss of control over data. For users, unauthorised scraping reduces user control over information and can lead to spamming, fraudulent communication, identity targeting, surveillance, and unexpected disclosures of content intended to be temporary.
Combating Unauthorized Scraping
Currently, there are no industry standards for combating unauthorised scraping. A recent study conducted by NewtonX highlighted that nearly 90% of experts surveyed believe unauthorised scraping prevention is either important or very important, but only 42% of respondents have established strategies to address the practice. To address these gaps, NewtonX concluded that effectively tackling unauthorised scraping requires a collaborative and multi-stakeholder effort. While there is no singular approach to combating unauthorised scraping, there are an array of practices that companies engage in to mitigate unauthorised scraping. Consequently, there is a demonstrable need to foster public-private dialogue and to mitigate the current lack of industry-wide collaboration to combat unauthorised scraping.
About The Mitigating Unauthorized Scraping Alliance
Mitigating Unauthorized Scraping Alliance (MUSA) brings together industry members to address these challenges to offer a unified front against unauthorised scraping and data misuse. MUSA is working with member companies and experts to publish industry-aligned practices for unauthorised scraping mitigation with the goal of making unauthorized scraping more difficult across member platforms, reducing the attack vector for unauthorised scraping threat actors, and serving as a resource for media and policymaker engagement.
MUSA provides insight, knowledge, and expertise to the public on unauthorised scraping by hosting public education events like an International Data Privacy Day Panel Event on January 31, 2023 and publishing a monthly newsletter highlighting unauthorized scraping related news and events.
If you would like to learn more about MUSA and stay informed about unauthorized scraping visit our website and connect on LinkedIn. If you are interested in joining a diverse group of industries and experts in combating unauthorized scraping and want to get involved with MUSA, contact us or fill out the: Membership Inquiry Form.
techUK - Getting Regulation Right for a Digital Society
Visit our Digital Regulation Hub to learn more or to register for regular updates.
techUK forums provide members the opportunities to showcase the ways in which they are helping to improve privacy and protect data protection rights. Our working groups, networks, and events - including our annual Digital Ethics Summit and Tech Policy Conference - enable cross-sector collaboration and are crucial sources of insight and thought leadership. Get in touch to see how we can support your policy work. Visit our Digital Regulation Hub and complete the ‘contact us’ form.