A database breach has exposed the profile data of nearly 235 million TikTok, Instagram, and YouTube users.
Profile data of nearly 235 million TikTok, Instagram, and YouTube users were exposed for a database breach through a practice known as web-scraping, where a company accesses a service’s web interface and then automatically collects the data. .
This is different from a hack in that it involves breaking into a system to access data that is not supposed to be publicly accessible. Instead, web-scraping only accesses public data.
For example, an automated system can access a series of YouTube channels, collecting the username, photo and number of followers of the channel owner. An entire database of these records becomes a privacy issue despite the fact that the data itself is public.
Once the data has been collected into a database, it is normally expected to be protected. But TNW reports that a database of 235 million records was found on the web without password protection.
The extracted data had four main data sets with details of millions of users of the aforementioned platforms. It contained information such as the profile name, full name, profile picture, age, gender and statistics of followers […]
Bob Diachenko, the principal investigator for security firm Comparitech, found three identical copies of the database on Aug. 1. According to Diachenko and the team, the data belonged to a now defunct company called Deep Social.
When contacted by the company, the request was sent to the Hong Kong-based company Social Data, which acknowledged the breach and closed access to the database. However, Social Data denied having ties to Deep Social.
Database collects public information from users
Comparitech detailed that each record contained some or all of the following:
Name of profile
Full real name
If the profile belongs to a company or has ads.
Statistics on follower engagement, including:
– Number of followers
– Participation rate
– Followers growth rate
– Audience gender
– Audience age
– Location of the audience
Timestamp of the last post
Furthermore, about 20% of the records sampled contained a telephone number or an email address. As TNW points out, this type of data can be used for spam, but also for phishing attempts.
Web-scraping is generally prohibited by the terms and conditions of the services in question, but a California court ruled last year that it is not illegal. That can, in many cases, be a good thing.
For example, CityMapper is a very popular application that finds out how to get from A to B in a city by the fastest method, getting live traffic and public transport data to do it. These days, most public transportation companies make that data available through an API, but in the early days it was only available on the web. The web scraping by the early CityMapper precursors offered a practical way to make data more usable.
Web scraping can still be useful today, when companies put useful data on the web but don’t make it available through an API. Price comparison services, for example, often still rely on web-scraping. But scraping of personal data is another matter, and the courts may need to distinguish between the two types of use.