Analytics company Alteryx leaks sensitive info on “virtually every American” household

Alteryx, a California based analytics company, has been implicated in the leak of a database containing the personal information of more than 123 million American households.

CNET reports that the database contained

…248 different data fields covering a wide variety of specific personal information, including address, age, gender, education, occupation and marital status. Other fields included mortgage and financial information, phone numbers and the number of children in the household.

The database was apparently a combination of public information gathered from the Census Bureau and more sensitive information provided by Alteryx partner Experian. Experian is a large credit bureau and direct competitor with Equifax.

According to DarkReading.com, the cause of the database leak was an improperly secured Amazon Web Services S3 storage bucket. The bucket that stored the sensitive information was set to allow any authenticated AWS user access to the database. As a result, anyone with a free AWS account could have downloaded the information.

In a statement to Forbes, Alteryx downplayed the gravity of the situation, claiming

The information in the file does not pose a risk of identity theft to any consumers.

Nonetheless security researchers at UpGuard, who discovered the vulnerability, point out that

While the spreadsheet uses anonymized record IDs to identify households, the other information in the fields – as well as another spreadsheet in the bucket, to be discussed shortly – are sufficiently detailed as to be not merely often identifying, but with a high degree of specificity.

Finally, the concentration of publicly and commercially-gleaned data about tens of millions of American households, and the exposure of this data to anyone with a free AWS account entering a URL, shows just how devastating an exposure can be at an enormous scale. The data exposed in this bucket would be invaluable for unscrupulous marketers, spammers, and identity thieves, for whom this data would be largely reliable and, more importantly, varied.