Data Security

Let’s put an end to data leaks

Encryption your data. Digital Lock. Hacker attack and data breach. Big data with encrypted computer code. Safe your data. Cyber internet security and privacy concept. Database storage 3d illustration

As cyber threats evolve, businesses have remained primarily focused on perimeter security, often overlooking potential vulnerabilities within their web application programming interfaces (APIs).

Yes, we’re talking about firewalls and web-application-firewalls (WAFs) and a new security category API security, which has become a must-have today. Regulations and compliance will require companies to deploy them, sure. And some might even feel more secure. We are not against them, on the contrary. But thinking companies are immune to external attacks because they have these circles of protections is erroneous. Not to mention, it’s common that these tools are not well-configured for protecting the business and its customers.

The main issue with all these tools: their core has been based on automatic detection. Detection-based security tools have a playbook of attacks and patterns that they try to look for. Anything that’s flagged gets blocked. They are all flawed with the same inherent issue – they're looking for bad actors who try to misuse the system bypassing protection mechanisms, but not for those using valid, but not secure functionality. Ouch. And therefore, we cannot fully depend on them. Simply because what happens once a real attack takes place and they miss it? How would a security team risk-manage that?

Security threat modeling has become a critical step when teams design a system with security in mind, aka security by design. And saying that perimeter security based on detection tools is good enough today no longer applies.

Sometimes the requests to the web server are legit and these automatic tools aren’t going to flag and report them. And then the security team gets left in the dark. Good luck protecting customers. Remember that web APIs are meant to receive requests and send data back. With all due respect to WAFs and API security – the best security has proper access checks and user input validation where due.

Many security pros might imagine that attackers are so sophisticated, like everybody loves to portray them, but not in the case of simple data scraping through APIs.

Duolingo's recent data leak incident underscores this oversight. The root of such vulnerabilities often lies in software coding—specifically, the absence of stringent access checks crucial for safeguarding user privacy. This isn't a new concern.

In fact, the Open Web Application Security Project (OWASP) has long cautioned against such issues. Yet, as the Duolingo incident illustrates, even major companies can sometimes neglect these warnings. We’re talking about proper data access checks within the API’s implementation of Duolingo’s system. Something that no automatic tool would know to prevent in real time.

Security pros know the issue as Broken Access Checks, and OWASP keeps on changing its name as the acronyms aren’t easy to memorize or understand quickly, like BOLA and IDOR This means than an attacker can ask an API to obtain access to another object in the system, one that the attacker doesn’t necessarily own. And that shouldn’t happen. We usually see URLs in the browsers with weird parameters, something that looks like what’s posted here in this fake link:

https://acme.com/system1/webservice?op=fetch-doc&docid=A

By looking carefully at the URL, users will see a parameter called ‘docid’ that receives a document identifier. When the web application receives this request, it will want to return the requested document. However, what happens if document A doesn’t belong to the attacker? Logically, a security pro will know that the system shouldn’t return this document, as it’s an error.

When developers forget to write code that covers logical access checks, we can’t blame them. It might sound easy, but it’s not. As there are hundreds of APIs sometimes in big systems, and the parameters and options are not so straightforward to decipher at all times. Either way, access checks are critical for preserving customers’ privacy and confidentiality.

Today, it’s enough to have a single API potentially capable of exposing data, out of hundreds of APIs, and game over - customer data is exposed! Rest assured that cyber criminals will eventually spot that API, if it exists, sooner or later.

Such access checks are what stands between a secure software and an attacker leaking all information of other customers, without truly having legitimate access to their information.

The good news: solutions already exist. Tools that enforce rigorous access controls at both the data and code levels. Implementing these products isn't just about technical upgrades. It’s about fostering a heightened awareness of evolving cyber threats and being proactive in defense. Eventually it’s all about data access controls, which has become the most important mechanism to have as a holistic software layer for protecting customer data. Otherwise, we will keep seeing more Duolingo incidents happening

Gil Dabah, co-founder and CEO, Piiano

You can skip this ad in 5 seconds