Denormalize Access Control

Access control is both immensely useful and incredibly broken.

Access control, or the ability to constrain access to data and programs in a shared system is the only way that we, as users of shared systems, can maintain our identities, personal security, and privacy. Shared systems include: databases, file servers, social networking sites, virtualized computing systems, vendor accounts, control panels, management tools, and so forth all need robust, flexible, granular, and scalable access control tools.

Contemporary access control tools--access control lists (ACL,) and access control groups--indeed the entire conceptual framework for managing access to data and resources, don’t work. From a theoretical practice, ACLs that express a relationship between users or groups of users and data or resources, represent a parsimonious solution to the “access control problem:” if properly deployed only those with access grants will have access to a given resource.

In practice these these kinds of relationships do not work. Typically relationships between data and users is rich and complex and different users need to be able to do different things with different resources. Some users need “read only” access, others need partial read access, some need read and write access but only to a subset of a resource. While ACL systems can impose these kinds of restrictions, the access control abscration doesn’t match the data abstraction or the real-world relationships that it supposedly reflects.

Compounding this problem are two important factors:

  1. Access control needs change over time in response to social and cultural shifts among the users and providers of these resources.
  2. There are too many pieces of information or resources in any potential shared system to allocate access on a per-object or per-resource basis, and the volume of objects and resources is only increasing.

Often many objects or resources have the same or similar access control patterns, which leads to the “group” abstraction. Groups make it possible to describe a specific access control pattern that apply to a number of objects, and connect this pattern with specific resources.

Conceptual deficiencies:

  • There’s a volume problem. Access control data represents a many-to-many-to-many relationship. There are many different users and (nested) groups, many different kinds of access controls that systems can grant, and many different (nested) resources. This would be unmanageably complex without the possibility for nesting, but nesting means that the relationships between resources and between groups and users are also important. With the possibility for nesting access control is impossible.

  • ACLs and group-based access control don’t account for the fact that access must be constantly evolving, and current systems don’t contain support for ongoing maintenance. (we need background threads that go through and validate access control consistency.) Also all access control grants must have some capacity for automatic expiration.

  • Access control requirements and possibilities shift as data becomes more or less structured, and as data use patterns change. The same conceptual framework that works well for access control in the context of a the data stored in a relational database, doesn’t work so when the data in question is a word processing document, an email folder, or a spread sheet.

    The fewer people that need access to a single piece of data, the easier the access control system can be. While this seems self evident, it also means that access control systems are difficult to test in the really large complex systems in which they’re used.

  • Group-based access control systems, in effect, normalize data about access control, in an effort to speed up data access times. While this performance is welcome, in most cases granting access via groups leads to an overly liberal distribution of access control rights. At once, its too difficult to understand “who has access to what” and too easy to add people to groups that give them more access than they need.

So the solution:

  1. Denormalize all access control data,
  2. don’t grant access to groups, and
  3. forbid inheritance.

This is totally counter to the state of the art. In most ways, normalized access control data, with role/group-based access control, and complex inheritance are the gold standard. Why would it work?

  • If you have a piece of data, you will always be able to determine who has access to data, without needing to do another look-up.

  • If you can deactivate credentials, then a background process can go through and remove access without causing a large security problem. (For partial removes, you would freeze an account, let the background process modify access control and then unfreeze the account.)

    The down side is that, potentially, in a large system, it may take a rather long time for access grants to propagate to users. Locking user accounts makes the system secure/viable, but doesn’t make the process any more quick.

    As an added bonus, these processes could probably be independent and wouldn’t require any sort of shared state or lock, which means many such operation could run in parallel, and they could stop and restart at will.

  • The inheritance option should be fuzzy. Some sort of “bucket-based” access control should be possible, if there’s a lot of data with the same access control rules and users.

    Once things get more complex, buckets are the wrong metaphor, you should use granular controls everywhere.

Problems/Conclusion:

  • Denormalization might fix the problems with ACLs and permissions systems, but it doesn’t fix the problems with distributed identity management.

    As a counterpoint, this seems like a cryptography management problem.

  • Storing access control information with data means that it’s difficult to take a user and return a list of what these credentials have access to.

    In truth, centralized ACL systems are subject to this flaw as well.

  • A huge part of the problem with centralized ACL derives from nesting, and the fact that we tend to model/organize data in tree-like structures, that often run counter to the organization of access control rights. As a result access control tools must be arbitrary.

Security Isn't a Technological Problem

Security, of technological resources, isn’t a technological problem. The security of technological resources and information is a problem with people.

There.

That’s not a very ground breaking conclusion, but I think that the effects of what this might mean for people doing security1 may be more startling.

Beyond a basic standard of “writing and using quality software” and following sane administration practices, the way to resolve security issues is to fix the way people use and understand the implications of their use.

There are tools that help control user behavior to greater or lesser degrees. Things like permissions control, management, auditing, and encryption, but they’re just tools: they don’t solve the human problems and the policy/practice issues that are the core of best security practice. Teaching people how their technology works, what’s possible and what’s not possible, and finally how to control their own data and resources is the key to increasing and providing security services to everyone.

I think of this as the “free software solution,” because it draws on the strengths and methods of free software to shape and enhance people’s user experience and to improve the possible security of the public network as a whole. One of the things that has always drawn me to free software, and one of its least understood properties, deals with the power of source code to create an environment that facilitates education and inquiry. People who regularly use free software, I’d bet, have a better understanding of how technology works than people who don’t, and it’s not because free software users have to deal with less polished software (not terribly true), but has something to do with a different relationship between creators and users of software. I think it would be interesting to take this model and apply it to the “security problem.”

With luck, teaching more people to think about security processes will mean that users will generally understand:

  • how encryption works, and be more amenable to managing their own cryptography identities and infrastructure. (PGP and SSH)
  • how networking works on a basic level to be able to configure, set, and review network security. (LAN Administration, NetFilter)
  • how passwords are stored and used, and what makes strong passwords that are easy to remember and difficult to break.
  • how to control and consolidate identity systems to minimize social engineering vulnerabilities. (OpenID, OAuth, etc.)

There’s a lot of pretty basic knowledge that I think most people don’t have. At the same time, I think it’s safe to say that most of the difficult engineering questions have been solved regarding security, there’s a bunch of tooling and infrastructure on the part of various services that would make better security practices easier to maintain (i.e. better PGP support in mail clients). In the mean time….

Stay smart.


  1. Security, being a process, rather than a product. Cite. ↩︎

Flaws with SSH

In response to 9 Awesome SSH Tricks some posted the following quote (on the old commenting system):

The workarounds have become smoother and some of the things we can do with networks of Unix machines are pretty impressive, but when ssh is the foundation of your security architecture, you know things aren’t working as they should.

-- Rob Pike, 2004

So let’s clarify things a bit. SSH is great as an end user protocol, and great for dealing with the realities of our distributed computing environment in an exigent manner, SSH lets us:

  • connect securely to remote systems.
  • quickly establish tunnels through remote machines.
  • admister remote systems securely.
  • provide end-users with key-based authentication.

SSH is great for providing end users with a secure way of interacting with computer systems in networked environment. It’s not, however, the magic bullet for security policy. If you or your organizations security practices revolve entirely around SSH tunnels, then you’re probably in trouble or about to be in trouble. Use traditional VPNs and TLS/SSL when it makes sense and develop a sane security policy.

But don’t forget SSH and if you do use SSH, know that there are some really awesome things that OpenSSH makes possible.

on public key encryption and security

As part of the moving process I got a bank account, and I was reminded, again, of how much the security systems of most online banks are comically flawed, which lead me to even greater anger about security in general. The following rant is what happened.

I should say at first, that I’m not really a security expert, and I just dabble in this stuff. Having said that…

“Security” online and in a digital context covers two pretty distinct aspects:

  • Identity. In real life we can show our drivers license or passport, we can say “I’m [insert name here],” and in many situations another person is probably not too far away to be able to say, “I know them, they’re [insert name here].” Online? Well identity is less easily and reliably verified. Identity is important both for individual’s (and organizations') identity and for things that people (and organizations) produce/own: emails, documents, web pages, software, and so forth.
  • Encryption. Basically we encrypt data so that we can be relatively certain that no one gains access to our data unless, by listening into our network connection, or gaining access to physical media. From encryption we get privacy, and as long as the encryption scheme works as it should and the encryption covers communications end-to-end, it’s pretty safe to assume some measure of privacy.

It turns out, from a technical perspective that encryption is reasonably easy to achieve. It’s true that all cryptographic schemes are ultimately breakable, however, if we can generally assume best practices (expiring old keys, keeping your private keys safe, etc.) then I feel fairly safe in asserting that encryption isn’t the weak part of the security equation.

This leaves identity on the table. Which is sort of a messy affair.

Just because someone says, “Hello my name is Alice,” it doesn’t mean that they are Alice. Just because they have Alice’s password, doesn’t necessarily mean that they are Alice (but that’s a safer bet.) The best, and most reliable way to verify someones identity, it turns out, to have a “web of trust.”

Which basically means, you assert that you are who you say you are, and then “vouch” for other people who you “know” are who they say they are. Once you’ve vouched for someone you then “trust” that the people they’ve vouched for, and so forth. Good web-of-trust systems allow you to revoke trust, and provide some mechanism for propagating trusted networks of identities among users.

The above described system is a very peer-to-peer/ad hoc system (bottom up, if you will), there are also more centralized (top down,) systems which can also function to verify identity in a digital context. These systems depend on commonly trusted third parties that are tasked with researching and verifying the identity of individuals and organization. So called “certificate authorities,” make it possible to “trust identities” of without needing a personal web-of-trust network extend to cover people and organizations you’d come in contact with.


Lets bring this back to the case study of the bank,

They encrypt their traffic, end to end, with SSL (eg. TLS), they pay for a certificate from a certificate authority with a good reputation. The weak part of this equation? You and Me, apparently.

To verify our identity, we have this arcane and convoluted scheme where by we have to enter hard to remember passwords in stages (my last bank, had us enter passwords on three pages in succession) so that the back can be sure we’re who we say we are. And the sad part is that while encryption and identity verification technology in secure and reliable ways is pretty advanced (in the big picture), we still have to send passwords. Here are my thoughts on passwords:

  • The best passwords are the hardest to remember. The best passwords don’t contain words, and contain numbers, letters, and punctuation. But these passwords are difficult to remember, and I think many people avoid picking truly secure passwords because of the difficulty.

  • Passwords aren’t bad, and they’re--I suspect--most useful as a casual deterrent and a reminder to users of the potential gravity of the situation; but they’re not exactly a reliable fingerprinting mechanism.

  • Some sort of cryptographic handshake would be many magnitudes more secure, and much less painless for users.

    I have this theory, that security for banks (and other similar institutions) is more about giving the appearance of being secure (asking for more complex passwords, making you jump through more hoops, etc.) and less about doing things that would be more secure in the long run. But maybe that’s just me.

Anyway, back onto more general interest topics in the near future.