Generally, authorization security (determining whether a subject has access to data) is based on labels. For example, file pathnames determine what directory a file resides under, and accordingly, what discretionary access controls are assigned to the file. Firewalls determine what packets are authorized based on IP addresses and port numbers from packet headers. Document management systems often require users to apply tags to newly-scanned documents so the documents can be protected and routed appropriately.
These labels we assign to data (filenames, port numbers, tags, etc.) need to be representative of the information contents. We often depend on users to use appropriate and correct labels so we can implement hard and fast controls on the data.
Unfortunately, labels are often indeterminate or not representative of the content. For example, an HTTPS stream to a site like GotoMyPC that actually is providing remote access to a PC screen results in complete access to any data and applications on that PC, but the contents of that HTTPS stream can't be controlled short of blocking all access to the GotoMyPC web site.
Content-aware data loss prevention systems use a variety of approaches to authorize data (in use, at rest, or in motion) based on the actual content of the data. For those who understand and accept its approach, it enables deeper understanding of information and also enables more intelligent authorization decisions. DLP also provides a backstop when other access controls fail, such as when users forget to correctly tag a document.