As part of their "Defective by Design" anti-DRM campaign, the FSF recently made the following claim:
This is part of an overall argument that Microsoft's insistence that only hardware with a TPM can run Windows 11 is with the goal of aiding streaming companies in their attempt to ensure media can only be played in tightly constrained environments.
I'm going to be honest here and say that I don't know what Microsoft's actual motivation for requiring a TPM in Windows 11 is. I've been talking about TPM stuff for a long time. My job involves writing a lot of TPM code. I think having a TPM enables a number of worthwhile security features. Given the choice, I'd certainly pick a computer with a TPM. But in terms of whether it's of sufficient value to lock out Windows 11 on hardware with no TPM that would otherwise be able to run it? I'm not sure that's a worthwhile tradeoff.
What I can say is that the FSF's claim is just 100% wrong, and since this seems to be the sole basis of their overall claim about Microsoft's strategy here, the argument is pretty significantly undermined. I'm not aware of any streaming media platforms making use of TPMs in any way whatsoever. There is hardware DRM that the media companies use to restrict users, but it's not in the TPM - it's in the GPU.
Let's back up for a moment. There's multiple different DRM implementations, but the big three are Widevine (owned by Google, used on Android, Chromebooks, and some other embedded devices), Fairplay (Apple implementation, used for Mac and iOS), and Playready (Microsoft's implementation, used in Windows and some other hardware streaming devices and TVs). These generally implement several levels of functionality, depending on the capabilities of the device they're running on - this will range from all the DRM functionality being implemented in software up to the hardware path that will be discussed shortly. Streaming providers can choose what level of functionality and quality to provide based on the level implemented on the client device, and it's common for 4K and HDR content to be tied to hardware DRM. In any scenario, they stream encrypted content to the client and the DRM stack decrypts it before the compressed data can be decoded and played.
The "problem" with software DRM implementations is that the decrypted material is going to exist somewhere the OS can get at it at some point, making it possible for users to simply grab the decrypted stream, somewhat defeating the entire point. Vendors try to make this difficult by obfuscating their code as much as possible (and in some cases putting some of it in-kernel), but pretty much all software DRM is at least somewhat broken and copies of any new streaming media end up being available via Bittorrent pretty quickly after release. This is why higher quality media tends to be restricted to clients that implement hardware-based DRM.
The implementation of hardware-based DRM varies. On devices in the ARM world this is usually handled by performing the cryptography in a Trusted Execution Environment, or TEE. A TEE is an area where code can be executed without the OS having any insight into it at all, with ARM's TrustZone being an example of this. By putting the DRM code in TrustZone, the cryptography can be performed in RAM that the OS has no access to, making the scraping described earlier impossible. x86 has no well-specified TEE (Intel's SGX is an example, but is no longer implemented in consumer parts), so instead this tends to be handed off to the GPU. The exact details of this implementation are somewhat opaque - of the previously mentioned DRM implementations, only Playready does hardware DRM on x86, and I haven't found any public documentation of what drivers need to expose for this to work.
In any case, as part of the DRM handshake between the client and the streaming platform, encryption keys are negotiated with the key material being stored in the GPU or the TEE, inaccessible from the OS. Once decrypted, the material is decoded (again either on the GPU or in the TEE - even in implementations that use the TEE for the cryptography, the actual media decoding may happen on the GPU) and displayed. One key point is that the decoded video material is still stored in RAM that the OS has no access to, and the GPU composites it onto the outbound video stream (which is why if you take a screenshot of a browser playing a stream using hardware-based DRM you'll just see a black window - as far as the OS can see, there is only a black window there).
Now, TPMs are sometimes referred to as a TEE, and in a way they are. However, they're fixed function - you can't run arbitrary code on the TPM, you only have whatever functionality it provides. But TPMs do have the ability to decrypt data using keys that are tied to the TPM, so isn't this sufficient? Well, no. First, the TPM can't communicate with the GPU. The OS could push encrypted material to it, and it would get plaintext material back. But the entire point of this exercise was to avoid the decrypted version of the stream from ever being visible to the OS, so this would be pointless. And rather more fundamentally, TPMs are slow. I don't think there's a TPM on the market that could decrypt a 1080p stream in realtime, let alone a 4K one.
The FSF's focus on TPMs here is not only technically wrong, it's indicative of a failure to understand what's actually happening in the industry. While the FSF has been focusing on TPMs, GPU vendors have quietly deployed all of this technology without the FSF complaining at all. Microsoft has enthusiastically participated in making hardware DRM on Windows possible, and user freedoms have suffered as a result, but Playready hardware-based DRM works just fine on hardware that doesn't have a TPM and will continue to do so.
Today, most of the major streaming media platforms utilize the TPM to decrypt media streams, forcefully placing the decryption out of the user's control(from here).
This is part of an overall argument that Microsoft's insistence that only hardware with a TPM can run Windows 11 is with the goal of aiding streaming companies in their attempt to ensure media can only be played in tightly constrained environments.
I'm going to be honest here and say that I don't know what Microsoft's actual motivation for requiring a TPM in Windows 11 is. I've been talking about TPM stuff for a long time. My job involves writing a lot of TPM code. I think having a TPM enables a number of worthwhile security features. Given the choice, I'd certainly pick a computer with a TPM. But in terms of whether it's of sufficient value to lock out Windows 11 on hardware with no TPM that would otherwise be able to run it? I'm not sure that's a worthwhile tradeoff.
What I can say is that the FSF's claim is just 100% wrong, and since this seems to be the sole basis of their overall claim about Microsoft's strategy here, the argument is pretty significantly undermined. I'm not aware of any streaming media platforms making use of TPMs in any way whatsoever. There is hardware DRM that the media companies use to restrict users, but it's not in the TPM - it's in the GPU.
Let's back up for a moment. There's multiple different DRM implementations, but the big three are Widevine (owned by Google, used on Android, Chromebooks, and some other embedded devices), Fairplay (Apple implementation, used for Mac and iOS), and Playready (Microsoft's implementation, used in Windows and some other hardware streaming devices and TVs). These generally implement several levels of functionality, depending on the capabilities of the device they're running on - this will range from all the DRM functionality being implemented in software up to the hardware path that will be discussed shortly. Streaming providers can choose what level of functionality and quality to provide based on the level implemented on the client device, and it's common for 4K and HDR content to be tied to hardware DRM. In any scenario, they stream encrypted content to the client and the DRM stack decrypts it before the compressed data can be decoded and played.
The "problem" with software DRM implementations is that the decrypted material is going to exist somewhere the OS can get at it at some point, making it possible for users to simply grab the decrypted stream, somewhat defeating the entire point. Vendors try to make this difficult by obfuscating their code as much as possible (and in some cases putting some of it in-kernel), but pretty much all software DRM is at least somewhat broken and copies of any new streaming media end up being available via Bittorrent pretty quickly after release. This is why higher quality media tends to be restricted to clients that implement hardware-based DRM.
The implementation of hardware-based DRM varies. On devices in the ARM world this is usually handled by performing the cryptography in a Trusted Execution Environment, or TEE. A TEE is an area where code can be executed without the OS having any insight into it at all, with ARM's TrustZone being an example of this. By putting the DRM code in TrustZone, the cryptography can be performed in RAM that the OS has no access to, making the scraping described earlier impossible. x86 has no well-specified TEE (Intel's SGX is an example, but is no longer implemented in consumer parts), so instead this tends to be handed off to the GPU. The exact details of this implementation are somewhat opaque - of the previously mentioned DRM implementations, only Playready does hardware DRM on x86, and I haven't found any public documentation of what drivers need to expose for this to work.
In any case, as part of the DRM handshake between the client and the streaming platform, encryption keys are negotiated with the key material being stored in the GPU or the TEE, inaccessible from the OS. Once decrypted, the material is decoded (again either on the GPU or in the TEE - even in implementations that use the TEE for the cryptography, the actual media decoding may happen on the GPU) and displayed. One key point is that the decoded video material is still stored in RAM that the OS has no access to, and the GPU composites it onto the outbound video stream (which is why if you take a screenshot of a browser playing a stream using hardware-based DRM you'll just see a black window - as far as the OS can see, there is only a black window there).
Now, TPMs are sometimes referred to as a TEE, and in a way they are. However, they're fixed function - you can't run arbitrary code on the TPM, you only have whatever functionality it provides. But TPMs do have the ability to decrypt data using keys that are tied to the TPM, so isn't this sufficient? Well, no. First, the TPM can't communicate with the GPU. The OS could push encrypted material to it, and it would get plaintext material back. But the entire point of this exercise was to avoid the decrypted version of the stream from ever being visible to the OS, so this would be pointless. And rather more fundamentally, TPMs are slow. I don't think there's a TPM on the market that could decrypt a 1080p stream in realtime, let alone a 4K one.
The FSF's focus on TPMs here is not only technically wrong, it's indicative of a failure to understand what's actually happening in the industry. While the FSF has been focusing on TPMs, GPU vendors have quietly deployed all of this technology without the FSF complaining at all. Microsoft has enthusiastically participated in making hardware DRM on Windows possible, and user freedoms have suffered as a result, but Playready hardware-based DRM works just fine on hardware that doesn't have a TPM and will continue to do so.