** Patch Theory **
How the patch works and/or what it does
Disclaimer:
I am not a PCI spec expert so the statements below may be an oversimplification or even incorrect
PCI devices are not required to support all of the bus-level capabilities that are listed in the official spec. Each manufacturer can choose which capabilities their products will and will not support. Because PCIe is a newer version of the original PCI spec, it defines a number of additional capabilities, one of them being ETFs or
Extended Tag Fields.
Starting with Monterey, macOS enables two
extended tag fields, namely 8b and 10b (8-bit and 10-bit), in the function
probeBusGated that is a member of the class
IOPCIBridge. Before macOS can enable these fields, it has to discover whether the device supports these fields.
This is where
bridgeProbeChild comes in. It is a member of the class
IOPCIConfigurator, and is involved in the process of scanning the entire PCI bus, querying all devices for a list of their capabilities, and saving all of this information in
IORegistry.
In Big Sur,
probeBus does not enable ETFs.
In Monterey and newer,
probeBusGated (which is a slightly modified version of
probeBus that runs in a single thread) checks whether ETFs are supported and, if so, it specifically enables two of them if they are present: ETF 8b and ETF 10b. Each of these ETFs is checked separately and enabled if it is supported.
Unfortunately for our AM5 motherboards (and MSI X570 motherboards), when ETF 10b is enabled on any device that supports it, that device becomes a zombie from that point onwards! This causes our I225, WIFI, NVME, and other devices to become zombies.
There are at least two ways to patch this problem, hence the two different sets of patches provided.
Method 1:
- Looking at the code below from probeBusGated, we see that both ETFs are checked and enabled inside an if { } clause. Specifically:
if (!(useDefaultETFSettings) && (deviceCaps & 0x20))
- If we modify this if-condition to always return false, then the inner block will never be executed
- That is what the first patch does in x86 Assembly Code
In the code below, ETFE means
Extended Tag Field Enable.
C:
// Enable extended tag fields, if supported
uint32_t deviceCaps = nub->reserved->configEntry->expressDeviceCaps1;
uint32_t deviceCaps2 = nub->reserved->configEntry->expressDeviceCaps2;
bool useDefaultETFSettings = gIOPCIFlags & kIOPCIConfiguratorDefaultETF;
if (!(useDefaultETFSettings) && (deviceCaps & 0x20))
{
// Need to set ETFE for either 8b or 10b tags
uint16_t deviceControl = nub->configRead16(nub->reserved->expressCapability + 0x08, "probeBusGated:ExpressCapa");
deviceControl |= 0x100;
nub->configWrite16(nub->reserved->expressCapability + 0x08, deviceControl, "probeBusGated2");
// Enable 10b Tag Fields if the device can request them and its root port can complete them
struct IOPCIConfigEntry *rootPortEntry = nub->reserved->configEntry->rootPortEntry;
if ((deviceCaps2 & 0x20000) && (rootPortEntry->expressDeviceCaps2 & 0x10000))
{
uint16_t deviceControl2 = nub->configRead16(nub->reserved->expressCapability + 0x28, "probeBusGated:ExpressCapa");
deviceControl2 |= 0x1000;
nub->configWrite16(nub->reserved->expressCapability + 0x28, deviceControl2, "probeBusGated3");
}
}
Method 2:
- Our trace logging shows that ETF 8b does not cause a problem. Instead, only ETF 10b zombifies our devices
- So instead of disabling both ETFs, how can we just disable ETF 10b?
- We can modify the nested if-condition located inside the main if-condition to always return false, but notice that this nested condition depends on expressDeviceCaps2, which is queried by busProbeChild (part of IOPCIConfigurator)
- If we prevent expressDeviceCap2 from being queried and set, we can cause the inner if-condition to fail
- Because expressDeviceCaps2 is queried and set in busProbeChild (code shown below), we can make a change in x86 Assembly to stop it
- The alternative patch, therefore, converts the second statement below to a series of NOPs or No-Operations, which effectively erases the second line
C:
child->expressDeviceCaps1 = configRead32(child, child->expressCapBlock + 0x04, NULL, "bridgeProbeChild:ExpressCapa");
child->expressDeviceCaps2 = configRead32(child, child->expressCapBlock + 0x24, NULL, "bridgeProbeChild:ExpressCapa");
child->expressMaxPayload = (child->expressDeviceCaps1 & 7);
DLOG(" expressMaxPayload 0x%x\n", child->expressMaxPayload);
I plan to create Patch 3 that will just disable the inner
if-condition from
probeBusGated. That will allow me to compare the behavior of each one.
- Update: Method 3 patch has been created and is preferred over the previous two. All patches are located here: