Linux PCI Resource Allocation
This is a collection of my notes gathered whilst trying to make sense of the PCI resource allocation mechanism up to and including v2.6.25.
Questions
- How is the highest resource allocation address determined? (PCIBIOS_MIN_MEM has no counterpart)
Other system resources in the address space are added to the generic resource list prior to PCI starting, and the assignment algorithm allocates a region that is above the requested minimum starting location (usually PCIBIOS_MIN_MEM in pci_assign_resource()) and below the first resource allocated beyond that point (new->end = last->start - 1 ). In PCI terms, initially the next used region is usually the PCI Express MMIO range. The combination of requesting a resource using the same starting address and assigning top-down works reasonably well when only one possible range is available.
Possible Solutions
The outlines of a solution appear possible using the following general algorithm:
- Maintain a list of available PCI IOMEM regions derived from BIOS Int 0x15 eax=0xE820 or EFI GetMemoryMap() maps.
- Expand pci_assign_resource() to call pci_iomem_get_region() which will determine which of the PCI IOMEM regions to request resources in. This will be used as the min value passed to pci_bus_alloc_resource()
- Provide additional allocation algorithms called from find_resource()
Code Overview
Early in the kernel initialisation before devices are even thought about, the memory-map identification functions discover usable memory ranges and their purpose. x86 architecture uses BIOS Int 0x15 eax=0xE820 or EFI GetMemoryMap() to report the used memory address ranges. From those the largest unused range is discovered and its starting address assigned to pci_mem_start. The symbol is exported to the rest of the kernel and is then made available in include/asm-x86/pci.h as:
#define PCIBIOS_MIN_MEM (pci_mem_start)
All architectures do something similar so the kernel's generic resource allocation functions use PCIBIOS_MIN_MEM
PCI relies on generic kernel resource allocation functions to obtain memory-mapped I/O region allocations. The resource structure along with many supporting definitions and function prototypes is defined in include/linux/ioport.h:
struct resource {
resource_size_t start;
resource_size_t end;
const char *name;
unsigned long flags;
struct resource *parent, *sibling, *child;
};
The PCI device structures are defined in include/linux/pci.h:
#define DEVICE_COUNT_RESOURCE 12
struct pci_dev {
struct list_head bus_list; /* node in per-bus list */
struct pci_bus *bus; /* bus this device is on */
struct pci_bus *subordinate; /* bus this device bridges to */
...
struct resource resource[DEVICE_COUNT_RESOURCE]; /* I/O and memory regions + expansion ROMs */
...
}
#define PCI_BUS_NUM_RESOURCES 16
struct pci_bus {
struct list_head node; /* node in list of buses */
struct pci_bus *parent; /* parent bus this bridge is on */
struct list_head children; /* list of child buses */
struct list_head devices; /* list of devices on this bus */
struct pci_dev *self; /* bridge device as seen by parent */
struct resource *resource[PCI_BUS_NUM_RESOURCES];
...
}
The code path followed during allocation is:
drivers/pci/setup-res.c::pci_assign_resources() drivers/pci/bus.c::pci_bus_alloc_resource() kernel/resource.c::allocate_resource() kernel/resource/c::find_resource()
The allocation routine from find_resource():
static int find_resource(struct resource *root, struct resource *new,
resource_size_t size, resource_size_t min,
resource_size_t max, resource_size_t align,
void (*alignf)(void *, struct resource *,
resource_size_t, resource_size_t),
void *alignf_data)
{
struct resource *this = root->child;
new->start = root->start;
/*
* Skip past an allocated resource that starts at 0, since the assignment
* of this->start - 1 to new->end below would cause an underflow.
*/
if (this && this->start == 0) {
new->start = this->end + 1;
this = this->sibling;
}
for(;;) {
if (this)
new->end = this->start - 1;
else
new->end = root->end;
if (new->start < min)
new->start = min;
if (new->end > max)
new->end = max;
new->start = ALIGN(new->start, align);
if (alignf)
alignf(alignf_data, new, size, align);
if (new->start < new->end && new->end - new->start >= size - 1) {
new->end = new->start + size - 1;
return 0;
}
if (!this)
break;
new->start = this->end + 1;
this = this->sibling;
}
return -EBUSY;
}
