wiki:Scripts/GawkReformatTableForClanguageStructure

gawk: Reformat Table Elements for C structure

Given a file containing data with this layout:

.rdata:1001AE00                 modelCapability <         1,          1, 0,          2>
.rdata:1001AE00                 modelCapability <         2,          2, 0,          2>
.rdata:1001AE00                 modelCapability <         3,          3, 0,          2>
.rdata:1001AE00                 modelCapability <         4,          4, 0,          2>
.rdata:1001AE00                 modelCapability <         5,          5, 0,          2>
.rdata:1001AE00                 modelCapability <         6,          6, 0,          2>
.rdata:1001AE00                 modelCapability <         7,          7, 0, 0>

This [g|m| ]awk script will strip the values out and print them suitable for inclusion in a C-language structure definition:

function stripLast(string) {
 return substr(string,1,length(string)-1)
};
function stripHex(string) {
 if( substr(string,length(string),1) == "h") {
  string = substr(string, 1, length(string)-1)
 }
 if( length(string) > 1 && substr(string,1,1) == "0") {
  string = substr(string,2)
 }
 return sprintf("0x%02X", strtonum(sprintf("0x%s",string))) 
};
{
 Type0 = stripHex(stripLast($4));
 Type1 = stripHex(stripLast($5));
 Cap0 = stripHex(stripLast($6));
 Cap1 = stripHex(stripLast($7));
 print " {",Type0 ",", Type1 ",", Cap0 ",", Cap1 "}," 
}

Note: gawk (GNU awk) is needed rather than traditional awk or mawk since it has the strtonum() built-in function. That said, most modern implementations support all the built-ins so usually awk/mawk/gawk are interchangeable and probably symbolic links to the same exectuable.

The output looks like:

$ gawk -f cleanup.gawk input.txt
 { 0x01, 0x01, 0x00, 0x02},
 { 0x02, 0x02, 0x00, 0x02},
 { 0x03, 0x03, 0x00, 0x02},
 { 0x04, 0x04, 0x00, 0x02},
 { 0x05, 0x05, 0x00, 0x02},
 { 0x06, 0x06, 0x00, 0x02},
 { 0x07, 0x07, 0x00, 0x00},