gdt | 6ea7cdc | 2005-02-15 17:10:03 +0000 | [diff] [blame] | 1 | $Id: IMPLEMENTATION.txt,v 1.2 2005/02/15 17:10:03 gdt Exp $ |
gdt | 5d6191e | 2005-02-10 16:38:09 +0000 | [diff] [blame] | 2 | |
| 3 | This file contains notes about the internals of the BGP |
| 4 | implementation. The initial impetus is understanding the memory usage |
| 5 | of Quagga'a BGP implementation. There may be some inaccuracies; it is |
| 6 | in the repository in the hopes that it will be significantly more |
| 7 | helpful than not. |
| 8 | |
| 9 | * FILES |
| 10 | |
| 11 | bgp_advertise.[hc]: |
| 12 | data structures: advertised prefixes, attributes |
| 13 | |
| 14 | bgp_aspath.[hc]: |
| 15 | struct aspath: |
| 16 | These are stored in a hash, apparently in wire format. |
| 17 | |
| 18 | bgp_attr.[hc]: |
| 19 | struct attr: contains all attributes |
| 20 | size(ILP32) 26 words/104 bytes (poor packing, v6/multicast is 10) |
| 21 | |
| 22 | bgp_attr_parse: origin, aspath, next hop probably most of interest |
| 23 | bgp_attr_origin: set flag bit |
| 24 | bgp_attr_aspath: put in refcounted hash table, so share pointer |
| 25 | bgp_attr_nexthop: store in attribute structure |
| 26 | |
| 27 | bgp_btoa.c: ? test program |
| 28 | |
| 29 | bgp_clist.[hc]: |
| 30 | data structures: community lists (including permit/deny state) |
| 31 | |
| 32 | bgp_community.[hc]: |
| 33 | data structures: community atttributes (multiple communities per struct) |
| 34 | |
| 35 | bgp_damp.[hc]: |
| 36 | per-route damping data, and damping control information |
| 37 | |
| 38 | bgp_debug.[hc]: |
| 39 | debugging support (vty config, dump of packets) |
| 40 | |
| 41 | bgp_dump.[hc]: |
| 42 | MRT-compatible dump format routines |
| 43 | |
| 44 | bgp_ecommunity.[hc]: |
| 45 | Extended communities attributes (multiple ecommmunities per struct) |
| 46 | |
| 47 | bgp_filter.[hc]: |
| 48 | AS path access list filtering |
| 49 | |
| 50 | bgp_fsm.[hc]: |
| 51 | Per-peer state machine for TCP connection, hold time, etc. |
| 52 | |
| 53 | bgp_main.c: |
| 54 | Daemon startup. |
| 55 | |
| 56 | bgp_mplsvpn.[hc]: |
| 57 | parsing of attribute structures for MPLS VPNs [need better description] |
| 58 | |
| 59 | bgp_network.[hc]: |
| 60 | Opening and binding of sockets, finding addresses for interfaces |
| 61 | |
| 62 | bgp_nexthop.[hc]: |
| 63 | data structures: Nexthop cache [not clear how used, if truly cache |
| 64 | in sense of memoization, or something else] |
| 65 | |
| 66 | importing EGP routes into IGP (thread created) |
| 67 | "scanning" (thread created) |
| 68 | bgp_scan: has useful clues to data structure complexity. Scanning |
| 69 | process iterates over database of received advertisements, and |
| 70 | builds 'cache' structure. |
| 71 | |
| 72 | bgp_open.[ch]: |
| 73 | Open messages, and capability negotiation |
| 74 | |
| 75 | bgp_packet.[hc] |
| 76 | sending and receiving of UPDATE/WITHDRAW |
| 77 | collision resolution for simultanteous opens |
| 78 | bgp_read: top-level read routine: reads whole packet (nonblocking) |
| 79 | and dispatches to per-message-type receive |
| 80 | |
| 81 | bgp_update_receive: |
| 82 | calls bgp_attr_parse |
| 83 | reads nrli into struct bgp_nrli update |
| 84 | |
| 85 | uninterning of aspath, community, ecommmunity, cluster, |
| 86 | transit which were interned in bgp_attr_parse |
| 87 | |
| 88 | bgp_regex.[ch]: |
| 89 | Glue to convert BGP regexps to standard (_ means many things). |
| 90 | |
| 91 | bgp_route.[hc]: |
| 92 | data structures: routes as received, static routes |
| 93 | Application of filters. Lots of route processing. |
| 94 | |
| 95 | bgp_nlri_parse: |
| 96 | sanity checks, then calls bgp_update with peer, prefix, attributes pointer |
| 97 | |
| 98 | bgp_update: bgp_update_main, then RS processing |
| 99 | |
| 100 | bgp_update_main: |
| 101 | find 'struct bgp_node *' for this afi/safi |
| 102 | look for route in table, then 'intern' attributes |
| 103 | ** interning is process of |
| 104 | looking for data in hash table, and putting there if missing, refcnt |
| 105 | using pointer to existing data |
| 106 | many validity checks |
| 107 | get new struct bgp_info (10 words/40 bytes) |
| 108 | call bgp_info_add with rn and bgp_info |
| 109 | call bgp_process |
| 110 | |
| 111 | bgp_routemap.c |
| 112 | implementation of route maps (match and set) |
| 113 | |
| 114 | bgp_snmp.c |
| 115 | SNMP glue. Not particularly interesting except to add variables or |
| 116 | debug SNMP. |
| 117 | |
| 118 | bgp_table.[hc] |
| 119 | data structures: struct bgp_table, struct bgp_node |
| 120 | allocation/lookup/utility operations - not a lot of protocol processin |
| 121 | |
| 122 | bgp_vty.[hc] |
| 123 | protocol-wide vty hooks |
| 124 | |
| 125 | bgp_zebra.[hc] |
| 126 | Processing interface events from zebra, redistribution of routes. |
| 127 | |
| 128 | bgpd.h |
| 129 | struct bgp_master: daemon main data structure |
| 130 | struct bgp: per-instance structure |
| 131 | struct peer_group |
| 132 | struct bgp_notify: (in-core representation of wire format?) |
| 133 | struct bgp_nexthop: (v4 and v6 addresses, *ifp) |
| 134 | struct bgp_rd: router distinguisher: 8 octects |
| 135 | struct bgp_filter: distribute, prefix, aslist, route_maps |
| 136 | struct peer: neighbor structure (very rich/complex) |
| 137 | struct bgp_nlri: reference to wire format |
| 138 | #define of protocol constants |
| 139 | attribute type codes |
| 140 | fsm states/events |
| 141 | timer values |
| 142 | |
| 143 | bgpd.c |
| 144 | instance/peer allocation |
| 145 | configuration |
| 146 | initialization/termination |
| 147 | |
| 148 | * DATA STRUCTURE SIZES |
| 149 | |
| 150 | Question: How much memory does quagga's bgpd use as a function of |
| 151 | state received from peers? |
| 152 | |
gdt | 6ea7cdc | 2005-02-15 17:10:03 +0000 | [diff] [blame] | 153 | It seems that a struct bgp_info is kept for each prefix. The "struct |
| 154 | attr *" is interned, and variables within that are interned. So, 40 |
| 155 | bytes are kept per received prefix, plus interned shared values. This |
| 156 | could be 36 if 'int suppress' where changed to a u_char and moved to |
| 157 | be with the other u_chars. Without MPLS, this could be 32 bytes. |
| 158 | Note that 8 bytes of this is linked list overhead, meaning that 24 |
| 159 | bytes are the raw per-prefix storage requirements. |
| 160 | |
| 161 | Also, a struct bgp_damp_info is apparently maintained per route; this |
| 162 | is fairly large (about 44 bytes). |
| 163 | |
| 164 | [TODO: the role of struct bgp_node.] |
gdt | 5d6191e | 2005-02-10 16:38:09 +0000 | [diff] [blame] | 165 | |
| 166 | * TIME COMPLEXITY |
| 167 | |
| 168 | It appears that received prefixes from each peer are stored in a |
| 169 | linked list. |