David K. Bainbridge | 215e024 | 2017-09-05 23:18:24 -0700 | [diff] [blame] | 1 | // Copyright 2016 The Go Authors. All rights reserved. |
| 2 | // Use of this source code is governed by a BSD-style |
| 3 | // license that can be found in the LICENSE file. |
| 4 | |
| 5 | /* |
| 6 | |
| 7 | Package bpf implements marshaling and unmarshaling of programs for the |
| 8 | Berkeley Packet Filter virtual machine, and provides a Go implementation |
| 9 | of the virtual machine. |
| 10 | |
| 11 | BPF's main use is to specify a packet filter for network taps, so that |
| 12 | the kernel doesn't have to expensively copy every packet it sees to |
| 13 | userspace. However, it's been repurposed to other areas where running |
| 14 | user code in-kernel is needed. For example, Linux's seccomp uses BPF |
| 15 | to apply security policies to system calls. For simplicity, this |
| 16 | documentation refers only to packets, but other uses of BPF have their |
| 17 | own data payloads. |
| 18 | |
| 19 | BPF programs run in a restricted virtual machine. It has almost no |
| 20 | access to kernel functions, and while conditional branches are |
| 21 | allowed, they can only jump forwards, to guarantee that there are no |
| 22 | infinite loops. |
| 23 | |
| 24 | The virtual machine |
| 25 | |
| 26 | The BPF VM is an accumulator machine. Its main register, called |
| 27 | register A, is an implicit source and destination in all arithmetic |
| 28 | and logic operations. The machine also has 16 scratch registers for |
| 29 | temporary storage, and an indirection register (register X) for |
| 30 | indirect memory access. All registers are 32 bits wide. |
| 31 | |
| 32 | Each run of a BPF program is given one packet, which is placed in the |
| 33 | VM's read-only "main memory". LoadAbsolute and LoadIndirect |
| 34 | instructions can fetch up to 32 bits at a time into register A for |
| 35 | examination. |
| 36 | |
| 37 | The goal of a BPF program is to produce and return a verdict (uint32), |
| 38 | which tells the kernel what to do with the packet. In the context of |
| 39 | packet filtering, the returned value is the number of bytes of the |
| 40 | packet to forward to userspace, or 0 to ignore the packet. Other |
| 41 | contexts like seccomp define their own return values. |
| 42 | |
| 43 | In order to simplify programs, attempts to read past the end of the |
| 44 | packet terminate the program execution with a verdict of 0 (ignore |
| 45 | packet). This means that the vast majority of BPF programs don't need |
| 46 | to do any explicit bounds checking. |
| 47 | |
| 48 | In addition to the bytes of the packet, some BPF programs have access |
| 49 | to extensions, which are essentially calls to kernel utility |
| 50 | functions. Currently, the only extensions supported by this package |
| 51 | are the Linux packet filter extensions. |
| 52 | |
| 53 | Examples |
| 54 | |
| 55 | This packet filter selects all ARP packets. |
| 56 | |
| 57 | bpf.Assemble([]bpf.Instruction{ |
| 58 | // Load "EtherType" field from the ethernet header. |
| 59 | bpf.LoadAbsolute{Off: 12, Size: 2}, |
| 60 | // Skip over the next instruction if EtherType is not ARP. |
| 61 | bpf.JumpIf{Cond: bpf.JumpNotEqual, Val: 0x0806, SkipTrue: 1}, |
| 62 | // Verdict is "send up to 4k of the packet to userspace." |
| 63 | bpf.RetConstant{Val: 4096}, |
| 64 | // Verdict is "ignore packet." |
| 65 | bpf.RetConstant{Val: 0}, |
| 66 | }) |
| 67 | |
| 68 | This packet filter captures a random 1% sample of traffic. |
| 69 | |
| 70 | bpf.Assemble([]bpf.Instruction{ |
| 71 | // Get a 32-bit random number from the Linux kernel. |
| 72 | bpf.LoadExtension{Num: bpf.ExtRand}, |
| 73 | // 1% dice roll? |
| 74 | bpf.JumpIf{Cond: bpf.JumpLessThan, Val: 2^32/100, SkipFalse: 1}, |
| 75 | // Capture. |
| 76 | bpf.RetConstant{Val: 4096}, |
| 77 | // Ignore. |
| 78 | bpf.RetConstant{Val: 0}, |
| 79 | }) |
| 80 | |
| 81 | */ |
| 82 | package bpf // import "golang.org/x/net/bpf" |