mirror of
https://github.com/kata-containers/kata-containers.git
synced 2025-06-04 13:20:50 +00:00
Removed all TOCs now that GitHub auto-generates them. Also updated the documentation requirements doc removing the requirement to add a TOC. Fixes: #2022. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
123 lines
3.3 KiB
Markdown
123 lines
3.3 KiB
Markdown
# Setting Sysctls with Kata
|
|
|
|
## Sysctls
|
|
|
|
In Linux, the sysctl interface allows an administrator to modify kernel
|
|
parameters at runtime. Parameters are available via the `/proc/sys/` virtual
|
|
process file system.
|
|
|
|
The parameters include the following subsystems among others:
|
|
- `fs` (file systems)
|
|
- `kernel` (kernel)
|
|
- `net` (networking)
|
|
- `vm` (virtual memory)
|
|
|
|
To get a complete list of kernel parameters, run:
|
|
```
|
|
$ sudo sysctl -a
|
|
```
|
|
|
|
Kubernetes provide mechanisms for setting namespaced sysctls.
|
|
Namespaced sysctls can be set per pod in the case of Kubernetes.
|
|
The following sysctls are known to be namespaced and can be set with
|
|
Kubernetes:
|
|
|
|
- `kernel.shm*`
|
|
- `kernel.msg*`
|
|
- `kernel.sem`
|
|
- `fs.mqueue.*`
|
|
- `net.*`
|
|
|
|
### Namespaced Sysctls:
|
|
|
|
Kata Containers supports setting namespaced sysctls with Kubernetes.
|
|
All namespaced sysctls can be set in the same way as regular Linux based
|
|
containers, the difference being, in the case of Kata they are set inside the guest.
|
|
|
|
#### Setting Namespaced Sysctls with Kubernetes:
|
|
|
|
Kubernetes considers certain sysctls as safe and others as unsafe. For detailed
|
|
information about what sysctls are considered unsafe, please refer to the [Kubernetes sysctl docs](https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/).
|
|
For using unsafe sysctls, the cluster admin would need to allow these as:
|
|
|
|
```
|
|
$ kubelet --allowed-unsafe-sysctls 'kernel.msg*,net.ipv4.route.min_pmtu' ...
|
|
```
|
|
|
|
or using the declarative approach as:
|
|
|
|
```
|
|
$ cat kubeadm.yaml
|
|
apiVersion: kubeadm.k8s.io/v1alpha3
|
|
kind: InitConfiguration
|
|
nodeRegistration:
|
|
kubeletExtraArgs:
|
|
allowed-unsafe-sysctls: "kernel.msg*,kernel.shm.*,net.*"
|
|
...
|
|
```
|
|
|
|
The above YAML can then be passed to `kubeadm init` as:
|
|
```
|
|
$ sudo -E kubeadm init --config=kubeadm.yaml
|
|
```
|
|
|
|
Both safe and unsafe sysctls can be enabled in the same way in the Pod YAML:
|
|
|
|
```
|
|
apiVersion: v1
|
|
kind: Pod
|
|
metadata:
|
|
name: sysctl-example
|
|
spec:
|
|
securityContext:
|
|
sysctls:
|
|
- name: kernel.shm_rmid_forced
|
|
value: "0"
|
|
- name: net.ipv4.route.min_pmtu
|
|
value: "1024"
|
|
```
|
|
|
|
### Non-Namespaced Sysctls:
|
|
|
|
Kubernetes disallow sysctls without a namespace.
|
|
The recommendation is to set them directly on the host or use a privileged
|
|
container in the case of Kubernetes.
|
|
|
|
In the case of Kata, the approach of setting sysctls on the host does not
|
|
work since the host sysctls have no effect on a Kata Container running
|
|
inside a guest. Kata gives you the ability to set non-namespaced sysctls using a privileged container.
|
|
This has the advantage that the non-namespaced sysctls are set inside the guest
|
|
without having any effect on the `/proc/sys` values of any other pod or the
|
|
host itself.
|
|
|
|
The recommended approach to do this would be to set the sysctl value in a
|
|
privileged init container. In this way, the application containers do not need any elevated
|
|
privileges.
|
|
|
|
```
|
|
apiVersion: v1
|
|
kind: Pod
|
|
metadata:
|
|
name: busybox-kata
|
|
spec:
|
|
runtimeClassName: kata-qemu
|
|
securityContext:
|
|
sysctls:
|
|
- name: kernel.shm_rmid_forced
|
|
value: "0"
|
|
containers:
|
|
- name: busybox-container
|
|
securityContext:
|
|
privileged: true
|
|
image: debian
|
|
command:
|
|
- sleep
|
|
- "3000"
|
|
initContainers:
|
|
- name: init-sys
|
|
securityContext:
|
|
privileged: true
|
|
image: busybox
|
|
command: ['sh', '-c', 'echo "64000" > /proc/sys/vm/max_map_count']
|
|
```
|