Commit Graph

6055 Commits

Author SHA1 Message Date
Lee Verberne
8835f54480 kubelet: add support for pod PID namespace sharing
This adds the logic for sending a NamespaceMode_POD to the runtime, but
leaves it disconnected pending https://issues.k8s.io/58716.
2018-02-08 16:58:07 +01:00
Kubernetes Submit Queue
fb340a4695 Merge pull request #57824 from thockin/gcr-vanity
Automatic merge from submit-queue (batch tested with PRs 57824, 58806, 59410, 59280). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

2nd try at using a vanity GCR name

The 2nd commit here is the changes relative to the reverted PR.  Please focus review attention on that.

This is the 2nd attempt.  The previous try (#57573) was reverted while we
figured out the regional mirrors (oops).
    
New plan: k8s.gcr.io is a read-only facade that auto-detects your source
region (us, eu, or asia for now) and pulls from the closest.  To publish
an image, push k8s-staging.gcr.io and it will be synced to the regionals
automatically (similar to today).  For now the staging is an alias to
gcr.io/google_containers (the legacy URL).
    
When we move off of google-owned projects (working on it), then we just
do a one-time sync, and change the google-internal config, and nobody
outside should notice.
    
We can, in parallel, change the auto-sync into a manual sync - send a PR
to "promote" something from staging, and a bot activates it.  Nice and
visible, easy to keep track of.

xref https://github.com/kubernetes/release/issues/281

TL;DR:
  *  The new `staging-k8s.gcr.io` is where we push images.  It is literally an alias to `gcr.io/google_containers` (the existing repo) and is hosted in the US.
  * The contents of `staging-k8s.gcr.io` are automatically synced to `{asia,eu,us)-k8s.gcr.io`.
  * The new `k8s.gcr.io` will be a read-only alias to whichever regional repo is closest to you.
  * In the future, images will be promoted from `staging` to regional "prod" more explicitly and auditably.

 ```release-note
Use "k8s.gcr.io" for pulling container images rather than "gcr.io/google_containers".  Images are already synced, so this should not impact anyone materially.
    
Documentation and tools should all convert to the new name. Users should take note of this in case they see this new name in the system.
```
2018-02-08 03:29:32 -08:00
Tim Hockin
3586986416 Switch to k8s.gcr.io vanity domain
This is the 2nd attempt.  The previous was reverted while we figured out
the regional mirrors (oops).

New plan: k8s.gcr.io is a read-only facade that auto-detects your source
region (us, eu, or asia for now) and pulls from the closest.  To publish
an image, push k8s-staging.gcr.io and it will be synced to the regionals
automatically (similar to today).  For now the staging is an alias to
gcr.io/google_containers (the legacy URL).

When we move off of google-owned projects (working on it), then we just
do a one-time sync, and change the google-internal config, and nobody
outside should notice.

We can, in parallel, change the auto-sync into a manual sync - send a PR
to "promote" something from staging, and a bot activates it.  Nice and
visible, easy to keep track of.
2018-02-07 21:14:19 -08:00
Kubernetes Submit Queue
eff9f75f70 Merge pull request #59297 from joelsmith/master
Automatic merge from submit-queue (batch tested with PRs 59010, 59212, 59281, 59014, 59297). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve error returned when fetching container logs during pod termination

**What this PR does / why we need it**:

This change better handles fetching of logs when a container is in a crash loop backoff state. In cases where it is unable to fetch the logs, it gives a helpful error message back to a user who has requested logs of a container from a terminated pod. Rather than attempting to get logs for a container using an empty container ID, it returns a useful error message.

In cases where the container runtime gets an error, log the error but don't leak it back through the API to the user.


**Which issue(s) this PR fixes**:
Fixes #59296

**Release note**:

```release-note
NONE

```
2018-02-07 15:27:49 -08:00
Lantao Liu
a77450ec2d Add mountpoint as CRI image filesystem storage identifier. 2018-02-07 23:01:06 +00:00
Kubernetes Submit Queue
b40f865ae5 Merge pull request #59472 from hanxiaoshuai/fixtodo02072
Automatic merge from submit-queue (batch tested with PRs 59276, 51042, 58973, 59377, 59472). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

clean up unused function GetKubeletDockerContainers

**What this PR does / why we need it**:
fix todo: function GetKubeletDockerContainers is not unused,it has been migrated off in test/e2e_node/garbage_collector_test.go  in [#57976](https://github.com/kubernetes/kubernetes/pull/57976/files)
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-07 12:00:53 -08:00
Kubernetes Submit Queue
cf7073a831 Merge pull request #58973 from verb/cri-enum
Automatic merge from submit-queue (batch tested with PRs 59276, 51042, 58973, 59377, 59472). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update Container Runtime Interface to use enumerated namespace modes

**What this PR does / why we need it**: This updates the CRI as described in the [Shared PID Namespace](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/pod-pid-namespace.md#container-runtime-interface-changes) proposal. This change to the alpha API is not backwards compatible: implementations of the CRI will need to update to the new API version.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
WIP #1615

**Special notes for your reviewer**:
/assign @yujuhong 

**Release note**:

```release-note
[action-required] The Container Runtime Interface (CRI) version has increased from v1alpha1 to v1alpha2. Runtimes implementing the CRI will need to update to the new version, which configures container namespaces using an enumeration rather than booleans.
```
2018-02-07 12:00:47 -08:00
Kubernetes Submit Queue
475457537b Merge pull request #59276 from roboll/roboll/kubelet-fix
Automatic merge from submit-queue (batch tested with PRs 59276, 51042, 58973, 59377, 59472). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: only register api source when connecting

**What this PR does / why we need it**:
before this change, an api source was always registered, even when there
was no kubeclient. this lead to some operations blocking waiting for
podConfig.SeenAllSources to pass, which it never would.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #59275

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-07 12:00:40 -08:00
Joel Smith
749980b726 Handle fetch of container logs of error containers during pod termination
* improve error returned when failing to fetch container logs
* handle cases where logs are requested for containers without the container ID
2018-02-07 12:23:56 -07:00
hangaoshuai
e416f3746e clean up unused function GetKubeletDockerContainers 2018-02-07 19:21:05 +08:00
Dmitry Rozhkov
3175a687a0 devicemanager: increase code coverege of endpoint's unit test
Particularly cover the code path when an unhealthy device
becomes healthy.
2018-02-07 12:29:48 +02:00
Lee Verberne
e10042d22f Increment CRI version from v1alpha1 to v1alpha2
This also incorporates the version string into the package name so
that incompatibile versions will fail to connect.

Arbitrary choices:
- The proto3 package name is runtime.v1alpha2. The proto compiler
  normally translates this to a go package of "runtime_v1alpha2", but
  I renamed it to "v1alpha2" for consistency with existing packages.
- kubelet/apis/cri is used as "internalapi". I left it alone and put the
  public "runtimeapi" in kubelet/apis/cri/runtime.
2018-02-07 09:06:26 +01:00
Lee Verberne
0f1de41790 Update kubelet for enumerated CRI namespaces
This adds support to both the Generic Runtime Manager and the
dockershim for the CRI's enumerated namespaces.
2018-02-07 09:06:26 +01:00
Lee Verberne
f4ab2b6331 Switch CRI NamespaceOption from bools to enums 2018-02-07 09:06:25 +01:00
Kubernetes Submit Queue
e5b6026db6 Merge pull request #59287 from cheftako/cloud-context-level
Automatic merge from submit-queue (batch tested with PRs 59441, 58264, 59287, 59396, 59439). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add context to all relevant cloud APIs

**What this PR does / why we need it**:

This adds context to all the relevant cloud provider interface signatures.
Callers of those APIs are currently satisfied using context.TODO().
There will be follow on PRs to push the context through the stack.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #815

**Special notes for your reviewer**:
For an idea of the full scope of this change please look at PR #58532.

**Release note**:
```release-note
Implementers of the cloud provider interface will note the addition of a context to this interface. Trivial code modification will be necessary for a cloud provider to continue to compile.
```
2018-02-06 20:27:39 -08:00
Kubernetes Submit Queue
056e9ecc43 Merge pull request #58941 from vikaschoudhary16/test-allocate
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add unit test for endpoint allocate

**What this PR does / why we need it**:
Adds a unit test for covering `allocate` function at endpoint.


**Release note**:

```release-note
None
```

/kind testing
/area hw-accelerators
/cc @jiayingz @vishh @derekwaynecarr @RenaudWasTaken @resouer @ConnorDoyle
2018-02-06 17:19:41 -08:00
Kubernetes Submit Queue
8201e4ba00 Merge pull request #57124 from JiangtianLi/jiangtli-memfunc
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use GlobalMemoryStatusEx to get total physical memory on Windows node

**What this PR does / why we need it**:
This PR fixes issue #57110 due to failure in getting total physical memory on some Windows VM such as in VMWare Fusion or Virtualbox. This change uses GlobalMemoryStatusEx instead of GetPhysicallyInstalledSystemMemory to retrieve total physical memory on Windows node. The amount obtained this way is also closer in parity with reading MemTotal from /proc/meminfo on Linux node.
(thanks to @martinivanov and @marono for the help)

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #57110

**Special notes for your reviewer**:

**Release note**:

```release-note

```
2018-02-06 13:53:09 -08:00
Walter Fender
e18e8ec3c0 Add context to all relevant cloud APIs
This adds context to all the relevant cloud provider interface signatures.
Callers of those APIs are currently satisfied using context.TODO().
There will be follow on PRs to push the context through the stack.
For an idea of the full scope of this change please look at PR #58532.
2018-02-06 12:49:17 -08:00
Kubernetes Submit Queue
4bd22b5467 Merge pull request #58415 from gnufied/fix-volume-resize-messages
Automatic merge from submit-queue (batch tested with PRs 52942, 58415). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve messaging on volume expansion

- we now provide clear message to user what to do when cloudprovider resizing is finished
  and file system resizing is needed.
- add a event when resizing is successful
- Use PATCH both in controller-manager and kubelet for updating PVC status
- Remove code duplication between controller-manager and kubelet for updating PVC status
- Only remove conditions that are managed by resize controller



```release-note
Improve messages user gets during and after volume resizing is done.
```
2018-02-06 07:55:32 -08:00
Lantao Liu
5cbc8cc8e0 Fix the wrong comment in cri constants.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-02-05 23:53:36 +00:00
Kubernetes Submit Queue
c02b784b76 Merge pull request #58172 from NVIDIA/annotations
Automatic merge from submit-queue (batch tested with PRs 58184, 59307, 58172). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add annotations to the device plugin API

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** : Related to #56649 but does not fix it

This adds the ability for the device plugins to annotate containers.
Product wise, this allows the NVIDIA device plugin to support CRI-O (which allows hooks through container annotations).

**Special notes for your reviewer**:
/area hw-accelerators
/cc @vishh @jiayingz @vikaschoudhary16 

I'm wondering if it would make sense to fire a blank call to `newContainerAnnotations` at the start of the deviceplugin to get Annotations that are forbidden.
Current behavior is that any Annotations that conflicts with Kubelet will be overwritten by Kubelet.

**Release note**:
```release-note
NONE
```
2018-02-05 13:50:35 -08:00
Jing Xu
9588d2098a Redesign and implement volume reconstruction work
This PR is the first part of redesign of volume reconstruction work. The
changes include
1. Remove dependency on volume spec stored in actual state for volume
cleanup process (UnmountVolume and UnmountDevice)

Modify AttachedVolume struct to add DeviceMountPath so that volume
unmount operation can use this information instead of constructing from
volume spec

2. Modify reconciler's volume reconstruction process (syncState). Currently workflow
is when kubelet restarts, syncState() is only called once before
reconciler starts its loop.
a. If volume plugin supports reconstruction, it will use the
reconstructed volume spec information to update actual state as before.
b. If volume plugin cannot support reconstruction, it will use the
scanned mount path information to clean up the mounts.

In this PR, all the plugins still support reconstruction (except
glusterfs), so reconstruction of some plugins will still have issues.
The next PR will modify those plugins that cannot support reconstruction
well.

This PR addresses issue #52683, #54108 (This PR includes the changes to
update devicePath after local attach finishes)
2018-02-05 13:14:09 -08:00
Derek Carr
4afc0c8052 kubelet ignores hugepages if hugetlb is not enabled 2018-02-05 13:07:59 -05:00
vikaschoudhary16
abfb99645b Add unit test for endpoint allocate 2018-02-05 00:53:07 -05:00
Clayton Coleman
0346145615 Cap how long the kubelet waits when it has no client cert
If we go a certain amount of time without being able to create a client
cert and we have no current client cert from the store, exit. This
prevents a corrupted local copy of the cert from leaving the Kubelet in a
zombie state forever. Exiting allows a config loop outside the Kubelet
to clean up the file or the bootstrap client cert to get another client
cert.
2018-02-03 23:18:53 -05:00
Renaud Gaubert
db537e5954 Add Annotations from the deviceplugin to the runtime 2018-02-03 19:53:20 +01:00
Renaud Gaubert
eb5035b08d Regenerate the deviceplugin protobuf file 2018-02-03 19:53:20 +01:00
Renaud Gaubert
ece4bf4f7f Add annotations to the deviceplugin API 2018-02-03 19:53:20 +01:00
Kubernetes Submit Queue
f02e37b6ac Merge pull request #57076 from feiskyer/win-resources
Automatic merge from submit-queue (batch tested with PRs 59097, 57076, 59295). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add windows config to Kubelet CRI

**What this PR does / why we need it**:

Currently Container Runtime Interface (CRI) only supports LinuxContainerConfig and therefore LinuxContainerResources in ContainerConfig. Windows resource config is different from Linux's, although it shares some common properties. 

This PR adds windows config to CRI. Add newly added WindowsContainerResources is original from OCI spec (see https://github.com/opencontainers/runtime-spec/blob/master/specs-go/config.go#L437).


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

First part of #56734. A further PR is needed to fill the values after we have agreement on the spec.

**Special notes for your reviewer**:

**Release note**:

```release-note
Add windows config to Kubelet CRI
```

/assign @yujuhong @brendandburns 
/cc @taylorb-microsoft @JiangtianLi @dchen1107
2018-02-02 19:37:38 -08:00
Kubernetes Submit Queue
8c6be65f4c Merge pull request #58720 from joelsmith/ro-vol
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Ensure that the runtime mounts RO volumes read-only

**What this PR does / why we need it**:

This change makes it so that containers cannot write to secret, configMap, downwardAPI and projected volumes since the runtime will now mount them read-only. This change makes things less confusing for a user since any attempt to update a secret volume will result in an error rather than a successful change followed by a revert by the kubelet when the volume next syncs.

It also adds a feature gate `ReadOnlyAPIDataVolumes` to a provide a way to disable the new behavior in 1.10, but for 1.11, the new behavior will become non-optional.

Also, E2E tests for downwardAPI and projected volumes are updated to mount the volumes somewhere other than /etc.

**Which issue(s) this PR fixes**
Fixes #58719 

**Release note**:
```release-note
Containers now mount secret, configMap, downwardAPI and projected volumes read-only. Previously,
container modifications to files in these types of volumes were temporary and reverted by the kubelet
during volume sync. Until version 1.11, setting the feature gate ReadOnlyAPIDataVolumes=false will
preserve the old behavior.
```
2018-02-02 06:42:12 -08:00
Kubernetes Submit Queue
d3b783d5ec Merge pull request #58743 from NickrenREN/pv-protection
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Postpone PV deletion with finalizer when it is being used

Postpone PV deletion if it is bound to a PVC

xref: https://github.com/kubernetes/community/pull/1608


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #33355

**Special notes for your reviewer**:

**Release note**:
```release-note
Postpone PV deletion when it is being bound to a PVC
```

WIP, assign to myself first

/assign @NickrenREN
2018-02-01 19:39:52 -08:00
rob boll
7da7b750fd kubelet: only register api source when connecting
before this change, an api source was always registered, even when there
was no kubeclient. this lead to some operations blocking waiting for
podConfig.SeenAllSources to pass, which it never would.
2018-02-01 15:28:02 -05:00
Kubernetes Submit Queue
06472a054a Merge pull request #58930 from smarterclayton/background_rotate
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Only rotate certificates in the background

Change the Kubelet to not block until the first certs have rotated (we didn't act on it anyway) and fall back to the bootstrap cert if the most recent rotated cert is expired on startup.

The certificate manager originally had a "block on startup" rotation behavior to ensure at least one rotation happened on startup. However, since rotation may not succeed within the first time window the code was changed to simply print the error rather than return it. This meant that the blocking rotation has no purpose - it cannot cause the kubelet to fail, and it *does* block the kubelet from starting static pods before the api server becomes available.

The current block behavior causes a bootstrapped kubelet that is also set to run static pods to wait several minutes before actually launching the static pods, which means self-hosted masters using static pods have a pointless delay on startup.

Since blocking rotation has no benefit and can't actually fail startup, this commit removes the blocking behavior and simplifies the code at the same time. The goroutine for rotation now completely owns the deadline, the shouldRotate() method is removed, and the method that sets rotationDeadline now returns it. We also explicitly guard against a negative sleep interval and omit the message.

Should have no impact on bootstrapping except the removal of a long delay on startup before static pods start.

The other change is that an expired certificate from the cert manager is *not* considered a valid cert, which triggers an immediate rotation.  This causes the cert manager to fall back to the original bootstrap certificate until a new certificate is issued.  This allows the bootstrap certificate on masters to be "higher powered" and allow the node to function prior to initial approval, which means someone configuring the masters with a pre-generated client cert can be guaranteed that the kubelet will be able to communicate to report self-hosted static pod status, even if the first client rotation hasn't happened.  This makes master self-hosting more predictable for static configuration environments.

```release-note
When using client or server certificate rotation, the Kubelet will no longer wait until the initial rotation succeeds or fails before starting static pods.  This makes running self-hosted masters with rotation more predictable.
```
2018-02-01 12:05:15 -08:00
Joel Smith
66b061dad2 Ensure that the runtime mounts RO volumes read-only
Add a feature gate ReadOnlyAPIDataVolumes to a provide a way to
disable the new behavior in 1.10, but for 1.11, the new
behavior will become non-optional.

Also, update E2E tests for downwardAPI and projected volumes
to mount the volumes somewhere other than /etc.
2018-02-01 10:02:29 -07:00
Kubernetes Submit Queue
0d900769d6 Merge pull request #59126 from filbranden/ipcs3
Automatic merge from submit-queue (batch tested with PRs 59106, 58985, 59068, 59120, 59126). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix cross-build breakage after #58174

**What this PR does / why we need it**:
Fix cross-build breakage after #58174

@cblecker 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #59121

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-01 05:53:45 -08:00
Kubernetes Submit Queue
f96ac05774 Merge pull request #59062 from mtaufen/fix-pod-pids-limit
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix PodPidsLimit and ConfigTrialDuration on internal KubeletConfig type

They should both follow the convention of not being a pointer on the internal type. 

This required adding a conversion function between `int64` and `*int64`. A side effect is this removes a warning in the generated code for the apps API group.

@dims

```release-note
NONE
```
2018-02-01 01:45:55 -08:00
Kubernetes Submit Queue
a644e611dd Merge pull request #58751 from feiskyer/hyperv
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add support of hyperv isolation for windows containers

**What this PR does / why we need it**:

Add support of hyperv isolation for windows containers.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #58750

**Special notes for your reviewer**:

Only one container per pod is supported yet.

**Release note**:

```release-note
Windows containers now support experimental Hyper-V isolation by setting annotation `experimental.windows.kubernetes.io/isolation-type=hyperv` and feature gates HyperVContainer. Only one container per pod is supported yet.
```
2018-01-31 21:10:17 -08:00
Kubernetes Submit Queue
465e925564 Merge pull request #58994 from RobertKrawitz/fake-runtime-start-race-condition-branch
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Race condition between listener and client in remote_runtime_test

Fix race condition in remote_runtime_test.
Fixes #58993
2018-01-31 20:31:50 -08:00
Filipe Brandenburger
2f2d886734 Fix cross-build breakage after #58174 2018-01-31 09:46:36 -08:00
NickrenREN
2a2f88b939 Rename PVCProtection feature gate so that PV protection can share the feature gate with PVC protection 2018-01-31 20:02:01 +08:00
Kubernetes Submit Queue
c817765b0e Merge pull request #58445 from hanxiaoshuai/typo
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix some typos in comments

**What this PR does / why we need it**:

Fixes # fix some typos in comments
2018-01-30 19:44:44 -08:00
YuxiJin-tobeyjin
af6b4e39c2 codeClean-merge-logfAndFailnow-to-fatalf 2018-01-31 11:39:31 +08:00
Kubernetes Submit Queue
84408378f9 Merge pull request #58174 from filbranden/ipcs1
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixes for HostIPC tests to work when Docker has SELinux support enabled.

**What this PR does / why we need it**:

Fixes for HostIPC tests to work when Docker has SELinux support enabled.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

N/A

**Special notes for your reviewer**:

The core of the matter is to use `ipcs` from util-linux rather than the one from busybox. The typical SELinux policy has enough to allow Docker containers (running under svirt_lxc_net_t SELinux type) to access IPC information by reading the contents of the files under /proc/sysvipc/, but not by using the shmctl etc. syscalls.

The `ipcs` implementation in busybox will use `shmctl(0, SHM_INFO, ...)` to detect whether it can read IPC info (see source code [here](https://git.busybox.net/busybox/tree/util-linux/ipcs.c?h=1_28_0#n138)), while the one in util-linux will prefer to read from the /proc files directly if they are available (see source code [here](https://github.com/karelzak/util-linux/blob/v2.27.1/sys-utils/ipcutils.c#L108)).

It turns out the SELinux policy doesn't allow the shmctl syscalls in an unprivileged container, while access to it through the /proc interface is fine. (One could argue this is a bug in the SELinux policy, but getting it fixed on stable OSs is hard, and it's not that hard for us to test it with an util-linux `ipcs`, so I propose we do so.)

This PR also contains a refactor of the code setting IpcMode, since setting it in the "common options" function is misleading, as on containers other than the sandbox, it ends up always getting overwritten, so let's only set it to "host" in the Sandbox.

It also has a minor fix for the `ipcmk` call, since support for size suffix was only introduced in recent versions of it.

**Release note**:

```release-note
NONE
```
2018-01-30 17:18:52 -08:00
Kubernetes Submit Queue
a18f086220 Merge pull request #59020 from brendandburns/kubelet-hang
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove setInitError.

**What this PR does / why we need it**:
Removes setInitError, it's not sure it was ever really used, and it causes the kubelet to hang and get wedged.

**Which issue(s) this PR fixes** 
Fixes #46086

**Special notes for your reviewer**:
If `initializeModules()` in `kubelet.go` encounters an error, it calls `runtimeState.setInitError(...)`

47d61ef472/pkg/kubelet/kubelet.go (L1339)

The trouble with this is that `initError` is never cleared, which means that `runtimeState.runtimeErrors()` always returns this `initError`, and thus pods never start sync-ing.

In normal operation, this is expected and desired because eventually the runtime is expected to become healthy, but in this case, `initError` is never updated, and so the system just gets wedged.

47d61ef472/pkg/kubelet/kubelet.go (L1751)

We could add some retry to `initializeModules()` but that seems unnecessary, as eventually we'd want to just die anyway. Instead, just log fatal and die, a supervisor will restart us.

Note, I'm happy to add some retry here too, if that makes reviewers happier.

**Release note**:
```release-note
Prevent kubelet from getting wedged if initialization of modules returns an error.
```

@feiskyer @dchen1107 @janetkuo 

@kubernetes/sig-node-bugs
2018-01-30 14:56:28 -08:00
Kubernetes Submit Queue
c244994af7 Merge pull request #58997 from Random-Liu/eviction-manager-use-cri
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make eviction manager work with CRI container runtime.

Previously, eviction manager uses a function `HasDedicatedImageFs` in `pkg/kubelet/cadvisor` to detect whether image fs and root fs are on the same device.

However, it doesn't work with CRI container runtime which provides container/image stats through CRI. Thus all eviction tests for containerd are failing now. https://k8s-testgrid.appspot.com/sig-node-containerd#node-e2e-flaky

This PR makes it work with CRI container runtime.

@kubernetes/sig-node-pr-reviews 
@yujuhong @yguo0905 @feiskyer @mrunalp @abhi @dashpole 
Signed-off-by: Lantao Liu <lantaol@google.com>



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
none
```
2018-01-30 12:43:30 -08:00
Michael Taufen
da41a6e793 Fix PodPidsLimit and ConfigTrialDuration on internal KubeletConfig type
They should both follow the convention of not being a pointer on the
internal type. This required adding a conversion function between
`int64` and `*int64`.

A side effect is this removes a warning in the generated code for the
apps API group.
2018-01-30 11:43:41 -08:00
Lantao Liu
68dadcfd15 Make eviction manager work with CRI container runtime.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-01-30 17:57:46 +00:00
Robert Krawitz
2d050b8549 Fix race condition in fake runtime test. 2018-01-30 08:09:01 -05:00
Peng Gao
ac86428d59 Add detailed err in ensure docker process error
Signed-off-by: Peng Gao <peng.gao.dut@gmail.com>
2018-01-30 15:02:22 +08:00
Brendan Burns
3a23c678c5 Remove setInitError. 2018-01-29 21:44:54 -08:00