kernel: Cherry pick multicast fixes into 4.9.x

This cherry picks:
- b6fe0440c637 ("bridge: implement missing ndo_uninit()")
- b1b9d366028f ("bridge: move bridge multicast cleanup to ndo_uninit")

The fix is in b1b9d366028f ("bridge: move bridge multicast cleanup
to ndo_uninit") but it requires b6fe0440c637 ("bridge: implement missing
ndo_uninit()"). Furthermore, b1b9d366028f needed some manual resolution
of a cherry-pick conflict because the surrounding code had changed.

Signed-off-by: Rolf Neugebauer <rn@rneugeba.io>
This commit is contained in:
Rolf Neugebauer 2018-09-05 20:50:13 +01:00
parent 0a9361d769
commit 4c725f0318
14 changed files with 263 additions and 12 deletions

View File

@ -1,7 +1,7 @@
From fe8dd3aef73a8404bc2aff0e61e8863e7203d8e5 Mon Sep 17 00:00:00 2001 From fe8dd3aef73a8404bc2aff0e61e8863e7203d8e5 Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <acme@redhat.com> From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Thu, 2 Mar 2017 12:55:49 -0300 Date: Thu, 2 Mar 2017 12:55:49 -0300
Subject: [PATCH 01/12] tools build: Add test for sched_getcpu() Subject: [PATCH 01/14] tools build: Add test for sched_getcpu()
Instead of trying to go on adding more ifdef conditions, do a feature Instead of trying to go on adding more ifdef conditions, do a feature
test and define HAVE_SCHED_GETCPU_SUPPORT instead, then use it to test and define HAVE_SCHED_GETCPU_SUPPORT instead, then use it to

View File

@ -1,7 +1,7 @@
From e3c72ac590752dd3812767cb6ea60e2b88ec0b06 Mon Sep 17 00:00:00 2001 From e3c72ac590752dd3812767cb6ea60e2b88ec0b06 Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <acme@redhat.com> From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Thu, 13 Oct 2016 17:12:35 -0300 Date: Thu, 13 Oct 2016 17:12:35 -0300
Subject: [PATCH 02/12] perf jit: Avoid returning garbage for a ret variable Subject: [PATCH 02/14] perf jit: Avoid returning garbage for a ret variable
When the loop body isn't executed at all, then the 'ret' local variable, When the loop body isn't executed at all, then the 'ret' local variable,
that is uninitialized will be used as the return value. that is uninitialized will be used as the return value.

View File

@ -1,7 +1,7 @@
From 516270b76fff8f0c8aa158f62b71db726e807240 Mon Sep 17 00:00:00 2001 From 516270b76fff8f0c8aa158f62b71db726e807240 Mon Sep 17 00:00:00 2001
From: Dexuan Cui <decui@microsoft.com> From: Dexuan Cui <decui@microsoft.com>
Date: Sat, 23 Jul 2016 01:35:51 +0000 Date: Sat, 23 Jul 2016 01:35:51 +0000
Subject: [PATCH 03/12] hv_sock: introduce Hyper-V Sockets Subject: [PATCH 03/14] hv_sock: introduce Hyper-V Sockets
Hyper-V Sockets (hv_sock) supplies a byte-stream based communication Hyper-V Sockets (hv_sock) supplies a byte-stream based communication
mechanism between the host and the guest. It's somewhat like TCP over mechanism between the host and the guest. It's somewhat like TCP over

View File

@ -1,7 +1,7 @@
From 5fd6750b81e73b4c5206a4709b263a1dd69287b6 Mon Sep 17 00:00:00 2001 From 5fd6750b81e73b4c5206a4709b263a1dd69287b6 Mon Sep 17 00:00:00 2001
From: Rolf Neugebauer <rolf.neugebauer@gmail.com> From: Rolf Neugebauer <rolf.neugebauer@gmail.com>
Date: Mon, 23 May 2016 18:55:45 +0100 Date: Mon, 23 May 2016 18:55:45 +0100
Subject: [PATCH 04/12] vmbus: Don't spam the logs with unknown GUIDs Subject: [PATCH 04/14] vmbus: Don't spam the logs with unknown GUIDs
With Hyper-V sockets device types are introduced on the fly. The pr_info() With Hyper-V sockets device types are introduced on the fly. The pr_info()
then prints a message on every connection, which is way too verbose. Since then prints a message on every connection, which is way too verbose. Since

View File

@ -1,7 +1,7 @@
From 3431174e32f2e60bd2fa8cc34525f87229faee02 Mon Sep 17 00:00:00 2001 From 3431174e32f2e60bd2fa8cc34525f87229faee02 Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com> From: Alex Ng <alexng@messages.microsoft.com>
Date: Sun, 6 Nov 2016 13:14:07 -0800 Date: Sun, 6 Nov 2016 13:14:07 -0800
Subject: [PATCH 05/12] Drivers: hv: utils: Fix the mapping between host Subject: [PATCH 05/14] Drivers: hv: utils: Fix the mapping between host
version and protocol to use version and protocol to use
We should intentionally declare the protocols to use for every known host We should intentionally declare the protocols to use for every known host

View File

@ -1,7 +1,7 @@
From 79f84ba6a2c606a4c2dd40bd6401d9ba95fed232 Mon Sep 17 00:00:00 2001 From 79f84ba6a2c606a4c2dd40bd6401d9ba95fed232 Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com> From: Alex Ng <alexng@messages.microsoft.com>
Date: Sun, 6 Nov 2016 13:14:10 -0800 Date: Sun, 6 Nov 2016 13:14:10 -0800
Subject: [PATCH 06/12] Drivers: hv: vss: Improve log messages. Subject: [PATCH 06/14] Drivers: hv: vss: Improve log messages.
Adding log messages to help troubleshoot error cases and transaction Adding log messages to help troubleshoot error cases and transaction
handling. handling.

View File

@ -1,7 +1,7 @@
From 167af36cb474059a4a6a3a1ec1af391c287c804e Mon Sep 17 00:00:00 2001 From 167af36cb474059a4a6a3a1ec1af391c287c804e Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com> From: Alex Ng <alexng@messages.microsoft.com>
Date: Sun, 6 Nov 2016 13:14:11 -0800 Date: Sun, 6 Nov 2016 13:14:11 -0800
Subject: [PATCH 07/12] Drivers: hv: vss: Operation timeouts should match host Subject: [PATCH 07/14] Drivers: hv: vss: Operation timeouts should match host
expectation expectation
Increase the timeout of backup operations. When system is under I/O load, Increase the timeout of backup operations. When system is under I/O load,

View File

@ -1,7 +1,7 @@
From 3a5e9488ac5d15df835ee90f7dadbd4fa128ae0c Mon Sep 17 00:00:00 2001 From 3a5e9488ac5d15df835ee90f7dadbd4fa128ae0c Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com> From: Alex Ng <alexng@messages.microsoft.com>
Date: Sat, 28 Jan 2017 12:37:17 -0700 Date: Sat, 28 Jan 2017 12:37:17 -0700
Subject: [PATCH 08/12] Drivers: hv: vmbus: Use all supported IC versions to Subject: [PATCH 08/14] Drivers: hv: vmbus: Use all supported IC versions to
negotiate negotiate
Previously, we were assuming that each IC protocol version was tied to a Previously, we were assuming that each IC protocol version was tied to a

View File

@ -1,7 +1,7 @@
From ed28e9313a5e52b282485e32643724ec6706fc97 Mon Sep 17 00:00:00 2001 From ed28e9313a5e52b282485e32643724ec6706fc97 Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com> From: Alex Ng <alexng@messages.microsoft.com>
Date: Sat, 28 Jan 2017 12:37:18 -0700 Date: Sat, 28 Jan 2017 12:37:18 -0700
Subject: [PATCH 09/12] Drivers: hv: Log the negotiated IC versions. Subject: [PATCH 09/14] Drivers: hv: Log the negotiated IC versions.
Log the negotiated IC versions. Log the negotiated IC versions.

View File

@ -1,7 +1,7 @@
From 256a26fbbffbd065b45acf5f9abd58838a25a49d Mon Sep 17 00:00:00 2001 From 256a26fbbffbd065b45acf5f9abd58838a25a49d Mon Sep 17 00:00:00 2001
From: Dexuan Cui <decui@microsoft.com> From: Dexuan Cui <decui@microsoft.com>
Date: Sun, 26 Mar 2017 16:42:20 +0800 Date: Sun, 26 Mar 2017 16:42:20 +0800
Subject: [PATCH 10/12] vmbus: fix missed ring events on boot Subject: [PATCH 10/14] vmbus: fix missed ring events on boot
During initialization, the channel initialization code schedules the During initialization, the channel initialization code schedules the
tasklet to scan the VMBUS receive event page (i.e. simulates an tasklet to scan the VMBUS receive event page (i.e. simulates an

View File

@ -1,7 +1,7 @@
From 24a091fd9fc828d6e46a2ed7d6147ffcf377f8c7 Mon Sep 17 00:00:00 2001 From 24a091fd9fc828d6e46a2ed7d6147ffcf377f8c7 Mon Sep 17 00:00:00 2001
From: Dexuan Cui <decui@microsoft.com> From: Dexuan Cui <decui@microsoft.com>
Date: Wed, 29 Mar 2017 18:37:10 +0800 Date: Wed, 29 Mar 2017 18:37:10 +0800
Subject: [PATCH 11/12] vmbus: remove "goto error_clean_msglist" in Subject: [PATCH 11/14] vmbus: remove "goto error_clean_msglist" in
vmbus_open() vmbus_open()
This is just a cleanup patch to simplify the code a little. This is just a cleanup patch to simplify the code a little.

View File

@ -1,7 +1,7 @@
From ca627325993df99b3007657f7480618550fc8a84 Mon Sep 17 00:00:00 2001 From ca627325993df99b3007657f7480618550fc8a84 Mon Sep 17 00:00:00 2001
From: Dexuan Cui <decui@microsoft.com> From: Dexuan Cui <decui@microsoft.com>
Date: Fri, 24 Mar 2017 20:53:18 +0800 Date: Fri, 24 Mar 2017 20:53:18 +0800
Subject: [PATCH 12/12] vmbus: dynamically enqueue/dequeue the channel on Subject: [PATCH 12/14] vmbus: dynamically enqueue/dequeue the channel on
vmbus_open/close vmbus_open/close
Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com>

View File

@ -0,0 +1,142 @@
From 09becdd784c0295e9f9d70677e18d262b7c15d45 Mon Sep 17 00:00:00 2001
From: Ido Schimmel <idosch@mellanox.com>
Date: Mon, 10 Apr 2017 14:59:27 +0300
Subject: [PATCH 13/14] bridge: implement missing ndo_uninit()
While the bridge driver implements an ndo_init(), it was missing a
symmetric ndo_uninit(), causing the different de-initialization
operations to be scattered around its dellink() and destructor().
Implement a symmetric ndo_uninit() and remove the overlapping operations
from its dellink() and destructor().
This is a prerequisite for the next patch, as it allows us to have a
proper cleanup upon changelink() failure during the bridge's newlink().
Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b6fe0440c63716e09cfc0d1484e3898a0f29d1d1)
---
net/bridge/br_device.c | 20 +++++++++++---------
net/bridge/br_if.c | 1 -
net/bridge/br_multicast.c | 7 +++++--
net/bridge/br_private.h | 5 +++++
4 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 5f5e28f210e0..15be72678bc8 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -122,6 +122,15 @@ static int br_dev_init(struct net_device *dev)
return err;
}
+static void br_dev_uninit(struct net_device *dev)
+{
+ struct net_bridge *br = netdev_priv(dev);
+
+ br_multicast_uninit_stats(br);
+ br_vlan_flush(br);
+ free_percpu(br->stats);
+}
+
static int br_dev_open(struct net_device *dev)
{
struct net_bridge *br = netdev_priv(dev);
@@ -337,6 +346,7 @@ static const struct net_device_ops br_netdev_ops = {
.ndo_open = br_dev_open,
.ndo_stop = br_dev_stop,
.ndo_init = br_dev_init,
+ .ndo_uninit = br_dev_uninit,
.ndo_start_xmit = br_dev_xmit,
.ndo_get_stats64 = br_get_stats64,
.ndo_set_mac_address = br_set_mac_address,
@@ -363,14 +373,6 @@ static const struct net_device_ops br_netdev_ops = {
.ndo_features_check = passthru_features_check,
};
-static void br_dev_free(struct net_device *dev)
-{
- struct net_bridge *br = netdev_priv(dev);
-
- free_percpu(br->stats);
- free_netdev(dev);
-}
-
static struct device_type br_type = {
.name = "bridge",
};
@@ -383,7 +385,7 @@ void br_dev_setup(struct net_device *dev)
ether_setup(dev);
dev->netdev_ops = &br_netdev_ops;
- dev->destructor = br_dev_free;
+ dev->destructor = free_netdev;
dev->ethtool_ops = &br_ethtool_ops;
SET_NETDEV_DEVTYPE(dev, &br_type);
dev->priv_flags = IFF_EBRIDGE | IFF_NO_QUEUE;
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 8e173324693d..1764de88483c 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -311,7 +311,6 @@ void br_dev_delete(struct net_device *dev, struct list_head *head)
br_fdb_delete_by_port(br, NULL, 0, 1);
- br_vlan_flush(br);
br_multicast_dev_del(br);
del_timer_sync(&br->gc_timer);
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 2136e45f5277..4c16e01031d4 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1898,8 +1898,6 @@ void br_multicast_dev_del(struct net_bridge *br)
out:
spin_unlock_bh(&br->multicast_lock);
-
- free_percpu(br->mcast_stats);
}
int br_multicast_set_router(struct net_bridge *br, unsigned long val)
@@ -2354,6 +2352,11 @@ int br_multicast_init_stats(struct net_bridge *br)
return 0;
}
+void br_multicast_uninit_stats(struct net_bridge *br)
+{
+ free_percpu(br->mcast_stats);
+}
+
static void mcast_stats_add_dir(u64 *dst, u64 *src)
{
dst[BR_MCAST_DIR_RX] += src[BR_MCAST_DIR_RX];
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 1b63177e0ccd..417edbe7b8b2 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -601,6 +601,7 @@ void br_rtr_notify(struct net_device *dev, struct net_bridge_port *port,
void br_multicast_count(struct net_bridge *br, const struct net_bridge_port *p,
const struct sk_buff *skb, u8 type, u8 dir);
int br_multicast_init_stats(struct net_bridge *br);
+void br_multicast_uninit_stats(struct net_bridge *br);
void br_multicast_get_stats(const struct net_bridge *br,
const struct net_bridge_port *p,
struct br_mcast_stats *dest);
@@ -741,6 +742,10 @@ static inline int br_multicast_init_stats(struct net_bridge *br)
return 0;
}
+static inline void br_multicast_uninit_stats(struct net_bridge *br)
+{
+}
+
static inline int br_multicast_igmp_type(const struct sk_buff *skb)
{
return 0;
--
2.18.0

View File

@ -0,0 +1,109 @@
From f8df437a8d7f1afebc7ebfbc695dc2032d7d5f66 Mon Sep 17 00:00:00 2001
From: Xin Long <lucien.xin@gmail.com>
Date: Tue, 25 Apr 2017 22:58:37 +0800
Subject: [PATCH 14/14] bridge: move bridge multicast cleanup to ndo_uninit
During removing a bridge device, if the bridge is still up, a new mdb entry
still can be added in br_multicast_add_group() after all mdb entries are
removed in br_multicast_dev_del(). Like the path:
mld_ifc_timer_expire ->
mld_sendpack -> ...
br_multicast_rcv ->
br_multicast_add_group
The new mp's timer will be set up. If the timer expires after the bridge
is freed, it may cause use-after-free panic in br_multicast_group_expired.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: [<ffffffffa07ed2c8>] br_multicast_group_expired+0x28/0xb0 [bridge]
Call Trace:
<IRQ>
[<ffffffff81094536>] call_timer_fn+0x36/0x110
[<ffffffffa07ed2a0>] ? br_mdb_free+0x30/0x30 [bridge]
[<ffffffff81096967>] run_timer_softirq+0x237/0x340
[<ffffffff8108dcbf>] __do_softirq+0xef/0x280
[<ffffffff8169889c>] call_softirq+0x1c/0x30
[<ffffffff8102c275>] do_softirq+0x65/0xa0
[<ffffffff8108e055>] irq_exit+0x115/0x120
[<ffffffff81699515>] smp_apic_timer_interrupt+0x45/0x60
[<ffffffff81697a5d>] apic_timer_interrupt+0x6d/0x80
Nikolay also found it would cause a memory leak - the mdb hash is
reallocated and not freed due to the mdb rehash.
unreferenced object 0xffff8800540ba800 (size 2048):
backtrace:
[<ffffffff816e2287>] kmemleak_alloc+0x67/0xc0
[<ffffffff81260bea>] __kmalloc+0x1ba/0x3e0
[<ffffffffa05c60ee>] br_mdb_rehash+0x5e/0x340 [bridge]
[<ffffffffa05c74af>] br_multicast_new_group+0x43f/0x6e0 [bridge]
[<ffffffffa05c7aa3>] br_multicast_add_group+0x203/0x260 [bridge]
[<ffffffffa05ca4b5>] br_multicast_rcv+0x945/0x11d0 [bridge]
[<ffffffffa05b6b10>] br_dev_xmit+0x180/0x470 [bridge]
[<ffffffff815c781b>] dev_hard_start_xmit+0xbb/0x3d0
[<ffffffff815c8743>] __dev_queue_xmit+0xb13/0xc10
[<ffffffff815c8850>] dev_queue_xmit+0x10/0x20
[<ffffffffa02f8d7a>] ip6_finish_output2+0x5ca/0xac0 [ipv6]
[<ffffffffa02fbfc6>] ip6_finish_output+0x126/0x2c0 [ipv6]
[<ffffffffa02fc245>] ip6_output+0xe5/0x390 [ipv6]
[<ffffffffa032b92c>] NF_HOOK.constprop.44+0x6c/0x240 [ipv6]
[<ffffffffa032bd16>] mld_sendpack+0x216/0x3e0 [ipv6]
[<ffffffffa032d5eb>] mld_ifc_timer_expire+0x18b/0x2b0 [ipv6]
This could happen when ip link remove a bridge or destroy a netns with a
bridge device inside.
With Nikolay's suggestion, this patch is to clean up bridge multicast in
ndo_uninit after bridge dev is shutdown, instead of br_dev_delete, so
that netif_running check in br_multicast_add_group can avoid this issue.
v1->v2:
- fix this issue by moving br_multicast_dev_del to ndo_uninit, instead
of calling dev_close in br_dev_delete.
(NOTE: Depends upon b6fe0440c637 ("bridge: implement missing ndo_uninit()"))
(NOTE: Manually fixed cherry-pick conflict as the code in the vicinity had
changed. The logic of this fix should be preserved)
Fixes: e10177abf842 ("bridge: multicast: fix handling of temp and perm entries")
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Rolf Neugebauer <rn@rneugeba.io>
(cherry picked from commit b1b9d366028ff580e6dd80b48a69c473361456f1)
---
net/bridge/br_device.c | 1 +
net/bridge/br_if.c | 1 -
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 15be72678bc8..a0a7bb6a991f 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -126,6 +126,7 @@ static void br_dev_uninit(struct net_device *dev)
{
struct net_bridge *br = netdev_priv(dev);
+ br_multicast_dev_del(br);
br_multicast_uninit_stats(br);
br_vlan_flush(br);
free_percpu(br->stats);
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 1764de88483c..e25b75654256 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -311,7 +311,6 @@ void br_dev_delete(struct net_device *dev, struct list_head *head)
br_fdb_delete_by_port(br, NULL, 0, 1);
- br_multicast_dev_del(br);
del_timer_sync(&br->gc_timer);
br_sysfs_delbr(br->dev);
--
2.18.0