kernel: Cherry pick multicast fixes into 4.9.x

This cherry picks:
- b6fe0440c637 ("bridge: implement missing ndo_uninit()")
- b1b9d366028f ("bridge: move bridge multicast cleanup to ndo_uninit")

The fix is in b1b9d366028f ("bridge: move bridge multicast cleanup
to ndo_uninit") but it requires b6fe0440c637 ("bridge: implement missing
ndo_uninit()"). Furthermore, b1b9d366028f needed some manual resolution
of a cherry-pick conflict because the surrounding code had changed.

Signed-off-by: Rolf Neugebauer <rn@rneugeba.io>
This commit is contained in:
Rolf Neugebauer 2018-09-05 20:50:13 +01:00
parent 0a9361d769
commit 4c725f0318
14 changed files with 263 additions and 12 deletions

View File

@ -1,7 +1,7 @@
From fe8dd3aef73a8404bc2aff0e61e8863e7203d8e5 Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Thu, 2 Mar 2017 12:55:49 -0300
Subject: [PATCH 01/12] tools build: Add test for sched_getcpu()
Subject: [PATCH 01/14] tools build: Add test for sched_getcpu()
Instead of trying to go on adding more ifdef conditions, do a feature
test and define HAVE_SCHED_GETCPU_SUPPORT instead, then use it to

View File

@ -1,7 +1,7 @@
From e3c72ac590752dd3812767cb6ea60e2b88ec0b06 Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Thu, 13 Oct 2016 17:12:35 -0300
Subject: [PATCH 02/12] perf jit: Avoid returning garbage for a ret variable
Subject: [PATCH 02/14] perf jit: Avoid returning garbage for a ret variable
When the loop body isn't executed at all, then the 'ret' local variable,
that is uninitialized will be used as the return value.

View File

@ -1,7 +1,7 @@
From 516270b76fff8f0c8aa158f62b71db726e807240 Mon Sep 17 00:00:00 2001
From: Dexuan Cui <decui@microsoft.com>
Date: Sat, 23 Jul 2016 01:35:51 +0000
Subject: [PATCH 03/12] hv_sock: introduce Hyper-V Sockets
Subject: [PATCH 03/14] hv_sock: introduce Hyper-V Sockets
Hyper-V Sockets (hv_sock) supplies a byte-stream based communication
mechanism between the host and the guest. It's somewhat like TCP over

View File

@ -1,7 +1,7 @@
From 5fd6750b81e73b4c5206a4709b263a1dd69287b6 Mon Sep 17 00:00:00 2001
From: Rolf Neugebauer <rolf.neugebauer@gmail.com>
Date: Mon, 23 May 2016 18:55:45 +0100
Subject: [PATCH 04/12] vmbus: Don't spam the logs with unknown GUIDs
Subject: [PATCH 04/14] vmbus: Don't spam the logs with unknown GUIDs
With Hyper-V sockets device types are introduced on the fly. The pr_info()
then prints a message on every connection, which is way too verbose. Since

View File

@ -1,7 +1,7 @@
From 3431174e32f2e60bd2fa8cc34525f87229faee02 Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com>
Date: Sun, 6 Nov 2016 13:14:07 -0800
Subject: [PATCH 05/12] Drivers: hv: utils: Fix the mapping between host
Subject: [PATCH 05/14] Drivers: hv: utils: Fix the mapping between host
version and protocol to use
We should intentionally declare the protocols to use for every known host

View File

@ -1,7 +1,7 @@
From 79f84ba6a2c606a4c2dd40bd6401d9ba95fed232 Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com>
Date: Sun, 6 Nov 2016 13:14:10 -0800
Subject: [PATCH 06/12] Drivers: hv: vss: Improve log messages.
Subject: [PATCH 06/14] Drivers: hv: vss: Improve log messages.
Adding log messages to help troubleshoot error cases and transaction
handling.

View File

@ -1,7 +1,7 @@
From 167af36cb474059a4a6a3a1ec1af391c287c804e Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com>
Date: Sun, 6 Nov 2016 13:14:11 -0800
Subject: [PATCH 07/12] Drivers: hv: vss: Operation timeouts should match host
Subject: [PATCH 07/14] Drivers: hv: vss: Operation timeouts should match host
expectation
Increase the timeout of backup operations. When system is under I/O load,

View File

@ -1,7 +1,7 @@
From 3a5e9488ac5d15df835ee90f7dadbd4fa128ae0c Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com>
Date: Sat, 28 Jan 2017 12:37:17 -0700
Subject: [PATCH 08/12] Drivers: hv: vmbus: Use all supported IC versions to
Subject: [PATCH 08/14] Drivers: hv: vmbus: Use all supported IC versions to
negotiate
Previously, we were assuming that each IC protocol version was tied to a

View File

@ -1,7 +1,7 @@
From ed28e9313a5e52b282485e32643724ec6706fc97 Mon Sep 17 00:00:00 2001
From: Alex Ng <alexng@messages.microsoft.com>
Date: Sat, 28 Jan 2017 12:37:18 -0700
Subject: [PATCH 09/12] Drivers: hv: Log the negotiated IC versions.
Subject: [PATCH 09/14] Drivers: hv: Log the negotiated IC versions.
Log the negotiated IC versions.

View File

@ -1,7 +1,7 @@
From 256a26fbbffbd065b45acf5f9abd58838a25a49d Mon Sep 17 00:00:00 2001
From: Dexuan Cui <decui@microsoft.com>
Date: Sun, 26 Mar 2017 16:42:20 +0800
Subject: [PATCH 10/12] vmbus: fix missed ring events on boot
Subject: [PATCH 10/14] vmbus: fix missed ring events on boot
During initialization, the channel initialization code schedules the
tasklet to scan the VMBUS receive event page (i.e. simulates an

View File

@ -1,7 +1,7 @@
From 24a091fd9fc828d6e46a2ed7d6147ffcf377f8c7 Mon Sep 17 00:00:00 2001
From: Dexuan Cui <decui@microsoft.com>
Date: Wed, 29 Mar 2017 18:37:10 +0800
Subject: [PATCH 11/12] vmbus: remove "goto error_clean_msglist" in
Subject: [PATCH 11/14] vmbus: remove "goto error_clean_msglist" in
vmbus_open()
This is just a cleanup patch to simplify the code a little.

View File

@ -1,7 +1,7 @@
From ca627325993df99b3007657f7480618550fc8a84 Mon Sep 17 00:00:00 2001
From: Dexuan Cui <decui@microsoft.com>
Date: Fri, 24 Mar 2017 20:53:18 +0800
Subject: [PATCH 12/12] vmbus: dynamically enqueue/dequeue the channel on
Subject: [PATCH 12/14] vmbus: dynamically enqueue/dequeue the channel on
vmbus_open/close
Signed-off-by: Dexuan Cui <decui@microsoft.com>

View File

@ -0,0 +1,142 @@
From 09becdd784c0295e9f9d70677e18d262b7c15d45 Mon Sep 17 00:00:00 2001
From: Ido Schimmel <idosch@mellanox.com>
Date: Mon, 10 Apr 2017 14:59:27 +0300
Subject: [PATCH 13/14] bridge: implement missing ndo_uninit()
While the bridge driver implements an ndo_init(), it was missing a
symmetric ndo_uninit(), causing the different de-initialization
operations to be scattered around its dellink() and destructor().
Implement a symmetric ndo_uninit() and remove the overlapping operations
from its dellink() and destructor().
This is a prerequisite for the next patch, as it allows us to have a
proper cleanup upon changelink() failure during the bridge's newlink().
Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b6fe0440c63716e09cfc0d1484e3898a0f29d1d1)
---
net/bridge/br_device.c | 20 +++++++++++---------
net/bridge/br_if.c | 1 -
net/bridge/br_multicast.c | 7 +++++--
net/bridge/br_private.h | 5 +++++
4 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 5f5e28f210e0..15be72678bc8 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -122,6 +122,15 @@ static int br_dev_init(struct net_device *dev)
return err;
}
+static void br_dev_uninit(struct net_device *dev)
+{
+ struct net_bridge *br = netdev_priv(dev);
+
+ br_multicast_uninit_stats(br);
+ br_vlan_flush(br);
+ free_percpu(br->stats);
+}
+
static int br_dev_open(struct net_device *dev)
{
struct net_bridge *br = netdev_priv(dev);
@@ -337,6 +346,7 @@ static const struct net_device_ops br_netdev_ops = {
.ndo_open = br_dev_open,
.ndo_stop = br_dev_stop,
.ndo_init = br_dev_init,
+ .ndo_uninit = br_dev_uninit,
.ndo_start_xmit = br_dev_xmit,
.ndo_get_stats64 = br_get_stats64,
.ndo_set_mac_address = br_set_mac_address,
@@ -363,14 +373,6 @@ static const struct net_device_ops br_netdev_ops = {
.ndo_features_check = passthru_features_check,
};
-static void br_dev_free(struct net_device *dev)
-{
- struct net_bridge *br = netdev_priv(dev);
-
- free_percpu(br->stats);
- free_netdev(dev);
-}
-
static struct device_type br_type = {
.name = "bridge",
};
@@ -383,7 +385,7 @@ void br_dev_setup(struct net_device *dev)
ether_setup(dev);
dev->netdev_ops = &br_netdev_ops;
- dev->destructor = br_dev_free;
+ dev->destructor = free_netdev;
dev->ethtool_ops = &br_ethtool_ops;
SET_NETDEV_DEVTYPE(dev, &br_type);
dev->priv_flags = IFF_EBRIDGE | IFF_NO_QUEUE;
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 8e173324693d..1764de88483c 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -311,7 +311,6 @@ void br_dev_delete(struct net_device *dev, struct list_head *head)
br_fdb_delete_by_port(br, NULL, 0, 1);
- br_vlan_flush(br);
br_multicast_dev_del(br);
del_timer_sync(&br->gc_timer);
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 2136e45f5277..4c16e01031d4 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1898,8 +1898,6 @@ void br_multicast_dev_del(struct net_bridge *br)
out:
spin_unlock_bh(&br->multicast_lock);
-
- free_percpu(br->mcast_stats);
}
int br_multicast_set_router(struct net_bridge *br, unsigned long val)
@@ -2354,6 +2352,11 @@ int br_multicast_init_stats(struct net_bridge *br)
return 0;
}
+void br_multicast_uninit_stats(struct net_bridge *br)
+{
+ free_percpu(br->mcast_stats);
+}
+
static void mcast_stats_add_dir(u64 *dst, u64 *src)
{
dst[BR_MCAST_DIR_RX] += src[BR_MCAST_DIR_RX];
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 1b63177e0ccd..417edbe7b8b2 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -601,6 +601,7 @@ void br_rtr_notify(struct net_device *dev, struct net_bridge_port *port,
void br_multicast_count(struct net_bridge *br, const struct net_bridge_port *p,
const struct sk_buff *skb, u8 type, u8 dir);
int br_multicast_init_stats(struct net_bridge *br);
+void br_multicast_uninit_stats(struct net_bridge *br);
void br_multicast_get_stats(const struct net_bridge *br,
const struct net_bridge_port *p,
struct br_mcast_stats *dest);
@@ -741,6 +742,10 @@ static inline int br_multicast_init_stats(struct net_bridge *br)
return 0;
}
+static inline void br_multicast_uninit_stats(struct net_bridge *br)
+{
+}
+
static inline int br_multicast_igmp_type(const struct sk_buff *skb)
{
return 0;
--
2.18.0

View File

@ -0,0 +1,109 @@
From f8df437a8d7f1afebc7ebfbc695dc2032d7d5f66 Mon Sep 17 00:00:00 2001
From: Xin Long <lucien.xin@gmail.com>
Date: Tue, 25 Apr 2017 22:58:37 +0800
Subject: [PATCH 14/14] bridge: move bridge multicast cleanup to ndo_uninit
During removing a bridge device, if the bridge is still up, a new mdb entry
still can be added in br_multicast_add_group() after all mdb entries are
removed in br_multicast_dev_del(). Like the path:
mld_ifc_timer_expire ->
mld_sendpack -> ...
br_multicast_rcv ->
br_multicast_add_group
The new mp's timer will be set up. If the timer expires after the bridge
is freed, it may cause use-after-free panic in br_multicast_group_expired.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: [<ffffffffa07ed2c8>] br_multicast_group_expired+0x28/0xb0 [bridge]
Call Trace:
<IRQ>
[<ffffffff81094536>] call_timer_fn+0x36/0x110
[<ffffffffa07ed2a0>] ? br_mdb_free+0x30/0x30 [bridge]
[<ffffffff81096967>] run_timer_softirq+0x237/0x340
[<ffffffff8108dcbf>] __do_softirq+0xef/0x280
[<ffffffff8169889c>] call_softirq+0x1c/0x30
[<ffffffff8102c275>] do_softirq+0x65/0xa0
[<ffffffff8108e055>] irq_exit+0x115/0x120
[<ffffffff81699515>] smp_apic_timer_interrupt+0x45/0x60
[<ffffffff81697a5d>] apic_timer_interrupt+0x6d/0x80
Nikolay also found it would cause a memory leak - the mdb hash is
reallocated and not freed due to the mdb rehash.
unreferenced object 0xffff8800540ba800 (size 2048):
backtrace:
[<ffffffff816e2287>] kmemleak_alloc+0x67/0xc0
[<ffffffff81260bea>] __kmalloc+0x1ba/0x3e0
[<ffffffffa05c60ee>] br_mdb_rehash+0x5e/0x340 [bridge]
[<ffffffffa05c74af>] br_multicast_new_group+0x43f/0x6e0 [bridge]
[<ffffffffa05c7aa3>] br_multicast_add_group+0x203/0x260 [bridge]
[<ffffffffa05ca4b5>] br_multicast_rcv+0x945/0x11d0 [bridge]
[<ffffffffa05b6b10>] br_dev_xmit+0x180/0x470 [bridge]
[<ffffffff815c781b>] dev_hard_start_xmit+0xbb/0x3d0
[<ffffffff815c8743>] __dev_queue_xmit+0xb13/0xc10
[<ffffffff815c8850>] dev_queue_xmit+0x10/0x20
[<ffffffffa02f8d7a>] ip6_finish_output2+0x5ca/0xac0 [ipv6]
[<ffffffffa02fbfc6>] ip6_finish_output+0x126/0x2c0 [ipv6]
[<ffffffffa02fc245>] ip6_output+0xe5/0x390 [ipv6]
[<ffffffffa032b92c>] NF_HOOK.constprop.44+0x6c/0x240 [ipv6]
[<ffffffffa032bd16>] mld_sendpack+0x216/0x3e0 [ipv6]
[<ffffffffa032d5eb>] mld_ifc_timer_expire+0x18b/0x2b0 [ipv6]
This could happen when ip link remove a bridge or destroy a netns with a
bridge device inside.
With Nikolay's suggestion, this patch is to clean up bridge multicast in
ndo_uninit after bridge dev is shutdown, instead of br_dev_delete, so
that netif_running check in br_multicast_add_group can avoid this issue.
v1->v2:
- fix this issue by moving br_multicast_dev_del to ndo_uninit, instead
of calling dev_close in br_dev_delete.
(NOTE: Depends upon b6fe0440c637 ("bridge: implement missing ndo_uninit()"))
(NOTE: Manually fixed cherry-pick conflict as the code in the vicinity had
changed. The logic of this fix should be preserved)
Fixes: e10177abf842 ("bridge: multicast: fix handling of temp and perm entries")
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Rolf Neugebauer <rn@rneugeba.io>
(cherry picked from commit b1b9d366028ff580e6dd80b48a69c473361456f1)
---
net/bridge/br_device.c | 1 +
net/bridge/br_if.c | 1 -
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 15be72678bc8..a0a7bb6a991f 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -126,6 +126,7 @@ static void br_dev_uninit(struct net_device *dev)
{
struct net_bridge *br = netdev_priv(dev);
+ br_multicast_dev_del(br);
br_multicast_uninit_stats(br);
br_vlan_flush(br);
free_percpu(br->stats);
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 1764de88483c..e25b75654256 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -311,7 +311,6 @@ void br_dev_delete(struct net_device *dev, struct list_head *head)
br_fdb_delete_by_port(br, NULL, 0, 1);
- br_multicast_dev_del(br);
del_timer_sync(&br->gc_timer);
br_sysfs_delbr(br->dev);
--
2.18.0