whiterose

linux unikernel
Log | Files | Refs | README | LICENSE | git clone https://git.ne02ptzero.me/git/whiterose

commit 1a9df9e29c2afecf6e3089442d429b377279ca3c
parent 14c741de93861749dfb60b4964028541f5c506ca
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed, 27 Mar 2019 12:22:57 -0700

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "Fixes here and there, a couple new device IDs, as usual:

   1) Fix BQL race in dpaa2-eth driver, from Ioana Ciornei.

   2) Fix 64-bit division in iwlwifi, from Arnd Bergmann.

   3) Fix documentation for some eBPF helpers, from Quentin Monnet.

   4) Some UAPI bpf header sync with tools, also from Quentin Monnet.

   5) Set descriptor ownership bit at the right time for jumbo frames in
      stmmac driver, from Aaro Koskinen.

   6) Set IFF_UP properly in tun driver, from Eric Dumazet.

   7) Fix load/store doubleword instruction generation in powerpc eBPF
      JIT, from Naveen N. Rao.

   8) nla_nest_start() return value checks all over, from Kangjie Lu.

   9) Fix asoc_id handling in SCTP after the SCTP_*_ASSOC changes this
      merge window. From Marcelo Ricardo Leitner and Xin Long.

  10) Fix memory corruption with large MTUs in stmmac, from Aaro
      Koskinen.

  11) Do not use ipv4 header for ipv6 flows in TCP and DCCP, from Eric
      Dumazet.

  12) Fix topology subscription cancellation in tipc, from Erik Hugne.

  13) Memory leak in genetlink error path, from Yue Haibing.

  14) Valid control actions properly in packet scheduler, from Davide
      Caratti.

  15) Even if we get EEXIST, we still need to rehash if a shrink was
      delayed. From Herbert Xu.

  16) Fix interrupt mask handling in interrupt handler of r8169, from
      Heiner Kallweit.

  17) Fix leak in ehea driver, from Wen Yang"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (168 commits)
  dpaa2-eth: fix race condition with bql frame accounting
  chelsio: use BUG() instead of BUG_ON(1)
  net: devlink: skip info_get op call if it is not defined in dumpit
  net: phy: bcm54xx: Encode link speed and activity into LEDs
  tipc: change to check tipc_own_id to return in tipc_net_stop
  net: usb: aqc111: Extend HWID table by QNAP device
  net: sched: Kconfig: update reference link for PIE
  net: dsa: qca8k: extend slave-bus implementations
  net: dsa: qca8k: remove leftover phy accessors
  dt-bindings: net: dsa: qca8k: support internal mdio-bus
  dt-bindings: net: dsa: qca8k: fix example
  net: phy: don't clear BMCR in genphy_soft_reset
  bpf, libbpf: clarify bump in libbpf version info
  bpf, libbpf: fix version info and add it to shared object
  rxrpc: avoid clang -Wuninitialized warning
  tipc: tipc clang warning
  net: sched: fix cleanup NULL pointer exception in act_mirr
  r8169: fix cable re-plugging issue
  net: ethernet: ti: fix possible object reference leak
  net: ibm: fix possible object reference leak
  ...

Diffstat:
MDocumentation/devicetree/bindings/net/dsa/qca8k.txt | 73++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
MDocumentation/networking/msg_zerocopy.rst | 2+-
MDocumentation/networking/netdev-FAQ.rst | 13+++++++++++++
MDocumentation/networking/nf_flowtable.txt | 8++++----
MDocumentation/networking/snmp_counter.rst | 12+++++++++---
March/powerpc/include/asm/ppc-opcode.h | 2++
March/powerpc/net/bpf_jit.h | 17+++++------------
March/powerpc/net/bpf_jit32.h | 4++++
March/powerpc/net/bpf_jit64.h | 20++++++++++++++++++++
March/powerpc/net/bpf_jit_comp64.c | 12++++++------
Mdrivers/isdn/hardware/mISDN/hfcmulti.c | 3++-
Mdrivers/net/Kconfig | 4++--
Mdrivers/net/dsa/qca8k.c | 174++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
Mdrivers/net/dsa/qca8k.h | 13+++++++++++++
Mdrivers/net/ethernet/3com/3c515.c | 2+-
Mdrivers/net/ethernet/8390/mac8390.c | 19++++++++++++-------
Mdrivers/net/ethernet/aquantia/atlantic/aq_ring.c | 5+++--
Mdrivers/net/ethernet/cadence/macb_main.c | 10++++++++--
Mdrivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c | 2+-
Mdrivers/net/ethernet/chelsio/cxgb4/sge.c | 2+-
Mdrivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 15++++++++++-----
Mdrivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 13++++++-------
Mdrivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 1-
Mdrivers/net/ethernet/ibm/ehea/ehea_main.c | 1+
Mdrivers/net/ethernet/mellanox/mlxsw/core_env.c | 2+-
Mdrivers/net/ethernet/micrel/ks8851.c | 42++++++++++++++++++++++++------------------
Mdrivers/net/ethernet/micrel/ks8851.h | 93++++++++++++++++++++++++++++++++++++++++++-------------------------------------
Mdrivers/net/ethernet/micrel/ks8851_mll.c | 317++-----------------------------------------------------------------------------
Mdrivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c | 2++
Mdrivers/net/ethernet/realtek/atp.c | 2+-
Mdrivers/net/ethernet/realtek/r8169.c | 8+++++---
Mdrivers/net/ethernet/sis/sis900.c | 10+++++-----
Mdrivers/net/ethernet/stmicro/stmmac/ring_mode.c | 13++++++++-----
Mdrivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 14++++++++------
Mdrivers/net/ethernet/ti/netcp_ethss.c | 8++++++--
Mdrivers/net/ethernet/xilinx/xilinx_axienet_main.c | 2++
Mdrivers/net/ieee802154/adf7242.c | 4++++
Mdrivers/net/ieee802154/mac802154_hwsim.c | 2+-
Mdrivers/net/phy/Kconfig | 3+++
Mdrivers/net/phy/broadcom.c | 13+++++++++++++
Mdrivers/net/phy/dp83822.c | 34++++++++++++++++++++--------------
Mdrivers/net/phy/meson-gxl.c | 6++++++
Mdrivers/net/phy/phy_device.c | 2+-
Mdrivers/net/tun.c | 16++++++++++++----
Mdrivers/net/usb/aqc111.c | 15+++++++++++++++
Mdrivers/net/usb/cdc_ether.c | 8++++++++
Mdrivers/net/vxlan.c | 4+---
Mdrivers/net/wireless/intel/iwlwifi/mvm/ftm-initiator.c | 4+---
Mdrivers/net/wireless/mediatek/mt76/dma.c | 7++++++-
Mdrivers/net/wireless/mediatek/mt76/mac80211.c | 18+++++++++++-------
Mdrivers/net/wireless/mediatek/mt76/mt76.h | 4++++
Mdrivers/net/wireless/mediatek/mt76/mt7603/beacon.c | 3+--
Mdrivers/net/wireless/mediatek/mt76/mt7603/dma.c | 17+++++++++++++++--
Mdrivers/net/wireless/mediatek/mt76/mt7603/init.c | 2+-
Mdrivers/net/wireless/mediatek/mt76/mt7603/mac.c | 2+-
Mdrivers/net/wireless/mediatek/mt76/mt7603/main.c | 16++++++++++++++++
Mdrivers/net/wireless/mediatek/mt76/mt7603/mcu.c | 2+-
Mdrivers/net/wireless/mediatek/mt76/mt7603/soc.c | 4++--
Mdrivers/net/wireless/mediatek/mt76/mt76x0/initvals.h | 2+-
Mdrivers/net/wireless/mediatek/mt76/mt76x0/usb.c | 10+++++++---
Mdrivers/net/wireless/mediatek/mt76/mt76x02.h | 11+++++++++++
Mdrivers/net/wireless/mediatek/mt76/mt76x02_debugfs.c | 27+++++++++++++++++++++++++++
Mdrivers/net/wireless/mediatek/mt76/mt76x02_dfs.c | 3++-
Mdrivers/net/wireless/mediatek/mt76/mt76x02_mac.c | 67++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
Mdrivers/net/wireless/mediatek/mt76/mt76x02_mac.h | 2++
Mdrivers/net/wireless/mediatek/mt76/mt76x02_mmio.c | 82+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
Mdrivers/net/wireless/mediatek/mt76/mt76x02_phy.c | 2++
Mdrivers/net/wireless/mediatek/mt76/mt76x02_usb_core.c | 3++-
Mdrivers/net/wireless/mediatek/mt76/mt76x02_util.c | 14++++++++++----
Mdrivers/net/wireless/mediatek/mt76/mt76x2/init.c | 2+-
Mdrivers/net/wireless/mediatek/mt76/mt76x2/mt76x2.h | 1+
Mdrivers/net/wireless/mediatek/mt76/mt76x2/pci_init.c | 2+-
Mdrivers/net/wireless/mediatek/mt76/mt76x2/pci_mcu.c | 21+++++++++++++++++++++
Mdrivers/net/wireless/mediatek/mt76/mt76x2/phy.c | 30++++++++++++++++++++++--------
Mdrivers/net/wireless/mediatek/mt76/mt76x2/usb.c | 7+++++--
Mdrivers/net/wireless/mediatek/mt76/mt76x2/usb_mac.c | 1-
Mdrivers/net/wireless/mediatek/mt76/tx.c | 11++++++++++-
Mdrivers/net/wireless/mediatek/mt76/usb.c | 6+++++-
Mdrivers/net/wireless/mediatek/mt7601u/usb.c | 4++++
Mdrivers/s390/net/qeth_core_main.c | 5++++-
Mdrivers/s390/net/qeth_l2_main.c | 7+++----
Mdrivers/s390/net/qeth_l3_main.c | 8++++----
Minclude/linux/atalk.h | 2+-
Minclude/linux/bpf.h | 1-
Minclude/linux/bpf_verifier.h | 40++++++++++++++++++++++++++++++++++++++++
Minclude/linux/brcmphy.h | 16++++++++++++++++
Minclude/linux/net.h | 6++++++
Minclude/linux/socket.h | 12++++++------
Minclude/net/act_api.h | 9+++++++--
Minclude/net/sch_generic.h | 1+
Minclude/net/sctp/checksum.h | 2+-
Minclude/net/sock.h | 6++++++
Minclude/net/tc_act/tc_gact.h | 2+-
Minclude/net/xdp_sock.h | 1-
Minclude/uapi/linux/bpf.h | 188+++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------
Mkernel/bpf/syscall.c | 22+++++++++++++++-------
Mkernel/bpf/verifier.c | 154+++++++++++++++++++++++++++++++++++++++++++++++--------------------------------
Mlib/rhashtable.c | 8++++++--
Mnet/appletalk/aarp.c | 15++++++++++++---
Mnet/appletalk/ddp.c | 20++++++++++++--------
Mnet/bridge/br_netfilter_hooks.c | 1+
Mnet/bridge/br_netfilter_ipv6.c | 2++
Mnet/core/devlink.c | 5+++++
Mnet/core/filter.c | 27++++++++++++++++++++++-----
Mnet/core/net-sysfs.c | 20+++++++++++++-------
Mnet/dccp/ipv6.c | 4++--
Mnet/ipv6/netfilter/ip6t_srh.c | 6++++++
Mnet/ipv6/route.c | 18++++++++++--------
Mnet/ipv6/tcp_ipv6.c | 8++++----
Mnet/mpls/mpls_iptunnel.c | 12+++++++++---
Mnet/ncsi/ncsi-netlink.c | 4++++
Mnet/netfilter/Kconfig | 1+
Mnet/netfilter/nf_conntrack_sip.c | 37++++++++++++++++---------------------
Mnet/netfilter/nf_tables_api.c | 5++++-
Mnet/netfilter/nft_objref.c | 19++++++++++++++++---
Mnet/netfilter/nft_redir.c | 2+-
Mnet/netfilter/nft_set_rbtree.c | 7+++----
Mnet/netlink/genetlink.c | 3++-
Mnet/nfc/llcp_sock.c | 9+++++++--
Mnet/openvswitch/datapath.c | 12++++++++++++
Mnet/packet/af_packet.c | 7++++---
Mnet/rose/rose_subr.c | 21++++++++++++---------
Mnet/rxrpc/output.c | 11+++++------
Mnet/sched/Kconfig | 3+--
Mnet/sched/act_api.c | 101++++++++++++++++++++++++++++++++++++++++++++++---------------------------------
Mnet/sched/act_bpf.c | 25+++++++++++++++++++------
Mnet/sched/act_connmark.c | 22+++++++++++++++++++---
Mnet/sched/act_csum.c | 22++++++++++++++++++----
Mnet/sched/act_gact.c | 15+++++++++++++--
Mnet/sched/act_ife.c | 35+++++++++++++++++++++--------------
Mnet/sched/act_ipt.c | 11++++++-----
Mnet/sched/act_mirred.c | 25+++++++++++++++++++++----
Mnet/sched/act_nat.c | 15+++++++++++++--
Mnet/sched/act_pedit.c | 18+++++++++++++++---
Mnet/sched/act_police.c | 13++++++++++++-
Mnet/sched/act_sample.c | 21+++++++++++++++++----
Mnet/sched/act_simple.c | 54+++++++++++++++++++++++++++++++++++++++---------------
Mnet/sched/act_skbedit.c | 20+++++++++++++++++---
Mnet/sched/act_skbmod.c | 20+++++++++++++++++---
Mnet/sched/act_tunnel_key.c | 19+++++++++++++++++--
Mnet/sched/act_vlan.c | 22++++++++++++++++++----
Mnet/sched/cls_api.c | 2+-
Mnet/sched/sch_cake.c | 25++++++++++++-------------
Mnet/sctp/socket.c | 54++++++++++++++++++++++++++++++++++++++++++++++--------
Mnet/socket.c | 277+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
Mnet/strparser/strparser.c | 2++
Mnet/tipc/group.c | 3+++
Mnet/tipc/net.c | 5+----
Mnet/tipc/node.c | 7++++---
Mnet/tipc/socket.c | 22+++++++++++++++++-----
Mnet/tipc/topsrv.c | 1+
Mnet/xdp/xdp_umem.c | 19+------------------
Mtools/include/uapi/linux/bpf.h | 188+++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------
Mtools/lib/bpf/Makefile | 42++++++++++++++++++++++++++++--------------
Mtools/lib/bpf/README.rst | 1+
Mtools/lib/bpf/btf.c | 51++++++++++++++++++++++++++++++++++++++-------------
Mtools/lib/bpf/libbpf.c | 13++++++++++---
Mtools/lib/bpf/xsk.c | 15+++++++++++----
Mtools/testing/selftests/bpf/bpf_helpers.h | 2++
Mtools/testing/selftests/bpf/prog_tests/map_lock.c | 2+-
Mtools/testing/selftests/bpf/prog_tests/spinlock.c | 2+-
Mtools/testing/selftests/bpf/progs/test_sock_fields_kern.c | 88++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------
Mtools/testing/selftests/bpf/test_btf.c | 44++++++++++++++++++++++++++++++++++++++++++++
Mtools/testing/selftests/bpf/test_sock_fields.c | 134++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------
Mtools/testing/selftests/bpf/verifier/calls.c | 25+++++++++++++++++++++++++
Mtools/testing/selftests/bpf/verifier/ref_tracking.c | 168+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mtools/testing/selftests/bpf/verifier/sock.c | 4++--
Mtools/testing/selftests/tc-testing/tc-tests/actions/bpf.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/connmark.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/csum.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/gact.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/ife.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/mirred.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/nat.json | 25+++++++++++++++++++++++++
Atools/testing/selftests/tc-testing/tc-tests/actions/pedit.json | 51+++++++++++++++++++++++++++++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/police.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/sample.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/simple.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/tunnel_key.json | 25+++++++++++++++++++++++++
Mtools/testing/selftests/tc-testing/tc-tests/actions/vlan.json | 25+++++++++++++++++++++++++
182 files changed, 3121 insertions(+), 1137 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/dsa/qca8k.txt b/Documentation/devicetree/bindings/net/dsa/qca8k.txt @@ -12,10 +12,15 @@ Required properties: Subnodes: The integrated switch subnode should be specified according to the binding -described in dsa/dsa.txt. As the QCA8K switches do not have a N:N mapping of -port and PHY id, each subnode describing a port needs to have a valid phandle -referencing the internal PHY connected to it. The CPU port of this switch is -always port 0. +described in dsa/dsa.txt. If the QCA8K switch is connect to a SoC's external +mdio-bus each subnode describing a port needs to have a valid phandle +referencing the internal PHY it is connected to. This is because there's no +N:N mapping of port and PHY id. + +Don't use mixed external and internal mdio-bus configurations, as this is +not supported by the hardware. + +The CPU port of this switch is always port 0. A CPU port node has the following optional node: @@ -31,8 +36,9 @@ For QCA8K the 'fixed-link' sub-node supports only the following properties: - 'full-duplex' (boolean, optional), to indicate that full duplex is used. When absent, half duplex is assumed. -Example: +Examples: +for the external mdio-bus configuration: &mdio0 { phy_port1: phy@0 { @@ -55,12 +61,12 @@ Example: reg = <4>; }; - switch0@0 { + switch@10 { compatible = "qca,qca8337"; #address-cells = <1>; #size-cells = <0>; - reg = <0>; + reg = <0x10>; ports { #address-cells = <1>; @@ -108,3 +114,56 @@ Example: }; }; }; + +for the internal master mdio-bus configuration: + + &mdio0 { + switch@10 { + compatible = "qca,qca8337"; + #address-cells = <1>; + #size-cells = <0>; + + reg = <0x10>; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + label = "cpu"; + ethernet = <&gmac1>; + phy-mode = "rgmii"; + fixed-link { + speed = 1000; + full-duplex; + }; + }; + + port@1 { + reg = <1>; + label = "lan1"; + }; + + port@2 { + reg = <2>; + label = "lan2"; + }; + + port@3 { + reg = <3>; + label = "lan3"; + }; + + port@4 { + reg = <4>; + label = "lan4"; + }; + + port@5 { + reg = <5>; + label = "wan"; + }; + }; + }; + }; diff --git a/Documentation/networking/msg_zerocopy.rst b/Documentation/networking/msg_zerocopy.rst @@ -50,7 +50,7 @@ the excellent reporting over at LWN.net or read the original code. patchset [PATCH net-next v4 0/9] socket sendmsg MSG_ZEROCOPY - http://lkml.kernel.org/r/20170803202945.70750-1-willemdebruijn.kernel@gmail.com + https://lkml.kernel.org/netdev/20170803202945.70750-1-willemdebruijn.kernel@gmail.com Interface diff --git a/Documentation/networking/netdev-FAQ.rst b/Documentation/networking/netdev-FAQ.rst @@ -131,6 +131,19 @@ it to the maintainer to figure out what is the most recent and current version that should be applied. If there is any doubt, the maintainer will reply and ask what should be done. +Q: I made changes to only a few patches in a patch series should I resend only those changed? +-------------------------------------------------------------------------------------------- +A: No, please resend the entire patch series and make sure you do number your +patches such that it is clear this is the latest and greatest set of patches +that can be applied. + +Q: I submitted multiple versions of a patch series and it looks like a version other than the last one has been accepted, what should I do? +------------------------------------------------------------------------------------------------------------------------------------------- +A: There is no revert possible, once it is pushed out, it stays like that. +Please send incremental versions on top of what has been merged in order to fix +the patches the way they would look like if your latest patch series was to be +merged. + Q: How can I tell what patches are queued up for backporting to the various stable releases? -------------------------------------------------------------------------------------------- A: Normally Greg Kroah-Hartman collects stable commits himself, but for diff --git a/Documentation/networking/nf_flowtable.txt b/Documentation/networking/nf_flowtable.txt @@ -44,10 +44,10 @@ including the Netfilter hooks and the flowtable fastpath bypass. / \ / \ |Routing | / \ --> ingress ---> prerouting ---> |decision| | postrouting |--> neigh_xmit \_________/ \__________/ ---------- \____________/ ^ - | ^ | | ^ | - flowtable | | ____\/___ | | - | | | / \ | | - __\/___ | --------->| forward |------------ | + | ^ | ^ | + flowtable | ____\/___ | | + | | / \ | | + __\/___ | | forward |------------ | |-----| | \_________/ | |-----| | 'flow offload' rule | |-----| | adds entry to | diff --git a/Documentation/networking/snmp_counter.rst b/Documentation/networking/snmp_counter.rst @@ -413,7 +413,7 @@ algorithm. .. _F-RTO: https://tools.ietf.org/html/rfc5682 TCP Fast Path -============ +============= When kernel receives a TCP packet, it has two paths to handler the packet, one is fast path, another is slow path. The comment in kernel code provides a good explanation of them, I pasted them below:: @@ -681,6 +681,7 @@ The TCP stack receives an out of order duplicate packet, so it sends a DSACK to the sender. * TcpExtTCPDSACKRecv + The TCP stack receives a DSACK, which indicates an acknowledged duplicate packet is received. @@ -690,7 +691,7 @@ The TCP stack receives a DSACK, which indicate an out of order duplicate packet is received. invalid SACK and DSACK -==================== +====================== When a SACK (or DSACK) block is invalid, a corresponding counter would be updated. The validation method is base on the start/end sequence number of the SACK block. For more details, please refer the comment @@ -704,11 +705,13 @@ explaination: .. _Add counters for discarded SACK blocks: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18f02545a9a16c9a89778b91a162ad16d510bb32 * TcpExtTCPSACKDiscard + This counter indicates how many SACK blocks are invalid. If the invalid SACK block is caused by ACK recording, the TCP stack will only ignore it and won't update this counter. * TcpExtTCPDSACKIgnoredOld and TcpExtTCPDSACKIgnoredNoUndo + When a DSACK block is invalid, one of these two counters would be updated. Which counter will be updated depends on the undo_marker flag of the TCP socket. If the undo_marker is not set, the TCP stack isn't @@ -719,7 +722,7 @@ will be updated. If the undo_marker is set, TcpExtTCPDSACKIgnoredOld will be updated. As implied in its name, it might be an old packet. SACK shift -========= +========== The linux networking stack stores data in sk_buff struct (skb for short). If a SACK block acrosses multiple skb, the TCP stack will try to re-arrange data in these skb. E.g. if a SACK block acknowledges seq @@ -730,12 +733,15 @@ seq 14 to 20. All data in skb2 will be moved to skb1, and skb2 will be discard, this operation is 'merge'. * TcpExtTCPSackShifted + A skb is shifted * TcpExtTCPSackMerged + A skb is merged * TcpExtTCPSackShiftFallback + A skb should be shifted or merged, but the TCP stack doesn't do it for some reasons. diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h @@ -302,6 +302,7 @@ /* Misc instructions for BPF compiler */ #define PPC_INST_LBZ 0x88000000 #define PPC_INST_LD 0xe8000000 +#define PPC_INST_LDX 0x7c00002a #define PPC_INST_LHZ 0xa0000000 #define PPC_INST_LWZ 0x80000000 #define PPC_INST_LHBRX 0x7c00062c @@ -309,6 +310,7 @@ #define PPC_INST_STB 0x98000000 #define PPC_INST_STH 0xb0000000 #define PPC_INST_STD 0xf8000000 +#define PPC_INST_STDX 0x7c00012a #define PPC_INST_STDU 0xf8000001 #define PPC_INST_STW 0x90000000 #define PPC_INST_STWU 0x94000000 diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h @@ -51,6 +51,8 @@ #define PPC_LIS(r, i) PPC_ADDIS(r, 0, i) #define PPC_STD(r, base, i) EMIT(PPC_INST_STD | ___PPC_RS(r) | \ ___PPC_RA(base) | ((i) & 0xfffc)) +#define PPC_STDX(r, base, b) EMIT(PPC_INST_STDX | ___PPC_RS(r) | \ + ___PPC_RA(base) | ___PPC_RB(b)) #define PPC_STDU(r, base, i) EMIT(PPC_INST_STDU | ___PPC_RS(r) | \ ___PPC_RA(base) | ((i) & 0xfffc)) #define PPC_STW(r, base, i) EMIT(PPC_INST_STW | ___PPC_RS(r) | \ @@ -65,7 +67,9 @@ #define PPC_LBZ(r, base, i) EMIT(PPC_INST_LBZ | ___PPC_RT(r) | \ ___PPC_RA(base) | IMM_L(i)) #define PPC_LD(r, base, i) EMIT(PPC_INST_LD | ___PPC_RT(r) | \ - ___PPC_RA(base) | IMM_L(i)) + ___PPC_RA(base) | ((i) & 0xfffc)) +#define PPC_LDX(r, base, b) EMIT(PPC_INST_LDX | ___PPC_RT(r) | \ + ___PPC_RA(base) | ___PPC_RB(b)) #define PPC_LWZ(r, base, i) EMIT(PPC_INST_LWZ | ___PPC_RT(r) | \ ___PPC_RA(base) | IMM_L(i)) #define PPC_LHZ(r, base, i) EMIT(PPC_INST_LHZ | ___PPC_RT(r) | \ @@ -85,17 +89,6 @@ ___PPC_RA(a) | ___PPC_RB(b)) #define PPC_BPF_STDCX(s, a, b) EMIT(PPC_INST_STDCX | ___PPC_RS(s) | \ ___PPC_RA(a) | ___PPC_RB(b)) - -#ifdef CONFIG_PPC64 -#define PPC_BPF_LL(r, base, i) do { PPC_LD(r, base, i); } while(0) -#define PPC_BPF_STL(r, base, i) do { PPC_STD(r, base, i); } while(0) -#define PPC_BPF_STLU(r, base, i) do { PPC_STDU(r, base, i); } while(0) -#else -#define PPC_BPF_LL(r, base, i) do { PPC_LWZ(r, base, i); } while(0) -#define PPC_BPF_STL(r, base, i) do { PPC_STW(r, base, i); } while(0) -#define PPC_BPF_STLU(r, base, i) do { PPC_STWU(r, base, i); } while(0) -#endif - #define PPC_CMPWI(a, i) EMIT(PPC_INST_CMPWI | ___PPC_RA(a) | IMM_L(i)) #define PPC_CMPDI(a, i) EMIT(PPC_INST_CMPDI | ___PPC_RA(a) | IMM_L(i)) #define PPC_CMPW(a, b) EMIT(PPC_INST_CMPW | ___PPC_RA(a) | \ diff --git a/arch/powerpc/net/bpf_jit32.h b/arch/powerpc/net/bpf_jit32.h @@ -122,6 +122,10 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh); #define PPC_NTOHS_OFFS(r, base, i) PPC_LHZ_OFFS(r, base, i) #endif +#define PPC_BPF_LL(r, base, i) do { PPC_LWZ(r, base, i); } while(0) +#define PPC_BPF_STL(r, base, i) do { PPC_STW(r, base, i); } while(0) +#define PPC_BPF_STLU(r, base, i) do { PPC_STWU(r, base, i); } while(0) + #define SEEN_DATAREF 0x10000 /* might call external helpers */ #define SEEN_XREG 0x20000 /* X reg is used */ #define SEEN_MEM 0x40000 /* SEEN_MEM+(1<<n) = use mem[n] for temporary diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h @@ -68,6 +68,26 @@ static const int b2p[] = { /* PPC NVR range -- update this if we ever use NVRs below r27 */ #define BPF_PPC_NVR_MIN 27 +/* + * WARNING: These can use TMP_REG_2 if the offset is not at word boundary, + * so ensure that it isn't in use already. + */ +#define PPC_BPF_LL(r, base, i) do { \ + if ((i) % 4) { \ + PPC_LI(b2p[TMP_REG_2], (i)); \ + PPC_LDX(r, base, b2p[TMP_REG_2]); \ + } else \ + PPC_LD(r, base, i); \ + } while(0) +#define PPC_BPF_STL(r, base, i) do { \ + if ((i) % 4) { \ + PPC_LI(b2p[TMP_REG_2], (i)); \ + PPC_STDX(r, base, b2p[TMP_REG_2]); \ + } else \ + PPC_STD(r, base, i); \ + } while(0) +#define PPC_BPF_STLU(r, base, i) do { PPC_STDU(r, base, i); } while(0) + #define SEEN_FUNC 0x1000 /* might call external helpers */ #define SEEN_STACK 0x2000 /* uses BPF stack */ #define SEEN_TAILCALL 0x4000 /* uses tail calls */ diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c @@ -252,7 +252,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 * if (tail_call_cnt > MAX_TAIL_CALL_CNT) * goto out; */ - PPC_LD(b2p[TMP_REG_1], 1, bpf_jit_stack_tailcallcnt(ctx)); + PPC_BPF_LL(b2p[TMP_REG_1], 1, bpf_jit_stack_tailcallcnt(ctx)); PPC_CMPLWI(b2p[TMP_REG_1], MAX_TAIL_CALL_CNT); PPC_BCC(COND_GT, out); @@ -265,7 +265,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 /* prog = array->ptrs[index]; */ PPC_MULI(b2p[TMP_REG_1], b2p_index, 8); PPC_ADD(b2p[TMP_REG_1], b2p[TMP_REG_1], b2p_bpf_array); - PPC_LD(b2p[TMP_REG_1], b2p[TMP_REG_1], offsetof(struct bpf_array, ptrs)); + PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_1], offsetof(struct bpf_array, ptrs)); /* * if (prog == NULL) @@ -275,7 +275,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 PPC_BCC(COND_EQ, out); /* goto *(prog->bpf_func + prologue_size); */ - PPC_LD(b2p[TMP_REG_1], b2p[TMP_REG_1], offsetof(struct bpf_prog, bpf_func)); + PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_1], offsetof(struct bpf_prog, bpf_func)); #ifdef PPC64_ELF_ABI_v1 /* skip past the function descriptor */ PPC_ADDI(b2p[TMP_REG_1], b2p[TMP_REG_1], @@ -606,7 +606,7 @@ bpf_alu32_trunc: * the instructions generated will remain the * same across all passes */ - PPC_STD(dst_reg, 1, bpf_jit_stack_local(ctx)); + PPC_BPF_STL(dst_reg, 1, bpf_jit_stack_local(ctx)); PPC_ADDI(b2p[TMP_REG_1], 1, bpf_jit_stack_local(ctx)); PPC_LDBRX(dst_reg, 0, b2p[TMP_REG_1]); break; @@ -662,7 +662,7 @@ emit_clear: PPC_LI32(b2p[TMP_REG_1], imm); src_reg = b2p[TMP_REG_1]; } - PPC_STD(src_reg, dst_reg, off); + PPC_BPF_STL(src_reg, dst_reg, off); break; /* @@ -709,7 +709,7 @@ emit_clear: break; /* dst = *(u64 *)(ul) (src + off) */ case BPF_LDX | BPF_MEM | BPF_DW: - PPC_LD(dst_reg, src_reg, off); + PPC_BPF_LL(dst_reg, src_reg, off); break; /* diff --git a/drivers/isdn/hardware/mISDN/hfcmulti.c b/drivers/isdn/hardware/mISDN/hfcmulti.c @@ -4365,7 +4365,8 @@ setup_pci(struct hfc_multi *hc, struct pci_dev *pdev, if (m->clock2) test_and_set_bit(HFC_CHIP_CLOCK2, &hc->chip); - if (ent->device == 0xB410) { + if (ent->vendor == PCI_VENDOR_ID_DIGIUM && + ent->device == PCI_DEVICE_ID_DIGIUM_HFC4S) { test_and_set_bit(HFC_CHIP_B410P, &hc->chip); test_and_set_bit(HFC_CHIP_PCM_MASTER, &hc->chip); test_and_clear_bit(HFC_CHIP_PCM_SLAVE, &hc->chip); diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig @@ -216,8 +216,8 @@ config GENEVE config GTP tristate "GPRS Tunneling Protocol datapath (GTP-U)" - depends on INET && NET_UDP_TUNNEL - select NET_IP_TUNNEL + depends on INET + select NET_UDP_TUNNEL ---help--- This allows one to create gtp virtual interfaces that provide the GPRS Tunneling Protocol datapath (GTP-U). This tunneling protocol diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c @@ -481,6 +481,155 @@ qca8k_port_set_status(struct qca8k_priv *priv, int port, int enable) qca8k_reg_clear(priv, QCA8K_REG_PORT_STATUS(port), mask); } +static u32 +qca8k_port_to_phy(int port) +{ + /* From Andrew Lunn: + * Port 0 has no internal phy. + * Port 1 has an internal PHY at MDIO address 0. + * Port 2 has an internal PHY at MDIO address 1. + * ... + * Port 5 has an internal PHY at MDIO address 4. + * Port 6 has no internal PHY. + */ + + return port - 1; +} + +static int +qca8k_mdio_write(struct qca8k_priv *priv, int port, u32 regnum, u16 data) +{ + u32 phy, val; + + if (regnum >= QCA8K_MDIO_MASTER_MAX_REG) + return -EINVAL; + + /* callee is responsible for not passing bad ports, + * but we still would like to make spills impossible. + */ + phy = qca8k_port_to_phy(port) % PHY_MAX_ADDR; + val = QCA8K_MDIO_MASTER_BUSY | QCA8K_MDIO_MASTER_EN | + QCA8K_MDIO_MASTER_WRITE | QCA8K_MDIO_MASTER_PHY_ADDR(phy) | + QCA8K_MDIO_MASTER_REG_ADDR(regnum) | + QCA8K_MDIO_MASTER_DATA(data); + + qca8k_write(priv, QCA8K_MDIO_MASTER_CTRL, val); + + return qca8k_busy_wait(priv, QCA8K_MDIO_MASTER_CTRL, + QCA8K_MDIO_MASTER_BUSY); +} + +static int +qca8k_mdio_read(struct qca8k_priv *priv, int port, u32 regnum) +{ + u32 phy, val; + + if (regnum >= QCA8K_MDIO_MASTER_MAX_REG) + return -EINVAL; + + /* callee is responsible for not passing bad ports, + * but we still would like to make spills impossible. + */ + phy = qca8k_port_to_phy(port) % PHY_MAX_ADDR; + val = QCA8K_MDIO_MASTER_BUSY | QCA8K_MDIO_MASTER_EN | + QCA8K_MDIO_MASTER_READ | QCA8K_MDIO_MASTER_PHY_ADDR(phy) | + QCA8K_MDIO_MASTER_REG_ADDR(regnum); + + qca8k_write(priv, QCA8K_MDIO_MASTER_CTRL, val); + + if (qca8k_busy_wait(priv, QCA8K_MDIO_MASTER_CTRL, + QCA8K_MDIO_MASTER_BUSY)) + return -ETIMEDOUT; + + val = (qca8k_read(priv, QCA8K_MDIO_MASTER_CTRL) & + QCA8K_MDIO_MASTER_DATA_MASK); + + return val; +} + +static int +qca8k_phy_write(struct dsa_switch *ds, int port, int regnum, u16 data) +{ + struct qca8k_priv *priv = ds->priv; + + return qca8k_mdio_write(priv, port, regnum, data); +} + +static int +qca8k_phy_read(struct dsa_switch *ds, int port, int regnum) +{ + struct qca8k_priv *priv = ds->priv; + int ret; + + ret = qca8k_mdio_read(priv, port, regnum); + + if (ret < 0) + return 0xffff; + + return ret; +} + +static int +qca8k_setup_mdio_bus(struct qca8k_priv *priv) +{ + u32 internal_mdio_mask = 0, external_mdio_mask = 0, reg; + struct device_node *ports, *port; + int err; + + ports = of_get_child_by_name(priv->dev->of_node, "ports"); + if (!ports) + return -EINVAL; + + for_each_available_child_of_node(ports, port) { + err = of_property_read_u32(port, "reg", &reg); + if (err) + return err; + + if (!dsa_is_user_port(priv->ds, reg)) + continue; + + if (of_property_read_bool(port, "phy-handle")) + external_mdio_mask |= BIT(reg); + else + internal_mdio_mask |= BIT(reg); + } + + if (!external_mdio_mask && !internal_mdio_mask) { + dev_err(priv->dev, "no PHYs are defined.\n"); + return -EINVAL; + } + + /* The QCA8K_MDIO_MASTER_EN Bit, which grants access to PHYs through + * the MDIO_MASTER register also _disconnects_ the external MDC + * passthrough to the internal PHYs. It's not possible to use both + * configurations at the same time! + * + * Because this came up during the review process: + * If the external mdio-bus driver is capable magically disabling + * the QCA8K_MDIO_MASTER_EN and mutex/spin-locking out the qca8k's + * accessors for the time being, it would be possible to pull this + * off. + */ + if (!!external_mdio_mask && !!internal_mdio_mask) { + dev_err(priv->dev, "either internal or external mdio bus configuration is supported.\n"); + return -EINVAL; + } + + if (external_mdio_mask) { + /* Make sure to disable the internal mdio bus in cases + * a dt-overlay and driver reload changed the configuration + */ + + qca8k_reg_clear(priv, QCA8K_MDIO_MASTER_CTRL, + QCA8K_MDIO_MASTER_EN); + return 0; + } + + priv->ops.phy_read = qca8k_phy_read; + priv->ops.phy_write = qca8k_phy_write; + return 0; +} + static int qca8k_setup(struct dsa_switch *ds) { @@ -502,6 +651,10 @@ qca8k_setup(struct dsa_switch *ds) if (IS_ERR(priv->regmap)) pr_warn("regmap initialization failed"); + ret = qca8k_setup_mdio_bus(priv); + if (ret) + return ret; + /* Initialize CPU port pad mode (xMII type, delays...) */ phy_mode = of_get_phy_mode(ds->ports[QCA8K_CPU_PORT].dn); if (phy_mode < 0) { @@ -624,22 +777,6 @@ qca8k_adjust_link(struct dsa_switch *ds, int port, struct phy_device *phy) qca8k_port_set_status(priv, port, 1); } -static int -qca8k_phy_read(struct dsa_switch *ds, int phy, int regnum) -{ - struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv; - - return mdiobus_read(priv->bus, phy, regnum); -} - -static int -qca8k_phy_write(struct dsa_switch *ds, int phy, int regnum, u16 val) -{ - struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv; - - return mdiobus_write(priv->bus, phy, regnum, val); -} - static void qca8k_get_strings(struct dsa_switch *ds, int port, u32 stringset, uint8_t *data) { @@ -879,8 +1016,6 @@ static const struct dsa_switch_ops qca8k_switch_ops = { .setup = qca8k_setup, .adjust_link = qca8k_adjust_link, .get_strings = qca8k_get_strings, - .phy_read = qca8k_phy_read, - .phy_write = qca8k_phy_write, .get_ethtool_stats = qca8k_get_ethtool_stats, .get_sset_count = qca8k_get_sset_count, .get_mac_eee = qca8k_get_mac_eee, @@ -923,7 +1058,8 @@ qca8k_sw_probe(struct mdio_device *mdiodev) return -ENOMEM; priv->ds->priv = priv; - priv->ds->ops = &qca8k_switch_ops; + priv->ops = qca8k_switch_ops; + priv->ds->ops = &priv->ops; mutex_init(&priv->reg_mutex); dev_set_drvdata(&mdiodev->dev, priv); diff --git a/drivers/net/dsa/qca8k.h b/drivers/net/dsa/qca8k.h @@ -49,6 +49,18 @@ #define QCA8K_MIB_FLUSH BIT(24) #define QCA8K_MIB_CPU_KEEP BIT(20) #define QCA8K_MIB_BUSY BIT(17) +#define QCA8K_MDIO_MASTER_CTRL 0x3c +#define QCA8K_MDIO_MASTER_BUSY BIT(31) +#define QCA8K_MDIO_MASTER_EN BIT(30) +#define QCA8K_MDIO_MASTER_READ BIT(27) +#define QCA8K_MDIO_MASTER_WRITE 0 +#define QCA8K_MDIO_MASTER_SUP_PRE BIT(26) +#define QCA8K_MDIO_MASTER_PHY_ADDR(x) ((x) << 21) +#define QCA8K_MDIO_MASTER_REG_ADDR(x) ((x) << 16) +#define QCA8K_MDIO_MASTER_DATA(x) (x) +#define QCA8K_MDIO_MASTER_DATA_MASK GENMASK(15, 0) +#define QCA8K_MDIO_MASTER_MAX_PORTS 5 +#define QCA8K_MDIO_MASTER_MAX_REG 32 #define QCA8K_GOL_MAC_ADDR0 0x60 #define QCA8K_GOL_MAC_ADDR1 0x64 #define QCA8K_REG_PORT_STATUS(_i) (0x07c + (_i) * 4) @@ -169,6 +181,7 @@ struct qca8k_priv { struct dsa_switch *ds; struct mutex reg_mutex; struct device *dev; + struct dsa_switch_ops ops; }; struct qca8k_mib_desc { diff --git a/drivers/net/ethernet/3com/3c515.c b/drivers/net/ethernet/3com/3c515.c @@ -1521,7 +1521,7 @@ static void update_stats(int ioaddr, struct net_device *dev) static void set_rx_mode(struct net_device *dev) { int ioaddr = dev->base_addr; - short new_mode; + unsigned short new_mode; if (dev->flags & IFF_PROMISC) { if (corkscrew_debug > 3) diff --git a/drivers/net/ethernet/8390/mac8390.c b/drivers/net/ethernet/8390/mac8390.c @@ -153,8 +153,6 @@ static void dayna_block_input(struct net_device *dev, int count, static void dayna_block_output(struct net_device *dev, int count, const unsigned char *buf, int start_page); -#define memcmp_withio(a, b, c) memcmp((a), (void *)(b), (c)) - /* Slow Sane (16-bit chunk memory read/write) Cabletron uses this */ static void slow_sane_get_8390_hdr(struct net_device *dev, struct e8390_pkt_hdr *hdr, int ring_page); @@ -233,19 +231,26 @@ static enum mac8390_type mac8390_ident(struct nubus_rsrc *fres) static enum mac8390_access mac8390_testio(unsigned long membase) { - unsigned long outdata = 0xA5A0B5B0; - unsigned long indata = 0x00000000; + u32 outdata = 0xA5A0B5B0; + u32 indata = 0; + /* Try writing 32 bits */ - memcpy_toio((void __iomem *)membase, &outdata, 4); - /* Now compare them */ - if (memcmp_withio(&outdata, membase, 4) == 0) + nubus_writel(outdata, membase); + /* Now read it back */ + indata = nubus_readl(membase); + if (outdata == indata) return ACCESS_32; + + outdata = 0xC5C0D5D0; + indata = 0; + /* Write 16 bit output */ word_memcpy_tocard(membase, &outdata, 4); /* Now read it back */ word_memcpy_fromcard(&indata, membase, 4); if (outdata == indata) return ACCESS_16; + return ACCESS_UNKNOWN; } diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c @@ -186,11 +186,12 @@ static void aq_rx_checksum(struct aq_ring_s *self, } if (buff->is_ip_cso) { __skb_incr_checksum_unnecessary(skb); - if (buff->is_udp_cso || buff->is_tcp_cso) - __skb_incr_checksum_unnecessary(skb); } else { skb->ip_summed = CHECKSUM_NONE; } + + if (buff->is_udp_cso || buff->is_tcp_cso) + __skb_incr_checksum_unnecessary(skb); } #define AQ_SKB_ALIGN SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c @@ -3370,14 +3370,20 @@ static int macb_clk_init(struct platform_device *pdev, struct clk **pclk, *hclk = devm_clk_get(&pdev->dev, "hclk"); } - if (IS_ERR(*pclk)) { + if (IS_ERR_OR_NULL(*pclk)) { err = PTR_ERR(*pclk); + if (!err) + err = -ENODEV; + dev_err(&pdev->dev, "failed to get macb_clk (%u)\n", err); return err; } - if (IS_ERR(*hclk)) { + if (IS_ERR_OR_NULL(*hclk)) { err = PTR_ERR(*hclk); + if (!err) + err = -ENODEV; + dev_err(&pdev->dev, "failed to get hclk (%u)\n", err); return err; } diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c @@ -2620,7 +2620,7 @@ static inline struct port_info *ethqset2pinfo(struct adapter *adap, int qset) } /* should never happen! */ - BUG_ON(1); + BUG(); return NULL; } diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c @@ -476,7 +476,7 @@ static inline int get_buf_size(struct adapter *adapter, break; default: - BUG_ON(1); + BUG(); } return buf_size; diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c @@ -815,6 +815,14 @@ static netdev_tx_t dpaa2_eth_tx(struct sk_buff *skb, struct net_device *net_dev) */ queue_mapping = skb_get_queue_mapping(skb); fq = &priv->fq[queue_mapping]; + + fd_len = dpaa2_fd_get_len(&fd); + nq = netdev_get_tx_queue(net_dev, queue_mapping); + netdev_tx_sent_queue(nq, fd_len); + + /* Everything that happens after this enqueues might race with + * the Tx confirmation callback for this frame + */ for (i = 0; i < DPAA2_ETH_ENQUEUE_RETRIES; i++) { err = priv->enqueue(priv, fq, &fd, 0); if (err != -EBUSY) @@ -825,13 +833,10 @@ static netdev_tx_t dpaa2_eth_tx(struct sk_buff *skb, struct net_device *net_dev) percpu_stats->tx_errors++; /* Clean up everything, including freeing the skb */ free_tx_fd(priv, fq, &fd, false); + netdev_tx_completed_queue(nq, 1, fd_len); } else { - fd_len = dpaa2_fd_get_len(&fd); percpu_stats->tx_packets++; percpu_stats->tx_bytes += fd_len; - - nq = netdev_get_tx_queue(net_dev, queue_mapping); - netdev_tx_sent_queue(nq, fd_len); } return NETDEV_TX_OK; @@ -1817,7 +1822,7 @@ static int dpaa2_eth_xdp_xmit_frame(struct net_device *net_dev, dpaa2_fd_set_format(&fd, dpaa2_fd_single); dpaa2_fd_set_ctrl(&fd, FD_CTRL_PTA); - fq = &priv->fq[smp_processor_id()]; + fq = &priv->fq[smp_processor_id() % dpaa2_eth_queue_count(priv)]; for (i = 0; i < DPAA2_ETH_ENQUEUE_RETRIES; i++) { err = priv->enqueue(priv, fq, &fd, 0); if (err != -EBUSY) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -22,6 +22,7 @@ #include "hns3_enet.h" #define hns3_set_field(origin, shift, val) ((origin) |= ((val) << (shift))) +#define hns3_tx_bd_count(S) DIV_ROUND_UP(S, HNS3_MAX_BD_SIZE) static void hns3_clear_all_ring(struct hnae3_handle *h); static void hns3_force_clear_all_rx_ring(struct hnae3_handle *h); @@ -1079,7 +1080,7 @@ static int hns3_fill_desc(struct hns3_enet_ring *ring, void *priv, desc_cb->length = size; - frag_buf_num = (size + HNS3_MAX_BD_SIZE - 1) >> HNS3_MAX_BD_SIZE_OFFSET; + frag_buf_num = hns3_tx_bd_count(size); sizeoflast = size & HNS3_TX_LAST_SIZE_M; sizeoflast = sizeoflast ? sizeoflast : HNS3_MAX_BD_SIZE; @@ -1124,14 +1125,13 @@ static int hns3_nic_maybe_stop_tso(struct sk_buff **out_skb, int *bnum, int i; size = skb_headlen(skb); - buf_num = (size + HNS3_MAX_BD_SIZE - 1) >> HNS3_MAX_BD_SIZE_OFFSET; + buf_num = hns3_tx_bd_count(size); frag_num = skb_shinfo(skb)->nr_frags; for (i = 0; i < frag_num; i++) { frag = &skb_shinfo(skb)->frags[i]; size = skb_frag_size(frag); - bdnum_for_frag = (size + HNS3_MAX_BD_SIZE - 1) >> - HNS3_MAX_BD_SIZE_OFFSET; + bdnum_for_frag = hns3_tx_bd_count(size); if (unlikely(bdnum_for_frag > HNS3_MAX_BD_PER_FRAG)) return -ENOMEM; @@ -1139,8 +1139,7 @@ static int hns3_nic_maybe_stop_tso(struct sk_buff **out_skb, int *bnum, } if (unlikely(buf_num > HNS3_MAX_BD_PER_FRAG)) { - buf_num = (skb->len + HNS3_MAX_BD_SIZE - 1) >> - HNS3_MAX_BD_SIZE_OFFSET; + buf_num = hns3_tx_bd_count(skb->len); if (ring_space(ring) < buf_num) return -EBUSY; /* manual split the send packet */ @@ -1169,7 +1168,7 @@ static int hns3_nic_maybe_stop_tx(struct sk_buff **out_skb, int *bnum, buf_num = skb_shinfo(skb)->nr_frags + 1; if (unlikely(buf_num > HNS3_MAX_BD_PER_FRAG)) { - buf_num = (skb->len + HNS3_MAX_BD_SIZE - 1) / HNS3_MAX_BD_SIZE; + buf_num = hns3_tx_bd_count(skb->len); if (ring_space(ring) < buf_num) return -EBUSY; /* manual split the send packet */ diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h @@ -193,7 +193,6 @@ enum hns3_nic_state { #define HNS3_VECTOR_INITED 1 #define HNS3_MAX_BD_SIZE 65535 -#define HNS3_MAX_BD_SIZE_OFFSET 16 #define HNS3_MAX_BD_PER_FRAG 8 #define HNS3_MAX_BD_PER_PKT MAX_SKB_FRAGS diff --git a/drivers/net/ethernet/ibm/ehea/ehea_main.c b/drivers/net/ethernet/ibm/ehea/ehea_main.c @@ -3160,6 +3160,7 @@ static ssize_t ehea_probe_port(struct device *dev, if (ehea_add_adapter_mr(adapter)) { pr_err("creating MR failed\n"); + of_node_put(eth_dn); return -EIO; } diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_env.c b/drivers/net/ethernet/mellanox/mlxsw/core_env.c @@ -113,7 +113,7 @@ int mlxsw_env_module_temp_thresholds_get(struct mlxsw_core *core, int module, return 0; default: /* Do not consider thresholds for zero temperature. */ - if (!MLXSW_REG_MTMP_TEMP_TO_MC(module_temp)) { + if (MLXSW_REG_MTMP_TEMP_TO_MC(module_temp) == 0) { *temp = 0; return 0; } diff --git a/drivers/net/ethernet/micrel/ks8851.c b/drivers/net/ethernet/micrel/ks8851.c @@ -142,6 +142,12 @@ struct ks8851_net { static int msg_enable; +/* SPI frame opcodes */ +#define KS_SPIOP_RD (0x00) +#define KS_SPIOP_WR (0x40) +#define KS_SPIOP_RXFIFO (0x80) +#define KS_SPIOP_TXFIFO (0xC0) + /* shift for byte-enable data */ #define BYTE_EN(_x) ((_x) << 2) @@ -535,9 +541,8 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) /* set dma read address */ ks8851_wrreg16(ks, KS_RXFDPR, RXFDPR_RXFPAI | 0x00); - /* start the packet dma process, and set auto-dequeue rx */ - ks8851_wrreg16(ks, KS_RXQCR, - ks->rc_rxqcr | RXQCR_SDA | RXQCR_ADRFE); + /* start DMA access */ + ks8851_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr | RXQCR_SDA); if (rxlen > 4) { unsigned int rxalign; @@ -568,7 +573,8 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) } } - ks8851_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr); + /* end DMA access and dequeue packet */ + ks8851_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr | RXQCR_RRXEF); } } @@ -785,6 +791,15 @@ static void ks8851_tx_work(struct work_struct *work) static int ks8851_net_open(struct net_device *dev) { struct ks8851_net *ks = netdev_priv(dev); + int ret; + + ret = request_threaded_irq(dev->irq, NULL, ks8851_irq, + IRQF_TRIGGER_LOW | IRQF_ONESHOT, + dev->name, ks); + if (ret < 0) { + netdev_err(dev, "failed to get irq\n"); + return ret; + } /* lock the card, even if we may not actually be doing anything * else at the moment */ @@ -849,6 +864,7 @@ static int ks8851_net_open(struct net_device *dev) netif_dbg(ks, ifup, ks->netdev, "network device up\n"); mutex_unlock(&ks->lock); + mii_check_link(&ks->mii); return 0; } @@ -899,6 +915,8 @@ static int ks8851_net_stop(struct net_device *dev) dev_kfree_skb(txb); } + free_irq(dev->irq, ks); + return 0; } @@ -1508,6 +1526,7 @@ static int ks8851_probe(struct spi_device *spi) spi_set_drvdata(spi, ks); + netif_carrier_off(ks->netdev); ndev->if_port = IF_PORT_100BASET; ndev->netdev_ops = &ks8851_netdev_ops; ndev->irq = spi->irq; @@ -1529,14 +1548,6 @@ static int ks8851_probe(struct spi_device *spi) ks8851_read_selftest(ks); ks8851_init_mac(ks); - ret = request_threaded_irq(spi->irq, NULL, ks8851_irq, - IRQF_TRIGGER_LOW | IRQF_ONESHOT, - ndev->name, ks); - if (ret < 0) { - dev_err(&spi->dev, "failed to get irq\n"); - goto err_irq; - } - ret = register_netdev(ndev); if (ret) { dev_err(&spi->dev, "failed to register network device\n"); @@ -1549,14 +1560,10 @@ static int ks8851_probe(struct spi_device *spi) return 0; - err_netdev: - free_irq(ndev->irq, ks); - -err_irq: +err_id: if (gpio_is_valid(gpio)) gpio_set_value(gpio, 0); -err_id: regulator_disable(ks->vdd_reg); err_reg: regulator_disable(ks->vdd_io); @@ -1574,7 +1581,6 @@ static int ks8851_remove(struct spi_device *spi) dev_info(&spi->dev, "remove\n"); unregister_netdev(priv->netdev); - free_irq(spi->irq, priv); if (gpio_is_valid(priv->gpio)) gpio_set_value(priv->gpio, 0); regulator_disable(priv->vdd_reg); diff --git a/drivers/net/ethernet/micrel/ks8851.h b/drivers/net/ethernet/micrel/ks8851.h @@ -11,9 +11,15 @@ */ #define KS_CCR 0x08 +#define CCR_LE (1 << 10) /* KSZ8851-16MLL */ #define CCR_EEPROM (1 << 9) -#define CCR_SPI (1 << 8) -#define CCR_32PIN (1 << 0) +#define CCR_SPI (1 << 8) /* KSZ8851SNL */ +#define CCR_8BIT (1 << 7) /* KSZ8851-16MLL */ +#define CCR_16BIT (1 << 6) /* KSZ8851-16MLL */ +#define CCR_32BIT (1 << 5) /* KSZ8851-16MLL */ +#define CCR_SHARED (1 << 4) /* KSZ8851-16MLL */ +#define CCR_48PIN (1 << 1) /* KSZ8851-16MLL */ +#define CCR_32PIN (1 << 0) /* KSZ8851SNL */ /* MAC address registers */ #define KS_MAR(_m) (0x15 - (_m)) @@ -112,13 +118,13 @@ #define RXCR1_RXE (1 << 0) #define KS_RXCR2 0x76 -#define RXCR2_SRDBL_MASK (0x7 << 5) -#define RXCR2_SRDBL_SHIFT (5) -#define RXCR2_SRDBL_4B (0x0 << 5) -#define RXCR2_SRDBL_8B (0x1 << 5) -#define RXCR2_SRDBL_16B (0x2 << 5) -#define RXCR2_SRDBL_32B (0x3 << 5) -#define RXCR2_SRDBL_FRAME (0x4 << 5) +#define RXCR2_SRDBL_MASK (0x7 << 5) /* KSZ8851SNL */ +#define RXCR2_SRDBL_SHIFT (5) /* KSZ8851SNL */ +#define RXCR2_SRDBL_4B (0x0 << 5) /* KSZ8851SNL */ +#define RXCR2_SRDBL_8B (0x1 << 5) /* KSZ8851SNL */ +#define RXCR2_SRDBL_16B (0x2 << 5) /* KSZ8851SNL */ +#define RXCR2_SRDBL_32B (0x3 << 5) /* KSZ8851SNL */ +#define RXCR2_SRDBL_FRAME (0x4 << 5) /* KSZ8851SNL */ #define RXCR2_IUFFP (1 << 4) #define RXCR2_RXIUFCEZ (1 << 3) #define RXCR2_UDPLFE (1 << 2) @@ -143,8 +149,10 @@ #define RXFSHR_RXCE (1 << 0) #define KS_RXFHBCR 0x7E +#define RXFHBCR_CNT_MASK (0xfff << 0) + #define KS_TXQCR 0x80 -#define TXQCR_AETFE (1 << 2) +#define TXQCR_AETFE (1 << 2) /* KSZ8851SNL */ #define TXQCR_TXQMAM (1 << 1) #define TXQCR_METFE (1 << 0) @@ -167,6 +175,10 @@ #define KS_RXFDPR 0x86 #define RXFDPR_RXFPAI (1 << 14) +#define RXFDPR_WST (1 << 12) /* KSZ8851-16MLL */ +#define RXFDPR_EMS (1 << 11) /* KSZ8851-16MLL */ +#define RXFDPR_RXFP_MASK (0x7ff << 0) +#define RXFDPR_RXFP_SHIFT (0) #define KS_RXDTTR 0x8C #define KS_RXDBCTR 0x8E @@ -184,7 +196,7 @@ #define IRQ_RXMPDI (1 << 4) #define IRQ_LDI (1 << 3) #define IRQ_EDI (1 << 2) -#define IRQ_SPIBEI (1 << 1) +#define IRQ_SPIBEI (1 << 1) /* KSZ8851SNL */ #define IRQ_DEDI (1 << 0) #define KS_RXFCTR 0x9C @@ -257,42 +269,37 @@ #define KS_P1ANLPR 0xEE #define KS_P1SCLMD 0xF4 -#define P1SCLMD_LEDOFF (1 << 15) -#define P1SCLMD_TXIDS (1 << 14) -#define P1SCLMD_RESTARTAN (1 << 13) -#define P1SCLMD_DISAUTOMDIX (1 << 10) -#define P1SCLMD_FORCEMDIX (1 << 9) -#define P1SCLMD_AUTONEGEN (1 << 7) -#define P1SCLMD_FORCE100 (1 << 6) -#define P1SCLMD_FORCEFDX (1 << 5) -#define P1SCLMD_ADV_FLOW (1 << 4) -#define P1SCLMD_ADV_100BT_FDX (1 << 3) -#define P1SCLMD_ADV_100BT_HDX (1 << 2) -#define P1SCLMD_ADV_10BT_FDX (1 << 1) -#define P1SCLMD_ADV_10BT_HDX (1 << 0) #define KS_P1CR 0xF6 -#define P1CR_HP_MDIX (1 << 15) -#define P1CR_REV_POL (1 << 13) -#define P1CR_OP_100M (1 << 10) -#define P1CR_OP_FDX (1 << 9) -#define P1CR_OP_MDI (1 << 7) -#define P1CR_AN_DONE (1 << 6) -#define P1CR_LINK_GOOD (1 << 5) -#define P1CR_PNTR_FLOW (1 << 4) -#define P1CR_PNTR_100BT_FDX (1 << 3) -#define P1CR_PNTR_100BT_HDX (1 << 2) -#define P1CR_PNTR_10BT_FDX (1 << 1) -#define P1CR_PNTR_10BT_HDX (1 << 0) +#define P1CR_LEDOFF (1 << 15) +#define P1CR_TXIDS (1 << 14) +#define P1CR_RESTARTAN (1 << 13) +#define P1CR_DISAUTOMDIX (1 << 10) +#define P1CR_FORCEMDIX (1 << 9) +#define P1CR_AUTONEGEN (1 << 7) +#define P1CR_FORCE100 (1 << 6) +#define P1CR_FORCEFDX (1 << 5) +#define P1CR_ADV_FLOW (1 << 4) +#define P1CR_ADV_100BT_FDX (1 << 3) +#define P1CR_ADV_100BT_HDX (1 << 2) +#define P1CR_ADV_10BT_FDX (1 << 1) +#define P1CR_ADV_10BT_HDX (1 << 0) + +#define KS_P1SR 0xF8 +#define P1SR_HP_MDIX (1 << 15) +#define P1SR_REV_POL (1 << 13) +#define P1SR_OP_100M (1 << 10) +#define P1SR_OP_FDX (1 << 9) +#define P1SR_OP_MDI (1 << 7) +#define P1SR_AN_DONE (1 << 6) +#define P1SR_LINK_GOOD (1 << 5) +#define P1SR_PNTR_FLOW (1 << 4) +#define P1SR_PNTR_100BT_FDX (1 << 3) +#define P1SR_PNTR_100BT_HDX (1 << 2) +#define P1SR_PNTR_10BT_FDX (1 << 1) +#define P1SR_PNTR_10BT_HDX (1 << 0) /* TX Frame control */ - #define TXFR_TXIC (1 << 15) #define TXFR_TXFID_MASK (0x3f << 0) #define TXFR_TXFID_SHIFT (0) - -/* SPI frame opcodes */ -#define KS_SPIOP_RD (0x00) -#define KS_SPIOP_WR (0x40) -#define KS_SPIOP_RXFIFO (0x80) -#define KS_SPIOP_TXFIFO (0xC0) diff --git a/drivers/net/ethernet/micrel/ks8851_mll.c b/drivers/net/ethernet/micrel/ks8851_mll.c @@ -40,6 +40,8 @@ #include <linux/of_device.h> #include <linux/of_net.h> +#include "ks8851.h" + #define DRV_NAME "ks8851_mll" static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0x00, 0x10, 0xA1, 0x86, 0x95, 0x11 }; @@ -48,319 +50,10 @@ static u8 KS_DEFAULT_MAC_ADDRESS[] = { 0x00, 0x10, 0xA1, 0x86, 0x95, 0x11 }; #define TX_BUF_SIZE 2000 #define RX_BUF_SIZE 2000 -#define KS_CCR 0x08 -#define CCR_EEPROM (1 << 9) -#define CCR_SPI (1 << 8) -#define CCR_8BIT (1 << 7) -#define CCR_16BIT (1 << 6) -#define CCR_32BIT (1 << 5) -#define CCR_SHARED (1 << 4) -#define CCR_32PIN (1 << 0) - -/* MAC address registers */ -#define KS_MARL 0x10 -#define KS_MARM 0x12 -#define KS_MARH 0x14 - -#define KS_OBCR 0x20 -#define OBCR_ODS_16MA (1 << 6) - -#define KS_EEPCR 0x22 -#define EEPCR_EESA (1 << 4) -#define EEPCR_EESB (1 << 3) -#define EEPCR_EEDO (1 << 2) -#define EEPCR_EESCK (1 << 1) -#define EEPCR_EECS (1 << 0) - -#define KS_MBIR 0x24 -#define MBIR_TXMBF (1 << 12) -#define MBIR_TXMBFA (1 << 11) -#define MBIR_RXMBF (1 << 4) -#define MBIR_RXMBFA (1 << 3) - -#define KS_GRR 0x26 -#define GRR_QMU (1 << 1) -#define GRR_GSR (1 << 0) - -#define KS_WFCR 0x2A -#define WFCR_MPRXE (1 << 7) -#define WFCR_WF3E (1 << 3) -#define WFCR_WF2E (1 << 2) -#define WFCR_WF1E (1 << 1) -#define WFCR_WF0E (1 << 0) - -#define KS_WF0CRC0 0x30 -#define KS_WF0CRC1 0x32 -#define KS_WF0BM0 0x34 -#define KS_WF0BM1 0x36 -#define KS_WF0BM2 0x38 -#define KS_WF0BM3 0x3A - -#define KS_WF1CRC0 0x40 -#define KS_WF1CRC1 0x42 -#define KS_WF1BM0 0x44 -#define KS_WF1BM1 0x46 -#define KS_WF1BM2 0x48 -#define KS_WF1BM3 0x4A - -#define KS_WF2CRC0 0x50 -#define KS_WF2CRC1 0x52 -#define KS_WF2BM0 0x54 -#define KS_WF2BM1 0x56 -#define KS_WF2BM2 0x58 -#define KS_WF2BM3 0x5A - -#define KS_WF3CRC0 0x60 -#define KS_WF3CRC1 0x62 -#define KS_WF3BM0 0x64 -#define KS_WF3BM1 0x66 -#define KS_WF3BM2 0x68 -#define KS_WF3BM3 0x6A - -#define KS_TXCR 0x70 -#define TXCR_TCGICMP (1 << 8) -#define TXCR_TCGUDP (1 << 7) -#define TXCR_TCGTCP (1 << 6) -#define TXCR_TCGIP (1 << 5) -#define TXCR_FTXQ (1 << 4) -#define TXCR_TXFCE (1 << 3) -#define TXCR_TXPE (1 << 2) -#define TXCR_TXCRC (1 << 1) -#define TXCR_TXE (1 << 0) - -#define KS_TXSR 0x72 -#define TXSR_TXLC (1 << 13) -#define TXSR_TXMC (1 << 12) -#define TXSR_TXFID_MASK (0x3f << 0) -#define TXSR_TXFID_SHIFT (0) -#define TXSR_TXFID_GET(_v) (((_v) >> 0) & 0x3f) - - -#define KS_RXCR1 0x74 -#define RXCR1_FRXQ (1 << 15) -#define RXCR1_RXUDPFCC (1 << 14) -#define RXCR1_RXTCPFCC (1 << 13) -#define RXCR1_RXIPFCC (1 << 12) -#define RXCR1_RXPAFMA (1 << 11) -#define RXCR1_RXFCE (1 << 10) -#define RXCR1_RXEFE (1 << 9) -#define RXCR1_RXMAFMA (1 << 8) -#define RXCR1_RXBE (1 << 7) -#define RXCR1_RXME (1 << 6) -#define RXCR1_RXUE (1 << 5) -#define RXCR1_RXAE (1 << 4) -#define RXCR1_RXINVF (1 << 1) -#define RXCR1_RXE (1 << 0) #define RXCR1_FILTER_MASK (RXCR1_RXINVF | RXCR1_RXAE | \ RXCR1_RXMAFMA | RXCR1_RXPAFMA) - -#define KS_RXCR2 0x76 -#define RXCR2_SRDBL_MASK (0x7 << 5) -#define RXCR2_SRDBL_SHIFT (5) -#define RXCR2_SRDBL_4B (0x0 << 5) -#define RXCR2_SRDBL_8B (0x1 << 5) -#define RXCR2_SRDBL_16B (0x2 << 5) -#define RXCR2_SRDBL_32B (0x3 << 5) -/* #define RXCR2_SRDBL_FRAME (0x4 << 5) */ -#define RXCR2_IUFFP (1 << 4) -#define RXCR2_RXIUFCEZ (1 << 3) -#define RXCR2_UDPLFE (1 << 2) -#define RXCR2_RXICMPFCC (1 << 1) -#define RXCR2_RXSAF (1 << 0) - -#define KS_TXMIR 0x78 - -#define KS_RXFHSR 0x7C -#define RXFSHR_RXFV (1 << 15) -#define RXFSHR_RXICMPFCS (1 << 13) -#define RXFSHR_RXIPFCS (1 << 12) -#define RXFSHR_RXTCPFCS (1 << 11) -#define RXFSHR_RXUDPFCS (1 << 10) -#define RXFSHR_RXBF (1 << 7) -#define RXFSHR_RXMF (1 << 6) -#define RXFSHR_RXUF (1 << 5) -#define RXFSHR_RXMR (1 << 4) -#define RXFSHR_RXFT (1 << 3) -#define RXFSHR_RXFTL (1 << 2) -#define RXFSHR_RXRF (1 << 1) -#define RXFSHR_RXCE (1 << 0) -#define RXFSHR_ERR (RXFSHR_RXCE | RXFSHR_RXRF |\ - RXFSHR_RXFTL | RXFSHR_RXMR |\ - RXFSHR_RXICMPFCS | RXFSHR_RXIPFCS |\ - RXFSHR_RXTCPFCS) -#define KS_RXFHBCR 0x7E -#define RXFHBCR_CNT_MASK 0x0FFF - -#define KS_TXQCR 0x80 -#define TXQCR_AETFE (1 << 2) -#define TXQCR_TXQMAM (1 << 1) -#define TXQCR_METFE (1 << 0) - -#define KS_RXQCR 0x82 -#define RXQCR_RXDTTS (1 << 12) -#define RXQCR_RXDBCTS (1 << 11) -#define RXQCR_RXFCTS (1 << 10) -#define RXQCR_RXIPHTOE (1 << 9) -#define RXQCR_RXDTTE (1 << 7) -#define RXQCR_RXDBCTE (1 << 6) -#define RXQCR_RXFCTE (1 << 5) -#define RXQCR_ADRFE (1 << 4) -#define RXQCR_SDA (1 << 3) -#define RXQCR_RRXEF (1 << 0) #define RXQCR_CMD_CNTL (RXQCR_RXFCTE|RXQCR_ADRFE) -#define KS_TXFDPR 0x84 -#define TXFDPR_TXFPAI (1 << 14) -#define TXFDPR_TXFP_MASK (0x7ff << 0) -#define TXFDPR_TXFP_SHIFT (0) - -#define KS_RXFDPR 0x86 -#define RXFDPR_RXFPAI (1 << 14) - -#define KS_RXDTTR 0x8C -#define KS_RXDBCTR 0x8E - -#define KS_IER 0x90 -#define KS_ISR 0x92 -#define IRQ_LCI (1 << 15) -#define IRQ_TXI (1 << 14) -#define IRQ_RXI (1 << 13) -#define IRQ_RXOI (1 << 11) -#define IRQ_TXPSI (1 << 9) -#define IRQ_RXPSI (1 << 8) -#define IRQ_TXSAI (1 << 6) -#define IRQ_RXWFDI (1 << 5) -#define IRQ_RXMPDI (1 << 4) -#define IRQ_LDI (1 << 3) -#define IRQ_EDI (1 << 2) -#define IRQ_SPIBEI (1 << 1) -#define IRQ_DEDI (1 << 0) - -#define KS_RXFCTR 0x9C -#define RXFCTR_THRESHOLD_MASK 0x00FF - -#define KS_RXFC 0x9D -#define RXFCTR_RXFC_MASK (0xff << 8) -#define RXFCTR_RXFC_SHIFT (8) -#define RXFCTR_RXFC_GET(_v) (((_v) >> 8) & 0xff) -#define RXFCTR_RXFCT_MASK (0xff << 0) -#define RXFCTR_RXFCT_SHIFT (0) - -#define KS_TXNTFSR 0x9E - -#define KS_MAHTR0 0xA0 -#define KS_MAHTR1 0xA2 -#define KS_MAHTR2 0xA4 -#define KS_MAHTR3 0xA6 - -#define KS_FCLWR 0xB0 -#define KS_FCHWR 0xB2 -#define KS_FCOWR 0xB4 - -#define KS_CIDER 0xC0 -#define CIDER_ID 0x8870 -#define CIDER_REV_MASK (0x7 << 1) -#define CIDER_REV_SHIFT (1) -#define CIDER_REV_GET(_v) (((_v) >> 1) & 0x7) - -#define KS_CGCR 0xC6 -#define KS_IACR 0xC8 -#define IACR_RDEN (1 << 12) -#define IACR_TSEL_MASK (0x3 << 10) -#define IACR_TSEL_SHIFT (10) -#define IACR_TSEL_MIB (0x3 << 10) -#define IACR_ADDR_MASK (0x1f << 0) -#define IACR_ADDR_SHIFT (0) - -#define KS_IADLR 0xD0 -#define KS_IAHDR 0xD2 - -#define KS_PMECR 0xD4 -#define PMECR_PME_DELAY (1 << 14) -#define PMECR_PME_POL (1 << 12) -#define PMECR_WOL_WAKEUP (1 << 11) -#define PMECR_WOL_MAGICPKT (1 << 10) -#define PMECR_WOL_LINKUP (1 << 9) -#define PMECR_WOL_ENERGY (1 << 8) -#define PMECR_AUTO_WAKE_EN (1 << 7) -#define PMECR_WAKEUP_NORMAL (1 << 6) -#define PMECR_WKEVT_MASK (0xf << 2) -#define PMECR_WKEVT_SHIFT (2) -#define PMECR_WKEVT_GET(_v) (((_v) >> 2) & 0xf) -#define PMECR_WKEVT_ENERGY (0x1 << 2) -#define PMECR_WKEVT_LINK (0x2 << 2) -#define PMECR_WKEVT_MAGICPKT (0x4 << 2) -#define PMECR_WKEVT_FRAME (0x8 << 2) -#define PMECR_PM_MASK (0x3 << 0) -#define PMECR_PM_SHIFT (0) -#define PMECR_PM_NORMAL (0x0 << 0) -#define PMECR_PM_ENERGY (0x1 << 0) -#define PMECR_PM_SOFTDOWN (0x2 << 0) -#define PMECR_PM_POWERSAVE (0x3 << 0) - -/* Standard MII PHY data */ -#define KS_P1MBCR 0xE4 -#define P1MBCR_FORCE_FDX (1 << 8) - -#define KS_P1MBSR 0xE6 -#define P1MBSR_AN_COMPLETE (1 << 5) -#define P1MBSR_AN_CAPABLE (1 << 3) -#define P1MBSR_LINK_UP (1 << 2) - -#define KS_PHY1ILR 0xE8 -#define KS_PHY1IHR 0xEA -#define KS_P1ANAR 0xEC -#define KS_P1ANLPR 0xEE - -#define KS_P1SCLMD 0xF4 -#define P1SCLMD_LEDOFF (1 << 15) -#define P1SCLMD_TXIDS (1 << 14) -#define P1SCLMD_RESTARTAN (1 << 13) -#define P1SCLMD_DISAUTOMDIX (1 << 10) -#define P1SCLMD_FORCEMDIX (1 << 9) -#define P1SCLMD_AUTONEGEN (1 << 7) -#define P1SCLMD_FORCE100 (1 << 6) -#define P1SCLMD_FORCEFDX (1 << 5) -#define P1SCLMD_ADV_FLOW (1 << 4) -#define P1SCLMD_ADV_100BT_FDX (1 << 3) -#define P1SCLMD_ADV_100BT_HDX (1 << 2) -#define P1SCLMD_ADV_10BT_FDX (1 << 1) -#define P1SCLMD_ADV_10BT_HDX (1 << 0) - -#define KS_P1CR 0xF6 -#define P1CR_HP_MDIX (1 << 15) -#define P1CR_REV_POL (1 << 13) -#define P1CR_OP_100M (1 << 10) -#define P1CR_OP_FDX (1 << 9) -#define P1CR_OP_MDI (1 << 7) -#define P1CR_AN_DONE (1 << 6) -#define P1CR_LINK_GOOD (1 << 5) -#define P1CR_PNTR_FLOW (1 << 4) -#define P1CR_PNTR_100BT_FDX (1 << 3) -#define P1CR_PNTR_100BT_HDX (1 << 2) -#define P1CR_PNTR_10BT_FDX (1 << 1) -#define P1CR_PNTR_10BT_HDX (1 << 0) - -/* TX Frame control */ - -#define TXFR_TXIC (1 << 15) -#define TXFR_TXFID_MASK (0x3f << 0) -#define TXFR_TXFID_SHIFT (0) - -#define KS_P1SR 0xF8 -#define P1SR_HP_MDIX (1 << 15) -#define P1SR_REV_POL (1 << 13) -#define P1SR_OP_100M (1 << 10) -#define P1SR_OP_FDX (1 << 9) -#define P1SR_OP_MDI (1 << 7) -#define P1SR_AN_DONE (1 << 6) -#define P1SR_LINK_GOOD (1 << 5) -#define P1SR_PNTR_FLOW (1 << 4) -#define P1SR_PNTR_100BT_FDX (1 << 3) -#define P1SR_PNTR_100BT_HDX (1 << 2) -#define P1SR_PNTR_10BT_FDX (1 << 1) -#define P1SR_PNTR_10BT_HDX (1 << 0) - #define ENUM_BUS_NONE 0 #define ENUM_BUS_8BIT 1 #define ENUM_BUS_16BIT 2 @@ -1475,7 +1168,7 @@ static void ks_setup(struct ks_net *ks) ks_wrreg16(ks, KS_RXFDPR, RXFDPR_RXFPAI); /* Setup Receive Frame Threshold - 1 frame (RXFCTFC) */ - ks_wrreg16(ks, KS_RXFCTR, 1 & RXFCTR_THRESHOLD_MASK); + ks_wrreg16(ks, KS_RXFCTR, 1 & RXFCTR_RXFCT_MASK); /* Setup RxQ Command Control (RXQCR) */ ks->rc_rxqcr = RXQCR_CMD_CNTL; @@ -1488,7 +1181,7 @@ static void ks_setup(struct ks_net *ks) */ w = ks_rdreg16(ks, KS_P1MBCR); - w &= ~P1MBCR_FORCE_FDX; + w &= ~BMCR_FULLDPLX; ks_wrreg16(ks, KS_P1MBCR, w); w = TXCR_TXFCE | TXCR_TXPE | TXCR_TXCRC | TXCR_TCGIP; @@ -1629,7 +1322,7 @@ static int ks8851_probe(struct platform_device *pdev) ks_setup_int(ks); data = ks_rdreg16(ks, KS_OBCR); - ks_wrreg16(ks, KS_OBCR, data | OBCR_ODS_16MA); + ks_wrreg16(ks, KS_OBCR, data | OBCR_ODS_16mA); /* overwriting the default MAC address */ if (pdev->dev.of_node) { diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c @@ -1048,6 +1048,8 @@ int qlcnic_do_lb_test(struct qlcnic_adapter *adapter, u8 mode) for (i = 0; i < QLCNIC_NUM_ILB_PKT; i++) { skb = netdev_alloc_skb(adapter->netdev, QLCNIC_ILB_PKT_SIZE); + if (!skb) + break; qlcnic_create_loopback_buff(skb->data, adapter->mac_addr); skb_put(skb, QLCNIC_ILB_PKT_SIZE); adapter->ahw->diag_cnt = 0; diff --git a/drivers/net/ethernet/realtek/atp.c b/drivers/net/ethernet/realtek/atp.c @@ -482,7 +482,7 @@ static void hardware_init(struct net_device *dev) write_reg_high(ioaddr, IMR, ISRh_RxErr); lp->tx_unit_busy = 0; - lp->pac_cnt_in_tx_buf = 0; + lp->pac_cnt_in_tx_buf = 0; lp->saved_tx_size = 0; } diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c @@ -678,6 +678,7 @@ struct rtl8169_private { struct work_struct work; } wk; + unsigned irq_enabled:1; unsigned supports_gmii:1; dma_addr_t counters_phys_addr; struct rtl8169_counters *counters; @@ -1293,6 +1294,7 @@ static void rtl_ack_events(struct rtl8169_private *tp, u16 bits) static void rtl_irq_disable(struct rtl8169_private *tp) { RTL_W16(tp, IntrMask, 0); + tp->irq_enabled = 0; } #define RTL_EVENT_NAPI_RX (RxOK | RxErr) @@ -1301,6 +1303,7 @@ static void rtl_irq_disable(struct rtl8169_private *tp) static void rtl_irq_enable(struct rtl8169_private *tp) { + tp->irq_enabled = 1; RTL_W16(tp, IntrMask, tp->irq_mask); } @@ -6520,9 +6523,8 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) { struct rtl8169_private *tp = dev_instance; u16 status = RTL_R16(tp, IntrStatus); - u16 irq_mask = RTL_R16(tp, IntrMask); - if (status == 0xffff || !(status & irq_mask)) + if (!tp->irq_enabled || status == 0xffff || !(status & tp->irq_mask)) return IRQ_NONE; if (unlikely(status & SYSErr)) { @@ -6540,7 +6542,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) set_bit(RTL_FLAG_TASK_RESET_PENDING, tp->wk.flags); } - if (status & RTL_EVENT_NAPI) { + if (status & (RTL_EVENT_NAPI | LinkChg)) { rtl_irq_disable(tp); napi_schedule_irqoff(&tp->napi); } diff --git a/drivers/net/ethernet/sis/sis900.c b/drivers/net/ethernet/sis/sis900.c @@ -730,10 +730,10 @@ static u16 sis900_default_phy(struct net_device * net_dev) status = mdio_read(net_dev, phy->phy_addr, MII_STATUS); /* Link ON & Not select default PHY & not ghost PHY */ - if ((status & MII_STAT_LINK) && !default_phy && - (phy->phy_types != UNKNOWN)) - default_phy = phy; - else { + if ((status & MII_STAT_LINK) && !default_phy && + (phy->phy_types != UNKNOWN)) { + default_phy = phy; + } else { status = mdio_read(net_dev, phy->phy_addr, MII_CONTROL); mdio_write(net_dev, phy->phy_addr, MII_CONTROL, status | MII_CNTL_AUTO | MII_CNTL_ISOLATE); @@ -741,7 +741,7 @@ static u16 sis900_default_phy(struct net_device * net_dev) phy_home = phy; else if(phy->phy_types == LAN) phy_lan = phy; - } + } } if (!default_phy && phy_home) diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c @@ -59,7 +59,7 @@ static int jumbo_frm(void *p, struct sk_buff *skb, int csum) desc->des3 = cpu_to_le32(des2 + BUF_SIZE_4KiB); stmmac_prepare_tx_desc(priv, desc, 1, bmax, csum, - STMMAC_RING_MODE, 1, false, skb->len); + STMMAC_RING_MODE, 0, false, skb->len); tx_q->tx_skbuff[entry] = NULL; entry = STMMAC_GET_ENTRY(entry, DMA_TX_SIZE); @@ -79,7 +79,8 @@ static int jumbo_frm(void *p, struct sk_buff *skb, int csum) desc->des3 = cpu_to_le32(des2 + BUF_SIZE_4KiB); stmmac_prepare_tx_desc(priv, desc, 0, len, csum, - STMMAC_RING_MODE, 1, true, skb->len); + STMMAC_RING_MODE, 1, !skb_is_nonlinear(skb), + skb->len); } else { des2 = dma_map_single(priv->device, skb->data, nopaged_len, DMA_TO_DEVICE); @@ -91,7 +92,8 @@ static int jumbo_frm(void *p, struct sk_buff *skb, int csum) tx_q->tx_skbuff_dma[entry].is_jumbo = true; desc->des3 = cpu_to_le32(des2 + BUF_SIZE_4KiB); stmmac_prepare_tx_desc(priv, desc, 1, nopaged_len, csum, - STMMAC_RING_MODE, 1, true, skb->len); + STMMAC_RING_MODE, 0, !skb_is_nonlinear(skb), + skb->len); } tx_q->cur_tx = entry; @@ -111,10 +113,11 @@ static unsigned int is_jumbo_frm(int len, int enh_desc) static void refill_desc3(void *priv_ptr, struct dma_desc *p) { - struct stmmac_priv *priv = (struct stmmac_priv *)priv_ptr; + struct stmmac_rx_queue *rx_q = priv_ptr; + struct stmmac_priv *priv = rx_q->priv_data; /* Fill DES3 in case of RING mode */ - if (priv->dma_buf_sz >= BUF_SIZE_8KiB) + if (priv->dma_buf_sz == BUF_SIZE_16KiB) p->des3 = cpu_to_le32(le32_to_cpu(p->des2) + BUF_SIZE_8KiB); } diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -3216,14 +3216,16 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev) stmmac_prepare_tx_desc(priv, first, 1, nopaged_len, csum_insertion, priv->mode, 1, last_segment, skb->len); - - /* The own bit must be the latest setting done when prepare the - * descriptor and then barrier is needed to make sure that - * all is coherent before granting the DMA engine. - */ - wmb(); + } else { + stmmac_set_tx_owner(priv, first); } + /* The own bit must be the latest setting done when prepare the + * descriptor and then barrier is needed to make sure that + * all is coherent before granting the DMA engine. + */ + wmb(); + netdev_tx_sent_queue(netdev_get_tx_queue(dev, queue), skb->len); stmmac_enable_dma_transmission(priv, priv->ioaddr); diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c @@ -3657,12 +3657,16 @@ static int gbe_probe(struct netcp_device *netcp_device, struct device *dev, ret = netcp_txpipe_init(&gbe_dev->tx_pipe, netcp_device, gbe_dev->dma_chan_name, gbe_dev->tx_queue_id); - if (ret) + if (ret) { + of_node_put(interfaces); return ret; + } ret = netcp_txpipe_open(&gbe_dev->tx_pipe); - if (ret) + if (ret) { + of_node_put(interfaces); return ret; + } /* Create network interfaces */ INIT_LIST_HEAD(&gbe_dev->gbe_intf_head); diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c @@ -1575,12 +1575,14 @@ static int axienet_probe(struct platform_device *pdev) ret = of_address_to_resource(np, 0, &dmares); if (ret) { dev_err(&pdev->dev, "unable to get DMA resource\n"); + of_node_put(np); goto free_netdev; } lp->dma_regs = devm_ioremap_resource(&pdev->dev, &dmares); if (IS_ERR(lp->dma_regs)) { dev_err(&pdev->dev, "could not map DMA regs\n"); ret = PTR_ERR(lp->dma_regs); + of_node_put(np); goto free_netdev; } lp->rx_irq = irq_of_parse_and_map(np, 1); diff --git a/drivers/net/ieee802154/adf7242.c b/drivers/net/ieee802154/adf7242.c @@ -1268,6 +1268,10 @@ static int adf7242_probe(struct spi_device *spi) INIT_DELAYED_WORK(&lp->work, adf7242_rx_cal_work); lp->wqueue = alloc_ordered_workqueue(dev_name(&spi->dev), WQ_MEM_RECLAIM); + if (unlikely(!lp->wqueue)) { + ret = -ENOMEM; + goto err_hw_init; + } ret = adf7242_hw_init(lp); if (ret) diff --git a/drivers/net/ieee802154/mac802154_hwsim.c b/drivers/net/ieee802154/mac802154_hwsim.c @@ -324,7 +324,7 @@ static int hwsim_get_radio_nl(struct sk_buff *msg, struct genl_info *info) goto out_err; } - genlmsg_reply(skb, info); + res = genlmsg_reply(skb, info); break; } diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig @@ -7,6 +7,8 @@ menuconfig MDIO_DEVICE help MDIO devices and driver infrastructure code. +if MDIO_DEVICE + config MDIO_BUS tristate default m if PHYLIB=m @@ -179,6 +181,7 @@ config MDIO_XGENE APM X-Gene SoC's. endif +endif config PHYLINK tristate diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c @@ -323,6 +323,19 @@ static int bcm54xx_config_init(struct phy_device *phydev) bcm54xx_phydsp_config(phydev); + /* Encode link speed into LED1 and LED3 pair (green/amber). + * Also flash these two LEDs on activity. This means configuring + * them for MULTICOLOR and encoding link/activity into them. + */ + val = BCM5482_SHD_LEDS1_LED1(BCM_LED_SRC_MULTICOLOR1) | + BCM5482_SHD_LEDS1_LED3(BCM_LED_SRC_MULTICOLOR1); + bcm_phy_write_shadow(phydev, BCM5482_SHD_LEDS1, val); + + val = BCM_LED_MULTICOLOR_IN_PHASE | + BCM5482_SHD_LEDS1_LED1(BCM_LED_MULTICOLOR_LINK_ACT) | + BCM5482_SHD_LEDS1_LED3(BCM_LED_MULTICOLOR_LINK_ACT); + bcm_phy_write_exp(phydev, BCM_EXP_MULTICOLOR, val); + return 0; } diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c @@ -15,6 +15,8 @@ #include <linux/netdevice.h> #define DP83822_PHY_ID 0x2000a240 +#define DP83825I_PHY_ID 0x2000a150 + #define DP83822_DEVADDR 0x1f #define MII_DP83822_PHYSCR 0x11 @@ -304,26 +306,30 @@ static int dp83822_resume(struct phy_device *phydev) return 0; } +#define DP83822_PHY_DRIVER(_id, _name) \ + { \ + PHY_ID_MATCH_MODEL(_id), \ + .name = (_name), \ + .features = PHY_BASIC_FEATURES, \ + .soft_reset = dp83822_phy_reset, \ + .config_init = dp83822_config_init, \ + .get_wol = dp83822_get_wol, \ + .set_wol = dp83822_set_wol, \ + .ack_interrupt = dp83822_ack_interrupt, \ + .config_intr = dp83822_config_intr, \ + .suspend = dp83822_suspend, \ + .resume = dp83822_resume, \ + } + static struct phy_driver dp83822_driver[] = { - { - .phy_id = DP83822_PHY_ID, - .phy_id_mask = 0xfffffff0, - .name = "TI DP83822", - .features = PHY_BASIC_FEATURES, - .config_init = dp83822_config_init, - .soft_reset = dp83822_phy_reset, - .get_wol = dp83822_get_wol, - .set_wol = dp83822_set_wol, - .ack_interrupt = dp83822_ack_interrupt, - .config_intr = dp83822_config_intr, - .suspend = dp83822_suspend, - .resume = dp83822_resume, - }, + DP83822_PHY_DRIVER(DP83822_PHY_ID, "TI DP83822"), + DP83822_PHY_DRIVER(DP83825I_PHY_ID, "TI DP83825I"), }; module_phy_driver(dp83822_driver); static struct mdio_device_id __maybe_unused dp83822_tbl[] = { { DP83822_PHY_ID, 0xfffffff0 }, + { DP83825I_PHY_ID, 0xfffffff0 }, { }, }; MODULE_DEVICE_TABLE(mdio, dp83822_tbl); diff --git a/drivers/net/phy/meson-gxl.c b/drivers/net/phy/meson-gxl.c @@ -201,6 +201,7 @@ static int meson_gxl_ack_interrupt(struct phy_device *phydev) static int meson_gxl_config_intr(struct phy_device *phydev) { u16 val; + int ret; if (phydev->interrupts == PHY_INTERRUPT_ENABLED) { val = INTSRC_ANEG_PR @@ -213,6 +214,11 @@ static int meson_gxl_config_intr(struct phy_device *phydev) val = 0; } + /* Ack any pending IRQ */ + ret = meson_gxl_ack_interrupt(phydev); + if (ret) + return ret; + return phy_write(phydev, INTSRC_MASK, val); } diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c @@ -1831,7 +1831,7 @@ int genphy_soft_reset(struct phy_device *phydev) { int ret; - ret = phy_write(phydev, MII_BMCR, BMCR_RESET); + ret = phy_set_bits(phydev, MII_BMCR, BMCR_RESET); if (ret < 0) return ret; diff --git a/drivers/net/tun.c b/drivers/net/tun.c @@ -1763,9 +1763,6 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, int skb_xdp = 1; bool frags = tun_napi_frags_enabled(tfile); - if (!(tun->dev->flags & IFF_UP)) - return -EIO; - if (!(tun->flags & IFF_NO_PI)) { if (len < sizeof(pi)) return -EINVAL; @@ -1867,6 +1864,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, err = skb_copy_datagram_from_iter(skb, 0, from, len); if (err) { + err = -EFAULT; +drop: this_cpu_inc(tun->pcpu_stats->rx_dropped); kfree_skb(skb); if (frags) { @@ -1874,7 +1873,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, mutex_unlock(&tfile->napi_mutex); } - return -EFAULT; + return err; } } @@ -1958,6 +1957,13 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, !tfile->detached) rxhash = __skb_get_hash_symmetric(skb); + rcu_read_lock(); + if (unlikely(!(tun->dev->flags & IFF_UP))) { + err = -EIO; + rcu_read_unlock(); + goto drop; + } + if (frags) { /* Exercise flow dissector code path. */ u32 headlen = eth_get_headlen(skb->data, skb_headlen(skb)); @@ -1965,6 +1971,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, if (unlikely(headlen > skb_headlen(skb))) { this_cpu_inc(tun->pcpu_stats->rx_dropped); napi_free_frags(&tfile->napi); + rcu_read_unlock(); mutex_unlock(&tfile->napi_mutex); WARN_ON(1); return -ENOMEM; @@ -1992,6 +1999,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, } else { netif_rx_ni(skb); } + rcu_read_unlock(); stats = get_cpu_ptr(tun->pcpu_stats); u64_stats_update_begin(&stats->syncp); diff --git a/drivers/net/usb/aqc111.c b/drivers/net/usb/aqc111.c @@ -1301,6 +1301,20 @@ static const struct driver_info trendnet_info = { .tx_fixup = aqc111_tx_fixup, }; +static const struct driver_info qnap_info = { + .description = "QNAP QNA-UC5G1T USB to 5GbE Adapter", + .bind = aqc111_bind, + .unbind = aqc111_unbind, + .status = aqc111_status, + .link_reset = aqc111_link_reset, + .reset = aqc111_reset, + .stop = aqc111_stop, + .flags = FLAG_ETHER | FLAG_FRAMING_AX | + FLAG_AVOID_UNLINK_URBS | FLAG_MULTI_PACKET, + .rx_fixup = aqc111_rx_fixup, + .tx_fixup = aqc111_tx_fixup, +}; + static int aqc111_suspend(struct usb_interface *intf, pm_message_t message) { struct usbnet *dev = usb_get_intfdata(intf); @@ -1455,6 +1469,7 @@ static const struct usb_device_id products[] = { {AQC111_USB_ETH_DEV(0x0b95, 0x2790, asix111_info)}, {AQC111_USB_ETH_DEV(0x0b95, 0x2791, asix112_info)}, {AQC111_USB_ETH_DEV(0x20f4, 0xe05a, trendnet_info)}, + {AQC111_USB_ETH_DEV(0x1c04, 0x0015, qnap_info)}, { },/* END */ }; MODULE_DEVICE_TABLE(usb, products); diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c @@ -851,6 +851,14 @@ static const struct usb_device_id products[] = { .driver_info = 0, }, +/* QNAP QNA-UC5G1T USB to 5GbE Adapter (based on AQC111U) */ +{ + USB_DEVICE_AND_INTERFACE_INFO(0x1c04, 0x0015, USB_CLASS_COMM, + USB_CDC_SUBCLASS_ETHERNET, + USB_CDC_PROTO_NONE), + .driver_info = 0, +}, + /* WHITELIST!!! * * CDC Ether uses two interfaces, not necessarily consecutive. diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c @@ -4335,10 +4335,8 @@ static void vxlan_destroy_tunnels(struct net *net, struct list_head *head) /* If vxlan->dev is in the same netns, it has already been added * to the list by the previous loop. */ - if (!net_eq(dev_net(vxlan->dev), net)) { - gro_cells_destroy(&vxlan->gro_cells); + if (!net_eq(dev_net(vxlan->dev), net)) unregister_netdevice_queue(vxlan->dev, head); - } } for (h = 0; h < PORT_HASH_SIZE; ++h) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/ftm-initiator.c b/drivers/net/wireless/intel/iwlwifi/mvm/ftm-initiator.c @@ -460,9 +460,7 @@ static int iwl_mvm_ftm_range_resp_valid(struct iwl_mvm *mvm, u8 request_id, static void iwl_mvm_debug_range_resp(struct iwl_mvm *mvm, u8 index, struct cfg80211_pmsr_result *res) { - s64 rtt_avg = res->ftm.rtt_avg * 100; - - do_div(rtt_avg, 6666); + s64 rtt_avg = div_s64(res->ftm.rtt_avg * 100, 6666); IWL_DEBUG_INFO(mvm, "entry %d\n", index); IWL_DEBUG_INFO(mvm, "\tstatus: %d\n", res->status); diff --git a/drivers/net/wireless/mediatek/mt76/dma.c b/drivers/net/wireless/mediatek/mt76/dma.c @@ -130,6 +130,8 @@ mt76_dma_tx_cleanup_idx(struct mt76_dev *dev, struct mt76_queue *q, int idx, static void mt76_dma_sync_idx(struct mt76_dev *dev, struct mt76_queue *q) { + iowrite32(q->desc_dma, &q->regs->desc_base); + iowrite32(q->ndesc, &q->regs->ring_size); q->head = ioread32(&q->regs->dma_idx); q->tail = q->head; iowrite32(q->head, &q->regs->cpu_idx); @@ -180,7 +182,10 @@ mt76_dma_tx_cleanup(struct mt76_dev *dev, enum mt76_txq_id qid, bool flush) else mt76_dma_sync_idx(dev, q); - wake = wake && qid < IEEE80211_NUM_ACS && q->queued < q->ndesc - 8; + wake = wake && q->stopped && + qid < IEEE80211_NUM_ACS && q->queued < q->ndesc - 8; + if (wake) + q->stopped = false; if (!q->queued) wake_up(&dev->tx_wait); diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c @@ -679,19 +679,15 @@ out: return ret; } -static void -mt76_sta_remove(struct mt76_dev *dev, struct ieee80211_vif *vif, - struct ieee80211_sta *sta) +void __mt76_sta_remove(struct mt76_dev *dev, struct ieee80211_vif *vif, + struct ieee80211_sta *sta) { struct mt76_wcid *wcid = (struct mt76_wcid *)sta->drv_priv; - int idx = wcid->idx; - int i; + int i, idx = wcid->idx; rcu_assign_pointer(dev->wcid[idx], NULL); synchronize_rcu(); - mutex_lock(&dev->mutex); - if (dev->drv->sta_remove) dev->drv->sta_remove(dev, vif, sta); @@ -699,7 +695,15 @@ mt76_sta_remove(struct mt76_dev *dev, struct ieee80211_vif *vif, for (i = 0; i < ARRAY_SIZE(sta->txq); i++) mt76_txq_remove(dev, sta->txq[i]); mt76_wcid_free(dev->wcid_mask, idx); +} +EXPORT_SYMBOL_GPL(__mt76_sta_remove); +static void +mt76_sta_remove(struct mt76_dev *dev, struct ieee80211_vif *vif, + struct ieee80211_sta *sta) +{ + mutex_lock(&dev->mutex); + __mt76_sta_remove(dev, vif, sta); mutex_unlock(&dev->mutex); } diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h @@ -126,6 +126,7 @@ struct mt76_queue { int ndesc; int queued; int buf_size; + bool stopped; u8 buf_offset; u8 hw_idx; @@ -143,6 +144,7 @@ struct mt76_mcu_ops { const struct mt76_reg_pair *rp, int len); int (*mcu_rd_rp)(struct mt76_dev *dev, u32 base, struct mt76_reg_pair *rp, int len); + int (*mcu_restart)(struct mt76_dev *dev); }; struct mt76_queue_ops { @@ -693,6 +695,8 @@ int mt76_sta_state(struct ieee80211_hw *hw, struct ieee80211_vif *vif, struct ieee80211_sta *sta, enum ieee80211_sta_state old_state, enum ieee80211_sta_state new_state); +void __mt76_sta_remove(struct mt76_dev *dev, struct ieee80211_vif *vif, + struct ieee80211_sta *sta); struct ieee80211_sta *mt76_rx_convert(struct sk_buff *skb); diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/beacon.c b/drivers/net/wireless/mediatek/mt76/mt7603/beacon.c @@ -135,8 +135,7 @@ void mt7603_pre_tbtt_tasklet(unsigned long arg) out: mt76_queue_tx_cleanup(dev, MT_TXQ_BEACON, false); - if (dev->mt76.q_tx[MT_TXQ_BEACON].queued > - __sw_hweight8(dev->beacon_mask)) + if (dev->mt76.q_tx[MT_TXQ_BEACON].queued > hweight8(dev->beacon_mask)) dev->beacon_check++; } diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/dma.c b/drivers/net/wireless/mediatek/mt76/mt7603/dma.c @@ -27,12 +27,16 @@ static void mt7603_rx_loopback_skb(struct mt7603_dev *dev, struct sk_buff *skb) { __le32 *txd = (__le32 *)skb->data; + struct ieee80211_hdr *hdr; + struct ieee80211_sta *sta; struct mt7603_sta *msta; struct mt76_wcid *wcid; + void *priv; int idx; u32 val; + u8 tid; - if (skb->len < sizeof(MT_TXD_SIZE) + sizeof(struct ieee80211_hdr)) + if (skb->len < MT_TXD_SIZE + sizeof(struct ieee80211_hdr)) goto free; val = le32_to_cpu(txd[1]); @@ -46,10 +50,19 @@ mt7603_rx_loopback_skb(struct mt7603_dev *dev, struct sk_buff *skb) if (!wcid) goto free; - msta = container_of(wcid, struct mt7603_sta, wcid); + priv = msta = container_of(wcid, struct mt7603_sta, wcid); val = le32_to_cpu(txd[0]); skb_set_queue_mapping(skb, FIELD_GET(MT_TXD0_Q_IDX, val)); + val &= ~(MT_TXD0_P_IDX | MT_TXD0_Q_IDX); + val |= FIELD_PREP(MT_TXD0_Q_IDX, MT_TX_HW_QUEUE_MGMT); + txd[0] = cpu_to_le32(val); + + sta = container_of(priv, struct ieee80211_sta, drv_priv); + hdr = (struct ieee80211_hdr *) &skb->data[MT_TXD_SIZE]; + tid = *ieee80211_get_qos_ctl(hdr) & IEEE80211_QOS_CTL_TID_MASK; + ieee80211_sta_set_buffered(sta, tid, true); + spin_lock_bh(&dev->ps_lock); __skb_queue_tail(&msta->psq, skb); if (skb_queue_len(&msta->psq) >= 64) { diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/init.c b/drivers/net/wireless/mediatek/mt76/mt7603/init.c @@ -112,7 +112,7 @@ static void mt7603_phy_init(struct mt7603_dev *dev) { int rx_chains = dev->mt76.antenna_mask; - int tx_chains = __sw_hweight8(rx_chains) - 1; + int tx_chains = hweight8(rx_chains) - 1; mt76_rmw(dev, MT_WF_RMAC_RMCR, (MT_WF_RMAC_RMCR_SMPS_MODE | diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/mac.c b/drivers/net/wireless/mediatek/mt76/mt7603/mac.c @@ -1072,7 +1072,7 @@ out: case MT_PHY_TYPE_HT: final_rate_flags |= IEEE80211_TX_RC_MCS; final_rate &= GENMASK(5, 0); - if (i > 15) + if (final_rate > 15) return false; break; default: diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/main.c b/drivers/net/wireless/mediatek/mt76/mt7603/main.c @@ -5,6 +5,7 @@ #include <linux/pci.h> #include <linux/module.h> #include "mt7603.h" +#include "mac.h" #include "eeprom.h" static int @@ -386,6 +387,15 @@ mt7603_sta_ps(struct mt76_dev *mdev, struct ieee80211_sta *sta, bool ps) } static void +mt7603_ps_set_more_data(struct sk_buff *skb) +{ + struct ieee80211_hdr *hdr; + + hdr = (struct ieee80211_hdr *) &skb->data[MT_TXD_SIZE]; + hdr->frame_control |= cpu_to_le16(IEEE80211_FCTL_MOREDATA); +} + +static void mt7603_release_buffered_frames(struct ieee80211_hw *hw, struct ieee80211_sta *sta, u16 tids, int nframes, @@ -399,6 +409,8 @@ mt7603_release_buffered_frames(struct ieee80211_hw *hw, __skb_queue_head_init(&list); + mt7603_wtbl_set_ps(dev, msta, false); + spin_lock_bh(&dev->ps_lock); skb_queue_walk_safe(&msta->psq, skb, tmp) { if (!nframes) @@ -409,11 +421,15 @@ mt7603_release_buffered_frames(struct ieee80211_hw *hw, skb_set_queue_mapping(skb, MT_TXQ_PSD); __skb_unlink(skb, &msta->psq); + mt7603_ps_set_more_data(skb); __skb_queue_tail(&list, skb); nframes--; } spin_unlock_bh(&dev->ps_lock); + if (!skb_queue_empty(&list)) + ieee80211_sta_eosp(sta); + mt7603_ps_tx_list(dev, &list); if (nframes) diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7603/mcu.c @@ -433,7 +433,7 @@ int mt7603_mcu_set_channel(struct mt7603_dev *dev) { struct cfg80211_chan_def *chandef = &dev->mt76.chandef; struct ieee80211_hw *hw = mt76_hw(dev); - int n_chains = __sw_hweight8(dev->mt76.antenna_mask); + int n_chains = hweight8(dev->mt76.antenna_mask); struct { u8 control_chan; u8 center_chan; diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/soc.c b/drivers/net/wireless/mediatek/mt76/mt7603/soc.c @@ -23,9 +23,9 @@ mt76_wmac_probe(struct platform_device *pdev) } mem_base = devm_ioremap_resource(&pdev->dev, res); - if (!mem_base) { + if (IS_ERR(mem_base)) { dev_err(&pdev->dev, "Failed to get memory resource\n"); - return -EINVAL; + return PTR_ERR(mem_base); } mdev = mt76_alloc_device(&pdev->dev, sizeof(*dev), &mt7603_ops, diff --git a/drivers/net/wireless/mediatek/mt76/mt76x0/initvals.h b/drivers/net/wireless/mediatek/mt76/mt76x0/initvals.h @@ -46,7 +46,7 @@ static const struct mt76_reg_pair common_mac_reg_table[] = { { MT_MM20_PROT_CFG, 0x01742004 }, { MT_MM40_PROT_CFG, 0x03f42084 }, { MT_TXOP_CTRL_CFG, 0x0000583f }, - { MT_TX_RTS_CFG, 0x00092b20 }, + { MT_TX_RTS_CFG, 0x00ffff20 }, { MT_EXP_ACK_TIME, 0x002400ca }, { MT_TXOP_HLDR_ET, 0x00000002 }, { MT_XIFS_TIME_CFG, 0x33a41010 }, diff --git a/drivers/net/wireless/mediatek/mt76/mt76x0/usb.c b/drivers/net/wireless/mediatek/mt76/mt76x0/usb.c @@ -229,7 +229,7 @@ static int mt76x0u_probe(struct usb_interface *usb_intf, struct usb_device *usb_dev = interface_to_usbdev(usb_intf); struct mt76x02_dev *dev; struct mt76_dev *mdev; - u32 asic_rev, mac_rev; + u32 mac_rev; int ret; mdev = mt76_alloc_device(&usb_intf->dev, sizeof(*dev), &mt76x0u_ops, @@ -262,10 +262,14 @@ static int mt76x0u_probe(struct usb_interface *usb_intf, goto err; } - asic_rev = mt76_rr(dev, MT_ASIC_VERSION); + mdev->rev = mt76_rr(dev, MT_ASIC_VERSION); mac_rev = mt76_rr(dev, MT_MAC_CSR0); dev_info(mdev->dev, "ASIC revision: %08x MAC revision: %08x\n", - asic_rev, mac_rev); + mdev->rev, mac_rev); + if (!is_mt76x0(dev)) { + ret = -ENODEV; + goto err; + } /* Note: vendor driver skips this check for MT76X0U */ if (!(mt76_rr(dev, MT_EFUSE_CTRL) & MT_EFUSE_CTRL_SEL)) diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02.h b/drivers/net/wireless/mediatek/mt76/mt76x02.h @@ -51,6 +51,7 @@ struct mt76x02_calibration { u16 false_cca; s8 avg_rssi_all; s8 agc_gain_adjust; + s8 agc_lowest_gain; s8 low_gain; s8 temp_vco; @@ -114,8 +115,11 @@ struct mt76x02_dev { struct mt76x02_dfs_pattern_detector dfs_pd; /* edcca monitor */ + unsigned long ed_trigger_timeout; bool ed_tx_blocked; bool ed_monitor; + u8 ed_monitor_enabled; + u8 ed_monitor_learning; u8 ed_trigger; u8 ed_silent; ktime_t ed_time; @@ -188,6 +192,13 @@ void mt76x02_mac_start(struct mt76x02_dev *dev); void mt76x02_init_debugfs(struct mt76x02_dev *dev); +static inline bool is_mt76x0(struct mt76x02_dev *dev) +{ + return mt76_chip(&dev->mt76) == 0x7610 || + mt76_chip(&dev->mt76) == 0x7630 || + mt76_chip(&dev->mt76) == 0x7650; +} + static inline bool is_mt76x2(struct mt76x02_dev *dev) { return mt76_chip(&dev->mt76) == 0x7612 || diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_debugfs.c b/drivers/net/wireless/mediatek/mt76/mt76x02_debugfs.c @@ -116,6 +116,32 @@ static int read_agc(struct seq_file *file, void *data) return 0; } +static int +mt76_edcca_set(void *data, u64 val) +{ + struct mt76x02_dev *dev = data; + enum nl80211_dfs_regions region = dev->dfs_pd.region; + + dev->ed_monitor_enabled = !!val; + dev->ed_monitor = dev->ed_monitor_enabled && + region == NL80211_DFS_ETSI; + mt76x02_edcca_init(dev, true); + + return 0; +} + +static int +mt76_edcca_get(void *data, u64 *val) +{ + struct mt76x02_dev *dev = data; + + *val = dev->ed_monitor_enabled; + return 0; +} + +DEFINE_DEBUGFS_ATTRIBUTE(fops_edcca, mt76_edcca_get, mt76_edcca_set, + "%lld\n"); + void mt76x02_init_debugfs(struct mt76x02_dev *dev) { struct dentry *dir; @@ -127,6 +153,7 @@ void mt76x02_init_debugfs(struct mt76x02_dev *dev) debugfs_create_u8("temperature", 0400, dir, &dev->cal.temp); debugfs_create_bool("tpc", 0600, dir, &dev->enable_tpc); + debugfs_create_file("edcca", 0400, dir, dev, &fops_edcca); debugfs_create_file("ampdu_stat", 0400, dir, dev, &fops_ampdu_stat); debugfs_create_file("dfs_stats", 0400, dir, dev, &fops_dfs_stat); debugfs_create_devm_seqfile(dev->mt76.dev, "txpower", dir, diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_dfs.c b/drivers/net/wireless/mediatek/mt76/mt76x02_dfs.c @@ -885,7 +885,8 @@ mt76x02_dfs_set_domain(struct mt76x02_dev *dev, if (dfs_pd->region != region) { tasklet_disable(&dfs_pd->dfs_tasklet); - dev->ed_monitor = region == NL80211_DFS_ETSI; + dev->ed_monitor = dev->ed_monitor_enabled && + region == NL80211_DFS_ETSI; mt76x02_edcca_init(dev, true); dfs_pd->region = region; diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_mac.c b/drivers/net/wireless/mediatek/mt76/mt76x02_mac.c @@ -67,12 +67,39 @@ int mt76x02_mac_shared_key_setup(struct mt76x02_dev *dev, u8 vif_idx, } EXPORT_SYMBOL_GPL(mt76x02_mac_shared_key_setup); +void mt76x02_mac_wcid_sync_pn(struct mt76x02_dev *dev, u8 idx, + struct ieee80211_key_conf *key) +{ + enum mt76x02_cipher_type cipher; + u8 key_data[32]; + u32 iv, eiv; + u64 pn; + + cipher = mt76x02_mac_get_key_info(key, key_data); + iv = mt76_rr(dev, MT_WCID_IV(idx)); + eiv = mt76_rr(dev, MT_WCID_IV(idx) + 4); + + pn = (u64)eiv << 16; + if (cipher == MT_CIPHER_TKIP) { + pn |= (iv >> 16) & 0xff; + pn |= (iv & 0xff) << 8; + } else if (cipher >= MT_CIPHER_AES_CCMP) { + pn |= iv & 0xffff; + } else { + return; + } + + atomic64_set(&key->tx_pn, pn); +} + + int mt76x02_mac_wcid_set_key(struct mt76x02_dev *dev, u8 idx, struct ieee80211_key_conf *key) { enum mt76x02_cipher_type cipher; u8 key_data[32]; u8 iv_data[8]; + u64 pn; cipher = mt76x02_mac_get_key_info(key, key_data); if (cipher == MT_CIPHER_NONE && key) @@ -85,9 +112,22 @@ int mt76x02_mac_wcid_set_key(struct mt76x02_dev *dev, u8 idx, if (key) { mt76_rmw_field(dev, MT_WCID_ATTR(idx), MT_WCID_ATTR_PAIRWISE, !!(key->flags & IEEE80211_KEY_FLAG_PAIRWISE)); + + pn = atomic64_read(&key->tx_pn); + iv_data[3] = key->keyidx << 6; - if (cipher >= MT_CIPHER_TKIP) + if (cipher >= MT_CIPHER_TKIP) { iv_data[3] |= 0x20; + put_unaligned_le32(pn >> 16, &iv_data[4]); + } + + if (cipher == MT_CIPHER_TKIP) { + iv_data[0] = (pn >> 8) & 0xff; + iv_data[1] = (iv_data[0] | 0x20) & 0x7f; + iv_data[2] = pn & 0xff; + } else if (cipher >= MT_CIPHER_AES_CCMP) { + put_unaligned_le16((pn & 0xffff), &iv_data[0]); + } } mt76_wr_copy(dev, MT_WCID_IV(idx), iv_data, sizeof(iv_data)); @@ -920,6 +960,7 @@ void mt76x02_edcca_init(struct mt76x02_dev *dev, bool enable) } } mt76x02_edcca_tx_enable(dev, true); + dev->ed_monitor_learning = true; /* clear previous CCA timer value */ mt76_rr(dev, MT_ED_CCA_TIMER); @@ -929,6 +970,10 @@ EXPORT_SYMBOL_GPL(mt76x02_edcca_init); #define MT_EDCCA_TH 92 #define MT_EDCCA_BLOCK_TH 2 +#define MT_EDCCA_LEARN_TH 50 +#define MT_EDCCA_LEARN_CCA 180 +#define MT_EDCCA_LEARN_TIMEOUT (20 * HZ) + static void mt76x02_edcca_check(struct mt76x02_dev *dev) { ktime_t cur_time; @@ -951,11 +996,23 @@ static void mt76x02_edcca_check(struct mt76x02_dev *dev) dev->ed_trigger = 0; } - if (dev->ed_trigger > MT_EDCCA_BLOCK_TH && - !dev->ed_tx_blocked) + if (dev->cal.agc_lowest_gain && + dev->cal.false_cca > MT_EDCCA_LEARN_CCA && + dev->ed_trigger > MT_EDCCA_LEARN_TH) { + dev->ed_monitor_learning = false; + dev->ed_trigger_timeout = jiffies + 20 * HZ; + } else if (!dev->ed_monitor_learning && + time_is_after_jiffies(dev->ed_trigger_timeout)) { + dev->ed_monitor_learning = true; + mt76x02_edcca_tx_enable(dev, true); + } + + if (dev->ed_monitor_learning) + return; + + if (dev->ed_trigger > MT_EDCCA_BLOCK_TH && !dev->ed_tx_blocked) mt76x02_edcca_tx_enable(dev, false); - else if (dev->ed_silent > MT_EDCCA_BLOCK_TH && - dev->ed_tx_blocked) + else if (dev->ed_silent > MT_EDCCA_BLOCK_TH && dev->ed_tx_blocked) mt76x02_edcca_tx_enable(dev, true); } diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_mac.h b/drivers/net/wireless/mediatek/mt76/mt76x02_mac.h @@ -177,6 +177,8 @@ int mt76x02_mac_shared_key_setup(struct mt76x02_dev *dev, u8 vif_idx, u8 key_idx, struct ieee80211_key_conf *key); int mt76x02_mac_wcid_set_key(struct mt76x02_dev *dev, u8 idx, struct ieee80211_key_conf *key); +void mt76x02_mac_wcid_sync_pn(struct mt76x02_dev *dev, u8 idx, + struct ieee80211_key_conf *key); void mt76x02_mac_wcid_setup(struct mt76x02_dev *dev, u8 idx, u8 vif_idx, u8 *mac); void mt76x02_mac_wcid_set_drop(struct mt76x02_dev *dev, u8 idx, bool drop); diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c b/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c @@ -19,6 +19,7 @@ #include <linux/irq.h> #include "mt76x02.h" +#include "mt76x02_mcu.h" #include "mt76x02_trace.h" struct beacon_bc_data { @@ -418,9 +419,66 @@ static bool mt76x02_tx_hang(struct mt76x02_dev *dev) return i < 4; } +static void mt76x02_key_sync(struct ieee80211_hw *hw, struct ieee80211_vif *vif, + struct ieee80211_sta *sta, + struct ieee80211_key_conf *key, void *data) +{ + struct mt76x02_dev *dev = hw->priv; + struct mt76_wcid *wcid; + + if (!sta) + return; + + wcid = (struct mt76_wcid *) sta->drv_priv; + + if (wcid->hw_key_idx != key->keyidx || wcid->sw_iv) + return; + + mt76x02_mac_wcid_sync_pn(dev, wcid->idx, key); +} + +static void mt76x02_reset_state(struct mt76x02_dev *dev) +{ + int i; + + lockdep_assert_held(&dev->mt76.mutex); + + clear_bit(MT76_STATE_RUNNING, &dev->mt76.state); + + rcu_read_lock(); + ieee80211_iter_keys_rcu(dev->mt76.hw, NULL, mt76x02_key_sync, NULL); + rcu_read_unlock(); + + for (i = 0; i < ARRAY_SIZE(dev->mt76.wcid); i++) { + struct ieee80211_sta *sta; + struct ieee80211_vif *vif; + struct mt76x02_sta *msta; + struct mt76_wcid *wcid; + void *priv; + + wcid = rcu_dereference_protected(dev->mt76.wcid[i], + lockdep_is_held(&dev->mt76.mutex)); + if (!wcid) + continue; + + priv = msta = container_of(wcid, struct mt76x02_sta, wcid); + sta = container_of(priv, struct ieee80211_sta, drv_priv); + + priv = msta->vif; + vif = container_of(priv, struct ieee80211_vif, drv_priv); + + __mt76_sta_remove(&dev->mt76, vif, sta); + memset(msta, 0, sizeof(*msta)); + } + + dev->vif_mask = 0; + dev->beacon_mask = 0; +} + static void mt76x02_watchdog_reset(struct mt76x02_dev *dev) { u32 mask = dev->mt76.mmio.irqmask; + bool restart = dev->mt76.mcu_ops->mcu_restart; int i; ieee80211_stop_queues(dev->mt76.hw); @@ -434,6 +492,9 @@ static void mt76x02_watchdog_reset(struct mt76x02_dev *dev) mutex_lock(&dev->mt76.mutex); + if (restart) + mt76x02_reset_state(dev); + if (dev->beacon_mask) mt76_clear(dev, MT_BEACON_TIME_CFG, MT_BEACON_TIME_CFG_BEACON_TX | @@ -452,20 +513,21 @@ static void mt76x02_watchdog_reset(struct mt76x02_dev *dev) /* let fw reset DMA */ mt76_set(dev, 0x734, 0x3); + if (restart) + dev->mt76.mcu_ops->mcu_restart(&dev->mt76); + for (i = 0; i < ARRAY_SIZE(dev->mt76.q_tx); i++) mt76_queue_tx_cleanup(dev, i, true); for (i = 0; i < ARRAY_SIZE(dev->mt76.q_rx); i++) mt76_queue_rx_reset(dev, i); - mt76_wr(dev, MT_MAC_SYS_CTRL, - MT_MAC_SYS_CTRL_ENABLE_TX | MT_MAC_SYS_CTRL_ENABLE_RX); - mt76_set(dev, MT_WPDMA_GLO_CFG, - MT_WPDMA_GLO_CFG_TX_DMA_EN | MT_WPDMA_GLO_CFG_RX_DMA_EN); + mt76x02_mac_start(dev); + if (dev->ed_monitor) mt76_set(dev, MT_TXOP_CTRL_CFG, MT_TXOP_ED_CCA_EN); - if (dev->beacon_mask) + if (dev->beacon_mask && !restart) mt76_set(dev, MT_BEACON_TIME_CFG, MT_BEACON_TIME_CFG_BEACON_TX | MT_BEACON_TIME_CFG_TBTT_EN); @@ -486,9 +548,13 @@ static void mt76x02_watchdog_reset(struct mt76x02_dev *dev) napi_schedule(&dev->mt76.napi[i]); } - ieee80211_wake_queues(dev->mt76.hw); - - mt76_txq_schedule_all(&dev->mt76); + if (restart) { + mt76x02_mcu_function_select(dev, Q_SELECT, 1); + ieee80211_restart_hw(dev->mt76.hw); + } else { + ieee80211_wake_queues(dev->mt76.hw); + mt76_txq_schedule_all(&dev->mt76); + } } static void mt76x02_check_tx_hang(struct mt76x02_dev *dev) diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_phy.c b/drivers/net/wireless/mediatek/mt76/mt76x02_phy.c @@ -194,6 +194,8 @@ bool mt76x02_phy_adjust_vga_gain(struct mt76x02_dev *dev) ret = true; } + dev->cal.agc_lowest_gain = dev->cal.agc_gain_adjust >= limit; + return ret; } EXPORT_SYMBOL_GPL(mt76x02_phy_adjust_vga_gain); diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_usb_core.c b/drivers/net/wireless/mediatek/mt76/mt76x02_usb_core.c @@ -85,8 +85,9 @@ int mt76x02u_tx_prepare_skb(struct mt76_dev *mdev, void *data, mt76x02_insert_hdr_pad(skb); - txwi = skb_push(skb, sizeof(struct mt76x02_txwi)); + txwi = (struct mt76x02_txwi *)(skb->data - sizeof(struct mt76x02_txwi)); mt76x02_mac_write_txwi(dev, txwi, skb, wcid, sta, len); + skb_push(skb, sizeof(struct mt76x02_txwi)); pid = mt76_tx_status_skb_add(mdev, wcid, skb); txwi->pktid = pid; diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_util.c b/drivers/net/wireless/mediatek/mt76/mt76x02_util.c @@ -237,6 +237,8 @@ int mt76x02_sta_add(struct mt76_dev *mdev, struct ieee80211_vif *vif, struct mt76x02_vif *mvif = (struct mt76x02_vif *)vif->drv_priv; int idx = 0; + memset(msta, 0, sizeof(*msta)); + idx = mt76_wcid_alloc(dev->mt76.wcid_mask, ARRAY_SIZE(dev->mt76.wcid)); if (idx < 0) return -ENOSPC; @@ -274,6 +276,8 @@ mt76x02_vif_init(struct mt76x02_dev *dev, struct ieee80211_vif *vif, struct mt76x02_vif *mvif = (struct mt76x02_vif *)vif->drv_priv; struct mt76_txq *mtxq; + memset(mvif, 0, sizeof(*mvif)); + mvif->idx = idx; mvif->group_wcid.idx = MT_VIF_WCID(idx); mvif->group_wcid.hw_key_idx = -1; @@ -289,6 +293,12 @@ mt76x02_add_interface(struct ieee80211_hw *hw, struct ieee80211_vif *vif) struct mt76x02_dev *dev = hw->priv; unsigned int idx = 0; + /* Allow to change address in HW if we create first interface. */ + if (!dev->vif_mask && + (((vif->addr[0] ^ dev->mt76.macaddr[0]) & ~GENMASK(4, 1)) || + memcmp(vif->addr + 1, dev->mt76.macaddr + 1, ETH_ALEN - 1))) + mt76x02_mac_setaddr(dev, vif->addr); + if (vif->addr[0] & BIT(1)) idx = 1 + (((dev->mt76.macaddr[0] ^ vif->addr[0]) >> 2) & 7); @@ -311,10 +321,6 @@ mt76x02_add_interface(struct ieee80211_hw *hw, struct ieee80211_vif *vif) if (dev->vif_mask & BIT(idx)) return -EBUSY; - /* Allow to change address in HW if we create first interface. */ - if (!dev->vif_mask && !ether_addr_equal(dev->mt76.macaddr, vif->addr)) - mt76x02_mac_setaddr(dev, vif->addr); - dev->vif_mask |= BIT(idx); mt76x02_vif_init(dev, vif, idx); diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2/init.c b/drivers/net/wireless/mediatek/mt76/mt76x2/init.c @@ -106,7 +106,7 @@ void mt76_write_mac_initvals(struct mt76x02_dev *dev) { MT_TX_SW_CFG1, 0x00010000 }, { MT_TX_SW_CFG2, 0x00000000 }, { MT_TXOP_CTRL_CFG, 0x0400583f }, - { MT_TX_RTS_CFG, 0x00100020 }, + { MT_TX_RTS_CFG, 0x00ffff20 }, { MT_TX_TIMEOUT_CFG, 0x000a2290 }, { MT_TX_RETRY_CFG, 0x47f01f0f }, { MT_EXP_ACK_TIME, 0x002c00dc }, diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2/mt76x2.h b/drivers/net/wireless/mediatek/mt76/mt76x2/mt76x2.h @@ -71,6 +71,7 @@ int mt76x2_mcu_load_cr(struct mt76x02_dev *dev, u8 type, u8 temp_level, void mt76x2_cleanup(struct mt76x02_dev *dev); +int mt76x2_mac_reset(struct mt76x02_dev *dev, bool hard); void mt76x2_reset_wlan(struct mt76x02_dev *dev, bool enable); void mt76x2_init_txpower(struct mt76x02_dev *dev, struct ieee80211_supported_band *sband); diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2/pci_init.c b/drivers/net/wireless/mediatek/mt76/mt76x2/pci_init.c @@ -77,7 +77,7 @@ mt76x2_fixup_xtal(struct mt76x02_dev *dev) } } -static int mt76x2_mac_reset(struct mt76x02_dev *dev, bool hard) +int mt76x2_mac_reset(struct mt76x02_dev *dev, bool hard) { const u8 *macaddr = dev->mt76.macaddr; u32 val; diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2/pci_mcu.c b/drivers/net/wireless/mediatek/mt76/mt76x2/pci_mcu.c @@ -165,9 +165,30 @@ error: return -ENOENT; } +static int +mt76pci_mcu_restart(struct mt76_dev *mdev) +{ + struct mt76x02_dev *dev; + int ret; + + dev = container_of(mdev, struct mt76x02_dev, mt76); + + mt76x02_mcu_cleanup(dev); + mt76x2_mac_reset(dev, true); + + ret = mt76pci_load_firmware(dev); + if (ret) + return ret; + + mt76_wr(dev, MT_WPDMA_RST_IDX, ~0); + + return 0; +} + int mt76x2_mcu_init(struct mt76x02_dev *dev) { static const struct mt76_mcu_ops mt76x2_mcu_ops = { + .mcu_restart = mt76pci_mcu_restart, .mcu_send_msg = mt76x02_mcu_msg_send, }; int ret; diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2/phy.c b/drivers/net/wireless/mediatek/mt76/mt76x2/phy.c @@ -260,10 +260,15 @@ mt76x2_phy_set_gain_val(struct mt76x02_dev *dev) gain_val[0] = dev->cal.agc_gain_cur[0] - dev->cal.agc_gain_adjust; gain_val[1] = dev->cal.agc_gain_cur[1] - dev->cal.agc_gain_adjust; - if (dev->mt76.chandef.width >= NL80211_CHAN_WIDTH_40) + val = 0x1836 << 16; + if (!mt76x2_has_ext_lna(dev) && + dev->mt76.chandef.width >= NL80211_CHAN_WIDTH_40) val = 0x1e42 << 16; - else - val = 0x1836 << 16; + + if (mt76x2_has_ext_lna(dev) && + dev->mt76.chandef.chan->band == NL80211_BAND_2GHZ && + dev->mt76.chandef.width < NL80211_CHAN_WIDTH_40) + val = 0x0f36 << 16; val |= 0xf8; @@ -280,6 +285,7 @@ void mt76x2_phy_update_channel_gain(struct mt76x02_dev *dev) { u8 *gain = dev->cal.agc_gain_init; u8 low_gain_delta, gain_delta; + u32 agc_35, agc_37; bool gain_change; int low_gain; u32 val; @@ -318,6 +324,16 @@ void mt76x2_phy_update_channel_gain(struct mt76x02_dev *dev) else low_gain_delta = 14; + agc_37 = 0x2121262c; + if (dev->mt76.chandef.chan->band == NL80211_BAND_2GHZ) + agc_35 = 0x11111516; + else if (low_gain == 2) + agc_35 = agc_37 = 0x08080808; + else if (dev->mt76.chandef.width == NL80211_CHAN_WIDTH_80) + agc_35 = 0x10101014; + else + agc_35 = 0x11111116; + if (low_gain == 2) { mt76_wr(dev, MT_BBP(RXO, 18), 0xf000a990); mt76_wr(dev, MT_BBP(AGC, 35), 0x08080808); @@ -326,15 +342,13 @@ void mt76x2_phy_update_channel_gain(struct mt76x02_dev *dev) dev->cal.agc_gain_adjust = 0; } else { mt76_wr(dev, MT_BBP(RXO, 18), 0xf000a991); - if (dev->mt76.chandef.width == NL80211_CHAN_WIDTH_80) - mt76_wr(dev, MT_BBP(AGC, 35), 0x10101014); - else - mt76_wr(dev, MT_BBP(AGC, 35), 0x11111116); - mt76_wr(dev, MT_BBP(AGC, 37), 0x2121262C); gain_delta = 0; dev->cal.agc_gain_adjust = low_gain_delta; } + mt76_wr(dev, MT_BBP(AGC, 35), agc_35); + mt76_wr(dev, MT_BBP(AGC, 37), agc_37); + dev->cal.agc_gain_cur[0] = gain[0] - gain_delta; dev->cal.agc_gain_cur[1] = gain[1] - gain_delta; mt76x2_phy_set_gain_val(dev); diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2/usb.c b/drivers/net/wireless/mediatek/mt76/mt76x2/usb.c @@ -21,11 +21,10 @@ #include "mt76x2u.h" static const struct usb_device_id mt76x2u_device_table[] = { - { USB_DEVICE(0x0e8d, 0x7612) }, /* Alfa AWUS036ACM */ { USB_DEVICE(0x0b05, 0x1833) }, /* Asus USB-AC54 */ { USB_DEVICE(0x0b05, 0x17eb) }, /* Asus USB-AC55 */ { USB_DEVICE(0x0b05, 0x180b) }, /* Asus USB-N53 B1 */ - { USB_DEVICE(0x0e8d, 0x7612) }, /* Aukey USB-AC1200 */ + { USB_DEVICE(0x0e8d, 0x7612) }, /* Aukey USBAC1200 - Alfa AWUS036ACM */ { USB_DEVICE(0x057c, 0x8503) }, /* Avm FRITZ!WLAN AC860 */ { USB_DEVICE(0x7392, 0xb711) }, /* Edimax EW 7722 UAC */ { USB_DEVICE(0x0846, 0x9053) }, /* Netgear A6210 */ @@ -66,6 +65,10 @@ static int mt76x2u_probe(struct usb_interface *intf, mdev->rev = mt76_rr(dev, MT_ASIC_VERSION); dev_info(mdev->dev, "ASIC revision: %08x\n", mdev->rev); + if (!is_mt76x2(dev)) { + err = -ENODEV; + goto err; + } err = mt76x2u_register_device(dev); if (err < 0) diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2/usb_mac.c b/drivers/net/wireless/mediatek/mt76/mt76x2/usb_mac.c @@ -93,7 +93,6 @@ int mt76x2u_mac_reset(struct mt76x02_dev *dev) mt76_wr(dev, MT_TX_LINK_CFG, 0x1020); mt76_wr(dev, MT_AUTO_RSP_CFG, 0x13); mt76_wr(dev, MT_MAX_LEN_CFG, 0x2f00); - mt76_wr(dev, MT_TX_RTS_CFG, 0x92b20); mt76_wr(dev, MT_WMM_AIFSN, 0x2273); mt76_wr(dev, MT_WMM_CWMIN, 0x2344); diff --git a/drivers/net/wireless/mediatek/mt76/tx.c b/drivers/net/wireless/mediatek/mt76/tx.c @@ -289,8 +289,11 @@ mt76_tx(struct mt76_dev *dev, struct ieee80211_sta *sta, dev->queue_ops->tx_queue_skb(dev, q, skb, wcid, sta); dev->queue_ops->kick(dev, q); - if (q->queued > q->ndesc - 8) + if (q->queued > q->ndesc - 8 && !q->stopped) { ieee80211_stop_queue(dev->hw, skb_get_queue_mapping(skb)); + q->stopped = true; + } + spin_unlock_bh(&q->lock); } EXPORT_SYMBOL_GPL(mt76_tx); @@ -374,7 +377,10 @@ mt76_release_buffered_frames(struct ieee80211_hw *hw, struct ieee80211_sta *sta, if (last_skb) { mt76_queue_ps_skb(dev, sta, last_skb, true); dev->queue_ops->kick(dev, hwq); + } else { + ieee80211_sta_eosp(sta); } + spin_unlock_bh(&hwq->lock); } EXPORT_SYMBOL_GPL(mt76_release_buffered_frames); @@ -577,6 +583,9 @@ void mt76_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *txq) struct mt76_txq *mtxq = (struct mt76_txq *) txq->drv_priv; struct mt76_queue *hwq = mtxq->hwq; + if (!test_bit(MT76_STATE_RUNNING, &dev->state)) + return; + spin_lock_bh(&hwq->lock); if (list_empty(&mtxq->list)) list_add_tail(&mtxq->list, &hwq->swq); diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c @@ -655,7 +655,11 @@ static void mt76u_tx_tasklet(unsigned long data) spin_lock_bh(&q->lock); } mt76_txq_schedule(dev, q); - wake = i < IEEE80211_NUM_ACS && q->queued < q->ndesc - 8; + + wake = q->stopped && q->queued < q->ndesc - 8; + if (wake) + q->stopped = false; + if (!q->queued) wake_up(&dev->tx_wait); diff --git a/drivers/net/wireless/mediatek/mt7601u/usb.c b/drivers/net/wireless/mediatek/mt7601u/usb.c @@ -303,6 +303,10 @@ static int mt7601u_probe(struct usb_interface *usb_intf, mac_rev = mt7601u_rr(dev, MT_MAC_CSR0); dev_info(dev->dev, "ASIC revision: %08x MAC revision: %08x\n", asic_rev, mac_rev); + if ((asic_rev >> 16) != 0x7601) { + ret = -ENODEV; + goto err; + } /* Note: vendor driver skips this check for MT7601U */ if (!(mt7601u_rr(dev, MT_EFUSE_CTRL) & MT_EFUSE_CTRL_SEL)) diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c @@ -1150,13 +1150,16 @@ static void qeth_notify_skbs(struct qeth_qdio_out_q *q, static void qeth_release_skbs(struct qeth_qdio_out_buffer *buf) { + struct sk_buff *skb; + /* release may never happen from within CQ tasklet scope */ WARN_ON_ONCE(atomic_read(&buf->state) == QETH_QDIO_BUF_IN_CQ); if (atomic_read(&buf->state) == QETH_QDIO_BUF_PENDING) qeth_notify_skbs(buf->q, buf, TX_NOTIFY_GENERALERROR); - __skb_queue_purge(&buf->skb_list); + while ((skb = __skb_dequeue(&buf->skb_list)) != NULL) + consume_skb(skb); } static void qeth_clear_output_buffer(struct qeth_qdio_out_q *queue, diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c @@ -629,8 +629,7 @@ static netdev_tx_t qeth_l2_hard_start_xmit(struct sk_buff *skb, } /* else fall through */ QETH_TXQ_STAT_INC(queue, tx_dropped); - QETH_TXQ_STAT_INC(queue, tx_errors); - dev_kfree_skb_any(skb); + kfree_skb(skb); netif_wake_queue(dev); return NETDEV_TX_OK; } @@ -645,6 +644,8 @@ static int qeth_l2_probe_device(struct ccwgroup_device *gdev) struct qeth_card *card = dev_get_drvdata(&gdev->dev); int rc; + qeth_l2_vnicc_set_defaults(card); + if (gdev->dev.type == &qeth_generic_devtype) { rc = qeth_l2_create_device_attributes(&gdev->dev); if (rc) @@ -652,8 +653,6 @@ static int qeth_l2_probe_device(struct ccwgroup_device *gdev) } hash_init(card->mac_htable); - card->info.hwtrap = 0; - qeth_l2_vnicc_set_defaults(card); return 0; } diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c @@ -2096,8 +2096,7 @@ static netdev_tx_t qeth_l3_hard_start_xmit(struct sk_buff *skb, tx_drop: QETH_TXQ_STAT_INC(queue, tx_dropped); - QETH_TXQ_STAT_INC(queue, tx_errors); - dev_kfree_skb_any(skb); + kfree_skb(skb); netif_wake_queue(dev); return NETDEV_TX_OK; } @@ -2253,14 +2252,15 @@ static int qeth_l3_probe_device(struct ccwgroup_device *gdev) struct qeth_card *card = dev_get_drvdata(&gdev->dev); int rc; + hash_init(card->ip_htable); + if (gdev->dev.type == &qeth_generic_devtype) { rc = qeth_l3_create_device_attributes(&gdev->dev); if (rc) return rc; } - hash_init(card->ip_htable); + hash_init(card->ip_mc_htable); - card->info.hwtrap = 0; return 0; } diff --git a/include/linux/atalk.h b/include/linux/atalk.h @@ -108,7 +108,7 @@ static __inline__ struct elapaarp *aarp_hdr(struct sk_buff *skb) #define AARP_RESOLVE_TIME (10 * HZ) extern struct datalink_proto *ddp_dl, *aarp_dl; -extern void aarp_proto_init(void); +extern int aarp_proto_init(void); /* Inter module exports */ diff --git a/include/linux/bpf.h b/include/linux/bpf.h @@ -193,7 +193,6 @@ enum bpf_arg_type { ARG_PTR_TO_CTX, /* pointer to context */ ARG_ANYTHING, /* any (initialized) argument is ok */ - ARG_PTR_TO_SOCKET, /* pointer to bpf_sock */ ARG_PTR_TO_SPIN_LOCK, /* pointer to bpf_spin_lock */ ARG_PTR_TO_SOCK_COMMON, /* pointer to sock_common */ }; diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h @@ -66,6 +66,46 @@ struct bpf_reg_state { * same reference to the socket, to determine proper reference freeing. */ u32 id; + /* PTR_TO_SOCKET and PTR_TO_TCP_SOCK could be a ptr returned + * from a pointer-cast helper, bpf_sk_fullsock() and + * bpf_tcp_sock(). + * + * Consider the following where "sk" is a reference counted + * pointer returned from "sk = bpf_sk_lookup_tcp();": + * + * 1: sk = bpf_sk_lookup_tcp(); + * 2: if (!sk) { return 0; } + * 3: fullsock = bpf_sk_fullsock(sk); + * 4: if (!fullsock) { bpf_sk_release(sk); return 0; } + * 5: tp = bpf_tcp_sock(fullsock); + * 6: if (!tp) { bpf_sk_release(sk); return 0; } + * 7: bpf_sk_release(sk); + * 8: snd_cwnd = tp->snd_cwnd; // verifier will complain + * + * After bpf_sk_release(sk) at line 7, both "fullsock" ptr and + * "tp" ptr should be invalidated also. In order to do that, + * the reg holding "fullsock" and "sk" need to remember + * the original refcounted ptr id (i.e. sk_reg->id) in ref_obj_id + * such that the verifier can reset all regs which have + * ref_obj_id matching the sk_reg->id. + * + * sk_reg->ref_obj_id is set to sk_reg->id at line 1. + * sk_reg->id will stay as NULL-marking purpose only. + * After NULL-marking is done, sk_reg->id can be reset to 0. + * + * After "fullsock = bpf_sk_fullsock(sk);" at line 3, + * fullsock_reg->ref_obj_id is set to sk_reg->ref_obj_id. + * + * After "tp = bpf_tcp_sock(fullsock);" at line 5, + * tp_reg->ref_obj_id is set to fullsock_reg->ref_obj_id + * which is the same as sk_reg->ref_obj_id. + * + * From the verifier perspective, if sk, fullsock and tp + * are not NULL, they are the same ptr with different + * reg->type. In particular, bpf_sk_release(tp) is also + * allowed and has the same effect as bpf_sk_release(sk). + */ + u32 ref_obj_id; /* For scalar types (SCALAR_VALUE), this represents our knowledge of * the actual value. * For pointer types, this represents the variable part of the offset diff --git a/include/linux/brcmphy.h b/include/linux/brcmphy.h @@ -148,6 +148,22 @@ #define BCM_LED_SRC_OFF 0xe /* Tied high */ #define BCM_LED_SRC_ON 0xf /* Tied low */ +/* + * Broadcom Multicolor LED configurations (expansion register 4) + */ +#define BCM_EXP_MULTICOLOR (MII_BCM54XX_EXP_SEL_ER + 0x04) +#define BCM_LED_MULTICOLOR_IN_PHASE BIT(8) +#define BCM_LED_MULTICOLOR_LINK_ACT 0x0 +#define BCM_LED_MULTICOLOR_SPEED 0x1 +#define BCM_LED_MULTICOLOR_ACT_FLASH 0x2 +#define BCM_LED_MULTICOLOR_FDX 0x3 +#define BCM_LED_MULTICOLOR_OFF 0x4 +#define BCM_LED_MULTICOLOR_ON 0x5 +#define BCM_LED_MULTICOLOR_ALT 0x6 +#define BCM_LED_MULTICOLOR_FLASH 0x7 +#define BCM_LED_MULTICOLOR_LINK 0x8 +#define BCM_LED_MULTICOLOR_ACT 0x9 +#define BCM_LED_MULTICOLOR_PROGRAM 0xa /* * BCM5482: Shadow registers diff --git a/include/linux/net.h b/include/linux/net.h @@ -83,6 +83,12 @@ enum sock_type { #endif /* ARCH_HAS_SOCKET_TYPES */ +/** + * enum sock_shutdown_cmd - Shutdown types + * @SHUT_RD: shutdown receptions + * @SHUT_WR: shutdown transmissions + * @SHUT_RDWR: shutdown receptions/transmissions + */ enum sock_shutdown_cmd { SHUT_RD, SHUT_WR, diff --git a/include/linux/socket.h b/include/linux/socket.h @@ -26,7 +26,7 @@ typedef __kernel_sa_family_t sa_family_t; /* * 1003.1g requires sa_family_t and that sa_data is char. */ - + struct sockaddr { sa_family_t sa_family; /* address family, AF_xxx */ char sa_data[14]; /* 14 bytes of protocol address */ @@ -44,7 +44,7 @@ struct linger { * system, not 4.3. Thus msg_accrights(len) are now missing. They * belong in an obscure libc emulation or the bin. */ - + struct msghdr { void *msg_name; /* ptr to socket address structure */ int msg_namelen; /* size of socket address structure */ @@ -54,7 +54,7 @@ struct msghdr { unsigned int msg_flags; /* flags on received message */ struct kiocb *msg_iocb; /* ptr to iocb for async requests */ }; - + struct user_msghdr { void __user *msg_name; /* ptr to socket address structure */ int msg_namelen; /* size of socket address structure */ @@ -122,7 +122,7 @@ struct cmsghdr { * inside range, given by msg->msg_controllen before using * ancillary object DATA. --ANK (980731) */ - + static inline struct cmsghdr * __cmsg_nxthdr(void *__ctl, __kernel_size_t __size, struct cmsghdr *__cmsg) { @@ -264,10 +264,10 @@ struct ucred { /* Maximum queue length specifiable by listen. */ #define SOMAXCONN 128 -/* Flags we can use with send/ and recv. +/* Flags we can use with send/ and recv. Added those for 1003.1g not all are supported yet */ - + #define MSG_OOB 1 #define MSG_PEEK 2 #define MSG_DONTROUTE 4 diff --git a/include/net/act_api.h b/include/net/act_api.h @@ -39,7 +39,7 @@ struct tc_action { struct gnet_stats_basic_cpu __percpu *cpu_bstats_hw; struct gnet_stats_queue __percpu *cpu_qstats; struct tc_cookie __rcu *act_cookie; - struct tcf_chain *goto_chain; + struct tcf_chain __rcu *goto_chain; }; #define tcf_index common.tcfa_index #define tcf_refcnt common.tcfa_refcnt @@ -90,7 +90,7 @@ struct tc_action_ops { int (*lookup)(struct net *net, struct tc_action **a, u32 index); int (*init)(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **act, int ovr, - int bind, bool rtnl_held, + int bind, bool rtnl_held, struct tcf_proto *tp, struct netlink_ext_ack *extack); int (*walk)(struct net *, struct sk_buff *, struct netlink_callback *, int, @@ -181,6 +181,11 @@ int tcf_action_dump_old(struct sk_buff *skb, struct tc_action *a, int, int); int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int, int); int tcf_action_copy_stats(struct sk_buff *, struct tc_action *, int); +int tcf_action_check_ctrlact(int action, struct tcf_proto *tp, + struct tcf_chain **handle, + struct netlink_ext_ack *newchain); +struct tcf_chain *tcf_action_set_ctrlact(struct tc_action *a, int action, + struct tcf_chain *newchain); #endif /* CONFIG_NET_CLS_ACT */ static inline void tcf_action_stats_update(struct tc_action *a, u64 bytes, diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h @@ -378,6 +378,7 @@ struct tcf_chain { bool flushing; const struct tcf_proto_ops *tmplt_ops; void *tmplt_priv; + struct rcu_head rcu; }; struct tcf_block { diff --git a/include/net/sctp/checksum.h b/include/net/sctp/checksum.h @@ -61,7 +61,7 @@ static inline __wsum sctp_csum_combine(__wsum csum, __wsum csum2, static inline __le32 sctp_compute_cksum(const struct sk_buff *skb, unsigned int offset) { - struct sctphdr *sh = sctp_hdr(skb); + struct sctphdr *sh = (struct sctphdr *)(skb->data + offset); const struct skb_checksum_ops ops = { .update = sctp_csum_update, .combine = sctp_csum_combine, diff --git a/include/net/sock.h b/include/net/sock.h @@ -710,6 +710,12 @@ static inline void sk_add_node_rcu(struct sock *sk, struct hlist_head *list) hlist_add_head_rcu(&sk->sk_node, list); } +static inline void sk_add_node_tail_rcu(struct sock *sk, struct hlist_head *list) +{ + sock_hold(sk); + hlist_add_tail_rcu(&sk->sk_node, list); +} + static inline void __sk_nulls_add_node_rcu(struct sock *sk, struct hlist_nulls_head *list) { hlist_nulls_add_head_rcu(&sk->sk_nulls_node, list); diff --git a/include/net/tc_act/tc_gact.h b/include/net/tc_act/tc_gact.h @@ -56,7 +56,7 @@ static inline bool is_tcf_gact_goto_chain(const struct tc_action *a) static inline u32 tcf_gact_goto_chain_index(const struct tc_action *a) { - return a->goto_chain->index; + return READ_ONCE(a->tcfa_action) & TC_ACT_EXT_VAL_MASK; } #endif /* __NET_TC_GACT_H */ diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h @@ -36,7 +36,6 @@ struct xdp_umem { u32 headroom; u32 chunk_size_nohr; struct user_struct *user; - struct pid *pid; unsigned long address; refcount_t users; struct work_struct work; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h @@ -502,16 +502,6 @@ union bpf_attr { * Return * 0 on success, or a negative error in case of failure. * - * int bpf_map_push_elem(struct bpf_map *map, const void *value, u64 flags) - * Description - * Push an element *value* in *map*. *flags* is one of: - * - * **BPF_EXIST** - * If the queue/stack is full, the oldest element is removed to - * make room for this. - * Return - * 0 on success, or a negative error in case of failure. - * * int bpf_probe_read(void *dst, u32 size, const void *src) * Description * For tracing programs, safely attempt to read *size* bytes from @@ -1435,14 +1425,14 @@ union bpf_attr { * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx) * Description * Equivalent to bpf_get_socket_cookie() helper that accepts - * *skb*, but gets socket from **struct bpf_sock_addr** contex. + * *skb*, but gets socket from **struct bpf_sock_addr** context. * Return * A 8-byte long non-decreasing number. * * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx) * Description * Equivalent to bpf_get_socket_cookie() helper that accepts - * *skb*, but gets socket from **struct bpf_sock_ops** contex. + * *skb*, but gets socket from **struct bpf_sock_ops** context. * Return * A 8-byte long non-decreasing number. * @@ -2098,52 +2088,52 @@ union bpf_attr { * Return * 0 on success, or a negative error in case of failure. * - * int bpf_rc_keydown(void *ctx, u32 protocol, u64 scancode, u32 toggle) + * int bpf_rc_repeat(void *ctx) * Description * This helper is used in programs implementing IR decoding, to - * report a successfully decoded key press with *scancode*, - * *toggle* value in the given *protocol*. The scancode will be - * translated to a keycode using the rc keymap, and reported as - * an input key down event. After a period a key up event is - * generated. This period can be extended by calling either - * **bpf_rc_keydown**\ () again with the same values, or calling - * **bpf_rc_repeat**\ (). + * report a successfully decoded repeat key message. This delays + * the generation of a key up event for previously generated + * key down event. * - * Some protocols include a toggle bit, in case the button was - * released and pressed again between consecutive scancodes. + * Some IR protocols like NEC have a special IR message for + * repeating last button, for when a button is held down. * * The *ctx* should point to the lirc sample as passed into * the program. * - * The *protocol* is the decoded protocol number (see - * **enum rc_proto** for some predefined values). - * * This helper is only available is the kernel was compiled with * the **CONFIG_BPF_LIRC_MODE2** configuration option set to * "**y**". * Return * 0 * - * int bpf_rc_repeat(void *ctx) + * int bpf_rc_keydown(void *ctx, u32 protocol, u64 scancode, u32 toggle) * Description * This helper is used in programs implementing IR decoding, to - * report a successfully decoded repeat key message. This delays - * the generation of a key up event for previously generated - * key down event. + * report a successfully decoded key press with *scancode*, + * *toggle* value in the given *protocol*. The scancode will be + * translated to a keycode using the rc keymap, and reported as + * an input key down event. After a period a key up event is + * generated. This period can be extended by calling either + * **bpf_rc_keydown**\ () again with the same values, or calling + * **bpf_rc_repeat**\ (). * - * Some IR protocols like NEC have a special IR message for - * repeating last button, for when a button is held down. + * Some protocols include a toggle bit, in case the button was + * released and pressed again between consecutive scancodes. * * The *ctx* should point to the lirc sample as passed into * the program. * + * The *protocol* is the decoded protocol number (see + * **enum rc_proto** for some predefined values). + * * This helper is only available is the kernel was compiled with * the **CONFIG_BPF_LIRC_MODE2** configuration option set to * "**y**". * Return * 0 * - * uint64_t bpf_skb_cgroup_id(struct sk_buff *skb) + * u64 bpf_skb_cgroup_id(struct sk_buff *skb) * Description * Return the cgroup v2 id of the socket associated with the *skb*. * This is roughly similar to the **bpf_get_cgroup_classid**\ () @@ -2159,30 +2149,12 @@ union bpf_attr { * Return * The id is returned or 0 in case the id could not be retrieved. * - * u64 bpf_skb_ancestor_cgroup_id(struct sk_buff *skb, int ancestor_level) - * Description - * Return id of cgroup v2 that is ancestor of cgroup associated - * with the *skb* at the *ancestor_level*. The root cgroup is at - * *ancestor_level* zero and each step down the hierarchy - * increments the level. If *ancestor_level* == level of cgroup - * associated with *skb*, then return value will be same as that - * of **bpf_skb_cgroup_id**\ (). - * - * The helper is useful to implement policies based on cgroups - * that are upper in hierarchy than immediate cgroup associated - * with *skb*. - * - * The format of returned id and helper limitations are same as in - * **bpf_skb_cgroup_id**\ (). - * Return - * The id is returned or 0 in case the id could not be retrieved. - * * u64 bpf_get_current_cgroup_id(void) * Return * A 64-bit integer containing the current cgroup id based * on the cgroup within which the current task is running. * - * void* get_local_storage(void *map, u64 flags) + * void *bpf_get_local_storage(void *map, u64 flags) * Description * Get the pointer to the local storage area. * The type and the size of the local storage is defined @@ -2209,6 +2181,24 @@ union bpf_attr { * Return * 0 on success, or a negative error in case of failure. * + * u64 bpf_skb_ancestor_cgroup_id(struct sk_buff *skb, int ancestor_level) + * Description + * Return id of cgroup v2 that is ancestor of cgroup associated + * with the *skb* at the *ancestor_level*. The root cgroup is at + * *ancestor_level* zero and each step down the hierarchy + * increments the level. If *ancestor_level* == level of cgroup + * associated with *skb*, then return value will be same as that + * of **bpf_skb_cgroup_id**\ (). + * + * The helper is useful to implement policies based on cgroups + * that are upper in hierarchy than immediate cgroup associated + * with *skb*. + * + * The format of returned id and helper limitations are same as in + * **bpf_skb_cgroup_id**\ (). + * Return + * The id is returned or 0 in case the id could not be retrieved. + * * struct bpf_sock *bpf_sk_lookup_tcp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u64 netns, u64 flags) * Description * Look for TCP socket matching *tuple*, optionally in a child @@ -2289,6 +2279,16 @@ union bpf_attr { * Return * 0 on success, or a negative error in case of failure. * + * int bpf_map_push_elem(struct bpf_map *map, const void *value, u64 flags) + * Description + * Push an element *value* in *map*. *flags* is one of: + * + * **BPF_EXIST** + * If the queue/stack is full, the oldest element is + * removed to make room for this. + * Return + * 0 on success, or a negative error in case of failure. + * * int bpf_map_pop_elem(struct bpf_map *map, void *value) * Description * Pop an element from *map*. @@ -2343,29 +2343,94 @@ union bpf_attr { * Return * 0 * + * int bpf_spin_lock(struct bpf_spin_lock *lock) + * Description + * Acquire a spinlock represented by the pointer *lock*, which is + * stored as part of a value of a map. Taking the lock allows to + * safely update the rest of the fields in that value. The + * spinlock can (and must) later be released with a call to + * **bpf_spin_unlock**\ (\ *lock*\ ). + * + * Spinlocks in BPF programs come with a number of restrictions + * and constraints: + * + * * **bpf_spin_lock** objects are only allowed inside maps of + * types **BPF_MAP_TYPE_HASH** and **BPF_MAP_TYPE_ARRAY** (this + * list could be extended in the future). + * * BTF description of the map is mandatory. + * * The BPF program can take ONE lock at a time, since taking two + * or more could cause dead locks. + * * Only one **struct bpf_spin_lock** is allowed per map element. + * * When the lock is taken, calls (either BPF to BPF or helpers) + * are not allowed. + * * The **BPF_LD_ABS** and **BPF_LD_IND** instructions are not + * allowed inside a spinlock-ed region. + * * The BPF program MUST call **bpf_spin_unlock**\ () to release + * the lock, on all execution paths, before it returns. + * * The BPF program can access **struct bpf_spin_lock** only via + * the **bpf_spin_lock**\ () and **bpf_spin_unlock**\ () + * helpers. Loading or storing data into the **struct + * bpf_spin_lock** *lock*\ **;** field of a map is not allowed. + * * To use the **bpf_spin_lock**\ () helper, the BTF description + * of the map value must be a struct and have **struct + * bpf_spin_lock** *anyname*\ **;** field at the top level. + * Nested lock inside another struct is not allowed. + * * The **struct bpf_spin_lock** *lock* field in a map value must + * be aligned on a multiple of 4 bytes in that value. + * * Syscall with command **BPF_MAP_LOOKUP_ELEM** does not copy + * the **bpf_spin_lock** field to user space. + * * Syscall with command **BPF_MAP_UPDATE_ELEM**, or update from + * a BPF program, do not update the **bpf_spin_lock** field. + * * **bpf_spin_lock** cannot be on the stack or inside a + * networking packet (it can only be inside of a map values). + * * **bpf_spin_lock** is available to root only. + * * Tracing programs and socket filter programs cannot use + * **bpf_spin_lock**\ () due to insufficient preemption checks + * (but this may change in the future). + * * **bpf_spin_lock** is not allowed in inner maps of map-in-map. + * Return + * 0 + * + * int bpf_spin_unlock(struct bpf_spin_lock *lock) + * Description + * Release the *lock* previously locked by a call to + * **bpf_spin_lock**\ (\ *lock*\ ). + * Return + * 0 + * * struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk) * Description * This helper gets a **struct bpf_sock** pointer such - * that all the fields in bpf_sock can be accessed. + * that all the fields in this **bpf_sock** can be accessed. * Return - * A **struct bpf_sock** pointer on success, or NULL in + * A **struct bpf_sock** pointer on success, or **NULL** in * case of failure. * * struct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *sk) * Description * This helper gets a **struct bpf_tcp_sock** pointer from a * **struct bpf_sock** pointer. - * * Return - * A **struct bpf_tcp_sock** pointer on success, or NULL in + * A **struct bpf_tcp_sock** pointer on success, or **NULL** in * case of failure. * * int bpf_skb_ecn_set_ce(struct sk_buf *skb) - * Description - * Sets ECN of IP header to ce (congestion encountered) if - * current value is ect (ECN capable). Works with IPv6 and IPv4. - * Return - * 1 if set, 0 if not set. + * Description + * Set ECN (Explicit Congestion Notification) field of IP header + * to **CE** (Congestion Encountered) if current value is **ECT** + * (ECN Capable Transport). Otherwise, do nothing. Works with IPv6 + * and IPv4. + * Return + * 1 if the **CE** flag is set (either by the current helper call + * or because it was already present), 0 if it is not set. + * + * struct bpf_sock *bpf_get_listener_sock(struct bpf_sock *sk) + * Description + * Return a **struct bpf_sock** pointer in **TCP_LISTEN** state. + * **bpf_sk_release**\ () is unnecessary and not allowed. + * Return + * A **struct bpf_sock** pointer on success, or **NULL** in + * case of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -2465,7 +2530,8 @@ union bpf_attr { FN(spin_unlock), \ FN(sk_fullsock), \ FN(tcp_sock), \ - FN(skb_ecn_set_ce), + FN(skb_ecn_set_ce), \ + FN(get_listener_sock), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c @@ -136,21 +136,29 @@ static struct bpf_map *find_and_alloc_map(union bpf_attr *attr) void *bpf_map_area_alloc(size_t size, int numa_node) { - /* We definitely need __GFP_NORETRY, so OOM killer doesn't - * trigger under memory pressure as we really just want to - * fail instead. + /* We really just want to fail instead of triggering OOM killer + * under memory pressure, therefore we set __GFP_NORETRY to kmalloc, + * which is used for lower order allocation requests. + * + * It has been observed that higher order allocation requests done by + * vmalloc with __GFP_NORETRY being set might fail due to not trying + * to reclaim memory from the page cache, thus we set + * __GFP_RETRY_MAYFAIL to avoid such situations. */ - const gfp_t flags = __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO; + + const gfp_t flags = __GFP_NOWARN | __GFP_ZERO; void *area; if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) { - area = kmalloc_node(size, GFP_USER | flags, numa_node); + area = kmalloc_node(size, GFP_USER | __GFP_NORETRY | flags, + numa_node); if (area != NULL) return area; } - return __vmalloc_node_flags_caller(size, numa_node, GFP_KERNEL | flags, - __builtin_return_address(0)); + return __vmalloc_node_flags_caller(size, numa_node, + GFP_KERNEL | __GFP_RETRY_MAYFAIL | + flags, __builtin_return_address(0)); } void bpf_map_area_free(void *area) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c @@ -212,7 +212,7 @@ struct bpf_call_arg_meta { int access_size; s64 msize_smax_value; u64 msize_umax_value; - int ptr_id; + int ref_obj_id; int func_id; }; @@ -346,35 +346,23 @@ static bool reg_type_may_be_null(enum bpf_reg_type type) type == PTR_TO_TCP_SOCK_OR_NULL; } -static bool type_is_refcounted(enum bpf_reg_type type) -{ - return type == PTR_TO_SOCKET; -} - -static bool type_is_refcounted_or_null(enum bpf_reg_type type) -{ - return type == PTR_TO_SOCKET || type == PTR_TO_SOCKET_OR_NULL; -} - -static bool reg_is_refcounted(const struct bpf_reg_state *reg) -{ - return type_is_refcounted(reg->type); -} - static bool reg_may_point_to_spin_lock(const struct bpf_reg_state *reg) { return reg->type == PTR_TO_MAP_VALUE && map_value_has_spin_lock(reg->map_ptr); } -static bool reg_is_refcounted_or_null(const struct bpf_reg_state *reg) +static bool reg_type_may_be_refcounted_or_null(enum bpf_reg_type type) { - return type_is_refcounted_or_null(reg->type); + return type == PTR_TO_SOCKET || + type == PTR_TO_SOCKET_OR_NULL || + type == PTR_TO_TCP_SOCK || + type == PTR_TO_TCP_SOCK_OR_NULL; } -static bool arg_type_is_refcounted(enum bpf_arg_type type) +static bool arg_type_may_be_refcounted(enum bpf_arg_type type) { - return type == ARG_PTR_TO_SOCKET; + return type == ARG_PTR_TO_SOCK_COMMON; } /* Determine whether the function releases some resources allocated by another @@ -392,6 +380,12 @@ static bool is_acquire_function(enum bpf_func_id func_id) func_id == BPF_FUNC_sk_lookup_udp; } +static bool is_ptr_cast_function(enum bpf_func_id func_id) +{ + return func_id == BPF_FUNC_tcp_sock || + func_id == BPF_FUNC_sk_fullsock; +} + /* string representation of 'enum bpf_reg_type' */ static const char * const reg_type_str[] = { [NOT_INIT] = "?", @@ -466,6 +460,8 @@ static void print_verifier_state(struct bpf_verifier_env *env, verbose(env, ",call_%d", func(env, reg)->callsite); } else { verbose(env, "(id=%d", reg->id); + if (reg_type_may_be_refcounted_or_null(t)) + verbose(env, ",ref_obj_id=%d", reg->ref_obj_id); if (t != SCALAR_VALUE) verbose(env, ",off=%d", reg->off); if (type_is_pkt_pointer(t)) @@ -2414,16 +2410,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno, /* Any sk pointer can be ARG_PTR_TO_SOCK_COMMON */ if (!type_is_sk_pointer(type)) goto err_type; - } else if (arg_type == ARG_PTR_TO_SOCKET) { - expected_type = PTR_TO_SOCKET; - if (type != expected_type) - goto err_type; - if (meta->ptr_id || !reg->id) { - verbose(env, "verifier internal error: mismatched references meta=%d, reg=%d\n", - meta->ptr_id, reg->id); - return -EFAULT; + if (reg->ref_obj_id) { + if (meta->ref_obj_id) { + verbose(env, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n", + regno, reg->ref_obj_id, + meta->ref_obj_id); + return -EFAULT; + } + meta->ref_obj_id = reg->ref_obj_id; } - meta->ptr_id = reg->id; } else if (arg_type == ARG_PTR_TO_SPIN_LOCK) { if (meta->func_id == BPF_FUNC_spin_lock) { if (process_spin_lock(env, regno, true)) @@ -2740,32 +2735,38 @@ static bool check_arg_pair_ok(const struct bpf_func_proto *fn) return true; } -static bool check_refcount_ok(const struct bpf_func_proto *fn) +static bool check_refcount_ok(const struct bpf_func_proto *fn, int func_id) { int count = 0; - if (arg_type_is_refcounted(fn->arg1_type)) + if (arg_type_may_be_refcounted(fn->arg1_type)) count++; - if (arg_type_is_refcounted(fn->arg2_type)) + if (arg_type_may_be_refcounted(fn->arg2_type)) count++; - if (arg_type_is_refcounted(fn->arg3_type)) + if (arg_type_may_be_refcounted(fn->arg3_type)) count++; - if (arg_type_is_refcounted(fn->arg4_type)) + if (arg_type_may_be_refcounted(fn->arg4_type)) count++; - if (arg_type_is_refcounted(fn->arg5_type)) + if (arg_type_may_be_refcounted(fn->arg5_type)) count++; + /* A reference acquiring function cannot acquire + * another refcounted ptr. + */ + if (is_acquire_function(func_id) && count) + return false; + /* We only support one arg being unreferenced at the moment, * which is sufficient for the helper functions we have right now. */ return count <= 1; } -static int check_func_proto(const struct bpf_func_proto *fn) +static int check_func_proto(const struct bpf_func_proto *fn, int func_id) { return check_raw_mode_ok(fn) && check_arg_pair_ok(fn) && - check_refcount_ok(fn) ? 0 : -EINVAL; + check_refcount_ok(fn, func_id) ? 0 : -EINVAL; } /* Packet data might have moved, any old PTR_TO_PACKET[_META,_END] @@ -2799,19 +2800,20 @@ static void clear_all_pkt_pointers(struct bpf_verifier_env *env) } static void release_reg_references(struct bpf_verifier_env *env, - struct bpf_func_state *state, int id) + struct bpf_func_state *state, + int ref_obj_id) { struct bpf_reg_state *regs = state->regs, *reg; int i; for (i = 0; i < MAX_BPF_REG; i++) - if (regs[i].id == id) + if (regs[i].ref_obj_id == ref_obj_id) mark_reg_unknown(env, regs, i); bpf_for_each_spilled_reg(i, state, reg) { if (!reg) continue; - if (reg_is_refcounted(reg) && reg->id == id) + if (reg->ref_obj_id == ref_obj_id) __mark_reg_unknown(reg); } } @@ -2820,15 +2822,20 @@ static void release_reg_references(struct bpf_verifier_env *env, * resources. Identify all copies of the same pointer and clear the reference. */ static int release_reference(struct bpf_verifier_env *env, - struct bpf_call_arg_meta *meta) + int ref_obj_id) { struct bpf_verifier_state *vstate = env->cur_state; + int err; int i; + err = release_reference_state(cur_func(env), ref_obj_id); + if (err) + return err; + for (i = 0; i <= vstate->curframe; i++) - release_reg_references(env, vstate->frame[i], meta->ptr_id); + release_reg_references(env, vstate->frame[i], ref_obj_id); - return release_reference_state(cur_func(env), meta->ptr_id); + return 0; } static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, @@ -3047,7 +3054,7 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn memset(&meta, 0, sizeof(meta)); meta.pkt_access = fn->pkt_access; - err = check_func_proto(fn); + err = check_func_proto(fn, func_id); if (err) { verbose(env, "kernel subsystem misconfigured func %s#%d\n", func_id_name(func_id), func_id); @@ -3093,7 +3100,7 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn return err; } } else if (is_release_function(func_id)) { - err = release_reference(env, &meta); + err = release_reference(env, meta.ref_obj_id); if (err) { verbose(env, "func %s#%d reference has not been acquired before\n", func_id_name(func_id), func_id); @@ -3154,8 +3161,10 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn if (id < 0) return id; - /* For release_reference() */ + /* For mark_ptr_or_null_reg() */ regs[BPF_REG_0].id = id; + /* For release_reference() */ + regs[BPF_REG_0].ref_obj_id = id; } else { /* For mark_ptr_or_null_reg() */ regs[BPF_REG_0].id = ++env->id_gen; @@ -3170,6 +3179,10 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn return -EINVAL; } + if (is_ptr_cast_function(func_id)) + /* For release_reference() */ + regs[BPF_REG_0].ref_obj_id = meta.ref_obj_id; + do_refine_retval_range(regs, fn->ret_type, func_id, &meta); err = check_map_func_compatibility(env, meta.map_ptr, func_id); @@ -3368,7 +3381,7 @@ do_sim: *dst_reg = *ptr_reg; } ret = push_stack(env, env->insn_idx + 1, env->insn_idx, true); - if (!ptr_is_dst_reg) + if (!ptr_is_dst_reg && ret) *dst_reg = tmp; return !ret ? -EFAULT : 0; } @@ -4665,11 +4678,19 @@ static void mark_ptr_or_null_reg(struct bpf_func_state *state, } else if (reg->type == PTR_TO_TCP_SOCK_OR_NULL) { reg->type = PTR_TO_TCP_SOCK; } - if (is_null || !(reg_is_refcounted(reg) || - reg_may_point_to_spin_lock(reg))) { - /* We don't need id from this point onwards anymore, - * thus we should better reset it, so that state - * pruning has chances to take effect. + if (is_null) { + /* We don't need id and ref_obj_id from this point + * onwards anymore, thus we should better reset it, + * so that state pruning has chances to take effect. + */ + reg->id = 0; + reg->ref_obj_id = 0; + } else if (!reg_may_point_to_spin_lock(reg)) { + /* For not-NULL ptr, reg->ref_obj_id will be reset + * in release_reg_references(). + * + * reg->id is still used by spin_lock ptr. Other + * than spin_lock ptr type, reg->id can be reset. */ reg->id = 0; } @@ -4684,11 +4705,16 @@ static void mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno, { struct bpf_func_state *state = vstate->frame[vstate->curframe]; struct bpf_reg_state *reg, *regs = state->regs; + u32 ref_obj_id = regs[regno].ref_obj_id; u32 id = regs[regno].id; int i, j; - if (reg_is_refcounted_or_null(&regs[regno]) && is_null) - release_reference_state(state, id); + if (ref_obj_id && ref_obj_id == id && is_null) + /* regs[regno] is in the " == NULL" branch. + * No one could have freed the reference state before + * doing the NULL check. + */ + WARN_ON_ONCE(release_reference_state(state, id)); for (i = 0; i < MAX_BPF_REG; i++) mark_ptr_or_null_reg(state, &regs[i], id, is_null); @@ -6052,15 +6078,17 @@ static int propagate_liveness(struct bpf_verifier_env *env, } /* Propagate read liveness of registers... */ BUILD_BUG_ON(BPF_REG_FP + 1 != MAX_BPF_REG); - /* We don't need to worry about FP liveness because it's read-only */ - for (i = 0; i < BPF_REG_FP; i++) { - if (vparent->frame[vparent->curframe]->regs[i].live & REG_LIVE_READ) - continue; - if (vstate->frame[vstate->curframe]->regs[i].live & REG_LIVE_READ) { - err = mark_reg_read(env, &vstate->frame[vstate->curframe]->regs[i], - &vparent->frame[vstate->curframe]->regs[i]); - if (err) - return err; + for (frame = 0; frame <= vstate->curframe; frame++) { + /* We don't need to worry about FP liveness, it's read-only */ + for (i = frame < vstate->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++) { + if (vparent->frame[frame]->regs[i].live & REG_LIVE_READ) + continue; + if (vstate->frame[frame]->regs[i].live & REG_LIVE_READ) { + err = mark_reg_read(env, &vstate->frame[frame]->regs[i], + &vparent->frame[frame]->regs[i]); + if (err) + return err; + } } } diff --git a/lib/rhashtable.c b/lib/rhashtable.c @@ -416,8 +416,12 @@ static void rht_deferred_worker(struct work_struct *work) else if (tbl->nest) err = rhashtable_rehash_alloc(ht, tbl, tbl->size); - if (!err) - err = rhashtable_rehash_table(ht); + if (!err || err == -EEXIST) { + int nerr; + + nerr = rhashtable_rehash_table(ht); + err = err ?: nerr; + } mutex_unlock(&ht->mutex); diff --git a/net/appletalk/aarp.c b/net/appletalk/aarp.c @@ -879,15 +879,24 @@ static struct notifier_block aarp_notifier = { static unsigned char aarp_snap_id[] = { 0x00, 0x00, 0x00, 0x80, 0xF3 }; -void __init aarp_proto_init(void) +int __init aarp_proto_init(void) { + int rc; + aarp_dl = register_snap_client(aarp_snap_id, aarp_rcv); - if (!aarp_dl) + if (!aarp_dl) { printk(KERN_CRIT "Unable to register AARP with SNAP.\n"); + return -ENOMEM; + } timer_setup(&aarp_timer, aarp_expire_timeout, 0); aarp_timer.expires = jiffies + sysctl_aarp_expiry_time; add_timer(&aarp_timer); - register_netdevice_notifier(&aarp_notifier); + rc = register_netdevice_notifier(&aarp_notifier); + if (rc) { + del_timer_sync(&aarp_timer); + unregister_snap_client(aarp_dl); + } + return rc; } /* Remove the AARP entries associated with a device. */ diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c @@ -1904,9 +1904,6 @@ static unsigned char ddp_snap_id[] = { 0x08, 0x00, 0x07, 0x80, 0x9B }; EXPORT_SYMBOL(atrtr_get_dev); EXPORT_SYMBOL(atalk_find_dev_addr); -static const char atalk_err_snap[] __initconst = - KERN_CRIT "Unable to register DDP with SNAP.\n"; - /* Called by proto.c on kernel start up */ static int __init atalk_init(void) { @@ -1921,17 +1918,22 @@ static int __init atalk_init(void) goto out_proto; ddp_dl = register_snap_client(ddp_snap_id, atalk_rcv); - if (!ddp_dl) - printk(atalk_err_snap); + if (!ddp_dl) { + pr_crit("Unable to register DDP with SNAP.\n"); + goto out_sock; + } dev_add_pack(&ltalk_packet_type); dev_add_pack(&ppptalk_packet_type); rc = register_netdevice_notifier(&ddp_notifier); if (rc) - goto out_sock; + goto out_snap; + + rc = aarp_proto_init(); + if (rc) + goto out_dev; - aarp_proto_init(); rc = atalk_proc_init(); if (rc) goto out_aarp; @@ -1945,11 +1947,13 @@ out_proc: atalk_proc_exit(); out_aarp: aarp_cleanup_module(); +out_dev: unregister_netdevice_notifier(&ddp_notifier); -out_sock: +out_snap: dev_remove_pack(&ppptalk_packet_type); dev_remove_pack(&ltalk_packet_type); unregister_snap_client(ddp_dl); +out_sock: sock_unregister(PF_APPLETALK); out_proto: proto_unregister(&ddp_proto); diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c @@ -502,6 +502,7 @@ static unsigned int br_nf_pre_routing(void *priv, nf_bridge->ipv4_daddr = ip_hdr(skb)->daddr; skb->protocol = htons(ETH_P_IP); + skb->transport_header = skb->network_header + ip_hdr(skb)->ihl * 4; NF_HOOK(NFPROTO_IPV4, NF_INET_PRE_ROUTING, state->net, state->sk, skb, skb->dev, NULL, diff --git a/net/bridge/br_netfilter_ipv6.c b/net/bridge/br_netfilter_ipv6.c @@ -235,6 +235,8 @@ unsigned int br_nf_pre_routing_ipv6(void *priv, nf_bridge->ipv6_daddr = ipv6_hdr(skb)->daddr; skb->protocol = htons(ETH_P_IPV6); + skb->transport_header = skb->network_header + sizeof(struct ipv6hdr); + NF_HOOK(NFPROTO_IPV6, NF_INET_PRE_ROUTING, state->net, state->sk, skb, skb->dev, NULL, br_nf_pre_routing_finish_ipv6); diff --git a/net/core/devlink.c b/net/core/devlink.c @@ -3897,6 +3897,11 @@ static int devlink_nl_cmd_info_get_dumpit(struct sk_buff *msg, continue; } + if (!devlink->ops->info_get) { + idx++; + continue; + } + mutex_lock(&devlink->lock); err = devlink_nl_info_fill(msg, devlink, DEVLINK_CMD_INFO_GET, NETLINK_CB(cb->skb).portid, diff --git a/net/core/filter.c b/net/core/filter.c @@ -1796,8 +1796,6 @@ static const struct bpf_func_proto bpf_skb_pull_data_proto = { BPF_CALL_1(bpf_sk_fullsock, struct sock *, sk) { - sk = sk_to_full_sk(sk); - return sk_fullsock(sk) ? (unsigned long)sk : (unsigned long)NULL; } @@ -5266,7 +5264,7 @@ static const struct bpf_func_proto bpf_sk_release_proto = { .func = bpf_sk_release, .gpl_only = false, .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_SOCKET, + .arg1_type = ARG_PTR_TO_SOCK_COMMON, }; BPF_CALL_5(bpf_xdp_sk_lookup_udp, struct xdp_buff *, ctx, @@ -5407,8 +5405,6 @@ u32 bpf_tcp_sock_convert_ctx_access(enum bpf_access_type type, BPF_CALL_1(bpf_tcp_sock, struct sock *, sk) { - sk = sk_to_full_sk(sk); - if (sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP) return (unsigned long)sk; @@ -5422,6 +5418,23 @@ static const struct bpf_func_proto bpf_tcp_sock_proto = { .arg1_type = ARG_PTR_TO_SOCK_COMMON, }; +BPF_CALL_1(bpf_get_listener_sock, struct sock *, sk) +{ + sk = sk_to_full_sk(sk); + + if (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_RCU_FREE)) + return (unsigned long)sk; + + return (unsigned long)NULL; +} + +static const struct bpf_func_proto bpf_get_listener_sock_proto = { + .func = bpf_get_listener_sock, + .gpl_only = false, + .ret_type = RET_PTR_TO_SOCKET_OR_NULL, + .arg1_type = ARG_PTR_TO_SOCK_COMMON, +}; + BPF_CALL_1(bpf_skb_ecn_set_ce, struct sk_buff *, skb) { unsigned int iphdr_len; @@ -5607,6 +5620,8 @@ cg_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) #ifdef CONFIG_INET case BPF_FUNC_tcp_sock: return &bpf_tcp_sock_proto; + case BPF_FUNC_get_listener_sock: + return &bpf_get_listener_sock_proto; case BPF_FUNC_skb_ecn_set_ce: return &bpf_skb_ecn_set_ce_proto; #endif @@ -5702,6 +5717,8 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_sk_release_proto; case BPF_FUNC_tcp_sock: return &bpf_tcp_sock_proto; + case BPF_FUNC_get_listener_sock: + return &bpf_get_listener_sock_proto; #endif default: return bpf_base_func_proto(func_id); diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c @@ -928,6 +928,8 @@ static int rx_queue_add_kobject(struct net_device *dev, int index) if (error) return error; + dev_hold(queue->dev); + if (dev->sysfs_rx_queue_group) { error = sysfs_create_group(kobj, dev->sysfs_rx_queue_group); if (error) { @@ -937,7 +939,6 @@ static int rx_queue_add_kobject(struct net_device *dev, int index) } kobject_uevent(kobj, KOBJ_ADD); - dev_hold(queue->dev); return error; } @@ -1464,6 +1465,8 @@ static int netdev_queue_add_kobject(struct net_device *dev, int index) if (error) return error; + dev_hold(queue->dev); + #ifdef CONFIG_BQL error = sysfs_create_group(kobj, &dql_group); if (error) { @@ -1473,7 +1476,6 @@ static int netdev_queue_add_kobject(struct net_device *dev, int index) #endif kobject_uevent(kobj, KOBJ_ADD); - dev_hold(queue->dev); return 0; } @@ -1745,16 +1747,20 @@ int netdev_register_kobject(struct net_device *ndev) error = device_add(dev); if (error) - return error; + goto error_put_device; error = register_queue_kobjects(ndev); - if (error) { - device_del(dev); - return error; - } + if (error) + goto error_device_del; pm_runtime_set_memalloc_noio(dev, true); + return 0; + +error_device_del: + device_del(dev); +error_put_device: + put_device(dev); return error; } diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c @@ -436,8 +436,8 @@ static struct sock *dccp_v6_request_recv_sock(const struct sock *sk, newnp->ipv6_mc_list = NULL; newnp->ipv6_ac_list = NULL; newnp->ipv6_fl_list = NULL; - newnp->mcast_oif = inet6_iif(skb); - newnp->mcast_hops = ipv6_hdr(skb)->hop_limit; + newnp->mcast_oif = inet_iif(skb); + newnp->mcast_hops = ip_hdr(skb)->ttl; /* * No need to charge this sock to the relevant IPv6 refcnt debug socks count diff --git a/net/ipv6/netfilter/ip6t_srh.c b/net/ipv6/netfilter/ip6t_srh.c @@ -210,6 +210,8 @@ static bool srh1_mt6(const struct sk_buff *skb, struct xt_action_param *par) psidoff = srhoff + sizeof(struct ipv6_sr_hdr) + ((srh->segments_left + 1) * sizeof(struct in6_addr)); psid = skb_header_pointer(skb, psidoff, sizeof(_psid), &_psid); + if (!psid) + return false; if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_PSID, ipv6_masked_addr_cmp(psid, &srhinfo->psid_msk, &srhinfo->psid_addr))) @@ -223,6 +225,8 @@ static bool srh1_mt6(const struct sk_buff *skb, struct xt_action_param *par) nsidoff = srhoff + sizeof(struct ipv6_sr_hdr) + ((srh->segments_left - 1) * sizeof(struct in6_addr)); nsid = skb_header_pointer(skb, nsidoff, sizeof(_nsid), &_nsid); + if (!nsid) + return false; if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_NSID, ipv6_masked_addr_cmp(nsid, &srhinfo->nsid_msk, &srhinfo->nsid_addr))) @@ -233,6 +237,8 @@ static bool srh1_mt6(const struct sk_buff *skb, struct xt_action_param *par) if (srhinfo->mt_flags & IP6T_SRH_LSID) { lsidoff = srhoff + sizeof(struct ipv6_sr_hdr); lsid = skb_header_pointer(skb, lsidoff, sizeof(_lsid), &_lsid); + if (!lsid) + return false; if (NF_SRH_INVF(srhinfo, IP6T_SRH_INV_LSID, ipv6_masked_addr_cmp(lsid, &srhinfo->lsid_msk, &srhinfo->lsid_addr))) diff --git a/net/ipv6/route.c b/net/ipv6/route.c @@ -1040,14 +1040,20 @@ static struct rt6_info *ip6_create_rt_rcu(struct fib6_info *rt) struct rt6_info *nrt; if (!fib6_info_hold_safe(rt)) - return NULL; + goto fallback; nrt = ip6_dst_alloc(dev_net(dev), dev, flags); - if (nrt) - ip6_rt_copy_init(nrt, rt); - else + if (!nrt) { fib6_info_release(rt); + goto fallback; + } + ip6_rt_copy_init(nrt, rt); + return nrt; + +fallback: + nrt = dev_net(dev)->ipv6.ip6_null_entry; + dst_hold(&nrt->dst); return nrt; } @@ -1096,10 +1102,6 @@ restart: dst_hold(&rt->dst); } else { rt = ip6_create_rt_rcu(f6i); - if (!rt) { - rt = net->ipv6.ip6_null_entry; - dst_hold(&rt->dst); - } } rcu_read_unlock(); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c @@ -1110,11 +1110,11 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * newnp->ipv6_fl_list = NULL; newnp->pktoptions = NULL; newnp->opt = NULL; - newnp->mcast_oif = tcp_v6_iif(skb); - newnp->mcast_hops = ipv6_hdr(skb)->hop_limit; - newnp->rcv_flowinfo = ip6_flowinfo(ipv6_hdr(skb)); + newnp->mcast_oif = inet_iif(skb); + newnp->mcast_hops = ip_hdr(skb)->ttl; + newnp->rcv_flowinfo = 0; if (np->repflow) - newnp->flow_label = ip6_flowlabel(ipv6_hdr(skb)); + newnp->flow_label = 0; /* * No need to charge this sock to the relevant IPv6 refcnt debug socks count diff --git a/net/mpls/mpls_iptunnel.c b/net/mpls/mpls_iptunnel.c @@ -140,9 +140,15 @@ static int mpls_xmit(struct sk_buff *skb) if (rt) err = neigh_xmit(NEIGH_ARP_TABLE, out_dev, &rt->rt_gateway, skb); - else if (rt6) - err = neigh_xmit(NEIGH_ND_TABLE, out_dev, &rt6->rt6i_gateway, - skb); + else if (rt6) { + if (ipv6_addr_v4mapped(&rt6->rt6i_gateway)) { + /* 6PE (RFC 4798) */ + err = neigh_xmit(NEIGH_ARP_TABLE, out_dev, &rt6->rt6i_gateway.s6_addr32[3], + skb); + } else + err = neigh_xmit(NEIGH_ND_TABLE, out_dev, &rt6->rt6i_gateway, + skb); + } if (err) net_dbg_ratelimited("%s: packet transmission failed: %d\n", __func__, err); diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c @@ -251,6 +251,10 @@ static int ncsi_pkg_info_all_nl(struct sk_buff *skb, } attr = nla_nest_start(skb, NCSI_ATTR_PACKAGE_LIST); + if (!attr) { + rc = -EMSGSIZE; + goto err; + } rc = ncsi_write_package_info(skb, ndp, package->id); if (rc) { nla_nest_cancel(skb, attr); diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig @@ -1007,6 +1007,7 @@ config NETFILTER_XT_TARGET_TEE depends on NETFILTER_ADVANCED depends on IPV6 || IPV6=n depends on !NF_CONNTRACK || NF_CONNTRACK + depends on IP6_NF_IPTABLES || !IP6_NF_IPTABLES select NF_DUP_IPV4 select NF_DUP_IPV6 if IP6_NF_IPTABLES ---help--- diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c @@ -20,9 +20,9 @@ #include <linux/udp.h> #include <linux/tcp.h> #include <linux/netfilter.h> +#include <linux/netfilter_ipv4.h> +#include <linux/netfilter_ipv6.h> -#include <net/route.h> -#include <net/ip6_route.h> #include <net/netfilter/nf_conntrack.h> #include <net/netfilter/nf_conntrack_core.h> #include <net/netfilter/nf_conntrack_expect.h> @@ -871,38 +871,33 @@ static int set_expected_rtp_rtcp(struct sk_buff *skb, unsigned int protoff, } else if (sip_external_media) { struct net_device *dev = skb_dst(skb)->dev; struct net *net = dev_net(dev); - struct rtable *rt; - struct flowi4 fl4 = {}; -#if IS_ENABLED(CONFIG_IPV6) - struct flowi6 fl6 = {}; -#endif + struct flowi fl; struct dst_entry *dst = NULL; + memset(&fl, 0, sizeof(fl)); + switch (nf_ct_l3num(ct)) { case NFPROTO_IPV4: - fl4.daddr = daddr->ip; - rt = ip_route_output_key(net, &fl4); - if (!IS_ERR(rt)) - dst = &rt->dst; + fl.u.ip4.daddr = daddr->ip; + nf_ip_route(net, &dst, &fl, false); break; -#if IS_ENABLED(CONFIG_IPV6) case NFPROTO_IPV6: - fl6.daddr = daddr->in6; - dst = ip6_route_output(net, NULL, &fl6); - if (dst->error) { - dst_release(dst); - dst = NULL; - } + fl.u.ip6.daddr = daddr->in6; + nf_ip6_route(net, &dst, &fl, false); break; -#endif } /* Don't predict any conntracks when media endpoint is reachable * through the same interface as the signalling peer. */ - if (dst && dst->dev == dev) - return NF_ACCEPT; + if (dst) { + bool external_media = (dst->dev == dev); + + dst_release(dst); + if (external_media) + return NF_ACCEPT; + } } /* We need to check whether the registration exists before attempting diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c @@ -2806,8 +2806,11 @@ err2: nf_tables_rule_release(&ctx, rule); err1: for (i = 0; i < n; i++) { - if (info[i].ops != NULL) + if (info[i].ops) { module_put(info[i].ops->type->owner); + if (info[i].ops->type->release_ops) + info[i].ops->type->release_ops(info[i].ops); + } } kvfree(info); return err; diff --git a/net/netfilter/nft_objref.c b/net/netfilter/nft_objref.c @@ -65,21 +65,34 @@ nla_put_failure: return -1; } -static void nft_objref_destroy(const struct nft_ctx *ctx, - const struct nft_expr *expr) +static void nft_objref_deactivate(const struct nft_ctx *ctx, + const struct nft_expr *expr, + enum nft_trans_phase phase) { struct nft_object *obj = nft_objref_priv(expr); + if (phase == NFT_TRANS_COMMIT) + return; + obj->use--; } +static void nft_objref_activate(const struct nft_ctx *ctx, + const struct nft_expr *expr) +{ + struct nft_object *obj = nft_objref_priv(expr); + + obj->use++; +} + static struct nft_expr_type nft_objref_type; static const struct nft_expr_ops nft_objref_ops = { .type = &nft_objref_type, .size = NFT_EXPR_SIZE(sizeof(struct nft_object *)), .eval = nft_objref_eval, .init = nft_objref_init, - .destroy = nft_objref_destroy, + .activate = nft_objref_activate, + .deactivate = nft_objref_deactivate, .dump = nft_objref_dump, }; diff --git a/net/netfilter/nft_redir.c b/net/netfilter/nft_redir.c @@ -233,5 +233,5 @@ module_exit(nft_redir_module_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Arturo Borrero Gonzalez <arturo@debian.org>"); -MODULE_ALIAS_NFT_AF_EXPR(AF_INET4, "redir"); +MODULE_ALIAS_NFT_AF_EXPR(AF_INET, "redir"); MODULE_ALIAS_NFT_AF_EXPR(AF_INET6, "redir"); diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c @@ -308,10 +308,6 @@ static void *nft_rbtree_deactivate(const struct net *net, else if (d > 0) parent = parent->rb_right; else { - if (!nft_set_elem_active(&rbe->ext, genmask)) { - parent = parent->rb_left; - continue; - } if (nft_rbtree_interval_end(rbe) && !nft_rbtree_interval_end(this)) { parent = parent->rb_left; @@ -320,6 +316,9 @@ static void *nft_rbtree_deactivate(const struct net *net, nft_rbtree_interval_end(this)) { parent = parent->rb_right; continue; + } else if (!nft_set_elem_active(&rbe->ext, genmask)) { + parent = parent->rb_left; + continue; } nft_rbtree_flush(net, set, rbe); return rbe; diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c @@ -366,7 +366,7 @@ int genl_register_family(struct genl_family *family) start, end + 1, GFP_KERNEL); if (family->id < 0) { err = family->id; - goto errout_locked; + goto errout_free; } err = genl_validate_assign_mc_groups(family); @@ -385,6 +385,7 @@ int genl_register_family(struct genl_family *family) errout_remove: idr_remove(&genl_fam_idr, family->id); +errout_free: kfree(family->attrbuf); errout_locked: genl_unlock_all(); diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c @@ -726,6 +726,10 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr, llcp_sock->service_name = kmemdup(addr->service_name, llcp_sock->service_name_len, GFP_KERNEL); + if (!llcp_sock->service_name) { + ret = -ENOMEM; + goto sock_llcp_release; + } nfc_llcp_sock_link(&local->connecting_sockets, sk); @@ -745,10 +749,11 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr, return ret; sock_unlink: - nfc_llcp_put_ssap(local, llcp_sock->ssap); - nfc_llcp_sock_unlink(&local->connecting_sockets, sk); +sock_llcp_release: + nfc_llcp_put_ssap(local, llcp_sock->ssap); + put_dev: nfc_put_device(dev); diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c @@ -448,6 +448,10 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb, upcall = genlmsg_put(user_skb, 0, 0, &dp_packet_genl_family, 0, upcall_info->cmd); + if (!upcall) { + err = -EINVAL; + goto out; + } upcall->dp_ifindex = dp_ifindex; err = ovs_nla_put_key(key, key, OVS_PACKET_ATTR_KEY, false, user_skb); @@ -460,6 +464,10 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb, if (upcall_info->egress_tun_info) { nla = nla_nest_start(user_skb, OVS_PACKET_ATTR_EGRESS_TUN_KEY); + if (!nla) { + err = -EMSGSIZE; + goto out; + } err = ovs_nla_put_tunnel_info(user_skb, upcall_info->egress_tun_info); BUG_ON(err); @@ -468,6 +476,10 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb, if (upcall_info->actions_len) { nla = nla_nest_start(user_skb, OVS_PACKET_ATTR_ACTIONS); + if (!nla) { + err = -EMSGSIZE; + goto out; + } err = ovs_nla_put_actions(upcall_info->actions, upcall_info->actions_len, user_skb); diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c @@ -1852,7 +1852,8 @@ oom: static void packet_parse_headers(struct sk_buff *skb, struct socket *sock) { - if (!skb->protocol && sock->type == SOCK_RAW) { + if ((!skb->protocol || skb->protocol == htons(ETH_P_ALL)) && + sock->type == SOCK_RAW) { skb_reset_mac_header(skb); skb->protocol = dev_parse_header_protocol(skb); } @@ -3243,7 +3244,7 @@ static int packet_create(struct net *net, struct socket *sock, int protocol, } mutex_lock(&net->packet.sklist_lock); - sk_add_node_rcu(sk, &net->packet.sklist); + sk_add_node_tail_rcu(sk, &net->packet.sklist); mutex_unlock(&net->packet.sklist_lock); preempt_disable(); @@ -4209,7 +4210,7 @@ static struct pgv *alloc_pg_vec(struct tpacket_req *req, int order) struct pgv *pg_vec; int i; - pg_vec = kcalloc(block_nr, sizeof(struct pgv), GFP_KERNEL); + pg_vec = kcalloc(block_nr, sizeof(struct pgv), GFP_KERNEL | __GFP_NOWARN); if (unlikely(!pg_vec)) goto out; diff --git a/net/rose/rose_subr.c b/net/rose/rose_subr.c @@ -105,16 +105,17 @@ void rose_write_internal(struct sock *sk, int frametype) struct sk_buff *skb; unsigned char *dptr; unsigned char lci1, lci2; - char buffer[100]; - int len, faclen = 0; + int maxfaclen = 0; + int len, faclen; + int reserve; - len = AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + ROSE_MIN_LEN + 1; + reserve = AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + 1; + len = ROSE_MIN_LEN; switch (frametype) { case ROSE_CALL_REQUEST: len += 1 + ROSE_ADDR_LEN + ROSE_ADDR_LEN; - faclen = rose_create_facilities(buffer, rose); - len += faclen; + maxfaclen = 256; break; case ROSE_CALL_ACCEPTED: case ROSE_CLEAR_REQUEST: @@ -123,15 +124,16 @@ void rose_write_internal(struct sock *sk, int frametype) break; } - if ((skb = alloc_skb(len, GFP_ATOMIC)) == NULL) + skb = alloc_skb(reserve + len + maxfaclen, GFP_ATOMIC); + if (!skb) return; /* * Space for AX.25 header and PID. */ - skb_reserve(skb, AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + 1); + skb_reserve(skb, reserve); - dptr = skb_put(skb, skb_tailroom(skb)); + dptr = skb_put(skb, len); lci1 = (rose->lci >> 8) & 0x0F; lci2 = (rose->lci >> 0) & 0xFF; @@ -146,7 +148,8 @@ void rose_write_internal(struct sock *sk, int frametype) dptr += ROSE_ADDR_LEN; memcpy(dptr, &rose->source_addr, ROSE_ADDR_LEN); dptr += ROSE_ADDR_LEN; - memcpy(dptr, buffer, faclen); + faclen = rose_create_facilities(dptr, rose); + skb_put(skb, faclen); dptr += faclen; break; diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c @@ -335,7 +335,6 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct sk_buff *skb, struct kvec iov[2]; rxrpc_serial_t serial; size_t len; - bool lost = false; int ret, opt; _enter(",{%d}", skb->len); @@ -393,14 +392,14 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct sk_buff *skb, static int lose; if ((lose++ & 7) == 7) { ret = 0; - lost = true; + trace_rxrpc_tx_data(call, sp->hdr.seq, serial, + whdr.flags, retrans, true); + goto done; } } - trace_rxrpc_tx_data(call, sp->hdr.seq, serial, whdr.flags, - retrans, lost); - if (lost) - goto done; + trace_rxrpc_tx_data(call, sp->hdr.seq, serial, whdr.flags, retrans, + false); /* send the packet with the don't fragment bit set if we currently * think it's small enough */ diff --git a/net/sched/Kconfig b/net/sched/Kconfig @@ -358,8 +358,7 @@ config NET_SCH_PIE help Say Y here if you want to use the Proportional Integral controller Enhanced scheduler packet scheduling algorithm. - For more information, please see - http://tools.ietf.org/html/draft-pan-tsvwg-pie-00 + For more information, please see https://tools.ietf.org/html/rfc8033 To compile this driver as a module, choose M here: the module will be called sch_pie. diff --git a/net/sched/act_api.c b/net/sched/act_api.c @@ -28,27 +28,10 @@ #include <net/act_api.h> #include <net/netlink.h> -static int tcf_action_goto_chain_init(struct tc_action *a, struct tcf_proto *tp) -{ - u32 chain_index = a->tcfa_action & TC_ACT_EXT_VAL_MASK; - - if (!tp) - return -EINVAL; - a->goto_chain = tcf_chain_get_by_act(tp->chain->block, chain_index); - if (!a->goto_chain) - return -ENOMEM; - return 0; -} - -static void tcf_action_goto_chain_fini(struct tc_action *a) -{ - tcf_chain_put_by_act(a->goto_chain); -} - static void tcf_action_goto_chain_exec(const struct tc_action *a, struct tcf_result *res) { - const struct tcf_chain *chain = a->goto_chain; + const struct tcf_chain *chain = rcu_dereference_bh(a->goto_chain); res->goto_tp = rcu_dereference_bh(chain->filter_chain); } @@ -71,6 +54,51 @@ static void tcf_set_action_cookie(struct tc_cookie __rcu **old_cookie, call_rcu(&old->rcu, tcf_free_cookie_rcu); } +int tcf_action_check_ctrlact(int action, struct tcf_proto *tp, + struct tcf_chain **newchain, + struct netlink_ext_ack *extack) +{ + int opcode = TC_ACT_EXT_OPCODE(action), ret = -EINVAL; + u32 chain_index; + + if (!opcode) + ret = action > TC_ACT_VALUE_MAX ? -EINVAL : 0; + else if (opcode <= TC_ACT_EXT_OPCODE_MAX || action == TC_ACT_UNSPEC) + ret = 0; + if (ret) { + NL_SET_ERR_MSG(extack, "invalid control action"); + goto end; + } + + if (TC_ACT_EXT_CMP(action, TC_ACT_GOTO_CHAIN)) { + chain_index = action & TC_ACT_EXT_VAL_MASK; + if (!tp || !newchain) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, + "can't goto NULL proto/chain"); + goto end; + } + *newchain = tcf_chain_get_by_act(tp->chain->block, chain_index); + if (!*newchain) { + ret = -ENOMEM; + NL_SET_ERR_MSG(extack, + "can't allocate goto_chain"); + } + } +end: + return ret; +} +EXPORT_SYMBOL(tcf_action_check_ctrlact); + +struct tcf_chain *tcf_action_set_ctrlact(struct tc_action *a, int action, + struct tcf_chain *goto_chain) +{ + a->tcfa_action = action; + rcu_swap_protected(a->goto_chain, goto_chain, 1); + return goto_chain; +} +EXPORT_SYMBOL(tcf_action_set_ctrlact); + /* XXX: For standalone actions, we don't need a RCU grace period either, because * actions are always connected to filters and filters are already destroyed in * RCU callbacks, so after a RCU grace period actions are already disconnected @@ -78,13 +106,15 @@ static void tcf_set_action_cookie(struct tc_cookie __rcu **old_cookie, */ static void free_tcf(struct tc_action *p) { + struct tcf_chain *chain = rcu_dereference_protected(p->goto_chain, 1); + free_percpu(p->cpu_bstats); free_percpu(p->cpu_bstats_hw); free_percpu(p->cpu_qstats); tcf_set_action_cookie(&p->act_cookie, NULL); - if (p->goto_chain) - tcf_action_goto_chain_fini(p); + if (chain) + tcf_chain_put_by_act(chain); kfree(p); } @@ -654,6 +684,10 @@ repeat: return TC_ACT_OK; } } else if (TC_ACT_EXT_CMP(ret, TC_ACT_GOTO_CHAIN)) { + if (unlikely(!rcu_access_pointer(a->goto_chain))) { + net_warn_ratelimited("can't go to NULL chain!\n"); + return TC_ACT_SHOT; + } tcf_action_goto_chain_exec(a, res); } @@ -800,15 +834,6 @@ static struct tc_cookie *nla_memdup_cookie(struct nlattr **tb) return c; } -static bool tcf_action_valid(int action) -{ - int opcode = TC_ACT_EXT_OPCODE(action); - - if (!opcode) - return action <= TC_ACT_VALUE_MAX; - return opcode <= TC_ACT_EXT_OPCODE_MAX || action == TC_ACT_UNSPEC; -} - struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, struct nlattr *nla, struct nlattr *est, char *name, int ovr, int bind, @@ -890,10 +915,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, /* backward compatibility for policer */ if (name == NULL) err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, ovr, bind, - rtnl_held, extack); + rtnl_held, tp, extack); else err = a_o->init(net, nla, est, &a, ovr, bind, rtnl_held, - extack); + tp, extack); if (err < 0) goto err_mod; @@ -907,18 +932,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, if (err != ACT_P_CREATED) module_put(a_o->owner); - if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN)) { - err = tcf_action_goto_chain_init(a, tp); - if (err) { - tcf_action_destroy_1(a, bind); - NL_SET_ERR_MSG(extack, "Failed to init TC action chain"); - return ERR_PTR(err); - } - } - - if (!tcf_action_valid(a->tcfa_action)) { + if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN) && + !rcu_access_pointer(a->goto_chain)) { tcf_action_destroy_1(a, bind); - NL_SET_ERR_MSG(extack, "Invalid control action value"); + NL_SET_ERR_MSG(extack, "can't use goto chain with NULL chain"); return ERR_PTR(-EINVAL); } diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c @@ -17,6 +17,7 @@ #include <net/netlink.h> #include <net/pkt_sched.h> +#include <net/pkt_cls.h> #include <linux/tc_act/tc_bpf.h> #include <net/tc_act/tc_bpf.h> @@ -278,10 +279,11 @@ static void tcf_bpf_prog_fill_cfg(const struct tcf_bpf *prog, static int tcf_bpf_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **act, int replace, int bind, bool rtnl_held, - struct netlink_ext_ack *extack) + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, bpf_net_id); struct nlattr *tb[TCA_ACT_BPF_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tcf_bpf_cfg cfg, old; struct tc_act_bpf *parm; struct tcf_bpf *prog; @@ -323,12 +325,16 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla, return ret; } + ret = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (ret < 0) + goto release_idr; + is_bpf = tb[TCA_ACT_BPF_OPS_LEN] && tb[TCA_ACT_BPF_OPS]; is_ebpf = tb[TCA_ACT_BPF_FD]; if ((!is_bpf && !is_ebpf) || (is_bpf && is_ebpf)) { ret = -EINVAL; - goto out; + goto put_chain; } memset(&cfg, 0, sizeof(cfg)); @@ -336,7 +342,7 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla, ret = is_bpf ? tcf_bpf_init_from_ops(tb, &cfg) : tcf_bpf_init_from_efd(tb, &cfg); if (ret < 0) - goto out; + goto put_chain; prog = to_bpf(*act); @@ -350,10 +356,13 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla, if (cfg.bpf_num_ops) prog->bpf_num_ops = cfg.bpf_num_ops; - prog->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*act, parm->action, goto_ch); rcu_assign_pointer(prog->filter, cfg.filter); spin_unlock_bh(&prog->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + if (res == ACT_P_CREATED) { tcf_idr_insert(tn, *act); } else { @@ -363,9 +372,13 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla, } return res; -out: - tcf_idr_release(*act, bind); +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + +release_idr: + tcf_idr_release(*act, bind); return ret; } diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c @@ -21,6 +21,7 @@ #include <net/netlink.h> #include <net/pkt_sched.h> #include <net/act_api.h> +#include <net/pkt_cls.h> #include <uapi/linux/tc_act/tc_connmark.h> #include <net/tc_act/tc_connmark.h> @@ -97,13 +98,15 @@ static const struct nla_policy connmark_policy[TCA_CONNMARK_MAX + 1] = { static int tcf_connmark_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, connmark_net_id); struct nlattr *tb[TCA_CONNMARK_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tcf_connmark_info *ci; struct tc_connmark *parm; - int ret = 0; + int ret = 0, err; if (!nla) return -EINVAL; @@ -128,7 +131,11 @@ static int tcf_connmark_init(struct net *net, struct nlattr *nla, } ci = to_connmark(*a); - ci->tcf_action = parm->action; + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, + extack); + if (err < 0) + goto release_idr; + tcf_action_set_ctrlact(*a, parm->action, goto_ch); ci->net = net; ci->zone = parm->zone; @@ -142,15 +149,24 @@ static int tcf_connmark_init(struct net *net, struct nlattr *nla, tcf_idr_release(*a, bind); return -EEXIST; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, + extack); + if (err < 0) + goto release_idr; /* replacing action and zone */ spin_lock_bh(&ci->tcf_lock); - ci->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); ci->zone = parm->zone; spin_unlock_bh(&ci->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); ret = 0; } return ret; +release_idr: + tcf_idr_release(*a, bind); + return err; } static inline int tcf_connmark_dump(struct sk_buff *skb, struct tc_action *a, diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c @@ -33,6 +33,7 @@ #include <net/sctp/checksum.h> #include <net/act_api.h> +#include <net/pkt_cls.h> #include <linux/tc_act/tc_csum.h> #include <net/tc_act/tc_csum.h> @@ -46,12 +47,13 @@ static struct tc_action_ops act_csum_ops; static int tcf_csum_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, - int bind, bool rtnl_held, + int bind, bool rtnl_held, struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, csum_net_id); struct tcf_csum_params *params_new; struct nlattr *tb[TCA_CSUM_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tc_csum *parm; struct tcf_csum *p; int ret = 0, err; @@ -87,21 +89,27 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla, return err; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; + p = to_tcf_csum(*a); params_new = kzalloc(sizeof(*params_new), GFP_KERNEL); if (unlikely(!params_new)) { - tcf_idr_release(*a, bind); - return -ENOMEM; + err = -ENOMEM; + goto put_chain; } params_new->update_flags = parm->update_flags; spin_lock_bh(&p->tcf_lock); - p->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); rcu_swap_protected(p->params, params_new, lockdep_is_held(&p->tcf_lock)); spin_unlock_bh(&p->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (params_new) kfree_rcu(params_new, rcu); @@ -109,6 +117,12 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla, tcf_idr_insert(tn, *a); return ret; +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); +release_idr: + tcf_idr_release(*a, bind); + return err; } /** diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c @@ -20,6 +20,7 @@ #include <linux/init.h> #include <net/netlink.h> #include <net/pkt_sched.h> +#include <net/pkt_cls.h> #include <linux/tc_act/tc_gact.h> #include <net/tc_act/tc_gact.h> @@ -57,10 +58,11 @@ static const struct nla_policy gact_policy[TCA_GACT_MAX + 1] = { static int tcf_gact_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, - struct netlink_ext_ack *extack) + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, gact_net_id); struct nlattr *tb[TCA_GACT_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tc_gact *parm; struct tcf_gact *gact; int ret = 0; @@ -116,10 +118,13 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla, return err; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; gact = to_gact(*a); spin_lock_bh(&gact->tcf_lock); - gact->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); #ifdef CONFIG_GACT_PROB if (p_parm) { gact->tcfg_paction = p_parm->paction; @@ -133,9 +138,15 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla, #endif spin_unlock_bh(&gact->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + if (ret == ACT_P_CREATED) tcf_idr_insert(tn, *a); return ret; +release_idr: + tcf_idr_release(*a, bind); + return err; } static int tcf_gact_act(struct sk_buff *skb, const struct tc_action *a, diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c @@ -29,6 +29,7 @@ #include <net/net_namespace.h> #include <net/netlink.h> #include <net/pkt_sched.h> +#include <net/pkt_cls.h> #include <uapi/linux/tc_act/tc_ife.h> #include <net/tc_act/tc_ife.h> #include <linux/etherdevice.h> @@ -469,11 +470,12 @@ static int populate_metalist(struct tcf_ife_info *ife, struct nlattr **tb, static int tcf_ife_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, - struct netlink_ext_ack *extack) + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, ife_net_id); struct nlattr *tb[TCA_IFE_MAX + 1]; struct nlattr *tb2[IFE_META_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tcf_ife_params *p; struct tcf_ife_info *ife; u16 ife_type = ETH_P_IFE; @@ -531,6 +533,10 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla, } ife = to_ife(*a); + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; + p->flags = parm->flags; if (parm->flags & IFE_ENCODE) { @@ -563,13 +569,8 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla, if (tb[TCA_IFE_METALST]) { err = nla_parse_nested(tb2, IFE_META_MAX, tb[TCA_IFE_METALST], NULL, NULL); - if (err) { -metadata_parse_err: - tcf_idr_release(*a, bind); - kfree(p); - return err; - } - + if (err) + goto metadata_parse_err; err = populate_metalist(ife, tb2, exists, rtnl_held); if (err) goto metadata_parse_err; @@ -581,21 +582,20 @@ metadata_parse_err: * going to bail out */ err = use_all_metadata(ife, exists); - if (err) { - tcf_idr_release(*a, bind); - kfree(p); - return err; - } + if (err) + goto metadata_parse_err; } if (exists) spin_lock_bh(&ife->tcf_lock); - ife->tcf_action = parm->action; /* protected by tcf_lock when modifying existing action */ + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); rcu_swap_protected(ife->params, p, 1); if (exists) spin_unlock_bh(&ife->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (p) kfree_rcu(p, rcu); @@ -603,6 +603,13 @@ metadata_parse_err: tcf_idr_insert(tn, *a); return ret; +metadata_parse_err: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); +release_idr: + kfree(p); + tcf_idr_release(*a, bind); + return err; } static int tcf_ife_dump(struct sk_buff *skb, struct tc_action *a, int bind, diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c @@ -97,7 +97,8 @@ static const struct nla_policy ipt_policy[TCA_IPT_MAX + 1] = { static int __tcf_ipt_init(struct net *net, unsigned int id, struct nlattr *nla, struct nlattr *est, struct tc_action **a, - const struct tc_action_ops *ops, int ovr, int bind) + const struct tc_action_ops *ops, int ovr, int bind, + struct tcf_proto *tp) { struct tc_action_net *tn = net_generic(net, id); struct nlattr *tb[TCA_IPT_MAX + 1]; @@ -205,20 +206,20 @@ err1: static int tcf_ipt_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, - int bind, bool rtnl_held, + int bind, bool rtnl_held, struct tcf_proto *tp, struct netlink_ext_ack *extack) { return __tcf_ipt_init(net, ipt_net_id, nla, est, a, &act_ipt_ops, ovr, - bind); + bind, tp); } static int tcf_xt_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, - int bind, bool unlocked, + int bind, bool unlocked, struct tcf_proto *tp, struct netlink_ext_ack *extack) { return __tcf_ipt_init(net, xt_net_id, nla, est, a, &act_xt_ops, ovr, - bind); + bind, tp); } static int tcf_ipt_act(struct sk_buff *skb, const struct tc_action *a, diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c @@ -94,10 +94,12 @@ static struct tc_action_ops act_mirred_ops; static int tcf_mirred_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, mirred_net_id); struct nlattr *tb[TCA_MIRRED_MAX + 1]; + struct tcf_chain *goto_ch = NULL; bool mac_header_xmit = false; struct tc_mirred *parm; struct tcf_mirred *m; @@ -157,18 +159,23 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, tcf_idr_release(*a, bind); return -EEXIST; } + m = to_mirred(*a); + if (ret == ACT_P_CREATED) + INIT_LIST_HEAD(&m->tcfm_list); + + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; spin_lock_bh(&m->tcf_lock); - m->tcf_action = parm->action; - m->tcfm_eaction = parm->eaction; if (parm->ifindex) { dev = dev_get_by_index(net, parm->ifindex); if (!dev) { spin_unlock_bh(&m->tcf_lock); - tcf_idr_release(*a, bind); - return -ENODEV; + err = -ENODEV; + goto put_chain; } mac_header_xmit = dev_is_mac_header_xmit(dev); rcu_swap_protected(m->tcfm_dev, dev, @@ -177,7 +184,11 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, dev_put(dev); m->tcfm_mac_header_xmit = mac_header_xmit; } + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); + m->tcfm_eaction = parm->eaction; spin_unlock_bh(&m->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (ret == ACT_P_CREATED) { spin_lock(&mirred_list_lock); @@ -188,6 +199,12 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, } return ret; +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); +release_idr: + tcf_idr_release(*a, bind); + return err; } static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c @@ -21,6 +21,7 @@ #include <linux/string.h> #include <linux/tc_act/tc_nat.h> #include <net/act_api.h> +#include <net/pkt_cls.h> #include <net/icmp.h> #include <net/ip.h> #include <net/netlink.h> @@ -38,10 +39,12 @@ static const struct nla_policy nat_policy[TCA_NAT_MAX + 1] = { static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, - bool rtnl_held, struct netlink_ext_ack *extack) + bool rtnl_held, struct tcf_proto *tp, + struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, nat_net_id); struct nlattr *tb[TCA_NAT_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tc_nat *parm; int ret = 0, err; struct tcf_nat *p; @@ -76,6 +79,9 @@ static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est, } else { return err; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; p = to_tcf_nat(*a); spin_lock_bh(&p->tcf_lock); @@ -84,13 +90,18 @@ static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est, p->mask = parm->mask; p->flags = parm->flags; - p->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); spin_unlock_bh(&p->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (ret == ACT_P_CREATED) tcf_idr_insert(tn, *a); return ret; +release_idr: + tcf_idr_release(*a, bind); + return err; } static int tcf_nat_act(struct sk_buff *skb, const struct tc_action *a, diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c @@ -23,6 +23,7 @@ #include <linux/tc_act/tc_pedit.h> #include <net/tc_act/tc_pedit.h> #include <uapi/linux/tc_act/tc_pedit.h> +#include <net/pkt_cls.h> static unsigned int pedit_net_id; static struct tc_action_ops act_pedit_ops; @@ -138,10 +139,11 @@ nla_failure: static int tcf_pedit_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, - struct netlink_ext_ack *extack) + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, pedit_net_id); struct nlattr *tb[TCA_PEDIT_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tc_pedit_key *keys = NULL; struct tcf_pedit_key_ex *keys_ex; struct tc_pedit *parm; @@ -205,6 +207,11 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla, goto out_free; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) { + ret = err; + goto out_release; + } p = to_pedit(*a); spin_lock_bh(&p->tcf_lock); @@ -214,7 +221,7 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla, if (!keys) { spin_unlock_bh(&p->tcf_lock); ret = -ENOMEM; - goto out_release; + goto put_chain; } kfree(p->tcfp_keys); p->tcfp_keys = keys; @@ -223,16 +230,21 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla, memcpy(p->tcfp_keys, parm->keys, ksize); p->tcfp_flags = parm->flags; - p->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); kfree(p->tcfp_keys_ex); p->tcfp_keys_ex = keys_ex; spin_unlock_bh(&p->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (ret == ACT_P_CREATED) tcf_idr_insert(tn, *a); return ret; +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); out_release: tcf_idr_release(*a, bind); out_free: diff --git a/net/sched/act_police.c b/net/sched/act_police.c @@ -21,6 +21,7 @@ #include <linux/slab.h> #include <net/act_api.h> #include <net/netlink.h> +#include <net/pkt_cls.h> struct tcf_police_params { int tcfp_result; @@ -83,10 +84,12 @@ static const struct nla_policy police_policy[TCA_POLICE_MAX + 1] = { static int tcf_police_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, + struct tcf_proto *tp, struct netlink_ext_ack *extack) { int ret = 0, tcfp_result = TC_ACT_OK, err, size; struct nlattr *tb[TCA_POLICE_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tc_police *parm; struct tcf_police *police; struct qdisc_rate_table *R_tab = NULL, *P_tab = NULL; @@ -128,6 +131,9 @@ static int tcf_police_init(struct net *net, struct nlattr *nla, tcf_idr_release(*a, bind); return -EEXIST; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; police = to_police(*a); if (parm->rate.rate) { @@ -213,12 +219,14 @@ static int tcf_police_init(struct net *net, struct nlattr *nla, if (new->peak_present) police->tcfp_ptoks = new->tcfp_mtu_ptoks; spin_unlock_bh(&police->tcfp_lock); - police->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); rcu_swap_protected(police->params, new, lockdep_is_held(&police->tcf_lock)); spin_unlock_bh(&police->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (new) kfree_rcu(new, rcu); @@ -229,6 +237,9 @@ static int tcf_police_init(struct net *net, struct nlattr *nla, failure: qdisc_put_rtab(P_tab); qdisc_put_rtab(R_tab); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); +release_idr: tcf_idr_release(*a, bind); return err; } diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c @@ -22,6 +22,7 @@ #include <linux/tc_act/tc_sample.h> #include <net/tc_act/tc_sample.h> #include <net/psample.h> +#include <net/pkt_cls.h> #include <linux/if_arp.h> @@ -37,12 +38,13 @@ static const struct nla_policy sample_policy[TCA_SAMPLE_MAX + 1] = { static int tcf_sample_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, - int bind, bool rtnl_held, + int bind, bool rtnl_held, struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, sample_net_id); struct nlattr *tb[TCA_SAMPLE_MAX + 1]; struct psample_group *psample_group; + struct tcf_chain *goto_ch = NULL; struct tc_sample *parm; u32 psample_group_num; struct tcf_sample *s; @@ -79,18 +81,21 @@ static int tcf_sample_init(struct net *net, struct nlattr *nla, tcf_idr_release(*a, bind); return -EEXIST; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; psample_group_num = nla_get_u32(tb[TCA_SAMPLE_PSAMPLE_GROUP]); psample_group = psample_group_get(net, psample_group_num); if (!psample_group) { - tcf_idr_release(*a, bind); - return -ENOMEM; + err = -ENOMEM; + goto put_chain; } s = to_sample(*a); spin_lock_bh(&s->tcf_lock); - s->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); s->rate = nla_get_u32(tb[TCA_SAMPLE_RATE]); s->psample_group_num = psample_group_num; RCU_INIT_POINTER(s->psample_group, psample_group); @@ -100,10 +105,18 @@ static int tcf_sample_init(struct net *net, struct nlattr *nla, s->trunc_size = nla_get_u32(tb[TCA_SAMPLE_TRUNC_SIZE]); } spin_unlock_bh(&s->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (ret == ACT_P_CREATED) tcf_idr_insert(tn, *a); return ret; +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); +release_idr: + tcf_idr_release(*a, bind); + return err; } static void tcf_sample_cleanup(struct tc_action *a) diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c @@ -18,6 +18,7 @@ #include <linux/rtnetlink.h> #include <net/netlink.h> #include <net/pkt_sched.h> +#include <net/pkt_cls.h> #include <linux/tc_act/tc_defact.h> #include <net/tc_act/tc_defact.h> @@ -60,14 +61,26 @@ static int alloc_defdata(struct tcf_defact *d, const struct nlattr *defdata) return 0; } -static void reset_policy(struct tcf_defact *d, const struct nlattr *defdata, - struct tc_defact *p) +static int reset_policy(struct tc_action *a, const struct nlattr *defdata, + struct tc_defact *p, struct tcf_proto *tp, + struct netlink_ext_ack *extack) { + struct tcf_chain *goto_ch = NULL; + struct tcf_defact *d; + int err; + + err = tcf_action_check_ctrlact(p->action, tp, &goto_ch, extack); + if (err < 0) + return err; + d = to_defact(a); spin_lock_bh(&d->tcf_lock); - d->tcf_action = p->action; + goto_ch = tcf_action_set_ctrlact(a, p->action, goto_ch); memset(d->tcfd_defdata, 0, SIMP_MAX_DATA); nla_strlcpy(d->tcfd_defdata, defdata, SIMP_MAX_DATA); spin_unlock_bh(&d->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + return 0; } static const struct nla_policy simple_policy[TCA_DEF_MAX + 1] = { @@ -78,10 +91,11 @@ static const struct nla_policy simple_policy[TCA_DEF_MAX + 1] = { static int tcf_simp_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, - struct netlink_ext_ack *extack) + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, simp_net_id); struct nlattr *tb[TCA_DEF_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tc_defact *parm; struct tcf_defact *d; bool exists = false; @@ -122,27 +136,37 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla, } d = to_defact(*a); - ret = alloc_defdata(d, tb[TCA_DEF_DATA]); - if (ret < 0) { - tcf_idr_release(*a, bind); - return ret; - } - d->tcf_action = parm->action; + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, + extack); + if (err < 0) + goto release_idr; + + err = alloc_defdata(d, tb[TCA_DEF_DATA]); + if (err < 0) + goto put_chain; + + tcf_action_set_ctrlact(*a, parm->action, goto_ch); ret = ACT_P_CREATED; } else { - d = to_defact(*a); - if (!ovr) { - tcf_idr_release(*a, bind); - return -EEXIST; + err = -EEXIST; + goto release_idr; } - reset_policy(d, tb[TCA_DEF_DATA], parm); + err = reset_policy(*a, tb[TCA_DEF_DATA], parm, tp, extack); + if (err) + goto release_idr; } if (ret == ACT_P_CREATED) tcf_idr_insert(tn, *a); return ret; +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); +release_idr: + tcf_idr_release(*a, bind); + return err; } static int tcf_simp_dump(struct sk_buff *skb, struct tc_action *a, diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c @@ -26,6 +26,7 @@ #include <net/ip.h> #include <net/ipv6.h> #include <net/dsfield.h> +#include <net/pkt_cls.h> #include <linux/tc_act/tc_skbedit.h> #include <net/tc_act/tc_skbedit.h> @@ -96,11 +97,13 @@ static const struct nla_policy skbedit_policy[TCA_SKBEDIT_MAX + 1] = { static int tcf_skbedit_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, skbedit_net_id); struct tcf_skbedit_params *params_new; struct nlattr *tb[TCA_SKBEDIT_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tc_skbedit *parm; struct tcf_skbedit *d; u32 flags = 0, *priority = NULL, *mark = NULL, *mask = NULL; @@ -186,11 +189,14 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla, return -EEXIST; } } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; params_new = kzalloc(sizeof(*params_new), GFP_KERNEL); if (unlikely(!params_new)) { - tcf_idr_release(*a, bind); - return -ENOMEM; + err = -ENOMEM; + goto put_chain; } params_new->flags = flags; @@ -208,16 +214,24 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla, params_new->mask = *mask; spin_lock_bh(&d->tcf_lock); - d->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); rcu_swap_protected(d->params, params_new, lockdep_is_held(&d->tcf_lock)); spin_unlock_bh(&d->tcf_lock); if (params_new) kfree_rcu(params_new, rcu); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (ret == ACT_P_CREATED) tcf_idr_insert(tn, *a); return ret; +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); +release_idr: + tcf_idr_release(*a, bind); + return err; } static int tcf_skbedit_dump(struct sk_buff *skb, struct tc_action *a, diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c @@ -16,6 +16,7 @@ #include <linux/rtnetlink.h> #include <net/netlink.h> #include <net/pkt_sched.h> +#include <net/pkt_cls.h> #include <linux/tc_act/tc_skbmod.h> #include <net/tc_act/tc_skbmod.h> @@ -82,11 +83,13 @@ static const struct nla_policy skbmod_policy[TCA_SKBMOD_MAX + 1] = { static int tcf_skbmod_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, skbmod_net_id); struct nlattr *tb[TCA_SKBMOD_MAX + 1]; struct tcf_skbmod_params *p, *p_old; + struct tcf_chain *goto_ch = NULL; struct tc_skbmod *parm; struct tcf_skbmod *d; bool exists = false; @@ -153,21 +156,24 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla, tcf_idr_release(*a, bind); return -EEXIST; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; d = to_skbmod(*a); p = kzalloc(sizeof(struct tcf_skbmod_params), GFP_KERNEL); if (unlikely(!p)) { - tcf_idr_release(*a, bind); - return -ENOMEM; + err = -ENOMEM; + goto put_chain; } p->flags = lflags; - d->tcf_action = parm->action; if (ovr) spin_lock_bh(&d->tcf_lock); /* Protected by tcf_lock if overwriting existing action. */ + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); p_old = rcu_dereference_protected(d->skbmod_p, 1); if (lflags & SKBMOD_F_DMAC) @@ -183,10 +189,18 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla, if (p_old) kfree_rcu(p_old, rcu); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (ret == ACT_P_CREATED) tcf_idr_insert(tn, *a); return ret; +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); +release_idr: + tcf_idr_release(*a, bind); + return err; } static void tcf_skbmod_cleanup(struct tc_action *a) diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c @@ -17,6 +17,7 @@ #include <net/netlink.h> #include <net/pkt_sched.h> #include <net/dst.h> +#include <net/pkt_cls.h> #include <linux/tc_act/tc_tunnel_key.h> #include <net/tc_act/tc_tunnel_key.h> @@ -210,12 +211,14 @@ static void tunnel_key_release_params(struct tcf_tunnel_key_params *p) static int tunnel_key_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, tunnel_key_net_id); struct nlattr *tb[TCA_TUNNEL_KEY_MAX + 1]; struct tcf_tunnel_key_params *params_new; struct metadata_dst *metadata = NULL; + struct tcf_chain *goto_ch = NULL; struct tc_tunnel_key *parm; struct tcf_tunnel_key *t; bool exists = false; @@ -359,6 +362,12 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla, goto release_tun_meta; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) { + ret = err; + exists = true; + goto release_tun_meta; + } t = to_tunnel_key(*a); params_new = kzalloc(sizeof(*params_new), GFP_KERNEL); @@ -366,23 +375,29 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla, NL_SET_ERR_MSG(extack, "Cannot allocate tunnel key parameters"); ret = -ENOMEM; exists = true; - goto release_tun_meta; + goto put_chain; } params_new->tcft_action = parm->t_action; params_new->tcft_enc_metadata = metadata; spin_lock_bh(&t->tcf_lock); - t->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); rcu_swap_protected(t->params, params_new, lockdep_is_held(&t->tcf_lock)); spin_unlock_bh(&t->tcf_lock); tunnel_key_release_params(params_new); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (ret == ACT_P_CREATED) tcf_idr_insert(tn, *a); return ret; +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + release_tun_meta: if (metadata) dst_release(&metadata->dst); diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c @@ -15,6 +15,7 @@ #include <linux/if_vlan.h> #include <net/netlink.h> #include <net/pkt_sched.h> +#include <net/pkt_cls.h> #include <linux/tc_act/tc_vlan.h> #include <net/tc_act/tc_vlan.h> @@ -105,10 +106,11 @@ static const struct nla_policy vlan_policy[TCA_VLAN_MAX + 1] = { static int tcf_vlan_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, int bind, bool rtnl_held, - struct netlink_ext_ack *extack) + struct tcf_proto *tp, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, vlan_net_id); struct nlattr *tb[TCA_VLAN_MAX + 1]; + struct tcf_chain *goto_ch = NULL; struct tcf_vlan_params *p; struct tc_vlan *parm; struct tcf_vlan *v; @@ -200,12 +202,16 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla, return -EEXIST; } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; + v = to_vlan(*a); p = kzalloc(sizeof(*p), GFP_KERNEL); if (!p) { - tcf_idr_release(*a, bind); - return -ENOMEM; + err = -ENOMEM; + goto put_chain; } p->tcfv_action = action; @@ -214,16 +220,24 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla, p->tcfv_push_proto = push_proto; spin_lock_bh(&v->tcf_lock); - v->tcf_action = parm->action; + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); rcu_swap_protected(v->vlan_p, p, lockdep_is_held(&v->tcf_lock)); spin_unlock_bh(&v->tcf_lock); + if (goto_ch) + tcf_chain_put_by_act(goto_ch); if (p) kfree_rcu(p, rcu); if (ret == ACT_P_CREATED) tcf_idr_insert(tn, *a); return ret; +put_chain: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); +release_idr: + tcf_idr_release(*a, bind); + return err; } static void tcf_vlan_cleanup(struct tc_action *a) diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c @@ -367,7 +367,7 @@ static void tcf_chain_destroy(struct tcf_chain *chain, bool free_block) struct tcf_block *block = chain->block; mutex_destroy(&chain->filter_chain_lock); - kfree(chain); + kfree_rcu(chain, rcu); if (free_block) tcf_block_destroy(block); } diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c @@ -211,6 +211,9 @@ struct cake_sched_data { u8 ack_filter; u8 atm_mode; + u32 fwmark_mask; + u16 fwmark_shft; + /* time_next = time_this + ((len * rate_ns) >> rate_shft) */ u16 rate_shft; ktime_t time_next_packet; @@ -258,8 +261,7 @@ enum { CAKE_FLAG_AUTORATE_INGRESS = BIT(1), CAKE_FLAG_INGRESS = BIT(2), CAKE_FLAG_WASH = BIT(3), - CAKE_FLAG_SPLIT_GSO = BIT(4), - CAKE_FLAG_FWMARK = BIT(5) + CAKE_FLAG_SPLIT_GSO = BIT(4) }; /* COBALT operates the Codel and BLUE algorithms in parallel, in order to @@ -1543,7 +1545,7 @@ static struct cake_tin_data *cake_select_tin(struct Qdisc *sch, struct sk_buff *skb) { struct cake_sched_data *q = qdisc_priv(sch); - u32 tin; + u32 tin, mark; u8 dscp; /* Tin selection: Default to diffserv-based selection, allow overriding @@ -1551,14 +1553,13 @@ static struct cake_tin_data *cake_select_tin(struct Qdisc *sch, */ dscp = cake_handle_diffserv(skb, q->rate_flags & CAKE_FLAG_WASH); + mark = (skb->mark & q->fwmark_mask) >> q->fwmark_shft; if (q->tin_mode == CAKE_DIFFSERV_BESTEFFORT) tin = 0; - else if (q->rate_flags & CAKE_FLAG_FWMARK && /* use fw mark */ - skb->mark && - skb->mark <= q->tin_cnt) - tin = q->tin_order[skb->mark - 1]; + else if (mark && mark <= q->tin_cnt) + tin = q->tin_order[mark - 1]; else if (TC_H_MAJ(skb->priority) == sch->handle && TC_H_MIN(skb->priority) > 0 && @@ -2172,6 +2173,7 @@ static const struct nla_policy cake_policy[TCA_CAKE_MAX + 1] = { [TCA_CAKE_MPU] = { .type = NLA_U32 }, [TCA_CAKE_INGRESS] = { .type = NLA_U32 }, [TCA_CAKE_ACK_FILTER] = { .type = NLA_U32 }, + [TCA_CAKE_FWMARK] = { .type = NLA_U32 }, }; static void cake_set_rate(struct cake_tin_data *b, u64 rate, u32 mtu, @@ -2619,10 +2621,8 @@ static int cake_change(struct Qdisc *sch, struct nlattr *opt, } if (tb[TCA_CAKE_FWMARK]) { - if (!!nla_get_u32(tb[TCA_CAKE_FWMARK])) - q->rate_flags |= CAKE_FLAG_FWMARK; - else - q->rate_flags &= ~CAKE_FLAG_FWMARK; + q->fwmark_mask = nla_get_u32(tb[TCA_CAKE_FWMARK]); + q->fwmark_shft = q->fwmark_mask ? __ffs(q->fwmark_mask) : 0; } if (q->tins) { @@ -2784,8 +2784,7 @@ static int cake_dump(struct Qdisc *sch, struct sk_buff *skb) !!(q->rate_flags & CAKE_FLAG_SPLIT_GSO))) goto nla_put_failure; - if (nla_put_u32(skb, TCA_CAKE_FWMARK, - !!(q->rate_flags & CAKE_FLAG_FWMARK))) + if (nla_put_u32(skb, TCA_CAKE_FWMARK, q->fwmark_mask)) goto nla_put_failure; return nla_nest_end(skb, opts); diff --git a/net/sctp/socket.c b/net/sctp/socket.c @@ -999,7 +999,7 @@ static int sctp_setsockopt_bindx(struct sock *sk, if (unlikely(addrs_size <= 0)) return -EINVAL; - kaddrs = vmemdup_user(addrs, addrs_size); + kaddrs = memdup_user(addrs, addrs_size); if (unlikely(IS_ERR(kaddrs))) return PTR_ERR(kaddrs); @@ -1007,7 +1007,7 @@ static int sctp_setsockopt_bindx(struct sock *sk, addr_buf = kaddrs; while (walk_size < addrs_size) { if (walk_size + sizeof(sa_family_t) > addrs_size) { - kvfree(kaddrs); + kfree(kaddrs); return -EINVAL; } @@ -1018,7 +1018,7 @@ static int sctp_setsockopt_bindx(struct sock *sk, * causes the address buffer to overflow return EINVAL. */ if (!af || (walk_size + af->sockaddr_len) > addrs_size) { - kvfree(kaddrs); + kfree(kaddrs); return -EINVAL; } addrcnt++; @@ -1054,7 +1054,7 @@ static int sctp_setsockopt_bindx(struct sock *sk, } out: - kvfree(kaddrs); + kfree(kaddrs); return err; } @@ -1329,7 +1329,7 @@ static int __sctp_setsockopt_connectx(struct sock *sk, if (unlikely(addrs_size <= 0)) return -EINVAL; - kaddrs = vmemdup_user(addrs, addrs_size); + kaddrs = memdup_user(addrs, addrs_size); if (unlikely(IS_ERR(kaddrs))) return PTR_ERR(kaddrs); @@ -1349,7 +1349,7 @@ static int __sctp_setsockopt_connectx(struct sock *sk, err = __sctp_connect(sk, kaddrs, addrs_size, flags, assoc_id); out_free: - kvfree(kaddrs); + kfree(kaddrs); return err; } @@ -2920,6 +2920,9 @@ static int sctp_setsockopt_delayed_ack(struct sock *sk, return 0; } + if (sctp_style(sk, TCP)) + params.sack_assoc_id = SCTP_FUTURE_ASSOC; + if (params.sack_assoc_id == SCTP_FUTURE_ASSOC || params.sack_assoc_id == SCTP_ALL_ASSOC) { if (params.sack_delay) { @@ -3024,6 +3027,9 @@ static int sctp_setsockopt_default_send_param(struct sock *sk, return 0; } + if (sctp_style(sk, TCP)) + info.sinfo_assoc_id = SCTP_FUTURE_ASSOC; + if (info.sinfo_assoc_id == SCTP_FUTURE_ASSOC || info.sinfo_assoc_id == SCTP_ALL_ASSOC) { sp->default_stream = info.sinfo_stream; @@ -3081,6 +3087,9 @@ static int sctp_setsockopt_default_sndinfo(struct sock *sk, return 0; } + if (sctp_style(sk, TCP)) + info.snd_assoc_id = SCTP_FUTURE_ASSOC; + if (info.snd_assoc_id == SCTP_FUTURE_ASSOC || info.snd_assoc_id == SCTP_ALL_ASSOC) { sp->default_stream = info.snd_sid; @@ -3531,6 +3540,9 @@ static int sctp_setsockopt_context(struct sock *sk, char __user *optval, return 0; } + if (sctp_style(sk, TCP)) + params.assoc_id = SCTP_FUTURE_ASSOC; + if (params.assoc_id == SCTP_FUTURE_ASSOC || params.assoc_id == SCTP_ALL_ASSOC) sp->default_rcv_context = params.assoc_value; @@ -3670,6 +3682,9 @@ static int sctp_setsockopt_maxburst(struct sock *sk, return 0; } + if (sctp_style(sk, TCP)) + params.assoc_id = SCTP_FUTURE_ASSOC; + if (params.assoc_id == SCTP_FUTURE_ASSOC || params.assoc_id == SCTP_ALL_ASSOC) sp->max_burst = params.assoc_value; @@ -3798,6 +3813,9 @@ static int sctp_setsockopt_auth_key(struct sock *sk, goto out; } + if (sctp_style(sk, TCP)) + authkey->sca_assoc_id = SCTP_FUTURE_ASSOC; + if (authkey->sca_assoc_id == SCTP_FUTURE_ASSOC || authkey->sca_assoc_id == SCTP_ALL_ASSOC) { ret = sctp_auth_set_key(ep, asoc, authkey); @@ -3853,6 +3871,9 @@ static int sctp_setsockopt_active_key(struct sock *sk, if (asoc) return sctp_auth_set_active_key(ep, asoc, val.scact_keynumber); + if (sctp_style(sk, TCP)) + val.scact_assoc_id = SCTP_FUTURE_ASSOC; + if (val.scact_assoc_id == SCTP_FUTURE_ASSOC || val.scact_assoc_id == SCTP_ALL_ASSOC) { ret = sctp_auth_set_active_key(ep, asoc, val.scact_keynumber); @@ -3904,6 +3925,9 @@ static int sctp_setsockopt_del_key(struct sock *sk, if (asoc) return sctp_auth_del_key_id(ep, asoc, val.scact_keynumber); + if (sctp_style(sk, TCP)) + val.scact_assoc_id = SCTP_FUTURE_ASSOC; + if (val.scact_assoc_id == SCTP_FUTURE_ASSOC || val.scact_assoc_id == SCTP_ALL_ASSOC) { ret = sctp_auth_del_key_id(ep, asoc, val.scact_keynumber); @@ -3954,6 +3978,9 @@ static int sctp_setsockopt_deactivate_key(struct sock *sk, char __user *optval, if (asoc) return sctp_auth_deact_key_id(ep, asoc, val.scact_keynumber); + if (sctp_style(sk, TCP)) + val.scact_assoc_id = SCTP_FUTURE_ASSOC; + if (val.scact_assoc_id == SCTP_FUTURE_ASSOC || val.scact_assoc_id == SCTP_ALL_ASSOC) { ret = sctp_auth_deact_key_id(ep, asoc, val.scact_keynumber); @@ -4169,6 +4196,9 @@ static int sctp_setsockopt_default_prinfo(struct sock *sk, goto out; } + if (sctp_style(sk, TCP)) + info.pr_assoc_id = SCTP_FUTURE_ASSOC; + if (info.pr_assoc_id == SCTP_FUTURE_ASSOC || info.pr_assoc_id == SCTP_ALL_ASSOC) { SCTP_PR_SET_POLICY(sp->default_flags, info.pr_policy); @@ -4251,6 +4281,9 @@ static int sctp_setsockopt_enable_strreset(struct sock *sk, goto out; } + if (sctp_style(sk, TCP)) + params.assoc_id = SCTP_FUTURE_ASSOC; + if (params.assoc_id == SCTP_FUTURE_ASSOC || params.assoc_id == SCTP_ALL_ASSOC) ep->strreset_enable = params.assoc_value; @@ -4376,6 +4409,9 @@ static int sctp_setsockopt_scheduler(struct sock *sk, if (asoc) return sctp_sched_set_sched(asoc, params.assoc_value); + if (sctp_style(sk, TCP)) + params.assoc_id = SCTP_FUTURE_ASSOC; + if (params.assoc_id == SCTP_FUTURE_ASSOC || params.assoc_id == SCTP_ALL_ASSOC) sp->default_ss = params.assoc_value; @@ -4541,6 +4577,9 @@ static int sctp_setsockopt_event(struct sock *sk, char __user *optval, if (asoc) return sctp_assoc_ulpevent_type_set(&param, asoc); + if (sctp_style(sk, TCP)) + param.se_assoc_id = SCTP_FUTURE_ASSOC; + if (param.se_assoc_id == SCTP_FUTURE_ASSOC || param.se_assoc_id == SCTP_ALL_ASSOC) sctp_ulpevent_type_set(&sp->subscribe, @@ -9169,7 +9208,7 @@ static inline void sctp_copy_descendant(struct sock *sk_to, { int ancestor_size = sizeof(struct inet_sock) + sizeof(struct sctp_sock) - - offsetof(struct sctp_sock, auto_asconf_list); + offsetof(struct sctp_sock, pd_lobby); if (sk_from->sk_family == PF_INET6) ancestor_size += sizeof(struct ipv6_pinfo); @@ -9253,7 +9292,6 @@ static int sctp_sock_migrate(struct sock *oldsk, struct sock *newsk, * 2) Peeling off partial delivery; keep pd_lobby in new pd_lobby. * 3) Peeling off non-partial delivery; move pd_lobby to receive_queue. */ - skb_queue_head_init(&newsp->pd_lobby); atomic_set(&sctp_sk(newsk)->pd_mode, assoc->ulpq.pd_mode); if (atomic_read(&sctp_sk(oldsk)->pd_mode)) { diff --git a/net/socket.c b/net/socket.c @@ -384,6 +384,18 @@ static struct file_system_type sock_fs_type = { * but we take care of internal coherence yet. */ +/** + * sock_alloc_file - Bind a &socket to a &file + * @sock: socket + * @flags: file status flags + * @dname: protocol name + * + * Returns the &file bound with @sock, implicitly storing it + * in sock->file. If dname is %NULL, sets to "". + * On failure the return is a ERR pointer (see linux/err.h). + * This function uses GFP_KERNEL internally. + */ + struct file *sock_alloc_file(struct socket *sock, int flags, const char *dname) { struct file *file; @@ -424,6 +436,14 @@ static int sock_map_fd(struct socket *sock, int flags) return PTR_ERR(newfile); } +/** + * sock_from_file - Return the &socket bounded to @file. + * @file: file + * @err: pointer to an error code return + * + * On failure returns %NULL and assigns -ENOTSOCK to @err. + */ + struct socket *sock_from_file(struct file *file, int *err) { if (file->f_op == &socket_file_ops) @@ -532,11 +552,11 @@ static const struct inode_operations sockfs_inode_ops = { }; /** - * sock_alloc - allocate a socket + * sock_alloc - allocate a socket * * Allocate a new inode and socket object. The two are bound together * and initialised. The socket is then returned. If we are out of inodes - * NULL is returned. + * NULL is returned. This functions uses GFP_KERNEL internally. */ struct socket *sock_alloc(void) @@ -561,7 +581,7 @@ struct socket *sock_alloc(void) EXPORT_SYMBOL(sock_alloc); /** - * sock_release - close a socket + * sock_release - close a socket * @sock: socket to close * * The socket is released from the protocol stack if it has a release @@ -617,6 +637,15 @@ void __sock_tx_timestamp(__u16 tsflags, __u8 *tx_flags) } EXPORT_SYMBOL(__sock_tx_timestamp); +/** + * sock_sendmsg - send a message through @sock + * @sock: socket + * @msg: message to send + * + * Sends @msg through @sock, passing through LSM. + * Returns the number of bytes sent, or an error code. + */ + static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg) { int ret = sock->ops->sendmsg(sock, msg, msg_data_left(msg)); @@ -633,6 +662,18 @@ int sock_sendmsg(struct socket *sock, struct msghdr *msg) } EXPORT_SYMBOL(sock_sendmsg); +/** + * kernel_sendmsg - send a message through @sock (kernel-space) + * @sock: socket + * @msg: message header + * @vec: kernel vec + * @num: vec array length + * @size: total message data size + * + * Builds the message data with @vec and sends it through @sock. + * Returns the number of bytes sent, or an error code. + */ + int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec, size_t num, size_t size) { @@ -641,6 +682,19 @@ int kernel_sendmsg(struct socket *sock, struct msghdr *msg, } EXPORT_SYMBOL(kernel_sendmsg); +/** + * kernel_sendmsg_locked - send a message through @sock (kernel-space) + * @sk: sock + * @msg: message header + * @vec: output s/g array + * @num: output s/g array length + * @size: total message data size + * + * Builds the message data with @vec and sends it through @sock. + * Returns the number of bytes sent, or an error code. + * Caller must hold @sk. + */ + int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg, struct kvec *vec, size_t num, size_t size) { @@ -811,6 +865,16 @@ void __sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk, } EXPORT_SYMBOL_GPL(__sock_recv_ts_and_drops); +/** + * sock_recvmsg - receive a message from @sock + * @sock: socket + * @msg: message to receive + * @flags: message flags + * + * Receives @msg from @sock, passing through LSM. Returns the total number + * of bytes received, or an error. + */ + static inline int sock_recvmsg_nosec(struct socket *sock, struct msghdr *msg, int flags) { @@ -826,20 +890,21 @@ int sock_recvmsg(struct socket *sock, struct msghdr *msg, int flags) EXPORT_SYMBOL(sock_recvmsg); /** - * kernel_recvmsg - Receive a message from a socket (kernel space) - * @sock: The socket to receive the message from - * @msg: Received message - * @vec: Input s/g array for message data - * @num: Size of input s/g array - * @size: Number of bytes to read - * @flags: Message flags (MSG_DONTWAIT, etc...) + * kernel_recvmsg - Receive a message from a socket (kernel space) + * @sock: The socket to receive the message from + * @msg: Received message + * @vec: Input s/g array for message data + * @num: Size of input s/g array + * @size: Number of bytes to read + * @flags: Message flags (MSG_DONTWAIT, etc...) * - * On return the msg structure contains the scatter/gather array passed in the - * vec argument. The array is modified so that it consists of the unfilled - * portion of the original array. + * On return the msg structure contains the scatter/gather array passed in the + * vec argument. The array is modified so that it consists of the unfilled + * portion of the original array. * - * The returned value is the total number of bytes received, or an error. + * The returned value is the total number of bytes received, or an error. */ + int kernel_recvmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec, size_t num, size_t size, int flags) { @@ -1005,6 +1070,13 @@ static long sock_do_ioctl(struct net *net, struct socket *sock, * what to do with it - that's up to the protocol still. */ +/** + * get_net_ns - increment the refcount of the network namespace + * @ns: common namespace (net) + * + * Returns the net's common namespace. + */ + struct ns_common *get_net_ns(struct ns_common *ns) { return &get_net(container_of(ns, struct net, ns))->ns; @@ -1099,6 +1171,19 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) return err; } +/** + * sock_create_lite - creates a socket + * @family: protocol family (AF_INET, ...) + * @type: communication type (SOCK_STREAM, ...) + * @protocol: protocol (0, ...) + * @res: new socket + * + * Creates a new socket and assigns it to @res, passing through LSM. + * The new socket initialization is not complete, see kernel_accept(). + * Returns 0 or an error. On failure @res is set to %NULL. + * This function internally uses GFP_KERNEL. + */ + int sock_create_lite(int family, int type, int protocol, struct socket **res) { int err; @@ -1224,6 +1309,21 @@ call_kill: } EXPORT_SYMBOL(sock_wake_async); +/** + * __sock_create - creates a socket + * @net: net namespace + * @family: protocol family (AF_INET, ...) + * @type: communication type (SOCK_STREAM, ...) + * @protocol: protocol (0, ...) + * @res: new socket + * @kern: boolean for kernel space sockets + * + * Creates a new socket and assigns it to @res, passing through LSM. + * Returns 0 or an error. On failure @res is set to %NULL. @kern must + * be set to true if the socket resides in kernel space. + * This function internally uses GFP_KERNEL. + */ + int __sock_create(struct net *net, int family, int type, int protocol, struct socket **res, int kern) { @@ -1333,12 +1433,35 @@ out_release: } EXPORT_SYMBOL(__sock_create); +/** + * sock_create - creates a socket + * @family: protocol family (AF_INET, ...) + * @type: communication type (SOCK_STREAM, ...) + * @protocol: protocol (0, ...) + * @res: new socket + * + * A wrapper around __sock_create(). + * Returns 0 or an error. This function internally uses GFP_KERNEL. + */ + int sock_create(int family, int type, int protocol, struct socket **res) { return __sock_create(current->nsproxy->net_ns, family, type, protocol, res, 0); } EXPORT_SYMBOL(sock_create); +/** + * sock_create_kern - creates a socket (kernel space) + * @net: net namespace + * @family: protocol family (AF_INET, ...) + * @type: communication type (SOCK_STREAM, ...) + * @protocol: protocol (0, ...) + * @res: new socket + * + * A wrapper around __sock_create(). + * Returns 0 or an error. This function internally uses GFP_KERNEL. + */ + int sock_create_kern(struct net *net, int family, int type, int protocol, struct socket **res) { return __sock_create(net, family, type, protocol, res, 1); @@ -3322,18 +3445,46 @@ static long compat_sock_ioctl(struct file *file, unsigned int cmd, } #endif +/** + * kernel_bind - bind an address to a socket (kernel space) + * @sock: socket + * @addr: address + * @addrlen: length of address + * + * Returns 0 or an error. + */ + int kernel_bind(struct socket *sock, struct sockaddr *addr, int addrlen) { return sock->ops->bind(sock, addr, addrlen); } EXPORT_SYMBOL(kernel_bind); +/** + * kernel_listen - move socket to listening state (kernel space) + * @sock: socket + * @backlog: pending connections queue size + * + * Returns 0 or an error. + */ + int kernel_listen(struct socket *sock, int backlog) { return sock->ops->listen(sock, backlog); } EXPORT_SYMBOL(kernel_listen); +/** + * kernel_accept - accept a connection (kernel space) + * @sock: listening socket + * @newsock: new connected socket + * @flags: flags + * + * @flags must be SOCK_CLOEXEC, SOCK_NONBLOCK or 0. + * If it fails, @newsock is guaranteed to be %NULL. + * Returns 0 or an error. + */ + int kernel_accept(struct socket *sock, struct socket **newsock, int flags) { struct sock *sk = sock->sk; @@ -3359,6 +3510,19 @@ done: } EXPORT_SYMBOL(kernel_accept); +/** + * kernel_connect - connect a socket (kernel space) + * @sock: socket + * @addr: address + * @addrlen: address length + * @flags: flags (O_NONBLOCK, ...) + * + * For datagram sockets, @addr is the addres to which datagrams are sent + * by default, and the only address from which datagrams are received. + * For stream sockets, attempts to connect to @addr. + * Returns 0 or an error code. + */ + int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen, int flags) { @@ -3366,18 +3530,48 @@ int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen, } EXPORT_SYMBOL(kernel_connect); +/** + * kernel_getsockname - get the address which the socket is bound (kernel space) + * @sock: socket + * @addr: address holder + * + * Fills the @addr pointer with the address which the socket is bound. + * Returns 0 or an error code. + */ + int kernel_getsockname(struct socket *sock, struct sockaddr *addr) { return sock->ops->getname(sock, addr, 0); } EXPORT_SYMBOL(kernel_getsockname); +/** + * kernel_peername - get the address which the socket is connected (kernel space) + * @sock: socket + * @addr: address holder + * + * Fills the @addr pointer with the address which the socket is connected. + * Returns 0 or an error code. + */ + int kernel_getpeername(struct socket *sock, struct sockaddr *addr) { return sock->ops->getname(sock, addr, 1); } EXPORT_SYMBOL(kernel_getpeername); +/** + * kernel_getsockopt - get a socket option (kernel space) + * @sock: socket + * @level: API level (SOL_SOCKET, ...) + * @optname: option tag + * @optval: option value + * @optlen: option length + * + * Assigns the option length to @optlen. + * Returns 0 or an error. + */ + int kernel_getsockopt(struct socket *sock, int level, int optname, char *optval, int *optlen) { @@ -3400,6 +3594,17 @@ int kernel_getsockopt(struct socket *sock, int level, int optname, } EXPORT_SYMBOL(kernel_getsockopt); +/** + * kernel_setsockopt - set a socket option (kernel space) + * @sock: socket + * @level: API level (SOL_SOCKET, ...) + * @optname: option tag + * @optval: option value + * @optlen: option length + * + * Returns 0 or an error. + */ + int kernel_setsockopt(struct socket *sock, int level, int optname, char *optval, unsigned int optlen) { @@ -3420,6 +3625,17 @@ int kernel_setsockopt(struct socket *sock, int level, int optname, } EXPORT_SYMBOL(kernel_setsockopt); +/** + * kernel_sendpage - send a &page through a socket (kernel space) + * @sock: socket + * @page: page + * @offset: page offset + * @size: total size in bytes + * @flags: flags (MSG_DONTWAIT, ...) + * + * Returns the total amount sent in bytes or an error. + */ + int kernel_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) { @@ -3430,6 +3646,18 @@ int kernel_sendpage(struct socket *sock, struct page *page, int offset, } EXPORT_SYMBOL(kernel_sendpage); +/** + * kernel_sendpage_locked - send a &page through the locked sock (kernel space) + * @sk: sock + * @page: page + * @offset: page offset + * @size: total size in bytes + * @flags: flags (MSG_DONTWAIT, ...) + * + * Returns the total amount sent in bytes or an error. + * Caller must hold @sk. + */ + int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset, size_t size, int flags) { @@ -3443,17 +3671,30 @@ int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset, } EXPORT_SYMBOL(kernel_sendpage_locked); +/** + * kernel_shutdown - shut down part of a full-duplex connection (kernel space) + * @sock: socket + * @how: connection part + * + * Returns 0 or an error. + */ + int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how) { return sock->ops->shutdown(sock, how); } EXPORT_SYMBOL(kernel_sock_shutdown); -/* This routine returns the IP overhead imposed by a socket i.e. - * the length of the underlying IP header, depending on whether - * this is an IPv4 or IPv6 socket and the length from IP options turned - * on at the socket. Assumes that the caller has a lock on the socket. +/** + * kernel_sock_ip_overhead - returns the IP overhead imposed by a socket + * @sk: socket + * + * This routine returns the IP overhead imposed by a socket i.e. + * the length of the underlying IP header, depending on whether + * this is an IPv4 or IPv6 socket and the length from IP options turned + * on at the socket. Assumes that the caller has a lock on the socket. */ + u32 kernel_sock_ip_overhead(struct sock *sk) { struct inet_sock *inet; diff --git a/net/strparser/strparser.c b/net/strparser/strparser.c @@ -550,6 +550,8 @@ EXPORT_SYMBOL_GPL(strp_check_rcv); static int __init strp_mod_init(void) { strp_wq = create_singlethread_workqueue("kstrp"); + if (unlikely(!strp_wq)) + return -ENOMEM; return 0; } diff --git a/net/tipc/group.c b/net/tipc/group.c @@ -919,6 +919,9 @@ int tipc_group_fill_sock_diag(struct tipc_group *grp, struct sk_buff *skb) { struct nlattr *group = nla_nest_start(skb, TIPC_NLA_SOCK_GROUP); + if (!group) + return -EMSGSIZE; + if (nla_put_u32(skb, TIPC_NLA_SOCK_GROUP_ID, grp->type) || nla_put_u32(skb, TIPC_NLA_SOCK_GROUP_INSTANCE, diff --git a/net/tipc/net.c b/net/tipc/net.c @@ -163,12 +163,9 @@ void tipc_sched_net_finalize(struct net *net, u32 addr) void tipc_net_stop(struct net *net) { - u32 self = tipc_own_addr(net); - - if (!self) + if (!tipc_own_id(net)) return; - tipc_nametbl_withdraw(net, TIPC_CFG_SRV, self, self, self); rtnl_lock(); tipc_bearer_stop(net); tipc_node_stop(net); diff --git a/net/tipc/node.c b/net/tipc/node.c @@ -817,10 +817,10 @@ static void __tipc_node_link_down(struct tipc_node *n, int *bearer_id, static void tipc_node_link_down(struct tipc_node *n, int bearer_id, bool delete) { struct tipc_link_entry *le = &n->links[bearer_id]; + struct tipc_media_addr *maddr = NULL; struct tipc_link *l = le->link; - struct tipc_media_addr *maddr; - struct sk_buff_head xmitq; int old_bearer_id = bearer_id; + struct sk_buff_head xmitq; if (!l) return; @@ -844,7 +844,8 @@ static void tipc_node_link_down(struct tipc_node *n, int bearer_id, bool delete) tipc_node_write_unlock(n); if (delete) tipc_mon_remove_peer(n->net, n->addr, old_bearer_id); - tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr); + if (!skb_queue_empty(&xmitq)) + tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr); tipc_sk_rcv(n->net, &le->inputq); } diff --git a/net/tipc/socket.c b/net/tipc/socket.c @@ -2349,6 +2349,16 @@ static int tipc_wait_for_connect(struct socket *sock, long *timeo_p) return 0; } +static bool tipc_sockaddr_is_sane(struct sockaddr_tipc *addr) +{ + if (addr->family != AF_TIPC) + return false; + if (addr->addrtype == TIPC_SERVICE_RANGE) + return (addr->addr.nameseq.lower <= addr->addr.nameseq.upper); + return (addr->addrtype == TIPC_SERVICE_ADDR || + addr->addrtype == TIPC_SOCKET_ADDR); +} + /** * tipc_connect - establish a connection to another TIPC port * @sock: socket structure @@ -2384,18 +2394,18 @@ static int tipc_connect(struct socket *sock, struct sockaddr *dest, if (!tipc_sk_type_connectionless(sk)) res = -EINVAL; goto exit; - } else if (dst->family != AF_TIPC) { - res = -EINVAL; } - if (dst->addrtype != TIPC_ADDR_ID && dst->addrtype != TIPC_ADDR_NAME) + if (!tipc_sockaddr_is_sane(dst)) { res = -EINVAL; - if (res) goto exit; - + } /* DGRAM/RDM connect(), just save the destaddr */ if (tipc_sk_type_connectionless(sk)) { memcpy(&tsk->peer, dest, destlen); goto exit; + } else if (dst->addrtype == TIPC_SERVICE_RANGE) { + res = -EINVAL; + goto exit; } previous = sk->sk_state; @@ -3255,6 +3265,8 @@ static int __tipc_nl_add_sk_con(struct sk_buff *skb, struct tipc_sock *tsk) peer_port = tsk_peer_port(tsk); nest = nla_nest_start(skb, TIPC_NLA_SOCK_CON); + if (!nest) + return -EMSGSIZE; if (nla_put_u32(skb, TIPC_NLA_CON_NODE, peer_node)) goto msg_full; diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c @@ -363,6 +363,7 @@ static int tipc_conn_rcv_sub(struct tipc_topsrv *srv, struct tipc_subscription *sub; if (tipc_sub_read(s, filter) & TIPC_SUB_CANCEL) { + s->filter &= __constant_ntohl(~TIPC_SUB_CANCEL); tipc_conn_delete_sub(con, s); return 0; } diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c @@ -193,9 +193,6 @@ static void xdp_umem_unaccount_pages(struct xdp_umem *umem) static void xdp_umem_release(struct xdp_umem *umem) { - struct task_struct *task; - struct mm_struct *mm; - xdp_umem_clear_dev(umem); ida_simple_remove(&umem_ida, umem->id); @@ -214,21 +211,10 @@ static void xdp_umem_release(struct xdp_umem *umem) xdp_umem_unpin_pages(umem); - task = get_pid_task(umem->pid, PIDTYPE_PID); - put_pid(umem->pid); - if (!task) - goto out; - mm = get_task_mm(task); - put_task_struct(task); - if (!mm) - goto out; - - mmput(mm); kfree(umem->pages); umem->pages = NULL; xdp_umem_unaccount_pages(umem); -out: kfree(umem); } @@ -357,7 +343,6 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) if (size_chk < 0) return -EINVAL; - umem->pid = get_task_pid(current, PIDTYPE_PID); umem->address = (unsigned long)addr; umem->chunk_mask = ~((u64)chunk_size - 1); umem->size = size; @@ -373,7 +358,7 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) err = xdp_umem_account_pages(umem); if (err) - goto out; + return err; err = xdp_umem_pin_pages(umem); if (err) @@ -392,8 +377,6 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) out_account: xdp_umem_unaccount_pages(umem); -out: - put_pid(umem->pid); return err; } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h @@ -502,16 +502,6 @@ union bpf_attr { * Return * 0 on success, or a negative error in case of failure. * - * int bpf_map_push_elem(struct bpf_map *map, const void *value, u64 flags) - * Description - * Push an element *value* in *map*. *flags* is one of: - * - * **BPF_EXIST** - * If the queue/stack is full, the oldest element is removed to - * make room for this. - * Return - * 0 on success, or a negative error in case of failure. - * * int bpf_probe_read(void *dst, u32 size, const void *src) * Description * For tracing programs, safely attempt to read *size* bytes from @@ -1435,14 +1425,14 @@ union bpf_attr { * u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx) * Description * Equivalent to bpf_get_socket_cookie() helper that accepts - * *skb*, but gets socket from **struct bpf_sock_addr** contex. + * *skb*, but gets socket from **struct bpf_sock_addr** context. * Return * A 8-byte long non-decreasing number. * * u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx) * Description * Equivalent to bpf_get_socket_cookie() helper that accepts - * *skb*, but gets socket from **struct bpf_sock_ops** contex. + * *skb*, but gets socket from **struct bpf_sock_ops** context. * Return * A 8-byte long non-decreasing number. * @@ -2098,52 +2088,52 @@ union bpf_attr { * Return * 0 on success, or a negative error in case of failure. * - * int bpf_rc_keydown(void *ctx, u32 protocol, u64 scancode, u32 toggle) + * int bpf_rc_repeat(void *ctx) * Description * This helper is used in programs implementing IR decoding, to - * report a successfully decoded key press with *scancode*, - * *toggle* value in the given *protocol*. The scancode will be - * translated to a keycode using the rc keymap, and reported as - * an input key down event. After a period a key up event is - * generated. This period can be extended by calling either - * **bpf_rc_keydown**\ () again with the same values, or calling - * **bpf_rc_repeat**\ (). + * report a successfully decoded repeat key message. This delays + * the generation of a key up event for previously generated + * key down event. * - * Some protocols include a toggle bit, in case the button was - * released and pressed again between consecutive scancodes. + * Some IR protocols like NEC have a special IR message for + * repeating last button, for when a button is held down. * * The *ctx* should point to the lirc sample as passed into * the program. * - * The *protocol* is the decoded protocol number (see - * **enum rc_proto** for some predefined values). - * * This helper is only available is the kernel was compiled with * the **CONFIG_BPF_LIRC_MODE2** configuration option set to * "**y**". * Return * 0 * - * int bpf_rc_repeat(void *ctx) + * int bpf_rc_keydown(void *ctx, u32 protocol, u64 scancode, u32 toggle) * Description * This helper is used in programs implementing IR decoding, to - * report a successfully decoded repeat key message. This delays - * the generation of a key up event for previously generated - * key down event. + * report a successfully decoded key press with *scancode*, + * *toggle* value in the given *protocol*. The scancode will be + * translated to a keycode using the rc keymap, and reported as + * an input key down event. After a period a key up event is + * generated. This period can be extended by calling either + * **bpf_rc_keydown**\ () again with the same values, or calling + * **bpf_rc_repeat**\ (). * - * Some IR protocols like NEC have a special IR message for - * repeating last button, for when a button is held down. + * Some protocols include a toggle bit, in case the button was + * released and pressed again between consecutive scancodes. * * The *ctx* should point to the lirc sample as passed into * the program. * + * The *protocol* is the decoded protocol number (see + * **enum rc_proto** for some predefined values). + * * This helper is only available is the kernel was compiled with * the **CONFIG_BPF_LIRC_MODE2** configuration option set to * "**y**". * Return * 0 * - * uint64_t bpf_skb_cgroup_id(struct sk_buff *skb) + * u64 bpf_skb_cgroup_id(struct sk_buff *skb) * Description * Return the cgroup v2 id of the socket associated with the *skb*. * This is roughly similar to the **bpf_get_cgroup_classid**\ () @@ -2159,30 +2149,12 @@ union bpf_attr { * Return * The id is returned or 0 in case the id could not be retrieved. * - * u64 bpf_skb_ancestor_cgroup_id(struct sk_buff *skb, int ancestor_level) - * Description - * Return id of cgroup v2 that is ancestor of cgroup associated - * with the *skb* at the *ancestor_level*. The root cgroup is at - * *ancestor_level* zero and each step down the hierarchy - * increments the level. If *ancestor_level* == level of cgroup - * associated with *skb*, then return value will be same as that - * of **bpf_skb_cgroup_id**\ (). - * - * The helper is useful to implement policies based on cgroups - * that are upper in hierarchy than immediate cgroup associated - * with *skb*. - * - * The format of returned id and helper limitations are same as in - * **bpf_skb_cgroup_id**\ (). - * Return - * The id is returned or 0 in case the id could not be retrieved. - * * u64 bpf_get_current_cgroup_id(void) * Return * A 64-bit integer containing the current cgroup id based * on the cgroup within which the current task is running. * - * void* get_local_storage(void *map, u64 flags) + * void *bpf_get_local_storage(void *map, u64 flags) * Description * Get the pointer to the local storage area. * The type and the size of the local storage is defined @@ -2209,6 +2181,24 @@ union bpf_attr { * Return * 0 on success, or a negative error in case of failure. * + * u64 bpf_skb_ancestor_cgroup_id(struct sk_buff *skb, int ancestor_level) + * Description + * Return id of cgroup v2 that is ancestor of cgroup associated + * with the *skb* at the *ancestor_level*. The root cgroup is at + * *ancestor_level* zero and each step down the hierarchy + * increments the level. If *ancestor_level* == level of cgroup + * associated with *skb*, then return value will be same as that + * of **bpf_skb_cgroup_id**\ (). + * + * The helper is useful to implement policies based on cgroups + * that are upper in hierarchy than immediate cgroup associated + * with *skb*. + * + * The format of returned id and helper limitations are same as in + * **bpf_skb_cgroup_id**\ (). + * Return + * The id is returned or 0 in case the id could not be retrieved. + * * struct bpf_sock *bpf_sk_lookup_tcp(void *ctx, struct bpf_sock_tuple *tuple, u32 tuple_size, u64 netns, u64 flags) * Description * Look for TCP socket matching *tuple*, optionally in a child @@ -2289,6 +2279,16 @@ union bpf_attr { * Return * 0 on success, or a negative error in case of failure. * + * int bpf_map_push_elem(struct bpf_map *map, const void *value, u64 flags) + * Description + * Push an element *value* in *map*. *flags* is one of: + * + * **BPF_EXIST** + * If the queue/stack is full, the oldest element is + * removed to make room for this. + * Return + * 0 on success, or a negative error in case of failure. + * * int bpf_map_pop_elem(struct bpf_map *map, void *value) * Description * Pop an element from *map*. @@ -2343,29 +2343,94 @@ union bpf_attr { * Return * 0 * + * int bpf_spin_lock(struct bpf_spin_lock *lock) + * Description + * Acquire a spinlock represented by the pointer *lock*, which is + * stored as part of a value of a map. Taking the lock allows to + * safely update the rest of the fields in that value. The + * spinlock can (and must) later be released with a call to + * **bpf_spin_unlock**\ (\ *lock*\ ). + * + * Spinlocks in BPF programs come with a number of restrictions + * and constraints: + * + * * **bpf_spin_lock** objects are only allowed inside maps of + * types **BPF_MAP_TYPE_HASH** and **BPF_MAP_TYPE_ARRAY** (this + * list could be extended in the future). + * * BTF description of the map is mandatory. + * * The BPF program can take ONE lock at a time, since taking two + * or more could cause dead locks. + * * Only one **struct bpf_spin_lock** is allowed per map element. + * * When the lock is taken, calls (either BPF to BPF or helpers) + * are not allowed. + * * The **BPF_LD_ABS** and **BPF_LD_IND** instructions are not + * allowed inside a spinlock-ed region. + * * The BPF program MUST call **bpf_spin_unlock**\ () to release + * the lock, on all execution paths, before it returns. + * * The BPF program can access **struct bpf_spin_lock** only via + * the **bpf_spin_lock**\ () and **bpf_spin_unlock**\ () + * helpers. Loading or storing data into the **struct + * bpf_spin_lock** *lock*\ **;** field of a map is not allowed. + * * To use the **bpf_spin_lock**\ () helper, the BTF description + * of the map value must be a struct and have **struct + * bpf_spin_lock** *anyname*\ **;** field at the top level. + * Nested lock inside another struct is not allowed. + * * The **struct bpf_spin_lock** *lock* field in a map value must + * be aligned on a multiple of 4 bytes in that value. + * * Syscall with command **BPF_MAP_LOOKUP_ELEM** does not copy + * the **bpf_spin_lock** field to user space. + * * Syscall with command **BPF_MAP_UPDATE_ELEM**, or update from + * a BPF program, do not update the **bpf_spin_lock** field. + * * **bpf_spin_lock** cannot be on the stack or inside a + * networking packet (it can only be inside of a map values). + * * **bpf_spin_lock** is available to root only. + * * Tracing programs and socket filter programs cannot use + * **bpf_spin_lock**\ () due to insufficient preemption checks + * (but this may change in the future). + * * **bpf_spin_lock** is not allowed in inner maps of map-in-map. + * Return + * 0 + * + * int bpf_spin_unlock(struct bpf_spin_lock *lock) + * Description + * Release the *lock* previously locked by a call to + * **bpf_spin_lock**\ (\ *lock*\ ). + * Return + * 0 + * * struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk) * Description * This helper gets a **struct bpf_sock** pointer such - * that all the fields in bpf_sock can be accessed. + * that all the fields in this **bpf_sock** can be accessed. * Return - * A **struct bpf_sock** pointer on success, or NULL in + * A **struct bpf_sock** pointer on success, or **NULL** in * case of failure. * * struct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *sk) * Description * This helper gets a **struct bpf_tcp_sock** pointer from a * **struct bpf_sock** pointer. - * * Return - * A **struct bpf_tcp_sock** pointer on success, or NULL in + * A **struct bpf_tcp_sock** pointer on success, or **NULL** in * case of failure. * * int bpf_skb_ecn_set_ce(struct sk_buf *skb) - * Description - * Sets ECN of IP header to ce (congestion encountered) if - * current value is ect (ECN capable). Works with IPv6 and IPv4. - * Return - * 1 if set, 0 if not set. + * Description + * Set ECN (Explicit Congestion Notification) field of IP header + * to **CE** (Congestion Encountered) if current value is **ECT** + * (ECN Capable Transport). Otherwise, do nothing. Works with IPv6 + * and IPv4. + * Return + * 1 if the **CE** flag is set (either by the current helper call + * or because it was already present), 0 if it is not set. + * + * struct bpf_sock *bpf_get_listener_sock(struct bpf_sock *sk) + * Description + * Return a **struct bpf_sock** pointer in **TCP_LISTEN** state. + * **bpf_sk_release**\ () is unnecessary and not allowed. + * Return + * A **struct bpf_sock** pointer on success, or **NULL** in + * case of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -2465,7 +2530,8 @@ union bpf_attr { FN(spin_unlock), \ FN(sk_fullsock), \ FN(tcp_sock), \ - FN(skb_ecn_set_ce), + FN(skb_ecn_set_ce), \ + FN(get_listener_sock), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile @@ -3,7 +3,7 @@ BPF_VERSION = 0 BPF_PATCHLEVEL = 0 -BPF_EXTRAVERSION = 1 +BPF_EXTRAVERSION = 2 MAKEFLAGS += --no-print-directory @@ -79,8 +79,6 @@ export prefix libdir src obj libdir_SQ = $(subst ','\'',$(libdir)) libdir_relative_SQ = $(subst ','\'',$(libdir_relative)) -LIB_FILE = libbpf.a libbpf.so - VERSION = $(BPF_VERSION) PATCHLEVEL = $(BPF_PATCHLEVEL) EXTRAVERSION = $(BPF_EXTRAVERSION) @@ -88,7 +86,10 @@ EXTRAVERSION = $(BPF_EXTRAVERSION) OBJ = $@ N = -LIBBPF_VERSION = $(BPF_VERSION).$(BPF_PATCHLEVEL).$(BPF_EXTRAVERSION) +LIBBPF_VERSION = $(BPF_VERSION).$(BPF_PATCHLEVEL).$(BPF_EXTRAVERSION) + +LIB_TARGET = libbpf.a libbpf.so.$(LIBBPF_VERSION) +LIB_FILE = libbpf.a libbpf.so* # Set compile option CFLAGS ifdef EXTRA_CFLAGS @@ -128,16 +129,18 @@ all: export srctree OUTPUT CC LD CFLAGS V include $(srctree)/tools/build/Makefile.include -BPF_IN := $(OUTPUT)libbpf-in.o -LIB_FILE := $(addprefix $(OUTPUT),$(LIB_FILE)) -VERSION_SCRIPT := libbpf.map +BPF_IN := $(OUTPUT)libbpf-in.o +VERSION_SCRIPT := libbpf.map + +LIB_TARGET := $(addprefix $(OUTPUT),$(LIB_TARGET)) +LIB_FILE := $(addprefix $(OUTPUT),$(LIB_FILE)) GLOBAL_SYM_COUNT = $(shell readelf -s --wide $(BPF_IN) | \ awk '/GLOBAL/ && /DEFAULT/ && !/UND/ {s++} END{print s}') VERSIONED_SYM_COUNT = $(shell readelf -s --wide $(OUTPUT)libbpf.so | \ grep -Eo '[^ ]+@LIBBPF_' | cut -d@ -f1 | sort -u | wc -l) -CMD_TARGETS = $(LIB_FILE) +CMD_TARGETS = $(LIB_TARGET) CXX_TEST_TARGET = $(OUTPUT)test_libbpf @@ -170,9 +173,13 @@ $(BPF_IN): force elfdep bpfdep echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h'" >&2 )) || true $(Q)$(MAKE) $(build)=libbpf -$(OUTPUT)libbpf.so: $(BPF_IN) - $(QUIET_LINK)$(CC) --shared -Wl,--version-script=$(VERSION_SCRIPT) \ - $^ -o $@ +$(OUTPUT)libbpf.so: $(OUTPUT)libbpf.so.$(LIBBPF_VERSION) + +$(OUTPUT)libbpf.so.$(LIBBPF_VERSION): $(BPF_IN) + $(QUIET_LINK)$(CC) --shared -Wl,-soname,libbpf.so.$(VERSION) \ + -Wl,--version-script=$(VERSION_SCRIPT) $^ -o $@ + @ln -sf $(@F) $(OUTPUT)libbpf.so + @ln -sf $(@F) $(OUTPUT)libbpf.so.$(VERSION) $(OUTPUT)libbpf.a: $(BPF_IN) $(QUIET_LINK)$(RM) $@; $(AR) rcs $@ $^ @@ -192,6 +199,12 @@ check_abi: $(OUTPUT)libbpf.so exit 1; \ fi +define do_install_mkdir + if [ ! -d '$(DESTDIR_SQ)$1' ]; then \ + $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$1'; \ + fi +endef + define do_install if [ ! -d '$(DESTDIR_SQ)$2' ]; then \ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$2'; \ @@ -200,8 +213,9 @@ define do_install endef install_lib: all_cmd - $(call QUIET_INSTALL, $(LIB_FILE)) \ - $(call do_install,$(LIB_FILE),$(libdir_SQ)) + $(call QUIET_INSTALL, $(LIB_TARGET)) \ + $(call do_install_mkdir,$(libdir_SQ)); \ + cp -fpR $(LIB_FILE) $(DESTDIR)$(libdir_SQ) install_headers: $(call QUIET_INSTALL, headers) \ @@ -219,7 +233,7 @@ config-clean: clean: $(call QUIET_CLEAN, libbpf) $(RM) $(TARGETS) $(CXX_TEST_TARGET) \ - *.o *~ *.a *.so .*.d .*.cmd LIBBPF-CFLAGS + *.o *~ *.a *.so *.so.$(VERSION) .*.d .*.cmd LIBBPF-CFLAGS $(call QUIET_CLEAN, core-gen) $(RM) $(OUTPUT)FEATURE-DUMP.libbpf diff --git a/tools/lib/bpf/README.rst b/tools/lib/bpf/README.rst @@ -111,6 +111,7 @@ starting from ``0.0.1``. Every time ABI is being changed, e.g. because a new symbol is added or semantic of existing symbol is changed, ABI version should be bumped. +This bump in ABI version is at most once per kernel development cycle. For example, if current state of ``libbpf.map`` is: diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c @@ -1602,16 +1602,12 @@ static bool btf_equal_int(struct btf_type *t1, struct btf_type *t2) /* Calculate type signature hash of ENUM. */ static __u32 btf_hash_enum(struct btf_type *t) { - struct btf_enum *member = (struct btf_enum *)(t + 1); - __u32 vlen = BTF_INFO_VLEN(t->info); - __u32 h = btf_hash_common(t); - int i; + __u32 h; - for (i = 0; i < vlen; i++) { - h = hash_combine(h, member->name_off); - h = hash_combine(h, member->val); - member++; - } + /* don't hash vlen and enum members to support enum fwd resolving */ + h = hash_combine(0, t->name_off); + h = hash_combine(h, t->info & ~0xffff); + h = hash_combine(h, t->size); return h; } @@ -1637,6 +1633,22 @@ static bool btf_equal_enum(struct btf_type *t1, struct btf_type *t2) return true; } +static inline bool btf_is_enum_fwd(struct btf_type *t) +{ + return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM && + BTF_INFO_VLEN(t->info) == 0; +} + +static bool btf_compat_enum(struct btf_type *t1, struct btf_type *t2) +{ + if (!btf_is_enum_fwd(t1) && !btf_is_enum_fwd(t2)) + return btf_equal_enum(t1, t2); + /* ignore vlen when comparing */ + return t1->name_off == t2->name_off && + (t1->info & ~0xffff) == (t2->info & ~0xffff) && + t1->size == t2->size; +} + /* * Calculate type signature hash of STRUCT/UNION, ignoring referenced type IDs, * as referenced type IDs equivalence is established separately during type @@ -1860,6 +1872,17 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id) new_id = cand_node->type_id; break; } + if (d->opts.dont_resolve_fwds) + continue; + if (btf_compat_enum(t, cand)) { + if (btf_is_enum_fwd(t)) { + /* resolve fwd to full enum */ + new_id = cand_node->type_id; + break; + } + /* resolve canonical enum fwd to full enum */ + d->map[cand_node->type_id] = type_id; + } } break; @@ -2084,15 +2107,15 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id, return fwd_kind == real_kind; } - if (cand_type->info != canon_type->info) - return 0; - switch (cand_kind) { case BTF_KIND_INT: return btf_equal_int(cand_type, canon_type); case BTF_KIND_ENUM: - return btf_equal_enum(cand_type, canon_type); + if (d->opts.dont_resolve_fwds) + return btf_equal_enum(cand_type, canon_type); + else + return btf_compat_enum(cand_type, canon_type); case BTF_KIND_FWD: return btf_equal_common(cand_type, canon_type); @@ -2103,6 +2126,8 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id, case BTF_KIND_PTR: case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: + if (cand_type->info != canon_type->info) + return 0; return btf_dedup_is_equiv(d, cand_type->type, canon_type->type); case BTF_KIND_ARRAY: { diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c @@ -840,12 +840,19 @@ static int bpf_object__elf_collect(struct bpf_object *obj, int flags) obj->efile.maps_shndx = idx; else if (strcmp(name, BTF_ELF_SEC) == 0) { obj->btf = btf__new(data->d_buf, data->d_size); - if (IS_ERR(obj->btf) || btf__load(obj->btf)) { + if (IS_ERR(obj->btf)) { pr_warning("Error loading ELF section %s: %ld. Ignored and continue.\n", BTF_ELF_SEC, PTR_ERR(obj->btf)); - if (!IS_ERR(obj->btf)) - btf__free(obj->btf); obj->btf = NULL; + continue; + } + err = btf__load(obj->btf); + if (err) { + pr_warning("Error loading %s into kernel: %d. Ignored and continue.\n", + BTF_ELF_SEC, err); + btf__free(obj->btf); + obj->btf = NULL; + err = 0; } } else if (strcmp(name, BTF_EXT_ELF_SEC) == 0) { btf_ext_data = data; diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c @@ -126,8 +126,8 @@ static void xsk_set_umem_config(struct xsk_umem_config *cfg, cfg->frame_headroom = usr_cfg->frame_headroom; } -static void xsk_set_xdp_socket_config(struct xsk_socket_config *cfg, - const struct xsk_socket_config *usr_cfg) +static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg, + const struct xsk_socket_config *usr_cfg) { if (!usr_cfg) { cfg->rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; @@ -135,14 +135,19 @@ static void xsk_set_xdp_socket_config(struct xsk_socket_config *cfg, cfg->libbpf_flags = 0; cfg->xdp_flags = 0; cfg->bind_flags = 0; - return; + return 0; } + if (usr_cfg->libbpf_flags & ~XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD) + return -EINVAL; + cfg->rx_size = usr_cfg->rx_size; cfg->tx_size = usr_cfg->tx_size; cfg->libbpf_flags = usr_cfg->libbpf_flags; cfg->xdp_flags = usr_cfg->xdp_flags; cfg->bind_flags = usr_cfg->bind_flags; + + return 0; } int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size, @@ -557,7 +562,9 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, } strncpy(xsk->ifname, ifname, IFNAMSIZ); - xsk_set_xdp_socket_config(&xsk->config, usr_config); + err = xsk_set_xdp_socket_config(&xsk->config, usr_config); + if (err) + goto out_socket; if (rx) { err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING, diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h @@ -180,6 +180,8 @@ static struct bpf_sock *(*bpf_sk_fullsock)(struct bpf_sock *sk) = (void *) BPF_FUNC_sk_fullsock; static struct bpf_tcp_sock *(*bpf_tcp_sock)(struct bpf_sock *sk) = (void *) BPF_FUNC_tcp_sock; +static struct bpf_sock *(*bpf_get_listener_sock)(struct bpf_sock *sk) = + (void *) BPF_FUNC_get_listener_sock; static int (*bpf_skb_ecn_set_ce)(void *ctx) = (void *) BPF_FUNC_skb_ecn_set_ce; diff --git a/tools/testing/selftests/bpf/prog_tests/map_lock.c b/tools/testing/selftests/bpf/prog_tests/map_lock.c @@ -37,7 +37,7 @@ void test_map_lock(void) const char *file = "./test_map_lock.o"; int prog_fd, map_fd[2], vars[17] = {}; pthread_t thread_id[6]; - struct bpf_object *obj; + struct bpf_object *obj = NULL; int err = 0, key = 0, i; void *ret; diff --git a/tools/testing/selftests/bpf/prog_tests/spinlock.c b/tools/testing/selftests/bpf/prog_tests/spinlock.c @@ -5,7 +5,7 @@ void test_spinlock(void) { const char *file = "./test_spin_lock.o"; pthread_t thread_id[4]; - struct bpf_object *obj; + struct bpf_object *obj = NULL; int prog_fd; int err = 0, i; void *ret; diff --git a/tools/testing/selftests/bpf/progs/test_sock_fields_kern.c b/tools/testing/selftests/bpf/progs/test_sock_fields_kern.c @@ -8,38 +8,51 @@ #include "bpf_helpers.h" #include "bpf_endian.h" -enum bpf_array_idx { - SRV_IDX, - CLI_IDX, - __NR_BPF_ARRAY_IDX, +enum bpf_addr_array_idx { + ADDR_SRV_IDX, + ADDR_CLI_IDX, + __NR_BPF_ADDR_ARRAY_IDX, +}; + +enum bpf_result_array_idx { + EGRESS_SRV_IDX, + EGRESS_CLI_IDX, + INGRESS_LISTEN_IDX, + __NR_BPF_RESULT_ARRAY_IDX, +}; + +enum bpf_linum_array_idx { + EGRESS_LINUM_IDX, + INGRESS_LINUM_IDX, + __NR_BPF_LINUM_ARRAY_IDX, }; struct bpf_map_def SEC("maps") addr_map = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(struct sockaddr_in6), - .max_entries = __NR_BPF_ARRAY_IDX, + .max_entries = __NR_BPF_ADDR_ARRAY_IDX, }; struct bpf_map_def SEC("maps") sock_result_map = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(struct bpf_sock), - .max_entries = __NR_BPF_ARRAY_IDX, + .max_entries = __NR_BPF_RESULT_ARRAY_IDX, }; struct bpf_map_def SEC("maps") tcp_sock_result_map = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(struct bpf_tcp_sock), - .max_entries = __NR_BPF_ARRAY_IDX, + .max_entries = __NR_BPF_RESULT_ARRAY_IDX, }; struct bpf_map_def SEC("maps") linum_map = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(__u32), - .max_entries = 1, + .max_entries = __NR_BPF_LINUM_ARRAY_IDX, }; static bool is_loopback6(__u32 *a6) @@ -100,18 +113,20 @@ static void tpcpy(struct bpf_tcp_sock *dst, #define RETURN { \ linum = __LINE__; \ - bpf_map_update_elem(&linum_map, &idx0, &linum, 0); \ + bpf_map_update_elem(&linum_map, &linum_idx, &linum, 0); \ return 1; \ } SEC("cgroup_skb/egress") -int read_sock_fields(struct __sk_buff *skb) +int egress_read_sock_fields(struct __sk_buff *skb) { - __u32 srv_idx = SRV_IDX, cli_idx = CLI_IDX, idx; + __u32 srv_idx = ADDR_SRV_IDX, cli_idx = ADDR_CLI_IDX, result_idx; struct sockaddr_in6 *srv_sa6, *cli_sa6; struct bpf_tcp_sock *tp, *tp_ret; struct bpf_sock *sk, *sk_ret; - __u32 linum, idx0 = 0; + __u32 linum, linum_idx; + + linum_idx = EGRESS_LINUM_IDX; sk = skb->sk; if (!sk || sk->state == 10) @@ -132,14 +147,55 @@ int read_sock_fields(struct __sk_buff *skb) RETURN; if (sk->src_port == bpf_ntohs(srv_sa6->sin6_port)) - idx = srv_idx; + result_idx = EGRESS_SRV_IDX; else if (sk->src_port == bpf_ntohs(cli_sa6->sin6_port)) - idx = cli_idx; + result_idx = EGRESS_CLI_IDX; else RETURN; - sk_ret = bpf_map_lookup_elem(&sock_result_map, &idx); - tp_ret = bpf_map_lookup_elem(&tcp_sock_result_map, &idx); + sk_ret = bpf_map_lookup_elem(&sock_result_map, &result_idx); + tp_ret = bpf_map_lookup_elem(&tcp_sock_result_map, &result_idx); + if (!sk_ret || !tp_ret) + RETURN; + + skcpy(sk_ret, sk); + tpcpy(tp_ret, tp); + + RETURN; +} + +SEC("cgroup_skb/ingress") +int ingress_read_sock_fields(struct __sk_buff *skb) +{ + __u32 srv_idx = ADDR_SRV_IDX, result_idx = INGRESS_LISTEN_IDX; + struct bpf_tcp_sock *tp, *tp_ret; + struct bpf_sock *sk, *sk_ret; + struct sockaddr_in6 *srv_sa6; + __u32 linum, linum_idx; + + linum_idx = INGRESS_LINUM_IDX; + + sk = skb->sk; + if (!sk || sk->family != AF_INET6 || !is_loopback6(sk->src_ip6)) + RETURN; + + srv_sa6 = bpf_map_lookup_elem(&addr_map, &srv_idx); + if (!srv_sa6 || sk->src_port != bpf_ntohs(srv_sa6->sin6_port)) + RETURN; + + if (sk->state != 10 && sk->state != 12) + RETURN; + + sk = bpf_get_listener_sock(sk); + if (!sk) + RETURN; + + tp = bpf_tcp_sock(sk); + if (!tp) + RETURN; + + sk_ret = bpf_map_lookup_elem(&sock_result_map, &result_idx); + tp_ret = bpf_map_lookup_elem(&tcp_sock_result_map, &result_idx); if (!sk_ret || !tp_ret) RETURN; diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c @@ -5874,6 +5874,50 @@ const struct btf_dedup_test dedup_tests[] = { .dont_resolve_fwds = false, }, }, +{ + .descr = "dedup: enum fwd resolution", + .input = { + .raw_types = { + /* [1] fwd enum 'e1' before full enum */ + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 4), + /* [2] full enum 'e1' after fwd */ + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), + BTF_ENUM_ENC(NAME_NTH(2), 123), + /* [3] full enum 'e2' before fwd */ + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), + BTF_ENUM_ENC(NAME_NTH(4), 456), + /* [4] fwd enum 'e2' after full enum */ + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 4), + /* [5] incompatible fwd enum with different size */ + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 1), + /* [6] incompatible full enum with different value */ + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), + BTF_ENUM_ENC(NAME_NTH(2), 321), + BTF_END_RAW, + }, + BTF_STR_SEC("\0e1\0e1_val\0e2\0e2_val"), + }, + .expect = { + .raw_types = { + /* [1] full enum 'e1' */ + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), + BTF_ENUM_ENC(NAME_NTH(2), 123), + /* [2] full enum 'e2' */ + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), + BTF_ENUM_ENC(NAME_NTH(4), 456), + /* [3] incompatible fwd enum with different size */ + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 1), + /* [4] incompatible full enum with different value */ + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), + BTF_ENUM_ENC(NAME_NTH(2), 321), + BTF_END_RAW, + }, + BTF_STR_SEC("\0e1\0e1_val\0e2\0e2_val"), + }, + .opts = { + .dont_resolve_fwds = false, + }, +}, }; diff --git a/tools/testing/selftests/bpf/test_sock_fields.c b/tools/testing/selftests/bpf/test_sock_fields.c @@ -16,10 +16,23 @@ #include "cgroup_helpers.h" #include "bpf_rlimit.h" -enum bpf_array_idx { - SRV_IDX, - CLI_IDX, - __NR_BPF_ARRAY_IDX, +enum bpf_addr_array_idx { + ADDR_SRV_IDX, + ADDR_CLI_IDX, + __NR_BPF_ADDR_ARRAY_IDX, +}; + +enum bpf_result_array_idx { + EGRESS_SRV_IDX, + EGRESS_CLI_IDX, + INGRESS_LISTEN_IDX, + __NR_BPF_RESULT_ARRAY_IDX, +}; + +enum bpf_linum_array_idx { + EGRESS_LINUM_IDX, + INGRESS_LINUM_IDX, + __NR_BPF_LINUM_ARRAY_IDX, }; #define CHECK(condition, tag, format...) ({ \ @@ -41,8 +54,16 @@ static int linum_map_fd; static int addr_map_fd; static int tp_map_fd; static int sk_map_fd; -static __u32 srv_idx = SRV_IDX; -static __u32 cli_idx = CLI_IDX; + +static __u32 addr_srv_idx = ADDR_SRV_IDX; +static __u32 addr_cli_idx = ADDR_CLI_IDX; + +static __u32 egress_srv_idx = EGRESS_SRV_IDX; +static __u32 egress_cli_idx = EGRESS_CLI_IDX; +static __u32 ingress_listen_idx = INGRESS_LISTEN_IDX; + +static __u32 egress_linum_idx = EGRESS_LINUM_IDX; +static __u32 ingress_linum_idx = INGRESS_LINUM_IDX; static void init_loopback6(struct sockaddr_in6 *sa6) { @@ -93,29 +114,46 @@ static void print_tp(const struct bpf_tcp_sock *tp) static void check_result(void) { - struct bpf_tcp_sock srv_tp, cli_tp; - struct bpf_sock srv_sk, cli_sk; - __u32 linum, idx0 = 0; + struct bpf_tcp_sock srv_tp, cli_tp, listen_tp; + struct bpf_sock srv_sk, cli_sk, listen_sk; + __u32 ingress_linum, egress_linum; int err; - err = bpf_map_lookup_elem(linum_map_fd, &idx0, &linum); + err = bpf_map_lookup_elem(linum_map_fd, &egress_linum_idx, + &egress_linum); CHECK(err == -1, "bpf_map_lookup_elem(linum_map_fd)", "err:%d errno:%d", err, errno); - err = bpf_map_lookup_elem(sk_map_fd, &srv_idx, &srv_sk); - CHECK(err == -1, "bpf_map_lookup_elem(sk_map_fd, &srv_idx)", + err = bpf_map_lookup_elem(linum_map_fd, &ingress_linum_idx, + &ingress_linum); + CHECK(err == -1, "bpf_map_lookup_elem(linum_map_fd)", + "err:%d errno:%d", err, errno); + + err = bpf_map_lookup_elem(sk_map_fd, &egress_srv_idx, &srv_sk); + CHECK(err == -1, "bpf_map_lookup_elem(sk_map_fd, &egress_srv_idx)", + "err:%d errno:%d", err, errno); + err = bpf_map_lookup_elem(tp_map_fd, &egress_srv_idx, &srv_tp); + CHECK(err == -1, "bpf_map_lookup_elem(tp_map_fd, &egress_srv_idx)", + "err:%d errno:%d", err, errno); + + err = bpf_map_lookup_elem(sk_map_fd, &egress_cli_idx, &cli_sk); + CHECK(err == -1, "bpf_map_lookup_elem(sk_map_fd, &egress_cli_idx)", "err:%d errno:%d", err, errno); - err = bpf_map_lookup_elem(tp_map_fd, &srv_idx, &srv_tp); - CHECK(err == -1, "bpf_map_lookup_elem(tp_map_fd, &srv_idx)", + err = bpf_map_lookup_elem(tp_map_fd, &egress_cli_idx, &cli_tp); + CHECK(err == -1, "bpf_map_lookup_elem(tp_map_fd, &egress_cli_idx)", "err:%d errno:%d", err, errno); - err = bpf_map_lookup_elem(sk_map_fd, &cli_idx, &cli_sk); - CHECK(err == -1, "bpf_map_lookup_elem(sk_map_fd, &cli_idx)", + err = bpf_map_lookup_elem(sk_map_fd, &ingress_listen_idx, &listen_sk); + CHECK(err == -1, "bpf_map_lookup_elem(sk_map_fd, &ingress_listen_idx)", "err:%d errno:%d", err, errno); - err = bpf_map_lookup_elem(tp_map_fd, &cli_idx, &cli_tp); - CHECK(err == -1, "bpf_map_lookup_elem(tp_map_fd, &cli_idx)", + err = bpf_map_lookup_elem(tp_map_fd, &ingress_listen_idx, &listen_tp); + CHECK(err == -1, "bpf_map_lookup_elem(tp_map_fd, &ingress_listen_idx)", "err:%d errno:%d", err, errno); + printf("listen_sk: "); + print_sk(&listen_sk); + printf("\n"); + printf("srv_sk: "); print_sk(&srv_sk); printf("\n"); @@ -124,6 +162,10 @@ static void check_result(void) print_sk(&cli_sk); printf("\n"); + printf("listen_tp: "); + print_tp(&listen_tp); + printf("\n"); + printf("srv_tp: "); print_tp(&srv_tp); printf("\n"); @@ -132,6 +174,19 @@ static void check_result(void) print_tp(&cli_tp); printf("\n"); + CHECK(listen_sk.state != 10 || + listen_sk.family != AF_INET6 || + listen_sk.protocol != IPPROTO_TCP || + memcmp(listen_sk.src_ip6, &in6addr_loopback, + sizeof(listen_sk.src_ip6)) || + listen_sk.dst_ip6[0] || listen_sk.dst_ip6[1] || + listen_sk.dst_ip6[2] || listen_sk.dst_ip6[3] || + listen_sk.src_port != ntohs(srv_sa6.sin6_port) || + listen_sk.dst_port, + "Unexpected listen_sk", + "Check listen_sk output. ingress_linum:%u", + ingress_linum); + CHECK(srv_sk.state == 10 || !srv_sk.state || srv_sk.family != AF_INET6 || @@ -142,7 +197,8 @@ static void check_result(void) sizeof(srv_sk.dst_ip6)) || srv_sk.src_port != ntohs(srv_sa6.sin6_port) || srv_sk.dst_port != cli_sa6.sin6_port, - "Unexpected srv_sk", "Check srv_sk output. linum:%u", linum); + "Unexpected srv_sk", "Check srv_sk output. egress_linum:%u", + egress_linum); CHECK(cli_sk.state == 10 || !cli_sk.state || @@ -154,21 +210,31 @@ static void check_result(void) sizeof(cli_sk.dst_ip6)) || cli_sk.src_port != ntohs(cli_sa6.sin6_port) || cli_sk.dst_port != srv_sa6.sin6_port, - "Unexpected cli_sk", "Check cli_sk output. linum:%u", linum); + "Unexpected cli_sk", "Check cli_sk output. egress_linum:%u", + egress_linum); + + CHECK(listen_tp.data_segs_out || + listen_tp.data_segs_in || + listen_tp.total_retrans || + listen_tp.bytes_acked, + "Unexpected listen_tp", "Check listen_tp output. ingress_linum:%u", + ingress_linum); CHECK(srv_tp.data_segs_out != 1 || srv_tp.data_segs_in || srv_tp.snd_cwnd != 10 || srv_tp.total_retrans || srv_tp.bytes_acked != DATA_LEN, - "Unexpected srv_tp", "Check srv_tp output. linum:%u", linum); + "Unexpected srv_tp", "Check srv_tp output. egress_linum:%u", + egress_linum); CHECK(cli_tp.data_segs_out || cli_tp.data_segs_in != 1 || cli_tp.snd_cwnd != 10 || cli_tp.total_retrans || cli_tp.bytes_received != DATA_LEN, - "Unexpected cli_tp", "Check cli_tp output. linum:%u", linum); + "Unexpected cli_tp", "Check cli_tp output. egress_linum:%u", + egress_linum); } static void test(void) @@ -211,10 +277,10 @@ static void test(void) err, errno); /* Update addr_map with srv_sa6 and cli_sa6 */ - err = bpf_map_update_elem(addr_map_fd, &srv_idx, &srv_sa6, 0); + err = bpf_map_update_elem(addr_map_fd, &addr_srv_idx, &srv_sa6, 0); CHECK(err, "map_update", "err:%d errno:%d", err, errno); - err = bpf_map_update_elem(addr_map_fd, &cli_idx, &cli_sa6, 0); + err = bpf_map_update_elem(addr_map_fd, &addr_cli_idx, &cli_sa6, 0); CHECK(err, "map_update", "err:%d errno:%d", err, errno); /* Connect from cli_sa6 to srv_sa6 */ @@ -273,9 +339,9 @@ int main(int argc, char **argv) struct bpf_prog_load_attr attr = { .file = "test_sock_fields_kern.o", .prog_type = BPF_PROG_TYPE_CGROUP_SKB, - .expected_attach_type = BPF_CGROUP_INET_EGRESS, }; - int cgroup_fd, prog_fd, err; + int cgroup_fd, egress_fd, ingress_fd, err; + struct bpf_program *ingress_prog; struct bpf_object *obj; struct bpf_map *map; @@ -293,12 +359,24 @@ int main(int argc, char **argv) err = join_cgroup(TEST_CGROUP); CHECK(err, "join_cgroup", "err:%d errno:%d", err, errno); - err = bpf_prog_load_xattr(&attr, &obj, &prog_fd); + err = bpf_prog_load_xattr(&attr, &obj, &egress_fd); CHECK(err, "bpf_prog_load_xattr()", "err:%d", err); - err = bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_INET_EGRESS, 0); + ingress_prog = bpf_object__find_program_by_title(obj, + "cgroup_skb/ingress"); + CHECK(!ingress_prog, + "bpf_object__find_program_by_title(cgroup_skb/ingress)", + "not found"); + ingress_fd = bpf_program__fd(ingress_prog); + + err = bpf_prog_attach(egress_fd, cgroup_fd, BPF_CGROUP_INET_EGRESS, 0); CHECK(err == -1, "bpf_prog_attach(CPF_CGROUP_INET_EGRESS)", "err:%d errno%d", err, errno); + + err = bpf_prog_attach(ingress_fd, cgroup_fd, + BPF_CGROUP_INET_INGRESS, 0); + CHECK(err == -1, "bpf_prog_attach(CPF_CGROUP_INET_INGRESS)", + "err:%d errno%d", err, errno); close(cgroup_fd); map = bpf_object__find_map_by_name(obj, "addr_map"); diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c @@ -1940,3 +1940,28 @@ .errstr = "!read_ok", .result = REJECT, }, +{ + "calls: cross frame pruning - liveness propagation", + .insns = { + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32), + BPF_MOV64_IMM(BPF_REG_8, 0), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_MOV64_IMM(BPF_REG_8, 1), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32), + BPF_MOV64_IMM(BPF_REG_9, 0), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_MOV64_IMM(BPF_REG_9, 1), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_8, 1, 1), + BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_2, 0), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, + .errstr_unpriv = "function calls to other bpf functions are allowed for root only", + .errstr = "!read_ok", + .result = REJECT, +}, diff --git a/tools/testing/selftests/bpf/verifier/ref_tracking.c b/tools/testing/selftests/bpf/verifier/ref_tracking.c @@ -605,3 +605,171 @@ .prog_type = BPF_PROG_TYPE_SCHED_CLS, .result = ACCEPT, }, +{ + "reference tracking: use ptr from bpf_tcp_sock() after release", + .insns = { + BPF_SK_LOOKUP, + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_tcp_sock), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_7, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, offsetof(struct bpf_tcp_sock, snd_cwnd)), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = REJECT, + .errstr = "invalid mem access", +}, +{ + "reference tracking: use ptr from bpf_sk_fullsock() after release", + .insns = { + BPF_SK_LOOKUP, + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_7, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, offsetof(struct bpf_sock, type)), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = REJECT, + .errstr = "invalid mem access", +}, +{ + "reference tracking: use ptr from bpf_sk_fullsock(tp) after release", + .insns = { + BPF_SK_LOOKUP, + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_tcp_sock), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 0, 1), + BPF_EXIT_INSN(), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6, offsetof(struct bpf_sock, type)), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = REJECT, + .errstr = "invalid mem access", +}, +{ + "reference tracking: use sk after bpf_sk_release(tp)", + .insns = { + BPF_SK_LOOKUP, + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_tcp_sock), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6, offsetof(struct bpf_sock, type)), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = REJECT, + .errstr = "invalid mem access", +}, +{ + "reference tracking: use ptr from bpf_get_listener_sock() after bpf_sk_release(sk)", + .insns = { + BPF_SK_LOOKUP, + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_get_listener_sock), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6, offsetof(struct bpf_sock, src_port)), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = ACCEPT, +}, +{ + "reference tracking: bpf_sk_release(listen_sk)", + .insns = { + BPF_SK_LOOKUP, + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_get_listener_sock), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6, offsetof(struct bpf_sock, type)), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = REJECT, + .errstr = "reference has not been acquired before", +}, +{ + /* !bpf_sk_fullsock(sk) is checked but !bpf_tcp_sock(sk) is not checked */ + "reference tracking: tp->snd_cwnd after bpf_sk_fullsock(sk) and bpf_tcp_sock(sk)", + .insns = { + BPF_SK_LOOKUP, + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_EXIT_INSN(), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), + BPF_MOV64_REG(BPF_REG_7, BPF_REG_0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_tcp_sock), + BPF_MOV64_REG(BPF_REG_8, BPF_REG_0), + BPF_JMP_IMM(BPF_JNE, BPF_REG_7, 0, 3), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_EXIT_INSN(), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_8, offsetof(struct bpf_tcp_sock, snd_cwnd)), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_EMIT_CALL(BPF_FUNC_sk_release), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SCHED_CLS, + .result = REJECT, + .errstr = "invalid mem access", +}, diff --git a/tools/testing/selftests/bpf/verifier/sock.c b/tools/testing/selftests/bpf/verifier/sock.c @@ -342,7 +342,7 @@ }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .result = REJECT, - .errstr = "type=sock_common expected=sock", + .errstr = "reference has not been acquired before", }, { "bpf_sk_release(bpf_sk_fullsock(skb->sk))", @@ -380,5 +380,5 @@ }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .result = REJECT, - .errstr = "type=tcp_sock expected=sock", + .errstr = "reference has not been acquired before", }, diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/bpf.json b/tools/testing/selftests/tc-testing/tc-tests/actions/bpf.json @@ -286,5 +286,30 @@ "teardown": [ "$TC action flush action bpf" ] + }, + { + "id": "b8a1", + "name": "Replace bpf action with invalid goto_chain control", + "category": [ + "actions", + "bpf" + ], + "setup": [ + [ + "$TC actions flush action bpf", + 0, + 1, + 255 + ], + "$TC action add action bpf bytecode '1,6 0 0 4294967295' pass index 90" + ], + "cmdUnderTest": "$TC action replace action bpf bytecode '1,6 0 0 4294967295' goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC action list action bpf", + "matchPattern": "action order [0-9]*: bpf.* default-action pass.*index 90", + "matchCount": "1", + "teardown": [ + "$TC action flush action bpf" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/connmark.json b/tools/testing/selftests/tc-testing/tc-tests/actions/connmark.json @@ -287,5 +287,30 @@ "teardown": [ "$TC actions flush action connmark" ] + }, + { + "id": "c506", + "name": "Replace connmark with invalid goto chain control", + "category": [ + "actions", + "connmark" + ], + "setup": [ + [ + "$TC actions flush action connmark", + 0, + 1, + 255 + ], + "$TC actions add action connmark pass index 90" + ], + "cmdUnderTest": "$TC actions replace action connmark goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions get action connmark index 90", + "matchPattern": "action order [0-9]+: connmark zone 0 pass.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action connmark" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/csum.json b/tools/testing/selftests/tc-testing/tc-tests/actions/csum.json @@ -500,5 +500,30 @@ "matchPattern": "^[ \t]+index [0-9]+ ref", "matchCount": "0", "teardown": [] + }, + { + "id": "d128", + "name": "Replace csum action with invalid goto chain control", + "category": [ + "actions", + "csum" + ], + "setup": [ + [ + "$TC actions flush action csum", + 0, + 1, + 255 + ], + "$TC actions add action csum iph index 90" + ], + "cmdUnderTest": "$TC actions replace action csum iph goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions get action csum index 90", + "matchPattern": "action order [0-9]*: csum \\(iph\\) action pass.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action csum" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/gact.json b/tools/testing/selftests/tc-testing/tc-tests/actions/gact.json @@ -560,5 +560,30 @@ "teardown": [ "$TC actions flush action gact" ] + }, + { + "id": "ca89", + "name": "Replace gact action with invalid goto chain control", + "category": [ + "actions", + "gact" + ], + "setup": [ + [ + "$TC actions flush action gact", + 0, + 1, + 255 + ], + "$TC actions add action pass random determ drop 2 index 90" + ], + "cmdUnderTest": "$TC actions replace action goto chain 42 random determ drop 5 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions list action gact", + "matchPattern": "action order [0-9]*: gact action pass.*random type determ drop val 2.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action gact" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/ife.json b/tools/testing/selftests/tc-testing/tc-tests/actions/ife.json @@ -1060,5 +1060,30 @@ "matchPattern": "action order [0-9]*: ife encode action pipe.*allow prio.*index 4", "matchCount": "0", "teardown": [] + }, + { + "id": "a0e2", + "name": "Replace ife encode action with invalid goto chain control", + "category": [ + "actions", + "ife" + ], + "setup": [ + [ + "$TC actions flush action ife", + 0, + 1, + 255 + ], + "$TC actions add action ife encode allow mark pass index 90" + ], + "cmdUnderTest": "$TC actions replace action ife encode allow mark goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions get action ife index 90", + "matchPattern": "action order [0-9]*: ife encode action pass.*type 0[xX]ED3E .*allow mark.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action ife" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json @@ -434,5 +434,30 @@ "teardown": [ "$TC actions flush action mirred" ] + }, + { + "id": "2a9a", + "name": "Replace mirred action with invalid goto chain control", + "category": [ + "actions", + "mirred" + ], + "setup": [ + [ + "$TC actions flush action mirred", + 0, + 1, + 255 + ], + "$TC actions add action mirred ingress mirror dev lo drop index 90" + ], + "cmdUnderTest": "$TC actions replace action mirred ingress mirror dev lo goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions get action mirred index 90", + "matchPattern": "action order [0-9]*: mirred \\(Ingress Mirror to device lo\\) drop.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action mirred" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/nat.json b/tools/testing/selftests/tc-testing/tc-tests/actions/nat.json @@ -589,5 +589,30 @@ "teardown": [ "$TC actions flush action nat" ] + }, + { + "id": "4b12", + "name": "Replace nat action with invalid goto chain control", + "category": [ + "actions", + "nat" + ], + "setup": [ + [ + "$TC actions flush action nat", + 0, + 1, + 255 + ], + "$TC actions add action nat ingress 1.18.1.1 1.18.2.2 drop index 90" + ], + "cmdUnderTest": "$TC actions replace action nat ingress 1.18.1.1 1.18.2.2 goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions get action nat index 90", + "matchPattern": "action order [0-9]+: nat ingress 1.18.1.1/32 1.18.2.2 drop.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action nat" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/pedit.json b/tools/testing/selftests/tc-testing/tc-tests/actions/pedit.json @@ -0,0 +1,51 @@ +[ + { + "id": "319a", + "name": "Add pedit action that mangles IP TTL", + "category": [ + "actions", + "pedit" + ], + "setup": [ + [ + "$TC actions flush action pedit", + 0, + 1, + 255 + ] + ], + "cmdUnderTest": "$TC actions add action pedit ex munge ip ttl set 10", + "expExitCode": "0", + "verifyCmd": "$TC actions ls action pedit", + "matchPattern": "action order [0-9]+: pedit action pass keys 1.*index 1 ref.*key #0 at ipv4\\+8: val 0a000000 mask 00ffffff", + "matchCount": "1", + "teardown": [ + "$TC actions flush action pedit" + ] + }, + { + "id": "7e67", + "name": "Replace pedit action with invalid goto chain", + "category": [ + "actions", + "pedit" + ], + "setup": [ + [ + "$TC actions flush action pedit", + 0, + 1, + 255 + ], + "$TC actions add action pedit ex munge ip ttl set 10 pass index 90" + ], + "cmdUnderTest": "$TC actions replace action pedit ex munge ip ttl set 10 goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions ls action pedit", + "matchPattern": "action order [0-9]+: pedit action pass keys 1.*index 90 ref.*key #0 at ipv4\\+8: val 0a000000 mask 00ffffff", + "matchCount": "1", + "teardown": [ + "$TC actions flush action pedit" + ] + } +] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/police.json b/tools/testing/selftests/tc-testing/tc-tests/actions/police.json @@ -739,5 +739,30 @@ "teardown": [ "$TC actions flush action police" ] + }, + { + "id": "689e", + "name": "Replace police action with invalid goto chain control", + "category": [ + "actions", + "police" + ], + "setup": [ + [ + "$TC actions flush action police", + 0, + 1, + 255 + ], + "$TC actions add action police rate 3mbit burst 250k drop index 90" + ], + "cmdUnderTest": "$TC actions replace action police rate 3mbit burst 250k goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions get action police index 90", + "matchPattern": "action order [0-9]*: police 0x5a rate 3Mbit burst 250Kb mtu 2Kb action drop", + "matchCount": "1", + "teardown": [ + "$TC actions flush action police" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/sample.json b/tools/testing/selftests/tc-testing/tc-tests/actions/sample.json @@ -584,5 +584,30 @@ "teardown": [ "$TC actions flush action sample" ] + }, + { + "id": "0a6e", + "name": "Replace sample action with invalid goto chain control", + "category": [ + "actions", + "sample" + ], + "setup": [ + [ + "$TC actions flush action sample", + 0, + 1, + 255 + ], + "$TC actions add action sample rate 1024 group 4 pass index 90" + ], + "cmdUnderTest": "$TC actions replace action sample rate 1024 group 7 goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions list action sample", + "matchPattern": "action order [0-9]+: sample rate 1/1024 group 4 pass.*index 90", + "matchCount": "1", + "teardown": [ + "$TC actions flush action sample" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/simple.json b/tools/testing/selftests/tc-testing/tc-tests/actions/simple.json @@ -126,5 +126,30 @@ "teardown": [ "" ] + }, + { + "id": "b776", + "name": "Replace simple action with invalid goto chain control", + "category": [ + "actions", + "simple" + ], + "setup": [ + [ + "$TC actions flush action simple", + 0, + 1, + 255 + ], + "$TC actions add action simple sdata \"hello\" pass index 90" + ], + "cmdUnderTest": "$TC actions replace action simple sdata \"world\" goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions list action simple", + "matchPattern": "action order [0-9]*: Simple <hello>.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action simple" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json b/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json @@ -484,5 +484,30 @@ "teardown": [ "$TC actions flush action skbedit" ] + }, + { + "id": "1b2b", + "name": "Replace skbedit action with invalid goto_chain control", + "category": [ + "actions", + "skbedit" + ], + "setup": [ + [ + "$TC actions flush action skbedit", + 0, + 1, + 255 + ], + "$TC actions add action skbedit ptype host pass index 90" + ], + "cmdUnderTest": "$TC actions replace action skbedit ptype host goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions list action skbedit", + "matchPattern": "action order [0-9]*: skbedit ptype host pass.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action skbedit" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json b/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json @@ -392,5 +392,30 @@ "teardown": [ "$TC actions flush action skbmod" ] + }, + { + "id": "b651", + "name": "Replace skbmod action with invalid goto_chain control", + "category": [ + "actions", + "skbmod" + ], + "setup": [ + [ + "$TC actions flush action skbmod", + 0, + 1, + 255 + ], + "$TC actions add action skbmod set etype 0x1111 pass index 90" + ], + "cmdUnderTest": "$TC actions replace action skbmod set etype 0x1111 goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions ls action skbmod", + "matchPattern": "action order [0-9]*: skbmod pass set etype 0x1111\\s+index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action skbmod" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/tunnel_key.json b/tools/testing/selftests/tc-testing/tc-tests/actions/tunnel_key.json @@ -884,5 +884,30 @@ "teardown": [ "$TC actions flush action tunnel_key" ] + }, + { + "id": "8242", + "name": "Replace tunnel_key set action with invalid goto chain", + "category": [ + "actions", + "tunnel_key" + ], + "setup": [ + [ + "$TC actions flush action tunnel_key", + 0, + 1, + 255 + ], + "$TC actions add action tunnel_key set src_ip 10.10.10.1 dst_ip 20.20.20.2 dst_port 3128 nocsum id 1 pass index 90" + ], + "cmdUnderTest": "$TC actions replace action tunnel_key set src_ip 10.10.10.2 dst_ip 20.20.20.1 dst_port 3129 id 2 csum goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions get action tunnel_key index 90", + "matchPattern": "action order [0-9]+: tunnel_key.*set.*src_ip 10.10.10.1.*dst_ip 20.20.20.2.*key_id 1.*dst_port 3128.*csum pass.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action tunnel_key" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json b/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json @@ -688,5 +688,30 @@ "teardown": [ "$TC actions flush action vlan" ] + }, + { + "id": "e394", + "name": "Replace vlan push action with invalid goto chain control", + "category": [ + "actions", + "vlan" + ], + "setup": [ + [ + "$TC actions flush action vlan", + 0, + 1, + 255 + ], + "$TC actions add action vlan push id 500 pass index 90" + ], + "cmdUnderTest": "$TC actions replace action vlan push id 500 goto chain 42 index 90 cookie c1a0c1a0", + "expExitCode": "255", + "verifyCmd": "$TC actions get action vlan index 90", + "matchPattern": "action order [0-9]+: vlan.*push id 500 protocol 802.1Q priority 0 pass.*index 90 ref", + "matchCount": "1", + "teardown": [ + "$TC actions flush action vlan" + ] } ]