Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ bigger change than it should be._

The DOCKER chain has a single DROP rule for the bridge network, to drop any
packets routed to the network that have not originated in the network. Added by
[defaultDrop][21].
[setDefaultForwardRule][21].
_This means there is no dependency on the filter-FORWARD chain's default policy.
Even if it is ACCEPT, packets will be dropped unless container ports/protocols
are published._
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
## Container on a nat-unprotected network, with a published port

Running the daemon with the userland proxy disable then, as before, adding a network running a container with a mapped port, equivalent to:

docker network create \
-o com.docker.network.bridge.name=bridge1 \
-o com.docker.network.bridge.gateway_mode_ipv4=nat-unprotected \
--subnet 192.0.2.0/24 --gateway 192.0.2.1 bridge1
docker run --network bridge1 -p 8080:80 --name c1 busybox

The filter table is:

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
num pkts bytes target prot opt in out source destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
num pkts bytes target prot opt in out source destination
1 0 0 DOCKER-USER 0 -- * * 0.0.0.0/0 0.0.0.0/0
2 0 0 ACCEPT 0 -- * * 0.0.0.0/0 0.0.0.0/0 match-set docker-ext-bridges-v4 dst ctstate RELATED,ESTABLISHED
3 0 0 DOCKER-ISOLATION-STAGE-1 0 -- * * 0.0.0.0/0 0.0.0.0/0
4 0 0 DOCKER 0 -- * * 0.0.0.0/0 0.0.0.0/0 match-set docker-ext-bridges-v4 dst
5 0 0 ACCEPT 0 -- bridge1 !bridge1 0.0.0.0/0 0.0.0.0/0
6 0 0 ACCEPT 0 -- bridge1 bridge1 0.0.0.0/0 0.0.0.0/0
7 0 0 ACCEPT 0 -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
8 0 0 ACCEPT 0 -- docker0 docker0 0.0.0.0/0 0.0.0.0/0

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
num pkts bytes target prot opt in out source destination

Chain DOCKER (1 references)
num pkts bytes target prot opt in out source destination
1 0 0 DROP 0 -- !docker0 docker0 0.0.0.0/0 0.0.0.0/0
2 0 0 ACCEPT 0 -- !bridge1 bridge1 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
num pkts bytes target prot opt in out source destination
1 0 0 DOCKER-ISOLATION-STAGE-2 0 -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
2 0 0 DOCKER-ISOLATION-STAGE-2 0 -- bridge1 !bridge1 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-2 (2 references)
num pkts bytes target prot opt in out source destination
1 0 0 DROP 0 -- * bridge1 0.0.0.0/0 0.0.0.0/0
2 0 0 DROP 0 -- * docker0 0.0.0.0/0 0.0.0.0/0

Chain DOCKER-USER (1 references)
num pkts bytes target prot opt in out source destination
1 0 0 RETURN 0 -- * * 0.0.0.0/0 0.0.0.0/0


<details>
<summary>iptables commands</summary>

-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION-STAGE-1
-N DOCKER-ISOLATION-STAGE-2
-N DOCKER-USER
-A FORWARD -j DOCKER-USER
-A FORWARD -m set --match-set docker-ext-bridges-v4 dst -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -m set --match-set docker-ext-bridges-v4 dst -j DOCKER
-A FORWARD -i bridge1 ! -o bridge1 -j ACCEPT
-A FORWARD -i bridge1 -o bridge1 -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER ! -i docker0 -o docker0 -j DROP
-A DOCKER ! -i bridge1 -o bridge1 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i bridge1 ! -o bridge1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-2 -o bridge1 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-USER -j RETURN


</details>

Differences from [nat mode][400]:

- In the DOCKER chain:
- Where `nat` mode appended a default-DROP rule for any packets not accepted
by the per-port/protocol rules, `nat-unprotected` appends a default-ACCEPT
rule. [setDefaultForwardRule][402]
- The ACCEPT rule is needed in case the filter-FORWARD chain's default
policy is DROP.
- Because the default for this network is ACCEPT, there is no per-port/protocol
rule to ACCEPT packets for the published port `80/tcp`, [setPerPortIptables][401]
doesn't set it up.
- _If the userland proxy is enabled, it is still started._

The nat table is identical to [nat mode][400].

<details>
<summary>nat table</summary>

Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
num pkts bytes target prot opt in out source destination
1 0 0 DOCKER 0 -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
num pkts bytes target prot opt in out source destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
num pkts bytes target prot opt in out source destination
1 0 0 DOCKER 0 -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
num pkts bytes target prot opt in out source destination
1 0 0 MASQUERADE 0 -- * !bridge1 192.0.2.0/24 0.0.0.0/0
2 0 0 MASQUERADE 0 -- * !docker0 172.17.0.0/16 0.0.0.0/0

Chain DOCKER (2 references)
num pkts bytes target prot opt in out source destination
1 0 0 RETURN 0 -- bridge1 * 0.0.0.0/0 0.0.0.0/0
2 0 0 RETURN 0 -- docker0 * 0.0.0.0/0 0.0.0.0/0
3 0 0 DNAT 6 -- !bridge1 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080 to:192.0.2.2:80


-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N DOCKER
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 192.0.2.0/24 ! -o bridge1 -j MASQUERADE
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A DOCKER -i bridge1 -j RETURN
-A DOCKER -i docker0 -j RETURN
-A DOCKER ! -i bridge1 -p tcp -m tcp --dport 8080 -j DNAT --to-destination 192.0.2.2:80


</details>

[400]: usernet-portmap.md
[401]: https://github.com/robmry/moby/blob/52c89d467fc5326149e4bbb8903d23589b66ff0d/libnetwork/drivers/bridge/port_mapping_linux.go#L747
[402]: https://github.com/robmry/moby/blob/52c89d467fc5326149e4bbb8903d23589b66ff0d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L261-L266
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,8 @@ Note that:
(unlike all the other rules so-far, which were created during driver or
network initialisation). [setPerPortForwarding][1]
- These per-port rules are inserted at the head of the chain, so that they
appear before the network's DROP rule [defaultDrop][2] which is always
appended to the end of the chain. In this case, because `docker0` was
appear before the network's DROP rule [setDefaultForwardRule][2] which is
always appended to the end of the chain. In this case, because `docker0` was
created before `bridge1`, the `bridge1` rules appear above and below the
`docker0` DROP rule.

Expand Down
1 change: 1 addition & 0 deletions integration/network/bridge/iptablesdoc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,4 @@ Scenarios:
- [Container on a user-defined network with inter-container communication disabled, with a published port](generated/usernet-portmap-noicc.md)
- [Container on a user-defined --internal network](generated/usernet-internal.md)
- [Container on a routed-mode network, with a published port](generated/usernet-portmap-routed.md)
- [Container on a nat-unprotected network, with a published port](generated/usernet-portmap-natunprot.md)
13 changes: 13 additions & 0 deletions integration/network/bridge/iptablesdoc/iptablesdoc_linux_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,19 @@ var index = []section{
},
}},
},
{
name: "usernet-portmap-natunprot.md",
networks: []bridgeNetwork{{
bridge: "bridge1",
gwMode: "nat-unprotected",
containers: []ctr{
{
name: "c1",
portMappings: nat.PortMap{"80/tcp": {{HostPort: "8080"}}},
},
},
}},
},
}

// iptCmdType is used to look up iptCmds in the markdown (can't use an int
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ bigger change than it should be._

The DOCKER chain has a single DROP rule for the bridge network, to drop any
packets routed to the network that have not originated in the network. Added by
[defaultDrop][21].
[setDefaultForwardRule][21].
_This means there is no dependency on the filter-FORWARD chain's default policy.
Even if it is ACCEPT, packets will be dropped unless container ports/protocols
are published._
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
## Container on a nat-unprotected network, with a published port

Running the daemon with the userland proxy disable then, as before, adding a network running a container with a mapped port, equivalent to:

docker network create \
-o com.docker.network.bridge.name=bridge1 \
-o com.docker.network.bridge.gateway_mode_ipv4=nat-unprotected \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we need an option to explicitly set this for IPv4/IPv6 is that we changed the default for IPv6 already (but not IPv4 for backward-compat), correct?

(Mostly trying to get my head around what's needed to make the overall experience more aligned in future, and possibly reduce some complexity in codepaths)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaults haven't changed ... in 27.0 (#47871) we introduced the IPv4 and IPv6 gateway mode options - nat became a new name for the old behaviour, and routed mode was added. The default is still nat for both IPv4 and IPv6.

For IPv4, NAT is normal, so you might want to publish ports to the host. For IPv6, direct routing is preferred and, if you have the routing set up in your network, you might want to disable NAT. So, they're configurable separately.

In 28.0, we're changing the default behaviour (nat mode) by blocking access to unpublished ports from other hosts on the same layer-2 network that have set up a route to the container network ("direct routing"). Previously, these ports were only blocked if the filter-FORWARD chain's default policy was DROP. But, docker only sets to DROP if it enables IP forwarding itself.

The old behaviour for a host with filter-FORWARD policy ACCEPT is used as a feature, so we're adding nat-unprotected as a way to restore the old nat behaviour.

It's possible to use nat-unprotected for IPv6, but its filter-FORWARD policy was always set to DROP when ip6tables was enabled - so the new restrictions in nat mode are less likely to be breaking changes.

--subnet 192.0.2.0/24 --gateway 192.0.2.1 bridge1
docker run --network bridge1 -p 8080:80 --name c1 busybox

The filter table is:

{{index . "LFilter4"}}

<details>
<summary>iptables commands</summary>

{{index . "SFilter4"}}

</details>

Differences from [nat mode][400]:

- In the DOCKER chain:
- Where `nat` mode appended a default-DROP rule for any packets not accepted
by the per-port/protocol rules, `nat-unprotected` appends a default-ACCEPT
rule. [setDefaultForwardRule][402]
- The ACCEPT rule is needed in case the filter-FORWARD chain's default
policy is DROP.
- Because the default for this network is ACCEPT, there is no per-port/protocol
rule to ACCEPT packets for the published port `80/tcp`, [setPerPortIptables][401]
doesn't set it up.
- _If the userland proxy is enabled, it is still started._

The nat table is identical to [nat mode][400].

<details>
<summary>nat table</summary>

{{index . "LNat4"}}

{{index . "SNat4"}}

</details>

[400]: usernet-portmap.md
[401]: https://github.com/robmry/moby/blob/52c89d467fc5326149e4bbb8903d23589b66ff0d/libnetwork/drivers/bridge/port_mapping_linux.go#L747
[402]: https://github.com/robmry/moby/blob/52c89d467fc5326149e4bbb8903d23589b66ff0d/libnetwork/drivers/bridge/setup_ip_tables_linux.go#L261-L266
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ Note that:
(unlike all the other rules so-far, which were created during driver or
network initialisation). [setPerPortForwarding][1]
- These per-port rules are inserted at the head of the chain, so that they
appear before the network's DROP rule [defaultDrop][2] which is always
appended to the end of the chain. In this case, because `docker0` was
appear before the network's DROP rule [setDefaultForwardRule][2] which is
always appended to the end of the chain. In this case, because `docker0` was
created before `bridge1`, the `bridge1` rules appear above and below the
`docker0` DROP rule.

Expand Down
28 changes: 20 additions & 8 deletions integration/networking/port_mapping_linux_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"os/exec"
"strconv"
"strings"
"syscall"
"testing"
"time"

Expand Down Expand Up @@ -671,10 +672,14 @@ func TestDirectRoutingOpenPorts(t *testing.T) {
// Run http servers on ports 80 and 81, but only map/open port 80.
createNet := func(gwMode string) ctrDesc {
netName := "test-" + gwMode
brName := "br-" + gwMode
if len(brName) > syscall.IFNAMSIZ {
brName = brName[:syscall.IFNAMSIZ-1]
}
network.CreateNoError(ctx, t, c, netName,
network.WithDriver("bridge"),
network.WithIPv6(),
network.WithOption(bridge.BridgeName, "br-"+gwMode),
network.WithOption(bridge.BridgeName, brName),
network.WithOption(bridge.IPv4GatewayMode, gwMode),
network.WithOption(bridge.IPv6GatewayMode, gwMode),
)
Expand Down Expand Up @@ -711,12 +716,19 @@ func TestDirectRoutingOpenPorts(t *testing.T) {
)

networks := map[string]ctrDesc{
"nat": createNet("nat"),
"routed": createNet("routed"),
"nat": createNet("nat"),
"nat-unprotected": createNet("nat-unprotected"),
"routed": createNet("routed"),
}
expPingExit := map[string]int{
"nat": pingFail,
"routed": pingSuccess,
"nat": pingFail,
"nat-unprotected": pingSuccess,
"routed": pingSuccess,
}
expUnmappedPortHTTP := map[string]string{
"nat": httpFail,
"nat-unprotected": httpSuccess,
"routed": httpFail,
}

testPing := func(t *testing.T, cmd, addr string, expExit int) {
Expand Down Expand Up @@ -746,7 +758,7 @@ func TestDirectRoutingOpenPorts(t *testing.T) {
for _, fwdPolicy := range []string{"ACCEPT", "DROP"} {
networking.SetFilterForwardPolicies(t, fwdPolicy)
t.Run(fwdPolicy, func(t *testing.T) {
for _, gwMode := range []string{"nat", "routed"} {
for gwMode := range networks {
t.Run(gwMode+"/v4/ping", func(t *testing.T) {
testPing(t, "ping", networks[gwMode].ipv4, expPingExit[gwMode])
})
Expand All @@ -757,13 +769,13 @@ func TestDirectRoutingOpenPorts(t *testing.T) {
testHttp(t, networks[gwMode].ipv4, "80", httpSuccess)
})
t.Run(gwMode+"/v4/http/81", func(t *testing.T) {
testHttp(t, networks[gwMode].ipv4, "81", httpFail)
testHttp(t, networks[gwMode].ipv4, "81", expUnmappedPortHTTP[gwMode])
})
t.Run(gwMode+"/v6/http/80", func(t *testing.T) {
testHttp(t, networks[gwMode].ipv6, "80", httpSuccess)
})
t.Run(gwMode+"/v6/http/81", func(t *testing.T) {
testHttp(t, networks[gwMode].ipv6, "81", httpFail)
testHttp(t, networks[gwMode].ipv6, "81", expUnmappedPortHTTP[gwMode])
})
}
})
Expand Down
22 changes: 19 additions & 3 deletions libnetwork/drivers/bridge/bridge_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -167,9 +167,10 @@ type driver struct {
type gwMode string

const (
gwModeDefault gwMode = ""
gwModeNAT gwMode = "nat"
gwModeRouted gwMode = "routed"
gwModeDefault gwMode = ""
gwModeNAT gwMode = "nat"
gwModeNATUnprot gwMode = "nat-unprotected"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to make nat-unprotected the default for existing networks -- only new networks should use nat. At least, that's what it should be for Docker Desktop (and DD doesn't have a way to 'migrate' Docker objects during upgrades).

Copy link
Contributor Author

@robmry robmry Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting our discussion here ...

Setting nat-unprotected for existing IPv4 networks would be right if the host had previously been running with the filter-FORWARD chain's policy set to ACCEPT ... which happens when sysctl ip_forward is enabled before the daemon starts.

But, when it's set to DROP, and for IPv6 - only published ports would be exposed. So, we'd silently be making things worse in many cases by defaulting to nat-unprotected.

So, the change to the default nat mode will break a few setups by blocking remote access to unpublished ports, and anyone who's relying on that will have to re-create their networks with the new mode - but, at least it's fail-safe, and this change will block remote access for users (possibly the majority) who hadn't realised those ports were open.

gwModeRouted gwMode = "routed"
)

// New constructs a new bridge driver
Expand Down Expand Up @@ -366,6 +367,8 @@ func newGwMode(gwMode string) (gwMode, error) {
switch gwMode {
case "nat":
return gwModeNAT, nil
case "nat-unprotected":
return gwModeNATUnprot, nil
case "routed":
return gwModeRouted, nil
}
Expand All @@ -376,6 +379,10 @@ func (m gwMode) routed() bool {
return m == gwModeRouted
}

func (m gwMode) unprotected() bool {
return m == gwModeNATUnprot
}

func parseErr(label, value, errString string) error {
return types.InvalidParameterErrorf("failed to parse %s value: %v (%s)", label, value, errString)
}
Expand Down Expand Up @@ -428,6 +435,15 @@ func (n *bridgeNetwork) getNATDisabled() (ipv4, ipv6 bool) {
return n.config.GwModeIPv4.routed(), n.config.GwModeIPv6.routed()
}

func (n *bridgeNetwork) gwMode(v iptables.IPVersion) gwMode {
n.Lock()
defer n.Unlock()
Comment on lines +439 to +440
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly curious; we have so much locking everywhere; do we have good insight what exactly is it synchronising? (was contemplating if it's purely n.config, and if we should have an accessor for that to return it 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it's protecting concurrent access to n.config ... an accessor to fetch a copy of the whole of config might be better - but most of the locking is probably unnecessary because the config won't change once created.

But, when adding new accessors, I'm sticking with the scheme others us - it needs proper review and tidy-up and, for me, the cost of acquiring the mutex is outweighed by the risk of making it crashy by missing something!

if v == iptables.IPv4 {
return n.config.GwModeIPv4
}
return n.config.GwModeIPv6
}

func (n *bridgeNetwork) userlandProxyPath() string {
n.Lock()
defer n.Unlock()
Expand Down
7 changes: 5 additions & 2 deletions libnetwork/drivers/bridge/port_mapping_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -776,8 +776,11 @@ func (n *bridgeNetwork) setPerPortIptables(b portBinding, enable bool) error {
if err := setPerPortNAT(b, v, proxyPath, bridgeName, enable); err != nil {
return err
}
if err := setPerPortForwarding(b, v, bridgeName, enable); err != nil {
return err

if !n.gwMode(v).unprotected() {
if err := setPerPortForwarding(b, v, bridgeName, enable); err != nil {
return err
}
}
return nil
}
Expand Down
Loading
Loading