Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck on Starting v6 Control Plane #35

Open
saguarobrian opened this issue Nov 26, 2024 · 4 comments
Open

Stuck on Starting v6 Control Plane #35

saguarobrian opened this issue Nov 26, 2024 · 4 comments

Comments

@saguarobrian
Copy link

Starting with no containers or images. I get the following on Ubuntu and this worked with version 1.20 of Zerotier on another Ubuntu Box.

Distributor ID: Ubuntu
Description: Ubuntu 24.04.1 LTS
Release: 24.04
Codename: noble

When I run the following command (NetworkID Changed)

docker run --name zerotier-one --device=/dev/net/tun --cap-add=NET_ADMIN --cap-add=NET_RAW --cap-add=SYS_ADMIN --env TZ=Etc/UTC --env PUID=999 --env PGID=994 --env ZEROTIER_ONE_LOCAL_PHYS=enp1s0f0 --env ZEROTIER_ONE_USE_IPTABLES_NFT=false --env ZEROTIER_ONE_GATEWAY_MODE=inbound --env ZEROTIER_ONE_NETWORK_IDS=xxxxxxxxxx -v /var/lib/zerotier-one:/var/lib/zerotier-one zyclonite/zerotier:router

I get the following

sagadmin@mn-zerotier:~$ sudo docker run --name zerotier-one --device=/dev/net/tun --cap-add=NET_ADMIN --cap-add=NET_RAW --cap-add=SYS_ADMIN --env TZ=Etc/UTC --env PUID=999 --env PGID=994 --env ZEROTIER_ONE_LOCAL_PHYS=enp1s0f0 --env ZEROTIER_ONE_USE_IPTABLES_NFT=false --env ZEROTIER_ONE_GATEWAY_MODE=inbound --env ZEROTIER_ONE_NETWORK_IDS=xxxxxxxxx -v /var/lib/zerotier-one:/var/lib/zerotier-one zyclonite/zerotier:router
Unable to find image 'zyclonite/zerotier:router' locally
router: Pulling from zyclonite/zerotier
20aa84b242f8: Pull complete
04e7b3706e72: Pull complete
43f96568de03: Pull complete
Digest: sha256:78540326002a2b6fa2249f64e3d0d716fe8b457c7be990d16c1f75245f42796e
Status: Downloaded newer image for zyclonite/zerotier:router
Tue Nov 26 01:22:26 UTC 2024 - launching ZeroTier-One in routing mode
adding iptables-legacy rules for inbound traffic (ZeroTier to local interfaces enp1s0f0)
Tue Nov 26 01:22:26 UTC 2024 - ZeroTier daemon is running as process 17
Starting Control Plane...
Starting V6 Control Plane...

@Paraphraser
Copy link
Contributor

I'm not immediately sure what question you're asking but:

  1. I use docker compose rather than docker run; and
  2. I'm running it on Raspberry Pi Bullseye so that might make a difference.

Here's my service definition:

  zerotier-router:
    container_name: zerotier
    image: "zyclonite/zerotier:router"
    restart: unless-stopped
    environment:
      - TZ=${TZ:-Etc/UTC}
      - PUID=1000
      - PGID=1000
      - ZEROTIER_ONE_NETWORK_IDS=whatever
      - ZEROTIER_ONE_LOCAL_PHYS=eth0 wlan0
      - ZEROTIER_ONE_USE_IPTABLES_NFT=true
      - ZEROTIER_ONE_GATEWAY_MODE=both
    network_mode: host
    volumes:
      - ./volumes/zerotier-one:/var/lib/zerotier-one
    devices:
      - "/dev/net/tun:/dev/net/tun"
    cap_add:
      - NET_ADMIN
      - SYS_ADMIN
      - NET_RAW

The one thing I did note is your docker run doesn't put the container into host mode. Perhaps try that and see what happens.

If that doesn't help, try setting ZEROTIER_ONE_USE_IPTABLES_NFT to true. You can also try checking to see if the network filters are in place. Something like:

$ sudo nft list ruleset | grep zt
		oifname "zt*" counter packets 762303 bytes 43577628 masquerade 
		iifname "zt*" oifname "eth0" counter packets 1147731 bytes 454431979 accept
		iifname "eth0" oifname "zt*" counter packets 2488151 bytes 1536233198 accept
		iifname "zt*" oifname "wlan0" counter packets 0 bytes 0 accept
		iifname "wlan0" oifname "zt*" counter packets 0 bytes 0 accept

In my case, everything is transiting the Ethernet port. The WiFi port is really only there as a backup.

If you prefer to stick with iptables:

$ sudo iptables -S | grep zt

Another possibility is to start with ZEROTIER_ONE_GATEWAY_MODE=both until you get it working, then back off to inbound if that is what you want.

If it looks like the container is running and the net-filters are in place, try sniffing the zerotier interface:

$ sudo tcpdump -i ztr2qsmswx 

where you can get the interface from ip a | grep zt. Then, try to trigger some traffic that should be being sent into the tunnel and see if it's getting there.

If you're just getting started with ZeroTier-router, you might also find this helpful.

@saguarobrian
Copy link
Author

I did try with NFT, but not that did not work. NFT tables does not look to be supported on Ubuntu, and the ZeroTier docs say NFT is only on Raspberry PI systems.

@saguarobrian
Copy link
Author

I think this maybe an issue with the permissions on the /var/lib/zerotier. Appears it sets aa weird ownership

@Paraphraser
Copy link
Contributor

the ZeroTier docs say NFT is only on Raspberry PI systems.

I assume you mean the following words from README-router.md:

"Try true if NAT does not seem to be working. This is needed on Raspberry Pi Bullseye."

That doesn't actually say only on Raspberry Pi. It simply says it is needed on Pi Bullseye. It may also be needed on other systems.

Some background may help. At the time I was writing that readme file, I was doing the work on a Bullseye Pi. The only thing I knew for certain was that setting up the net filters with iptables didn't work, whereas iptables-nft did. I didn't have other systems to run tests on so I didn't know whether iptables-nft would always work, or if there were situations where iptables would work but iptables-nft would not work. The ZEROTIER_ONE_USE_IPTABLES_NFT variable solves that conundrum by letting the user pick the approach that works.

NFT tables does not look to be supported on Ubuntu

I do not believe that is true.

All this is from an Ubuntu guest running on Proxmox-VE:

$ grep "^VERSION=" /etc/os-release 
VERSION="24.04.1 LTS (Noble Numbat)"

No containers, no images, nothing up my sleeves:

$ docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

$ docker images
REPOSITORY   TAG       IMAGE ID   CREATED   SIZE

What's the story with net filters:

$ sudo nft list ruleset | grep -ci docker
# Warning: table ip nat is managed by iptables-nft, do not touch!
# Warning: table ip filter is managed by iptables-nft, do not touch!
25

$ sudo nft list ruleset 2>/dev/null | grep -ci zt 
0

$ sudo nft list ruleset 2>/dev/null | wc -l
75

The # lines are being written to stderr and the count of 25 hits on the word "docker", the zero hits on "zt", and the overall count of 75 lines in the output should be enough to confirm that Docker has set up a bunch of net filters. And this is all long before any containers are running.

If you don't get a similar result then I'd suggest maybe revisiting how you installed Docker on this system. Please treat that as a serious suggestion. You'd be surprised at how many problems can be traced to an improper installation of Docker. And those improper installations can often be traced to bad advice in YouTube videos or "how to" documents on the web. If you want me to expand on this, I'd be happy to.

You also wrote this:

I think this maybe an issue with the permissions on the /var/lib/zerotier. Appears it sets aa weird ownership

The answer here is roughly the same as for iptables-nft. Please take a look at these lines in the entrypoint script.

At the time I didn't know the origin of 999 and 994. I still don't. Those IDs aren't defined inside the container so I'd definitely agree this is a bit "weird". But, because I didn't understand it, I preserved those IDs as the defaults. On the Linux systems I play with, the first non-root user gets UID=1000, GID=1000 so I tested that ZeroTier (as implemented in the container) didn't object to those values being applied via environment variables. I work mainly with docker compose so I prefer to use 1000/1000 for persistent stores unless the container objects.

In any event, the ZeroTier processes run as root so, as far as I'm aware, the only thing that gets affected by PUID/PGID is the ownership of /var/lib/zerotier-one inside the container, and whatever that path maps to outside the container.

In short, I doubt that 999/994 is going to be the answer.


We're going to be talking about "routing" so let's baseline the routing table on this Ubuntu system so we can spot changes later:

$ ip r
default via 192.168.132.1 dev ens18 proto dhcp src 192.168.132.252 metric 100 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
192.168.132.0/24 dev ens18 proto kernel scope link src 192.168.132.252 metric 100 
192.168.132.1 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.55 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.57 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 

Here's my compose file:

---

services:

  zerotier-router:
    container_name: zerotier
    image: "zyclonite/zerotier:router"
    restart: unless-stopped
    environment:
      - TZ=${TZ:-Etc/UTC}
      - PUID=1000
      - PGID=1000
      - ZEROTIER_ONE_NETWORK_IDS=${ZEROTIER_ONE_NETWORK_IDS}
      - ZEROTIER_ONE_LOCAL_PHYS=ens18
      - ZEROTIER_ONE_USE_IPTABLES_NFT=true
      - ZEROTIER_ONE_GATEWAY_MODE=both
    network_mode: host
    volumes:
      - ./volumes/zerotier-one:/var/lib/zerotier-one
    devices:
      - "/dev/net/tun:/dev/net/tun"
    cap_add:
      - NET_ADMIN
      - SYS_ADMIN
      - NET_RAW

Let's give it a whirl:

$ docker compose up -d
[+] Running 4/4
 ✔ zerotier-router Pulled                                                                                                                                  5.6s 
   ✔ 20aa84b242f8 Pull complete                                                                                                                            1.7s 
   ✔ 04e7b3706e72 Pull complete                                                                                                                            1.8s 
   ✔ 43f96568de03 Pull complete                                                                                                                            2.1s 
[+] Running 1/1
 ✔ Container zerotier  Started                                                                                                                             0.3s 

$ docker logs zerotier 
Assuming container first run.
Configuring auto-join of network ID: 9999888877776666
You will need to authorize this host at:
   https://my.zerotier.com/network/9999888877776666
changed ownership of '/var/lib/zerotier-one/networks.d/9999888877776666.conf' to 1000:1000
changed ownership of '/var/lib/zerotier-one/networks.d' to 1000:1000
changed ownership of '/var/lib/zerotier-one' to 1000:1000
Wed Nov 27 10:36:32 AEDT 2024 - launching ZeroTier-One in routing mode
adding iptables-nft rules for bi-directional traffic (local interfaces ens18 to/from ZeroTier)
Wed Nov 27 10:36:32 AEDT 2024 - ZeroTier daemon is running as process 18
Starting Control Plane...
Starting V6 Control Plane...

Now, yes, the last line is "Starting V6 Control Plane" but that doesn't mean it's somehow stuck. That's just the last message it emitted.

What does the container think is going on?

$ docker exec zerotier zerotier-cli info
200 info b9b925f6f2 1.14.2 ONLINE

$ docker exec zerotier zerotier-cli listnetworks
200 listnetworks <nwid> <name> <mac> <status> <type> <dev> <ZT assigned ips>
200 listnetworks 9999888877776666  22:2b:06:d3:2e:6b ACCESS_DENIED PRIVATE ztr2qsmswx -

It thinks ZeroTier is up but this newly-established client is not authorised to connect to my ZeroTier network. That's expected. I haven't gone to ZeroTier Central to authorise this new client. Also expected is that the routing table hasn't changed (because the client hasn't joined my ZeroTier network and can't forward traffic to/from):

$ ip r
default via 192.168.132.1 dev ens18 proto dhcp src 192.168.132.252 metric 100 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
192.168.132.0/24 dev ens18 proto kernel scope link src 192.168.132.252 metric 100 
192.168.132.1 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.55 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.57 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 

Meanwhile, what's the story with net filters:

$ sudo nft list ruleset | grep -ci docker
# Warning: table ip nat is managed by iptables-nft, do not touch!
# Warning: table ip filter is managed by iptables-nft, do not touch!
25

$ sudo nft list ruleset 2>/dev/null | grep -ci zt
3

$ sudo nft list ruleset 2>/dev/null | wc -l
79

Thus some new rules have been added.

I'll go to ZeroTier Central and authorise this client ...

What does the container think now?

$ docker exec zerotier zerotier-cli listnetworks
200 listnetworks <nwid> <name> <mac> <status> <type> <dev> <ZT assigned ips>
200 listnetworks 9999888877776666 My_ZeroTier 22:2b:06:d3:2e:6b OK PRIVATE ztr2qsmswx 10.244.34.107/16

It has joined the network. The routing table?

$ ip r
default via 192.168.132.1 dev ens18 proto dhcp src 192.168.132.252 metric 100 
10.244.0.0/16 dev ztr2qsmswx proto kernel scope link src 10.244.34.107 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
192.168.0.0/23 via 10.244.124.118 dev ztr2qsmswx proto static metric 5000 
192.168.132.0/24 dev ens18 proto kernel scope link src 192.168.132.252 metric 100 
192.168.132.1 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.55 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.57 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 

All as expected. 10.244/16 is the ZeroTier network, while 192.168/23 is being advertised by ZeroTier Cloud. So what about reachability of a host where the only possible path is via the ZeroTier Cloud?

$ traceroute 192.168.1.60
traceroute to 192.168.1.60 (192.168.1.60), 30 hops max, 60 byte packets
 1  10.244.124.118 (10.244.124.118)  102.100 ms  102.070 ms  103.146 ms
 2  * 192.168.1.60 (192.168.1.60)  104.616 ms  104.630 ms

All works.


Now, I've never done this before but I'll try using docker run to achieve the same thing. First, down the running container and check that the routes are withdrawn and net filter rules have gone away:

$ docker compose down
[+] Running 1/1
 ✔ Container zerotier  Removed                                                                                                                             2.4s 

$ ip r
default via 192.168.132.1 dev ens18 proto dhcp src 192.168.132.252 metric 100 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
192.168.132.0/24 dev ens18 proto kernel scope link src 192.168.132.252 metric 100 
192.168.132.1 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.55 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.57 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 

$ sudo nft list ruleset 2>/dev/null | grep -ci zt
0

The decks are cleared. Holding my breath:

$ docker run \
  --name zerotier \
  --network=host -d \
  --device=/dev/net/tun \
  --cap-add=NET_ADMIN --cap-add=NET_RAW --cap-add=SYS_ADMIN \
  --env TZ=Etc/UTC --env PUID=$(id -u) --env PGID=$(id -g) \
  --env ZEROTIER_ONE_LOCAL_PHYS=ens18 \
  --env ZEROTIER_ONE_USE_IPTABLES_NFT=true \
  --env ZEROTIER_ONE_GATEWAY_MODE=both \
  --env ZEROTIER_ONE_NETWORK_IDS=9999888877776666 \
  -v /home/moi/IOTstack/volumes/zerotier-one:/var/lib/zerotier-one \
  zyclonite/zerotier:router
790e02d420a363cf227ff1f47ced9058349a4feaa565e292fa7a894ffcae6128

Notes:

  • I prefer the name zerotier rather than zerotier-one.
  • I've added --network=host to put the container into host mode, and -d to detach the container so it runs in the background.
  • I've kept the owner and group IDs of the current user (1000 in both cases).
  • iptables-nft is enabled because I already know it works that way.
  • I don't think both matters here but I enabled it anyway.
  • The -v keeps the same persistent storage location as I was using for docker compose. That way I don't have to authorise a new client.

What's the situation now this is running?

$ sudo nft list ruleset 2>/dev/null | grep -ci zt
3

$ ip r
default via 192.168.132.1 dev ens18 proto dhcp src 192.168.132.252 metric 100 
10.244.0.0/16 dev ztr2qsmswx proto kernel scope link src 10.244.34.107 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
192.168.0.0/23 via 10.244.124.118 dev ztr2qsmswx proto static metric 5000 
192.168.132.0/24 dev ens18 proto kernel scope link src 192.168.132.252 metric 100 
192.168.132.1 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.55 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 
192.168.132.57 dev ens18 proto dhcp scope link src 192.168.132.252 metric 100 

$ docker logs zerotier 
changed ownership of '/var/lib/zerotier-one/planet' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/164b13f1c0.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/bc8a31dddc.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/cafe9efeb9.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/778cde7190.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/cafe04eba9.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/6168925b5d.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/8ef945cc52.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/cafe9ccda7.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/62f865ae71.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d/565799d8f6.peer' to 1000:1000
changed ownership of '/var/lib/zerotier-one/peers.d' to 1000:1000
changed ownership of '/var/lib/zerotier-one/identity.secret' to 1000:1000
changed ownership of '/var/lib/zerotier-one/controller.d/network' to 1000:1000
changed ownership of '/var/lib/zerotier-one/controller.d/trace' to 1000:1000
changed ownership of '/var/lib/zerotier-one/controller.d' to 1000:1000
changed ownership of '/var/lib/zerotier-one/identity.public' to 1000:1000
changed ownership of '/var/lib/zerotier-one/metrics.prom' to 1000:1000
changed ownership of '/var/lib/zerotier-one/zerotier-one.port' to 1000:1000
changed ownership of '/var/lib/zerotier-one/metricstoken.secret' to 1000:1000
changed ownership of '/var/lib/zerotier-one/authtoken.secret' to 1000:1000
Wed Nov 27 00:02:21 UTC 2024 - launching ZeroTier-One in routing mode
adding iptables-nft rules for bi-directional traffic (local interfaces ens18 to/from ZeroTier)
Wed Nov 27 00:02:21 UTC 2024 - ZeroTier daemon is running as process 18
Starting Control Plane...
Starting V6 Control Plane...

I reckon it works. On Ubuntu 24.04.1 LTS (Noble Numbat). At least when that's running as a Proxmox-VE guest on an old Intel-based MacBook Pro.


That said, I now realise that the "Command line example" in the documentation lacks a lot. It really should be something like this:

$ docker run --name zerotier-one --device=/dev/net/tun \
  --network=host -d \
  --cap-add=NET_ADMIN --cap-add=NET_RAW --cap-add=SYS_ADMIN \
  --env TZ=Etc/UTC --env PUID=$(id -u) --env PGID=$(id -g) \
  --env ZEROTIER_ONE_LOCAL_PHYS=eth0 \
  --env ZEROTIER_ONE_USE_IPTABLES_NFT=true \
  --env ZEROTIER_ONE_GATEWAY_MODE=inbound \
  --env ZEROTIER_ONE_NETWORK_IDS=«yourDefaultNetworkID(s)» \
  -v /var/lib/zerotier-one:/var/lib/zerotier-one \
  zyclonite/zerotier:router

Similarly, the example service definition needs some tweaks:

  1. The version is deprecated and should be replaced with the --- "here comes YAML" indicator.
  2. ZEROTIER_ONE_USE_IPTABLES_NFT=true

The section on ZEROTIER_ONE_USE_IPTABLES_NFT needs a better explanation so the reader is guided in the right direction. I'm thinking something like this:

  • ZEROTIER_ONE_USE_IPTABLES_NFT - controls the command the container uses to set up NAT forwarding. Example:

      environment:
      - ZEROTIER_ONE_USE_IPTABLES_NFT=true
    
    • false means the container uses iptables. This is the default if the variable is omitted but that is only to maintain backwards compatibility.
    • true means the container uses iptables-nft. This is generally what you need.

    Keep in mind that the container is manipulating filter tables on the host so the container needs to meet the host's expectations. The best advice is to start with true and only switch to false if Network Address Translation (NAT) doesn't seem to be working.


Please tell me if you are able to get it working and whether you'd like to see anything else in the documentation. Then I'll prepare a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants