Skip to content
This repository was archived by the owner on Jul 19, 2025. It is now read-only.
This repository was archived by the owner on Jul 19, 2025. It is now read-only.

Ksync on connection error due to the watcher and kubernetes timebound API connection #310

@alok87

Description

@alok87

Since it was discussed #285 (comment) to run behind some daemon manager. We have put ksync behind a daemon manager which auto restarts it when it crashes.

But we found a case where ksync throws an error due to connection and does not recover from it and stays stuck in that place and stop syncing. Since we are throwing the error instead of it being a Fatal error the daemon manager also does not restart ksync on such errors.

[daemon-manager] Using "ksync" service for syncing local code to remote container
[daemon-manager] Running "ksync watch" in auto restart mode
INFO[0000] listening                                     bind=127.0.0.1 port=40322
INFO[0002] syncthing listening                           port=8384 syncthing=localhost
INFO[0004] new pod detected                              pod=accounts spec=consult
ERRO[0026] (connection.go:99): msg: unable to start tunnel
location: /go/src/github.com/vapor-ware/ksync/pkg/ksync/cluster/connection.go:86
struct: *cluster.Connection
----------
nodename: hidden.internal
----------
next: msg: error forwarding ports
location: /go/src/github.com/vapor-ware/ksync/pkg/ksync/cluster/tunnel.go:110
struct: *cluster.Tunnel
----------
localport: 59378
remoteport: 40321
podname: ksync-9xdhv
namespace: kube-system
out: {}
----------
next: error upgrading connection: error sending request: Post https://XXXX/api/v1/namespaces/kube-system/pods/ksync-9xdhv/portforward: unexpected EOF  ContainerName= LocalPath=/Users/alok87/code/accounts LocalReadOnly=true Name=consult Namespace=may16 Pod= Reload=false RemotePath=/test RemoteReadOnly=false Selector="[product=accounts]"
ERRO[0304] lost connection to cluster                    ContainerName= LocalPath=/Users/alok87/code/accounts LocalReadOnly=true Name=consult Namespace=may16 Pod= Reload=false RemotePath=/test RemoteReadOnly=false Selector="[product=accounts]"

The daemon manager is basically doing below:

for {
		fmt.Println("[daemon-manager] Running \"ksync watch\" in auto restart mode")
		err := utils.RunCommand("ksync watch")
		if err != nil {
			fmt.Printf("[practl] ksync crashed!! err: %v\n", err)
			fmt.Println("[practl] ksync would be restarted")
		}
}

https://github.com/vapor-ware/ksync/blob/master/pkg/ksync/cluster/connection.go#L99

There are two things we can do

  1. Fatal on getting such error, as it does not recover and ksync stays in stuck state here.
  2. We should throw error and recover from this error and not stay stuck.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions