[New Scheduler]Implement PFCInvokerServer#5098
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5098 +/- ##
==========================================
- Coverage 81.25% 74.61% -6.64%
==========================================
Files 218 220 +2
Lines 10700 10723 +23
Branches 450 456 +6
==========================================
- Hits 8694 8001 -693
- Misses 2006 2722 +716
Continue to review full report at Codecov.
|
faa7bab to
43caa23
Compare
| invoker: InvokerCore)(implicit ec: ExecutionContext, actorSystem: ActorSystem, logger: Logging): BasicRasService | ||
| } | ||
|
|
||
| object DefaultInvokerServer extends InvokerServerProvider { |
There was a problem hiding this comment.
I moved DefaultInvokerServer to a separate file, keep the same as FPCInvokerServer.
This is mainly to facilitate management
| "health", | ||
| PingMessage( | ||
| InvokerInstanceId( | ||
| -1 - instance.instance, |
There was a problem hiding this comment.
Use the same topic name health to send unhealthy message via negative number of invoker.
In controller side(e.g. InvokerSupervision), get the message from kafka and make the invoker to unhealthy and down finally.
| invoker.enable() | ||
| } ~ (path("disable") & post) { | ||
| invoker.disable() | ||
| } ~ (path("memory") & get) { |
There was a problem hiding this comment.
Regarding /memory, this is for FPCInvokerServer only, this is used for judge whether the invoker is idle, if the result(busyMemory + inProgressMemory) > 0, it shows the invoker is executing activations, so it is busy, if the result == 0, it shows the invoker is idle.
For DefaultInvokerServer, can't use /memory to judge whether the invoker is idle, this is another topic for this. btw, we judge it like this, get every controller slot of the invoker, and do sum. if the result > 0, the invoker is busy, otherwise, it is idle.
43caa23 to
241d26a
Compare
d58eed9 to
c2ea992
Compare
|
Have any comment ? |
| val healthActionNamePrefix = "invokerHealthTestAction" | ||
| val bufferSize = 10 | ||
| val bufferErrorTolerance = 3 | ||
| var useMemory = 0l |
There was a problem hiding this comment.
Why do we need this mutable variable?
There was a problem hiding this comment.
There has a get used memory API: http://$invoker:$port/memory,
complete(InvokerHealthManager.useMemory.toString)
The FunctionPullingContainerPool actor(which will be contributed) sends the pool's used memory info(busyMemory + inprogressMemory) to InvokerHealthyManager actor, so use a mutable variable to store the value.
There was a problem hiding this comment.
I know that's there only one instance of the invoker health manager per instance / jvm, but it still seems risky to be mutating a var in an object like this especially when it's the source of truth on whether the invoker is idle or not.
There was a problem hiding this comment.
@style95 @bdoyle0182 , yes, not a good idea of use mutable variable
It is better to do like below
functionPullingContainerPool.ask(UsedMemory)(Timeout(5.seconds)).mapTo[Int]
But functionPullingContainerPool is implementing now, so i removed /memory api in this pr: 86f5531
and i will add /memory api in [New Scheduler] FunctionPullingContainerPool pr.
|
|
||
| object DefaultInvokerServer extends InvokerServerProvider { | ||
|
|
||
| // TODO: TBD, after FPCInvokerReactive is ready, can read the credentials from pureconfig |
There was a problem hiding this comment.
Not a problem for this mr, but I don't think that's a secure way of setting admin credentials
There was a problem hiding this comment.
Yes, after FPCInvokerReactive is ready, can read the credentials from pureconfig like other components.
| val healthActionNamePrefix = "invokerHealthTestAction" | ||
| val bufferSize = 10 | ||
| val bufferErrorTolerance = 3 | ||
| var useMemory = 0l |
There was a problem hiding this comment.
Ok, good catch, as i said above, i removed /memory api in this pr. i will add /memory api in another pr: [New Scheduler] FunctionPullingContainerPool with functionPullingContainerPool.ask(UsedMemory)(Timeout(5.seconds)).mapTo[Int] solution.
|
LGTM other than the discussion on the mutable variable |
|
Have any more comment? |
| } | ||
| }) | ||
|
|
||
| override def enable(): Route = ??? |
There was a problem hiding this comment.
I haven't tried it but would it be ok to leave these methods empty to run it?
There was a problem hiding this comment.
Sorry, you mean implement it like below?
override def enable(): Route = {
complete("todo or pong")
}
Due to InvokerReactive extends InvokerCore which has enabled/disabled,
so need to implement enable/disabled
There was a problem hiding this comment.
when deploy using InvokerReactive and send enable/disable requests to invoker, what will happen?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5098 +/- ##
===========================================
+ Coverage 32.52% 75.16% +42.64%
===========================================
Files 211 220 +9
Lines 10435 10723 +288
Branches 447 460 +13
===========================================
+ Hits 3394 8060 +4666
+ Misses 7041 2663 -4378 ☔ View full report in Codecov by Sentry. |

Description
Manage New scheduler's invokerServer. provide basic disable/enable for the invoker.
These http endpoints are for deployment or get the invoker's current situation, nomally, when do zerodowntime deployment,
need to send /disable request to invoker, make the invoker's status to
down, then, the new activations will not be sent to the invoker, we can wait the invoker executed all flying activations, then, we can stop the invoker safely and deploy the invoker ./disable
e.g. curl -X POST -u xxx:xxx http://$invoker:12001/disable, disabled the invoker, the invoker status will be changed to
unhealthyfirstly and changed todownfinally after several seconds./enable
e.g. curl -X POST -u xxx:xxx http://$invoker:12001/enable, enable the invoker, the invoker status will be changed to
upagain./memory (this is for FPCInvokerServer only)
e.g. curl -X GET -u xxx:xxx http://$invoker:12001/memory, if the result >0, the invoker is busy, otherwise, it is idle.
Due to simple, no document.
Related issue and scope
My changes affect the following components
Types of changes
Checklist: