Refactoring and consolidating services dependencies calculation logic#60800
Refactoring and consolidating services dependencies calculation logic#60800
Conversation
240d9d2 to
33db76b
Compare
4b8badc to
10ef0a4
Compare
|
jenkins test dashboard cephadm |
1 similar comment
|
jenkins test dashboard cephadm |
adk3798
left a comment
There was a problem hiding this comment.
This seems good looking it over, although this is one of those PRs that's particularly difficult to review since it's a lot of moving code around
src/pybind/mgr/cephadm/serve.py
Outdated
| self.log.info(f'Reconfiguring {dd.name()} (dependencies changed)...') | ||
| sym_diff = set(deps).symmetric_difference(last_deps) | ||
| self.log.debug(f'Reconfiguring {dd.name()} deps changed {last_deps} -> {deps} diff is {sym_diff}') | ||
| self.log.info(f'Reconfiguring {dd.name()} because dependencies have changed: diff {sym_diff}') |
There was a problem hiding this comment.
Do we want to be logging the diffs at info level? I think the reason there is a separate log line to begin with is so we don't flood the logs with some large blob of deps at info level.
There was a problem hiding this comment.
@adk3798 yeah, probably it's too much info for info level. I'll only keep that for debug.
| ServiceMap = { | ||
| CephadmAgent.TYPE: CephadmAgent, | ||
| NodeProxy.TYPE: NodeProxy, | ||
| IscsiService.TYPE: IscsiService, | ||
| PrometheusService.TYPE: PrometheusService, | ||
| GrafanaService.TYPE: GrafanaService, | ||
| AlertmanagerService.TYPE: AlertmanagerService, | ||
| PromtailService.TYPE: PromtailService, | ||
| CephExporterService.TYPE: CephExporterService, | ||
| NodeExporterService.TYPE: NodeExporterService, | ||
| JaegerAgentService.TYPE: JaegerAgentService, | ||
| MgmtGatewayService.TYPE: MgmtGatewayService |
There was a problem hiding this comment.
I feel like there ought to be some way to automate the creation of this mapping. Maybe you could take a look at the self.cephdam_services attribute in cephadm/module.py, since it's a similar thing but instead of mapping types to services it's just the list of services. I don't see any reason this list couldn't include all the services without deps either. They'll just route back to the parent function class and get an empty list.
There was a problem hiding this comment.
Thanks for pointing that out, I wasn't aware of the self.cephdam_services. Since it serves exactly the same purpose I'll just go and us it directly. As you said services with no dependencies will return an empty list (from the base class).
There was a problem hiding this comment.
FWIW there's a "registration via decorator" pattern I have used many times in the past. It looks vageuly like:
MY_REG = {}
def reg_decorator(cls):
MY_REG[cls.KEY] = cls
return cls
@reg_decorator
class MyCoolClass:
pass
I do believe we used it in cephadm for the daemon classes, IIRC.
There was a problem hiding this comment.
@phlogistonjohn @adk3798 I like the idea of automatically register/generates the services map (cephadm_services) instead of having to specify the whole list. However, in order to avoid putting more changes on this PR (which already has a lot), and to avoid mix this change with the dependencies movement I think it's better to have a dedicated PR, so I went and created the PR #61114 on top of this one with a new commit. I'll be rebasin it later once this one is merged.
src/pybind/mgr/cephadm/utils.py
Outdated
| def get_daemon_names(mgr: "CephadmOrchestrator", daemons: List[str]) -> List[str]: | ||
| daemon_names = [] | ||
| for daemon_type in daemons: | ||
| for dd in mgr.cache.get_daemons_by_type(daemon_type): | ||
| daemon_names.append(dd.name()) | ||
| return daemon_names |
There was a problem hiding this comment.
now that this isn't some inline function in the calc_daemon_deps function, we might want to clean it up. Specifically I think it's a bit unintuitive that the parameter is daemons when it's expected to be a list of daemon types. We could also consider just having a new function in HostCache get_daemons_by_types that handles getting daemons of multiple types at once, since all it's doing is repeatedly calling to the HostCache with a different type.
There was a problem hiding this comment.
@adk3798 I think it's a good idea. I created the new method and updated the code to use it. All the changes related to the review are packed in the last commit. I'll be squashing the changes later once all the comments are solved.
3c375de to
4db99d2
Compare
|
@rkachach from my last teuthology run, I think it's possible there are some issues with this for alertmanager and ingress. From the logs of a failed test (I shortened the lines and removed a lot of the ones that weren't helpful from https://pulpito.ceph.com/adking-2024-12-10_06:38:28-orch:cephadm-wip-adk-testing-2024-12-09-2041-distro-default-smithi/8028102/, I see some daemons getting reconfigured quite a bit. Do you think this is a result of some bug in this PR? |
4db99d2 to
9e9b492
Compare
| @staticmethod | ||
| def get_dependencies(mgr: "CephadmOrchestrator") -> List[str]: |
There was a problem hiding this comment.
This could be a classmethod since it uses class level parameters
There was a problem hiding this comment.
@phlogistonjohn TBH I'm not sure it's the case as we don't use any class argument (such as cls). This method only relies on the mgr argument in general so I think in this case is in fact a staticmethod.
There was a problem hiding this comment.
I generally (almost always) review code patch-by-patch. In my view, every patch should tell it's own "story". So when I first looked at the patch I immediately saw code in this function like:
port = cast(int, mgr.get_module_option_ex('prometheus', 'server_port', PrometheusService.DEFAULT_MGR_PROMETHEUS_PORT))
Where PrometheusService is the class that this method is attached to. When I see a staticmethod that refers to it's own class by name I always recommend using a classmethod. The main reason is that a classmethod works better with inheritance. For example:
class Foo:
SECRET = 'abracadabra'
@staticmethod
def whats_my_line():
return Foo.SECRET
class Bar:
SECRET = 'open sesame'
In this example, the behavior of Bar.whats_my_line is a bit surprising to the unaware: it'll return 'abracadabra' not 'open sesame' because the function call always returns Foo's SECRET attribute. If the funciton was instead written as:
@classmethod
def whats_my_line(cls):
return cls.SECRET
then things generally work as many people would expect.
Note I do not object to staticmethod when there are zero references to the class at all, like in many other examples in this patch.
That all said, I do now see that the final version of this function in the PR removes these class references. But I was looking at it patch-by-patch. I prefer each patch to be "complete" - but I won't insist on it in a case like this.
Does all this make sense?
There was a problem hiding this comment.
Yes it does! thank your for your patience and for the detailed explanation :)
In fact, I didn’t look/notice the specific example you pointed out regarding the reference to the class itself for Prometheus in the code.:
port = cast(int, mgr.get_module_option_ex('prometheus', 'server_port', PrometheusService.DEFAULT_MGR_PROMETHEUS_PORT))
I think this usage alone is strong enough to justify using a classmethod instead of a staticmethod in this case. I’ve updated the code to use a classmethod in all the methods.
| ServiceMap = { | ||
| CephadmAgent.TYPE: CephadmAgent, | ||
| NodeProxy.TYPE: NodeProxy, | ||
| IscsiService.TYPE: IscsiService, | ||
| PrometheusService.TYPE: PrometheusService, | ||
| GrafanaService.TYPE: GrafanaService, | ||
| AlertmanagerService.TYPE: AlertmanagerService, | ||
| PromtailService.TYPE: PromtailService, | ||
| CephExporterService.TYPE: CephExporterService, | ||
| NodeExporterService.TYPE: NodeExporterService, | ||
| JaegerAgentService.TYPE: JaegerAgentService, | ||
| MgmtGatewayService.TYPE: MgmtGatewayService |
There was a problem hiding this comment.
FWIW there's a "registration via decorator" pattern I have used many times in the past. It looks vageuly like:
MY_REG = {}
def reg_decorator(cls):
MY_REG[cls.KEY] = cls
return cls
@reg_decorator
class MyCoolClass:
pass
I do believe we used it in cephadm for the daemon classes, IIRC.
src/pybind/mgr/cephadm/module.py
Outdated
| svc_cls = ServiceMap.get(daemon_type, None) | ||
| deps = svc_cls.get_dependencies(self, spec, daemon_type) if svc_cls else [] |
| @staticmethod | ||
| def get_dependencies(mgr: "CephadmOrchestrator", |
There was a problem hiding this comment.
This could be a classmethod and it would behave "correctly" wrt inheritance.
There was a problem hiding this comment.
same comment as above :)
|
jenkins retest this please |
f1294c2 to
cd56c79
Compare
5a21ec8 to
52a7f75
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
currently, the dependency logic is duplicated between the different Service classes and the module.py::_calc_daemon_deps function, which can lead to issues such as BUGSs, difficulty in maintenance, and other problems associated with duplicated code. In this change, we are consolidating all the dependency logic into the Service subclasses to eliminate this duplication. This way, we also force anybody creating a new Service to think about its potential dependencies. Fixes: https://tracker.ceph.com/issues/69021 Signed-off-by: Redouane Kachach <[email protected]>
52a7f75 to
1280f01
Compare
|
jenkins test api |
|
jenkins test api |
currently, the dependency logic is duplicated between the different Service classes and the
module.py::_calc_daemon_depsfunction, which can lead to issues such as BUGSs, difficulty in maintenance, and other problems associated with duplicated code. In this change, we are consolidating all the dependency logic into the Service subclasses to eliminate this duplication. This way, we also force anybody creating a new Service to think about its potential dependencies.https://tracker.ceph.com/issues/69021
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e