Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JUJU-299] Store unit CharmURL as a string reference #13546

Merged
merged 2 commits into from
Dec 10, 2021

Conversation

manadart
Copy link
Member

@manadart manadart commented Dec 3, 2021

We have long stored in the unit Mongo document, an actual *charm.URL reference.

This of course is fraught with danger, as upstream changes to the library can change what is stored, and potentially render old documents incompatible with an updated type.

In addition, we instituted GetBSON and SetBSON helpers to handle persistence and retrieval, which were made non-trivial by the introduction of CharmHub and multiple possible formats for this URL.

We have ended up in a situation where we are paying a small overhead to parse this URL upon every access, even when we might not use the field.

Here we update the document to store a string, and only parse the URL at need. The patch is scoped only to what is necessary to change the one field in the unit document. We still do unnecessary conversion, and could benefit from changing the application document as well, including all the sundry methods that accept *charm.URL, but immediately stringify it for use.

Upgrades had one usage of the doc for a version 2.8 step. This is replaced with the old version of the doc still storing a *charm.URL.

QA steps

  • Bootstrap a 2.7 controller and juju install ubuntu -n 3.
  • With this patch, juju upgrade-controller --build-agent.
  • Run juju upgrade-model twice. It will jump to the latest 2.8, then this patch.
  • Ensure that the logs show no errors relating to unit documents.

Documentation changes

None.

Bug reference

N/A

@manadart manadart force-pushed the 2.9-unit-charm-url branch 6 times, most recently from 58c44f6 to 1ff6c36 Compare December 3, 2021 17:24
@hpidcock hpidcock added the 2.9 label Dec 5, 2021
@manadart manadart changed the title Store unit CharmURL as a string reference instead of charm.URL reference [JUJU-299] Store unit CharmURL as a string reference Dec 6, 2021
@manadart manadart force-pushed the 2.9-unit-charm-url branch 2 times, most recently from f033b25 to d5b2ff0 Compare December 6, 2021 14:50
of *charm.URL.

There is an overhead for parsing the stored URL, which we are paying
whether we access it or not.
@manadart
Copy link
Member Author

manadart commented Dec 6, 2021

!!build!!

was at the time (version 2.8) that stored *charm.URL instead of *string.
if cURL != nil {
ok = true
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a default case here that populates results.Results[i].Error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple possible errors here that are handled by the catch-all on line 981. I had to look twice at this myself.

chURL, found := unit.CharmURL()
if !found {
chURL, err := unit.CharmURL()
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check that this is not a NotFound error here? Doesn't make much of a difference but perhaps a caller might expect a "no charm url" error instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only error will come from parsing, which is only attempted if not nil. "no charm url" is returned otherwise.

@@ -2612,11 +2612,17 @@ func (a *Application) removeUnitOps(u *Unit, asserts bson.D, op *ForcedOperation
if u.doc.CharmURL != nil {
// If the unit has a different URL to the application, allow any final
// cleanup to happen; otherwise we just do it when the app itself is removed.
maybeDoFinal := u.doc.CharmURL != a.doc.CharmURL
maybeDoFinal := *u.doc.CharmURL != a.doc.CharmURL.String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
maybeDoFinal := *u.doc.CharmURL != a.doc.CharmURL.String()
maybeDoFinal := u.doc.CharmURL != nil && *u.doc.CharmURL != a.doc.CharmURL.String()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inside a block that already checks that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh! sorry, didn't spot that.

Copy link
Contributor

@achilleasa achilleasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@manadart
Copy link
Member Author

$$merge$$

@jujubot jujubot merged commit c5e5548 into juju:2.9 Dec 10, 2021
@manadart manadart deleted the 2.9-unit-charm-url branch December 10, 2021 14:14
jujubot added a commit that referenced this pull request Dec 15, 2021
#13565

Merge from 2.9 bringing forward:
- #13546 from manadart/2.9-unit-charm-url
- #13564 from jujubot/increment-to-2.9.23
- #13556 from hmlanigan/deploy-bundle-2-8-controller

Minor conflicts including those typical to version bumps:
- apiserver/facades/agent/uniter/mocks/newlxdprofile.go
- cmd/juju/application/deployer/bundlehandler_test.go
- scripts/win-installer/setup.iss
- snap/snapcraft.yaml
- version/version.go
SimonRichardson added a commit to SimonRichardson/juju that referenced this pull request Jan 19, 2022
@wallyworld wallyworld mentioned this pull request Jan 31, 2022
juanmanuel-tirado pushed a commit that referenced this pull request Feb 1, 2022
* MVP verify app health after controller/model upgrade

Sometimes after a new release users can have issues with deployed
applications during an upgrade. Add some CI to verify that
existing applications remain healthy after a controller and model
upgrade

* Remove NewProviderAddressInSpace constructor

Instead use the cleaner NewMachineAddress().AsProviderAddress

* Remove NewProviderAddress constructor

Instead use the cleaner NewMachineAddress().AsProviderAddress

* Ignore exit code for kubectl auth check;

* Extract "assume" section feature descriptions to standalone file

This should make it easier to locate and manage the list of features in
the future.

* Update show-model feature test to include supported features
expectations

* Reorder commands run in run_model_metrics.

Ensure that ntp/0 is related to focal/0 and ntp/1 is related to
juju-qa-test/0 for consistent test results.

* Reduce API server metric cardinality

The number of API server metric quantile permutations are growing
because the rpc.ErrorCode can lead to too many deviations. The fix for
now is to reduce this and only compute success or errors. This way we
can be sure that the cardinality can be finite.

This should reduce the total number of memory used for every API server
instance.

* Do not force "stable" channel when re-deploying existing local charms

When deploying local charms, they do not get associated with a
particular channel as this information is not available at deploy time.

When deploying a bundle containing local charms multiple times, we
should never force the channel for local charms as this prevents Juju
from locating the already-existing local charm causing the error
described in LP1954933.

* Handles permission errors for listing instance profiles.

We have introduced some changes into 2.9 where people with very scoped acess keys can't use the new destroy features that require listing instance profiles.

* Remove offers when a model is destroyed

* Update pubsub library

The pubsub library updates to clean up potential runtime errors. Every
so often we can end up with a nil message. Which either means we
dropped a message or we're attempting to process a message that isn't
there. This can lead to issues in people attempting to use the library.

Removing the complexity of knowing when to request the next message. It
removes the ambiguity of knowing if you have a next message or not. The
trick of knowing when next is blocking or not is also gone, as we can
use the existing data channel. If that's already full (it's a buffered
channel of 1), then just offer a default case in the select to allow the
bypassing of the blocking operation.

Message pointers in the implementation is gone, you either have a
message or you don't. We no longer need to check for empty. And in
doing so, the implementation iterates all the pending handlers at once,
without checking for data or next each iteration.

The interface is exactly the same, just the underlying queue
implementation.

* Add explicit owner to params when creating an offer

* Ensures that we copy the value of APIInfo into a new reference in order
to avoid testing races in the raftlease client tests.

* Remove stub sentence from add-machine helper

* Remove NewProviderAddresses constructors

Instead use the cleaner NewMachineAddresses().AsProviderAddresses

* Update to latest version of Pebble

This includes:

* Show process's last output in error msg when service exits quickly:
  canonical/pebble#69
* Fix subtle Service merging issues:
  canonical/pebble#98
* Fix bug with replan/start/restart not updating internal config:
  canonical/pebble#96

* Ensure 'hostname -f' returns juju-assigned hostname on equinix metal nodes

This fixes https://bugs.launchpad.net/juju/+bug/1956538

* Make apiserver log file size configurable

* Remove the format2 test charm
It is unused and has a broken symlink

* The interactive version command doesn't need a controller

* Remove txn watcher wrench

The following removes the txn watcher wrench. It calls too frequently in
production applications. This removes it completely, but I'm not adverse
to either creating a controller/model config for enabling wrenches
(calling wrench.SetEnable method), similar to the setup for logging
config. Alternatively, just throttling the existence of the wench every
minute or so.

Either way, we just need to back off calling this.

* Make log size configurable for other use cases

* apiserver/apiserver.go (eg: /var/log/juju/logsink.log)
* cmd/jujud/agent/machine.go (eg: /var/log/juju/machine-0.log)
* worker/deployer/unit_agent.go (eg: /var/log/juju/unit-mysql-0.log)

* Update relevant mocks

* Fix potential race with signal handling and make code go vet clean

This is using: go vet -composites=false ./...

* Need to serialize the new logging settings

* juju info/find/download run without a controller

* Remove application worker from caasapplicationprovisioner runner when it is stopped;

* Unit machine test fixes for firewalled env (s390x)

The following patches the charmrevision Facade to prevent outbound API
requests to external services. This isn't ideal, as I don't really want
to patch a function call in a unit tests, but this is an integration
masking as a unit test.

* Add more properties to ProviderAddress

And also add accompanying functional options

This is so these params can be included inside addresses, meaning
InterfaceInfos doesn't need to duplicate each nic for each address

* Actually add new logging settings to controller config

* Show the correct config details in "juju bootstrap -h" output

* Consistently log "created rotating log file" messages

* Add test for app worker shutdown for a dead app and refactor CAASApplicationSuite;

* Remove app worker from runner when the worker peacefully exits;

* Add support for mgo scram-sha256 authl default to mongo 4.4 on bootstrap

* Fixes racy tests for the CAAS firewaller worker.

* Softens Raft op queue test to ensure immediate dispatch instead of
batches of 1. We have observed that on slow systems, such as our arm64
test node, that there can be dequeued batches with more than one
operation.

* Prevent leaking of mongo connections

The following prevents the leaking of a mongo connection when iterating
over the collections. The defer would only be registered for the first
one because of the for loop semantics.

The fix is simple, wrap it in a closure.

* Fix some intermittent unit test failures

* Use mongo 4.4 for macos client tests

* Run mock gen;

* Change to makefile to be consistent with develop branch for mongo dependencies

* Increment juju to 2.9.24

* Updates state package for better handling of deferred collection
closure.

* Updates state package to ensure that we close query iterators upon every
usage.

* Add missing wiring to instancecfg and worker/deployer

* Update import paths to juju/lumberjack repo (instead of go.mod replace)

Oddly, go.sum now references gopkg.in/natefinch/lumberjack.v2, but
that's because the juju/lumberjack tests reference that import path.
Later we should clean up the juju/lumberjack repo, add go.mod there
and update the import paths on the tests in that repo.

* Fix unmanaged "controller" namespace clash with "controller" model.

* go fmt - order imports

* Add a couple of wiring tests

* Fix intermittent peergrouper worker test failures

* Reduce default agent-logfile-max-size from 300MB to 100MB

Per Ian's code review comment

* Use agent-logfile settings for unit agent (instead of model-logfile)

* Remove model-logfile settings from agent wiring

* Process offer permissions when migrating models

* Remove offer permissions when the offer is removed

* Fix TestApplicationsWithExposedOffers unit test

* Prevent panic during reporting in relation state tracker.

The following prevents the reporting in relation state tracker causing a
panic if the relation doesn't exist. Additionally, we may want to check
if the relationer is dying before inspecting the relation unit.

* Use juju/retry to handle retries - service

* Use juju/retry to handle retries - windows

* Fixes variable clash and minor comment issues in the instance-poller
worker.

* Use juju/retry to handle retries - common

* Use juju/retry to handle retries - commands

* Ensures that the instance-poller API does not attempt to update
link-layer device addresses using network config that has no device
information.

This occurs when providers do not implement the NetworkInterfaces
method, and the instance-poller worker supplies "fake" config built from
instance addresses.

* Fix various tests for agent-logfile changes

* Use juju/retry to handle retries - state

Turns out in this case the only function in question wasn't in use, so
I've droppoed it

* Engine Report: Model Cache

The following improves the reporting for the model cache. Currently,
it's not possible to know what the current state of the model cache is
for a given controller.

In an ideal world we would have an API to dump the model cache, except
that falls down for HA environments. The correct way would be to query
every controller in the fleet to report back their current information.
That's a bit of work and something that can be done in the future, but
for now, we can piggy-back off the juju_engine_report to dump out more
information.

We don't dump all the information, just enough to help diagnose
potential problems with in the eventual consistent cache.

* Improve yaml parsing for yaml files without 'clouds:' keyword ...

... so that the 'list-clouds' output can be used directly as an input
to 'update-clouds'.

* Try to coerce expected format with schema

* Style fix camelCase

* Retry verify 3 times until success

There is a rare race condition where this job will poll for juju status
before an upgrade has completed, leading to an error.

Resolve this by retrying a small number of times

Also return correct error code on deploy apache timeout

* Removes bridge-utils from the list of installed packages on Ubuntu.
Juju doesn't use it.

* Removes the antiquated LXD setup script. No one needs this to get the
localhost/lxd provider working any more.

* Removes unused testing charms client-forwardproxy and
squid-forwardproxy.

* Removes some old TODO comments that are no longer relevant.

* Fixes nested deployer report test to ensure that all workers are
reported as starting, not just the third, which does not guarantee that
the others are up.

This can fail on slow machines and has been observed on arm64 test jobs.

* Improve startUnitRequest to start a new agent if the current one isn't running.

Used by juju_start_unit introspection.  The current version only restarts the workers
of the agent.  However if the agent is lost and not running, there is no way to get it
back without manual intervention in the machine's agent.conf. Not a user friendly method.

* Add core/network helper to normalize MAC addresses

* Revert "Merge pull request #13546 from manadart/2.9-unit-charm-url"

This reverts commit c5e5548, reversing
changes made to 878fc7c.

* Ensure MAC addresses are normalized when converting from/to API params

This change ensures that the network-related facades always process MAC
addresses consistently.

* Implement NetworkInterfaces call for azure provider

* Record the error if an external cmr controller goes bad

* Ensure that public IPs are only reported as shadow addresses when
listing NICs

This makes the provider's behavior consistent with the rest of the
providers.

* Bump AllModelWatcher and AllWatcher facade version, remove .Parameters and .Results from the action delta in the new facades;

* Add tests for the action delta change;

* Re-generate schema;

* add message query to juju wait-for

* fix linting

* Move mockgen to package_test.go;

* Use default storage class for AKS CI tests;

* ran autolinter

* Ensures that unit public address is acquired one time on an as-needed
basis rather than pre-populating for every hook context.

This should result in fewer API calls, and lower chance of the address
not yet being populated.

* Stop polling external cmr controller when it is removed

* Implement NetworkInterfaces call for openstack provider

* Remove fake NIC generation fallback logic from instancepoller worker

Now that all Juju providers (except manual) implement the
NetworkInterfaces method we can finally remove the fallback logic within
the instancepoller worker that would otherwise generate a list of fake
NICs to emulate the output of NetworkInterfaces.

The prior logic has been replaced with a check that outputs a warning to
the logs if the instancepoller ever encounters a provider that does not
implement this method. This is meant as a hint to folks working on
adding support for new providers in the future that they must implement
the method.

Note that the instancepoller ignores manual machines so this particular
call does not need to be implemented for that provider.

* Use juju/retry to handle retries - worker/apicaller

* Add per controller and per app limits for downloading resources

* Code review fixes

* Emit instancepoller error if env does not implement NetworkInterfaces

This commit replaces the warning message introduced via the previous
commit with an error if the instancepoller detects a (non-manual)
substrate that does not implement the NetworkInterfaces method.

* Revert "Ensures that the instance-poller API does not attempt to update"

This reverts commit fd5be60 which has
been made redundant due to the changes in the previous 2 commits.

* Bump charmrepo/v6 dependency

* Implement GetEssentialMetadata for charmstore repository

* Fake GetEssentialMetadata for charmhub repository

The proposed method implementation groups together requests based on the
macaroons provided (requests without macaroons are lumped together) and
attempts to fetch the data in batches (note that macaroons are not
currently supported by charmhub but the repository is wired to use them
once support becomes available).

At the moment, since the charmhub API (via the refresh method) only
allows us to access the metadata.yaml contents, the implementation falls
back to downloading the charms and populating the essential metadata
responses.

When the charmhub API rolls out support for fetching the metadata we
need (metadata.yaml, manifest.yaml, config.yaml, lxd-profile.yaml), we
will amend the implementation accordingly.

For the time being nothing is calling this particular repository method
so it is safe to land so we can try out the asynchronous charm download
flow (behind a feature flag).

* Rebuild repository mocks

* Bump juju/retry and juju/errors dependencies

* Define async-charm-downloads feature flag

This flag will enable us to incrementally test the changes around
asynchronous charm downloads without breaking the existing deploy logic.

* Refactor charm lookup by URL method in state

The original implementation of the lookup code returns ErrNotFound for
placeholder and/or charms pending to be uploaded. However, the
asynchronous charm downloader code will not use the two-phase commit
approach of the current code (i.e. prepare the charm doc entry, download
charm, update the entry with both the metadata *and* the stored blob
path/SHA256) but rather inserts a charm document with the metadata (but
no blob details) and the pending-download flag set.

To this end, the lookup code has been modified to inspect the controller
flag list and allow pending charms to be retrieved if the async charm
download flag is set. This will allow the rest of the existing deploy
steps to function properly.

* Add helper to create new charm docs and populate the charm metadata
fields

To avoid breaking changes, instead of modifying AddCharm, a new method
has been created and AddCharm has been annotated with a TODO comment
that the AddCharm method must be replaced once the remaining of the
server-side bundle expansion work lands.

* Make the revno threshold for the collection watcher configurable

Prior to this change, the collect() function which the collection
watcher invokes would flag the ID of documents that have a revno <= 0 as
not present. As a result the mergeIDs function which is subsequently
called to figure out which changes should be emitted by the watcher
would ignore those documents.

When a new document gets inserted into a collection it gets assigned a
revno of 0. As a result, the collection watcher will not report it even
if the configured filter function returns true for the document. By
making the revno threshold configurable, we can work around this
limitation.

The default zero-value of the threshold allows us to retain the original
watcher behavior and only override as needed to support special
use-cases.

* Implement watcher for applications that reference charms not yet
downloaded

* Add escape hatch to AddCharmXXX to trigger an async charm download

The escape hatch is behind the async charm downloads feature flag. When
enabled, the AddCharm and friends will fetch the relevant charm metadata
and persist it as a new charm document with the pending flag set.

Currently, the repository implementation do not provide the means to do
a batch give-me-the-charm-meta-only request so the call simply emulates
this behavior by downloading the charm (as the current AddCharm
implementation does) and reseting the calculated SHA256 value (note: the
code uses the repository DownloadCharm method directly (instead of the
DownloadAndStore abstraction) so no blobs are stored.

The server-side-bundle expansion work will (eventually) provide its own
dedicated facade (the batch metadata download functionality will be
implemented then) thus rendering the methods in the charms facade
obsolete. However, the use of the controller flag allows us to test the
async charm download functionality *now* without breaking the existing
deploy flow.

* Support marshaling/unmarshaling of NotYetAvailable apiserver errors

The API server will map NotYetAvailable errors to a 409 (conflict)
status code. As per the spec (see
https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.10):

"The request could not be completed due to a conflict with the current
state of the resource. This code is only allowed in situations where it
is expected that the user might be able to resolve the conflict and
resubmit the request".

* Reply with HTTP 409 (conflict) when trying to get blob data for pending charms

The blob streaming code in apiserver/charms.go now checks whether a
charm is pending to be downloaded and replies with a 409 status code
instead of a 404 to let the client know that the blob data is not yet
available but will be in the future and they need to retry the request.

* Update blob downloader client to properly handle 409 status codes

409 status codes are mapped to NotYetAvailable errors.

* Create and register charmdownloader facade

This is a controller facade which allows clients to:
- watch applications with pending charms.
- trigger (asynchronously) a download for any pending charms.

* Add client for the charmdownloader facade

* Create charmdownloader worker

This worker detects applications that have a pending charm to be
downloaded and triggers an asynchronous download.

Note that the watcher will only report applications with pending charms
iff the async-charm-downloads flag is set. Consequently, we can safely
enable the worker without any issues.

* Add the charmdownloader worker to the model manifold

* Wrap the bundle reader used by the manifest deployer in a retry loop

In the async charm download scenario, having a retry loop allows us to
retry the download if the charm takes longer to download than the time
it takes for a machine to provision.

* Only request supported field names when fetching charmstore metadata

The charmstore client implementation for the Metadata call uses
reflection to extract the list of names from the provided argument and
adds them to the query string that gets sent to the charmstore API.

So, instead of passing a core/charm.EssentialCharmMetadata value which
includes additional fields that trigger BadRequest errors from the
store API we need to define an auxilliary structure with the required
field names and copy the response into the EssentialCharmMetadata value.

* Update async charm d/l code in charms facade to only fetch essential
metadata

The initial (placeholder) implementation of the queueAsyncCharmDownload
method in the charms facade downloaded the charm and extracted the
required metadata from it.

With this commit, the implementation now uses the GetEssentialMetadata
method from the charm repository to grab the minimum set of metadata
needed (metadata.yaml, manifest.yaml, config.yaml and lxd-profile.yaml)
for a deployment without having to download the full charm and pass
those to state.AddCharmMetadata.

The remaining bits of metadata (e.g. actions, metrics etc.) are
populated once the charm gets asynchronously downloaded and before the
charm blob becomes available for agents to download.

* Improve application status messages emitted by the charmdownloader
facade

* Do not fetch lxd-profile.yaml when fetching essential charm metadata

We can simply start LXD containers with the default profile, wait for
the charmdownloader worker to fetch and populate the rest of the charm
metadata (inc. custom lxd profiles) and rely on the instancemutator
worker to apply them to the instances.

* Update lxdprofile watcher to emit changes when charm metadata changes

When the asynchronous charm download flag is enabled, the charms client
facade will only populate the essential bits of the charm document which
are required to commence deployment (metadata, manifest and config).

The full charm blob is then downloaded asynchronously while the machine
for the application units are being provisioned and the full metadata is
populated once the download completes.

Any potential lxd profile defined by the charm will only become known to
Juju once the download is complete. We must therefore rely on the
instancemutator worker to apply the lxd profile to the relevant
containers.

In order for the instancemutator to pick up this change, it needs to
watch for charm metadata changes, match the charm URLs to the ones
referenced by the applications whose units are assigned to a particular
machine and then compare the LXD profiles for changes. This commit
introduces this extra bit of functionality.

* Run go mod tidy

* Rebuild facade schema

* Fix AddCharm test

* Avoid breaking migration because of NYI

Migration fails on some providers (e.g. oci) because of some not yet
implemented/optional features. This unblocks it by returning nil
whenever it sees an nyi. All the other errors are raised as usual.

* Fixes comments for various all-watcher and model-cache logic.

* Charmhub refresh results are unordered

The following updates the charmhub refresh API clients so that all
results are correctly ordered. This was written in the documentation of
the charmhub client API, but it was not observed when using the client.
This can be seen using the charmrevisionupdater expects all results to
be in order. If the results are out of order we end up creating place
holders with invalid revisions.

These are valid:

 - ch:amd64/focal/mongodb-64: {}
 - ch:amd64/focal/ubuntu-19: {}

After running they become:

 - ch:amd64/focal/mongodb-19: {}
 - ch:amd64/focal/mongodb-64: {}
 - ch:amd64/focal/ubuntu-19: {}
 - ch:amd64/focal/ubuntu-64: {}

* Upgrade steps to remove orphaned charm docs

The following removes charm orphaned docs that where left when we run
the upgradecharmrevisioner. This ensures that we don't have any
dependency on them to a given application.

Hopefully this will remove any potential issue in large deployments with
charmhub charms.

* Additional logging in PasswordValid failure cases.

For LP: 1956975, add error logs on why the password validation failed in
the error case.  Cause of bug currently undetermined.

* Ensure azure NICs have their public IP address (if assigned) populated

The recently added implementation for NetworkInterfaces did not populate
the public IPs (shadow IPs) section of the returned NIC list. As a
result, Juju could not figure out a suitable public IP for workloads and
therefore fell back to using the private IP instead.

* Use ProviderCallContext instead of context.Context for azure API calls

* Prevent charm revision calling on startup

On very large deployments we do not want the charm revision worker
starting up and instantly calling the charmhub/charmstore APIs. In a HA
environment this is compounded by the fact that it does it for every
controller node. This can happen if you bounce the agents, so jujus
under load or in trouble are not helping themselves out here.

The solution here is half baked, I've left a TODO to state what the
correct solution is, for now we will just jitter the worker for the
delay so that not every node is smashing the API at the same time.

The real solution is to have a lease for the worker so that it is only
responsible for calling the API. Releasing that lease when it's
terminated. We can not add ifResponsible flag in the manifold, because
if the agent is being (hard) bounced then there is a chance that this
never runs.

I think for now, this is a happy medium.

* Move 2.9.25 upgrade step to 2.9.24

* Update juju/mutex to juju/mutex/v2

* Ensures that upon cache timeout for setting a unit's charm URL, we
indicate whether the unit is in the cache, and/or the *current* value
for it's charm URL.

* Removes the state serving info dependency from the presence manifold.

This is not required, because it is implied by the dependency on the
central hub.

* Removes the ifController wrapper from manifolds already wrapped by
ifDatabaseUpgradeComplete - it is already implied, and therefore
redundant.

* Transfer machine DisplayName through migration

includes temporary code that will be removed before the PR lands:

- go.mod for external changed dependency (description pkg), and
- provider/oci/environ.go for quick fix to allow the PR to run

* Transfer display-name over CloudInstance instead of MachineDoc

* Update go.mod & go.sum

To use the juju/description edge

* Revert back temporary AdoptResources with a TODO

* Optimize code a bit

DisplayName field is not a pointer, nor omitempty, so no need to check
creating extra variables

* Add unit tests for migrate_export & import

Includes a helper function DisplayName() for state.Machine, normally
not available since DisplayName is a field of instanceData (i.e. not
Machine)

* Re-use existing functionality instead of introducing new method

Using existing Machine.InstanceNames() method instead of exporting a
new one

* move names import to correct stanza

* Add Leadership.

To improve the run time for juju ssh <app>/leader, add a direct call to
get the application's leader unit.  Bumped the facade revision.

* Add version 3 of the sshclient facade to introduce Leader().

* Updating juju/os to support macos Monterey

Adds support for compiling Juju on macos Monterey

* Update to use juju/utils/v3

* Make the juju snap strictly confined

* Alters the isControllerFlagManifold to depend on the state config
watcher instead of state.

This means that when a state ping fails, controller-gated workers that
do not depend on state will not bounce.

* Adds comments to machine manifolds for transitive isController
dependencies.

* Fixes machine manifolds tests for removed transitive state dependencies.

* Remove non-passing acceptance tests

In order to get the signal back from our CI tests, we have some tests
that just never passed. As the people whom wrote them are long gone and
the new preferred way is to write using the shell tests we should nuke
from orbit.

These tests will be added to a document to state that they do need
re-writing in the future, but these tests have always been flaky and we
REALLY want to improve our signal to noise ratio.

* Fix comment about disabled os (#13669)

* Add Leader to SSHClient and start using version 3 of facade.

Which implements Leader on the apiserver side. Required to make
juju ssh <app>/leader faster, rather than getting the data from status.

* Allow Leader to be called during upgrades.

It executes a small subset of FullStatus, thus is okay.

* Improve time to run juju ssh application/leader.

Use a specific API call to get the leader unit name, fall back to using FullStatus
if the new sshclient facade version is not available.  FullStatus can be very slow
on large models, juju ssh should not be if possible.

Juju debughooks time to ssh will also improved.

Moved mockgen lines to package_test.go, per convention, easier to find in one place.

Fixed some unhandled error and unused parameter compiler warnings.

* Moved mockgen lines to package_test.go, per convention, easier to find in one place.

* Document use of the @ symbol when setting a config value in help message

* Comment out the snap interfaces till approved

* Comment out the snap plugs till approved

Co-authored-by: Juju bot <[email protected]>
Co-authored-by: Jack Shaw <[email protected]>
Co-authored-by: Kelvin Liu <[email protected]>
Co-authored-by: Achilleas Anagnostopoulos <[email protected]>
Co-authored-by: Heather Lanigan <[email protected]>
Co-authored-by: Thomas Miller <[email protected]>
Co-authored-by: Ian Booth <[email protected]>
Co-authored-by: Joseph Phillips <[email protected]>
Co-authored-by: Ben Hoyt <[email protected]>
Co-authored-by: Jujubot <[email protected]>
Co-authored-by: Harry Pidcock <[email protected]>
Co-authored-by: Caner Derici <[email protected]>
Co-authored-by: Bas de Bruijne <[email protected]>
Co-authored-by: Arnaud Delobelle <[email protected]>
jujubot added a commit that referenced this pull request Feb 1, 2022
#13673

Merge 2.9

#13546 [JUJU-299] Store unit CharmURL as a string reference
#13640 add message query to juju wait-for
#13646 [JUJU-484] Stop polling external cmr controller when it is removed
#13634 [JUJU-463] Implement NetworkInterfaces method for azure provider
#13624 [JUJU-416] Consistantly use juju/retry to handle retries 3 (state/*)
#13648 [JUJU-464] Implement NetworkInterfaces method for openstack provider
#13636 [JUJU-479] Revert "unit charm url"
#13650 [JUJU-485] Add per controller and per app limits for downloading resources
#13649 [JUJU-481] Remove fake NIC generation fallback logic from instancepoller worker
#13681 Comment out the snap interfaces till approved
#13670 [JUJU-521] 1959351 fix slow juju ssh leader
#13671 [JUJU-527] Updating juju/os to support macos Monterey
#13678 [Juju-541] Document use of the @ symbol when setting a config value in help message

Lots of trivial conflicts due ot import path changes and the forward port of the charm download changes which first landed in this branch and were backported.

## QA steps

See PRs



[JUJU-299]: https://warthogs.atlassian.net/browse/JUJU-299?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-484]: https://warthogs.atlassian.net/browse/JUJU-484?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-463]: https://warthogs.atlassian.net/browse/JUJU-463?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-416]: https://warthogs.atlassian.net/browse/JUJU-416?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-464]: https://warthogs.atlassian.net/browse/JUJU-464?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-479]: https://warthogs.atlassian.net/browse/JUJU-479?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-485]: https://warthogs.atlassian.net/browse/JUJU-485?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-481]: https://warthogs.atlassian.net/browse/JUJU-481?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

[JUJU-521]: https://warthogs.atlassian.net/browse/JUJU-521?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-527]: https://warthogs.atlassian.net/browse/JUJU-527?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
manadart added a commit to manadart/juju that referenced this pull request May 16, 2022
…charm-url""

This reverts commit b2a1223, which was
itself a reversion of the patch that originally made CharmURL on the
unit Mongo document a string reference instead of a typed CharmURL
reference.
jujubot added a commit that referenced this pull request May 17, 2022
#14030

This is an effective reapplication of #13546. See that patch for the full description.

That patch was reverted under #13636, which this in-turn reverts.

## QA steps

See #13546.

## Documentation changes

None.

## Bug reference

N/A
@wallyworld wallyworld mentioned this pull request May 19, 2022
jujubot added a commit that referenced this pull request May 19, 2022
#14050

Merge 2.9, but revert the commit which deleted secrets from 2.9

#13996 [JUJU-1019] Bug 1969929 bundle revision only
#14019 [JUJU-1082] Fix inst filtering to avoid arch mismatches
#14020 [JUJU-1069] CI improvements
#14021 [JUJU-1079] Update volumes if statefulset spec changed;
#13684 [JUJU-544] Remove redundant ifCredentialValid wrappers from model manifolds
#14002 [JUJU-1054] Ensure to convert pod status to juju status consistent across operators
#14024 Juju 1061 add machine private key
#14026 [JUJU-1077] Refactor unit tests
#14033 [JUJU-1103] Add --cert option to microk8s refresh-cert
#13546 [JUJU-299] Store unit CharmURL as a string reference
#14025 [JUJU-1089] Deprecated note for --no-download flag in create-backup
#14023 [JUJU-1070] Fix/lp 1971560
#14027 Fix K8s application removal in pre-initialized error state
#14030 [JUJU-1099] Restore "Store unit CharmURL as a string reference"
#14037 [JUJU-1109] Fix encoding for interfaceAddressDisplay, used by the network-get hook tool
#14028 [JUJU-1091] Cloud-init wait for IP
#14029 [JUJU-1070] Use first 6 digits for short model UUID;
#14034 [JUJU-1070] Use first 6 digits for short model UUID;
#14042 Use default arch when provisioning a machine
#14045 Update references to jujucharms.com
#14046 Address consistency in use of id/Id/ID in command line output
#14047 Adjust the default log level for installing/starting a service

```
# Conflicts:
# apiserver/facades/controller/firewaller/firewaller.go
# apiserver/facades/controller/firewaller/firewaller_test.go
# apiserver/facades/controller/firewaller/firewaller_unit_test.go
# apiserver/facades/controller/remoterelations/mock_test.go
# apiserver/facades/controller/remoterelations/remoterelations_test.go
# caas/kubernetes/provider/application/application_test.go
# caas/kubernetes/provider/k8s.go
# go.mod
# go.sum
# mongo/mongo_test.go
# rpc/params/apierror.go
# snap/snapcraft.yaml
# version/version.go
# worker/uniter/operation/interface.go
# worker/uniter/operation/runhook_test.go
# worker/uniter/secrets/rotatesecrets.go
# worker/uniter/secrets/rotatesecrets_test.go
```

## QA steps

See PRs

[JUJU-1019]: https://warthogs.atlassian.net/browse/JUJU-1019?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1082]: https://warthogs.atlassian.net/browse/JUJU-1082?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1069]: https://warthogs.atlassian.net/browse/JUJU-1069?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1079]: https://warthogs.atlassian.net/browse/JUJU-1079?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-544]: https://warthogs.atlassian.net/browse/JUJU-544?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1054]: https://warthogs.atlassian.net/browse/JUJU-1054?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1077]: https://warthogs.atlassian.net/browse/JUJU-1077?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1103]: https://warthogs.atlassian.net/browse/JUJU-1103?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-299]: https://warthogs.atlassian.net/browse/JUJU-299?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1089]: https://warthogs.atlassian.net/browse/JUJU-1089?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1070]: https://warthogs.atlassian.net/browse/JUJU-1070?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1099]: https://warthogs.atlassian.net/browse/JUJU-1099?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1109]: https://warthogs.atlassian.net/browse/JUJU-1109?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[JUJU-1091]: https://warthogs.atlassian.net/browse/JUJU-1091?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants