Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚶‍➡️ [DNM] feat: new controller_build_info metric #13269

Open
wants to merge 1 commit into
base: opentelemetry-configuration
Choose a base branch
from

Conversation

Joibel
Copy link
Member

@Joibel Joibel commented Jun 28, 2024

One of the new metrics of #12589.

Add new, static metric which contains labels from the product version fields.

This is common good practice, and puts the information about which version of a controller was running that a given point in time.

This can be useful when diagnosing issues that occurred, and also shows the progress of a version rollout when that happens.

Note to reviewers: this is part of a stack of reviews for metrics changes. Please don't merge until the rest of the stack is also ready.

Signed-off-by: Alan Clucas alan@clucas.org

@Joibel Joibel force-pushed the opentelemetry-configuration branch from 0ed5767 to f4a1b61 Compare July 1, 2024 07:42
@Joibel Joibel added area/controller Controller issues, panics area/metrics labels Jul 1, 2024
@agilgur5 agilgur5 changed the title feat: new controller_build_info metric 🚶‍➡️ [DNM] feat: new controller_build_info metric Jul 1, 2024
@agilgur5 agilgur5 self-assigned this Jul 1, 2024
Copy link
Member

@agilgur5 agilgur5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs and naming comments

@@ -201,6 +201,21 @@ Metrics for the [Four Golden Signals](https://sre.google/sre-book/monitoring-dis
Some metric attributes may have high cardinality and are marked with ⚠️ to warn you. You may need to disable this metric or disable the attribute.
<!-- titles should be the exact metric name for deep-linking, alphabetical ordered -->
<!-- titles are without argo_workflows prefix -->
#### `controller_build_info`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### `controller_build_info`
#### `build_info`

isn't "controller" redundant here?

we should have more server metrics as well, but they're emitted separately by separate containers, so I would think we wouldn't need to distinguish

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also wonder if build_metadata might be more explicit

Copy link
Member

@agilgur5 agilgur5 Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just version to match argo version

Suggested change
#### `controller_build_info`
#### `version`

version_metadata would work too

@@ -201,6 +201,21 @@ Metrics for the [Four Golden Signals](https://sre.google/sre-book/monitoring-dis
Some metric attributes may have high cardinality and are marked with ⚠️ to warn you. You may need to disable this metric or disable the attribute.
<!-- titles should be the exact metric name for deep-linking, alphabetical ordered -->
<!-- titles are without argo_workflows prefix -->
#### `controller_build_info`

The build information for this workflow controller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The build information for this workflow controller
Build metadata for this Controller.

simplify


| attribute | explanation |
|-------------|------------------------------------------------------------------------|
| `version` | The version of argo-workflows |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `version` | The version of argo-workflows |
| `version` | The version of Argo |

It's referred to as "Argo" in nearly all docs (similar to #13077 (comment))

| attribute | explanation |
|-------------|------------------------------------------------------------------------|
| `version` | The version of argo-workflows |
| `platform` | Platform this is running on, as go describes it e.g. `linux/amd64` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `platform` | Platform this is running on, as go describes it e.g. `linux/amd64` |
| `platform` | The [Go platform](https://go.dev/doc/install/source#environment) compiled for. Example: `linux/amd64` |
  • link to the docs for GOOS + GOARCH, otherwise this feels quite unclear. I'm tempted to further explain that as a combination of "OS and architecture".
    • it's also what it's compiled for, not necessarily what it's running on
    • I could not find a better official link for this either 😕
  • avoid Latin phrases, per k8s style guide

|-------------|------------------------------------------------------------------------|
| `version` | The version of argo-workflows |
| `platform` | Platform this is running on, as go describes it e.g. `linux/amd64` |
| `gover` | Version of go that built this workflow controller |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `gover` | Version of go that built this workflow controller |
| `goversion` | Version of Go used |
  • why not use the full "version"? I read "gover" as "gopher" initially (which ironically works)
  • simplify
    • half of the metrics don't say "this workflow controller" and half do. be consistent
      • it's redundant, so we can just be consistent in removing it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also isn't the convention to use underscores? or is it different for attribute names vs metric names?

Suggested change
| `gover` | Version of go that built this workflow controller |
| `go_version` | Version of Go used |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Underscores are explicitly mentioned as word delimiters for attribute names in the convention docs: https://opentelemetry.io/docs/specs/semconv/general/attribute-naming/

So the second suggestion should be used.

It would also be be consistent with argo version which prints GoVersion

Comment on lines +215 to +217
| `commit` | The full git SHA1 of the code that built this workflow controller |
| `treestate` | Whether the git tree was `dirty` or `clean` when this was built |
| `tag` | The tag on the git commit or `untagged` if it was not tagged |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `commit` | The full git SHA1 of the code that built this workflow controller |
| `treestate` | Whether the git tree was `dirty` or `clean` when this was built |
| `tag` | The tag on the git commit or `untagged` if it was not tagged |
| `git_commit` | The full Git SHA1 commit |
| `git_tree_state` | Whether the Git tree was `dirty` or `clean` when built |
| `git_tag` | The Git tag or `untagged` if it was not tagged |
  • match argo version and split words with underscores per OTel convention
  • simplify
  • "Git" as proper noun

const nameBuildInfo = `controller_build_info`
err := m.createInstrument(int64Counter,
nameBuildInfo,
"Build Information for the Argo Workflows Controller",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Build Information for the Argo Workflows Controller",
"Build metadata for this Controller",

same as above

Comment on lines +6 to +11
labelBuildGoVer string = `goversion`
labelBuildDate string = `build`
labelBuildCompiler string = `compiler`
labelBuildGitCommit string = `commit`
labelBuildGitTreeState string = `treestate`
labelBuildGitTag string = `tag`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
labelBuildGoVer string = `goversion`
labelBuildDate string = `build`
labelBuildCompiler string = `compiler`
labelBuildGitCommit string = `commit`
labelBuildGitTreeState string = `treestate`
labelBuildGitTag string = `tag`
labelBuildGoVersion string = `go_version`
labelBuildDate string = `build_date`
labelBuildCompiler string = `compiler`
labelBuildGitCommit string = `git_commit`
labelBuildGitTreeState string = `git_tree_state`
labelBuildGitTag string = `git_tag`

same as above for docs, matching argo version and splitting words with underscores per OTel convention


version := argo.GetVersion()
m.addInt(ctx, nameBuildInfo, 1, instAttribs{
{name: labelBuildVersion, value: version.Version},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any way to add descriptions to attributes in OTel?


func TestBuildInfo(t *testing.T) {
_, te, err := CreateDefaultTestMetrics()
if assert.NoError(t, err) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could invert the conditional here as well

@Joibel Joibel force-pushed the opentelemetry-configuration branch from f4a1b61 to 3030d84 Compare July 5, 2024 10:05
@Joibel Joibel force-pushed the opentelemetry-configuration branch from 3030d84 to aab55e2 Compare July 5, 2024 10:12
@Joibel Joibel force-pushed the opentelemetry-configuration branch from aab55e2 to 4c454e5 Compare July 5, 2024 10:24
One of the metrics #12589.

Add new, static metric which contains labels from the product version
fields.

This is common good practice, and puts the information about which
version of a controller was running that a given point in time.

This can be useful when diagnosing issues that occurred, and also
shows the progress of a version rollout when that happens.

Note to reviewers: this is part of a stack of reviews for metrics
changes. Please don't merge until the rest of the stack is also ready.

Signed-off-by: Alan Clucas <alan@clucas.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controller Controller issues, panics area/metrics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants
-