Weekly Notes 38
- STAR method for interview questions: A format is suggested for talking about an organizational or technical change you helped carry out in order to solve a problem or make your teammates lives better in some way.
- Situation describe the context you started in and challenges
- Task briefly what was your plan to address
- Actions you did towards goal
- Results how did it turn out
-
A guide to destructuring in JavaScript: Neat ways to unpack objects and arrays in javascript.
-
Why I don’t like discussing action items during incident reviews: Talks about what time feels good spent in an incident review to him. I’ve had good experience talking about action items too. Hard to predict what thread is worth pulling on that will trigger insight. Kinda just have to keep doing the work and showing up.
-
Building an automatically updating live blog in Django: A tiny app that does one thing well that was built just in time. :) Python.
-
Wonderful Rails World Vibes: Looking back at rails world in Toronto 2024.
-
Thoughts From The First SEV0 Conference: Nice rundown of what sounds like an interesting conference. Can’t wait for the videos. Possibly my fave description was an incident response exercise @ slack where a lunch order was screwed up and the participants had to work the problem! lol
- sync_pd_slack_oncall.py: Nice example of a short python script that is well documented that interacts with pagerduty and slack from Honeycomb. :)
#!/usr/bin/env python3
# Copyright 2024 Honeycomb.io
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of
# this software and associated documentation files (the "Software"), to deal in
# the Software without restriction, including without limitation the rights to
# use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
# the Software, and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
# FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
# COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
# IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
# Dependencies:
#
# The script expects to find two API Keys:
#
# - a Slack Bot User OAuth Token's key with the following permissions:
# - "usergroups:read",
# - "usergroups:write",
# - "users.profile:read",
# - "users:read",
# - "users:read.email"
# - a PagerDuty read-only HTTP API key
#
# The keys are to be located in text files at:
#
# - $SYNC_SECRETS_PATH/$ENV.pagerduty_readonly
# - $SYNC_SECRETS_PATH/$ENV.slack_groupsync
#
# The default value for $SYNC_SECRETS_PATH is /mnt/secrets-store
# and for $ENV is 'development'. Override them as you see fit for
# your container environment.
# Setup:
#
# There's no database or nothing of this kind. Scroll down the file and edit the
# NOOP_USERS list to contain the list of any PagerDuty users you have that act
# as placeholders and can't be in slack groups.
#
# Then edit the ROTATIONS dict to map the slack group handle to one or more
# rotations in PagerDuty.
#
# We run this script in a Kubernetes cronjob in EKS where the secrets are
# managed and mounted via the AWS provider for the Secrets Store CSI Driver.
# Getting this set up is left as an exercise to the reader.
# Usage:
#
# $ ./sync_pd_slack_oncall.py
#
# Notes:
# - It is expected that all rotations and group aliases are valid
# and already exist. No error handling is done.
# - If any rotation contains only NO-OP users, it won't be updated
# as Slack does not allow empty groups.
# - There's no rate limiting nor retries. Our setup is small enough
# that none is required yet.
import http.client
import json
import datetime
import os
# no-op users are fake users to let us have gap in coverage in some
# rotations, which shouldn't be expected to be added to slack groups
NOOP_USERS=["noop@example.org"]
ROTATIONS = {
# "<slack-alias>" : ["<rotation IDs>"]
"platform-oncall": [
"PC5BGL2", # primary
"PNB6DMH", # extra primary
"PR892VA" # secondary
],
"storage-oncall": [
"PIMZEZC", # primary
"P8KF5JL" # extra primary
],
"platform-leads": [
"P9SFCKV"
],
"eng-oncall": [
# platform
"PC5BGL2", # primary
"PNB6DMH", # extra primary
"PR892VA", # secondary
# storage
"PIMZEZC", # primary
"P8KF5JL" # extra primary
]
}
#
SECRETS_PATH = os.getenv('SYNC_SECRETS_PATH', '/mnt/secrets-store')
ENV = os.getenv('ENV', 'development')
# we expect these tokens to be mounted somewhere on disk in a utilcronjob chart
with open(f"{SECRETS_PATH}/{ENV}.pagerduty_readonly") as f:
PD_TOKEN = f.read().rstrip()
with open(f"{SECRETS_PATH}/{ENV}.slack_groupsync") as f:
SLACK_TOKEN = f.read().rstrip()
def email_per_rotation(rotations):
conn = http.client.HTTPSConnection("api.pagerduty.com")
headers = {
'Accept': "application/json",
'Content-Type': "application/json",
'Authorization': f"Token token={PD_TOKEN}"
}
now = datetime.datetime.now(datetime.UTC).strftime("%Y-%m-%dT%H:%M:%SZ")
emails = {}
for rotation in rotations:
emails[rotation] = []
user_urls = []
for schedule in rotations[rotation]:
# Send ?since=<t>&until=<t> to give whoever is on-call _right now_. This
# lets the API create a 'final_schedule' report that includes overrides and that
# otherwise does not exist.
conn.request("GET", f"/schedules/{schedule}?since={now}&until={now}", headers=headers)
res = conn.getresponse()
# format here is of the form:
#
# {'schedule': {
# 'escalation_policies': [...],
# 'final_schedule': {
# 'name': ...,
# 'final_schedule': {
# 'rendered_schedule_entries': [
# {'id': ..., 'start': ..., 'end': ...,
# 'user': {'id': ...,
# 'self': <url>}}
# ]
# }
# }
# }}
#
# What we want is to extract the URLs for all our users.
d = json.loads(res.read())
entries = d['schedule']['final_schedule']['rendered_schedule_entries']
for obj in entries:
user_urls.append(obj['user']['self'])
for url in user_urls:
# fetch the emails from the user profiles
conn.request("GET", url, headers=headers)
res = conn.getresponse()
# response looks like this, so add the right stuff
# {'user':
# {'name': ...,
# 'email': 'noop@example.org',
# ...}
# }
d = json.loads(res.read())
email = d['user']['email']
if email not in NOOP_USERS and email not in emails[rotation]:
emails[rotation].append(email)
return emails
def slack_ids_by_emails(emails):
conn = http.client.HTTPSConnection("slack.com")
headers = {
'Authorization': f"Bearer {SLACK_TOKEN}"
}
ids = []
for email in emails:
conn.request("GET", f"/api/users.lookupByEmail?email={email}", headers=headers)
res = conn.getresponse()
# {'ok': True,
# 'user': {
# 'id': 'U01JVM0SP4N',
# 'team_id': 'T0BQM7CV2',
# ...}
# }
d = json.loads(res.read())
ids.append(d['user']['id'])
return ids
def slack_groups_ids():
groups = {}
conn = http.client.HTTPSConnection("slack.com")
headers = {
'Authorization': f"Bearer {SLACK_TOKEN}"
}
conn.request("GET", f"/api/usergroups.list", headers=headers)
res = conn.getresponse()
# {'ok': True,
# 'usergroups': [
# {'id': 'U01JXM6SB2G',
# 'handle': 'platform-oncall',
# ...},
# ...]
# }
d = json.loads(res.read())
for usergroup in d['usergroups']:
groups[usergroup['handle']] = usergroup['id']
return groups
def map_to_groups(rotation_emails, groups):
mapped = {}
for rotation, emails in rotation_emails.items():
mapped[groups[rotation]] = slack_ids_by_emails(emails)
return mapped
def update_slack_groups(mapping, groups):
conn = http.client.HTTPSConnection("slack.com")
headers = {
'Authorization': f"Bearer {SLACK_TOKEN}",
'Content-Type': "application/json"
}
for group_id, user_id_list in mapping.items():
if not user_id_list:
for handle, id in groups.items():
if id == group_id:
group_name = handle
break
else:
# we should always be able to find the handle for a given id, but if
# it's not there, just display the raw group id
group_name = group_id
print(f"skipped @{group_name} as it would be empty")
continue
user_arg = str.join(',', user_id_list)
payload = json.dumps({'usergroup': group_id, 'users': user_arg})
conn.request("POST", f"/api/usergroups.users.update", payload, headers=headers)
res = conn.getresponse()
# {'ok': True, 'usergroup': <usergroup object>}
d = json.loads(res.read())
if d['ok']:
print(f"updated @{d['usergroup']['handle']}")
if __name__ == "__main__":
# go from the rotation list and extract the emails of everyone
# who is currently on-call in PagerDuty, sorted by rotation
rotation_emails = email_per_rotation(ROTATIONS)
# fetch all the slack groups in a single call, mapped by their
# slack handle pointing at their IDs
groups = slack_groups_ids()
# Turn the rotation handles' dict pointing at user emails into
# a bunch of group IDs pointing at slack user IDs
mapped = map_to_groups(rotation_emails, groups)
# update all the groups
update_slack_groups(mapped, groups)