TIL
Today I Learned!
ansible, aws, bash, docker, elastic search, frontend, homebrew, intellij, java, jenkins, linux, macos, mongodb, network, ngrok, nodejs, openssl, postfix, python, resilience, security, ssh, strace, subversion, tailwindcss, terraform, yum
Services I have helped run in the past
- Sending email
- Email is delightful :)
Ansible
ansible-playbook -i "localhost" playbook-nagios.yml --list-tasks
: Lists plays that will be run by ansibleansible hostgroup -i inv -m shell -a "command"
: Execute taskacross all hosts in group hostgroup in inventory inv
AWS
aws sts get-caller-identity
: Gets the apiuser for my current apikey
Bash
Beautiful bash
- gradlew
- locust run-redistributed-headless.sh This one has a nice capture of process ids in an array with looping
- Github checks ssh key encryption algorithm in use
- Zed script directory
- Pleroma web site build script
More links
- Documented bash ‘set’ shell modifiers (eg What does set -e mean?)
- Best practices for writing bash
- Anybody can write good bash
- Google’s style guide: Lots of great tips / things to think about here
- Difference between [ and [[: Always have to look this up
- Shell check for static analysis of scripts looking for common mistakes
Docker
The difference between entrypoint and cmd
The ENTRYPOINT specifies a command that will always be executed when the container starts.
The CMD specifies arguments that will be fed to the ENTRYPOINT.
If you want to make an image dedicated to a specific command you will use ENTRYPOINT [“/path/dedicated_command”]
Otherwise, if you want to make an image for general purpose, you can leave ENTRYPOINT unspecified and use CMD [“/path/dedicated_command”] as you will be able to override the setting by supplying arguments to docker run.
Elastic
CAT api
Management urls in elastic that give cluster status, and index info
- _cat/nodes return info about cluster nodes (jvm use, cpu, leader / follower + several other things) elastic docs
- _cat/master return info about this cluster’s leader node
- _cat/indices return info about an index (number of primary+replica shards, disk use, health)
- _cat/shards return shard stats (primary or replica, disk use, node)
- _cat/allocation?v return shards to data nodes (disk use)
?v prints columns with output and tries to line up data
Notes
- Can add replicas easily but merging shards is hard
- Dedicated leader nodes means less data copying during failover in aws
- Production clusters should have an odd number of nodes (3 or 5 are common for small clusters. the latter tolerates 2 node failures while continuing to be operable)
- Without dedicated leader nodes, a cluster can become very busy after a failure copying data (high network, cpu are observed)
- Cluster nodes should be “big enough” to handle this extra load
- Running elastic with a high availability requirement in production is expensive $$$
- 20% storage is reserved for elastic overhead by aws opensearch
- Aws opensearch says cluster with standby has 99.99% avail and doesn’t copy shards as aggressively (or at all?) during a failure
- This config automatically added dedicated leader nodes x3
- A cluster without standby has 99.9% avail
Frontend
- html5boilerplate a reasonable starting point for frontend projects. Pretty much everything you could ever be worried about is here 🙂 This is similar but much simpler
- Vue instance lifecycle events
- Setting up Vue computed properties for unit tests with vue-test-utils In the past, I have had to override a computed value to create specific test context
Homebrew
# Status of all services
brew services
# Start a service
brew services start [service]
# Stop a service
brew services stop [service]
Intellij
Shortcuts
- Cmd+Shift+V Choose from previous n clipboards values and paste
- Ctrl+g Multi select the word under the cursor (need to press multiple times)
- Alt+Cmd+[ jumps to matching opening brace
- Alt+Cmd+] jumps to matching closing brace
Java
- Java links that are my favourites
- Db migrations in java: Incredibly Spring data doesn’t have a story for db migrations out of the box (It seems to have everything else!)
- Elasticsearch’s guidance for setting java’s heap size: 50% of main memory available in a system is a recommendation I’ve read in a couple of places now
- Get version of the jvm that ran tomcat by looking at catalina.out for log lines: [‘JVM Version’, ‘JVM Vendor’]
A full garbage collection is triggered when a copy from newgen to oldgen fails because there isn’t enough available memory.
Set default java on macos
Performance analysis
CPU
Links
Memory
JConsole
A program that ships with the jdk that lets you introspect a running java process. (You can see exposed jmx beans, heap memory, threads, and a few other things) I usually use it to look at jmx beans
# Figure out the pid of your java process doing something like this or pidof ...
ps aux | grep java
# The run
jconsole
# JConsole will display a list of java processes it finds on your system. Pick the one you want!
Note: I have only ever done this in dev
Maven
# Compute dependency tree for a package in a java program that was brought in by a maven dependency either directly or indirectly
# All dependencies
mvn dependency:tree
# Filter output from above
mvn dependency:tree -Dincludes=org.yaml
Links
- How root / intermediate certs work, and how they’re protected
- A list of tutorials from Baeldung.com. This is a really good resource!
- Testing in Java
- Java logging: sl4j, apache commons logging, log4j, java.util.logging … All the logs! Nice overview of logging in java with increasing degrees of complexity depending on need
- Analyzing maven dependency tree
Openssl
Cacerts, keystores, trust, pkcs12, trust anchors, WebPKI
# View trusted root certs in centos
keytool -list -keystore /etc/pki/ca-trust/extracted/java/cacerts -storepass changeit
# View an ssl cert from a remote host
openssl s_client -showcerts -connect telusemrapi.telushealth.com:443
# View a particular cert in my trust store
# Get alias from the cert list command above
keytool -list -v -keystore /etc/pki/ca-trust/extracted/java/cacerts -alias digicertassuredidrootca -storepass changeit
View keys in a key store
keytool -list -v -keystore a.p12 \
-storepass changeme \
-storetype PKCS12
Extract key
openssl pkcs12 -info -in a.p12 -nodes -nocerts
Extract cert
openssl pkcs12 -info -in a.p12 -nokeys
Make a keystore
openssl pkcs12 -export -in cert.pem -inkey key.pem -out a.p12
# Generate a private key in pem format
openssl genrsa -out key.pem 2048
# Then do this if I need a public key too
openssl rsa -in key.pem -outform PEM -pubout -out public.pem
# Generate a public/private key in rsa format (Can be used with github + SSH)
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
# Github recommends creating a key with the ed25519 algorithm
ssh-keygen -t ed25519 -C "your_email@example.com"
# Check a certificate
openssl x509 -in server.crt -text -noout
# Check a key
openssl rsa -in server.key -check
# Check a csr
openssl req -text -noout -verify -in server.csr
Linux
zip -er a.zip a/
: Creates an encrypted zipfile of contents with folders from a/dd if=/dev/zero of=a.txt bs=1024 count=10240
: Create a 10m “empty” file for testing (10k chunks, at 1024 bytes per chunk)pstree -s tomcat
: Show processes containing tomcat in their commandline (brew install pstree)split -l 50 filename
: Splits file into 50 line chunks names xaa, xab, xac, …mkfs.xfs /dev/sdc
: Make an xfs filesystem- After initializing a volume with a filesystem, get its UUID by running
sudo blkid
and update /etc/fstab. Don’t use the device file in fstab. (eg /dev/nvme1n1) Device name to volume mapping can change between restarts depending on which volume comes up first when multiple volumes are attached to a server. - Background long running tasks started in a login shell
- Run
bg
thendisown -h
to tell your shell not to send the terminate signal to processes started under it. (Ignore SIGHUP?) Suspend the program first with ctrl-z. - Run
nohup <cmd>
to launch a command not bound to the lifetime of the current shell.
- Run
- Disk utilization per process with pidstat
pidstat -dl 20
batches and reports disk use (r/w) every 20s source - Filesystem Hierarchy Standard: Where to put things in the linux file system
nc -l 8080 -k
starts a tiny echo server that prints incoming client messages
Systemd
Systemd stops mongodb by sending a sigterm signal. Sigterm is like asking a process to terminate itself gracefully. It’s the default when you do a kill
SIGTERM is the “normal” kill signal. The application will be given time to shut down cleanly (save its state, free resources such as temporary files, etc.), and an application that is programmed to not immediately terminate upon a SIGTERM signal may take a moment to be terminated.
SIGKILL (also known as Unix signal 9)—kills the process abruptly, producing a fatal error. It is always effective at terminating the process, but can have unintended consequences. SIGTERM (also known as Unix signal 15)—tries to kill the process, but can be blocked or handled in various ways.
https://stackoverflow.com/questions/42978358/how-does-the-systemd-stop-command-actually-work
Always want to be doing sigterm.
Links
- Tcpdump: A great guide to using tcpdump to investigate network traffic in a vm running a relatively modern version of linux.
MacOS
cmd + shift + .
Reveals hidden files in a finder window. eg ~/.sdkman/cmd + .
Is a special key combo for the physical escape key on the iPad magic keyboardcmd + ctrl + space
Pops up the emoji keyboard on a mac
MongoDB
Connection Strings
Links
- Connection strings
- Blog post about Mongo 3.6 and dns seeding
- Another blog post about service discovery: This one is particularly good. Walks you through a connection lifecycle in pyMongo. (I’m assuming a recent java driver is similar) There’s a formal service discovery protocol it sounds like
- db.isMaster(): The query drivers send to the mongo host they’re configured with to learn about the topology of a mongo cluster
- Update arrays in a document: Modify an array in a document using $ and $[] notation
- Design Patterns for MongoDB
- Compact command concerns
- Mongodb manual - compact
Network
Tools
- NetSpot: Map out areas of strong vs weak wifi signal strength to help you decide where to put wireless access points. There’s a free version!
Ngrok
- ngrok: A tool to create a public tunnel to a tcp service running on localhost
OpenSSL
Cross posted a few ssh things here because I can never remember where in this doc I put stuff like this …
Postfix
Various helpful commands
Email entering postfix queues
Email leaving postfix
Postfix log entry
# Message delivery time stamps
# delays=a/b/c/d, where
# a = time before queue manager, including message transmission
# b = time in queue manager
# c = connection setup including DNS, HELO and TLS;
# d = message transmission time
- The postfix logfile entry: Postfix feature # 20051103 added the following (from the 2.3.13 release notes): Better insight into the nature of performance bottle necks, with detailed logging of delays in various stages of message delivery. Postfix logs additional delay information as “delays=a/b/c/d” where a=time before queue manager, including message transmission; b=time in queue manager; c=connection setup time including DNS, HELO and TLS; d=message transmission time.
A few more links about postfix :)
- Postfix architecture (input handlers, queues, output handlers): Fairly important for understanding how postfix works
- SPF records: Qualifiers, mechanisms, oh my! This is a way to validate the FROM address of a message. That the sender (ip) is permitted to send email on behalf of the domain in from:. Works with dns
- Basic postfix config: Good high level guidance for setting up postfix for specific use cases
- Extend postfix smtpd input filtering with custom code: We were looking for a way to show backpressure to clients based on health of active and deferred queues (Don’t accept new messages addressed to email service providers we are currently having delivery trouble with. eg A large number of delayed messages). This may be a way to do that
- On destination rate delays: If you are relaying directly to email service providers, the rate means 1 per domain. If indirect on the other hand, domain == ‘smtp nexthop’. If you only have one of these – ie you’re sending messages to an internal smtp server that relays through another before external delivery – domain in this case is NOT the recipient address domain. It is the relay server. If you only have 1 of these, then email will go out 1 at a time at the defined period
Reverse DNS (ptr) records
Mail servers will cross-check your SMTP server’s advertised HELO hostname against the PTR record for the connecting IP address, and then check that the returned name has an address record matching the connecting IP address. If any of these checks fail, then your outgoing mail may be rejected or marked as spam.
So, you need to set all three consistently: The server’s hostname and the name in the PTR record must match, and that name must resolve to the same IP address.
Note that these do not have to be the same as the domain names for which you are sending mail, and it’s common that they are not.
Reverse dns records (ptr): A discussion of how they’re used. The first comment is the most helpful (Included here for posterity :))
Python
python3 -m http.server 8000
: Runs an http server in the current dirpython3 -m site
: Lists python’s module load path
Resilience
Security
Security, cryptography, whatever
- Choosing good cryptographic constructs for specific circustances Cryptographic Right Answers
- Generate a random secret on linux / mac:
Alpha numeric
dd if=/dev/urandom bs=16 count=1 2> /dev/null | xxd -p
Uppercase, lowercase, numbers, +, =, /
head -c 32 /dev/urandom | base64
SSH
~/.ssh/config
Create an ssh tunnel to allow access to an internal service where I do have a jump host and can connect to another machine that can see the internal service
Strace
Subversion
- Convention at cog
- For a merge commit message, start with “Merged from
" - Add a re #
- For a merge commit message, start with “Merged from
Tailwindcss
Terraform
Yum
NodeJS
- npx serve - Runs a tiny webserver in the current directory. Very handy!
- npm install -g npx - Installs npx assuming you have npm :)