Why Your Bootstrap Script Fails on Amazon Linux 2023 (and How to Fix It)
If you recently migrated from Amazon Linux 2 (AL2) to Amazon Linux 2023 (AL2023), you may have encountered a frustrating scenario: your EC2 instance launches successfully, but your applications arenโt running. The culprit is often the User Data (bootstrap) script. AL2023 introduces strict security defaults and architectural changes that break many legacy bash scripts. In this guide, weโll breakdown the top 5 reasons your bootstrap scripts fail on AL2023 and provide the exact code snippets to fix them.
AL2023 Bootstrap script Fixes
- 1. The DNF “Transaction Lock” Error
- 2. Permission Denied in /tmp
- 3. IMDSv2 Enforcement (Metadata Failures)
- 4. The Missing python Command
- 5. Strict Shebang Requirements
- Summary
1. The DNF “Transaction Lock” Error
The Symptom:Your script attempts to install packages, but fails with:
Error: Failed to obtain the transaction lock on /var/lib/rpm/.rpm.lock
The Cause: AL2023 replaces yum with dnf. On boot, the OS often runs automatic updates or initializes the SSM Agent in the background. If your User Data script tries to run dnf install at the exact same moment, it crashes because the package manager is locked.
The Fix: Do not run a naked dnf install command. You must implement a retry loop to wait for the lock to release.
Bash
# Add this function to your User Data
dnf_install_safe() {
local packages="$*"
local max_retries=10
local count=0
until dnf install -y $packages; do
count=$((count + 1))
if [ $count -ge $max_retries ]; then
echo "Failed to install $packages after $max_retries attempts"
return 1
fi
echo "DNF is locked. Waiting 10 seconds... (Attempt $count/$max_retries)"
sleep 10
done
}
# Usage:
dnf_install_safe httpd git
2. Permission Denied in /tmp
The Symptom:Your script downloads an installer to the temporary directory, but when you try to run it, you see:
bash: /tmp/install.sh: Permission denied
The Cause: For enhanced security, AL2023 mounts the /tmp directory with the noexec flag by default. This prevents any binaries or scripts located in /tmp from being executed.
The Fix: Stop using /tmp for executable scripts. Switch to /var/tmp, which allows execution.
Don’t do this:
wget -O /tmp/install.sh https://example.com/install.sh
sh /tmp/install.sh # Fails
Do this instead:
wget -O /var/tmp/install.sh https://example.com/install.sh
sh /var/tmp/install.sh # Success
3. IMDSv2 Enforcement (Metadata Failures)
The Symptom: Your script fails to retrieve the instance IP, Region, or Instance ID using curl.
The Cause: AL2023 enforces IMDSv2 (Instance Metadata Service Version 2) by default. The old method (IMDSv1) allowed simple curl requests. IMDSv2 requires a session token for authentication to prevent Server-Side Request Forgery (SSRF).
The Fix: You must generate a token before requesting metadata.
AL2 Style (Broken in AL2023):
REGION=$(curl -s http://169.254.169.254/latest/meta-data/placement/region)
AL2023 Style (Working):
# 1. Get Token
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
# 2. Use Token to get Data
REGION=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -s http://169.254.169.254/latest/meta-data/placement/region)
4. The Missing python Command
The Symptom:
Scripts executing python script.py return command not found.
The Cause:
AL2023 does not install Python 2.7. Furthermore, it does not alias python to python3 by default to avoid ambiguity.
The Fix:
Update your scripts to explicitly call python3.
Bash
python3 my_script.py
5. Strict Shebang Requirements
The Symptom:
The User Data script is ignored entirely and doesn’t run.
The Cause:
Cloud-init in AL2023 is stricter about file headers. If your script does not begin with a valid shebang (interpreter directive), the system may treat it as plain text rather than a script.
The Fix:
Ensure the very first line of your User Data is:
Bash
#!/bin/bash
Note: Do not put comments or blank lines above this line.
Summary
Migrating to Amazon Linux 2023 offers better performance and security, but it requires modernizing your automation. By handling DNF locks gracefully, respecting the noexec /tmp mount, and adopting IMDSv2, your bootstrap scripts will be robust and future-proof.
Need to debug further?If your instance still fails to configure, SSH into the instance and check the cloud-init logs:
Bash
cat /var/log/cloud-init-output.log
