The Monitoring Agent provides email notification capabilities across the entire SAP landscape. All Linux servers are configured to relay email through smtp2go, a cloud SMTP service, using the verified domain fivetran-internal-sales.com.
Key features:
/usr/sap/sap_skills/mailing_lists.json, loaded dynamically on page load<hostname>@fivetran-internal-sales.comsap-skills@fivetran-internal-sales.com| Property | Value |
|---|---|
| Provider | smtp2go |
| SMTP Server | mail.smtp2go.com |
| Port | 2525 |
| Auth User | antonio.carbone@fivetran.com |
| Auth Password | Stored in vault key smtp2go |
| Verified Domain | fivetran-internal-sales.com |
| TLS | Enabled (opportunistic) |
| Portal From | sap-skills@fivetran-internal-sales.com |
Each Linux server in the SAP landscape is configured to relay email via smtp2go. Each uses <hostname>@fivetran-internal-sales.com as its from address.
| Server | From Address | MTA | OS |
|---|---|---|---|
sapidesecc8 | sapidesecc8@fivetran-internal-sales.com | Postfix (lmdb) | SUSE 15 SP5 |
sapidess4 | sapidess4@fivetran-internal-sales.com | Postfix (lmdb) | SUSE 15 SP3 |
ts-sap-hana-s4-ssh-tunnel | sshtunnel@fivetran-internal-sales.com | Exim4 | Debian 12 |
saprouter | saprouter@fivetran-internal-sales.com | Postfix (lmdb) | Rocky Linux 10 |
saphvrhub | hvrhub@fivetran-internal-sales.com | Postfix (hash) | Rocky Linux 8 |
| MTA | Config File | Credentials File |
|---|---|---|
| Postfix (SUSE) | /etc/postfix/main.cf | /etc/postfix/sasl_passwd (lmdb map) |
| Postfix (Rocky) | /etc/postfix/main.cf | /etc/postfix/sasl_passwd (hash or lmdb map) |
| Exim4 (Debian) | /etc/exim4/update-exim4.conf.conf | /etc/exim4/passwd.client |
lmdb maps. Rocky Linux 8 uses hash (Berkeley DB). After editing sasl_passwd, regenerate the map: postmap lmdb:/etc/postfix/sasl_passwd or postmap /etc/postfix/sasl_passwd.
Send a test email from any server:
# From sapidesecc8 (local) echo "Test" | mailx -s "Test from sapidesecc8" -r "sapidesecc8@fivetran-internal-sales.com" recipient@email.com # From sapidess4 (via SSH) ssh root@sapidess4 'echo "Test" | mailx -s "Test from sapidess4" -r "sapidess4@fivetran-internal-sales.com" recipient@email.com' # From sshtunnel (Exim4 uses different syntax) ssh root@10.142.0.37 'echo "Test" | mail -s "Test from sshtunnel" -a "From: sshtunnel@fivetran-internal-sales.com" recipient@email.com' # From saprouter ssh root@10.128.0.111 'echo "Test" | mailx -s "Test from saprouter" -r "saprouter@fivetran-internal-sales.com" recipient@email.com' # From hvrhub ssh root@10.128.15.240 'echo "Test" | mailx -s "Test from hvrhub" -r "hvrhub@fivetran-internal-sales.com" recipient@email.com'
To configure a new Linux server to relay email via smtp2go:
# For Postfix (RHEL/Rocky/SUSE): postconf -e "relayhost = [mail.smtp2go.com]:2525" postconf -e "smtp_sasl_auth_enable = yes" postconf -e "smtp_sasl_password_maps = lmdb:/etc/postfix/sasl_passwd" # or hash: on older systems postconf -e "smtp_sasl_security_options = noanonymous" postconf -e "smtp_tls_security_level = may" echo "[mail.smtp2go.com]:2525 antonio.carbone@fivetran.com:PASSWORD" > /etc/postfix/sasl_passwd chmod 600 /etc/postfix/sasl_passwd postmap lmdb:/etc/postfix/sasl_passwd # or: postmap /etc/postfix/sasl_passwd systemctl restart postfix # For Exim4 (Debian): # Edit /etc/exim4/update-exim4.conf.conf: # dc_eximconfig_configtype='smarthost' # dc_smarthost='mail.smtp2go.com::2525' # Add to /etc/exim4/passwd.client: # mail.smtp2go.com:antonio.carbone@fivetran.com:PASSWORD update-exim4.conf systemctl restart exim4
smtp2go. Do not hardcode it in documentation.
Distribution lists are stored in a JSON file on the server: /usr/sap/sap_skills/mailing_lists.json. The web UI loads the lists dynamically on page load via the API. When a list is updated via the web UI:
POST /sap_skills/api/update_mailing_list with the list name and email array/usr/sap/sap_skills/mailing_lists.json| List Name | Recipients | Usage |
|---|---|---|
SAPSpecialists | antonio.carbone@fivetran.com, richard.brouwer@fivetran.com | Alerts, notifications, reports |
| Endpoint | Method | Auth | Description |
|---|---|---|---|
/sap_skills/api/get_mailing_list | POST | None | Read a distribution list by name. Body: {"list_name": "SAPSpecialists"} |
/sap_skills/api/update_mailing_list | POST | None | Update a distribution list. Body: {"list_name": "...", "emails": [...]} |
/sap_skills/api/send_test_email | POST | None | Send test email to a list. Body: {"list_name": "...", "emails": [...]} |
Example — read list via curl:
curl -sk -X POST -H "Content-Type: application/json" \
-d '{"list_name":"SAPSpecialists"}' \
https://sapidesecc8.fivetran-internal-sales.com/sap_skills/api/get_mailing_list
Example — send to list from Python:
import subprocess
# Read list from API or hardcode
recipients = ["antonio.carbone@fivetran.com", "richard.brouwer@fivetran.com"]
for r in recipients:
subprocess.run(f'echo "Alert message" | mailx -s "SAP Alert" -r "sap-skills@fivetran-internal-sales.com" {r}', shell=True)
| Issue | Fix |
|---|---|
| Email not delivered | Check mail log: tail -20 /var/log/mail (SUSE) or journalctl -u postfix -n 20 (Rocky/Debian) |
unsupported dictionary type: hash | Use lmdb instead: postconf -e "smtp_sasl_password_maps = lmdb:/etc/postfix/sasl_passwd" then postmap lmdb:/etc/postfix/sasl_passwd |
sender domain not verified | Only fivetran-internal-sales.com is verified. Use *@fivetran-internal-sales.com as from address. |
TLS engine unavailable | Enable tlsmgr in /etc/postfix/master.cf (uncomment the line) and restart postfix |
SASL authentication failed | Verify credentials in /etc/postfix/sasl_passwd match the vault. Regenerate map after editing. |
| Distribution list changes not saving | Check browser console for errors. Verify the API returns {"status": "ok"}. Check permissions on /usr/sap/sap_skills/mailing_lists.json. |
| Test email button does nothing | List may be empty. Add at least one recipient first. |
# Check postfix queue postqueue -p # Flush stuck mail postqueue -f # Check postfix config postconf relayhost smtp_sasl_auth_enable smtp_sasl_password_maps smtp_tls_security_level # Check Exim4 queue (sshtunnel) exim4 -bp # Check smtp2go delivery log # Go to https://app.smtp2go.com > Activity > Email Activity
The Portal Watchdog runs on sapidess4 and monitors the web server on sapidesecc8. If the portal goes down unexpectedly (without a planned maintenance flag), the watchdog sends email alerts to the SAP Specialists distribution list.
Because it runs on a separate server from the portal, it can detect and report outages even when sapidesecc8 is completely unreachable.
| Property | Value |
|---|---|
| Script | /usr/local/bin/portal_watchdog.py on sapidess4 |
| Schedule | Every 5 minutes via cron (*/5 * * * *) on sapidess4 |
| Check method | HTTPS to https://sapidesecc8.fivetran-internal-sales.com/sap_skills/ + SSH fallback |
| State file | /var/run/portal_watchdog_state.json |
| Planned flag | /var/run/portal_planned_restart (set by cockpit before restart) |
| Local log | /var/log/portal_watchdog.log |
| Activity CSV | /var/log/portal_watchdog_activity.csv |
| Parquet output | gs://sap_cds_dbt/portal_log/ (written directly from sapidess4) |
/var/run/portal_planned_restart. If the flag exists, the outage is treated as planned and no alert is sent./var/run/portal_watchdog_state.json to detect transitions (up → down, down → up).sapidess4 has gsutil and PyArrow installed, allowing the watchdog to write Parquet event files directly to gs://sap_cds_dbt/portal_log/ even when sapidesecc8 is down. This ensures activity logging continuity during portal outages.
# Run watchdog manually ssh root@sapidess4 "/usr/local/bin/portal_watchdog.py" # View state ssh root@sapidess4 "cat /var/run/portal_watchdog_state.json" # View recent log ssh root@sapidess4 "tail -20 /var/log/portal_watchdog.log" # Check cron is active ssh root@sapidess4 "crontab -l | grep watchdog" # Set planned maintenance flag (prevents alerts) ssh root@sapidess4 "touch /var/run/portal_planned_restart" # Remove planned flag after maintenance ssh root@sapidess4 "rm -f /var/run/portal_planned_restart"
Every monitoring event across the SAP landscape is logged to both a local CSV file and individual Parquet files in GCS. This provides a durable audit trail for all service state changes, health checks, restarts, and watchdog events.
| Format | Location | Retention |
|---|---|---|
| CSV | /var/log/sap_portal_activity.csv | 4-month retention (monthly cron cleanup) |
| Parquet | gs://sap_cds_dbt/portal_log/ | Individual files per event |
| Column | Description |
|---|---|
date | Event date (YYYY-MM-DD) |
time | Event time (HH:MM:SS, local) |
timestamp_utc | UTC timestamp (ISO 8601) |
source | Origin of the event (e.g., health_check, watchdog, cockpit) |
type | Event type (e.g., state_change, startup, alert) |
action | What happened (e.g., start, stop, restart, check) |
server | Target server hostname |
service | Target service name |
result | Outcome (e.g., success, failure, running, stopped) |
detail | Additional context or error message |
The Service Monitor is a real-time dashboard that checks the health of all 6 servers and their critical services every 5 minutes. It runs as a cron job on sapidesecc8 and writes results to a JSON file that the web UI reads.
Access it at: SAP_Monitoring.html → Service Monitor
| Server | Services Checked | Check Method |
|---|---|---|
| sapidess4 | SAP Dispatcher, Gateway, ICM, IGS HANA FIV (Nameserver, Indexserver, XSEngine, Compileserver) HANA PIT (Nameserver, Indexserver) |
SSH + sapcontrol -nr 03 (SAP)SSH + sapcontrol -nr 00 (FIV)SSH + sapcontrol -nr 96 (PIT) |
| sapidesecc8 | SAP Dispatcher, Gateway, ICM Oracle Database Web Server (Portal) |
Local sapcontrol -nr 00ps -ef | grep ora_pmoncurl https://localhost/sap_skills/ |
| saprouter | Server reachability SAPRouter systemd service |
SSH to saprouter-internalsystemctl is-active saprouter |
| sap-sql-ides | Server reachability SAP Instance (SQ1) SQL Server Database |
Ping 10.128.0.51SOAP to http://10.128.0.51:50013/ (sapcontrol)WinRM via sq1_system_status API |
| saphvrhub | Server reachability HVR Hub Service PostgreSQL 14 |
SSH to saphvrhubsystemctl is-active hvrhubserversystemctl is-active postgresql-14 |
| SSH Tunnel | Server reachability | SSH to 10.142.0.37 |
| Color | Meaning |
|---|---|
| Green | Service is running / server is reachable |
| Red | Service is down / server is unreachable |
| Gray | Initial state — check in progress |
Each server card also shows a REACHABLE or UNREACHABLE badge at the top right.
| Property | Value |
|---|---|
| Script | /usr/local/bin/sap_health_check.sh |
| Runs on | sapidesecc8 |
| Schedule | Every 5 minutes via cron (*/5 * * * *) |
| JSON Output | /usr/sap/sap_skills/health_status.json |
| History Log | /var/log/sap_health_check.log |
| Log Rotation | Auto-truncated to last 2000 lines |
{
"timestamp": "2026-04-14T05:16:19Z",
"servers": {
"sapidess4": {
"reachable": true,
"sap_app_server": {"dispatcher":1,"gateway":1,"icm":1,"igs":1},
"hana_fiv": {"nameserver":1,"indexserver":1,"xsengine":3,"compileserver":1},
"hana_pit": {"nameserver":1,"indexserver":0}
},
"sapidesecc8": {
"reachable": true,
"sap_app_server": {"dispatcher":1,"gateway":1,"icm":1},
"oracle_db": true,
"web_server": true
},
"saprouter": {"reachable": true, "saprouter_service": true},
"sap_sql_ides": {"reachable": true, "sap_instance": true, "sql_server": true},
"saphvrhub": {"reachable": true, "hvr_hub_service": true, "postgresql": true},
"ssh_tunnel": {"reachable": true}
}
}
Values of 1 or true = running. Values of 0 or false = down.
| Endpoint | Method | Description |
|---|---|---|
/sap_skills/health_status.json | GET | Current health status JSON (static file, updated by cron) |
/sap_skills/api/admin/health_log | POST | Last 50 lines of the health check history log |
After each health check, sap_health_alert.py runs automatically to detect and report state changes:
The planned cron wrapper (/usr/local/bin/planned_cron_wrapper.py) sets planned maintenance flags before running scheduled tasks, so the health alert system does not send false alerts during expected downtime.
--also-flag option: Used when a scheduled action has indirect consequences on other services. For example, an Oracle offline_force backup also stops SAP services on the same server:
# Oracle offline_force backup: flags both Oracle AND SAP as planned planned_cron_wrapper.py --also-flag sapidesecc8:sap sapidesecc8:oracle -- brbackup -t offline_force ...
This prevents false alerts for both the directly stopped Oracle database and the indirectly affected SAP application server.
# Run health check manually ssh root@sapidesecc8 "/usr/local/bin/sap_health_check.sh" # View current status ssh root@sapidesecc8 "cat /usr/sap/sap_skills/health_status.json" # View health check history ssh root@sapidesecc8 "tail -20 /var/log/sap_health_check.log" # Check cron is active ssh root@sapidesecc8 "crontab -l | grep health"