Maturing Security Workflows

I’ve been thinking a lot lately about maturing security workflows. Building workflows takes years of onboarding tools, learning systems, and wiring automations together. The harder question is what happens after they’re built.

How do you manage constant churn without being overwhelmed by the systems you’ve spent years building? That’s what maturing security workflows is really about.

If you don’t have stability in upstream systems, you’ll only introduce pain downstream. If you depend on an upstream IdP for attributes and group membership, it has to be reliable. Data needs to be standardized and validated. Always ask: Do I have the data I expected and need? If not, fail fast rather than waste cycles processing and enriching bad data.

Adding severity and confidence scores to your data is a good way to improve maturity. It shows you understand the context of your logs or alerts and gives you options later by applying action gates — for example, only creating a ticket if criticality is high or critical. These scores also help preserve integrity in human-in-the-loop decisions. We might think we should network-contain an endpoint or reset a user’s password, but if the severity is critical, the automation stops and waits for a human to click the button.

Reduce alert fatigue and automation errors by dedicating time each week to tuning and iterating on pain points. You should have metrics — even if they’re anecdotal — showing which alerts fire most often or which ones you consistently ignore due to high false-positive rates. Fix them. Block time on your calendar one day a week to refactor and increase alert fidelity.

Failure is not wrong. Failure is not bad. Failure is valuable input. When you miss an alert you should have caught, or fail to notice an HTTP 401 from your API because a key expired, that’s not just a mistake — it’s an opportunity to become a better engineer. Invest time in error handling and custom logging. You may not need it while you’re in the middle of building a workflow, but your teammates — and future you — will need it to quickly understand why things were built the way they were.

To grow and mature, assume failure. Call it out. Make it obvious. You can’t fix what you refuse to see.

Alert Fatigue Is Self-Inflicted And Better Detection Makes It Worse

Alert fatigue is a real thing. When you’re building and scaling logging and alerting, it’s easy for things to escalate faster than you can respond or mature your systems. The more logs you ingest, the more detections you make, the more alerts you fire, the more overhead you add.

To make it worse, the better you get at detections, the more time you spend responding!

Management responds with more tools instead of more hands. Tools add opportunity but also increase complexity, and now you’re in even deeper as the only owner of the system that is firing alerts.

Automation and AI can help reduce fatigue and burden if you’re careful not to take your hands completely off the wheel. But where do you start?

That’s where I find myself, armed with tools and ability. Building an alert pipeline to standardize what an “Event” looks like for downstream systems. Everyone wants automation and AI, but it’s hard to scale or mature your security operations if you don’t first have stability and predictability in your data. Your team should know exactly what an “Event” looks like when it’s delivered to them.


I started with the concept of catching a webhook. But then what do I do with it? With some trial and error, I started building the rough outlines for a SOC or Alert pipeline.

simplified diagram for a SOC alert pipeline

I attempted to create a flow that begins with validation. Validating headers or HMAC signatures. Is the response within an acceptable time window, so we avoid replay events? Is the content-type expected? Is the schema correct? Was it sent from an approved IP? Validation rules should confirm that this event is expected.

Once you have an expected event, it’s time to start processing it, making sure grouped or bulk events are broken out into individual events, so each event is processed through the pipeline.

You may not have a unique id to work with, or you’re dealing with multiple log sources that have different fields. So, create and insert an alert id for the root payload and an event id for each broken-out event.

Before you waste cycles or credits processing or enriching events, it’s time to check you have the required fields you need. If you don’t, fail quickly and fix your detection upstream to send it with the payload, or insert them before you continue.

Now that you have broken out individual events with unique ids and your required fields. It’s time to enrich. Make your calls to APIs or vendors and pull back your data. Validate the response like before. Break out the response into individual events if multiple results are found.

Sanitized the response if needed by down casing if you know downstream systems are case sensitive, escaping any special characters, reject unexpected fields. You should answer the question, is the response safe and valid to proceed?

Normalize the response. Is the time format in UTC and formatted the way you expect? If not, fix it now. Are fields empty that should have responses? Insert a default value if needed. End with data that is consistent.

Rinse and repeat with each API or vendor callout for enrichment data. Add your computer name and serial, last login times, user risk scores, IP/domain validation, etc.

At the end, you should have a merged final event, containing the alert data (what alert fired and why), the event data (individual event data), and enrichment data.

Now you can decide what to do with it. Send it to AI to validate your findings, provide a summary, or give it a criticality? Send it to your notification channels?

This alone doesn’t decrease the number of alerts, but it sets you up for better success. If you have a predictable enriched alert that hits your notification pipeline, you can make logical decisions. Maybe only high and critical alerts go to case management. Maybe team members can subscribe to emails if they want. Maybe there’s a Slack channel for each alert by severity. Team members can decide to mute or unmute as needed. Or if a certain confidence score is met, you send the event to your SOAR for auto-remediation.

Is any of this right? It’s an iterative process, but it’s a starting point.

JamfReports Ruby Gem

Ruby Gem for Jamf Pro API

JamfReports is a ruby gem that uses the Jamf Pro API for hosted instances. Currently, it only reports on Applications. Pulling from the /api/v1/computers-inventory?section=APPLICATIONS endpoint.

For gem install instructions see here

*Future reports may be added.


Usage

In order to run JamfReports, you are required to add these two lines to the top of your file, under the require JamfReports line

#!/usr/bin/ruby
require "JamfReports"

## UPDATE THESE VARIABLES -------------------------------
$jamfpro_url = "https://company.jamfcloud.com" 
$api_pw = "Your API KEY HERE"

If you don’t have your api key for the $api_pw variable, run this command to create it and paste it into the variable.

printf "jamfUserName:JamfPassword" | iconv -t ISO-8859-1 | base64 -i -

And these 3 lines to generate the api token and pull the data from Jamf so you can run the report commands.

JamfReports.getToken
JamfReports.checkTokenExpiration
JamfReports.getAllApplicationInventory

A full working file might look like this.

#!/usr/bin/ruby
require "JamfReports"

## UPDATE THESE VARIABLES ----------------------------------------
$jamfpro_url = "https://company.jamfcloud.com" 
$api_pw = "Your API KEY HERE"

## COMMANDS ------------------------------------------------------
JamfReports.getToken
JamfReports.checkTokenExpiration
JamfReports.getAllApplicationInventory

## REPORTS -------------------------------------------------------
JamfReports.listAllInstalledApps
# JamfReports.webBrowserReport
# JamfReports.totalNumberOfOneInstalledApps
# JamfReports.listofOneInstallApps

## After reporting, revoke current token
JamfReports.invalidateToken

Examples

listAllInstalledApps

Returns a sorted list of all apps installed across your fleet of computers in Jamf. Sorted from most installed app, to least.

webBrowserReport

  • Google Chrome.app
  • Google Chrome Canary.app
  • Firefox.app
  • Firefox Developer Edition.app
  • Safari.app
  • Safari Technology Preview.app
  • Microsoft Edge.app
  • Brave Browser.app
  • Arc.app
  • Opera.app
  • LinCastor Browser.app
  • LockDown Browser.app
  • Tor Browser.app
  • Vivaldi.app
  • DuckDuckGo.app

totalNumberOfOneInstalledApps

Returns a single metric to show the number of “One-off” or single installed apps across your fleet.

listofOneInstallApps

Returns a list of all the “One-off” or single installed apps across your fleet. This can be helpful in scoping in Jamf uninstall policies.

Jamf Pro API All Applications Installed Report

This is a script for hosted Jamf Pro that returns all the installed apps across your fleet and sorts them by most installed to least. Update the variables $jamfpro_url and $api_pw in the script for your environment. It should return results up to 2000 computers. If you need more, you’ll have to add pagination.

The basic concept is to call the /api/v1/computers-inventory?section=APPLICATIONS endpoint, loop through and add all apps and versions into a hash (dictionary) and then print and sort the results.

It takes your Jamf username and password and converts it to a base64 string that you put in the $api_pw variable and then that generates a bearer token for the additional calls. The invalidateToken function at the very end, revokes the bearer token.

To get the base64 string you just put this into terminal and paste it into the $api_pw

printf "jamfUserName:JamfPassword" | iconv -t ISO-8859-1 | base64 -i -

This script is helpful to get an understanding of all the apps installed in your fleet, including most and least popular Apps. I use this to also help determine what Apps we should prioritize in patch management.

You can download the script from here

Note: Change out jamfUserName and JamfPassword with a user that has at least read only privileges in jamf. Make sure to keep the “:” inbetween the variables.

macOS Content Caching Metrics

Content caching is a service in macOS that speeds up downloading of software distributed by Apple and data that users store in iCloud by saving content that local Apple devices have already downloaded. If you’ve enabled it in “System Preferences > Sharing > Content Caching”. You know that it couldn’t be easier to set up. But have you ever wondered, what is content caching doing? Is it even working?

Luckily Apple has provided the AssetCacheManagerUtil command-line tool to manage content caching, with the status option that will output some metrics.

AssetCacheManagerUtil status                                                          
2022-06-19 15:14:20.159 AssetCacheManagerUtil[95604:12021093] Content caching status:
    Activated: true
    Active: true
    ActualCacheUsed: 61.6 GB
    CacheDetails: (4)
        iCloud: 5.62 GB
        iOS Software: 7.54 GB
        Mac Software: 34.31 GB
        Other: 29.13 GB
    CacheFree: 110.83 GB
    CacheLimit: 250 GB
    CacheStatus: OK
    CacheUsed: 76.6 GB
    MaxCachePressureLast1Hour: 0%
    Parents: (none)
    Peers: (none)
    PersonalCacheFree: 110.83 GB
    PersonalCacheLimit: 250 GB
    PersonalCacheUsed: 5.62 GB
    Port: 56183
    PrivateAddresses: (1)
        10.0.1.102
    PublicAddress: xx.xx.xx.xx
    RegistrationStatus: 1
    RestrictedMedia: false
    ServerGUID: D2BE5986-6E42-41AC-845D-27A87C6EE6B6
    StartupStatus: OK
    TetheratorStatus: 0
    TotalBytesAreSince: 2022-05-10 12:10:15
    TotalBytesDropped: 7.3 MB
    TotalBytesImported: 1.72 GB
    TotalBytesReturnedToChildren: Zero KB
    TotalBytesReturnedToClients: 135.66 GB
    TotalBytesReturnedToPeers: Zero KB
    TotalBytesStoredFromOrigin: 59.88 GB
    TotalBytesStoredFromParents: Zero KB
    TotalBytesStoredFromPeers: Zero KB

This is a good start but you might want to know, what does TotalBytesReturnedToClients mean? You can reference all the metrics in Apple’s Developer Docs. Where we can find that TotalBytesReturnedToClients is:

The amount of data, in bytes, that the content cache served to client iOS, macOS, and tvOS devices since the TotalBytesAreSince date.

This is progress. But the AssetCacheManagerUtil tool spits out one large json output. How can I pull individual metrics?

jq is a lightweight and flexible command-line JSON processor that can be easily installed with home-brew.

brew install jq

And with the below command, you can parse the output for a single metric:


AssetCacheManagerUtil status -j 2>/dev/null | jq '.result.TotalBytesReturnedToClients'
135658827669

But now you’ll notice the output isn’t nice and human-readable like it was before we parsed it, jq will return values in bytes. You’ll need to convert it to MB’s or GB’s if you want.

Now that we can use jq , we can pull the metrics we care about:


AssetCacheManagerUtil status -j &>/dev/null | jq '.result.Active'  
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.ActualCacheUsed'  
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.CacheDetails.iCloud' 
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.CacheDetails."iOS Software"'  
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.CacheDetails."Mac Software"'   
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.CacheDetails.Other' 
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.CacheFree' 
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.CacheUsed'  
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.CacheLimit' 
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.MaxCachePressureLast1Hour' 
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.TotalBytesReturnedToClients' 
AssetCacheManagerUtil status -j &>/dev/null | jq '.result.TotalBytesStoredFromOrigin' 


Visualizing macOS Content Cache Metrics

I’m a big fan of Grafana and Influxdb, so it only makes sense for me to send these metrics to Influxdb so I can build a dashboard with Grafana. What’s nice about sending it to influxdb is you don’t have to convert the bytes, that can be handled directly in Grafana while building the dashboard.

Here’ my script, written in ruby. I run the AssetCacheManagerUtil command and send the data to influxdb.

Then I create a dashboard in Grafana, add a new panel, and select Caching as the measurement. Here I’m selecting the variable cacheused from my ruby script.

After everything is said and done, I have a whole dashboard and finally have some visibility into what caching server is doing, and how much bandwidth it’s saving me.

Download a package from a Jamf Pro Cloud Distribution Point

I ran into a new issue. I have a policy that is pushing a package that I’m not clear on what it’s doing. What’s worse, the policy is pending or failed on a large number of machines. To confuse matters, I have 3 different “Google Chrome” packages but no idea what the differences are.

I wanted to download the packages and open them with something like Suspicious Package to see what they are actually doing.

Since there’s no “Download” option in Jamf for your pkg, and I don’t want to install the package to figure out if it’s working. You can create a policy, cache the package, scope it to your machine and then copy the package to your desktop.

  • Create a new policy
  • Add each package you want to download
  • Set the Action to Cache
  • I made the policy available in Self Service, so I wouldn’t have to wait for a check-in
  • Scope the policy to only your machine

After you install the packages. Open Terminal and switch to root. Change to the directory and verify the packages are downloaded.

sh-3.2# sudo su
sh-3.2# cd /Library/Application\ Support/JAMF/Waiting\ Room/
sh-3.2# ls -1

Now, let’s copy the packages (ignore the cache.xml files) to our desktop and change the owner to our account, so we can examine them.

#copy pkgs to your desktop
cp Google Chrome new.pkg /Users/username/Desktop/
cp Google Chrome.pkg /Users/username/Desktop/
cp googlechrome.pkg /Users/username/Desktop/

#change the owner from root, to your user account
chown username /Users/username/Desktop/Google Chrome new.pkg
chown username /Users/username/Desktop/Google Chrome.pkg
chown username /Users/username/Desktop/googlechrome.pkg

If all you needed to do was download the packages, you’re done! In my case, I want to see what they are installing, so I’ll open them with Suspicious Package.



This package, Google Chrome new.pkg is installing v89.04.4389.90! Maybe this was new at some point, but clearly poorly named now.

Turns out the package that was causing me the most issues was googlechrome.pkg which was trying to install Chrome to the System volume, instead of "/Applications“, which was causing it to fail on every client it was scoped to.

macOS: Enable Chrome Auto Updates For All Users

Meant to be run after installing Chrome for the first time on a machine to enable auto updates for all users. Creates the /Library/Google/GoogleSoftwareUpdate folder, including “TicketStore“. Then registers the system for autoupdates.

Before running, /Library/Google/GoogleSoftwareUpdate doesn’t exist


Run the script with Sudo to install to the system wide location at /Library/Google instead of the user location at ~/Library/Google.

 sudo chrome_enable_autoupdates.rb

After running, the folder and ticket store are created


. To check settings after running:

  • Open Chrome
  • Click on Chrome in upper left menu bar
  • Click on “About Google Chrome
  • Check update settings

The bulk of the work is done by the KSADMIN command to register a keystone ticket and check for updates. Full script can be found here.

  def self.keystone_register
          command = [
             "'#{KSADMIN}'",
             "--register",
             "--productid",          "'#{product_id}'",
             "--version",            "'#{chrome_version}'",
             "--xcpath",             "'#{CHROMEPATH}'",
             "--url",                "'#{update_url}'",
             "--tag-path",           "'#{TAGPATH}'",
             "--tag-key",            "'#{TAGKEY}'",
             "--brand-path",         "'#{BRANDPATH}'",
             "--brand-key",          "'#{BRANDKEY}'",
             "--version-path",       "'#{VERSIONPATH}'",
             "--version-key",        "'#{VERSIONKEY}'",
       "--system-store",
             "2>/dev/null"]
          system("#{command.join(" ")}")
     end

Jenkins Build Scripts For macOS 12 Monterey

I automate part of my macOS builds for new OS releases. I use a simple script (_version_check.sh) that I update with the latest release, ie … 12.2.1. Then I commit it to my repo and fire a webhook to Jenkins. Jenkins takes the version number and checks to see if it’s available via the "softwareupdate" command.

softwareupdate --fetch-full-installer --full-installer-version "$version_to_download"

If it is, it downloads the full installer app, “Install macOS Monterey.app“, to the Applications folder. When that is successful, I have a second Jenkins job that converts the “.app" to a “.dmg" . I archive the “.dmg" and use it in automated workflows with tools like MDS

Scripts are on Github

Showing Bluetooth Battery Percentage On Your Desktop

I use a tool called GeekTool (a fun tool that can be used in a lot of ways) to show the battery percentage of my bluetooth connected devices on my desktop. It makes it easier to keep an eye on them and know when they need charged.

Since switching to an Apple M1 laptop my old script broke, so I took a little time to update it. I used to pull the information from system_profiler but it seems it’s been removed. the ioreg command, now list the current battery percentage of any bluetooth connected devices

ioreg -r -l -k "BatteryPercent"  

This might look like it gives you a lot of output, but we really just need to parse it, to get what we want. The below command will give you current battery percentage for an Apple Magic Keyboard and Magic Mouse. Other products can be listed, but you’d have to run the above ioreg command and get the Product name to switch out in the command.

KeyboardPercent=$(ioreg -r -l -k "BatteryPercent" \
| grep -A 9 "Magic Keyboard with Touch ID" \
| grep "BatteryPercent" \
| awk '{print $3}')

MousePercent=$(ioreg -r -l -k "BatteryPercent" \
| grep -A 9 "Magic Mouse" \
| grep "BatteryPercent" \
| awk '{print $3}')

echo "Keyboard Battery Level: $KeyboardPercent"
echo "Mouse Battery Level: $MousePercent"

Once I have the command working, I update my GeekTool script. There’s a little more formatting here then just show me the percentage and GeekTool is more sensitive to whitespace. The below command is what I use to paste into the script editor to give me the output in the screenshot.

#!/bin/zsh

# Bluetooth Keyboard -----------------------------------------------------------
KeyboardPercent=$(ioreg -r -l -k "BatteryPercent" \
| grep -A 9 "Magic Keyboard with Touch ID" \
| grep "BatteryPercent" \
| awk '{print $3}')

typeset -i b=5
echo "Keyboard:\t\t\c"

if [ ${KeyboardPercent} = 0 ]
then
    echo "Disconnected\c"
else
    if [ $KeyboardPercent -lt 11 ]
    then
        echo "\033[1;31m\c"
    else
        echo "\033[0m\c"
    fi
    while [ $b -le $KeyboardPercent ]
    do
        echo "|\c"
        b=`expr $b + 5`
    done
    while [ $b -le 100 ]
    do
        echo "\033[1;37m|\033[0m\c"
        b=`expr $b + 5`
    done
    echo "\033[0m $KeyboardPercent%\c"
    unset KeyboardPercent
    unset b
fi

echo "\033[0m\nMouse:\t\t\t\c"

# Bluetooth Mouse --------------------------------------------------------------

MousePercent=$(ioreg -r -l -k "BatteryPercent" \
| grep -A 9 "Magic Mouse" \
| grep "BatteryPercent" \
| awk '{print $3}')

if [ ${MousePercent} = 0 ]
then
    echo "Disconnected\c"
else
    if [ $MousePercent -lt 11 ]
    then
        echo "\033[1;31m\c"
    else
        echo "\033[0m\c"
    fi
    typeset -i b=5
    while [ $b -le $MousePercent ]
    do
        echo "|\c"
        b=`expr $b + 5`
    done
    while [ $b -le 100 ]
    do
        echo "\033[1;37m|\033[0m\c"
        b=`expr $b + 5`
    done
    echo "\033[0m $MousePercent%\c"
    unset MousePercent
    unset b
fi

Scripts can be found here

Monitor Scheduled Tasks on Windows Server

Scheduled Tasks are a great way to run scripts at specified time intervals. Similar to Unix cron. Because it’s such a powerful tool, it can be easy to abuse by programs that install their own schedule tasks. Most times without any transparency to the user that a task is being created. It’s also a potential target for malware to install persistent tasks that go unnoticed.

On managed servers, I want transparency into which tasks are installed. I created a powershell script with an array of “whitelisted” tasks, then created a scheduled tasks to check the name of the scheduled tasks installed. The results are sent to InfluxDB and If a task is not on my white list, an alert is sent with Grafana.

The powershell script and scheduled task can be found here.

When first installed, it reports the total number of scheduled tasks, and total number of non-Microsoft tasks. By default, I ignore Microsoft tasks installed on a new install.

In this scenario you can see an alert has been triggered, because there is an Unapproved task, and to the right, it shows the name of the task, User_Feed_ ...

Now you can log into the computer and check the task to decided if it needs added to your white list or deleted.

Once the task has been deleted, the alarm is cleared and all tasks report as Approved