Spectrometer usage statistics

Andrei · October 26, 2024, 12:20am

A bash script we use to extract the spectrometer usage statistics from “raw” data.

What it does: finds all raw spectra files (fid and sed) that were created within the period of a given number of days ago, and then gets from corresponding audit trail files pairs of dates when the spectra were measured, checks that they were not measured before the desired time period (to exclude wrpa’d datasets), duration of measurement, probehead, and user (both are extracted only for the most recent run, though) and saves to a CSV file. The latter can be further processed in Excel.

We use it to cross-check with the reservation system records. It works relatively well on our workstations (all CentOS 7) with TopSpin 3.5.7 and 4.1.4

To do: add popt detection

Suggestions and corrections are welcome

#!/bin/bash

# Get user input

# Check if number of days is provided as a parameter, otherwise prompt the user
if [[ -z "$1" ]]; then
    read -p "Enter the number of days: " days
else
    days="$1"
fi

# path="/opt/nmrdata/"
# or check if path is provided as a parameter, otherwise prompt the user
if [[ -z "$2" ]]; then
    read -p "Enter the path to search: " path
else
    path="$2"
fi

# Validate input
if [[ ! -d "$path" ]]; then
    echo "Error: The path '$path' does not exist."
    exit 1
fi

if ! [[ "$days" =~ ^[0-9]+$ ]]; then
    echo "Error: Number of days must be a valid integer."
    exit 1
fi

# Output file with unique name including hostname
output_file="summary-$(uname -n)-$(date +%Y-%m-%d_%H-%M-%S)-last-$days-days.csv"

# Initialize the output file
> "$output_file"

# Convert input days to a cutoff date (in epoch for comparison)
cutoff_date_epoch=$(date -d "-$days days" +%s)

# Find all spectra files created within the specified number of days
files=$(find "$path" -type f -mtime -$days \( -name 'fid' -o -name 'ser' \))

# Check if any files were found
if [[ -z "$files" ]]; then
    echo "No data found in the specified path."
    exit 0
fi

# Convert timestamps to a unified format
convert_timestamp() {
    local timestamp=$1
    echo $(date -d "${timestamp//[<>]/ }" +"%Y-%m-%d %H:%M:%S")
}

# Regex for matching timestamps (Perl-compatible regex)
timestamp_regex='\d{4}-\d{2}-\d{2}[T ]\d{2}:\d{2}:\d{2}\.\d{3}[ ]?[\+\-]\d{4}'

# Set IFS to handle filenames with spaces
IFS=$'\n'

# Process each file
for file in $files; do
    # Get the directory path
    # echo "Found file: ${file}"
    dir_path=$(dirname "$(realpath "$file")")
    echo "Processing directory: $dir_path"
    prev_timestamp=""

    while IFS= read -r line; do
        # Check if the line contains a valid timestamp
        if echo "$line" | grep -qP "$timestamp_regex"; then
            current_timestamp=$(echo "$line" | grep -oP "$timestamp_regex")

            # Check if the line contains "started at"
            if [[ $line =~ "started at" ]]; then
                start_timestamp=$current_timestamp
                unified_start_ts=$(convert_timestamp "$start_timestamp")
                unified_prev_ts=$(convert_timestamp "$prev_timestamp")

                # Compare timestamps using epoch values
                start_epoch=$(date -d "$start_timestamp" +%s)
                end_epoch=$(date -d "$prev_timestamp" +%s)

                if [[ $start_epoch -gt $cutoff_date_epoch ]]; then
                    duration=$((end_epoch - start_epoch))
                    probe=$(grep -o '##$PROBHD= .*' "$dir_path"/acqus | cut -d" " -f2-)
                    user=$(grep -o '##OWNER= .*' "$dir_path"/acqus | cut -d" " -f2-)
                    # Write to the output file, clean commas, brackets, carrier returns
                    printf "%s\n" "$unified_start_ts, $unified_prev_ts, $duration, ${dir_path//,/}, ${probe//[<>.\r]/}, ${user//$'\r'/}" >> "$output_file"
                fi

                # Reset timestamps
                start_timestamp=""
                prev_timestamp=""
            else
                prev_timestamp=$current_timestamp
            fi
        fi
    done < "$dir_path"/audita.txt
done

unset IFS

# Sort and remove duplicates
sort -u -t ',' -k2,2 "$output_file" | sort -t ',' -k2,2 > "${output_file}.sorted"
mv "${output_file}.sorted" "$output_file"

# Add header
sed -i '1s/^/Start Time, End Time, Duration (sec), Directory, Probe ID, User\n/' "$output_file"

echo "Script finished. Results saved to '$output_file'"