Generating a workflow using omicron-process

PyOmicron provides the omicron-process command-line executable, designed for creating and managing HTCondor workflows.

The only hard requirement in order to run omicron-process is a configuration file. Once you have that you can run automatically over the most recent chunk of data as

$ omicron-process <group> --config-file <config-file>

where <config-file> is the name of your configuration file, and <group> is the name of the section inside that file that configures this workflow.

Warning

By default, this command will automatically generate the workflow as an HTCondor DAG and submit it to via condor_submit_dag. To generate the workflow but not submit the DAG, add the --no-submit option on the command line.

Note

By default, omicron-process won’t output very much to the screen, so it can be useful to supply --verbose --verbose (i.e. --verbose twice) to get the DEBUG logging statements. This will provide a running progress report for the workflow, which can be very informative.

Warning

omicron-process will complain loudly if it can’t find omicron.exe on the path somewhere.

You can either specify --executable manually on the command line, or if you are working in the detchar account on the LIGO Data Grid you enable the standard conda environment for omicron with the following command:

conda_omicron

The other alternative is to install omicron into your conda environment with:

mamba install omicron

Details of the workflow

The omicron-process executable will do the following

  • find the relevant time segments to process (if state-flag or state-channel has been defined in the configuration),

  • find the frame files containing the data (using gw_data_find),

  • build a Directed Acyclic Graph (DAG) defining the workflow.

The DAG will normally do something like this:

  1. process raw data using omicron.exe

  2. merge contiguous output files with .root, .h5, and .xml extensions

  3. gzip .xml files to save space

  4. the merged files are copied to the archive directory, nominally /home/detchar/triggers/<ifo>/<channel-filetag>/<metric day>

  5. if everything completes successfully, trigger and log files are deleted

typical DAG diagram

Note

The workflow will break the interval into chunks defined in the configuration file chunkdur parameter, which includes the padding. The nominal value is 120 sec. This results in multiple htCondor jobs.

Archiving multiple workflows

Optionally, you can specify the --archive option to copy files from the run directory into a structured archive under ~/triggers/. Each file is re-located as follows:

~/triggers/{IFO}/{filetag}/{gps5}/{filename}

where the path components are as follows

  • {IFO} is the two-character interferometer prefix for the raw data channel (e.g. L1),

  • {filetag} is an underscore-delimited tag including the rest of the channel name and OMICRON, e.g. (GDS_CALIB_STRAIN_OMICRON),

  • {gps5} is the 5-digit GPS epoch for the start time of the file, e.g. 12345 if the file starts at GPS 1234567890.

  • {filename} is the T050017-compatible name, which will be of the form {IFO}-{filetag}-<gpsstart>-<duration>.<ext>

e.g.:

~/triggers/L1/GDS_CALIB_STRAIN_OMICRON/12345/L1-GDS_CALIB_STRAIN_OMICRON-1234567890-100.xml.gz

Processing a specific time interval

If you have a specific time interval that you’re most interested in, you will need to use the --gps option on the command line:

$ omicron-process <group> --config-file <config-file> --gps <gpsstart> <gpsend>

where <gpsstart> and <gpsend> are your two GPS times.

Note

You can also give the GPS arguments as date strings, in quotes, as follows

$ omicron-process <group> --config-file <config-file> --gps "Jan 1" "Jan 2"

Additionally, when using -gps, you can specify --cache-file to submit your own LAL-formatted data cache file:

$ omicron-process <group> --config-file <config-file> --gps <gpsstart> <gpsend> --cache-file /path/to/cache.lcf

More help

For detailed documentation of all command-line options and arguments, print the --help message:

$ omicron-process --help
/home/docs/checkouts/readthedocs.org/user_builds/pyomicron/conda/latest/lib/python3.11/site-packages/h5py/__init__.py:36: UserWarning: h5py is running against HDF5 1.14.3 when it was built against 1.14.2, this may cause problems
  _warn(("h5py is running against HDF5 {0} when it was built against {1}, "
/home/docs/checkouts/readthedocs.org/user_builds/pyomicron/conda/latest/lib/python3.11/site-packages/htcondor/__init__.py:49: UserWarning: Neither the environment variable CONDOR_CONFIG, /etc/condor/, /usr/local/etc/, nor ~condor/ contain a condor_config source. Therefore, we are using a null condor_config.
  _warnings.warn(message)
usage: omicron-process [-h] [-V] [-t GPSTIME GPSTIME] [-f CONFIG_FILE]
                       [-i IFO] [-v] [-o OUTPUT_DIR] [-a] [-g FILE_TAG]
                       [-l LOG_FILE] [-C MAX_CHUNKS_PER_JOB]
                       [-N MAX_CHANNELS_PER_JOB]
                       [--max-online-lookback MAX_ONLINE_LOOKBACK]
                       [--max-concurrent MAX_CONCURRENT] [-x EXCLUDE_CHANNEL]
                       [--reattach | --rescue | --no-submit]
                       [--universe {vanilla,local}] [--executable EXECUTABLE]
                       [--condor-retry CONDOR_RETRY]
                       [--condor-accounting-group CONDOR_ACCOUNTING_GROUP]
                       [--condor-accounting-group-user CONDOR_ACCOUNTING_GROUP_USER]
                       [--condor-request-disk CONDOR_REQUEST_DISK]
                       [--submit-rescue-dag SUBMIT_RESCUE_DAG]
                       [-c "key=value"] [-d "opt | opt=value"]
                       [--cache-file FILE | --use-dev-shm] [--no-segdb]
                       [--skip-omicron] [--skip-root-merge]
                       [--skip-hdf5-merge] [--skip-ligolw_add] [--skip-gzip]
                       [--skip-postprocessing] [--skip-rm]
                       group

Process LIGO data using the Omicron event trigger generator (ETG)

This utility can be used to process one or more channels or LIGO data using
Omicron with minimal manual labour in determining segments, finding data,
and configuring HTCondor.

The input to this should be an INI-format configuration file that lists the
processing parameters and channels that pass to Omicron, something like:

```ini
[GW]
q-range = 3.3166 150
frequency-range = 4.0 8192.0
frametype = H1_HOFT_C00
state-flag = H1:DMT-CALIBRATED:1
sample-frequency = 16384
chunk-duration = 124
segment-duration = 64
overlap-duration = 4
mismatch-max = 0.2
snr-threshold = 5
channels = H1:GDS-CALIB_STRAIN
```

The above 'GW' group name should then be passed to `omicron-process` along
with any customisations available from the command line, e.g.

```
omicron-process GW --config-file ./config.ini
```

By default `omicron-process` will look at the most recent data available
('online' mode), to run in 'offline' mode, pass the `--gps` argument

```
omicron-process GW --config-file ./config.ini --gps <gpsstart> <gpsstop>
```

The output of `omicron-process` is a Directed Acyclic Graph (DAG) that is
*automatically* submitted to condor for processing.

Positional arguments:
  group                 name of configuration group to process

Optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -t GPSTIME GPSTIME, --gps GPSTIME GPSTIME
                        GPS times for offline processing
  -f CONFIG_FILE, --config-file CONFIG_FILE
                        path to configuration file (default: None)
  -i IFO, --ifo IFO     IFO prefix to process (default: None)
  -v, --verbose         print verbose output, give more times for more verbose
                        output

Output options:
  -o OUTPUT_DIR, --output-dir OUTPUT_DIR
                        path to output directory (default: /home/docs/checkout
                        s/readthedocs.org/user_builds/pyomicron/checkouts/late
                        st/docs)
  -a, --archive         archive created files under /home/docs/triggers
                        (default: False)
  -g FILE_TAG, --file-tag FILE_TAG
                        additional file tag to be appended to final file
                        descriptions
  -l LOG_FILE, --log-file LOG_FILE
                        save a copy of all logger messages to this file

Processing options:
  -C MAX_CHUNKS_PER_JOB, --max-chunks-per-job MAX_CHUNKS_PER_JOB
                        maximum number of chunks to process in a single condor
                        job (default: 4)
  -N MAX_CHANNELS_PER_JOB, --max-channels-per-job MAX_CHANNELS_PER_JOB
                        maximum number of channels to process in a single
                        condor job (default: 20)
  --max-online-lookback MAX_ONLINE_LOOKBACK
                        With no immediately previous run, or one that was long
                        ago this is the max time of an online job. Default:
                        1200
  --max-concurrent MAX_CONCURRENT
                        Max omicron jobs run at one time [64]
  -x EXCLUDE_CHANNEL, --exclude-channel EXCLUDE_CHANNEL
                        exclude channel from the analysis (can be given
                        multiple times)

Condor options:
  --reattach            if DAG already running, try and reattach to it and
                        follow it's progress, this is only designed for online
                        running
  --rescue              rescue a failed DAG instead of creating a new one
                        (default: False)
  --no-submit           do not submit the DAG to condor (default: False)
  --universe {vanilla,local}
                        condor universe (default: vanilla)
  --executable EXECUTABLE
                        omicron executable (default: /home/docs/checkouts/read
                        thedocs.org/user_builds/pyomicron/conda/latest/bin/omi
                        cron)
  --condor-retry CONDOR_RETRY
                        number of times to retry each job if failed (default:
                        2)
  --condor-accounting-group CONDOR_ACCOUNTING_GROUP
                        accounting_group for condor submission on the LIGO
                        Data Grid (default:
                        ligo.prod.o4.detchar.transient.omicron)
  --condor-accounting-group-user CONDOR_ACCOUNTING_GROUP_USER
                        accounting_group_user for condor submission on the
                        LIGO Data Grid (default: docs)
  --condor-request-disk CONDOR_REQUEST_DISK
                        Required LIGO argument: local disk use (default: 50G)
  --submit-rescue-dag SUBMIT_RESCUE_DAG
                        number of times to automatically submit the rescue DAG
                        (default: 0)
  -c "key=value", --condor-command "key=value"
                        Extra commands to add to the HTCondor submit files,
                        can be given multiple times
  -d "opt | opt=value", --dagman-option "opt | opt=value"
                        Extra options to pass to condor_submit_dag as "-{opt}
                        [{value}]". Can be given multiple times (default:
                        ['force', '-import_env'])

Data options:
  --cache-file FILE     use frame locations from FILE
  --use-dev-shm         use low-latency frame buffer in /dev/shm (default:
                        False)
  --no-segdb            don't use the segment database for state determination
                        (default: False)

Pipeline options:
  --skip-omicron        skip running omicron (default: False)
  --skip-root-merge     skip running omicron-root-merge (default: False)
  --skip-hdf5-merge     skip running omicron-hdf5-merge (default: False)
  --skip-ligolw_add     skip running ligolw_add (default: False)
  --skip-gzip           skip running gzip (default: False)
  --skip-postprocessing
                        skip all post-processing, equivalent to --skip-root-
                        merge --skip-hdf5-merge --skip-ligolw_add --skip-gzip
                        (default: False)
  --skip-rm             Do not remove all the trigger files created by the
                        job.Useful for debugging(default: False)

This source code for this project is available here:

https://github.com/gwpy/pyomicron/

All issues regarding this software should be raised using the GitHub web
interface, bug reports and feature requests are encouraged.

Documentation is available here:

https://pyomicron.readthedocs.io/en/latest/