Generating a workflow using omicron-process
PyOmicron provides the omicron-process command-line executable, designed for creating and managing HTCondor workflows.
The only hard requirement in order to run omicron-process is a configuration file. Once you have that you can run automatically over the most recent chunk of data as
$ omicron-process <group> --config-file <config-file>
where <config-file> is the name of your configuration file, and <group> is the name of the section inside that file that configures this workflow.
Warning
By default, this command will automatically generate the workflow as an
HTCondor DAG and submit it to via condor_submit_dag.
To generate the workflow but not submit the DAG, add the --no-submit
option on the command line.
Note
By default, omicron-process won’t output very much to the screen, so it
can be useful to supply --verbose --verbose (i.e. --verbose twice)
to get the DEBUG logging statements. This will provide a running
progress report for the workflow, which can be very informative.
Warning
omicron-process will complain loudly if it can’t find omicron.exe
on the path somewhere.
You can either specify --executable manually
on the command line, or if you are working in the detchar account on the LIGO Data Grid
you enable the standard conda environment for omicron with the following command:
conda_omicron
The other alternative is to install omicron into your conda environment with:
mamba install omicron
Details of the workflow
The omicron-process executable will do the following
find the relevant time segments to process (if
state-flagorstate-channelhas been defined in the configuration),find the frame files containing the data (using
gw_data_find),build a Directed Acyclic Graph (DAG) defining the workflow.
The DAG will normally do something like this:
process raw data using
omicron.exemerge contiguous output files with
.root,.h5, and.xmlextensionsgzip
.xmlfiles to save spacethe merged files are copied to the archive directory, nominally
/home/detchar/triggers/<ifo>/<channel-filetag>/<metric day>if everything completes successfully, trigger and log files are deleted
Note
The workflow will break the interval into chunks defined in the configuration
file chunkdur parameter, which includes the padding. The nominal value
is 120 sec. This results in multiple htCondor jobs.
Archiving multiple workflows
Optionally, you can specify the --archive option to copy files from the run directory into a structured archive under ~/triggers/. Each file is re-located as follows:
~/triggers/{IFO}/{filetag}/{gps5}/{filename}
where the path components are as follows
{IFO}is the two-character interferometer prefix for the raw data channel (e.g.L1),{filetag}is an underscore-delimited tag including the rest of the channel name andOMICRON, e.g. (GDS_CALIB_STRAIN_OMICRON),{gps5}is the 5-digit GPS epoch for the start time of the file, e.g.12345if the file starts at GPS1234567890.{filename}is the T050017-compatible name, which will be of the form{IFO}-{filetag}-<gpsstart>-<duration>.<ext>
e.g.:
~/triggers/L1/GDS_CALIB_STRAIN_OMICRON/12345/L1-GDS_CALIB_STRAIN_OMICRON-1234567890-100.xml.gz
Processing a specific time interval
If you have a specific time interval that you’re most interested in, you will need to use the --gps option on the command line:
$ omicron-process <group> --config-file <config-file> --gps <gpsstart> <gpsend>
where <gpsstart> and <gpsend> are your two GPS times.
Note
You can also give the GPS arguments as date strings, in quotes, as follows
$ omicron-process <group> --config-file <config-file> --gps "Jan 1" "Jan 2"
Additionally, when using -gps, you can specify --cache-file to submit your own LAL-formatted data cache file:
$ omicron-process <group> --config-file <config-file> --gps <gpsstart> <gpsend> --cache-file /path/to/cache.lcf
More help
For detailed documentation of all command-line options and arguments, print the --help message:
$ omicron-process --help
/home/docs/checkouts/readthedocs.org/user_builds/pyomicron/conda/stable/lib/python3.12/site-packages/htcondor/__init__.py:49: UserWarning: Neither the environment variable CONDOR_CONFIG, /etc/condor/, /usr/local/etc/, nor ~condor/ contain a condor_config source. Therefore, we are using a null condor_config.
_warnings.warn(message)
usage: omicron-process [-h] [-V] [-t GPSTIME GPSTIME] [-f CONFIG_FILE]
[-i IFO] [-v] [-o OUTPUT_DIR] [-a] [-g FILE_TAG]
[-l LOG_FILE] [-C MAX_CHUNKS_PER_JOB]
[-N MAX_CHANNELS_PER_JOB]
[--max-online-lookback MAX_ONLINE_LOOKBACK]
[-x EXCLUDE_CHANNEL]
[--reattach | --rescue | --no-submit]
[--universe {vanilla,local}] [--executable EXECUTABLE]
[--conda-env CONDA_ENV] [--condor-retry CONDOR_RETRY]
[--max-concurrent MAX_CONCURRENT]
[--condor-accounting-group CONDOR_ACCOUNTING_GROUP]
[--condor-accounting-group-user CONDOR_ACCOUNTING_GROUP_USER]
[--condor-request-disk CONDOR_REQUEST_DISK]
[--submit-rescue-dag SUBMIT_RESCUE_DAG]
[-c "key=value"] [-d "opt | opt=value"]
[--auth-type {x509,igwn,scitoken}]
[--cache-file FILE | --use-dev-shm] [--no-segdb]
[--skip-omicron] [--skip-root-merge]
[--skip-hdf5-merge] [--skip-ligolw_add] [--skip-gzip]
[--skip-postprocessing] [--skip-rm]
group
Process LIGO data using the Omicron event trigger generator (ETG)
This utility can be used to process one or more channels or LIGO data using
Omicron with minimal manual labour in determining segments, finding data,
and configuring HTCondor.
The input to this should be an INI-format configuration file that lists the
processing parameters and channels that pass to Omicron, something like:
```ini
[GW]
q-range = 3.3166 150
frequency-range = 4.0 8192.0
frametype = H1_HOFT_C00
state-flag = H1:DMT-CALIBRATED:1
sample-frequency = 16384
chunk-duration = 124
segment-duration = 64
overlap-duration = 4
mismatch-max = 0.2
snr-threshold = 5
channels = H1:GDS-CALIB_STRAIN
```
The above 'GW' group name should then be passed to `omicron-process` along
with any customisations available from the command line, e.g.
```
omicron-process GW --config-file ./config.ini
```
By default `omicron-process` will look at the most recent data available
('online' mode), to run in 'offline' mode, pass the `--gps` argument
```
omicron-process GW --config-file ./config.ini --gps <gpsstart> <gpsstop>
```
The output of `omicron-process` is a Directed Acyclic Graph (DAG) that is
*automatically* submitted to condor for processing.
Positional arguments:
group name of configuration group to process default=all
groups
Optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-t GPSTIME GPSTIME, --gps GPSTIME GPSTIME
GPS times or date/time for offline processing
-f CONFIG_FILE, --config-file CONFIG_FILE
path to configuration file (default: None)
-i IFO, --ifo IFO IFO prefix to process (default: None)
-v, --verbose print verbose output, give more times for more verbose
output
Output options:
-o OUTPUT_DIR, --output-dir OUTPUT_DIR
path to output directory (default: /home/docs/checkout
s/readthedocs.org/user_builds/pyomicron/checkouts/stab
le/docs)
-a, --archive archive created files under /home/docs/triggers
(default: False)
-g FILE_TAG, --file-tag FILE_TAG
additional file tag to be appended to final file
descriptions
-l LOG_FILE, --log-file LOG_FILE
save a copy of all logger messages to this file
Processing options:
-C MAX_CHUNKS_PER_JOB, --max-chunks-per-job MAX_CHUNKS_PER_JOB
maximum number of chunks to process in a single condor
job (default: 4)
-N MAX_CHANNELS_PER_JOB, --max-channels-per-job MAX_CHANNELS_PER_JOB
maximum number of channels to process in a single
condor job (default: 20)
--max-online-lookback MAX_ONLINE_LOOKBACK
With no immediately previous run, or one that was long
ago this is the max time of an online job. Default:
1800
-x EXCLUDE_CHANNEL, --exclude-channel EXCLUDE_CHANNEL
exclude channel from the analysis (can be given
multiple times)
Condor options:
--reattach if DAG already running, try and reattach to it and
follow it's progress, this is only designed for online
running
--rescue rescue a failed DAG instead of creating a new one
(default: False)
--no-submit do not submit the DAG to condor (default: False)
--universe {vanilla,local}
condor universe (default: vanilla)
--executable EXECUTABLE
path to omicron executable (default: /home/docs/checko
uts/readthedocs.org/user_builds/pyomicron/conda/stable
/bin/omicron)
--conda-env CONDA_ENV
conda environment (name or path) for all programs in
DAG [ligo-omicron-3.10]
--condor-retry CONDOR_RETRY
number of times to retry each job if failed (default:
2)
--max-concurrent MAX_CONCURRENT
Max omicron jobs run at one time [64]
--condor-accounting-group CONDOR_ACCOUNTING_GROUP
accounting_group for condor submission on the LIGO
Data Grid (default:
ligo.prod.o4.detchar.transient.omicron)
--condor-accounting-group-user CONDOR_ACCOUNTING_GROUP_USER
accounting_group_user for condor submission on the
LIGO Data Grid (default: docs)
--condor-request-disk CONDOR_REQUEST_DISK
Required LIGO argument: local disk use (default: 20G)
--submit-rescue-dag SUBMIT_RESCUE_DAG
number of times to automatically submit the rescue DAG
(default: 0)
-c "key=value", --condor-command "key=value"
Extra commands to add to the HTCondor submit files,
can be given multiple times
-d "opt | opt=value", --dagman-option "opt | opt=value"
Extra options to pass to condor_submit_dag as "-{opt}
[{value}]". Can be given multiple times (default:
['force', 'import_env'])
--auth-type {x509,igwn,scitoken}
How to authenticate to dqsegdb, datafind, and cvmfs
Data options:
--cache-file FILE use frame locations from FILE
--use-dev-shm use low-latency frame buffer in /dev/shm (default:
False)
--no-segdb don't use the segment database for state determination
(default: False)
Pipeline options:
--skip-omicron skip running omicron (default: False)
--skip-root-merge skip running omicron-root-merge (default: False)
--skip-hdf5-merge skip running omicron-hdf5-merge (default: False)
--skip-ligolw_add skip running ligolw_add (default: False)
--skip-gzip skip running gzip (default: False)
--skip-postprocessing
skip all post-processing, equivalent to --skip-root-
merge --skip-hdf5-merge --skip-ligolw_add --skip-gzip
(default: False)
--skip-rm Do not remove all the trigger files created by the
job.Useful for debugging(default: False)
This source code for this project is available here:
https://github.com/gwpy/pyomicron/
All issues regarding this software should be raised using the GitHub web
interface, bug reports and feature requests are encouraged.
Documentation is available here:
https://pyomicron.readthedocs.io/en/latest/