Generating a workflow using omicron-process
PyOmicron provides the omicron-process
command-line executable, designed for creating and managing HTCondor workflows.
The only hard requirement in order to run omicron-process
is a configuration file. Once you have that you can run automatically over the most recent chunk of data as
$ omicron-process <group> --config-file <config-file>
where <config-file>
is the name of your configuration file, and <group>
is the name of the section inside that file that configures this workflow.
Warning
By default, this command will automatically generate the workflow as an
HTCondor DAG and submit it to via condor_submit_dag
.
To generate the workflow but not submit the DAG, add the --no-submit
option on the command line.
Note
By default, omicron-process
won’t output very much to the screen, so it
can be useful to supply --verbose --verbose
(i.e. --verbose
twice)
to get the DEBUG
logging statements. This will provide a running
progress report for the workflow, which can be very informative.
Warning
omicron-process
will complain loudly if it can’t find omicron.exe
on the path somewhere.
You can either specify --executable
manually
on the command line, or if you are working in the detchar account on the LIGO Data Grid
you enable the standard conda environment for omicron with the following command:
conda_omicron
The other alternative is to install omicron into your conda environment with:
mamba install omicron
Details of the workflow
The omicron-process
executable will do the following
find the relevant time segments to process (if
state-flag
orstate-channel
has been defined in the configuration),find the frame files containing the data (using
gw_data_find
),build a Directed Acyclic Graph (DAG) defining the workflow.
The DAG will normally do something like this:
process raw data using
omicron.exe
merge contiguous output files with
.root
,.h5
, and.xml
extensionsgzip
.xml
files to save spacethe merged files are copied to the archive directory, nominally
/home/detchar/triggers/<ifo>/<channel-filetag>/<metric day>
if everything completes successfully, trigger and log files are deleted
Note
The workflow will break the interval into chunks defined in the configuration
file chunkdur
parameter, which includes the padding. The nominal value
is 120 sec. This results in multiple htCondor jobs.
Archiving multiple workflows
Optionally, you can specify the --archive
option to copy files from the run directory into a structured archive under ~/triggers/
. Each file is re-located as follows:
~/triggers/{IFO}/{filetag}/{gps5}/{filename}
where the path components are as follows
{IFO}
is the two-character interferometer prefix for the raw data channel (e.g.L1
),{filetag}
is an underscore-delimited tag including the rest of the channel name andOMICRON
, e.g. (GDS_CALIB_STRAIN_OMICRON
),{gps5}
is the 5-digit GPS epoch for the start time of the file, e.g.12345
if the file starts at GPS1234567890
.{filename}
is the T050017-compatible name, which will be of the form{IFO}-{filetag}-<gpsstart>-<duration>.<ext>
e.g.:
~/triggers/L1/GDS_CALIB_STRAIN_OMICRON/12345/L1-GDS_CALIB_STRAIN_OMICRON-1234567890-100.xml.gz
Processing a specific time interval
If you have a specific time interval that you’re most interested in, you will need to use the --gps
option on the command line:
$ omicron-process <group> --config-file <config-file> --gps <gpsstart> <gpsend>
where <gpsstart>
and <gpsend>
are your two GPS times.
Note
You can also give the GPS arguments as date strings, in quotes, as follows
$ omicron-process <group> --config-file <config-file> --gps "Jan 1" "Jan 2"
Additionally, when using -gps
, you can specify --cache-file
to submit your own LAL-formatted data cache file:
$ omicron-process <group> --config-file <config-file> --gps <gpsstart> <gpsend> --cache-file /path/to/cache.lcf
More help
For detailed documentation of all command-line options and arguments, print the --help
message:
$ omicron-process --help
/home/docs/checkouts/readthedocs.org/user_builds/pyomicron/conda/latest/lib/python3.11/site-packages/h5py/__init__.py:36: UserWarning: h5py is running against HDF5 1.14.3 when it was built against 1.14.2, this may cause problems
_warn(("h5py is running against HDF5 {0} when it was built against {1}, "
/home/docs/checkouts/readthedocs.org/user_builds/pyomicron/conda/latest/lib/python3.11/site-packages/htcondor/__init__.py:49: UserWarning: Neither the environment variable CONDOR_CONFIG, /etc/condor/, /usr/local/etc/, nor ~condor/ contain a condor_config source. Therefore, we are using a null condor_config.
_warnings.warn(message)
usage: omicron-process [-h] [-V] [-t GPSTIME GPSTIME] [-f CONFIG_FILE]
[-i IFO] [-v] [-o OUTPUT_DIR] [-a] [-g FILE_TAG]
[-l LOG_FILE] [-C MAX_CHUNKS_PER_JOB]
[-N MAX_CHANNELS_PER_JOB]
[--max-online-lookback MAX_ONLINE_LOOKBACK]
[--max-concurrent MAX_CONCURRENT] [-x EXCLUDE_CHANNEL]
[--reattach | --rescue | --no-submit]
[--universe {vanilla,local}] [--executable EXECUTABLE]
[--condor-retry CONDOR_RETRY]
[--condor-accounting-group CONDOR_ACCOUNTING_GROUP]
[--condor-accounting-group-user CONDOR_ACCOUNTING_GROUP_USER]
[--condor-request-disk CONDOR_REQUEST_DISK]
[--submit-rescue-dag SUBMIT_RESCUE_DAG]
[-c "key=value"] [-d "opt | opt=value"]
[--cache-file FILE | --use-dev-shm] [--no-segdb]
[--skip-omicron] [--skip-root-merge]
[--skip-hdf5-merge] [--skip-ligolw_add] [--skip-gzip]
[--skip-postprocessing] [--skip-rm]
group
Process LIGO data using the Omicron event trigger generator (ETG)
This utility can be used to process one or more channels or LIGO data using
Omicron with minimal manual labour in determining segments, finding data,
and configuring HTCondor.
The input to this should be an INI-format configuration file that lists the
processing parameters and channels that pass to Omicron, something like:
```ini
[GW]
q-range = 3.3166 150
frequency-range = 4.0 8192.0
frametype = H1_HOFT_C00
state-flag = H1:DMT-CALIBRATED:1
sample-frequency = 16384
chunk-duration = 124
segment-duration = 64
overlap-duration = 4
mismatch-max = 0.2
snr-threshold = 5
channels = H1:GDS-CALIB_STRAIN
```
The above 'GW' group name should then be passed to `omicron-process` along
with any customisations available from the command line, e.g.
```
omicron-process GW --config-file ./config.ini
```
By default `omicron-process` will look at the most recent data available
('online' mode), to run in 'offline' mode, pass the `--gps` argument
```
omicron-process GW --config-file ./config.ini --gps <gpsstart> <gpsstop>
```
The output of `omicron-process` is a Directed Acyclic Graph (DAG) that is
*automatically* submitted to condor for processing.
Positional arguments:
group name of configuration group to process
Optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-t GPSTIME GPSTIME, --gps GPSTIME GPSTIME
GPS times for offline processing
-f CONFIG_FILE, --config-file CONFIG_FILE
path to configuration file (default: None)
-i IFO, --ifo IFO IFO prefix to process (default: None)
-v, --verbose print verbose output, give more times for more verbose
output
Output options:
-o OUTPUT_DIR, --output-dir OUTPUT_DIR
path to output directory (default: /home/docs/checkout
s/readthedocs.org/user_builds/pyomicron/checkouts/late
st/docs)
-a, --archive archive created files under /home/docs/triggers
(default: False)
-g FILE_TAG, --file-tag FILE_TAG
additional file tag to be appended to final file
descriptions
-l LOG_FILE, --log-file LOG_FILE
save a copy of all logger messages to this file
Processing options:
-C MAX_CHUNKS_PER_JOB, --max-chunks-per-job MAX_CHUNKS_PER_JOB
maximum number of chunks to process in a single condor
job (default: 4)
-N MAX_CHANNELS_PER_JOB, --max-channels-per-job MAX_CHANNELS_PER_JOB
maximum number of channels to process in a single
condor job (default: 20)
--max-online-lookback MAX_ONLINE_LOOKBACK
With no immediately previous run, or one that was long
ago this is the max time of an online job. Default:
1200
--max-concurrent MAX_CONCURRENT
Max omicron jobs run at one time [64]
-x EXCLUDE_CHANNEL, --exclude-channel EXCLUDE_CHANNEL
exclude channel from the analysis (can be given
multiple times)
Condor options:
--reattach if DAG already running, try and reattach to it and
follow it's progress, this is only designed for online
running
--rescue rescue a failed DAG instead of creating a new one
(default: False)
--no-submit do not submit the DAG to condor (default: False)
--universe {vanilla,local}
condor universe (default: vanilla)
--executable EXECUTABLE
omicron executable (default: /home/docs/checkouts/read
thedocs.org/user_builds/pyomicron/conda/latest/bin/omi
cron)
--condor-retry CONDOR_RETRY
number of times to retry each job if failed (default:
2)
--condor-accounting-group CONDOR_ACCOUNTING_GROUP
accounting_group for condor submission on the LIGO
Data Grid (default:
ligo.prod.o4.detchar.transient.omicron)
--condor-accounting-group-user CONDOR_ACCOUNTING_GROUP_USER
accounting_group_user for condor submission on the
LIGO Data Grid (default: docs)
--condor-request-disk CONDOR_REQUEST_DISK
Required LIGO argument: local disk use (default: 50G)
--submit-rescue-dag SUBMIT_RESCUE_DAG
number of times to automatically submit the rescue DAG
(default: 0)
-c "key=value", --condor-command "key=value"
Extra commands to add to the HTCondor submit files,
can be given multiple times
-d "opt | opt=value", --dagman-option "opt | opt=value"
Extra options to pass to condor_submit_dag as "-{opt}
[{value}]". Can be given multiple times (default:
['force', '-import_env'])
Data options:
--cache-file FILE use frame locations from FILE
--use-dev-shm use low-latency frame buffer in /dev/shm (default:
False)
--no-segdb don't use the segment database for state determination
(default: False)
Pipeline options:
--skip-omicron skip running omicron (default: False)
--skip-root-merge skip running omicron-root-merge (default: False)
--skip-hdf5-merge skip running omicron-hdf5-merge (default: False)
--skip-ligolw_add skip running ligolw_add (default: False)
--skip-gzip skip running gzip (default: False)
--skip-postprocessing
skip all post-processing, equivalent to --skip-root-
merge --skip-hdf5-merge --skip-ligolw_add --skip-gzip
(default: False)
--skip-rm Do not remove all the trigger files created by the
job.Useful for debugging(default: False)
This source code for this project is available here:
https://github.com/gwpy/pyomicron/
All issues regarding this software should be raised using the GitHub web
interface, bug reports and feature requests are encouraged.
Documentation is available here:
https://pyomicron.readthedocs.io/en/latest/