Midas Talk

Back Midas Rome Roody Rootana

Midas DAQ System

Not logged in

Back | Find | Login | Help

30 Apr 2022, Giovanni Mazzitelli, Forum, S3 Object Storage

30 Apr 2022, Konstantin Olchanski, Forum, S3 Object Storage

01 May 2022, Giovanni Mazzitelli, Forum, S3 Object Storage

Message ID: 2388 Entry time: 30 Apr 2022 In reply to: 2387 Reply to this: 2393

Author:	Konstantin Olchanski
Topic:	Forum
Subject:	S3 Object Storage

> We are storing raw MIDAS files to S3 Object Storage, but MIDAS file are not 
> optimised for readout from such kind of storage. There is any work around on 
> evolution of midas raw output or, beyond simulated posix fs,  to develop midas 
> python library optimised to stream data from S3 (is not really clear to me if this 
> is possible).

We have plans for adding S3 object storage support to lazylogger, but have not gotten 
around to it yet.

We do not plan to add this in mlogger. mlogger works well for writing data to locally-
attached storage (local ext4, XFS, ZFS) but always runs into problems with timeouts and 
delays when writing to anything network-attached (even writing to NFS).

I envision that each midas raw data file (mid.gz or mid.lz4 or mid.bz2) will
be stored as an S3 object and there will be some kind of directory object
to map object ids to run and subrun numbers.

Choice of best file size is open, normally we use subruns to limit file size to 1-2 
Gbytes. If cloud storage prefers some other object size, we can easily to up to 10 
Gbytes and down to "a few megabytes" (ODB dumps will have to be turned off for this).

Other than that, in your view, what else is needed to optimize midas files for storage 
in the Amazon S3 could?

P.S. For reading files from the cloud, code needs to be written and added to 
midasio/midasio.cxx, for example, see the code that is already there for reading ssh-
attached files and dcache/dccp-attached files. (CERN EOS files can be read directly 
from POSIX mount point /eos).

K.O.

ELOG V3.1.4-2e1708b5