This is an important improvement, should have a post of it's own. K.O.
> > > RFE filed:
> > > https://bitbucket.org/tmidas/midas/issues/367/odb-should-be-saved-to-disk-
periodically
> >
> > Implemented and closed: https://bitbucket.org/tmidas/midas/issues/367/odb-
should-be-saved-to-disk-periodically
> >
> > Stefan
>
> Stefan's comments from the closed bug report:
>
> Ok I implemented some periodic flushing. Here is what I did:
>
> Created
>
> /System/Flush/Flush period : TID_UINT32 /System/Flush/Last flush : TID_UINT32
>
> which control the flushing to disk. The default value for “Flush period” is 60
seconds or one minute.
>
> All clients call db_flush_database() through their cm_yield() function
> db_flush_database() checks the “Last flush” and only flushes the ODB when the
period has expired. This test is
> done inside the ODB semaphore so that we don’t get a race condigiton
> If the period has expired, db_flush_database() calls ss_shm_flush()
> ss_shm_flush() tries to allocate a buffer of the shared memory. If the
allocation is not successful (out of
> memory), ss_shm_flush() writes directly to the binary file as before.
> If the allocation is successful, ss_shm_flush() copies the share memory to a
buffer and passes this buffer to a
> dedicated thread which writes the buffer to the binary file. This causes
ss_shm_flush() to return immediately and
> not block the calling program during the disk write operation.
> Added back the “if (destroy_flag) ss_shm_flush()” so that the ODB is flushed
for sure before the shared memory
> gets deleted.
> This means now that under normal circumstances, exiting programs like odbedit
do NOT flush the ODB. This allows to
> call many “odbedit -c” in a row without the flush penalty. Nevertheless, the
ODB then gets flushed by other
> clients latest 60 seconds (or whatever the flush period is) after odbedit
exits.
>
> Please note that ODB flushing has two purposes:
>
> When all programs exit, we need a persistent storage for the ODB. In most
experiments this only happens very
> seldom. Maybe at the end of a beam time period.
> If the computer crashes, a recent version of the ODB is kept on disk to
simplify recovery after the crash.
> Since crashes are not so often (during production periods we have maybe one
hardware failure every few years) the
> flushing of the ODB too often does not make sense and just consumes resources.
Flushing does also not help from
> corrupted ODBs, since the binary image will also get corrupted. So the only
reason for periodic flushes is to ease
> recovery after a total crash. I put the default to 60 seconds, but if people
are really paranoid they can decrease
> it to 10 seconds or so. Or increase it to 600 seconds if their system does not
crash every week and disks are
> slow.
>
> I made a dedicated branch feature/periodic_odb_flush so people can test the
new functionality. If there are no
> complaints within the next few days, I will merge that into develop.
>
> Stefan |