ELOG Midas

Back Midas Rome Roody Rootana

Midas DAQ System, Page 103 of 152

Not logged in

Find | Login | Help

New entries since:

Wed Dec 31 16:00:00 1969

Full | Summary | Threaded | Collapse | Expand

3034 Entries

Goto page Previous 1, 2, 3 ... 102, 103, 104 ... 150, 151, 152 Next

27 Apr 2023, Marius Koeppel, Suggestion, Maximum ODB size

> This is change is wrong. As I wrote, ODB is not 64-bit clean and it is not 32-bit clean. We think is is 31-bit clean, so maximum size would be slightly less than 2 Gbytes.

I just wanted to show that changing it and creating bigger ODBs is in general possible.

My main intention was to trigger the discussion again. I also think in general 1GB is enough. But for our applications sometimes 100MB is just on the edge. 



> Congratulations. created != "it works". for proper test, you should fill it with 1.5 GB of stuff, save to json file, reload from json file, save to a different json file and compare that they have same contents (minus timestamps).

You’re right we did not properly test it. I will run this test with a 1GB ODB.



> We could spend a lot of time making odb 32-bit clean and give you 4GB-max ODB, but would it be useful? For large ODB, "save to .json" already takes a long time ("save to .xml" is slower, "save to .odb" ditto, also buggy). We already have complaints that runs take forever to start because mlogger 

> takes a long time to write the ODB save file.

I also agree that going in and making it 32-bit or even 64-bit clean is not worth the effort.

Also concerning the writing speed of the logger etc I am fully with you.

However, having the freedom to choose a bit bigger ODB would be great.



You said the writing into .odb is buggy. Do you mean it’s buggy in general or only in this specific case?

We save the ODB most of the time in the .odb format. 



Cheers,

Marius

27 Apr 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

> You said the writing into .odb is buggy. Do you mean it’s buggy in general or only in this specific case?
> We save the ODB most of the time in the .odb format. 

I recommend JSON. Main advantage is you can read it using JSON decoder available for any language, no need to write custom code.

Other than that, the main issue is encoding of strings. For ODB this is key names and string values.

JSON was the first to standardize escape characters what can encode all valid UTF-8 UNICODE strings,
the system of escape characters is clean, easy to understand and easy to implement. https://www.json.org/json-en.html

XML is not as well defined as JSON, i.e. go and try to find the XML BNF grammar. I am not sure if the MIDAS XML encoder
and decoder is fully UTF-8 clean, and if some unlucky combinations of characters break string encoding or decoding. This
is usually tested using a fuzzer (generates all possible, unlucky and unlikely string values). Most suspicious
would be quotes, and square and angle brackets. If some character combinations break encoding or decoding, likely
this cannot be fixed in MIDAS without breaking backwards self-compatibility (will not read old ODB files correctly).

Same applies for the ODB format, except that it is even more ad-hoc. Again, any problems are hard to fix without
breaking backward self-compatibility.

In addition, in the past, the ODB and XML decoders had trouble with very long strings, this has been
fixed some time ago.

K.O.

27 Apr 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

my vote is to bump the ODB size limit to 1999*1000*1000 (not quite 2GB). but this needs to be tested. especially save and restore from ODB, XML and JSON files, including how long it takes to save and load a 1.9GB ODB. K.O.

27 Apr 2023, Stefan Ritt, Suggestion, Maximum ODB size

> Congratulations. created != "it works".

Two other tings to consider:

1) The ODB shared memory is dumped into a binary file (".ODB.SHM") after the last client finished and read if the first client starts, to get it persistent. 
So this could slow down starting and stopping, but only the first client, so I guess it's not an issue.

2) Traditionally, the ODB gets dumped to the .mid file at the beginning and end of every run, so that one know the exact configuration of the experiment
for offline analysis. This can be turned off of course, but most experiments use it. If the ODB is dumped in any ASCII format, this can take quite long.
Assume it takes 10 seconds at the beginning of each run, and we take a run every five minutes. Then we loose 48 mins of precious beam time every day.

Best,
Stefan

28 Apr 2023, Marius Koeppel, Suggestion, Maximum ODB size

> my vote is to bump the ODB size limit to 1999*1000*1000 (not quite 2GB). but this needs to be tested. especially save and restore from ODB, XML and JSON files, including how long it takes to save and load a 1.9GB ODB. K.O.

I had some fun with python and created a test script which can be executed in the MIDASSYS/online folder (test_odb.py). I did not really normalize the time so it will be different at different systems but I guess the trend is important (see create_time.pdf).
What is surprising to me is that even that I only write one STRING key to the time increases. Is this maybe related to what Stefan said about the run start - so that odbedit needs some time to load the bigger ODB?
Second thing is that also the creation / storing and load time is increasing. Should this be or is there a bug in the code I use or again is this related to the previous point?

The test of comparing the ODB after store / load / store already fails for the json format. I know I only test if the dicts are the same, so for timestamps this already fails.
But what is strange here is that sometimes the test works sometimes not and its different from run to run.

I will try to improve the test a bit more but for a short update this is how it looks so fare.

Best,
Marius

28 Apr 2023, Stefan Ritt, Suggestion, Maximum ODB size

> Is this maybe related to what Stefan said about the run start - so that odbedit needs some time to load the bigger ODB?

At the run start mlogger writes the ODB to the .mid file. This needs conversion (binary ODB -> XML ASCII) which can take time.
This does NOT depend on the ODB size, but on the ODB *content*. Every key in the ODB takes time to convert. So if your ODB as 1.5 GB
but only a few keys, this is still fast. Only if you have 200 million keys int he ODB, then mlogger takes lots of time to convert
200 million values to XML or JSON strings.

Stefan

28 Apr 2023, Marius Koeppel, Suggestion, Maximum ODB size

> At the run start mlogger writes the ODB to the .mid file. This needs conversion (binary ODB -> XML ASCII) which can take time.
> This does NOT depend on the ODB size, but on the ODB *content*. Every key in the ODB takes time to convert. So if your ODB as 1.5 GB
> but only a few keys, this is still fast. Only if you have 200 million keys int he ODB, then mlogger takes lots of time to convert
> 200 million values to XML or JSON strings.
This was also my assumption. Is this the same for odbedit -c save FILE?
Because this is what I tested with the script and there one can see in the plot that the time increases to write the file if the ODB size increases.
The content of the ODB is always the same - one STRING key in the directory Test.

Best,
Marius

28 Apr 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

> > Congratulations. created != "it works".
> 
> Two other tings to consider:
> 
> 1) The ODB shared memory is dumped into a binary file (".ODB.SHM") after the last client finished and read if the first client starts, to get it persistent. 
> So this could slow down starting and stopping, but only the first client, so I guess it's not an issue.
>

typical disk writing speed is 100-1000 Mbytes/sec, so writing 1 GB .ODB.SHM will take 1-10 seconds. NFS over 1gige network is 100 Mbytes/sec, so 10 seconds to 
write .ODB.SHM. embedded ARM write speed to SD flash can be as low as 10 Mbytes/sec, so up to 100 seconds.

> 
> 2) Traditionally, the ODB gets dumped to the .mid file at the beginning and end of every run, so that one know the exact configuration of the experiment
> for offline analysis. This can be turned off of course, but most experiments use it. If the ODB is dumped in any ASCII format, this can take quite long.
> Assume it takes 10 seconds at the beginning of each run, and we take a run every five minutes. Then we loose 48 mins of precious beam time every day.
> 

new default is to save as JSON, (as of my last measurement) JSON encoder is faster than the XML (and ODB?) encoder, by default result is compressed by GZIP-1 (66 
Mbytes/sec is my old benchmark, should remeasure on new DDR5 machines), compressed JSON is written .mid.gz file at disk speed (as above). Alternatively, use LZ4 
compression, runs roughly at memcpy() speed, less compression, written to .mid.lz4 at disk speed.

if data storage is ZFS, ZFS built-in LZ4 compression is now enabled by default, so result writing uncompressed .mid file (no compression of ODB dump), should be 
roughly same as when using MIDAS LZ4 compression and writing .mid.lz4.

bottom line, I need to remeasure gzip and lz4 compression speeds on new computers (DDR4 AMD 5000 series and DDR5 AMD 7000 series).

K.O.

28 Apr 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

> > Is this maybe related to what Stefan said about the run start - so that odbedit needs some time to load the bigger ODB?
> 
> At the run start mlogger writes the ODB to the .mid file. This needs conversion (binary ODB -> XML ASCII) which can take time.
> This does NOT depend on the ODB size, but on the ODB *content*.
>

Yes and no. They must be storing more than 100 Mbytes of stuff in ODB, if they are asking to bump ODB size from 100 Mbyte to 2 GByte ODB.

On the MIDAS, side, though, we have to plan for the worst case, if max ODB size 1.9 GB and it is full of data,
and mlogger (and odbedit save and load) take 10-30 seconds, then at least all timeouts (watchdog timeout, RPC timeout, etc)
must be increased accordingly.

K.O.

09 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

> > 1) The ODB shared memory is dumped into a binary file (".ODB.SHM") after the last client finished ...

correction: ODB shared memory is saved to .ODB.SHM each time a client stops, this is db_close_database().

I have just run into a problem with this in the DRAGON experiment. At begin and end of run they run
a script that does a large number of "odbedit" calls to read stuff from ODB and it was taking a very long time.
Each odbedit invocation was taking about 1 second, starting odbedit is quick, stopping odbedit takes about 1 second.

It turns out each invocation of odbedit saves .ODB.SHM, ODB was 100 Mbytes size, home disk is an HDD (~100-200 Mbytes/sec writing speed), so yes, about 1 second to 
stop odbedit.

Solution was to reduce ODB size from 100 Mbytes to 10 Mbytes, odbedit now run quickly, begin and end of run scripts run quickly. problem solved.

K.O.

P.S. no, I am not the dragon experiment, no, I did not write those scripts, no, I will not rewrite them, persons who wrote them are long gone, no, the persons running 
dragon today will not be rewriting them.

12 Jun 2023, Stefan Ritt, Suggestion, Maximum ODB size

> correction: ODB shared memory is saved to .ODB.SHM each time a client stops, this is db_close_database().

The original design of the midas shared memory (back in the 1990's) was that the ODB shared memory file gets
only saved into the .ODB.SHM when the *last* client exits. This ensures to keep the ODB persistent when the
shared memory gets deleted. I vaguely remember I put something in like:

db_close_database()
...
  destroy_flag = (pheader->num_clients == 0);

  if (destroy_flag)
     ss_shm_flush(pheader->name, pdb->shm_adr, pdb->shm_size, pdb->shm_handle);
...

Now I see that the "if (destory_flag)" is missing. Not sure if it was removed once, or if it actually never
was there. But I see no point in flushing the ODB when a client ends. We need the flushing only before the
shared memory gets deleted. We we have to ensure that the share memory and the binary dump file stay in sync
(like if all midas clients die at the same time), we could add some code to flush the ODB like once per minute,
but not attach it to db_close_database(). I know several experiments using "odbedit -c xxx" in vast quantities,
so all these experiments would then benefit.

Note: Mu3e at PSI also uses 100 MB ODB, and they really need it.

Thoughts and opinions?

Best,
Stefan

12 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

> > correction: ODB shared memory is saved to .ODB.SHM each time a client stops, this is db_close_database().
> 
> The original design of the midas shared memory (back in the 1990's) was that the ODB shared memory file gets
> only saved into the .ODB.SHM when the *last* client exits. This ensures to keep the ODB persistent when the
> shared memory gets deleted. I vaguely remember I put something in like:
> 
> db_close_database()
> ...
>   destroy_flag = (pheader->num_clients == 0);
> 
>   if (destroy_flag)
>      ss_shm_flush(pheader->name, pdb->shm_adr, pdb->shm_size, pdb->shm_handle);

I remember the same, but I tracked it down in git to the very first commit, and there is no if() there,
odb is saved to .ODB.SHM on every client shutdown, not just the last client. I guess we both misremebered.

What's more, ss_shm_flush() is done while holding the ODB semaphore, so all other midas programs that try to access
odb at the same time (including the mserver) will stall until write() and close() return. at least we do not fsync(),
and there is no waiting until data is committed to physical media.

$ git annotate 3bb04af4d^ src/odb.c
...
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	875)  destroy_flag = (pheader->num_clients == 0);
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	876)
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	877)  /* flush shared memory to disk */
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	878)  ss_flush_shm(pheader->name, pheader, sizeof(DATABASE_HEADER)+2*pheader->data_size);
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	879)
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	880)  /* unmap shared memory, delete it if we are the last */
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	881)  ss_close_shm(pheader->name, pheader,
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	882)               _database[hDB-1].shm_handle, destroy_flag);
...

K.O.

13 Jun 2023, Stefan Ritt, Suggestion, Maximum ODB size

> I remember the same, but I tracked it down in git to the very first commit, and there is no if() there,
> odb is saved to .ODB.SHM on every client shutdown, not just the last client. I guess we both misremebered.

I confirm. Really strange how your mind can trick you. I'm absolutely sure I had this planned originally (1995?), but it got never implemented.

Well, never too late. So I added the "if" and committed to develop. I did a quick test and things seem to work fine here. Actually programs stop 
a bit faster now. So please everybody give it a try and report back here.

BTW, how do I resize the ODB. I remember we discussed this some time ago, and concluded that odbedit needs a resize flag. Has this even been 
done? If not, what is the "official" way to resize the ODB. We had some documentation about that some time ago, but I can't find it anymore.

Stefan

13 Jun 2023, Marius Koeppel, Suggestion, Maximum ODB size

> BTW, how do I resize the ODB. I remember we discussed this some time ago, and concluded that odbedit needs a resize flag. Has this even been 
> done? If not, what is the "official" way to resize the ODB. We had some documentation about that some time ago, but I can't find it anymore.

I guess this is still not done and the issue is still open: https://bitbucket.org/tmidas/midas/issues/329/need-odbresize
I guess if we touch this maybe the problem with the wrong size should be also fixed: https://bitbucket.org/tmidas/midas/issues/328/odbinit-s-1024mb-creates-odb-with-wrong

Best,
Marius

13 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

> > I remember the same, but I tracked it down in git to the very first commit, and there is no if() there,
> > odb is saved to .ODB.SHM on every client shutdown, not just the last client. I guess we both misremebered.

small problem. build an experiment, start taking data, observe how ODB is never saved to disk because the "last client" never stops. as bonus, crash 
the computer, observe how all changes to ODB are now lost. if mlogger is configured to save odb.json at the end of run, and to write ODB dumps at 
begin and end of every data file, you can recover some of the lost data.

for better effect, ODB should be dumped to disk at periodic intervals. but. current implementation writes odb to disk while holding the ODB 
semaphore, which means all ODB access stops for the duration, specifically, there will be gaps in the history because mlogger cannot read history 
data from ODB.

a better implementation could take the ODB lock, make a copy of ODB shared memory, release the ODB lock, complete writing to disk without holding the 
lock. protection is needed against 100 midas programs trying to do this all at the same time. computers with 0.5 GB RAM (many ARM FPGA SoCs) will be 
limited to ~100 Mbyte ODB). plus deal with memory allocation failures when taking a copy of a 2GB ODB.

in theory, the mmap() shared memory (already implemented in midas) does this automatically, but we lose control
over disk writes, we see some OSes write odb to disk "too often" and at wrong times, i.e. while we are in the middle
of creating or deleting something. current sequence of open(), atomic write() and close() ensures ODB.SHM always
contains a valid odb. (minus loss of OS and disk caches to crash or power loss).

K.O.

13 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

> > BTW, how do I resize the ODB.

ODB cannot be resized "online". Everything has to stop, save content to odb.json, get rid of old ODB.SHM, ensure ODB shared memory is destroyed (SysV or POSIX shared memory), 
create new ODB with new size, load odb.json. Feel free to punch this into chatgpt > odbresize.cxx, commit, test, push.

> I remember we discussed this some time ago, and concluded that odbedit needs a resize flag.

ODB cannot be resized online. ODB API has ODB clients holding ODB handles which are pointers (offsets) into ODB shared memory.

> Has this even been done?
> I guess this is still not done and the issue is still open: https://bitbucket.org/tmidas/midas/issues/329/need-odbresize
> I guess if we touch this maybe the problem with the wrong size should be also fixed: https://bitbucket.org/tmidas/midas/issues/328/odbinit-s-1024mb-creates-odb-with-wrong

please contribute 14 distraction-free days to my patreon. thanks in advance!

K.O.

13 Jun 2023, Stefan Ritt, Suggestion, Maximum ODB size

> small problem. build an experiment, start taking data, observe how ODB is never saved to disk because the "last client" never stops. as bonus, crash 
> the computer, observe how all changes to ODB are now lost. if mlogger is configured to save odb.json at the end of run, and to write ODB dumps at 
> begin and end of every data file, you can recover some of the lost 

The new behavior is not much worse than before. Assume 10 programs running happily for days, computer crashes, all ODB changes lost. 
So indeed a periodic flush without holding the lock might be best. Use a semaphore to prevent all programs flushing at the same time, or put
the flush only in the logger after an end of run.

Stefan

13 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

> 
> > small problem. build an experiment, start taking data, observe how ODB is never saved to disk because the "last client" never stops. as bonus, crash 
> > the computer, observe how all changes to ODB are now lost. if mlogger is configured to save odb.json at the end of run, and to write ODB dumps at 
> > begin and end of every data file, you can recover some of the lost 
> 
> The new behavior is not much worse than before. Assume 10 programs running happily for days, computer crashes, all ODB changes lost. 
> So indeed a periodic flush without holding the lock might be best. Use a semaphore to prevent all programs flushing at the same time, or put
> the flush only in the logger after an end of run.

are you sure? when/how often does "last midas program finishes" happen? it does not happen on a system crash, not on power loss, not on "shutdown -r now" 
(I am pretty sure). In the experiments you run, how often do you shut down all programs (and check that you did not forget one somehow)?

sanity check. dragon experiment, very active, .ODB.SHM timestamp is 1 second old. not-very-active agmini, today is June 13th, timestamp of .ODB.SHM is June 
2nd. inactive TACTIC, timestamp of .ODB.SHM is May 16th.

so yes, not great, but in the new scheme, ODB.SHM timestamps would probably be from 2021 or 2020.

my vote is to undo this change, it is dangerous because it causes odb to be saved to ODB.SHM never.

K.O.

13 Jun 2023, Stefan Ritt, Suggestion, Maximum ODB size

> are you sure? when/how often does "last midas program finishes" happen? it does not happen on a system crash, not on power loss, not on "shutdown -r now" 
> (I am pretty sure). In the experiments you run, how often do you shut down all programs (and check that you did not forget one somehow)?

Indeed this is almost never the case, maybe once per months. On the other hand, we have a complete crash of the os maybe once a year. Most of the time the programs 
run continuously (we do not need odbedit), so our timestamp is typically one or two days old, so not good either.

> my vote is to undo this change, it is dangerous because it causes odb to be saved to ODB.SHM never.

My vote is to flush the odb either periodically or after each run.

Stefan

15 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size

> 
> > are you sure? when/how often does "last midas program finishes" happen? it does not happen on a system crash, not on power loss, not on "shutdown -r now" 
> > (I am pretty sure). In the experiments you run, how often do you shut down all programs (and check that you did not forget one somehow)?
> 
> Indeed this is almost never the case, maybe once per months. On the other hand, we have a complete crash of the os maybe once a year. Most of the time the programs 
> run continuously (we do not need odbedit), so our timestamp is typically one or two days old, so not good either.
> 
> > my vote is to undo this change, it is dangerous because it causes odb to be saved to ODB.SHM never.
> 
> My vote is to flush the odb either periodically or after each run.
> 

So we are in agreement.

RFE filed:
https://bitbucket.org/tmidas/midas/issues/367/odb-should-be-saved-to-disk-periodically

Dangerous change reverted:
60e4c44ad66346b89ba057391acf7a02890049be

K.O.

bash-3.2$ git diff
diff --git a/src/odb.cxx b/src/odb.cxx
index 0d3b88c2..d104ff28 100644
--- a/src/odb.cxx
+++ b/src/odb.cxx
@@ -2199,7 +2199,14 @@ INT db_close_database(HNDLE hDB)
       destroy_flag = (pheader->num_clients == 0);
 
       /* flush shared memory to disk */
-      if (destroy_flag)
+
+      /* if we save ODB to disk only after last client finishes, we will never save ODB to disk
+         in most experiments - none of them ever completely stop MIDAS in normal operation.
+         as result, all changes to ODB contents will be lost on system crash, power loss
+         or normal reboot. see https://daq00.triumf.ca/elog-midas/Midas/2539
+         K.O. June 2023. */
+
+      if (1 || destroy_flag)
          ss_shm_flush(pheader->name, pdb->shm_adr, pdb->shm_size, pdb->shm_handle);
 
       strlcpy(xname, pheader->name, sizeof(xname));

K.O.

Goto page Previous 1, 2, 3 ... 102, 103, 104 ... 150, 151, 152 Next

ELOG V3.1.4-2e1708b5