Back Midas Rome Roody Rootana
  Midas DAQ System, Page 34 of 138  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
ID Date Author Topicdown Subject
  1282   26 Apr 2017 Stefan RittInfoadded db_get_value_string()
Just some thought for discussion:

Rather than "spicing up" the MIDAS library here and there with C++ objects such as std::string, wouldn't it make more sense to "cleanly" wrap an ODB value in a C++ class? We could use then 
both APIs in parallel, and encourage the C++ API for new developments. We could then write things like:

   ODBKEY<std::string> name("/Experiment/Name"); // constructor calls automatically db_get_value
   name = "New Name"; // overloading the "=" operator, will call db_set_value()

or even

   ODBKEY<std::vector, std::string> nameArray("...");
   for (auto &s : nameArray)
      std::cout << s << std::endl; // print all elements of string array

so we treat ODB arrays as vectors, which fixes array boundary violations nicely.

If the key does not exist, we could properly throw exceptions and forget about tons of nested return parameters for error conditions.

Many nice things could be done, common errors could be prevented, and we can do a "smooth" migration: We don't have to change the whole library completely, just where we feel it's currently 
needed. So over time the code would be "objectified". Would be nice if we could rely on C++11 (like the "auto" feature above). Not sure about VxWorks, but every other OS should be fine.

Stefan

> Since we have been regularly running into problems with db_get_xxx(TID_STRING) and string buffers of mismatched size,
> I now implemented db_get_value_string(hdb, hkey, key_name, index, &string, create).
> 
> It works the same as db_get_value(TID_STRING), except that the string value is returned into an std::string object,
> memory allocation is handled by std::string and there is no string length limit (other than std::string limits).
> 
> Accessing string arrays is done explicitly via an "index" parameter, if index is bigger than odb array size DB_OUT_OF_RANGE is returned
> without logging an error message (e.g. db_get_data_index() will log an error). This makes is safe to iterate over array entries with a simple
> loop of index from 0 and up until db_get returns an error.
> 
> As before, if the odb entry does not exist, it will be created (if create==true) and initialized with the value of the string parameter (zero-terminated in odb).
> 
> There is also newly added db_set_value_string() and cm_get_path_string(). if you want more of these, please ask, or send patches.
> 
> K.O.
  1284   02 May 2017 Konstantin OlchanskiInfoadded db_resize_string()
> Since we have been regularly running into problems with db_get_xxx(TID_STRING) and string buffers of mismatched size,
> I now implemented db_get_value_string(hdb, hkey, key_name, index, &string, create).

I run into problems with string arrays - non-array strings have unlimited length, but string arrays have fixed string length, usually set at creation time.

This causes a problem with growing arrays using db_get_value_string(), when converting a non-array variable to an array, the wrong
string length gets used, and one gets an array with useless string length. There is no way to specify the correct array string length
without adding more parameters to db_get_value_string() and confusing and complicating it for the typical case where it is used
against simple (non-array) odb entries.

To clarify the situation, db_get_value_string() was changed to reject attempts to resize an array and
calls of db_get_value_string(index>0 and create==TRUE) now return an error.

To create and resize string arrays, I added a new function - db_resize_array(hdb, hkey, key_name, num_values, max_string_size).

Here,
num_values is the new array size, making it possible to grow or shrink an array
max_string_size is the new string size, making it possible to change the array string length after the array was created (there was no midas function to do this before now).

I added a json-rpc call for db_resize_string().

But it still needs to be added to odbedit and mhttpd.

K.O.
  1286   02 May 2017 Konstantin OlchanskiInfomhttpd inline-editor change
I changed the mhttpd odb inline editor to use the json-rpc interface. Good things:

- browser no longer complains about obsolete synchronous ajax calls
- can edit strings of arbitrary length (was limited to the max URL length)
- funny characters " (quote), > and < (angle brackets) are correctly escaped.
- after editing, the actual value from odb is loaded and displayed (confirming that the edit "took").

K.O.
  1289   02 May 2017 Konstantin OlchanskiInfoadded db_get_value_string()
> Just some thought for discussion:

Even more thoughts:

- c++ interface for odb. been there, done that. see VirtualODB in rootana. Can access live ODB, XML odb dump from midas file, even ODB through http/mhttpd (needs to be converted to json rpc api).
- c++11. the ROOT team made the decision for us, for all practical reasons. RH/SL/CentOS <= 6 are left for dead. (but we still have machines as old as SL4).
- odb interface via severe operator overloading. writing "let x=42;" to simulate the universe from the big band to thermal death is elegant (overload operator= of class "let")
  but there is a surprise for naive programmer (long run time, large memory consumption)
- c++ exceptions. defective by design, as they do not carry enough debug information (i.e. java exceptions carry the full stack trace). in the typical case, it is impossible to tell
  who and why is throwing exceptions. error handling is reduced to "main() { try { real_main } catch exception { printf("sorry!"); }}.
  see http://stackoverflow.com/questions/1736146/why-is-exception-handling-bad
- converting midas to a new simplified odb api. typical use via db_get_value() is already one (or two) line of code that cannot be reduced (have to specify odb path, tid, etc),
  so little is gained from using a different api. getting rid of db_find_key()/db_get_key() would be helpful, but with db_get_value(), they are hardly ever used in new code.

There are weaknesses in the current api, would be nice to fix them some day, and a c++ api seems like the right way to go:

- fix the race condition between db_enum_key() and db_delete_key(). (it is same as between "ls" and "rm" - with nfs, try to "rm" on one client while running "ls" on another, fun!)
- fix the race condition between odb handles (pointers into shared memory) and db_delete_key() (and whatever else moves the keys around). This means using full odb paths for
  all odb api functions.
- make it all work nice multithreaded - the above race conditions would become only worse if we encourage heavy use of threads in midas.

And I do need a "no-odb" odb api for my "no-midas" midas frontend framework (where I can build and run the frontend without linking and connecting with a real midas),
in practice it means all api "get" calls have to take a "default" value that is returned right back to me when I am not connected (or linked) with a real odb.

Good fodder for this summer discussions.

K.O.


> 
> Rather than "spicing up" the MIDAS library here and there with C++ objects such as std::string, wouldn't it make more sense to "cleanly" wrap an ODB value in a C++ class? We could use then 
> both APIs in parallel, and encourage the C++ API for new developments. We could then write things like:
> 
>    ODBKEY<std::string> name("/Experiment/Name"); // constructor calls automatically db_get_value
>    name = "New Name"; // overloading the "=" operator, will call db_set_value()
> 
> or even
> 
>    ODBKEY<std::vector, std::string> nameArray("...");
>    for (auto &s : nameArray)
>       std::cout << s << std::endl; // print all elements of string array
> 
> so we treat ODB arrays as vectors, which fixes array boundary violations nicely.
> 
> If the key does not exist, we could properly throw exceptions and forget about tons of nested return parameters for error conditions.
> 
> Many nice things could be done, common errors could be prevented, and we can do a "smooth" migration: We don't have to change the whole library completely, just where we feel it's currently 
> needed. So over time the code would be "objectified". Would be nice if we could rely on C++11 (like the "auto" feature above). Not sure about VxWorks, but every other OS should be fine.
> 
> Stefan
> 
> > Since we have been regularly running into problems with db_get_xxx(TID_STRING) and string buffers of mismatched size,
> > I now implemented db_get_value_string(hdb, hkey, key_name, index, &string, create).
> > 
> > It works the same as db_get_value(TID_STRING), except that the string value is returned into an std::string object,
> > memory allocation is handled by std::string and there is no string length limit (other than std::string limits).
> > 
> > Accessing string arrays is done explicitly via an "index" parameter, if index is bigger than odb array size DB_OUT_OF_RANGE is returned
> > without logging an error message (e.g. db_get_data_index() will log an error). This makes is safe to iterate over array entries with a simple
> > loop of index from 0 and up until db_get returns an error.
> > 
> > As before, if the odb entry does not exist, it will be created (if create==true) and initialized with the value of the string parameter (zero-terminated in odb).
> > 
> > There is also newly added db_set_value_string() and cm_get_path_string(). if you want more of these, please ask, or send patches.
> > 
> > K.O.
  1294   31 May 2017 Konstantin OlchanskiInfomodified db_watch() arguments
for reasons unknown, db_watch() did not have an "info" parameter passed through to the callback 
handler function, like it is done with db_open_record().

This omission makes it difficult to write db_watch handler functions that must watch multiple odb 
trees - db_watch only delivers the hkey of the modified item inside the tree, leaving us with no 
simple way to tell which tree it came from. An example of this is mfe.c watching the Common 
structure for multiple equipments. There are other
uses for the "info" parameter, for example it is needed to implement c++ wrapper classes.

this omission is now corrected at the cost of changing the definition db_watch().

all uses of db_watch() in the midas tree have been corrected, but all out-of-tree programs
will not compile. For quick conversion, add a NULL parameter to db_watch() calls and add a 
"void*info" parameter to your watch handler function.

sorry about this disturbance,
K.O.
  1305   13 Jul 2017 Konstantin OlchanskiInfoimplemented: json-rpc batch requests
The mhttpd json-rpc interface now implements batch requests per
http://www.jsonrpc.org/specification#batch

In the nutshell, instead of a single request, one can send a json array of requests and receive a json 
array of replies.

As a variance from the spec, the midas implementation executes the requests strictly in-order and 
the array of replies corresponds exactly to the array of requests (the spec requires user to use the 
"id" field to match replies to requests, in midas json-rpc, the 1st reply is always to the 1st request,
2nd reply is to the 2nd request and so forth).

See this in action look at resources/example.html and in resources/transition.html

K.O.
  1307   25 Jul 2017 Stefan RittInfoCurrent git repository "develop" branch broken
Dear all,

we are currently undergoing major modifications in the way mhttpd is working. I realized that 
we are now at a state where mhttpd is currently broken, and it will take a few weeks in order to 
get everything converted to the new scheme we plan to use. Therefore I moved the git branch 
"master" to the last known stable version of midas. So for any practical purpose, please do 
NOT update your "develop" branch until further notice. To get the last stable version, you can 
do a 

$ git checkout master

which moves you right before we started to make major modifications. Once we are finished, 
we will announce this here in the forum.

Best regards,
Stefan
  1310   04 Aug 2017 Konstantin OlchanskiInfoNotes on installing midas from scratch
Notes on installing midas from scratch. The instruction on midaswiki will be synced with this later.

cd ~/packages
git clone ...
cd midas
make
cd ~
mkdir ~/online
cd ~/online
~/git/midas/darwin/bin/odbinit --env
source env.sh
~/git/midas/darwin/bin/odbinit --exptab
~/git/midas/darwin/bin/odbinit
ls -la
send:online olchansk$ ls -la
total 2376
drwxr-xr-x   15 olchansk  staff      510 Aug  4 15:34 .
drwxr-xr-x+ 244 olchansk  staff     8296 Aug  4 15:33 ..
-rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .ALARM.SHM
-rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .ELOG.SHM
-rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .HISTORY.SHM
-rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .MSG.SHM
-rw-r--r--    1 olchansk  staff  1183808 Aug  4 15:34 .ODB.SHM
-rw-r--r--    1 olchansk  staff        8 Aug  4 15:34 .ODB_SIZE.TXT
-rw-r--r--    1 olchansk  staff       15 Aug  4 15:34 .SHM_HOST.TXT
-rw-r--r--    1 olchansk  staff       12 Aug  4 15:34 .SHM_TYPE.TXT
-rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .SYSMSG.SHM
-rw-r--r--    1 olchansk  staff      341 Aug  4 15:33 env.csh
-rw-r--r--    1 olchansk  staff      322 Aug  4 15:33 env.sh
-rw-r--r--    1 olchansk  staff       40 Aug  4 15:34 exptab
-rw-r--r--    1 olchansk  staff      287 Aug  4 15:34 midas.log
send:online olchansk$

odbedit ### works
mhttpd ### bombs, requires SSL certificate https://bitbucket.org/tmidas/midas/issues/57/initial-mhttpd-should-bind-to-localhost
odbedit ### cd /experiment, set "http redirect to https" to no, set "midas https port" to 0
mhttpd ### runs now
connect to http://localhost:8080 ### status page works
restart mhttpd as mhttpd -D
mlogger -D
fetest ### runs, prints time and data
start a run from web page ### works
### fetest generates crazy data rate https://bitbucket.org/tmidas/midas/issues/58/fetest-crazy-data-rate
### go to history, define plot for SLOW/SLOW, see sine wave ### works
### history is written to expt dir, no good, go to "history"
### data files written to expt dir, no good, go to "data"
### midas.log written to data dir, no good (want expt dir)
### elog written to expt dir, go to "elog"
### logger channel config is wrong - gzip compression and crc32c should be enabled by default
### history config is wrong - FILE per-variable history should be enabled by default

K.O.
 
  1311   07 Aug 2017 Stefan RittInfoNotes on installing midas from scratch
Thanks for documenting this in detail. A few suggestions:

- is it really necessary to call odbedit three times? Maybe two or even three functions can be merged. Like you call odbinit, it checks if the environment is 
there, and creates it automatically if not. Same with the exptab.

- can we make "http redirecto to https = n" and "midas https port = 0" as the default? Of course this has to go with binding to localhost only.

- does it make sense to define default directories for history, data files and midas.log? Maybe we could come with a "default scheme" which can then later 
adjusted if needed.

- will you take care of the wrong logger channel config and history config?

Best regards,
Stefan

> Notes on installing midas from scratch. The instruction on midaswiki will be synced with this later.
> 
> cd ~/packages
> git clone ...
> cd midas
> make
> cd ~
> mkdir ~/online
> cd ~/online
> ~/git/midas/darwin/bin/odbinit --env
> source env.sh
> ~/git/midas/darwin/bin/odbinit --exptab
> ~/git/midas/darwin/bin/odbinit
> ls -la
> send:online olchansk$ ls -la
> total 2376
> drwxr-xr-x   15 olchansk  staff      510 Aug  4 15:34 .
> drwxr-xr-x+ 244 olchansk  staff     8296 Aug  4 15:33 ..
> -rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .ALARM.SHM
> -rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .ELOG.SHM
> -rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .HISTORY.SHM
> -rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .MSG.SHM
> -rw-r--r--    1 olchansk  staff  1183808 Aug  4 15:34 .ODB.SHM
> -rw-r--r--    1 olchansk  staff        8 Aug  4 15:34 .ODB_SIZE.TXT
> -rw-r--r--    1 olchansk  staff       15 Aug  4 15:34 .SHM_HOST.TXT
> -rw-r--r--    1 olchansk  staff       12 Aug  4 15:34 .SHM_TYPE.TXT
> -rw-r--r--    1 olchansk  staff        0 Aug  4 15:34 .SYSMSG.SHM
> -rw-r--r--    1 olchansk  staff      341 Aug  4 15:33 env.csh
> -rw-r--r--    1 olchansk  staff      322 Aug  4 15:33 env.sh
> -rw-r--r--    1 olchansk  staff       40 Aug  4 15:34 exptab
> -rw-r--r--    1 olchansk  staff      287 Aug  4 15:34 midas.log
> send:online olchansk$
> 
> odbedit ### works
> mhttpd ### bombs, requires SSL certificate https://bitbucket.org/tmidas/midas/issues/57/initial-mhttpd-should-bind-to-localhost
> odbedit ### cd /experiment, set "http redirect to https" to no, set "midas https port" to 0
> mhttpd ### runs now
> connect to http://localhost:8080 ### status page works
> restart mhttpd as mhttpd -D
> mlogger -D
> fetest ### runs, prints time and data
> start a run from web page ### works
> ### fetest generates crazy data rate https://bitbucket.org/tmidas/midas/issues/58/fetest-crazy-data-rate
> ### go to history, define plot for SLOW/SLOW, see sine wave ### works
> ### history is written to expt dir, no good, go to "history"
> ### data files written to expt dir, no good, go to "data"
> ### midas.log written to data dir, no good (want expt dir)
> ### elog written to expt dir, go to "elog"
> ### logger channel config is wrong - gzip compression and crc32c should be enabled by default
> ### history config is wrong - FILE per-variable history should be enabled by default
> 
> K.O.
>  
  1317   11 Oct 2017 Konstantin OlchanskiInfoadded support for ucLinux
Support for building for ucLinux was added to MIDAS. I use the emcraft toolchain and userland on 
some kind of embedded ARM CPU that does not have an MMU. See the Makefile for details. The 
main difference of ucLinux is lack of fork(), which cannot be done without an MMU. Not everything 
works, but at the least I can run a frontend and connect to an experiment on a remote host 
computer (mserver connection). K.O.
  1318   13 Oct 2017 Konstantin OlchanskiInfoodb multithread support repaired
multithreaded access to odb was implemented back in 2013-2014. but recently a bug surfaced - 
there was a race condition in the odb locking code against cm_watchdog(). Somehow this only 
affected the mserver for the DRAGON experiment at TRIUMF. This is now fixed on the branch 
feature/midas-2017-10. (this branch collects all the code that needs additional testing before 
merging into develop and becoming the next release of midas).
K.O.
  1328   21 Nov 2017 Konstantin OlchanskiInfoMIDAS support on el5?
It has been reported that the current midas release candidate does not build on el5 linux (SL/RHEL/CentOS-5).

According to Red Hat, el5 is end-of-life, last SL 5 (SL5.11) was done in 2014, so this linux is very old. Also as it happens, I do not have access to any 
el5 machines to check if midas builds or runs (but this can be fixed).

https://www.scientificlinux.org/downloads/sl-versions/sl5/
https://access.redhat.com/support/policy/updates/errata

On the midas web page (https://midas.triumf.ca) we do not explicitly state which versions of which linux we definitely support. Most other open-
source projects only support current major linux distributions, hardly anybody supports end-of-life linuxes such as el5. Some projects do not even 
support recent linuxes still widely in use (ROOT6 does not build on stock el6 and there is no KDE5 for el7).

So back to midas. Support for different operating systems comes down to:

1) C/C++ language support. We still use el6 (GCC 4.4.7), so use of c++-11 language features should be avoided
2) operating system features support:
a) sysv semaphores (sysv shared memory no longer used, cannot be used on macos)
aa) (macos also is missing parts of the sysv semaphore api, such as "wait for lock, with timeout", we are using an ugly work-around)
b) posix shared memory with mprotect() & co
c) posix mutexes, including recursive-type mutexes (this seems to be the problem on el5)
d) bsd networking (need to migrate from select() to poll() and from gethostbyname() to getaddrinfo() & co (for IPv6 support))

Not all of these operating system functions are required for all of midas. Running mhttpd and mlogger requires
pretty much everything. Running just a frontend connected to midas through the mserver requires the least features,
just the networking is enough, I think.

Obviously we cannot support midas in perpetuity on all versions of all operating systems, once I do not have
access to a machine, I cannot even check that midas builds and that it runs the basic functions.

Instead, we could provide a "feature reduced" build of midas (makefile target) that includes "just enough" of midas
to (say) run a frontend, maybe even odbedit. We already have some provisions for this, but no obvious documented
way actually doing it.

So back to el5.

How important it is to support very old operating systems?
How many people still use el5?
How about old  versions of Ubuntu? Macos?

If you use anything older than el6, can you speak up,
(and if possible say why you cannot migrate to an up-to-date linux).

K.O.
  1371   08 Jun 2018 Lee PoolInfoMIDAS RTEMS PoRT
Hi,

So I finally got around to "publish" work I did in 2009/2010 with RTEMS.

The work was mainly between myself and Till Straumann (SLAC), and Dr. Joel 
Sherill, to get VME support for vme universe/vme tsi148 ( basic support ), into 
the i386 bsp.

https://bitbucket.org/lcpool2/midas-k600/src/develop/ ( our rtems port ).


What this did was to allow us to run our various VME single board controllers, 
with a single frontend application.

It is still classified testing but its been very successful, so
far, and I hope to use it in the next experiment, if possible.

The midas port, contains a makefile, and some changes to the 
midas.c/system.c/mfe.c files. I've not tested  the full functionality
as I'm super time limited.

Hope this is help full to others...
  1376   20 Jul 2018 Konstantin OlchanskiInfoROOT I/O workshop notable
The ROOT I/O workshop was held on June 20th at CERN. A few things of interest in MIDAS land:

- LZ4 is now used as default compression (replacing gzip-1)
- JSON class streamer is finally implemented (XML streamer updated/reworked)
- recursive read-write lock class implemented
- do not see any special mention of Javascript I/O or jsroot, but jsroot git repo seems to be quite active

Of these the recursive read-write lock is most interesting - using something similar would improve ODB performance
and presumably fix the existing lock fairness problems.

https://root.cern.ch/doc/master/TReentrantRWLock_8hxx_source.html
https://indico.cern.ch/event/715802/contributions/2942560/attachments/1670191/2680682/ROOT_IO_June_Workshop_v2.pdf
https://github.com/root-project/jsroot

K.O.
  1403   24 Oct 2018 Ryu SawadaInfobm_receive_event timeout in ROME
Hi all

There is a bug report in the ROME repository which says bm_receive_event timeouts.

https://bitbucket.org/muegamma/rome3/issues/8/rome-with-midas-produces-timeout-after

Does anybody have any ideas what could causing the problem ?

Ryu
  1410   22 Nov 2018 Konstantin OlchanskiInfostatus of self-signed https certificates
I just happened to check the current situation with self-signed https certificates as implemented in mhttpd.

(To remember, the powers-that-be are pushing for universal use of https for all web access. The https
implementation in mhttpd at the moment can only generate self-signed certificates, so...)

plain unencrypted http:
- both google chrome and firefox say "connection not secure", but connect without any fuss.
- apple safari does not say anything

https with self-signed certificate:
- google chrome goes through an "are you sure?" page, "red not secure" status in toolbar
- firefox does the same thing, requires adding a security exception, but still shows "not secure" status in toolbar
- apple safari goes through a sequence of "are you sure?" pages, asks for the user password to add the self-signed certificate to 
the macos key store, then marks the connection as "secure" (good)

So clearly powers-that-be do not want us to use self-signed certificates for https. (And frown on use of unencrypted
http even for localhost connections). Properly signed certificates can be obtained from letsencrypt almost
automatically, but of course mhttpd needs to know how to use them and how to do handle their automatic renewals.

I plan to update the mongoose web server library inside mhttpd and with luck I will straighten some of this certificate business at 
the same time.

In the mean time, we continue to recommend that mhttpd should be used behind a password protected https proxy (i.e. apache 
httpd, etc).

K.O.
  1411   30 Nov 2018 Stefan RittInfostatus of self-signed https certificates
> In the mean time, we continue to recommend that mhttpd should be used behind a password protected https proxy (i.e. apache 
> httpd, etc).

I guess this is what moste people do anyhow these days. Do I understand correctly that this then rules out the usage of letsencrype certificates, since the 
host needs to be accessed from outside, which is not possible if running behind a password protected firewall.

Stefan
  1412   03 Dec 2018 Konstantin OlchanskiInfostatus of self-signed https certificates
> > In the mean time, we continue to recommend that mhttpd should be used behind a password protected https proxy (i.e. apache 
> > httpd, etc).
> 
> I guess this is what moste people do anyhow these days. Do I understand correctly that this then rules out the usage of letsencrype certificates, since the 
> host needs to be accessed from outside, which is not possible if running behind a password protected firewall.
> 
> Stefan

Careful, firewall != proxy, very different things.

A firewall prevents network communications, period. (Like fences and locked doors, there are good reasons to have them).

An https proxy is a way to have encrypted (protected) web communications with a machine behind a firewall.

Basically, we have 4 main cases, all with trouble.

1) mhttpd running on localhost, "just for testing", is in trouble. there is no simple way to get a "blessed" certificate, and self-signed certificates are now "almost forbidden". http is "okey 
for now", but the writing is on the wall. There is no special exception for "local-only" connections.

2a) mhttpd running on an internet-connected machine, with apache httpd, our best case. To get this working one has to configure both apache httpd and the "blessed certificate" 
certbot tool. With luck, both tools work smoothly on current OSes (they do NOT).

2b) same, but without apache httpd. One still has to run certbot, and the "glue" between mhttpd and certbot is currently missing: need a way to point mhttpd to the certbot certificate 
files and a way to reload mhttpd when the certificate is auto-renewed.

3) mhttpd running on a machine behind a corporate firewall. worst case. if firewall Gods make an opening for ports 80 and 443, it becomes case (2a/b), otherwise, one must use some 
kind of https proxy. (Plus there is no trivial way to setup an encrypted secure communication channel between mhttpd and this proxy, a double bad).

K.O.

P.S. I guess one can use nginx as the https proxy instead of apache httpd. I did not try yet. My impression is that everybody uses nginx, except for people who started with apache httpd 
and are too lazy to try nginx.

K.O.
  1413   05 Dec 2018 Konstantin OlchanskiInfoPartial refactoring of ODB code
The current ODB code has several structural problems and I think I now figured out how to straighten them out.

Here is the problems:

a) nested (recursive) odb locks
b) no clear separation between read-only access and read-write access
c) no clear separation between odb validation and repair functions
d) cm_msg() is called while holding a database lock

Discussion:

a) odb locks are nested because most functions lock the database, then call other functions that lock the database again. Most locking primitives - SystemV 
semaphores, POSIX semaphores and mutexes - usually do not permit nested (recursive) locking.

For locking the odb shared memory we use a SystemV semaphore with recursion implemented "by hand" in ss_semaphore_wait_for(). This works ok.

For making odb thread-safe, we use POSIX mutexes, and we rely on an optional feature (PTHREAD_MUTEX_RECURSIVE) which seems to work on most OSes, but 
is not required to exist and work by any standard. For example, recursive mutexes do not work in uclinux (linux for machines without an MMU).

I looked at implementing recursive mutexes "by hand", same as we have the recursive semaphores, and realized that it is quite complicated and computationally 
expensive (read: inefficient). (Also I think nested and recursive locks is "not mainstream" and should rather be avoided). As an example you can see full 
complexity of a nested lock as recent implementation in ROOT. (good luck finding it).

A solution for this problem is well known. All functions are separated into "unlocked" user-callable functions and "locked" internal functions. Nested locking is 
naturally eliminated.

Call sequences:
db_get_key() -> db_find_key() // odb is locked twice
become
db_get_key() -> db_get_key_locked() -> db_find_key_locked() // odb is locked once

Actual implementation of this scheme turns out to be a very clean and mechanical refactoring (moving the code without changing what it does).

As a try, I refactored db_find_key() and db_get_key() and I like the result. Locking is now obvious - obscure error paths with hidden "unlock before return"  - are all 
gone. Extra conversions between hDB and pheader are gone.

b) in this refactoring, functions that do not (should not) modify odb become easy to identify - the pheader argument is tagged "const".

This simplifies the implementation of "write-protected" odb - instead of ad-hoc db_allow_write_locked() sprinkled everywhere, one can have obvious calls to 
"db_lock_read_only()" and "db_lock_read_write()".

Separation of locks into "read" and "write" locks, in turn, improves locking behaviour - helps against problems like lock starvation - which we did see with MIDAS - 
as "read" locks are much more efficient - all readers can read the data at the same time, locking is only done when somebody need to "write".

c) some db_validate() functions also try to do repair. this cannot work if validation is called from "read-only" functions like db_find_key(). I now think the "repair" 
functions should be separate from "validate" functions. validate functions should detect problems, repair functions would repair them. The question remains - 
when is good time to run a full repair. (probably at the time when we connect to the database - this way, simply starting "odbedit" will force a database check and 
repair).

d) calls to cm_msg() when odb is locked has been a problem for a long time. because cm_msg() itself calls odb and because it also calls event buffer code 
(SYSMSG buffer) which in turn call odb functions, there was trouble with deadlocks between ODB and event buffer semaphores, trouble with recursive use of 
ODB, etc.

Right now we have all this partially papered over by having cm_msg() put messages into a memory buffer that we periodically flush, but I was never super happy 
with that solution. For example, if we crash before the message buffer is flushed, all error messages are lost, they do not go into midas.log, they are not printed on 
the screen, they are not accessible in the core dump.

To resolve this problem, I have all "locked" functions call db_msg() instead of cm_msg(). db_msg() saves the messages in a linked list which is flushed into 
cm_msg() immediately after we unlock odb.

If we crash after generating an error message but before it is flushed to cm_msg(), we can still access it through the linked list inside the core dump. This is an 
improvement over what we have now. Ideally, all messages should be printed to the terminal and saved to midas.log and pushed into SYSMSG, but most of this is 
impractical at a moment when odb is locked - as we already know it leads to deadlocks and other trouble...

Bottom line, I now have a path to improve the odb code and to resolve some of the long standing structural problems.

K.O.
  1414   11 Dec 2018 Stefan RittInfoPartial refactoring of ODB code
All makes sense to me. I agree to proceed with the refactoring.

One additional comment: In the 90's when I developed this code, locking was expensive. On a decent computer you could do a couple of thousand lock operations per second before you hit the 100% 
CPU limit. Therefore I tried to reduce the number of lock operation as much as possible. Like a db_find_key locks the ODB once and then goes through all keys before it unlocks again. If I would lock for 
every key and have an ODB with ten thousands of keys, that would have taken very long in the old days. 

Now the world has changed, we can do almost a million locks a second. So a db_get_record() does not have to obtain a whole directory in one go, but can get each value separately, and if necessary lock 
the ODB on each key access. This would be slower, but only a negligible amount these days. So in the spirit of making midas more robust, we can even go a step beyond simple refactoring and change the 
locking scheme if it becomes more transparent and stable.

Best,
Stefan
ELOG V3.1.4-2e1708b5