Back Midas Rome Roody Rootana
  Midas DAQ System, Page 114 of 121  Not logged in ELOG logo
ID Date Author Topic Subjectup
  1166   26 Feb 2016 Konstantin OlchanskiSuggestionscript command limited to 256 characters; remove limit?
> Using low-level memory allocation routines in higher-level programs like mhttpd makes me nervous.

It should not, people have used malloc() for decades now without much injury to themselves. (Thomas corrects me: some people had big injury to their pride, me included).

> We could use vector arrays to allow variable-sized allocation, and use the data() member function to access the char* needed for functions like strlcat,
> db_get_data, and db_sprintf.

I thought auto_ptr was the correct tool to allocate "I just need a few bytes for a few minutes" arrays, but there is some discrepancy
between delete and delete[] (with brackets) and auto_ptr p(new char[i]) is verboten (even though it compiles just fine).

I ended up writing a custom replacement for auto_ptr called auto_string - now in mhttpd.cxx available for use in other places like this.

Still I think a db_get_data() that returns allocated memory is the correct solution. But this memory still needs to be released and lacking auto_ptr it opens the door for memory leaks.

> This conforms to the c++ standard, but doesn't require explicit freeing by the user - at least, not when you're allocating std::vector<char>

I do not think std::vector<char> can be cast into "char*" and used as replacement of "char str[100]" or "char* str = malloc(i);"

In other new, the limit on the command length is now removed.

K.O.

> 
> Amy
> 
> > Thank you for reporting this problem:
> > 
> > a) ODB key *names* are restricted to 31 characters (32 bytes, last byte is a NUL), not 256 characters.
> > b) ODB string length is unlimited (32-bit length field)
> > c) ODB C API "db_get_value" & co require fixed length buffer and most users of this API provide a 256-byte fixed buffer for strings, some of them also do not 
> > check the status code, resulting in silent truncation. (I think the ODB functions themselves report truncation to midas.log, so not completely silent).
> > 
> > We try to fix this where we must - but it is cumbersome with the current ODB API - as in your fix on has to:
> > - get the ODB key, extract size
> > - allocate buffer
> > - call db_get_value() & co
> > - use the data
> > - remember to free the buffer on each and every return path
> > 
> > The first three steps could become one if we had an ODB "get_data" function that automatically allocated the data buffer.
> > 
> > But the main source of bugs will be the last step - remember to free the buffer, always.
> > 
> > P.S.
> > 
> > We are not alone in pondering how to do this best. If you want to see it "done right",
> > read the fresh-off-the-presses book "Go Programming Language" by Alan Donovan and Brian Kernighan,
> > http://www.gopl.io/
> > 
> > Brian Kernighan is the "K" in K&R "C programming language", still around and kicking, now at Google.
> > Sadly the "R" passed away in 2011 - http://www.nytimes.com/2011/10/14/technology/dennis-ritchie-programming-trailblazer-dies-at-70.html
> > 
> > K.O.
> > 
> > > Both the /Script and /CustomScript trees in the ODB allow users to trigger a 
> > > script via Midas - which silently truncates command strings longer than 
> > > 256 characters.
> > > 
> > > I'd prefer that Midas place no limit on string length.  Failing that, it would be
> > > helpful to have character limits called out in the documentation 
> > > (https://midas.triumf.ca/MidasWiki/index.php//Script_ODB_tree#.3Cscript-name.3E_key_or_subtree,
> > > https://midas.triumf.ca/MidasWiki/index.php//Customscript_ODB_tree).
> > > 
> > > As far as I can tell, odb.c allows arbitrarily large strings in the ODB data.  
> > > (Although key *names* are restricted to 256 characters.)  I've submitted one 
> > > possible version of an arbitrary-length exec_script() as a pull request 
> > > (https://bitbucket.org/tmidas/midas/pull-requests/).
> > > 
> > > Am I misunderstanding any critical pieces?  Does Midas intentionally treat 
> > > strings in the ODB as limited to 256 characters?
  337   05 Feb 2007 Fedor IgnatovBug Reportsegmentation violation of analyzer on a x86_64
Hello,

When I  connect to analyzer on a x86_64 processor(with Roody),  
a analyzer break with segmentation violation in the root_server_thread  function.
Same code are working fine on a 32bit processor.
As I found the problem are in exchanging of pointers between analyzer and client.
Before to send a pointer, it is saved a pointer in int (size=4, instead of 8) at
this place:
Index: src/mana.c
===================================================================
--- src/mana.c  (revision 3498)
+++ src/mana.c  (working copy)
@@ -5386,7 +5386,7 @@

             //write pointer
             message->Reset(kMESS_ANY);
-            int p = (POINTER_T) obj;
+            POINTER_T p = (POINTER_T) obj;
             *message << p;
             sock->Send(*message);


Sincerely Yours,
Fedor Ignatov 
  341   06 Feb 2007 Stefan RittBug Reportsegmentation violation of analyzer on a x86_64
> Hello,
> 
> When I  connect to analyzer on a x86_64 processor(with Roody),  
> a analyzer break with segmentation violation in the root_server_thread  function.
> Same code are working fine on a 32bit processor.
> As I found the problem are in exchanging of pointers between analyzer and client.
> Before to send a pointer, it is saved a pointer in int (size=4, instead of 8) at
> this place:
> Index: src/mana.c
> ===================================================================
> --- src/mana.c  (revision 3498)
> +++ src/mana.c  (working copy)
> @@ -5386,7 +5386,7 @@
> 
>              //write pointer
>              message->Reset(kMESS_ANY);
> -            int p = (POINTER_T) obj;
> +            POINTER_T p = (POINTER_T) obj;
>              *message << p;
>              sock->Send(*message);
> 
> 
> Sincerely Yours,
> Fedor Ignatov 

Do I understand you right? With your patch it works even on 64 bit, right? Or do you
mean there is still a segmentation violation? Anyhow I committed your patch since the
"int" is clearly incorrect.

- Stefan
  342   06 Feb 2007 Fedor IgnatovBug Reportsegmentation violation of analyzer on a x86_64
Yes right, Problem of a segmentation violation is solved with this patch. Now it works
fine on x86_64.

Fedor 

> Do I understand you right? With your patch it works even on 64 bit, right? Or do you
> mean there is still a segmentation violation? Anyhow I committed your patch since the
> "int" is clearly incorrect.
> 
> - Stefan
  345   17 Feb 2007 Konstantin OlchanskiBug Reportsegmentation violation of analyzer on a x86_64
> Yes right, Problem of a segmentation violation is solved with this patch. Now it works
> fine on x86_64.

Right. I confirm this. I have this exact same fix in my stand-alone copy of the midas
histogram server, and should commit it to MIDAS CVS as well.

K.O.
  2270   19 Aug 2021 Konstantin OlchanskiBug Reportselect() FD_SETSIZE overrun
I am looking at the mlogger in the ALPHA anti-hydrogen experiment at CERN. It is 
mysteriously misbehaving during run start and stop.

The problem turns out to be with the select() system call.

The corresponding FD_SET(), FD_ISSET() & co operate on a an array of fixed size 
FD_SETSIZE, value 1024, in my case. But the socket number is 1409, so we overrun 
the FD_SET() array. Ouch.

I see that all uses of select() in midas have no protection against this.

(we should probably move away from select() to newer poll() or whatever it is)

Why does mlogger open so many file descriptors? The usual, scaling problems in the 
history. The old midas history does not reuse file descriptors, so opens the same 
3 history files (.hst, .idx, etc) for each history event. The new FILE history 
opens just one file per history event. But if the number of events is bigger than 
1024, we run into same trouble.

(BTW, the system limit on file descriptors is 4096 on the affected machine, 1024 
on some other machines, see "limit" or "ulimit -a").

K.O.
  2271   20 Aug 2021 Stefan RittBug Reportselect() FD_SETSIZE overrun
> I am looking at the mlogger in the ALPHA anti-hydrogen experiment at CERN. It is 
> mysteriously misbehaving during run start and stop.
> 
> The problem turns out to be with the select() system call.
> 
> The corresponding FD_SET(), FD_ISSET() & co operate on a an array of fixed size 
> FD_SETSIZE, value 1024, in my case. But the socket number is 1409, so we overrun 
> the FD_SET() array. Ouch.
> 
> I see that all uses of select() in midas have no protection against this.
> 
> (we should probably move away from select() to newer poll() or whatever it is)
> 
> Why does mlogger open so many file descriptors? The usual, scaling problems in the 
> history. The old midas history does not reuse file descriptors, so opens the same 
> 3 history files (.hst, .idx, etc) for each history event. The new FILE history 
> opens just one file per history event. But if the number of events is bigger than 
> 1024, we run into same trouble.
> 
> (BTW, the system limit on file descriptors is 4096 on the affected machine, 1024 
> on some other machines, see "limit" or "ulimit -a").
> 
> K.O.

I cannot imagine that you have more than 1024 different events in ALPHA. That wouldn't 
fit on your status page. 

I have some other suspicion: The logger opens a history file on access, then closes it 
again after writing to it. In the old days we had a case where we had a return from the 
write function BEFORE the file has been closed. This is kind of a memory leak, but with 
file descriptors. After some time of course you run out of file descriptors and crash. 
Now that bug has been fixed many years ago, but it sounds to me like there is another 
"fd leak" somewhere. You should add some debugging in the history code to print the 
file descriptors when you open a file and when you leave that routine. The leak could 
however also be somewhere else, like writing to the message file, ODB dump, ...

The right thing of course would be to rewrite everything with std::ofstream which 
closes automatically the file when the object gets out of scope.

Stefan
  859   11 Feb 2013 Wes GohnForumsend_tcp error
I am getting a series of errors from MIDAS that I do not understand, so I hope
someone can help me figure this out.

I am attempting to run many frontends on one machine. I can run 8 with no
problem, but if I try to add a 9th I get errors relating to send_tcp. 

I have tried adjusting the max event sizes and buffer sizes, but it has not
resolved the problem. I also tried adjusting the data rates and the total data
volume going through each frontend, but there was no change. And as far as I can
tell I am not up against any hardware limits.

The errors are repeated continuously while a run is going. The three errors I
get are:

16:45:22 [FakeData09,ERROR] [midas.c:9958:rpc_client_call,ERROR] send_tcp() failed
16:45:22 [FakeData09,ERROR] [frontend_rpc.c:191:rpc_call,ERROR] No RPC to master
16:45:22 [FakeData09,ERROR] [system.c:4166:send_tcp,ERROR]
send(socket=9,size=16) returned -1, errno: 32 (Broken pipe)

If you have any suggestions of how I can debug this, please let me know. Thanks!
  860   11 Feb 2013 Stefan RittForumsend_tcp error
> I am getting a series of errors from MIDAS that I do not understand, so I hope
> someone can help me figure this out.
> 
> I am attempting to run many frontends on one machine. I can run 8 with no
> problem, but if I try to add a 9th I get errors relating to send_tcp. 
> 
> I have tried adjusting the max event sizes and buffer sizes, but it has not
> resolved the problem. I also tried adjusting the data rates and the total data
> volume going through each frontend, but there was no change. And as far as I can
> tell I am not up against any hardware limits.
> 
> The errors are repeated continuously while a run is going. The three errors I
> get are:
> 
> 16:45:22 [FakeData09,ERROR] [midas.c:9958:rpc_client_call,ERROR] send_tcp() failed
> 16:45:22 [FakeData09,ERROR] [frontend_rpc.c:191:rpc_call,ERROR] No RPC to master
> 16:45:22 [FakeData09,ERROR] [system.c:4166:send_tcp,ERROR]
> send(socket=9,size=16) returned -1, errno: 32 (Broken pipe)
> 
> If you have any suggestions of how I can debug this, please let me know. Thanks!

Can you tell me

- why you need 9 frontends
- what kind of data your frontends produce
- how your event builder looks like and how you assemble the fragments
- what messages/errors you see when you run odbedit BEFORE the crash

/Stefan
  861   11 Feb 2013 Wes GohnForumsend_tcp error
> > I am getting a series of errors from MIDAS that I do not understand, so I hope
> > someone can help me figure this out.
> > 
> > I am attempting to run many frontends on one machine. I can run 8 with no
> > problem, but if I try to add a 9th I get errors relating to send_tcp. 
> > 
> > I have tried adjusting the max event sizes and buffer sizes, but it has not
> > resolved the problem. I also tried adjusting the data rates and the total data
> > volume going through each frontend, but there was no change. And as far as I can
> > tell I am not up against any hardware limits.
> > 
> > The errors are repeated continuously while a run is going. The three errors I
> > get are:
> > 
> > 16:45:22 [FakeData09,ERROR] [midas.c:9958:rpc_client_call,ERROR] send_tcp() failed
> > 16:45:22 [FakeData09,ERROR] [frontend_rpc.c:191:rpc_call,ERROR] No RPC to master
> > 16:45:22 [FakeData09,ERROR] [system.c:4166:send_tcp,ERROR]
> > send(socket=9,size=16) returned -1, errno: 32 (Broken pipe)
> > 
> > If you have any suggestions of how I can debug this, please let me know. Thanks!
> 
> Can you tell me
> 
> - why you need 9 frontends
> - what kind of data your frontends produce
> - how your event builder looks like and how you assemble the fragments
> - what messages/errors you see when you run odbedit BEFORE the crash
> 
> /Stefan

Our experiment will need 24 frontends that will each run on its own machine. For now we
want to run 24 "fake" frontends on one machine for testing purposes. 9 is the limit
where it stops working properly. 

We have a pulser that is giving us periodic data at a constant rate. We have a master
frontend running on a different PC in interrupt mode that assembles the events, and then
N "FakeData" frontends running in polled mode on a single PC. 

We do have an event builder, but we get these errors whether the event builder is
running or not.

At the start of a run, I see the following messages:

[mtransition,INFO] Run #21 started
Sat Feb 9 16:14:57 2013 [FakeData09,ERROR] [system.c:4166:send_tcp,ERROR]
send(socket=9,size=16) returned -1, errno: 104 (Connection reset by peer)
Sat Feb 9 16:14:57 2013 [FakeData09,ERROR] [midas.c:9958:rpc_client_call,ERROR]
send_tcp() failed
Sat Feb 9 16:14:57 2013 [FakeData09,ERROR] [frontend_rpc.c:191:rpc_call,ERROR] No RPC to
master
Sat Feb 9 16:14:57 2013 [master,ERROR] [midas.c:10844:recv_tcp_server,ERROR] Cannot
allocate 268435512 bytes for network buffer
Sat Feb 9 16:14:57 2013 [master,ERROR] [midas.c:12893:rpc_server_receive,ERROR]
recv_tcp_server() returned -1, abort
Sat Feb 9 16:14:57 2013 [master,TALK] Program 'FakeData09' on host 'fe01' aborted

After this it recycles just the first three errors that I mentioned above.
  862   12 Feb 2013 Stefan RittForumsend_tcp error
Ok, now the picture is clearer. I have however no idea what the real problem is. The number of concurrent programs in midas is 64 as defined in midas.h (MAX_CLIENTS) so that should not be the problem. In our experiment we run 10 front-ends (but 
on 10 different machines) without problems. Other experiments used 27 front-ends.

The TCP error you see comes probably from the fact that the mserver side crashes or quits, then the socket gets broken. What you can try to debug this is to run mserver manually. Just remove mserver from inetd, and start it with "mserver -d" and 
watch what happens. Do you see any additional error messages. If the mserver segfaults, you should turn on core dumps and have a look there. Note that the mserver starts a child process on each incoming connection, so running mserver in gdb 
does not really help, since the child processes (which connect back to the front-ends) are not seen by gdb.

Have you tried to run the 9 front-ends on maybe two different PCs (5 and 4) to see if the problem is on the client side?


Best regards,
Stefan
  865   19 Feb 2013 Wes GohnForumsend_tcp error

Thank you for the help. As it turns out, the problem was due to the fact that we were compiling MIDAS on our 64 bit backend machine, but one of the frontend machines is 32 bit. The problem was resolved by compiling a 32 bit version of MIDAS in
addition to the 64 bit version.
  101   20 Nov 2003 Konstantin Olchanski set-uid-root midas programs
I see that MIDAS installs several set-uid-root programs into /usr/local/bin.
In this age and time of evil computer hackers, this is not a good idea and
we should Do Something (TM) about it. Here is my risk assessment:

[olchansk@midtis06 midas]$ ls -l /usr/local/bin | grep wsr
-rwsr-sr-x    1 root     root        25811 Nov 20 09:27 dio
-rwsr-sr-x    1 root     root       344553 Nov 20 09:27 mhttpd
-rwsr-sr-x    1 root     root        70736 Nov 20 09:27 webpaw

dio- is required to be setuid-root to gain I/O permissions. I looked at it a
few times, and it is probably safe, but I would like to get a second
opinion. Stephan, can you should it to your local security geeks?

mhttpd- definitely unsafe. It has more buffer overflows than I can shake a
stick at. Why is it suid-root anyway?

webpaw- what is it?!?

K.O.
  102   20 Nov 2003 Stefan Ritt set-uid-root midas programs
> dio- is required to be setuid-root to gain I/O permissions. I looked at it a
> few times, and it is probably safe, but I would like to get a second
> opinion. Stephan, can you should it to your local security geeks?
> 
> mhttpd- definitely unsafe. It has more buffer overflows than I can shake a
> stick at. Why is it suid-root anyway?
> 
> webpaw- what is it?!?

dio was written by Pierre. 

mhttpd and webpaw both are web servers. webpaw is used to display PAW 
pictures over the web. If you run these programs at a port <1024, and most 
people do run them at port 80 (at least at PSI), they need to be setuid-root. 
Unless you know a better way to do that...
  2115   02 Mar 2021 Konstantin OlchanskiInfoshortest possible sleep
since I am implementing a polled equipment, I was curious what is the smallest possible sleep time on current computers.

in current UNIX, there are 2 system calls available for sleeping: select() (with microsecond granularity) and nanosleep() (with nanosecond granularity).

So I wrote a little test program to check it out (progs/test_sleep).

First, Linux result using select(). Typical run on AMD 3700X CPU (4.1 GHz turbo boost) with Ubuntu LTS 20, linux kernel 5.8:

daq13:midas$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1003368.855 usec actual, 100336.885 usec actual per loop, oversleep 336.885 usec, 0.3%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1008512.020 usec actual, 10085.120 usec actual per loop, oversleep 85.120 usec, 0.9%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1062137.842 usec actual, 1062.138 usec actual per loop, oversleep 62.138 usec, 6.2%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1528650.999 usec actual, 152.865 usec actual per loop, oversleep 52.865 usec, 52.9%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  6250898.123 usec actual, 62.510 usec actual per loop, oversleep 52.510 usec, 525.1%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total, 54056918.144 usec actual, 54.057 usec actual per loop, oversleep 53.057 usec, 5305.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,   210875.988 usec actual, 0.211 usec actual per loop, oversleep 0.111 usec, 110.9%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total,   204804.897 usec actual, 0.205 usec actual per loop, oversleep 0.195 usec, 1948.0%
daq13:midas$ 

How to read this:

First line is 10 sleeps of 100 ms, for a total of 1 sec. this actually sleeps for a bit longer,
average over-sleep is 300 usec out of 100 ms is 0.3%.

Next few lines use progressively shorter sleep, 10 ms, 1 ms and 0.1 ms. over-sleep is consistently around 50-60 usec,
which I conclude to be this linux sleep granularity.

Last two lines try sleep for 0.1 usec and 0.01 usec, resulting in a zero-time sleep of select(),
so we just measure the average time cost of a linux syscall, around 200 ns in this machine.

Going to different machines:

Intel E-2236 (4.8 GHz tutboboost), Ubuntu LTS 20, linux kernel 5.8: over-sleep is 60 usec, zero-sleep is 400 ns.
Intel E-2226G (same, see arc.intel.com), CentOS-7, linux kernel 3.10: over-sleep is 60 usec, zero-sleep is 600 ns.
VME processor (2 GHz Intel T7400), Ubuntu 20, linux kernel 5.8: over-sleep is 60 usec, zero-sleep is 1700 ns.

This is pretty consistent, select() over-sleep is 60 usec on all hardware, zero-sleep tracks CPU GHz ratings.

Next, MacOS result, MacBookAir2020, MacOS 10.15.7, CPU 1.2 GHz i7-1060G7:

4ed0:midas olchansk$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1031108.856 usec actual, 103110.886 usec actual per loop, oversleep 3110.886 usec, 3.1%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1091104.984 usec actual, 10911.050 usec actual per loop, oversleep 911.050 usec, 9.1%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1270800.829 usec actual, 1270.801 usec actual per loop, oversleep 270.801 usec, 27.1%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1370345.116 usec actual, 137.035 usec actual per loop, oversleep 37.035 usec, 37.0%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  1706473.112 usec actual, 17.065 usec actual per loop, oversleep 7.065 usec, 70.6%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total,  5150341.034 usec actual, 5.150 usec actual per loop, oversleep 4.150 usec, 415.0%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,   595654.011 usec actual, 0.596 usec actual per loop, oversleep 0.496 usec, 495.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total,   591560.125 usec actual, 0.592 usec actual per loop, oversleep 0.582 usec, 5815.6%
4ed0:midas olchansk$ 

things are quite different here, OS is Mach microkernel with an oldish FreeBSD UNIX single-server (from NextSTEP),
so the sleep granularity is different, better than linux. zero-sleep still measures the syscall time, 600 ns on this machine.

Next we measure the same using the nansleep() syscall.

daq13:midas$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1004133.940 usec actual, 100413.394 usec actual per loop, oversleep 413.394 usec, 0.4%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1046117.067 usec actual, 10461.171 usec actual per loop, oversleep 461.171 usec, 4.6%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1096894.979 usec actual, 1096.895 usec actual per loop, oversleep 96.895 usec, 9.7%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1526744.843 usec actual, 152.674 usec actual per loop, oversleep 52.674 usec, 52.7%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  6250154.018 usec actual, 62.502 usec actual per loop, oversleep 52.502 usec, 525.0%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total, 53344123.125 usec actual, 53.344 usec actual per loop, oversleep 52.344 usec, 5234.4%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total, 52641665.936 usec actual, 52.642 usec actual per loop, oversleep 52.542 usec, 52541.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total, 52637501.001 usec actual, 52.638 usec actual per loop, oversleep 52.628 usec, 526275.0%
daq13:midas$ 

Here everything is simple. sleep longer than 1000 usec works the same as select(), sleep for shorter than 100 usec sleeps for 52 usec, regardless of what 
we ask for.

MacOS does no better, long sleeps are same as select(), sleeps is 1 usec or less sleep for too long. no improvement over select().

4ed0:midas olchansk$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1023327.827 usec actual, 102332.783 usec actual per loop, oversleep 2332.783 usec, 2.3%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1130330.086 usec actual, 11303.301 usec actual per loop, oversleep 1303.301 usec, 13.0%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1333846.807 usec actual, 1333.847 usec actual per loop, oversleep 333.847 usec, 33.4%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1402330.160 usec actual, 140.233 usec actual per loop, oversleep 40.233 usec, 40.2%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  2034706.831 usec actual, 20.347 usec actual per loop, oversleep 10.347 usec, 103.5%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total,  6646192.074 usec actual, 6.646 usec actual per loop, oversleep 5.646 usec, 564.6%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,  7556284.189 usec actual, 7.556 usec actual per loop, oversleep 7.456 usec, 7456.3%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total, 15720005.035 usec actual, 15.720 usec actual per loop, oversleep 15.710 usec, 157100.1%
4ed0:midas olchansk$ 

On Linux, strace tells us that the actual syscall behind nanosleep() is this:
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=10000}, 0x7fffc159e200) = 0

Let's try it directly... result is the same.
Let's try it with CLOCK_MONOTONIC... result is the same.

The man page of clock_nanosleep() specifies that this syscall always suspends the calling thread,
so what we see here is the Linux scheduler tick size.

Bottom line.

On current linux, shortest sleep is around 100 usec both select() and nanosleep().
On MacOS, shortest sleep is down to 5 usec using select(), but I cannot tell if CPU sleeps or busy-loops.

select() is still the best syscall for sleeping.

K.O.
  2116   02 Mar 2021 Stefan RittInfoshortest possible sleep
Why do you need that? Periodic equipment typically runs ever ten seconds or so, meaning one can do this easily in a scheduler.

For polled equipment, you don't want to sleep at all. Because if you sleep, you might miss an event. That's why I put my poll in mfe.c into a for() loop. No 
sleep, maximum polling rate. I just double checked on my macbook air. 

- If poll is always false (no event available), the loop executes 50M times in 100ms (calibrated during startup of the frontend). That means one iteration 
takes 2ns (!). So if an event occurs, the readout is started with a 2ns overhead. No sleep can beat that. In a real world application, one has to add of course 
the VME access or so to poll for the event.

- If poll is always true, the framework generates about 700k events each second (returning jus a few bytes of event data).

So if one adds any sleep here, things can get only worse, so I don't see the point for that. Of course polling eats one kernel at 100%, but these days every 
CPU has more than one, even my 800 MHz Xilinx embedded ARM CPU (Zynq).

Best,
Stefan
  2117   03 Mar 2021 Konstantin OlchanskiInfoshortest possible sleep
> Why do you need that?

UNIX/POSIX advertises functions for sleeping in microseconds and nanoseconds,
for sure it is interesting to know what they actually do and what happens
when you ask them to sleep for 1 microsecond or 1 nanosecond.

To sleep or not to sleep that is a question.

But if I do decide to sleep, and I call the sleep function, I want to know what actually happens.

Now I do and I share it with all.

On current Linux, shortest sleep is around 60 usec. select() with sleep
shorter than that will not sleep at all, nanosleep() will always sleep for
the shortest amount.

P.S. For fans of interrupts ("because they are fast"), sleep waiting for interrupt
probably has same latency/granularity as above (60 usec), so if I drive a DMA engine
and I except the DMA transfer to complete under 60 usec, I should use a busy loop
to poll the "DMA done" bit instead of going to sleep and wait for the DMA interrupt.

K.O.
  158   13 Oct 2004 Konstantin OlchanskiBug Reportsilly odbedit "rename Display xxx/yyy"
odbedit command "rename Display xxx/yyy" creates a key named "xxx/yyy" (yes,
with a slash in the name) and this key cannot be deleted or renamed...
K.O.
  159   13 Oct 2004 Stefan RittBug Reportsilly odbedit "rename Display xxx/yyy"
> odbedit command "rename Display xxx/yyy" creates a key named "xxx/yyy" (yes,
> with a slash in the name) and this key cannot be deleted or renamed...
> K.O.

"rename" is "rename", not "mv" under Unix. If you want this functionality, put it
in and don't complain!
  758   10 May 2011 Jianglai LiuForumsimple example frontend for V1720
Hi,

Who has a good example of a frontend program using CAEN V1718 VME-USB bridge and
V1720 FADC? I am trying to set up the DAQ for such a simple system.

I put together a frontend which talks to the VME. However it gets stuck at
"Calibrating" in initialize_equipment().

I'd appreciate some help!

Thanks,
Jianglai
ELOG V3.1.4-2e1708b5