Back Midas Rome Roody Rootana
  Midas DAQ System, Page 115 of 136  Not logged in ELOG logo
ID Date Author Topicdown Subject
  2278   28 Sep 2021 Richard LonglandBug ReportInstall clash between MIDAS 2020-08 and mscb
All,

I am performing a fresh install of MIDAS on an Ubuntu linux box. I follow the 
usual installation procedure:

1) git clone https://bitbucket.org/tmidas/midas --recursive
2) cd midas
3) git checkout release/midas-2020-08
4) mkdir build
5) cd build
6) cmake ..
7) make

Step 3 warns me that 
"warning: unable to rmdir 'manalyzer': Directory not empty" and 
"warning: unable to rmdir 'midasio': Directory not empty"

Step 7 fails.
Compilation fails with an mhttp error related to mscb:
mhttpd.cxx:8224:59: error: too few arguments to function 'int mscb_ping(int, 
short unsigned int, int, int)'
 8224 |             status = mscb_ping(fd, (unsigned short) ind, 1);


I was able to get around this by rolling mscb back to some old version (commit 
74468dd), but am extremely nervous about mix-and-matching the code this way.

Any advice would be greatly appreciated.

Cheers,
Richard 
  2279   28 Sep 2021 Stefan RittBug ReportInstall clash between MIDAS 2020-08 and mscb
> 1) git clone https://bitbucket.org/tmidas/midas --recursive
> 2) cd midas
> 3) git checkout release/midas-2020-08
> 4) mkdir build
> 5) cd build
> 6) cmake ..
> 7) make

When you do step 3), you get

~/tmp/midas$ git checkout release/midas-2020-08
warning: unable to rmdir 'manalyzer': Directory not empty
warning: unable to rmdir 'midasio': Directory not empty
M	mjson
M	mscb
M	mvodb
M	mxml

The 'M' in front of the submodules like mscb tell you that you
have an older version of midas (namely midas-2020-08), but the 
*current* submodules, which won't match. So you have to roll back
also the submodules with:

3.5) git submodule update --recursive

This fetched those versions of the submodules which match the
midas version 2020-08. See here for details: 

https://git-scm.com/book/en/v2/Git-Tools-Submodules

From where did you get the command

git checkout release/xxxx ???

If you tell me the location of that documentation, I will take
care that it will be amended with the command

git submodule update --recursive

Best,
Stefan
  2280   29 Sep 2021 Richard LonglandBug Reportnstall clash between MIDAS 2020-08 and mscb
Thank you, Stefan.

I found these instructions under
1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)

I do see reference to updating the submodules under the TRIUMF install 
instructions 
(https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
all_MIDAS) although perhaps it can be clarified.

Cheers,
Richard
  2281   29 Sep 2021 Stefan RittBug Reportnstall clash between MIDAS 2020-08 and mscb
> Thank you, Stefan.
> 
> I found these instructions under
> 1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
> 2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)
> 
> I do see reference to updating the submodules under the TRIUMF install 
> instructions 
> (https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
> all_MIDAS) although perhaps it can be clarified.
> 
> Cheers,
> Richard

Hi Richard,

I updated the documentation at

https://midas.triumf.ca/MidasWiki/index.php/Changelog#Updating_midas

by putting the submodule update command everywhere.

Best,
Stefan
  2296   29 Oct 2021 Frederik WautersBug Reportmidas::odb::iterator + operator
I have 16 array odb key

{"FIR Energy", {
            {"Energy Gap Value", std::array<uint32_t,16>(10) },

I can get the maximum of this array like 


uint32_t max_value = *std::max_element(values.begin(),values.end());

but when I need the maximum of a sub range

uint32_t max_value = *std::max_element(values.begin(),values.begin()+4);

I get

/home/labor/new_daq/frontends/SIS3316Module.cpp:584:62: error: no match for ‘operator+’ (operand types are ‘midas::odb::iterator’ and ‘int’)
  584 |   max_value = *std::max_element(values.begin(),values.begin()+4);
      |                                                ~~~~~~~~~~~~~~^~
      |                                                            |  |
      |                                                            |  int
      |               

As the + operator is overloaded for midas::odb::iterator, I was expected this to work.

(and yes, I can find the max element by accessing the elements on by one)
  2297   29 Oct 2021 Frederik WautersBug Reportmidas::odb::iterator + operator | work around
ok, so retrieving as a std::array (as it was defined) does not work

    std::array<uint32_t,16> avalues = settings["FIR Energy"]["Energy Gap Value"];

but retrieving as an std::vector does, and then I have a standard c++ iterator which I can use in std stuff

    std::vector<uint32_t> values = settings["FIR Energy"]["Energy Gap Value"];

> I have 16 array odb key
> 
> {"FIR Energy", {
>             {"Energy Gap Value", std::array<uint32_t,16>(10) },
> 
> I can get the maximum of this array like 
> 
> 
> uint32_t max_value = *std::max_element(values.begin(),values.end());
> 
> but when I need the maximum of a sub range
> 
> uint32_t max_value = *std::max_element(values.begin(),values.begin()+4);
> 
> I get
> 
> /home/labor/new_daq/frontends/SIS3316Module.cpp:584:62: error: no match for ‘operator+’ (operand types are ‘midas::odb::iterator’ and ‘int’)
>   584 |   max_value = *std::max_element(values.begin(),values.begin()+4);
>       |                                                ~~~~~~~~~~~~~~^~
>       |                                                            |  |
>       |                                                            |  int
>       |               
> 
> As the + operator is overloaded for midas::odb::iterator, I was expected this to work.
> 
> (and yes, I can find the max element by accessing the elements on by one)
  2298   29 Oct 2021 Kushal KapoorBug ReportUnknown Error 319 from client
I’m trying to run MIDAS using a frontend code/client named “fetiglab”. Run stops 
after 2/3sec with an error saying “Unknown error 319 from client “fetiglab” on 
localhost.

Frontend code compiled without any errors and MIDAS reads the frontend 
successfully, this only comes when I start the new run on MIDAS, here are a few 
more details from the terminal-

11:46:32 [fetiglab,ERROR] [odb.cxx:11268:db_get_record,ERROR] struct size 
mismatch for "/" (expected size: 1, size in ODB: 41920)

11:46:32 [Logger,INFO] Deleting previous file 
"/home/rcmp/online3/run00621_000.root"

11:46:32 [ODBEdit,ERROR] [midas.cxx:5073:cm_transition,ERROR] transition START 
aborted: client "fetiglab" returned status 319

11:46:32 [ODBEdit,ERROR] [midas.cxx:5246:cm_transition,ERROR] Could not start a 
run: cm_transition() status 319, message 'Unknown error 319 from client 
'fetiglab' on host "localhost"'

TR_STARTABORT transition: cleanup after failure to start a run

&#8204;

I’ve also enclosed a screenshot for the same, any suggestions would be highly 
appreciated. thanks
Attachment 1: Screenshot_2021-10-26_114015.png
Screenshot_2021-10-26_114015.png
  2304   01 Dec 2021 Lars MartinBug ReportOff-by-one in sequencer documentation
The documentation for the sequencer loop says:

<quote>
LOOP [name ,] n ... ENDLOOP	To execute a loop n times. For infinite loops, "infinite" 
can be specified as n. Optionally, the loop variable running from 0...(n-1) can be accessed 
inside the loop via $name.
</quote>

In fact the loop variable runs from 1...n, as can be seen by running this exciting 
sequencer code:

1 COMMENT "Figuring out MSL"
2 
3 LOOP n,4
4   MESSAGE $n,1
5 ENDLOOP
  2305   02 Dec 2021 Stefan RittBug ReportOff-by-one in sequencer documentation
> The documentation for the sequencer loop says:
> 
> <quote>
> LOOP [name ,] n ... ENDLOOP	To execute a loop n times. For infinite loops, "infinite" 
> can be specified as n. Optionally, the loop variable running from 0...(n-1) can be accessed 
> inside the loop via $name.
> </quote>
> 
> In fact the loop variable runs from 1...n, as can be seen by running this exciting 
> sequencer code:
> 
> 1 COMMENT "Figuring out MSL"
> 2 
> 3 LOOP n,4
> 4   MESSAGE $n,1
> 5 ENDLOOP

Indeed you're right. The loop variable runs from 1...n. I fixed that in the documentation.

Stefan
  2307   02 Dec 2021 Alexey KalininBug Reportsome frontend kicked by cm_periodic_tasks
Hello,
We have a small experiment with MIDAS based DAQ.
Status page shows :
ES	ESFrontend@192.168.0.37	207	0.2	0.000
Trigger06	Sample Frontend06@192.168.0.37	1.297M	0.3	0.000
Trigger01	Sample Frontend01@192.168.0.37	1.297M	0.3	0.000
Trigger16	Sample Frontend16@192.168.0.37	1.297M	0.3	0.000
Trigger38	Sample Frontend38@192.168.0.37	1.297M	0.3	0.000
Trigger37	Sample Frontend37@192.168.0.37	1.297M	0.3	0.000
Trigger03	Sample Frontend03@192.168.0.38	1.297M	0.3	0.000
Trigger07	Sample Frontend07@192.168.0.38	1.297M	0.3	0.000
Trigger04	Sample Frontend04@192.168.0.38	59898	0.0	0.000
Trigger08	Sample Frontend08@192.168.0.38	59898	0.0	0.000
Trigger17	Sample Frontend17@192.168.0.38	59898	0.0	0.000


And SYSTEM buffers page shows:
ESFrontend	1968	198	47520	0	0x00000000	0		
193 ms
Sample Frontend06	1332547	1330826	379729872	0	0x00000000	
0		1.1 sec
Sample Frontend16	1332542	1330839	361988208	0	0x00000000	
0		94 ms
Sample Frontend37	1332530	1330841	337798408	0	0x00000000	
0		1.1 sec
Sample Frontend01	1332543	1330829	467136688	0	0x00000000	
0		34 ms
Sample Frontend38	1332528	1330830	291453608	0	0x00000000	
0		1.1 sec
Sample Frontend04	63254	61467	20882584	0	0x00000000	
0		208 ms
Sample Frontend08	63262	61476	27904056	0	0x00000000	
0		205 ms
Sample Frontend17	63271	61473	20433840	0	0x00000000	
0		213 ms
Sample Frontend03	1332549	1330818	386821728	0	0x00000000	
0		82 ms
Sample Frontend07	1332554	1330821	462210896	0	0x00000000	
0		37 ms
Logger	968742	0w+9500418r	0w+2718405736r	0	0x00000000	0	
GET_ALL Used 0 bytes 0.0%	303 ms
rootana	254561	0w+29856958r	0w+8718288352r	0	0x00000000	0		
762 ms


The problem is that eventually some of frontend closed with message 
:19:22:31.834 2021/12/02 [rootana,INFO] Client 'Sample Frontend38' on buffer 
'SYSMSG' removed by cm_periodic_tasks because process pid 9789 does not exist

in the meantime mserver loggging :
mserver started interactively
mserver will listen on TCP port 1175
double free or corruption (!prev)
double free or corruption (!prev)
free(): invalid next size (normal)
double free or corruption (!prev)


I can find some correlation between number of events/event size produced by 
frontend, cause its failed when its become big enough. 

frontend scheme is like this:

poll event time set to 0;

poll_event{
//if buffer not transferred return (continue cutting the main buffer)
//read main buffer from hardware
//buffer not transfered
}

read event{
// cut the main buffer to subevents (cut one event from main buffer) return;
//if (last subevent) {buffer transfered ;return}
}

What is strange to me that 2 frontends (1 per remote pc) causing this.

Also, I'm executing one FEcode with -i # flag , put setting eventid in 
frontend_init , and using SYSTEM buffer for all.

Is there something I'm missing?
Thanks. 
A.
  2308   12 Dec 2021 Marius KoeppelBug ReportWritting MIDAS Events via FPGAs
Dear all,

in 13 Feb 2020 to 21 Feb 2020 we had a talk about how I try to create MIDAS events directly on a FPGA and 
than use DMA to hand the event over to MIDAS. In the thread I also explained how I do it in my MIDAS frontend. 

For testing the DAQ I created a dummy frontend which was emulating my FPGA (see attached file). The interesting code is 
in the function read_stream_thread and there I just fill a array according to the 32b BANKS which are 64b aligned (more or less
the lines 306-369). And than I do:

    uint32_t * dma_buf_volatile;
    dma_buf_volatile = dma_buf_dummy;

    copy_n(&dma_buf_volatile[0], sizeof(dma_buf_dummy)/4, pdata);

    pdata+=sizeof(dma_buf_dummy);
    rb_increment_wp(rbh, sizeof(dma_buf_dummy)); // in byte length

to send the data to the buffer.

This summer (Mai - July) everything was working fine but today I did not get the data into MIDAS. 
I was hopping around a bit with the commits and everything was at least working until: 3921016ce6d3444e6c647cbc7840e73816564c78.

Thanks,
Marius
Attachment 1: dummy_fe.cpp
/********************************************************************\

  Name:         dummy_fe.cxx
  Created by:   Frederik Wauters
  Changed by:   Marius Koeppel

  Contents:     Dummy frontend producing stream data

\********************************************************************/

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <algorithm>
#include <random>

#include <iostream>
#include <unistd.h>
#include <bitset>

#include "midas.h"
#include "msystem.h"
#include "mcstd.h"

#include <thread>
#include <chrono>

#include "mfe.h"

using namespace std;

/*-- Globals -------------------------------------------------------*/

/* The frontend name (client name) as seen by other MIDAS clients   */
const char *frontend_name = "Dummy Stream Frontend";
/* The frontend file name, don't change it */
const char *frontend_file_name = __FILE__;

/* frontend_loop is called periodically if this variable is TRUE    */
BOOL frontend_call_loop = FALSE;

/* a frontend status page is displayed with this frequency in ms    */
INT display_period = 0;

/* maximum event size produced by this frontend */
INT max_event_size = 1 << 25; // 32MB

/* maximum event size for fragmented events (EQ_FRAGMENTED) */
INT max_event_size_frag = 5 * 1024 * 1024;

/* buffer size to hold events */
INT event_buffer_size = 10000 * max_event_size;

/*-- Function declarations -----------------------------------------*/

INT read_stream_thread(void *param);
uint64_t generate_random_pixel_hit(uint64_t time_stamp);
uint32_t generate_random_pixel_hit_swb(uint32_t time_stamp);

BOOL equipment_common_overwrite = TRUE; //true is overwriting the common odb  


/* DMA Buffer and related */
volatile uint32_t *dma_buf;
#define MUDAQ_DMABUF_DATA_ORDER         25 // 29, 25 for 32 MB
#define MUDAQ_DMABUF_DATA_LEN           (1 << MUDAQ_DMABUF_DATA_ORDER)  // in bytes
size_t dma_buf_size = MUDAQ_DMABUF_DATA_LEN;
uint32_t dma_buf_nwords = dma_buf_size/sizeof(uint32_t);




/*-- Equipment list ------------------------------------------------*/

EQUIPMENT equipment[] = {

   {"Stream",              /* equipment name */
    {1, 0,                     /* event ID, trigger mask */
     "SYSTEM",                  /* event buffer */
     EQ_USER,               /* equipment type */
     0,                         /* event source */
     "MIDAS",                   /* format */
     TRUE,                      /* enabled */
     RO_RUNNING ,        /* read always and update ODB */
     100,                     /* poll for 100ms */
     0,                         /* stop run after this event limit */
     0,                         /* number of sub events */
     0,                         /* log history every event */
     "", "", ""} ,
    NULL,              /* readout routine */
   },

   {""}
};


/*-- Dummy routines ------------------------------------------------*/

INT poll_event(INT source, INT count, BOOL test)
{
   return 1;
};

INT interrupt_configure(INT cmd, INT source, POINTER_T adr)
{
   return 1;
};

/*-- Frontend Init -------------------------------------------------*/

INT frontend_init()
{
  // create ring buffer for readout thread
  create_event_rb(0);

  // create readout thread
  ss_thread_create(read_stream_thread, NULL);
    
  set_equipment_status(equipment[0].name, "Ready for running", "var(--mgreen)");

  return CM_SUCCESS;
}

/*-- Frontend Exit -------------------------------------------------*/

INT frontend_exit()
{
 return CM_SUCCESS;
}

/*-- Frontend Loop -------------------------------------------------*/

INT frontend_loop()
{
  return CM_SUCCESS;
}

/*-- Begin of Run --------------------------------------------------*/

INT begin_of_run(INT run_number, char *error)
{
  set_equipment_status(equipment[0].name, "Running", "var(--mgreen)");
  return CM_SUCCESS;
}

/*-- End of Run ----------------------------------------------------*/

INT end_of_run(INT run_number, char *error)
{
  set_equipment_status(equipment[0].name, "Ready for running", "var(--mgreen)");
  return CM_SUCCESS;
}

/*-- Pause Run -----------------------------------------------------*/

INT pause_run(INT run_number, char *error)
{
   return CM_SUCCESS;
}

/*-- Resume Run ----------------------------------------------------*/

INT resume_run(INT run_number, char *error)
{
   return CM_SUCCESS;
}


uint64_t generate_random_pixel_hit(uint64_t time_stamp)
{

  // Bits 63 - 35: TimeStamp (29 bits)
  // Bits 34 - 30: Tot       (5 bits)
  // Bits 29 - 26: Layer     (4 bits)
  // Bits 25 - 21: Phi       (5 bits)
  // Bits 20 - 16: ChipID    (5 bits)
  // Bits 15 -  8: Col       (8 bits)
  // Bits  7 -  0: Row       (8 bits)

  uint64_t tot = rand() % 31; // 0 to 31
  uint64_t layer = rand() % 15; // 0 to 15
  uint64_t phi = rand() % 31; // 0 to 31
  uint64_t chipID = rand() % 5; // 0 to 31
  uint64_t col = rand() % 250; // 0 to 255
  uint64_t row = rand() % 250; // 0 to 255

  uint64_t hit = (time_stamp << 35) | (tot << 30) | (layer << 26) | (phi << 21) | (chipID << 16) | (col << 8) | row;

  //printf("tot: 0x%2.2x,layer: 0x%2.2x,phi: 0x%2.2x,chipID: 0x%2.2x,col: 0x%2.2x,row: 0x%2.2x\n",tot,layer,phi,chipID,col,row);
  //cout << hex << hit << endl;
  //cout << hex << time_stamp << endl;
  //cout << hex << (hit >> 35 & 0x7FFFFFFFF) << endl;
  //cout << hex << (hit >> 32) << endl;

  return hit;
}

uint32_t generate_random_pixel_hit_swb(uint32_t time_stamp)
{

  uint32_t tot = rand() % 31; // 0 to 31
  uint32_t chipID = rand() % 5; // 0 to 5
  uint32_t col = rand() % 250; // 0 to 250
  uint32_t row = rand() % 250; // 0 to 250

  uint32_t hit = (time_stamp << 28) | (chipID << 22) | (row << 13) | (col << 5) | tot << 1;

//   if ( print ) {
//     printf("ts:%8.8x,chipID:%8.8x,row:%8.8x,col:%8.8x,tot:%8.8x\n", time_stamp,chipID,row,col,tot);
//     printf("hit:%8.8x\n", hit);
//     std::cout << std::bitset<32>(hit) << std::endl;
//   }

  return hit;
}

uint64_t generate_random_scifi_hit(uint32_t time_stamp, uint32_t counter1, uint32_t counter2)
{
    uint64_t asic = rand() % 8;
    uint64_t hit_type = 1;
    uint64_t channel_number = rand() % 32;
    uint64_t timestamp_bad = 0;
    uint64_t coarse_counter_value = (time_stamp >> 5) & 0x7FFF; 
    uint64_t fine_counter_value = time_stamp & 0x1F;
    uint64_t energy_flag = rand() % 2;
    uint64_t fpga_id = 10 + rand() % 4;
    if (counter1 > 0) {
        fpga_id = counter1;
    }

    return (asic << 60) | (hit_type << 59) | (channel_number << 54) | (timestamp_bad << 53) | (coarse_counter_value << 38) | (fine_counter_value << 33) | (energy_flag << 32) | (fpga_id << 28) | ((time_stamp & 0x0FFFFFFF));

}

uint64_t generate_random_delta_t_exponential(float lambda)
{
    float r = rand()/(1.0+RAND_MAX);
    return ceil(-log(1.0-r)/lambda);
}

uint64_t generate_random_delta_t_gauss(float mu, float sigma)
{
    std::default_random_engine generator;
    generator.seed(rand());
    std::normal_distribution<float> distribution(mu,sigma);
    float r = ceil(distribution(generator));
    if (r<0.0) r=0.0;
    return r;
}



INT read_stream_thread(void *param)
{

  uint32_t* pdata;

  // init bank structure - 64bit alignment
  uint32_t SERIAL = 0x00000001;
  uint32_t TIME = 0x00000001;

  uint64_t hit;

  // tell framework that we are alive
  signal_readout_thread_active(0, TRUE);

  // obtain ring buffer for inter-thread data exchange
  int rbh = get_event_rbh(0);
  int status;

  while (is_readout_thread_enabled()) {

    // obtain buffer space
    status = rb_get_wp(rbh, (void **)&pdata, 10);

    // just sleep and try again if buffer has no space
    if (status == DB_TIMEOUT) {
      set_equipment_status(equipment[0].name, "Buffer full", "var(--myellow)");
      continue;
    }

    if (status != DB_SUCCESS){
       cout << "!DB_SUCCESS" << endl;
       break;
    }

    // don't readout events if we are not running
    if (run_state != STATE_RUNNING) {
      set_equipment_status(equipment[0].name, "Not running", "var(--myellow)");
      continue;
    }

    set_equipment_status(equipment[0].name, "Running", "var(--mgreen)");

    int nEvents = 5000;
    size_t eventSize=76;
    uint32_t dma_buf_dummy[nEvents*eventSize];

    uint64_t prev_mutrig_hit = 0;
    uint64_t delta_t_1 = 100;
... 212 more lines ...
  2313   26 Jan 2022 Konstantin OlchanskiBug ReportWritting MIDAS Events via FPGAs
> today I did not get the data into MIDAS. 

Any error messages printed by the frontend? any error message in midas.log? core dumps? crashes?

I do not understand what you mean by "did not get the data into midas". You create events
and send them to a midas event buffer and you do not see them there? With mdump?
Do you see this both connected locally and connected remotely through the mserver?

BTW, I see you are using the mfe.c frontend. Event data handling in mfe.c frontends
is quite convoluted and impossible to straighten out. I recommend that you use
the tmfe c++ frontend instead. Event data handling is much simplified and is easier to debug
compared to the mfe.c frontend. There is examples in the midas repository and there are
tutorials for converting frontends from mfe.c to tmfe posted in this forum here.

BTW, the commit you refer to only changed some html files, could not have affected
your data.

K.O.
  2314   26 Jan 2022 Konstantin OlchanskiBug Reportsome frontend kicked by cm_periodic_tasks
> The problem is that eventually some of frontend closed with message 
> :19:22:31.834 2021/12/02 [rootana,INFO] Client 'Sample Frontend38' on buffer 
> 'SYSMSG' removed by cm_periodic_tasks because process pid 9789 does not exist

This messages means what it says. A client was registered with the SYSMSG buffer and this 
client had pid 9789. At some point some other client (rootana, in this case) checked it and 
process pid 9789 was no longer running. (it then proceeded to remove the registration).

There is 2 possibilities:
- simplest: your frontend has crashed. best to debug this by running it inside gdb, wait for 
the crash.
- unlikely: reported pid is bogus, real pid of your frontend is different, the client 
registration in SYSMSG is corrupted. this would indicate massive corruption of midas shared 
memory buffers, not impossible if your frontend misbehaves and writes to random memory 
addresses. ODB has protection against this (normally turned off, easy to enable, set ODB 
"/experiment/protect odb" to yes), shared memory buffers do not have protection against this 
(should be added?).

Do this. When you start your frontend, write down it's pid, when you see the crash message, 
confirm pid number printed is the same. As additional test, run your frontend inside gdb, 
after it crashes, you can print the stack trace, etc.

> 
> in the meantime mserver loggging :
> mserver started interactively
> mserver will listen on TCP port 1175
> double free or corruption (!prev)
> double free or corruption (!prev)
> free(): invalid next size (normal)
> double free or corruption (!prev)
> 

Are these "double free" messages coming from the mserver or from your frontend? (i.e. you run 
them in different terminals, not all in the same terminal?).

If messages are coming from the mserver, this confirms possibility (1),
except that for frontends connected remotely, the pid is the pid of the mserver,
and what we see are crashes of mserver, not crashes of your frontend. These are much harder to 
debug.

You will need to enable core dumps (ODB /Experiment/Enable core dumps set to "y"),
confirm that core dumps work (i.e. "killall -SEGV mserver", observe core files are created
in the directory where you started the mserver), reproduce the crash, run "gdb mserver 
core.NNNN", run "bt" to print the stack trace, post the stack trace here (or email to me 
directly).

>
> I can find some correlation between number of events/event size produced by 
> frontend, cause its failed when its become big enough. 
> 

There is no limit on event size or event rate in midas, you should not see any crash
regardless of what you do. (there is a limit of event size, because an event has
to fit inside an event buffer and event buffer size is limited to 2 GB).

Obviously you hit a bug in mserver that makes it crash. Let's debug it.

One thing to try is set the write cache size to zero and see if your crash goes away. I see
some indication of something rotten in the event buffer code if write cache is enabled. This
is set in ODB "/Eq/XXX/Common/Write Cache Size", set it to zero. (beware recent confusion
where odb settings have no effect depending on value of "equipment_common_overwrite").

>
> frontend scheme is like this:
> 

Best if you use the tmfe c++ frontend, event data handling is much simpler and we do not
have to debug the convoluted old code in mfe.c.

K.O.

>
> poll event time set to 0;
> 
> poll_event{
> //if buffer not transferred return (continue cutting the main buffer)
> //read main buffer from hardware
> //buffer not transfered
> }
> 
> read event{
> // cut the main buffer to subevents (cut one event from main buffer) return;
> //if (last subevent) {buffer transfered ;return}
> }
> 
> What is strange to me that 2 frontends (1 per remote pc) causing this.
> 
> Also, I'm executing one FEcode with -i # flag , put setting eventid in 
> frontend_init , and using SYSTEM buffer for all.
> 
> Is there something I'm missing?
> Thanks. 
> A.
  2315   26 Jan 2022 Konstantin OlchanskiBug ReportOff-by-one in sequencer documentation
> > 3 LOOP n,4
> > 4   MESSAGE $n,1
> > 5 ENDLOOP
> 
> Indeed you're right. The loop variable runs from 1...n. I fixed that in the documentation.

Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.

K.O.
  2317   26 Jan 2022 Marius KoeppelBug ReportWritting MIDAS Events via FPGAs
> Any error messages printed by the frontend? any error message in midas.log? core dumps? crashes? 
> I do not understand what you mean by "did not get the data into midas". You create events
> and send them to a midas event buffer and you do not see them there? With mdump?
> Do you see this both connected locally and connected remotely through the mserver?

I simply don't see the event counter counting up and I also don't see them using mdump. No logs, no dumps and no crashes - every is quite. I only tested it locally.
 
> BTW, I see you are using the mfe.c frontend. Event data handling in mfe.c frontends
> is quite convoluted and impossible to straighten out. I recommend that you use
> the tmfe c++ frontend instead. Event data handling is much simplified and is easier to debug
> compared to the mfe.c frontend. There is examples in the midas repository and there are
> tutorials for converting frontends from mfe.c to tmfe posted in this forum here.

I know the code I used is really old that's why I was so surprised that it suddenly did not work. But I am on the way to change it. Also Stefan gave me some comments on how to improve the code. But still changing them did not really change the behavior. 

> BTW, the commit you refer to only changed some html files, could not have affected
> your data.

I just hopped around and the commit I send was the first one which worked again. But it's of course not the one where the stuff broke. I did a bit of git-bisect and ended up with this commit as the first one where my frontend is not working anymore: 91582e4172d534bf9b10e661a423c399fd1a69f4

Cheers,
Marius
  2319   26 Jan 2022 Stefan RittBug ReportOff-by-one in sequencer documentation
> Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.

   for (i=1 ; i<=10 ; i++);     ;-)
  2321   26 Jan 2022 Konstantin OlchanskiBug ReportOff-by-one in sequencer documentation
> > Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.
> 
>    for (i=1 ; i<=10 ; i++);     ;-)

Similar code made big news just recently: (scroll down to the example main() program)

https://blog.qualys.com/vulnerabilities-threat-research/2022/01/25/pwnkit-local-privilege-escalation-
vulnerability-discovered-in-polkits-pkexec-cve-2021-4034

I forget if the FORTRAN rules were "loop once" or "never loop" or if it was different
between Fortran-4, fortran-77, DEC extensions and IBM extension, or if it was a compiler switch.

We should check that we do something reasonable with such loops to zero:

LOOP n,0
   MESSAGE $n,1
ENDLOOP

P.S. Yup. "man g77" option "-fonetrip".

K.O.
  2322   26 Jan 2022 Konstantin OlchanskiBug ReportWritting MIDAS Events via FPGAs
> 
> > Any error messages printed by the frontend? any error message in midas.log? core dumps? crashes? 
> > I do not understand what you mean by "did not get the data into midas". You create events
> > and send them to a midas event buffer and you do not see them there? With mdump?
> > Do you see this both connected locally and connected remotely through the mserver?
> 
> I simply don't see the event counter counting up and I also don't see them using mdump. No logs, no dumps and no crashes - every is quite. I only tested it locally.
>

If you are connected locally (no mserver), I want to know the value returned by bm_send_event(). Simplest
if you edit mfe.c and everywhere it calls bm_send_event() and rpc_send_event(), print the returned value.

It would be very interesting to see if bm_send_event() returns 1 (SUCCESS), but the event vanishes
without a trace.

Before you do that, try something simpler:

Run "mdump -s -d", it will print some event buffer internals.

Watch to see if any data pointers change when you send your events ("wp", "rp", etc).

If nothing changes at all, then we are not sending anything (fault is in your code or on mfe.c).

If you see "wp" counting up, then we definitely write your events into the buffer and mdump & mlogger should see them.

But there is some funny logic for event_id and trigger_mask and it is worth checking their
values. For a good test, set event_id=1 and trigger_mask=0x1. There might be trouble if either is set to zero.

K.O.
  2323   26 Jan 2022 Konstantin OlchanskiBug ReportUnknown Error 319 from client
> I’m trying to run MIDAS using a frontend code/client named “fetiglab”. Run stops 
> after 2/3sec with an error saying “Unknown error 319 from client “fetiglab” on 
> localhost.

actually run never starts.

> 11:46:32 [fetiglab,ERROR] [odb.cxx:11268:db_get_record,ERROR] struct size 
> mismatch for "/" (expected size: 1, size in ODB: 41920)

this is the error that causes run start to fail. for reasons unknown
your frontend is trying to do a db_get_record() from "/" (ODB root top directory).

if this is an mfe.c frontend, I do not think I have ever seen it do something
like this.

so, a puzzle.

K.O.
  2325   26 Jan 2022 Marius KoeppelBug ReportWritting MIDAS Events via FPGAs
> If you are connected locally (no mserver), I want to know the value returned by bm_send_event(). Simplest
> if you edit mfe.c and everywhere it calls bm_send_event() and rpc_send_event(), print the returned value.
> 
> It would be very interesting to see if bm_send_event() returns 1 (SUCCESS), but the event vanishes
> without a trace.

I checked bm_send_event(rbh, (EVENT_HEADER*)(&pdata[0]), 0, 20); which gives me back 1. I also check the status of rb_increment_wp which is also 1.

> Before you do that, try something simpler:
> Run "mdump -s -d", it will print some event buffer internals.
> Watch to see if any data pointers change when you send your events ("wp", "rp", etc).

"rp" & "wp" are not counting up. 

> But there is some funny logic for event_id and trigger_mask and it is worth checking their
> values. For a good test, set event_id=1 and trigger_mask=0x1. There might be trouble if either is set to zero.

Changing both to 0x1 did not change the behavior. 

Cheers,
Marius
ELOG V3.1.4-2e1708b5