I confirm I see same on the agmini system. Two problems: (a) error message is wrong, it's a
short read, not a read error (clue: read() syscall does not return "no such file"). (b)
mlogger is supposed to write history in record-size blocks, read in the same record size
blocks. UNIX file semantics require that both reader and writer see read() and write() as
atomic, even on NFS, so mhttpd should never see partially written history records. I can
debug this on the agmini system. Probably should.
Problem (a) fixed in commit bb423c8680cc67220312534403840442868f2b3b, if you update, you
should see error messages about "short read" and the read sizes it reports are very
interesting, please put them in the elog here.
K.O.
> We sporadically (like once per few hours) have an error message when we access the
> history plots through mhttpd:
>
> 07:21:35.109 2023/08/03 [mhttpd,ERROR]
> [history_schema.cxx:2345:FileHistory::read_data,ERROR] Cannot read
> '/data2/history/mhf_1690890685_20230801_dc_hv.dat', read() errno 2 (No such file
> or directory)
>
> When I log in to the machine, I properly see the file and also can access it
>
> [meg@megon02 history]$ ls -l mhf_1690890685_20230801_dc_hv.dat
> -rw-rw-r--. 1 meg meg 34176312 Aug 3 07:23 mhf_1690890685_20230801_dc_hv.dat
>
> and I also can dump that file.
>
> When I try again with mhttpd, I properly see that file.
>
> Now in principle this is not a problem, but the error message is annoying, since this
> is the only error we get in 24 hours. I attached a 24h log to see what I mean. If this
> is an OS issue, I wonder if we should add code to retry the file access in case we get
> that error.
>
> Anybody seen a similar thing?
>
> Best,
> Stefan |