View Issue Details

IDProjectCategoryView StatusLast Update
0000736Apache 2.x Bugpublic2021-08-23 21:49
ReporteremaxAssigned ToSteven Levine 
PriorityurgentSeverityblockReproducibilitysometimes
Status assignedResolutionopen 
PlatformVM guest on Vbox 2 cores, 3GBramOSAOSOS Version503
Product Version2.4.x 
Target VersionFixed in Version 
Summary0000736: Apache 2.4.4x build with php 5.6.40 sometimes don't restart
DescriptionServer Version: Apache/2.4.46 (OS/2) OpenSSL/1.1.1j PHP/5.6.40
Server Built: Apr 1 2021 14:58:19

Hi all,

sometimes apache don't start/restart with this error:

02816271 01 ff 0000 Asrt: Assertion Failed!!!
02816271 01 ff 0000 Asrt: Function: <NULL>
02816271 01 ff 0000 Asrt: File: D:/Users/dmik/rpmbuild/BUILD/libcx-0.6.6/src/shared.c
02816271 01 ff 0000 Asrt: Line: 189
02816271 01 ff 0000 Asrt: Expr: arc == NO_ERROR
02816271 01 ff 0000 Asrt: 105

i've to setboot /b the VM to have it back starting again

this is a major issue :(

sorry
TagsNo tags attached.

Activities

emax

2021-06-04 19:32

reporter   ~0003706

this kind of crash/dump don't produce any exceptQ dump file


apache error log:

[Fri Jun 04 10:28:17.649000 2021] [mpm_mpmt_os2:notice] [pid 493:tid 1] (OS 10035)Resource temporarily unavailable: apr_socket_accept
[Fri Jun 04 10:30:47.086000 2021] [mpm_mpmt_os2:notice] [pid 493:tid 1] (OS 10035)Resource temporarily unavailable: apr_socket_accept
Assertion info: 105
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.6.6/src/shared.c, line 564

Killed by SIGABRT
pid=0x01f2 ppid=0x01b4 tid=0x0001 slot=0x007d Assertion info: 105
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.6.6/src/shared.c, line 564

Killed by SIGABRT
pid=0x01f4 ppid=0x01b4 tid=0x0012 slot=0x00b9 pri=0x0200 mc=0x0001 ps=0x0010
D:\APACHE\BIN\HTTPD.EXE
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
[Fri Jun 04 10:52:07.031000 2021] [mpm_mpmt_os2:error] [pid 502:tid 1] (OS 105)The previous ownership of this semaphore has ended. : Sem owner died pidSemaphoreOwner = 501, tidSemaphoreOwner = 1, ulRequestCount = 1 rc = 105 at ap_mpm_child_main #0
[Fri Jun 04 10:52:07.037000 2021] [mpm_mpmt_os2:error] [pid 502:tid 1] (OS 105)The previous ownership of this semaphore has ended. : AH00194: apr_socket_accept
[Fri Jun 04 10:52:07.028000 2021] [mpm_mpmt_os2:error] [pid 503:tid 1] (OS 105)The previous ownership of this semaphore has ended. : Sem owner died pidSemaphoreOwner = 501, tidSemaphoreOwner = 1, ulRequestCount = 1 rc = 105 at ap_mpm_child_main #0
[Fri Jun 04 10:52:07.038000 2021] [mpm_mpmt_os2:error] [pid 503:tid 1] (OS 105)The previous ownership of this semaphore has ended. : AH00194: apr_socket_accept

after reboot:

[Fri Jun 04 11:33:38.336000 2021] [mpm_mpmt_os2:error] [pid 67:tid 1] AH00200: DosGetNamedSharedMem returned 0
[Fri Jun 04 11:33:38.365000 2021] [mpm_mpmt_os2:notice] [pid 67:tid 1] AH00206: Apache/2.4.46 (OS/2) OpenSSL/1.1.1j PHP/5.6.40 configured -- resuming normal operations

psmedley

2021-06-04 19:33

administrator   ~0003707

This is a libcx issue and needs to be reported to bww

psmedley

2021-06-04 20:24

administrator   ~0003708

Suggest reporting at https://github.com/bitwiseworks/libcx/issues

psmedley

2021-06-04 20:25

administrator   ~0003709

Similar to https://github.com/bitwiseworks/libcx/issues/82

emax

2021-06-04 20:29

reporter   ~0003710

done, thanks

https://github.com/bitwiseworks/libcx/issues/92

emax

2021-06-04 20:40

reporter   ~0003711

it seems i've an old libcx, of course "yum update libcx" don't work properly
it detect the old version, but faile on update

Download dei pacchetti:
Running Transaction Check
Test di transazione in corso
Errore nel controllo transazione:
  il file /@unixroot/usr/bin/pwd_mkdb.exe dell'installazione di libc-1:0.1.7-1.oc00.

emax

2021-06-04 20:50

reporter   ~0003712

i've updated it manually from version 0.6.6. to 0.7.0
let's see if it improves the situation

psmedley

2021-06-05 07:25

administrator   ~0003713

I normally just run 'yum update'

emax

2021-06-07 21:18

reporter   ~0003717

upgrading libcx0 from version 0.6.6. to 0.7.0 this issue seems to be fixed
please give me some more day to confirm

thanks

emax

2021-06-14 18:01

reporter   ~0003733

this crash after a number of urls "not found or unable to stat" gives a very interesting dump
but NO dumps in popuplog.os2, NO exceptQ dumps


[Sun Jun 13 10:49:10.372000 2021] [:error] [pid 6055:tid 22] [client X.Y.Z.A:31621] script '/apache/htdocs/mywebsite/wp-login.php' not found or unable to stat

Killed by SIGSEGV
pid=0x17a7 ppid=0x10f1 tid=0x0001 slot=0x00f3 pri=0x0200 mc=0x0001 ps=0x0010
D:\APACHE\BIN\HTTPD.EXE
PHP5 0:0078a90a
cs:eip=005b:1dbca90a ss:esp=0053:0022fbc0 ebp=0022fc08
 ds=0053 es=
Killed by SIGSEGV
pid=0x17bd ppid=0x10f1 tid=0x0001 slot=0x007f pri=0x0200 mc=0x0001 ps=0x0010
D:\APACHE\BIN\HTTPD.EXE
PHP5 0:00706dff
cs:eip=005b:1db46dff ss:esp=0053:0022fb80 ebp=0022fbc8
 ds=0053 es=0053 fs=150b gs=0000 efl=00010202
eax=00000000 ebx=24eefcc0 ecx=00000022 edx=00000000 edi=28684220 esi=200e4890
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
HTTPD.EXE
PHP5 0:0078a90a
cs:eip=005b:1dbca90a ss:esp=0053:0022fbc0 ebp=0022fc08
 ds=0053 es=0053 fs=150b gs=0000 efl=00010206
eax=00000000 ebx=0003fff0 ecx=00000000 edx=22738300 edi=22738308 esi=20111600
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
0022fbc8
 ds=0053 es=Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size total: 65120 bytes
Heap size used now: 7910 bytes
ProcDesc structs used now: 8
FileDesc structs used now: 17
SharedFileDesc structs used now: 10
===== LIBCx stats end =====
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.7.0/src/shared.c, line 564

Killed by SIGABRT
pid=0x17ba ppid=0x10f1 tid=0x0001 slot=0x009b pri=0x0200 mc=0x0001 ps=0x0010
D:\APACHE\BIN\HTTPD.EXE
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
[Sun Jun 13 11:01:21.532000 2021] [mpm_mpmt_os2:error] [pid 6334:tid 1] (OS 10038)Socket operation on non-socket: AH00194: apr_socket_accept
Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size tAssertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size total: 65120 bytes
Heap size used now: 3492 bytes
ProcDesc structs used now: 5
FileDesc structs used now: 4
SharedFileDesc structs used now: 1
===== LIBCx stats end =====
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.7.0/src/shared.c, line 564

Killed by SIGABRT
pid=0x18bc ppid=0x0010 tid=0x0003 slot=0x00b2 pri=0x0200 mc=0x0001 ps=0x0010
D:\APACHE\BIN\HTTPD.EXE
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size tAssertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size total: 65120 bytes
Heap size used now: 2856 bytes
ProcDesc structs used now: 4
FileDesc structs used now: 3
SharedFileDesc structs used now: 1
===== LIBCx stats end =====
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.7.0/src/shared.c, line 564

Killed by SIGABRT
pid=0x18bd ppid=0x0010 tid=0x0004 slot=0x0075 pri=0x0200 mc=0x0001 ps=0x0010
D:\APACHE\BIN\HTTPD.EXE
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
[Sun Jun 13 11:05:52.902000 2021] [mpm_mpmt_os2:error] [pid 67:tid 1] AH00200: DosGetNamedSharedMem returned 0
[Sun Jun 13 11:05:52.927000 2021] [mpm_mpmt_os2:notice] [pid 67:tid 1] AH00206: Apache/2.4.46 (OS/2) OpenSSL/1.1.1j PHP/5.6.40 configured -- resuming normal operations

Killed by SIGSEGV
pid=0x0044 ppid=0x0043 tid=0x0001 slot=0x005b pri=0x0200 mc=0x0001 ps=0x0010
D:\APACHE\BIN\HTTPD.EXE
PHP5 0:00706dff
cs:eip=005b:1db46dff ss:esp=0053:0022fb80 ebp=0022fbc8
 ds=0053 es=0053 fs=150b gs=0000 efl=00010202
eax=00000000 ebx=218f0d40 ecx=00000022 edx=00000000 edi=218f0d00 esi=200e4890
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
[client X.Y.Z.A:51006] script '/apache//mywebsite/wp-login.php' not found or unable to stat, referer: http://www.abcdef.com

Killed by SIGSEGV
pid=0x0048 ppid=0x0043 tid=0x0001 slot=0x0060 pri=0x0200 mc=0x0001 ps=0x0010
D:\APACHE\BIN\HTTPD.EXE
PHP5 0:00706dff
cs:eip=005b:1db46dff ss:esp=0053:0022fb80 ebp=0022fbc8
 ds=0053 es=0053 fs=150b gs=0000 efl=00010202
eax=00000000 ebx=219fa0c0 ecx=00000022 edx=00000000 edi=219f86a0 esi=200e4890
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.

emax

2021-06-14 18:04

reporter   ~0003735

also updated ticket on github

https://github.com/bitwiseworks/libcx/issues/92

emax

2021-06-21 23:38

reporter   ~0003750

today i had another serious issue:

when this issue occurs, apache do not restart anymore and i've to seboot /b the server :-(

[Mon Jun 21 08:21:06.933000 2021] [mpm_mpmt_os2:notice] [pid 426:tid 1] (OS 10035)Resource temporarily unavailable: apr_socket_accept
[Mon Jun 21 08:22:01.642000 2021] [mpm_mpmt_os2:notice] [pid 428:tid 1] (OS 10035)Resource temporarily unavailable: apr_socket_accept
[Mon Jun 21 08:23:25.590000 2021] [mpm_mpmt_os2:notice] [pid 427:tid 1] (OS 10035)Resource temporarily unavailable: apr_socket_accept
[Mon Jun 21 08:26:35.729000 2021] [mpm_mpmt_os2:notice] [pid 424:tid 1] (OS 10035)Resource temporarily unavailable: apr_socket_accept
Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size total: 65120 bytes
Heap size used now: 7262 bytes
ProcDesc structs used now: 8
FileDesc structs used now: 14
SharedFileDesc structs used now: 8
===== LIBCx stats end =====
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.7.0/src/shared.c, line 564

Killed by SIGABRT
pid=0x01b1 ppid=0x019a tid=0x0001 slot=0x0083 pri=0x0200 mc=0x0001 ps=0x0010
X:\APACHE\BIN\HTTPD.EXE
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
600
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size total: 65120 bytes
Heap size used now: 4128 bytes
ProcDesc structs used now: 6
FileDesc structs used now: 5
SharedFileDesc structs used now: 1
===== LIBCx stats end =====
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.7.0/src/shared.c, line 564

Killed by SIGABRT
pid=0x01b4 ppid=0x019a tid=0x0004 slot=0x00ea pri=0x0200 mc=0x0001 ps=0x0010
X:\APACHE\BIN\HTTPD.EXE
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.
[Mon Jun 21 08:46:09.710000 2021] [mpm_mpmt_os2:error] [pid 438:tid 1] (OS 10038)Socket operation on non-socket: AH00194: apr_socket_accept
Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size total: 65120 bytes
Heap size used now: 2856 bytes
ProcDesc structs used now: 4
FileDesc structs used now: 3
SharedFileDesc structs used now: 1
===== LIBCx stats end =====
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.7.0/src/shared.c, line 564

Killed by SIGABRT
pid=0x01b3 ppid=0x0010 tid=0x0001 slot=0x0065 pri=0x0200 mc=0x0001 ps=0x0010
X:\APACHE\BIN\HTTPD.EXE
Process dumping was disabled, use DUMPPROC / PROCDUMP to enable it.

emax

2021-06-22 03:14

reporter   ~0003751

update

i've upgraded to apache 2.4.48 let's see if this improve the situation, but i doubt

Server Version: Apache/2.4.48 (OS/2) OpenSSL/1.1.1k PHP/5.6.40
Server MPM: mpmt_os2
Server Built: Jun 6 2021 15:51:55

emax

2021-06-22 06:05

reporter   ~0003752

same issue even on 2.4.48

about the github repository it's allmost abandoneware there

psmedley

2021-06-22 06:59

administrator   ~0003753

It is no surprise that 2.4.48 behaves the same. As I've said, the 'issue' is coming from libcx.

Best I can do is attempt to build apache2 without a libcx dependency.

Not sure how feasible this is as libcx provides a number of functions that are considered 'standard' these days.

emax

2021-06-22 07:27

reporter   ~0003754

i don't know what to say, what i see is that the repository on github is abandoneware
and followed by anyone and nobody answer
that is no good at all
apache 2.4.x is too much unstable that i'm considering comin back to 2.2.x

psmedley

2021-06-22 09:35

administrator   ~0003755

You're welcome to switch back to Apache 2.2 (it's your system), just be aware that there will be no future builds/fixes for it, it's way too far EOL.

psmedley

2021-06-22 18:54

administrator   ~0003756

https://smedley.id.au/tmp/httpd-2.4.48-os2-20210622-debug.zip

*minimises* use of libcx to only ssl.dll and md.dll and doesn't use mmap/munmap at all.

Note: you won't get exceptq logs from this one....

emax

2021-06-22 19:04

reporter   ~0003757

even after upgrading from libcx 0.6.6 to 0.7.0 i don't have aymore exceptQ dumps

emax

2021-06-24 23:23

reporter   ~0003758

https://smedley.id.au/tmp/httpd-2.4.48-os2-20210622-debug.zip

This new build die on start:

A non-recoverable error occurred. The process ended.

06-24-2021 15:48:07 SYS2070
PID 0487 TID 0001 Slot 0085
X:\APACHE\BIN\HTTPD.EXE
AUTHN_CO->HTTPD._ap_hook_check_authn 127

no exceptQ dump at all
at this point i think that libcx 0.7.0 is not compatible with excetpQ environment

psmedley

2021-06-25 06:52

administrator   ~0003759

That looks like you have an old DLL in libpath. However, I realised there's more to it than just building apache2 without libcx, as php also uses it, and for mmap support, it's required that the executable that loads a DLL is also linked with libcx.

I did advise that there would be no exceptq support - the reason for that is quite simple. There are two ways to add exceptq support to an executable. 1) modify the code; 2) link against libcx. The apache2 code used to use approach 1), but when we started using libcx, we removed the code as it made patching easier. So in this build, there is no mechanism to load exceptq.

In your error above, there would never be a TRP file anyway, as there is no exception - it just can't load a DLL due to incompatible symbols.

Your issues with libcx 'assertions failed' also won't generate an exceptq dump, as there is no crashing code - libcx is deliberately aborted as it got something unexpected.

emax

2021-06-29 06:21

reporter   ~0003765

hi Paul,

i guess that there is no other way if Bitwise or Dmik put their hand on libcx 0.7.0...

But that repository on github seems like a desert :(

Steven Levine

2021-07-02 06:22

manager   ~0003774

Just a reminder. When I see a summary like:

  Apache 2.4.46 build (1th/4/2021) with php 5.6.40 sometimes don't restart

and 2.4.48 has been out for a while, I tend to ignore the the ticket assuming it a stale. If you have replicated the issue with 2.4.48, you need to take the time to update the ticket summary.

Steven Levine

2021-07-10 09:56

manager   ~0003791

If you can reproduce this with httpd-2.4.48-os2-20210619-debug.zip, please capture a process dump using http://www.warpcave.com/os2diags/PDumpCtl-0.17-20210620.zip. Since the hang is most likely in a DLL, you will need to capture a full dump using something like:

  pdumpctl httpd f d o

emax

2021-07-10 19:17

reporter   ~0003794

- pdumpctl environment updated

- now i'm going to update apache if it is not that version with reduced use of libcx, since here it neither start

- there is a big misunderstanding this is not an hang of httpD and i've an unkillable process, apache exit with that crash and don't start anymore
so that i've to setboot /b the VM, so how can i use dumpctl on httpd if i've no httpd running in memory?

thanks

emax

2021-07-10 19:26

reporter   ~0003796

this build httpd-2.4.48-os2-20210619-debug.zip. don't start

A non-recoverable error occurred. The process ended.

07-10-2021 11:53:50 SYS2070 PID 0208 TID 0001 Slot 0086
X:\APACHE\BIN\HTTPD.EXE
AUTHN_CO->HTTPD._ap_hook_check_authn
127

probably it lacks some modules

Steven Levine

2021-07-15 14:16

manager   ~0003818

Is this ticket ready to mark resolved?

emax

2021-07-15 23:55

reporter   ~0003821

i've a lot of this issue in these days, this ticket is maybe the most important
and the issue the most problematic

apache that don't restart anymore so that the script automatically after 3 retries give a setboot /b
i had this issue 3 times today

i'm starting to believe that apache 2.4.x is not jet compatible with a production environment

i did not have all these issue with 2.2.x
sorry

massimo

Steven Levine

2021-07-16 01:27

manager   ~0003823

You are free to continue to run 2.2. The only downside is I will not provide any support for 2.2. I only have so much time available. You will be on your own unless someone else decides to support 2.2.

If you choose to help get the 2.4 issues resolved, you need to collect the required data so that others have data to look at. You know which data is required because this is not the first time you have asked for help on similar issues. You reported similar issues regularly for 2.2.

The choice is yours.

emax

2021-07-26 06:19

reporter   ~0003837

note: upgraded apache to debug build Jun 19 2021 07:52:39

emax

2021-08-09 18:31

reporter   ~0003867

even after upgrading to debug build Jun 19 2021 07:52:39
the issue is still here, apache do not restart anymore and i've to setboot /b the VM

apache error log:

[Mon Aug 09 09:17:49.080000 2021] [:error] [pid 477:tid 23] [client 66.249.72.216:65048] script 'X:/apache/htdocs/mysite/abc.php' not found or unable to stat
Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
HeapAssertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size total: 65120 bytes
Heap size used now: 4128 bytes
ProcDesc structs used now: 6
FileDesc structs used now: 5
SharedFileDesc structs used now: 1
===== LIBCx stats end =====
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.7.0/src/shared.c, line 564

Killed by SIGABRT
pid=0x01e6 ppid=0x01c8 tid=0x001b slot=0x0087 pri=0x0200 mc=0x0001 ps=0x0010
X:\APACHE\BIN\HTTPD.EXE
Creating 01E6_1B.TRP
lled by SIGABRT
pid=0x01e4 ppid=0x01c8 tid=0x001e slot=0x0047 pri=0x0200 mc=0x0001 ps=0x0010
X:\APACHE\BIN\HTTPD.EXE
Creating 01E4_1E.TRP
[Mon Aug 09 09:53:45.372000 2021] [mpm_mpmt_os2:error] [pid 488:tid 1] (OS 10038)Socket operation on non-socket: AH00194: apr_socket_accept
Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size Assertion info: 105

===== LIBCx resource usage =====
Reserved memory size: 2097152 bytes
Committed memory size: 65536 bytes
Heap size total: 65120 bytes
Heap size used now: 2856 bytes
ProcDesc structs used now: 4
FileDesc structs used now: 3
SharedFileDeCreating 01E8_01.TRP

===== LIBCx stats end =====
Assertion failed: arc == NO_ERROR, file D:/Users/dmik/rpmbuild/BUILD/libcx-0.7.0/src/shaCreating 01E7_01.TRP
led by SIGABRT
pid=0x01e9 ppid=0x0010 tid=0x0001 slot=0x00be pri=0x0200 mc=0x0001 ps=0x0010
X:\APACHE\BIN\HTTPD.EXE
Creating 01E9_01.TRP

01E6_1B.TRP (96,960 bytes)
01E4_1E.TRP (97,516 bytes)
01E9_01.TRP (47,627 bytes)
01E8_01.TRP (47,627 bytes)

Steven Levine

2021-08-11 12:07

manager   ~0003869

Looking at the log data you included, I need a bit more info. The log starts with:

  [Mon Aug 09 09:17:49.080000 2021] [:error] [pid 477:tid 23] [client 66.249.72.216:65048] script 'X:/apache/htdocs/mysite/abc.php' not found or unable to stat

Does abc.php actually exist or is the a hack attempt?

Next the log reports:

 Assertion info: 105

This line is not supposed to appear alone. Did you omit part of the error report?

The next timestamp reported in the log is from 01E4_1E.TRP

  Exception Report - created 2021/08/09 09:53:43

which is some 36 minutes later. What was the system doing during this time?

Did you check %UNIXROOT\var\log\libcx for reports? The error implies there should have been one or more libcx reports generated.

emax

2021-08-11 19:48

reporter   ~0003872

hi,

"Does abc.php actually exist or is the a hack attempt?"

no, it's an "url not found" they try to call a webpage that do not exist anymore


This line is not supposed to appear alone. Did you omit part of the error report?

i don't recall, i keep only 2 days of error_log so it rotated and i've not it anymore, sorry :(
i've now increased the log rotation to keep more days

"which is some 36 minutes later. What was the system doing during this time?"

nothing, rexx gone away, so even the fault daemon gone away
just like running out of system resources
if apache do not start anymore the fault daemon try 3 times to restart apache after it does a reboot (setboot)
but since even rexx went away it was not runing anymore

when the notifies reached me on the phone (other servers notify that apache on the webserver is not answering) i went back in my office
and setboot /b the web server VM
this took a bit of time :)



"Did you check %UNIXROOT\var\log\libcx for reports? The error implies there should have been one or more libcx reports generated."

no, all reports e.g. "HTTPD-60f9b130-26e2.log" are dated 22th july 2021

Steven Levine

2021-08-12 02:04

manager   ~0003873

OK. So you are saying the system was hung soon after the stat for abc.php failed. True?

Also, the

  Killed by SIGABRT

in the log was in response to you manually killing the httpd instance when you got into the office. True?

The reason for the hang is that a thread died while holding the libcx global lock. This is what the

  Assertion info: 105

means. Unfortunately, since no libcx logs were written, it appears that when this occurred, libcx is unable to report which thread failed. I am discussing this with Dmitriy.

emax

2021-08-12 18:19

reporter   ~0003875

I'm sorry, i don't know exactly when rexx gone away
I don't know how to check it, since my fault daemon use rexx
If the system is not "out of resources" rexx is working so the fault-daemon try to restart apache

I realize that the system is out of resources since with PGMCNTRL i see that the rexx fault-daemon is not running
if i try to restart it, of course it don't start
so at that point i go with setboot /b

the command i use to manually start apache first kill it and after start it
so maybe is that one

Steven Levine

2021-08-13 01:14

manager   ~0003876


You can estimate when rexx went away by checking the timestamps on the log records your fault daemon writes. If the fault deamon does not timestamp its actions, perhaps you consider doing this. It makes troubleshooting a lot easier.

If you still think that apache is not restarting because the system is "out of resources", you are not yet fully understanding what I have written here and elsewhere. When libcx reports error 105, aka ERROR_OWNER_DIED, it means an apache thread died while it held a global lock. Libcx cannot recover from this failure. It can only report it.

There are two ways you can recover from this failure. One is to reboot. Another is to kill all the applications that are using libcx. This will unload all instances of libx and close the global semaphore. This method will work long as no application is hung in the exit list and preventing libcx from unloading. After this applications should now start and run normally.

You can check if any applications are using libcx with

  psfiles | find "LIBCX"

This should be in your notes.

If your fault deamon rexx script will not restart, this is most likely a different problem, unrelated to libcx. What is the error message you get when the deamon fails to restart? As you know, I cannot provide must useful help without and error message of some sort.

emax

2021-08-13 01:45

reporter   ~0003877

hi,

i've now understand what happens, but "kill all the applications that are using libcx"
this is "not possible" since injoy firewall when restarted it do not work anymore (it close all connections with a "SYN_SENT"), so i've to setboot /b in any way

the fault daemon has the timestamps when it do something e.g. apache that "exit" and it try to restart it
it writes in a journal log

Steven Levine

2021-08-13 04:25

manager   ~0003878


Perhaps I was not sufficiently precise? When I say kill the application, I mean shut it down in the most graceful way possible. In the case of ijfw, this mean using sync -kill to request gateway.exe to shut down cleanly.

The ijfw binaries do not link to libcx, so the libcx lock problem should not affect how ijfw runs. If the goal is to unload all the applications using libcx, ijfw is not one of applications that needs to be shut down. You can check this with Theseus or pstat.

Regarding timestamps. While it might be true that your logs are timestampped based on what you have posted to the ticket, no one but you would know this. Unless we can see the timestamps, it's as if they don't exist.

I was able to see the delay by looking at the apache timestamps and comparing them the .trp file timestamps. However, you should not be making me work this hard.

emax

2021-08-18 04:52

reporter   ~0003879

about injoy FW

i've tried a lot of ways to exit injoy FW, but even the restart firewall feature from the UI or remote UI give the same issue
anyway if i've not to restart injoy fw, i could kill the other applications and see if apache and rexx restart correctly

about the timestamps

i'm sorry, but i got a bit confused about this point
please confirm for the next time if i've to do this check

1) check the time stamp in the eQ trap
2) check the time stamp of apache restart in the journal log and compare them?

what i should write in the details to the ticket?
both the timestamps?
am i right?

about this point:

"If your fault deamon rexx script will not restart, this is most likely a different problem, unrelated to libcx. What is the error message you get when the deamon fails to restart? As you know, I cannot provide must useful help without and error message of some sort."

next time it happens i will write down the error

i recall something about "RexxUtil" or "rxFtp", but i must verify next time it happens again
sorry

Steven Levine

2021-08-18 05:46

manager   ~0003880

We can leave the ijfw restart issues to a future ticket.

Regarding timestamps, what I am trying to say is that without a timestamp I cannot know when an event occurred. If the events effectively occur one right after another, all I usually need to know is when the 1st event occurred. If an event occurs a long time after the previously reported event and this is not obvious from the data you post to the ticket comment, you need to indicate this.

Consider the data you supplied in 0000736:0003867. There is no simple way for me to know that the exceptq dump was generated 30 minutes later. This is obviously (at least to me) something I need to be told. Sure I am perfectly capable of figuring it out myself, but if you make me do this, this leaves less time for me to work on your core problem which is the restart failures.

Back to the restart issue. Bww has pushed updated libc and libcx binaries to netlabs-exp. I've been running test builds for a couple of weeks, so it's unlikely that they will cause any new issues on quasarbbs. One important fix is a change to the logging code which should allow us to see which httpd thread died.

I recommend you install these packages from netlabs-exp. Don't forget to install the associated debug packages.

There's another relevant change to klibc and libcx's handling log files and exceptq dumps. klibc log files are now written to %UNIXROOT%/var/log/app. Exceptq dumps are now renamed and copied to /var/log/app. Libcx log files continue to be written to /var/log/libcx.

See

  https://github.com/bitwiseworks/libc/issues/98
  https://github.com/bitwiseworks/libc/issues/99
  https://github.com/bitwiseworks/libc/issues/101

for the reasoning. Once everyone get used to this, it should be easier to find exceptq dump files when the working directory is not known.

emax

2021-08-18 18:05

reporter   ~0003881

about libc

i'm running right now the updated libcx0.dll

Signature: @#bww bitwise works GmbH:0.7#@##1## 2021-07-30 20:58:27 novator::::1t5::@@kLIBC Extension Library
Vendor: bww bitwise works GmbH
Revision: 0.07
Date/Time: 2021-07-30 20:58:27
Build Machine: novator
File Version: 0.7.1
Description: kLIBC Extension Library

if i've understand well there are also other libc* dll updated

if i run yum update libc (or libcn or libcx) it say there is no package marked for the update

psmedley

2021-08-18 19:15

administrator   ~0003882

As Steven said, the updates are in netlabs-exp - I'll wager you're only setup to update from update-rel

emax

2021-08-18 19:22

reporter   ~0003883

update

on this VM* "libcx0.dll" is used only by httpd (apache) and cron2 (2019 Yuri Dario build v1.4n)
there are no other processes using libcx0.dll

* this VM does only 2 roles: Apache+PHP and FtpD
  no others (except for the FW and cron), rsyncD only runs at night for bkups

another info

yesterday evening i had 7 eQ dumps at 22:39 with:

Cause: Invalid execution address 00000000
Trap -> 1ECB1AE1 LIBCN0 0001:000F1AE1 b_panic.c#519 ___libc_Back_panicV + 1589 0001:000F0558 (b_panic.obj)

rexx has not "gone away" this time and the fault daemon correctly rebooted the VM at 22:42

but in the journal log, the fault daemon tried to restart apache at 22:18

i've also found in POPUPLOG.OS2 "WGET crashing on LIBCX0.DLL" at 22:42 (these are checks from the fault-daemon)

08-17-2021 22:42:45 SYS3175 PID 21d1 TID 0001 Slot 006c
C:\USR\BIN\WGET.EXE
 c0000005
1ebb00fd
P1=00000001 P2=00000004 P3=XXXXXXXX P4=XXXXXXXX
EAX=00000000 EBX=000004fc ECX=00000004 EDX=000004fc
ESI=00000004 EDI=00000000
DS=0053 DSACC=d0f3 DSLIM=5fffffff
ES=0053 ESACC=d0f3 ESLIM=5fffffff
FS=150b FSACC=00f3 FSLIM=00000030
GS=0000 GSACC=**** GSLIM=********
CS:EIP=005b:1ebb00fd CSACC=d0df CSLIM=5fffffff
SS:ESP=0053:0018f768 SSACC=d0f3 SSLIM=5fffffff
EBP=0018f820 FLG=00010212
LIBCX0.DLL 0001:000100fd

hope these info could be of any help

Steven Levine

2021-08-20 10:19

manager   ~0003884

Did you forget to upload the exceptq reports for the httpd traps?

You need to open a separate ticket for the wget trap. This is a libcx issue, not an apache2 issue.

emax

2021-08-20 17:33

reporter   ~0003885

if these can be of any help
you can find here in attachment

emax

2021-08-20 17:36

reporter   ~0003886

mantis do not published them going to retry

21C7_05.TRP (74,384 bytes)
21C8_01.TRP (46,843 bytes)
21C9_01.TRP (46,961 bytes)
21CA_01.TRP (46,843 bytes)
2108_01.TRP (43,931 bytes)
2110_01.TRP (53,049 bytes)
21C0_01.TRP (47,191 bytes)

Steven Levine

2021-08-22 15:05

manager   ~0003887

How many times do you need to be asked to update the the most recently available libc and libcx packages?

 LIBCX0 1eba0000 00012d60 08/04/2021 13:05:11 230,570 C:\USR\LIB\LIBCX0.DLL
 LIBCN0 1ebc0000 0010bed0 02/26/2021 21:22:16 1,216,805 C:\USR\LIB\LIBCN0.DLL

emax

2021-08-23 16:58

reporter   ~0003889

since those DLLs are on the *experimental* repository and this is a production server

*anyway i will try to add them manually when i'm able to get them on the ecs desktop pc*

i will not set experimental repository on the production server

emax

2021-08-23 21:49

reporter   ~0003890

done

16/08/21 20:56 1.225.146 124 a--- libcn0.dll
16/08/21 21:09 65.099 124 a--- libcx0.dll

libcx0

Signature: @#bww bitwise works GmbH:0.7#@##1## 2021-08-17 00:09:39 noVendor: bww bitwise works GmbH
Revision: 0.07
Date/Time: 2021-08-17 00:09:39
Build Machine: novator
File Version: 0.7.1
Description: kLIBC Extension Library

Issue History

Date Modified Username Field Change
2021-06-04 19:27 emax New Issue
2021-06-04 19:32 emax Note Added: 0003706
2021-06-04 19:33 psmedley Note Added: 0003707
2021-06-04 20:24 psmedley Note Added: 0003708
2021-06-04 20:25 psmedley Note Added: 0003709
2021-06-04 20:29 emax Note Added: 0003710
2021-06-04 20:40 emax Note Added: 0003711
2021-06-04 20:50 emax Note Added: 0003712
2021-06-05 07:25 psmedley Note Added: 0003713
2021-06-07 21:18 emax Note Added: 0003717
2021-06-14 18:01 emax Note Added: 0003733
2021-06-14 18:04 emax Note Added: 0003735
2021-06-21 23:38 emax Note Added: 0003750
2021-06-22 03:14 emax Note Added: 0003751
2021-06-22 06:05 emax Note Added: 0003752
2021-06-22 06:59 psmedley Note Added: 0003753
2021-06-22 07:27 emax Note Added: 0003754
2021-06-22 09:35 psmedley Note Added: 0003755
2021-06-22 18:54 psmedley Note Added: 0003756
2021-06-22 19:04 emax Note Added: 0003757
2021-06-24 23:23 emax Note Added: 0003758
2021-06-25 06:52 psmedley Note Added: 0003759
2021-06-29 06:21 emax Note Added: 0003765
2021-07-02 06:22 Steven Levine Note Added: 0003774
2021-07-10 09:56 Steven Levine Assigned To => Steven Levine
2021-07-10 09:56 Steven Levine Status new => feedback
2021-07-10 09:56 Steven Levine Note Added: 0003791
2021-07-10 19:17 emax Note Added: 0003794
2021-07-10 19:17 emax Status feedback => assigned
2021-07-10 19:26 emax Note Added: 0003796
2021-07-15 14:16 Steven Levine Status assigned => feedback
2021-07-15 14:16 Steven Levine Note Added: 0003818
2021-07-15 23:55 emax Note Added: 0003821
2021-07-15 23:55 emax Status feedback => assigned
2021-07-16 01:27 Steven Levine Note Added: 0003823
2021-07-26 06:19 emax Note Added: 0003837
2021-08-09 18:31 emax File Added: 01E6_1B.TRP
2021-08-09 18:31 emax File Added: 01E4_1E.TRP
2021-08-09 18:31 emax File Added: 01E9_01.TRP
2021-08-09 18:31 emax File Added: 01E8_01.TRP
2021-08-09 18:31 emax Note Added: 0003867
2021-08-10 18:43 psmedley Summary Apache 2.4.46 build (1th/4/2021) with php 5.6.40 sometimes don't restart => Apache 2.4.4x build with php 5.6.40 sometimes don't restart
2021-08-11 12:07 Steven Levine Status assigned => feedback
2021-08-11 12:07 Steven Levine Note Added: 0003869
2021-08-11 19:48 emax Note Added: 0003872
2021-08-11 19:48 emax Status feedback => assigned
2021-08-12 02:04 Steven Levine Status assigned => feedback
2021-08-12 02:04 Steven Levine Note Added: 0003873
2021-08-12 18:19 emax Note Added: 0003875
2021-08-12 18:19 emax Status feedback => assigned
2021-08-13 01:14 Steven Levine Status assigned => feedback
2021-08-13 01:14 Steven Levine Note Added: 0003876
2021-08-13 01:45 emax Note Added: 0003877
2021-08-13 01:45 emax Status feedback => assigned
2021-08-13 04:25 Steven Levine Status assigned => feedback
2021-08-13 04:25 Steven Levine Note Added: 0003878
2021-08-18 04:52 emax Note Added: 0003879
2021-08-18 04:52 emax Status feedback => assigned
2021-08-18 05:46 Steven Levine Status assigned => feedback
2021-08-18 05:46 Steven Levine Note Added: 0003880
2021-08-18 18:05 emax Note Added: 0003881
2021-08-18 18:05 emax Status feedback => assigned
2021-08-18 19:15 psmedley Note Added: 0003882
2021-08-18 19:22 emax Note Added: 0003883
2021-08-20 10:19 Steven Levine Status assigned => feedback
2021-08-20 10:19 Steven Levine Note Added: 0003884
2021-08-20 17:33 emax Note Added: 0003885
2021-08-20 17:33 emax Status feedback => assigned
2021-08-20 17:36 emax File Added: 21C7_05.TRP
2021-08-20 17:36 emax File Added: 21C8_01.TRP
2021-08-20 17:36 emax File Added: 21C9_01.TRP
2021-08-20 17:36 emax File Added: 21CA_01.TRP
2021-08-20 17:36 emax File Added: 2108_01.TRP
2021-08-20 17:36 emax File Added: 2110_01.TRP
2021-08-20 17:36 emax File Added: 21C0_01.TRP
2021-08-20 17:36 emax Note Added: 0003886
2021-08-22 15:05 Steven Levine Status assigned => feedback
2021-08-22 15:05 Steven Levine Note Added: 0003887
2021-08-23 16:58 emax Note Added: 0003889
2021-08-23 16:58 emax Status feedback => assigned
2021-08-23 21:49 emax Note Added: 0003890