View Issue Details

IDProjectCategoryView StatusLast Update
0000769Samba ServerBugpublic2024-01-29 21:57
ReporterDavid McKenna Assigned Topsmedley  
PrioritynormalSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
PlatformIntel i3-8100OSArcaOSOS Version5.0.7
Summary0000769: Sys0008 when browsing SAMBA share version 4.13 and newer
DescriptionWhen browsing a SAMBA share using SAMBA server version 4.13 or newer with any client, evventually browsing fails with an out of memory (sys0008) error on the command line, or shows '0 files' using the WPS. Also files cannot be opened directly on the share (like pdf, jpg, or wpi files) and mp3's cannot be played.
Steps To Reproduce Browse a server share using the WPS from a client. Within 5 minutes the condition will present itself.
Additional Information Have been using the ArcaOS 4.11.6 version successfully for awhile now, even with newer clients. Running newer servers (using the same smb.conf file) exhibits the described behaviour with any client.

 I captured a trace from using the 4.15 server (sys0008.ftf) and the ArcaOS 4.11.6 server (nosys0008.ftf) for about 6 minutes each. Zipped them up and attached the zip file. The 4.15 server showed sys0008 errors in a couple minutes, the 4.11 server never does.
TagsNo tags attached.
Attached Files
trace.zip (150,413 bytes)

Activities

David McKenna

2022-09-06 21:13

reporter   ~0004560

I captured some ndpsmb log files on the client while browsing the SAMBA share - I'll upload. Went back and forth from a command line window and a WPS window. At 16:45 the command line window started giving the sys008 error (almost right away). The WPS window started showing some known full directories as '0 files' at 16:48. I stopped collecting log data once the WPS windows always said '0 files' and the command prompt always showed sys0008.
ndpsmblogs.zip (666,729 bytes)

David McKenna

2022-09-06 21:18

reporter   ~0004561

Forgot one...
ndctl.log (1,048 bytes)   
16:43:59.760: 
NetDrive for OS/2(R) Control Program version 3.1.6.876 May 09 2017

16:43:59.760: 2022-09-06

16:43:59.760: NDFSDIR environment variable found: C:\Programs\NDFS
16:43:59.760: NetDrive working directory is C:\Programs\NDFS
16:43:59.760: File system driver name is NDFS32.
16:43:59.770: NetDrive configuration file 'C:\Programs\NDFS\ndctl.cfg' has been processed.
16:43:59.770: Registered Version

16:43:59.770: Registered to David McKenna

16:43:59.770: [   local] internal plugin has been initialized.
16:43:59.860: [  ndpsmb] (level 2) plugin has been loaded.
16:43:59.860: [  ndpftp] (level 2) plugin has been loaded.
16:43:59.860: [ ndpntfs] (level 2) plugin has been loaded.
16:43:59.860: [  ndpiso] (level 2) plugin has been loaded.
16:43:59.880: [  ndpdav] (level 2) plugin has been loaded.
16:43:59.880: [  ndpudf] (level 2) plugin has been loaded.
16:43:59.880: Initialization completed. The control program is running.
16:49:15.320: NetDrive Control Program has been interrupted by File System Driver.
ndctl.log (1,048 bytes)   

David McKenna

2022-09-06 21:24

reporter   ~0004562

I didn't explicitly mention that both the Server and client are running on ArcaOS 5.0.7 computers. Have not tested any other OS yet.

psmedley

2024-01-02 06:36

administrator   ~0004650

Latest info from https://www.os2world.com/forum/index.php/topic,3515.30.html confirms that this only occurs with a share which is a whole drive - ie c:/

psmedley

2024-01-05 06:37

administrator   ~0004652

This *appears* to be related to some timeout issues we were seeing with the client with browsing a directory with a large number of files.

By default, samba commands have a 20 second timeout - increasing it to 60 seconds allowed a directory listing of a large folder (6 directories and over 2500 files for over 22GB of data) to succeed - and anecdotally, seems to have prevented the issues above.

I'll take a look at implementing some better error handling in the client for this (ie to detect the timeout). Interestingly, smbclient reports:
NT_STATUS_INVALID_NETWORK_RESPONSE listing \software\os2\*
smb: \software\os2\> SMBecho failed (NT_STATUS_CONNECTION_DISCONNECTED). The connection is disconnected now.

David McKenna

2024-01-05 20:59

reporter   ~0004653

Attached are 2 log files requested. In both cases, all I did was start the server, then from the client, used the WPS to list the contents of the root of the share. 411logs.zip are the logs when running the 4.11.06 server, 418logs.zip are the logs when running the 4.18.9 server. Using 4.18 client for both.
411logs.zip (157,768 bytes)
418logs.zip (121,980 bytes)

David McKenna

2024-01-13 13:14

reporter   ~0004657

Paul,

  I attached the logs you requested. Started the NDFS control program using the 4.19 client attached to the 4.19 server. I got '0 files' browsing using the WPS at 8:06AM, then immediately checked on command line and also got the 'out of memory' error. Hope this helps...
419client-serverlog.zip (990,160 bytes)

psmedley

2024-01-13 21:50

administrator   ~0004658

First observation is that you have an enormous quantity of log lines like the following in log.smbc, which don't seem to be 'normal' (he says having not checked his own):

[2024/01/13 08:04:08.291000, 5] ../../libcli/smb/smb2_signing.c:574(smb2_signing_sign_pdu)
  signed SMB2 message (sign_algo_id=2)

Something like 33873 entries in a 76700 line log file.

log.smbc also shows what 'might' be the error that's getting reported to the client:
[2024/01/13 08:06:45.934000, 3] ../../source3/libsmb/clierror.c:91(cli_status_to_errno)
  cli_status_to_errno: 0xc0000017 -> 12

the errno of 12 maps to ENOMEM which is an out of memory error - https://android.googlesource.com/kernel/lk/+/dima/for-travis/include/errno.h#45

More after a beach walk :)

psmedley

2024-01-14 00:22

administrator   ~0004659

It may be interesting to confirm this is indeed a SMB2 issue.

On the server, add the following to smb.conf:
server min protocol = NT1

On the client:
client max protocol = NT1

psmedley

2024-01-14 02:21

administrator   ~0004660

OK, for the first time I was able to reproduce this. Seems to be triggered using the WPS to open large folders. General browsing from the command line doesn't do it.

psmedley

2024-01-14 04:56

administrator   ~0004661

Last edited: 2024-01-14 08:41

Does https://smedley.id.au/tmp/ndpsmb.zip help at all?

Edit: it breaks opening files via the WPS. It may still be interesting to see if you can trigger the out of memory error browsing via the WPS

Edit2: It doesn't actually break it, it was just really slow cos of the massive number of files in the directory I was opening from.

David McKenna

2024-01-14 13:31

reporter   ~0004662

I tried this ndpsmb.zip, but I can't seem to copy to the server with it. I can browse and open files though. Setting the server to 'server min protocol = NT1' doesn't seem to help either. Same behaviors as before...

On a different note, I notice that files on the server over 2GB in size always display as 2GB in the WPS. On the command line they show the correct size. Files on a local drive over 2GB show the correct size in the WPS too. Is that to be expected?

LewisR

2024-01-14 20:07

developer   ~0004663

Hi, David...

This is a known (by us) WPS bug. Have a quick look at Julien's ticket in the AN bugtracker, here:

https://mantis.arcanoae.com/view.php?id=2323

Whether it is fixable (and distributable by AN) or not remains unknown (under investigation).

As for not being able to copy to the server, do you have some log entries from either side? You might want to ensure that packet signing is off, as well, as I've seen issues with mandatory packet signing under SMB1 (unrelated to the OS/2 client or server). Packet signing cannot be disabled for SMB2, but for SMB1, on the server side, in the global stanza of smb.conf, set the following:

server signing = disabled

and restart the server. Unless the client side is set to mandatory, it should In fact, you might want to disable signing on the client side for this SMB1 testing, as well. Again, in the global stanza, set:

client signing = disabled

and restart the NetDrive control program.

Thanks for testing all of this and providing such great feedback.

psmedley

2024-01-14 21:28

administrator   ~0004664

As Lewis said, 2gb limit in WPS is known. I'll have a look and see how I could have introduced an issue with copying files to the server, but are you able to reproduce the out of memory condition through browsing via the WPS?

David McKenna

2024-01-14 23:29

reporter   ~0004665

Using the smb.conf directives Lewis requested (and keeping the NT1 ones as well), I reloaded the test ndpsmb.dll on the client and I have not had the '0 files' condition appear for 15 minutes (a long time!) while browsing the share. But it hardly matters, because I can't do anything else - no copy or open of files, or play mp3 or read txt files. When I try, it eventually leads to a WPS hang. At least the original 4.19 client could be used for awhile. (EDIT: while typing this, I got the out of memory error with the test ndpsmb.dll).

I have tried the original 4.19 client in a Virtual Box install on a Windows host, and have not yet got the out of memory condition (2 days now). But I also don't have any trouble using the share with Windows either.

Have also tried the 4.19 client on my ArcaOS laptop and there I eventually get WPS hangs, and the command line will show 'directory not found' errors instead of 'out of memory'. The laptop is attaching to the network via a travel router.

 Also, smbclient does not work anywhere if I don't add -t 60 to the command - it always times out no matter where it is run.

 I noticed in one of the server logs that there were 'broken extended attributes' on some directories (in particular \software and \software\os2, 2 of the largest) and files on the share. I tried to run eaclean on the drive, but it immediately crashed. This makes me nervous about the health of the drive (even though it chkdsk's without error) so I got a new one and will replace it - probably tomorrow. I attach the log that shows the error and a random file it claims has broken extended attributes.
odinapp.zip (1,390,360 bytes)

psmedley

2024-01-15 08:11

administrator   ~0004666

Probably need to zip the defective file as the EA's will have been lost with the upload here...

I'll have another think about the OOM error...

David McKenna

2024-01-15 13:27

reporter   ~0004667

OK, I 7zipped the same file and attached. I tried just zip, but it would fail with a message: 'Interrupted (abort)'. Hope 7zip preserves EA's.
broken.7z (1,389,914 bytes)

David McKenna

2024-01-15 13:31

reporter   ~0004668

I wonder if it would be of any help to capture a TCP/IP trace while browsing for inspection in Wireshark? I could do it with the 4.11, then the 4.19 server running to see if there is any extraneous traffic being created to slow things down....

psmedley

2024-01-18 08:31

administrator   ~0004669

OK so the messages like:
os2getxattr : (software:0) user.DOSATTRIB 56
os2getxattr : (null:45) system.posix_acl_access -1
os2getxattr : (software:0) system.posix_acl_default -1
os2getxattr : (null:45) user.DOSATTRIB -1
os2getxattr : (null:45) user.DOSATTRIB -1
os2getxattr : (software/.:0) user.DOSATTRIB 56
os2listxattr: path software/., file 0 at libattr/syscalls.c(290)
Broken Extended Attributes detected for: software/.:0, Last EA:

Come from libattr, here, I can read odinapp.zip's EA's just fine. Weird.

psmedley

2024-01-22 08:32

administrator   ~0004682

I don't suspect it will make a difference, but can you test with https://smedley.id.au/tmp/ndpsmb-20240122.zip

David McKenna

2024-01-22 11:24

reporter   ~0004683

Unfortunately it does not help the 'out of memory' problem.

psmedley

2024-01-24 08:39

administrator   ~0004688

OK - so the first out of memory seems to be:
[2024/01/23 09:38:27.270000, 5] ../../source3/smbd/files.c:1985(file_free)
  file_free: freed files structure 0 (1 used)
[2024/01/23 09:38:27.270000, 5] ../../source3/smbd/files.c:77(fsp_new)
  fsp_new: allocated files structure (2 used)
[2024/01/23 09:38:27.270000, 5] ../../source3/smbd/files.c:179(file_new)
  file_new: new file fnum [invalid value]
[2024/01/23 09:38:27.270000, 5] ../../source3/smbd/files.c:321(openat_internal_dir_from_pathref)
  openat_internal_dir_from_pathref: fd_openat(".", ".") failed: NT_STATUS_NO_MEMORY
[2024/01/23 09:38:27.270000, 5] ../../source3/smbd/files.c:1985(file_free)
  file_free: freed files structure 0 (1 used)
[2024/01/23 09:38:27.270000, 3] ../../source3/smbd/filename.c:319(get_real_filename_full_scan_at)
  get_real_filename_full_scan_at: scan dir didn't open dir [.]: NT_STATUS_NO_MEMORY
[2024/01/23 09:38:27.270000, 5] ../../source3/smbd/files.c:1985(file_free)
  file_free: freed files structure 0 (0 used)
[2024/01/23 09:38:27.270000, 3] ../../source3/smbd/smb2_server.c:4033(smbd_smb2_request_error_ex)
  smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_NO_MEMORY] || at ../../source3/smbd/smb2_create.c:360

David McKenna

2024-01-29 21:27

reporter   ~0004694

The client version ndpsmb-20240127 no longer has the 'out of memory' issue (been using it a couple of days now). This ticket can be closed - thanks!

psmedley

2024-01-29 21:57

administrator   ~0004695

Latest client plugin resolves this

Issue History

Date Modified Username Field Change
2022-09-06 10:53 David McKenna New Issue
2022-09-06 10:53 David McKenna File Added: trace.zip
2022-09-06 21:13 David McKenna Note Added: 0004560
2022-09-06 21:13 David McKenna File Added: ndpsmblogs.zip
2022-09-06 21:18 David McKenna Note Added: 0004561
2022-09-06 21:18 David McKenna File Added: ndctl.log
2022-09-06 21:24 David McKenna Note Added: 0004562
2024-01-02 06:36 psmedley Note Added: 0004650
2024-01-05 06:37 psmedley Note Added: 0004652
2024-01-05 20:59 David McKenna Note Added: 0004653
2024-01-05 20:59 David McKenna File Added: 411logs.zip
2024-01-05 20:59 David McKenna File Added: 418logs.zip
2024-01-13 13:14 David McKenna Note Added: 0004657
2024-01-13 13:14 David McKenna File Added: 419client-serverlog.zip
2024-01-13 21:50 psmedley Note Added: 0004658
2024-01-14 00:22 psmedley Note Added: 0004659
2024-01-14 02:21 psmedley Note Added: 0004660
2024-01-14 04:56 psmedley Note Added: 0004661
2024-01-14 08:21 psmedley Note Edited: 0004661
2024-01-14 08:24 psmedley Note Edited: 0004661
2024-01-14 08:41 psmedley Note Edited: 0004661
2024-01-14 13:31 David McKenna Note Added: 0004662
2024-01-14 20:07 LewisR Note Added: 0004663
2024-01-14 21:28 psmedley Note Added: 0004664
2024-01-14 23:29 David McKenna Note Added: 0004665
2024-01-14 23:29 David McKenna File Added: odinapp.zip
2024-01-14 23:29 David McKenna File Added: log.smbd.davem.192.168.21.3
2024-01-15 08:11 psmedley Note Added: 0004666
2024-01-15 13:27 David McKenna Note Added: 0004667
2024-01-15 13:27 David McKenna File Added: broken.7z
2024-01-15 13:31 David McKenna Note Added: 0004668
2024-01-18 08:31 psmedley Note Added: 0004669
2024-01-19 21:01 psmedley Category 10 => Bug
2024-01-22 08:32 psmedley Note Added: 0004682
2024-01-22 11:24 David McKenna Note Added: 0004683
2024-01-24 08:39 psmedley Note Added: 0004688
2024-01-29 21:27 David McKenna Note Added: 0004694
2024-01-29 21:57 psmedley Assigned To => psmedley
2024-01-29 21:57 psmedley Status new => assigned
2024-01-29 21:57 psmedley Status assigned => resolved
2024-01-29 21:57 psmedley Resolution open => fixed
2024-01-29 21:57 psmedley Note Added: 0004695