processes killed automatically?

Post Reply
CUnknown
Posts: 22
Joined: Mon Dec 28, 2009 11:32 am

processes killed automatically?

Post by CUnknown » Tue Dec 29, 2009 9:54 pm

Hello,

Regarding my experiment with transmission on my NAS, i am experiencing some trouble with it.
today, while downloading a DVD, the transmission daemon was suddenly killed (after approximately 4 hours of downloading).

I was listening to music wich was streamed of the NAS too, and forward a lot of tracks. Maybe this had something to do with the sudden stop of the transmission service?

Is there a service in the operating system of the nas to kill services at a given moment? or to kill services when they use to much system resources?

Thanks in advance

Mijzelf
Posts: 6206
Joined: Mon Jun 16, 2008 10:45 am

Re: processes killed automatically?

Post by Mijzelf » Wed Dec 30, 2009 10:59 am

Could be. Have a google on 'OOM killer'.

CUnknown
Posts: 22
Joined: Mon Dec 28, 2009 11:32 am

Re: processes killed automatically?

Post by CUnknown » Thu Dec 31, 2009 4:35 pm

mm.. looks like transmission gets the best score to be killed when there is a process overkill.

It looks like there is nothing to do about it except changing the scoring weights (looks a bit tricky to me) or make a script that checks whether the daemon is running or not and restart the daemon if it isn't running.

Mijzelf
Posts: 6206
Joined: Mon Jun 16, 2008 10:45 am

Re: processes killed automatically?

Post by Mijzelf » Fri Jan 01, 2010 8:32 am

Don't know if restarting the daemon will help. When there's not enough memory, it will be killed again soon.

Do you have enough swapspace?

CUnknown
Posts: 22
Joined: Mon Dec 28, 2009 11:32 am

Re: processes killed automatically?

Post by CUnknown » Fri Jan 01, 2010 4:29 pm

I have not changed the partitions in any way, so it will be the default amount of swap space.
This will be the problem indeed. If the NAS doesn't have enough RAM, the swap space is used. If the NAS doesn't have enough swap, the OOM killer comes into action :twisted:

Do you know a solution to extend the swap partition in place, without loosing any data and opening the NAS?

Mijzelf
Posts: 6206
Joined: Mon Jun 16, 2008 10:45 am

Re: processes killed automatically?

Post by Mijzelf » Fri Jan 01, 2010 4:40 pm

Maybe. You can add a swapfile.

Code: Select all

# create 32 MB file
dd if=/dev/zero of=/path/to/swapfile count=32768 bs=1024
# convert to swapspace
mkswap /path/to/swapfile
#use it
swapon /path/to/swapfile
Only the swapon part has to be executed each boot. When mkswap is not available on the box you can use any linux box to prepare the swapfile.

davetuk
Posts: 4
Joined: Sat Mar 27, 2010 6:58 am

Re: processes killed automatically?

Post by davetuk » Sat Mar 27, 2010 7:24 am

First of all, this post is not a solution! It is a collection of my findings so far, in the hope that it will help you, me or others.

I know this is an old post, but I seem to be having the exact same problem as you (out of memory killer). In my case, it's wiping out twonkymediaserv every time I try to rebuild the database, and I think it may have never actually completed a full database rebuild for this reason:

Code: Select all

Mar 27 05:01:43 (none) user.warn kernel: oom-killer: gfp_mask=0x2d0
Mar 27 05:01:47 (none) user.warn kernel: DMA per-cpu:
Mar 27 05:01:47 (none) user.warn kernel: cpu 0 hot: low 2, high 6, batch 1
Mar 27 05:01:47 (none) user.warn kernel: cpu 0 cold: low 0, high 2, batch 1
Mar 27 05:01:47 (none) user.warn kernel: Normal per-cpu: empty
Mar 27 05:01:47 (none) user.warn kernel: HighMem per-cpu: empty
Mar 27 05:01:47 (none) user.warn kernel: 
Mar 27 05:01:47 (none) user.warn kernel: Free pages:        6836kB (0kB HighMem)
Mar 27 05:01:47 (none) user.warn kernel: Active:0 inactive:46 dirty:0 writeback:0 unstable:0 free:1709 slab:1032 mapped:0 pagetables:73
Mar 27 05:01:47 (none) user.warn kernel: DMA free:6836kB min:512kB low:640kB high:768kB active:0kB inactive:184kB present:16384kB pages_scanned:35 all_unreclaimable? no
Mar 27 05:01:47 (none) user.warn kernel: lowmem_reserve[]: 0 0 0
Mar 27 05:01:47 (none) user.warn kernel: Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 27 05:01:47 (none) user.warn kernel: lowmem_reserve[]: 0 0 0
Mar 27 05:01:47 (none) user.warn kernel: HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 27 05:01:47 (none) user.warn kernel: lowmem_reserve[]: 0 0 0
Mar 27 05:01:47 (none) user.warn kernel: DMA: 483*4kB 283*8kB 117*16kB 20*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6836kB
Mar 27 05:01:47 (none) user.warn kernel: Normal: empty
Mar 27 05:01:47 (none) user.warn kernel: HighMem: empty
Mar 27 05:01:48 (none) user.warn kernel: Swap cache: add 9349, delete 9349, find 1194/2553, race 0+0
Mar 27 05:01:48 (none) user.warn kernel: Free swap  = 123684kB
Mar 27 05:01:48 (none) user.warn kernel: Total swap = 128448kB
Mar 27 05:01:48 (none) user.err kernel: Out of Memory: Killed process 1851 (mt-daapd).
Mar 27 05:01:48 (none) user.warn kernel: oom-killer: gfp_mask=0x2d0
Mar 27 05:01:48 (none) user.warn kernel: DMA per-cpu:
Mar 27 05:01:48 (none) user.warn kernel: cpu 0 hot: low 2, high 6, batch 1
Mar 27 05:01:48 (none) user.warn kernel: cpu 0 cold: low 0, high 2, batch 1
Mar 27 05:01:48 (none) user.warn kernel: Normal per-cpu: empty
Mar 27 05:01:48 (none) user.warn kernel: HighMem per-cpu: empty
Mar 27 05:01:48 (none) user.warn kernel: 
Mar 27 05:01:48 (none) user.warn kernel: Free pages:        6628kB (0kB HighMem)
Mar 27 05:01:48 (none) user.warn kernel: Active:17 inactive:89 dirty:0 writeback:0 unstable:0 free:1657 slab:1033 mapped:1 pagetables:68
Mar 27 05:01:48 (none) user.warn kernel: DMA free:6628kB min:512kB low:640kB high:768kB active:68kB inactive:356kB present:16384kB pages_scanned:30 all_unreclaimable? no
Mar 27 05:01:48 (none) user.warn kernel: lowmem_reserve[]: 0 0 0
Mar 27 05:01:48 (none) user.warn kernel: Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 27 05:01:48 (none) user.warn kernel: lowmem_reserve[]: 0 0 0
Mar 27 05:01:48 (none) user.warn kernel: HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 27 05:01:48 (none) user.warn kernel: lowmem_reserve[]: 0 0 0
Mar 27 05:01:48 (none) user.warn kernel: DMA: 403*4kB 287*8kB 118*16kB 22*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6628kB
Mar 27 05:01:48 (none) user.warn kernel: Normal: empty
Mar 27 05:01:48 (none) user.warn kernel: HighMem: empty
Mar 27 05:01:48 (none) user.warn kernel: Swap cache: add 9410, delete 9409, find 1194/2562, race 0+0
Mar 27 05:01:48 (none) user.warn kernel: Free swap  = 124728kB
Mar 27 05:01:48 (none) user.warn kernel: Total swap = 128448kB
Mar 27 05:01:48 (none) user.err kernel: Out of Memory: Killed process 1732 (twonkymediaserv).
Mar 27 05:01:48 (none) user.warn kernel: oom-killer: gfp_mask=0x2d0
Mar 27 05:01:48 (none) user.warn kernel: DMA per-cpu:
Mar 27 05:01:48 (none) user.warn kernel: cpu 0 hot: low 2, high 6, batch 1
Mar 27 05:01:48 (none) user.warn kernel: cpu 0 cold: low 0, high 2, batch 1
Mar 27 05:01:48 (none) user.warn kernel: Normal per-cpu: empty
Mar 27 05:01:48 (none) user.warn kernel: HighMem per-cpu: empty
Mar 27 05:01:48 (none) user.warn kernel: 
Mar 27 05:01:48 (none) user.warn kernel: Free pages:        6708kB (0kB HighMem)
Mar 27 05:01:48 (none) user.warn kernel: Active:17 inactive:85 dirty:0 writeback:0 unstable:0 free:1677 slab:1033 mapped:2 pagetables:68
Mar 27 05:01:48 (none) user.warn kernel: DMA free:6708kB min:512kB low:640kB high:768kB active:68kB inactive:340kB present:16384kB pages_scanned:0 all_unreclaimable? no
Mar 27 05:01:48 (none) user.warn kernel: lowmem_reserve[]: 0 0 0
Mar 27 05:01:48 (none) user.warn kernel: Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 27 05:01:48 (none) user.warn kernel: lowmem_reserve[]: 0 0 0
Mar 27 05:01:48 (none) user.warn kernel: HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 27 05:01:48 (none) user.warn kernel: lowmem_reserve[]: 0 0 0
Mar 27 05:01:48 (none) user.warn kernel: DMA: 413*4kB 288*8kB 120*16kB 22*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6708kB
Mar 27 05:01:48 (none) user.warn kernel: Normal: empty
Mar 27 05:01:48 (none) user.warn kernel: HighMem: empty
Mar 27 05:01:48 (none) user.warn kernel: Swap cache: add 9449, delete 9447, find 1194/2570, race 0+0
Mar 27 05:01:48 (none) user.warn kernel: Free swap  = 124728kB
Mar 27 05:01:48 (none) user.warn kernel: Total swap = 128448kB
Mar 27 05:01:48 (none) user.err kernel: Out of Memory: Killed process 1741 (twonkymediaserv).
I also notice this guy has the same problem.

Critical lines in the sequence, I think are:

Code: Select all

Normal: empty
HighMem: empty
Essentially, the linux kernel 'oom-killer' is being invoked because it thinks it has completely run out of memory. In reality, there is over 120MB of swap free at the time that this happens, so I think that either the system is not using it's swap properly (it is using some swap - so the swap is operational), or it is a specific type of memory (highmem/lowmem/whatever), that is running out.

At the moment, all I have is links to useful resources:

http://linux-mm.org/OOM_Killer
http://serverfault.com/questions/56874/ ... -free-swap
http://www.redhat.com/magazine/001nov04/features/vm/
http://linux.derkeiler.com/Mailing-List ... 00062.html
http://www.sheepguardingllama.com/2008/ ... onitoring/

I have installed ssh and have root access, but this is just to monitor the situation, and the problem existed before.

I have been monitoring by clicking the 'Rebuild' button on the web page and monitoring the logs. The fault is hard to monitor because the oom-killer sometimes picks my ssh or telnet processes to kill first, thus killing my 'top' or 'tail -f /var/log/messages' shell!

I really want to adjust the vm parameters, but at the moment, I find that things like

Code: Select all

# echo "0" > /proc/sys/vm/oom-kill
or

Code: Select all

# echo "250" > /proc/sys/vm/lower_zone_protection
Just don't work.

I suspect (but am not fully sure) that these commands which map the parameters to a virtual filesystem actually invoke the sysctl command underneath, e.g.

Code: Select all

sysctl -w vm.max map count=65535
...but since sysctl is not included in the Networkspace's build of BusyBox, they don't work (try typing 'busybox' at a command shell to see all available commands).

I am considering updating the busybox executable, or installing sysctl separately somehow.

That's my progress so far, please post back if you find anything new (or a solution!)

Dave

davetuk
Posts: 4
Joined: Sat Mar 27, 2010 6:58 am

Re: processes killed automatically?

Post by davetuk » Tue Apr 06, 2010 12:06 pm

I have now solved this problem!

I did it by installing busybox and sysctl, then adding some settings to /etc/sysctl.conf to tweak the memory manager settings. It seems that lacie have shipped this units with a mistake in the settings.

According to the linux kernel documentation, vm.min_free_kbytes=512 is much too low. I've set this to 1200 on my machine now, and have not had any oom problems since.

Most importantly, my twonkymediaserver can rebuild it's database now without being killed by the oom-killer. For the first time ever the media server lists all the files in my collection, not just the ones it used to scan prior to being killed.

I will post all of the other /etc/sysctl.conf settings and how I got the kernel to load them automatically at boot shortly.

CUnknown
Posts: 22
Joined: Mon Dec 28, 2009 11:32 am

Re: processes killed automatically?

Post by CUnknown » Mon Apr 12, 2010 5:57 pm

Sounds good! I will wait till you post futher documentation and try it out :D

davetuk
Posts: 4
Joined: Sat Mar 27, 2010 6:58 am

Re: processes killed automatically?

Post by davetuk » Sat Apr 17, 2010 8:19 am

Hi,

It's been a couple of weeks, but I'm pretty sure I can remember what I did...

Firstly you need to get root on your NetworkSpace and get remote access. The linked page gets you utelnetd.

Then you need to install ipkg by following the last part of this post.

[optional] At this point I'd recommend setting a root password and then installing openssh:

Code: Select all

/opt/sbin/ipkg install openssh
Once this is working, you can remove the utelnetd hack.

Before we start, let me point out that ipkg seems to install files into /opt/. I was a little scared that ipkg was going to overwrite my existing binaries, but this does not seem to be the case.

To tweak the relevant kernel parameters, you need sysctl installed. This is normally part of the multi-call binary busybox, except unfortunately the build of busybox on the NetworkSpace does not include sysctl. (run /bin/busybox to list the available commands on the stock busybox) We do:

Code: Select all

/opt/bin/ipkg install busybox
This gives us the new binary /opt/bin/busybox, and if you run this, you should now see sysctl in the list.

You can now list the kernel parameters by doing /opt/sbin/sysctl -a. The lines starting vm.* are the interesting ones. You'll probably find that vm.min_free_kbytes is set to 512, which is one of the main reasons why the system invokes the oom killer (see previous link to kernel documentation). It is possible to start changing the kernel paramters by using sysctl -w xxxx=yyy, and this is what I did to tinker around with the settings while trying it out. You can make the settings permanent by adding them to a file called /etc/sysctl.conf. Or at least that's what most linux documentation tells you. Unfortunately, there is no call in the boot up sequence which reads /etc/sysctl.conf so you have to add one yourself.

My brain cannot cope with vi, so I did

Code: Select all

/opt/bin/ipkg install nano
then

Code: Select all

/opt/bin/nano /opt/etc/sysctl.conf
and typed

Code: Select all

vm.swappiness=80
vm.overcommit_memory=2
vm.min_free_kbytes=1200
Use ctrl-x to quit, then save.

Now create a symbolic link to your config file and test it out:

Code: Select all

ln -s /opt/etc/sysctl.conf /etc/sysctl.conf
/opt/sbin/sysctl -p
It should list the parameters that have been set; the output will look almost exactly like the contents of sysctl.conf (it just has spaces either side of the '=' signs).

Make sure you are happy with the settings before you do the next step, as stupid kernel parameter tweaks could leave you with a system that doesn't boot.

Code: Select all

/opt/bin/nano /opt/etc/init.d/updatesysctl
copy and paste in the following:

Code: Select all

#!/bin/sh

/opt/sbin/sysctl -e -p /etc/sysctl.conf >/dev/null &2>1
Make the script executable and then create a symlink from the boot up scripts (rc.d) directory:

Code: Select all

chmod 0755 /opt/etc/init.d/updatesysctl
ln -s /etc/rc.d/rc3.d/S02updatesysctl /opt/etc/init.d/updatesysctl
You can test the script if you want by running /etc/rc.d/rc3.d/S02updatesysctl. The S02 just makes it run quite early in the boot sequence. I can only assume that LaCie's kernel settings defaults are compiled into in the kernel.

Once you are happy, reboot the system:

Code: Select all

reboot
DONE.

Log back in again and watch your log messages:

Code: Select all

tail -f -/var/log/messages
Then try doing the thing that caused the oom-killer to be invoked (for me it was rebuilding the media database from the web interface). You should see very little happening, where before you saw the oom-killer dancing around like the grim reaper (and if anything like me - killing your ssh terminal).

If you are tinkering, you'll find that a lot of the usual options for memory monitoring that people suggest are cut down in busybox, so they don't work. These two scripts I made were quite handy:

Code: Select all

NetworkSpace /root # cat mempoll
#!/bin/ash
while true
do
date
cat /proc/meminfo
sleep 2
done

NetworkSpace /root # cat slabpoll
#!/bin/ash
while true
do
date
awk '{printf "%5d kB %s\n", $3*$4/(1024), $1}' < /proc/slabinfo | sort -n
sleep 2
done

NetworkSpace /root #
Have fun!

Dave

miind
Posts: 2
Joined: Fri Jun 10, 2011 1:09 pm

Re: processes killed automatically?

Post by miind » Wed Jan 18, 2012 10:36 am

Im trying out the fix outlined here. I guess lines with /opt/sbin/sysctl should be /opt/busybox sysctl instead?!

davetuk
Posts: 4
Joined: Sat Mar 27, 2010 6:58 am

Re: processes killed automatically?

Post by davetuk » Sat Nov 24, 2012 9:10 pm

miind wrote:I guess lines with /opt/sbin/sysctl should be /opt/busybox sysctl instead?!
I think the install creates symlinks or calling scripts for you, so either should work.

This fix has given me a couple of years of reliable service now.

Post Reply