kernel 2.6.32 memory issue with lucid

Asked by Walter Richards on 2010-10-18

Bonjour,

we have a memory issue with the current stable kernel on lucid ( 2.6.32-25-server x86_64 ). We have ran into the same issue with all the 2.6.32 kernels published on lucid that we've tried with our application.

We have a system which has 32GB of memory, and basically, when we run our application the memory usage grows continuously to the point where th O/S sends oom-killers to about every processes. It usually take less then 8 hours to reproduce this behavior.

From sar, I can see the memory usage growing. At some point when the memory is used at more then 90%, the "kbbuffers" and "kbcached" are freed up and when there are no much memory to be freed, the oom-killers get in and starts killing processes.

The funny thing, is that if we stop the application shortly before the o/s would start to trigger the oom-killer, the memory is not much freed up. Around 3GB are freed up, which are pretty much what the application is using. I confirmed it by running regular "memstat" on our application. We even flushed the caches "echo 3 > /proc/sys/vm/drop_caches", but again with no much being really freed. Which is understandable to me since that part was already freed by the o/s according to sar. Now, if we leave the system idle for the night, tiny bits are slowly released by the kernel to the point where most of it is freed up the day after.

We have tried to tune the kernel parameters. (sysctl) which did not really helped. We compared the sysctl with the karmic kernel without findind any thing as well.

Finally I gave up and tried kernels provided in http://kernel.ubuntu.com/~kernel-ppa/mainline/. I have tried with v2.6.33.5-lucid and v2.6.34-lucid and my application is running fine without any issue anymore.

I believe there is a bug with the 2.6.32 kernel. Is there something I forgot in approach ? Is it a case to report it as a bug ?

We choose to have our application on Lucid because of the LTS. This is the second application we have that shows that same behavior, and would really want to have a fix for our problem.

Thanks in advance.
--
Walter

Question information

Language:
English Edit question
Status:
Open
For:
Ubuntu linux Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Launchpad Janitor (janitor) said : #1

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

I am currently running with the lucid-proposed 2.6.32-26 kernel. So far it seems to solve my problems which were most likely related to the bug reported with the mm page-allocator in :

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/649483

* mm: page allocator: update free page counters after pages are placed on the free list
* mm: page allocator: calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake
* mm: page allocator: drain per-cpu lists after direct reclaim allocation fails

Simon D├ęziel (sdeziel) said : #3

Hi Walter,

I'm glad to hear that the -proposed update fixed your problem. Could you please report your success in the bug report ?

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/649483

Thank you and have a good day,
Simon

Bonjour Simon,

we are almost done with our testing. We should complete all our tests within the next 2 days. I will update the bug report as soon as we have full confirmation that they are fixed.

But so far, it looks very good.
--
Walter

Bonjour,

we ran tests with 2.6.32-26, but we ended up having our memory problem and getting OOM killer once again.

We are still looking and would appreciate any help or suggestions.
--
Walter

Bonjour,

we still have this issue with the 2.6.32-26 kernel from LTS.

I can add a little bit of details to that question.

The application we are running is receiving a lot of FTP. Currently we are running vsftpd.

In fact, slowly I can see that the memory is being used by pcpu_get_vm_areas.

px-dev1:~# free
             total used free shared buffers cached
Mem: 33077468 25238696 7838772 0 712052 8893704
-/+ buffers/cache: 15632940 17444528
Swap: 8191992 0 8191992

px-dev1:~# grep pcpu_get_vm_areas /proc/vmallocinfo |wc -l
4240

It will grow like that until the system has no memory ( about 12k of pcpu_get_vm_area ) then OOM killer will get in.

I would appreciate any idea or suggestion.

Thanks.
--
Walter

--
Walter

Bonjour,

if I change the ftp server from vsftpd to ftpd, the problem dissapear.

The issue seems to be caused by a combination of vsftpd and the kernel-2.6.32.
--
Walter

haref bfgds (ferel23) said : #9

yes its really a serious issue that I am also facing in my new https://bikespassion.com/ website can someone please guide me about it.

Mark Jerry (markjerry) said : #10

I am also facing same issue on my Website https://pescobill.pk/, If anyone Found any any Solution Please let me know. Thanks

Can you help with this problem?

Provide an answer of your own, or ask Walter Richards for more information if necessary.

To post a message you must log in.