Shlomi Reiss created HHQ-5818:
-
Summary: Number of open files defined for large scale deployment is too low - Server crashes.
Key: HHQ-5818
URL: https://jira.hyperic.com/browse/HHQ-5818
Project: Hyperic HQ
Issue Type: Bug
Components: Installation / Upgrade
Affects Versions: 5.7
Environment: Scale environment : 2K agent
Reporter: Shlomi Reiss
Assignee: Mayan Weiss
Priority: Critical
Scenario :
1. Installed hyperic vApp with large scale definition.
2. Connected 2K agents to the server (Agents starts gradually with 5 Seconds)
Result:
After about 10 hours Server has crashed and is not accessible.
Wrapper log shows errors about "Too many open files", and after a while the server crashes with OOM.
The server is Zombie, java processes seem to be up and running ( in ps ), but server dosent respond.
Log files are attached.
Looking at the open files limit on the vapp i get the following :
localhost:/opt/hyperic/server-5.7.0-EE/logs # ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 96298
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 96298
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Looking at my other scale setup (which was deployed manually by Scott, not as a vapp) i see that the OS definitions are different :
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 96056
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 8192
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 96056
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://jira.hyperic.com/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira