Issue:
We deployed a new SAS Software in SAS Web tier. While restarting the web services we were unable to bring up the services. We found out below error message from the web logs (/Lev1/Web/Logs/SASServer1_1):com.atomikos.icatch.SysException: Error in init: Error in recover
We reached out for tech support and after the investigation he found out that it is PermGen issue.
Solution:
A restart of the SAS web application server instance would temporarily remedy this situation and get you back up, but it would likely re-occurunless it was caused by some unusual usage scenario.
SAS TS took JVM memory settings from the log, and he discussed this with their performance expert to see if we may need to tweak some JVM arguments. He also suggested that it may also require some more in-depth GC logging in order to determine the root of the issue.
After futher investigation it is found out that it occured due to heavy OLAP/Cube usage and PermGen issue again. This is a known issue if you have heavy OLAP/Cube usage.
Immediate action:
Tweak the JVM arguments defined in /Lev1/Web/WebAppServer/SASServer1_1/bin/setenv.sh. You should see these: -XX:PermSize=768m -XX:MaxPermSize=1280m. To help with the PermGen issue I recommend updating these values to
-XX:PermSize=1024m -XX:MaxPermSize=1500m. Then you will need to restart the SASServer1_1 service when you can schedule time.
This may not permanently solve the issue. If it does not, we will have to enable GC logging and do some more in-depth performance analysis,
which could cause some further outage situations. I just want to make you aware of this, it may require a little more experimentation to get the
JVM arguments tweaked just right and avoid this issue. But, start with the recommendation above and it should help. Atlast we end up with following JVM arguments:
-Xmx6000m
-Xms2048m
-XX:PermSize=1024m
-XX:MaxPermSize=1500m
This gives you a bit more JVM memory overhead, which may prevent further outages. Worst case, it does no harm.
Enabling GC logging:
Capturing the GC logging for a week or so willallow us to make more accurate JVM tuning suggestions. The following shell script code can be added to your setenv.sh file in order to enable
GC logging as well as some other diagnostics that we often use for this type of situation. It should create 2 new log files in the specified
LOG_DIR (you can change LOG_DIR if you choose).
#SAS web tier diagnostics
LOG_DIR="/Lev1/Web/Logs/SASServer1_1"
PROCESS_NAME="SASServer1_1"
# Output files
GC_LOG_FILE="$LOG_DIR/gc$PROCESS_NAME-`date +%d%b%Y-%H:%M:%S`.log"
ULIMIT_FILE="$LOG_DIR/ulimit$PROCESS_NAME-`date +%d%b%Y-%H:%M:%S`.log"
echo "GC data will be written to $GC_LOG_FILE"
echo "user limits written to $ULIMIT_FILE"
#Collect user limits for this process
echo "----------" > $ULIMIT_FILE
date >> $ULIMIT_FILE
echo "----------" >> $ULIMIT_FILE
echo "ulimit -a: " >> $ULIMIT_FILE
ulimit -a >> $ULIMIT_FILE
echo "----------" >> $ULIMIT_FILE
echo "ulimit -Ha: " >> $ULIMIT_FILE
ulimit -Ha >> $ULIMIT_FILE
#Enable verbose gc logging
JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC"
JVM_OPTS="$JVM_OPTS -verbose:gc -Xloggc:$GC_LOG_FILE"
Comments
Post a Comment