Debugging cgi-scripts

From genomewiki
Jump to navigationJump to search

See also:


Debugging with GDB

Complete instructions:

make sure you have compiled with -ggdb and without optimizations (so we can see all variables) by adding

 export COPT="-O0 -ggdb"

to your .bashrc (if using bash). Or add to .cshrc (if using csh or tcsh)

  setenv COPT "-O0 -ggdb"

You might need to make clean; make cgi afterwards. Also make sure that the CGIs use the right hg.conf. Run

 export HGDB_CONF=<PATHTOCGIS>/hg.conf

Then:

 cd cgi-bin
 gdb --args hgc 'hgsid=4777921&c=chr21&o=27542938&t=27543085&g=pubsDevBlat&i=1000235064'

To not forget the quotes, do not include the question mark from your internet browser.

To get a stacktrace of the place where it's aborting:

 break errAbort
 run
 where

Get coredumps from CGIs

Add this to the apache virtualhost CGI-BIN directory config to make errabort.c produce coredumps instead of errAbort error messages. You can then call gdb with

 gdb /usr/local/apache/cgi-bin/hgTracks 

Add this to the main apache config /etc/apache2/apache2.conf to make apache allow coredumps

 CoreDumpDirectory /tmp

If you're using apparmor, you'll need to deactivate it or change apparmor's config (/etc/default/apport).

 sudo vi /etc/default/apport
 add  enabled=1
 sudo vi /etc/sysctl.conf
 add 
 kernel.core_pattern=/usr/local/dump/core.%e.%p.%s.%t
 fs.suid_dumpable=2


 sudo vi /etc/security/limits.conf
 add
 root soft core unlimited
 root hard core unlimited
 www-data soft core unlimited
 www-data hard core unlimited

Don't forget to restart apache.

Finding memory problems with valgrind

Sometimes the program crashes at random places, because the stack or other datastructures have been destroyed by rogue code. You need valgrind to find the buggy code.

Run the program like this:

 valgrind --tool=memcheck --leak-check=yes pslMap ~max/pslMapProblem.psl ~max/pslMap-dm3-refseq.psl out.temp

CGI is too slow

First thing to try: Add the "measureTiming=1" parameter to the CGI call.

If you still have no idea, you can ctrl-C and look for where it's stuck. Or run gprof, to show how much CPU time each function takes, or valgrind, which includes most of the I/O time.

If you cannot ctrl-c because it's a CGI that needs very special POST parameters, you can attach to a running CGI to see where it's stuck:

 sudo gdb /usr/local/apache/cgi-bin/hgLiftOver `pidof hgLiftOver`

Profiling with gprof

First, recompile with another gcc option added or add it to your .bashrc

  export COPT='-ggdb -pg'

Running the programs now will create a file gmon.out in the current working directory.

Run hgTracks (e.g. through apache), go to the cgi-bin directory and run gprof on the newly created gprof file:

 gprof hgTracks gmon.out | less

hgTracks with the default tracks gave me this today:

 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ms/call  ms/call  name     
  17.65      0.06     0.06  1145954     0.00     0.00  hashLookup
   8.82      0.09     0.03   281068     0.00     0.00  cloneString
   5.88      0.11     0.02   113781     0.00     0.00  hashAdd
   5.88      0.13     0.02   113781     0.00     0.00  hashAddN
   5.88      0.15     0.02    67666     0.00     0.00  lmCloneString
   4.41      0.17     0.02                             lmCloneMem
   2.94      0.18     0.01  1055248     0.00     0.00  hashFindVal

Profiling with valgrind

Gprof shows you only CPU time. If you're stuck in I/O somewhere, gprof won't show it. You need to do ctrl-c a few times (best) or you can use valgrind again

 valgrind --tool=callgrind --dump-instr=yes --simulate-cache=yes --collect-jumps=yes hgTracks
 callgrind_annotate callgrind.out.<yourPID> | less

The tool kCacheGrind allows better inspection of the results than callgrind_annotate, but is a GUI program. It's on the big dev VM.

How to set the right hg.conf for CGIs on the command line

There are two ways: change to CGI-BIN or stay in src/hg/hgTracks. See above for the variable to direct hgTracks to the right hg.conf.

Otherways, Angie sez:

I actually want this setting from my ~/.hg.conf:

 udc.cacheDir=/data/tmp/angie/udcCacheCmdLine

Otherwise, since my gdb is running as angie not apache, there is a permissions error when trying to update udcCache files. But then, when I run hgTracks on the command line, I generally run in ~/kent/src/hg/hgTracks not /usr/local/apache/cgi-bin-angie. So changing the hgConfig logic to look for ./hg.conf would not affect my gdb usage. (my ~/kent/src/hg/trash is a symlink to /usr/local/apache/trash, and hg/.gitignore has trash and */ct/*)

BTW this is the entire non-comment contents of my ~/.hg.conf :

 include /usr/local/apache/cgi-bin-angie/hg.conf
 db.user=XXXX
 db.password=XXXX
 udc.cacheDir=/data/tmp/angie/udcCacheCmdLine

So there is very little difference between cgi-bin-angie/hg.conf and ~/.hg.conf . Why not use your ~/.hg.conf for gdb debugging?