Training new Browser Engineers: Difference between revisions
(Created page with "== parasol == Trainer: Galt Barber == software engineering best practices == Trainer: Kate Rosenbloom == machine layout, clusters, data flow, etc == Trainer: Hiram Clawson == ...") |
(Links changed from .soe or .cse to .gi) |
||
(48 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
== parasol == | == parasol == | ||
Trainer: Galt Barber | Trainer: Galt Barber | ||
parasol is software for running batches of jobs | |||
on the cluster of computing machines. | |||
There is a full page of parasol information and examples here: [[Parasol_how_to]] | |||
== software engineering best practices == | == software engineering best practices == | ||
Trainer: Kate Rosenbloom | Trainer: Kate Rosenbloom | ||
Recommended reading: | |||
* The Art of UNIX Programming , second part of Chapter 1 (Basics of the Unix Philosophy, Eric Raymond). pp. 11-27 | |||
* Beautiful Code, Chapter 13 (Design of the Gene Sorter, Jim Kent). pp. 217-228 | |||
* The good: https://en.wikipedia.org/wiki/The_Elements_of_Programming_Style#Lessons | |||
* The bad & the ugly: https://en.wikipedia.org/wiki/Code_smell | |||
* Pike's rules: http://users.ece.utexas.edu/~adnan/pike.html | |||
Some useful acronyms: | |||
* DRY programming (also OAOO) - Don't Repeat Yourself, Once & Only Once | |||
* YAGNI - You Ain't Gonna Need It | |||
expressions: | |||
* Software rot, bit decay | |||
* Technical debt | |||
and aphorisms: | |||
* Do the simplest thing that can possibly work (Ward Cunningham, Extreme Programming) | |||
* Any fool can write code that a computer can understand. Good programmers write code that humans can understand. (Martin Fowler, Refactoring) | |||
Some philosophers of good practice: | |||
* Brian Kernighan, PJ Plauger, Rob Pike, Yourdon (early UNIX) | |||
* Martin Fowler, Ward Cunningham, Kent Beck (XP crowd) | |||
* Christopher Alexander, Eric Gamma, Gang of Four (Design Patterns) | |||
* 'Uncle Bob' Martin | |||
== machine layout, clusters, data flow, etc == | == machine layout, clusters, data flow, etc == | ||
Line 17: | Line 58: | ||
Trainer: Angie Hinrichs | Trainer: Angie Hinrichs | ||
== GBiB | When we write library code or CGI code, that code will (hopefully) be in use for many years, and '''readability''' is extremely important because somebody else may need to debug or extend your code some day. Code is easier to read if it looks like it was written by one person as opposed to a jumble of different styles and indentation rules. Try to make your code look like Jim wrote it, and readers will thank you (or at least refrain from cursing you ;). | ||
Basics of kent/src C formatting conventions: | |||
* multi-word variable or function names: camelCase, camel123Case (numbers and acronyms treated as words) | |||
* tab = 8 spaces | |||
* indentation = 4 spaces | |||
* opening brace '''{''' is indented on next line (not at end of line) | |||
* function declaration is followed by a brief comment describing inputs, outputs, gotchas like memory allocation details | |||
If you use emacs, get Chuck Sugnet's [[Media: Jkent-c.emacs|jkent-c.emacs]] and add this to your ~/.emacs: | |||
<code>(load-file "~/jkent-c.emacs")</code> | |||
Higher-level conventions: | |||
* try to keep functions short enough to view in one editor screenful | |||
* empty lines between function declarations (but rarely within functions) | |||
* define functions before their first use in the file (to avoid duplicate declarations) | |||
* comment sparingly -- don't repeat what the code says, but say why you're doing something non-obvious | |||
* errAbort when there's an error condition -- much easier to find bugs that way | |||
Get to know src/inc/'''common.h''' well (and common.c), and use its utilities and error-checking wrappers around C lib functions: | |||
* strcpy --> '''safecpy''', sprintf --> '''safef''', strncpy --> '''safencpy''', strcat --> '''safecat''' | |||
* malloc --> needMem (and variants) | |||
* free --> freez, freeMem | |||
* memcpy --> cloneString, cloneStringZ, cloneMem | |||
* read --> mustRead, similar for write and close | |||
* verbose() for debugging comments | |||
* string utilities: sameString, startsWith*, stringIn, wildMatch, count*, chop* | |||
* sl* functions for list operations | |||
* sl* variants: slName, slRef, slPair | |||
Other absolutely fundamental src/{inc,lib} modules include: | |||
* dyString: dynamically allocated, expandable strings | |||
* errAbort: context-specific handling of warnings and errors | |||
* hash: associate strings with any type of data | |||
* linefile: read in a file (or URL! and automatically decompress compressed files!) line by line | |||
* obscure: despite the name, this companion to common.h holds a lot of useful util functions | |||
* options: command-line option parsing | |||
Below are several slide sets that provide a good introduction to Jim's programming philosophy and other local conventions: | |||
* Jim's Software Sermon (2002): [http://genomewiki.ucsc.edu/images/7/7c/SoftwareSermon2002_Jim_Kent.ppt PPT] [http://www.cse.ucsc.edu/~donnak/eng/softsermon2002.htm HTML] | |||
* Jim's brief CGI programming intro (2007) - see esp. pages 6 & 7 about the cart [https://hgwdev.gi.edu/~kent/cgiProgramming.ppt PPT] | |||
* Jim's Software Engineering & testing presentation (2012): [http://genomewiki.ucsc.edu/images/c/c7/SoftwareEngTesting.pptx PPTX] | |||
* Jim's Locality & Modularity presentation (2012) [https://hgwdev.gi.edu/~kent/locality.pptx PPTX] | |||
Others' overviews of kent/src: | |||
* Hiram's BME 230 presentation on GB, TB & kent/src (2011) - see esp. pages 5-20 [http://genomewiki.ucsc.edu/images/f/f7/BME230_Winter_2011.ppt PPT] | |||
* Robert Baertsch's BME 230 presentation (2008): [http://genomewiki.ucsc.edu/images/7/75/Baertsch-code-talk.ppt PPT] | |||
* Angie's presentation to Bejerano Lab about working with and extending CGIs & utils (2008): [http://genomewiki.ucsc.edu/images/b/b5/Bejerano_Lab_2008_03_31.ppt PPT] | |||
== GBiB, release process, development tools == | |||
Trainer: Max Haeussler | Trainer: Max Haeussler | ||
* "make search" in kent/src | |||
* xcode anyone? | |||
* [http://genomewiki.ucsc.edu/genecats/index.php/Genome_Browser_in_a_Box_config] gbib page | |||
* [http://genomewiki.ucsc.edu/index.php/It%27s_a_long_way_to_the_RR release process] | |||
* [http://genomewiki.ucsc.edu/index.php/Debugging_cgi-scripts debugging CGIs] | |||
== debugging tools == | == debugging tools == | ||
Trainers: Angie Hinrichs & Max Haeussler | Trainers: Angie Hinrichs & Max Haeussler | ||
Read the genomewiki page [http://genomewiki.ucsc.edu/index.php/Debugging_cgi-scripts Debugging cgi-scripts] | |||
== how not to break the build! == | |||
Trainer: Brian Raney | |||
== background reading material == | |||
hgFindSpec: http://genomewiki.cse.ucsc.edu/index.php/HgFindSpec | |||
Our csh - bash equivalence document: $HOME/kent/src/hg/doc/bashVsCsh.txt | |||
VI: http://genomewiki.ucsc.edu/genecats/index.php/VI_quick_start | |||
Cluster Jobs: http://genomewiki.ucsc.edu/index.php/Cluster_Jobs | |||
[[Category:Browser Development]][[Category:Browser Development Training]] |
Latest revision as of 19:47, 24 September 2018
parasol
Trainer: Galt Barber
parasol is software for running batches of jobs on the cluster of computing machines.
There is a full page of parasol information and examples here: Parasol_how_to
software engineering best practices
Trainer: Kate Rosenbloom
Recommended reading:
- The Art of UNIX Programming , second part of Chapter 1 (Basics of the Unix Philosophy, Eric Raymond). pp. 11-27
- Beautiful Code, Chapter 13 (Design of the Gene Sorter, Jim Kent). pp. 217-228
- The bad & the ugly: https://en.wikipedia.org/wiki/Code_smell
- Pike's rules: http://users.ece.utexas.edu/~adnan/pike.html
Some useful acronyms:
- DRY programming (also OAOO) - Don't Repeat Yourself, Once & Only Once
- YAGNI - You Ain't Gonna Need It
expressions:
- Software rot, bit decay
- Technical debt
and aphorisms:
- Do the simplest thing that can possibly work (Ward Cunningham, Extreme Programming)
- Any fool can write code that a computer can understand. Good programmers write code that humans can understand. (Martin Fowler, Refactoring)
Some philosophers of good practice:
- Brian Kernighan, PJ Plauger, Rob Pike, Yourdon (early UNIX)
- Martin Fowler, Ward Cunningham, Kent Beck (XP crowd)
- Christopher Alexander, Eric Gamma, Gang of Four (Design Patterns)
- 'Uncle Bob' Martin
machine layout, clusters, data flow, etc
Trainer: Hiram Clawson
loading a track, making an assembly
Trainer: Hiram Clawson
C libraries and GB code gotchas
Trainer: Jim Kent
kent src coding standards & libraries overview
Trainer: Angie Hinrichs
When we write library code or CGI code, that code will (hopefully) be in use for many years, and readability is extremely important because somebody else may need to debug or extend your code some day. Code is easier to read if it looks like it was written by one person as opposed to a jumble of different styles and indentation rules. Try to make your code look like Jim wrote it, and readers will thank you (or at least refrain from cursing you ;).
Basics of kent/src C formatting conventions:
- multi-word variable or function names: camelCase, camel123Case (numbers and acronyms treated as words)
- tab = 8 spaces
- indentation = 4 spaces
- opening brace { is indented on next line (not at end of line)
- function declaration is followed by a brief comment describing inputs, outputs, gotchas like memory allocation details
If you use emacs, get Chuck Sugnet's jkent-c.emacs and add this to your ~/.emacs:
(load-file "~/jkent-c.emacs")
Higher-level conventions:
- try to keep functions short enough to view in one editor screenful
- empty lines between function declarations (but rarely within functions)
- define functions before their first use in the file (to avoid duplicate declarations)
- comment sparingly -- don't repeat what the code says, but say why you're doing something non-obvious
- errAbort when there's an error condition -- much easier to find bugs that way
Get to know src/inc/common.h well (and common.c), and use its utilities and error-checking wrappers around C lib functions:
- strcpy --> safecpy, sprintf --> safef, strncpy --> safencpy, strcat --> safecat
- malloc --> needMem (and variants)
- free --> freez, freeMem
- memcpy --> cloneString, cloneStringZ, cloneMem
- read --> mustRead, similar for write and close
- verbose() for debugging comments
- string utilities: sameString, startsWith*, stringIn, wildMatch, count*, chop*
- sl* functions for list operations
- sl* variants: slName, slRef, slPair
Other absolutely fundamental src/{inc,lib} modules include:
- dyString: dynamically allocated, expandable strings
- errAbort: context-specific handling of warnings and errors
- hash: associate strings with any type of data
- linefile: read in a file (or URL! and automatically decompress compressed files!) line by line
- obscure: despite the name, this companion to common.h holds a lot of useful util functions
- options: command-line option parsing
Below are several slide sets that provide a good introduction to Jim's programming philosophy and other local conventions:
- Jim's Software Sermon (2002): PPT HTML
- Jim's brief CGI programming intro (2007) - see esp. pages 6 & 7 about the cart PPT
- Jim's Software Engineering & testing presentation (2012): PPTX
- Jim's Locality & Modularity presentation (2012) PPTX
Others' overviews of kent/src:
- Hiram's BME 230 presentation on GB, TB & kent/src (2011) - see esp. pages 5-20 PPT
- Robert Baertsch's BME 230 presentation (2008): PPT
- Angie's presentation to Bejerano Lab about working with and extending CGIs & utils (2008): PPT
GBiB, release process, development tools
Trainer: Max Haeussler
- "make search" in kent/src
- xcode anyone?
- [1] gbib page
- release process
- debugging CGIs
debugging tools
Trainers: Angie Hinrichs & Max Haeussler
Read the genomewiki page Debugging cgi-scripts
how not to break the build!
Trainer: Brian Raney
background reading material
hgFindSpec: http://genomewiki.cse.ucsc.edu/index.php/HgFindSpec
Our csh - bash equivalence document: $HOME/kent/src/hg/doc/bashVsCsh.txt
VI: http://genomewiki.ucsc.edu/genecats/index.php/VI_quick_start
Cluster Jobs: http://genomewiki.ucsc.edu/index.php/Cluster_Jobs