Sql injection protection

From genomewiki
Revision as of 20:11, 28 April 2022 by Galt (talk | contribs) (Updated for version 2 extensions. Frag functions are removed as unneeded since they were difficult to use correctly. Now we have a better approach for passing SQL clauses.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

SQL Injection Protection

Background

SQL means Structured Query Language. It is interpreted by the SQL database engine, MySQL in our case, in order to execute queries and statements.

SQL Injection (sqli) is a method for hackers to insert their own SQL statements or clauses into the SQL command being executed, thereby causing the SQL database to run the hackers' own code.

There are advanced SQL-injection attack libraries available for hackers to use, and of course computers are happy to run the tedious repetitive attack variants over and over until a weakness is found.

MySQL Security Configuration

The first line of defense is properly configuring the database security.

We configure the CGI MySQL users so that they only have the minimum privileges required. For instance, the default user only has the SELECT privilege. Temporary tables can only be created in the hgTemp database. Finally, we remove access to the mysql database itself which stores meta-data, including hashed passwords, thereby removing one of the most valuable targets.

Programming Methods to Prevent SQL Injection

When constructing SQL strings for execution by the database server, we typically use a printf-style function that allows CGI parameters to be inserted into the SQL statement. It is at this moment that we must catch problems because after that the system cannot tell which parts of a SQL string came from the system and which came from the CGI inputs. Hackers will be attempting to pass in input strings containing quote-marks and other special characters and SQL-comments, etc., to hi-jack the statment. Even when hackers cannot directly see the output of their hacking, the database can still leak information out using blind-sqli techniques such as taking a long time to return versus returning quickly. Computers are happy to automate such attacks.

Special Printf-style Functions to Prevent SQL Injection

When we look at the various printf specifiers like %d, %f, %s, %c, etc., it becomes clear that only the %s specifier can be used by hackers.

The primary problem with %s can be broken into two parts:

  • Quoted %s like '%s' or "%s" or even '%s%s' or ' where id=%s'.
  • Unquoted %s like %s which is usually an identifier such as a table or field.

By using special functions like sqlSafef, quoted '%s' input strings get escaped automatically by calling the mysql escape string function, which protects them from hacking. The tedium of having to allocate, escape, and free escaped strings is handled for the programmer.

Quoted String example:

char *id; // parameter or variable which results from CGI input directly or indirectly, and can not be trusted.
char query[256];
sqlSafef(query, sizeof query, "select * from table where id = '%s'", id);


And for unquoted %s, checking that no illegal characters besides upper and lower case letters and digits and the underscore and period. Only AZaz09._ are allowed. This protects identifiers from being hacked in unquoted input strings.

Unquoted Identifier Table-Name Example:

char *tableName; // variable not known to be safe, as usual.
char query[256];
sqlSafef(query, sizeof query, "select * from %s", tableName);

Unquoted Identifier Field-Name Example:

char *fieldName; // variable not known to be safe, as usual.
char query[256];
sqlSafef(query, sizeof query, "select %s from mytable", fieldName);

NOSQLINJ Tag to Catch Unvetted Queries

In order to catch queries which programmers have accidentally forgotten to construct using the sqli-safe functions, a tag NOSQLINJ is added at the beginning of each SQL statement by the sqli-safe functions.

If safef were accidentally used instead of sqlSafef, then it would lack the NOSQLINJ tag at the beginning and the query when executed would produce an error or other response depending on the level setting.

hg.conf Settings for Error Level Handling

In order to allow users to customize their response to the sql-injection problems, these settings are defined in hg.conf.

noSqlInj.level={abort,warn,logOnly,ignore}
noSqlInj.dumpStack={on,off}  

The level tells how severely to treat the error.

If the level is ignore (the current default) then it is ignored and the user sees nothing.

If the level is logOnly, then the user sees nothing, but you may look for issues in the stderr output log.

If the level is warn, then the user will see the error on the screen, the error will be printed to the log, but the system will continue.

If the level is abort, then it will display an error to the user, write to the stderr log, and stop execution before any harm is done. This is the safest and recommended level.

The optional stack dump may be turned on to supply additional debugging information that is helpful to programmers in locating where in the system the error was triggered. The stack dump will be printed to stderr and appear in the log as additional information.

SQL Clauses and %-s

Sometimes, due to program complexity, part of the SQL string is constructed in one place and then passed to another routine possibly in another library. So the entire SQL literal formatting string is not present in one place for sqlSafef to operate on. Part of the string is created in one place, and then gets passed to another place where the rest of the string is created.

myLibFunc(char *where)
{
char query[1024];
sqlSafef(query, sizeof query, "select name from knownGene %-s", where);
[...]  // use the query string to fetch records 
}
myFunc1(char *name)
{
char where[1024];
sqlSafef(where, sizeof where, "where name='%s'", name);
myLibFunc(where);
}
myFunc2(char *proteinId)
{
char where[1024];
sqlSafe(where, sizeof where, "where proteinId='%s'", proteinId);
myLibFunc(where);
}


Because the where-clause passed to myLibFunc contains many characters not allowed in an identifier, a regular %s will not work and would cause the system to complain about invalid characters.

In myLibFunc above, %-s is a special case printf-operator. The sqli-safe functions use %-s to mean that the string is left alone, that it is neither escaped nor checked. For safety, it checks that the string passed into %-s has a NOSQLINJ prefix and will give an error if it does not. Using %-s is intended for passing SQL clauses to subroutines.

Because myLibFunc is using %-s, it requires that all callers of myLibFunc (like myFunc1 and myFunc2) must use functions like sqlSafef upstream, and then pass the safely constructed SQL clause to the function.

sqlDyString Functions

Use dyString variables and sqlDyString sqli-safe functions for complex SQL statements created with many conditionals.

sqlDyStringPrintf is the main dyString sqli-safe function. It is like dyStringPrintf, but it uses the same special printf-style functionality that sqlSafef uses. It handles quoted and unquoted %s, and %-s in the same way. It automatically adds the NOSQLINJ to the string if it is empty, which is usually the case the first time that a dyString variable is used.

Because sqlDyStringPrintf appends to the string, rather than over-writing the string, one can simply call it multiple times, appending more and more of the growing complex query string according to the various conditions and options found.

// id, orderByField, useLimit, and limit are parameters passed in to a function
struct dyString *dy = dyStringNew(256);

sqlDyStringPrintf(dy, "select %s from %s", fieldName, tableName);
if (id)
  sqlDyStringPrintf(dy, " where id ='%s'", id);
if (orderByField)
  sqlDyStringPrintf(dy, " order by %s", orderByField);
if (useLimit)
  sqlDyStringPrintf(dy, " limit %d", limit);

struct sqlResult *rs = sqlGetResult(dy->string);

[...]

dyStringFree(&dy);

Safe use of an explicit NOSQLINJ tag in a simple SQL literal

Example where NOSQLINJ tag is explicitly specified as a #define:

int chromCount = sqlQuickNum(NOSQLINJ "SELECT count(*) FROM chromInfo");

There are some places in the system where printf-style functions and user-input variables are NOT used. They are perfectly safe from sql-injection. These cases typically involve simple SQL literal strings. Simply add the NOSQLINJ define to the beginning of the SQL literal string once you have vetted the query as being safe.

  • Is it simple SQL Literal?
  • Are there no variables involved?

Then it is safe and you may insert the NOSQLINJ define at the beginning of the query that you have vetted as being safe.

jksql.h:
#define NOSQLINJ "NOSQLINJ "
The c compiler concatenates multiple string literals into a single literal.
Note we use the #define because the compiler can catch errors in spelling it.

One could have also do something like this:

char query[1024];
sqlSafef(query, sizeof query, "SELECT count(*) FROM chromInfo");
int chromCount = sqlQuickNum(query);

There is nothing wrong with this, but it is 3 lines instead of one. Nothing beats the brevity of adding the NOSQLINJ tag directly. For short simple literal queries with no input variables, adding the tag is quick and safe.

Reference

Sql-injection safe functions Reference