How Application Flaws Enable SQL Injection

In this unit, you previously learned about cross-site scripting and SQL injection attacks. In this article, you will learn how to define injection, and the terms threat agents, attack vectors, security weaknesses, technical impacts, and business impact as related to injection attacks. There is also a detailed example of how an injection attack works within code.

Defining injection

Let’s get started. I’m going to draw directly from the OWASP definition of injection:

"Injection flaws, such as SQL, OS, and LDAP injection, occur when untrusted data is sent to an interpreter as part of a command or query. The attacker’s hostile data can trick the interpreter into executing unintended commands or accessing unauthorized data".

The crux of the injection risk centres on the term "untrusted". We’re going to see this word a lot over coming posts so let’s clearly define it now:

"Untrusted data comes from any source – either direct or indirect – where integrity is not verifiable and intent may be malicious. This includes manual user input such as form data, implicit user input such as request headers and constructed user input such as query string variables. Consider the application to be a black box and any data entering it to be untrusted".

OWASP also includes a matrix describing the source, the exploit and the impact to business:

Threat
Agents
Attack
Vectors
Security
Weakness
Technical
Impacts
Business 
Impact
Exploitability 
EASY
Prevalence: COMMON
Detectability: AVERAGE
Impact
SEVERE

Consider anyone who can send untrusted data to the system, including external users, internal users, and administrators. Attacker sends simple text-based attacks that exploit the syntax of the targeted interpreter. Almost any source of data can be an injection vector, including internal sources. Injection flaws occur when an application sends untrusted data to an interpreter. Injection flaws are very prevalent, particularly in legacy code, often found in SQL queries, LDAP queries, XPath queries, OS commands, program arguments, etc. Injection flaws are easy to discover when examining code, but more difficult via testing. Scanners and fuzzers can help attackers find them. Injection can result in data loss or corruption, lack of accountability, or denial of access. Injection can sometimes lead to complete host takeover. Consider the business value of the affected data and the platform running the interpreter. All data could be stolen, modified, or deleted. Could your reputation be harmed?


Most of you are probably familiar with the concept (or at least the term) of SQL injection but the injection risk is broader than just SQL and indeed broader than relational databases. As the weakness above explains, injection flaws can be present in technologies like LDAP or theoretically in any platform which that constructs queries from untrusted data.


Anatomy of a SQL injection attack

Let’s jump straight into how the injection flaw surfaces itself in code. We’ll look specifically at SQL injection because it means working in an environment familiar to most .NET developers and it’s also a very prevalent technology for the exploit. In the SQL context, the exploit needs to trick SQL Server into executing an unintended query constructed with untrusted data.

For the sake of simplicity and illustration, let’s assume we’re going to construct a SQL statement in C# using a parameter passed in a query string and bind the output to a grid view. In this case it’s the good old Northwind database driving a product page filtered by the beverages category which happens to be category ID 1. The web application has a link directly to the page where the CategoryID parameter is passed through in a query string. Here’s a snapshot of what the Products and Customers (we’ll get to this one) tables look like:

image-01

Here’s what the code is doing:

var catID = Request.QueryString["CategoryID"];

        

var sqlString = "SELECT * FROM Products WHERE CategoryID = " + catID;

var connString = WebConfigurationManager.ConnectionStrings

["NorthwindConnectionString"].ConnectionString;

using (var conn = new SqlConnection(connString))

{

  var command = new SqlCommand(sqlString, conn);

  command.Connection.Open();

  grdProducts.DataSource = command.ExecuteReader();

  grdProducts.DataBind();

}

And here’s what we’d normally expect to see in the browser:

image-02

In this scenario, the CategoryID query string is untrusted data. We assume it is properly formed and we assume it represents a valid category and we consequently assume the requested URL and the sqlString variable end up looking exactly like this (I’m going to highlight the untrusted data in red and show it both in the context of the requested URL and subsequent SQL statement):

Products.aspx?CategoryID=1

SELECT * FROM Products WHERE CategoryID = 1

Of course much has been said about assumption. The problem with the construction of this code is that by manipulating the query string value we can arbitrarily manipulate the command executed against the database. For example:

Products.aspx?CategoryID=1 or 1=1

SELECT * FROM Products WHERE CategoryID = 1 or 1=1

Obviously 1=1 always evaluates to true so the filter by category is entirely invalidated. Rather than displaying only beverages we’re now displaying products from all categories. This is interesting, but not particularly invasive so let’s push on a bit:

Products.aspx?CategoryID=1 or name=''

SELECT * FROM Products WHERE CategoryID = 1 or name=''

When this statement runs against the Northwind database it’s going to fail as the Products table has no column called name. In some form or another, the web application is going to return an error to the user. It will hopefully be a friendly error message contextualised within the layout of the website but at worst it may be a yellow screen of death. For the purpose of where we’re going with injection, it doesn’t really matter as just by virtue of receiving some form of error message we’ve quite likely disclosed information about the internal structure of the application, namely that there is no column called name in the table(s) the query is being executed against.

Let’s try something different:

Products.aspx?CategoryID=1 or productname=''

SELECT * FROM Products WHERE CategoryID = 1 or productname=''

This time the statement will execute successfully because the syntax is valid against Northwind so we have therefore confirmed the existence of the ProductName column. Obviously it’s easy to put this example together with prior knowledge of the underlying data schema but in most cases data models are not particularly difficult to guess if you understand a little bit about the application they’re driving. Let’s continue:

Products.aspx?CategoryID=1 or 1=(select count(*) from products)

SELECT * FROM Products WHERE CategoryID = 1 or 1=(select count(*) from

products)

With the successful execution of this statement we have just verified the existence of the Products tables. This is a pretty critical step as it demonstrates the ability to validate the existence of individual tables in the database regardless of whether they are used by the query driving the page or not. This disclosure is starting to become serious information leakage we could potentially leverage to our advantage.

So far we’ve established that SQL statements are being arbitrarily executed based on the query string value and that there is a table called Product with a column called ProductName. Using the techniques above we could easily ascertain the existence of the Customers table and the CompanyName column by fairly assuming that an online system facilitating ordering may contain these objects. Let’s step it up a notch:

Products.aspx?CategoryID=1;update products set productname = productname

SELECT * FROM Products WHERE CategoryID = 1;update products set productname

= productname

The first thing to note about the injection above is that we’re now executing multiple statements. The semicolon is terminating the first statement and allowing us to execute any statement we like afterwards. The second really important observation is that if this page successfully loads and returns a list of beverages, we have just confirmed the ability to write to the database. It’s about here that the penny usually drops in terms of understanding the potential ramifications of injection vulnerabilities and why OWASP categorises the technical impact as "severe".

All the examples so far have been non-destructive. No data has been manipulated and the intrusion has quite likely not been detected. We’ve also not disclosed any actual data from the application, we’ve only established the schema. Let’s change that.

Products.aspx?CategoryID=1;insert into products(productname) select

companyname from customers

SELECT * FROM Products WHERE CategoryID = 1;insert into products

(productname) select companyname from customers

So as with the previous example, we’re terminating the CategoryID parameter then injecting a new statement but this time we’re populating data out of the Customers table. We’ve already established the existence of the tables and columns we’re dealing with and that we can write to the Products table so this statement executes beautifully. We can now load the results back into the browser:

Products.aspx?CategoryID=500 or categoryid is null

SELECT * FROM Products WHERE CategoryID = 500 or categoryid is null

The unfeasibly high CategoryID ensures existing records are excluded and we are making the assumption that the ID of new records defaults to null (obviously no default value on the column in this case). Here’s what the browser now discloses – note the company name of the customer now being disclosed in the ProductName column:

image-03

Bingo. Internal customer data now disclosed.


What made this possible?

The above example could only happen because of a series of failures in the application design. Firstly, the CategoryID query string parameter allowed any value to be assigned and executed by SQL Server without any parsing whatsoever. Although we would normally expect an integer, arbitrary strings were accepted.

Secondly, the SQL statement was constructed as a concatenated string and executed without any concept of using parameters. The CategoryID was consequently allowed to perform activities well outside the scope of its intended function.

Finally, the SQL Server account used to execute the statement had very broad rights. At the very least this one account appeared to have data reader and data writer rights. Further probing may have even allowed the dropping of tables or running of system commands if the account had the appropriate rights.


Validate all input against a whitelist

This is a critical concept not only this post but in the subsequent OWASP posts that will follow so I’m going to say it really, really loud:

All input must be validated against a whitelist of acceptable value ranges.

As per the definition I gave for untrusted data, the assumption must always be made that any data entering the system is malicious in nature until proven otherwise. The data might come from query strings like we just saw, from form variables, request headers or even file attributes such as the Exif metadata tags in JPG images.

In order to validate the integrity of the input we need to ensure it matches the pattern we expect. Blacklists looking for patterns such as we injected earlier on are hard work both because the list of potentially malicious input is huge and because it changes as new exploit techniques are discovered.

Validating all input against whitelists is both far more secure and much easier to implement. In the case above, we only expected a positive integer and anything outside that pattern should have been immediate cause for concern. Fortunately this is a simple pattern that can be easily validated against a regular expression. Let’s rewrite that first piece of code from earlier on with the help of whitelist validation:

var catID = Request.QueryString["CategoryID"];

var positiveIntRegex = new Regex(@"^0*[1-9][0-9]*$");

if(!positiveIntRegex.IsMatch(catID))

{

  lblResults.Text = "An invalid CategoryID has been specified.";

  return;

}

Just this one piece of simple validation has a major impact on the security of the code. It immediately renders all the examples further up completely worthless in that none of the malicious CategoryID values match the regex and the program will exit before any SQL execution occurs.

An integer is a pretty simple example but the same principal applies to other data types. A registration form, for example, might expect a "first name" form field to be provided. The whitelist rule for this field might specify that it can only contain the letters a-z and common punctuation characters (be careful with this – there are numerous characters outside this range that commonly appear in names), plus it must be within 30 characters of length. The more constraints that can be placed around the whitelist without resulting in false positives, the better.

Regular expression validators in ASP.NET are a great way to implement field level whitelists as they can easily provide both client side (which is never sufficient on its own) and server side validation plus they tie neatly into the validation summary control. MSDN has a good overview of how to use regular expressions to constrain input in ASP.NET so all you need to do now is actually understand how to write a regex.

Finally, no input validation story is complete without the infamous Bobby Tables:

image-04


Parameterised stored procedures

One of the problems we had above was that the query was simply a concatenated string generated dynamically at runtime. The account used to connect to SQL Server then needed broad permissions to perform whatever action was instructed by the SQL statement.

Let’s take a look at the stored procedure approach in terms of how it protects against SQL injection. Firstly, we’ll put together the SQL to create the procedure and grant execute rights to the user.

CREATE PROCEDURE GetProducts

  @CategoryID INT

AS

SELECT *

FROM dbo.Products

WHERE CategoryID = @CategoryID

GO

GRANT EXECUTE ON GetProducts TO NorthwindUser

GO

There are a couple of native defences in this approach. Firstly, the parameter must be of integer type or a conversion error will be raised when the value is passed. Secondly, the context of what this procedure – and by extension the invoking page – can do is strictly defined and secured directly to the named user. The broad reader and writer privileges which were earlier granted in order to execute the dynamic SQL are no longer needed in this context.

Moving on the .NET side of things:

var conn = new SqlConnection(connString);

using (var command = new SqlCommand("GetProducts", conn))

{

  command.CommandType = CommandType.StoredProcedure;

  command.Parameters.Add("@CategoryID", SqlDbType.Int).Value = catID;

  command.Connection.Open();

  grdProducts.DataSource = command.ExecuteReader();

  grdProducts.DataBind();

}

This is a good time to point out that parameterised stored procedures are an additional defence to parsing untrusted data against a whitelist. As we previously saw with the INT data type declared on the stored procedure input parameter, the command parameter declares the data type and if the catID string wasn’t an integer the implicit conversion would throw a System.FormatException before even touching the data layer. But of course that won’t do you any good if the type is already a string!

Just one final point on stored procedures; passing a string parameter and then dynamically constructing and executing SQL within the procedure puts you right back at the original dynamic SQL vulnerability. Don’t do this!


Named SQL parameters

One of problems with the code in the original exploit is that the SQL string is constructed in its entirety in the .NET layer and the SQL end has no concept of what the parameters are. As far as it’s concerned it has just received a perfectly valid command even though it may in fact have already been injected with malicious code.

Using named SQL parameters gives us far greater predictability about the structure of the query and allowable values of parameters. What you’ll see in the following code block is something very similar to the first dynamic SQL example except this time the SQL statement is a constant with the category ID declared as a parameter and added programmatically to the command object.

const string sqlString = "SELECT * FROM Products WHERE CategoryID =

@CategoryID";

var connString = WebConfigurationManager.ConnectionStrings

["NorthwindConnectionString"].ConnectionString;

using (var conn = new SqlConnection(connString))

{

var command = new SqlCommand(sqlString, conn);

  command.Parameters.Add("@CategoryID", SqlDbType.Int).Value = catID;

  command.Connection.Open();

  grdProducts.DataSource = command.ExecuteReader();

  grdProducts.DataBind();

}

What this will give us is a piece of SQL that looks like this:

exec sp_executesql N'SELECT * FROM Products WHERE CategoryID = 

@CategoryID',N'@CategoryID int',@CategoryID=1

There are two key things to observe in this statement:

  1. The sp_executesql command is invoked
  2. The CategoryID appears as a named parameter of INT data type

This statement is only going to execute if the account has data reader permissions to the Products table so one downside of this approach is that we’re effectively back in the same data layer security model as we were in the very first example. We’ll come to securing this further shortly.

The last thing worth noting with this approach is that the sp_executesql command also provides some query plan optimisations which although are not related to the security discussion, is a nice bonus.


LINQ to SQL

Stored procedures and parameterised queries are a great way of seriously curtailing the potential damage that can be done by SQL injection but they can also become pretty unwieldy. The case for using ORM as an alternative has been made many times before so I won’t rehash it here but I will look at this approach in the context of SQL injection. It’s also worthwhile noting that LINQ to SQL is only one of many ORMs out there and the principals discussed here are not limited purely to one of Microsoft’s interpretation of object mapping.

Firstly, let’s assume we’ve created a Northwind DBML and the data layer has been persisted into queryable classes. Things are now pretty simple syntax wise:

var dc = new NorthwindDataContext();

var catIDInt = Convert.ToInt16(catID);

grdProducts.DataSource = dc.Products.Where(p => p.CategoryID == catIDInt);

grdProducts.DataBind();

From a SQL injection perspective, once again the query string should have already been assessed against a whitelist and we shouldn’t be at this stage if it hasn’t passed. Before we can use the value in the "where" clause it needs to be converted to an integer because the DBML has persisted the INT type in the data layer and that’s what we’re going to be performing our equivalency test on. If the value wasn’t an integer we’d get that System.FormatException again and the data layer would never be touched.

LINQ to SQL now follows the same parameterised SQL route we saw earlier, it just abstracts the query so the developer is saved from being directly exposed to any SQL code. The database is still expected to execute what from its perspective, is an arbitrary statement:

exec sp_executesql N'SELECT [t0].[ProductID], [t0].[ProductName], [t0].

[SupplierID], [t0].[CategoryID], [t0].[QuantityPerUnit], [t0].[UnitPrice], 

[t0].[UnitsInStock], [t0].[UnitsOnOrder], [t0].[ReorderLevel], [t0].

[Discontinued]

FROM [dbo].[Products] AS [t0]

WHERE [t0].[CategoryID] = @p0',N'@p0 int',@p0=1

There was some discussion about the security model in the early days of LINQ to SQL and concern expressed in terms of how it aligned to the prevailing school of thought regarding secure database design. Much of the reluctance related to the need to provide accounts connecting to SQL with reader and writer access at the table level. Concerns included the risk of SQL injection as well as from the DBA’s perspective, authority over the context a user was able to operate in moved from their control – namely within stored procedures – to the application developer’s control. However with parameterised SQL being generated and the application developers now being responsible for controlling user context and access rights it was more a case of moving cheese than any new security vulnerabilities.


Applying the principle of least privilege

The final flaw in the successful exploit above was that the SQL account being used to browse products also had the necessary rights to read from the Customers table and write to the Products table, neither of which was required for the purposes of displaying products on a page. In short, the principle of least privilege had been ignored:

"In information security, computer science, and other fields, the principle of least privilege, also known as the principle of minimal privilege or just least privilege, requires that in a particular abstraction layer of a computing environment, every module (such as a process, a user or a program on the basis of the layer we are considering) must be able to access only such information and resources that are necessary to its legitimate purpose".

This was achievable because we took the easy way out and used a single account across the entire application to both read and write from the database. Often you’ll see this happen with the one SQL account being granted db_datareader and db_datawriter roles:

image-05

This is a good case for being a little more selective about the accounts we’re using and the rights they have. Quite frequently, a single SQL account is used by the application. The problem this introduces is that the one account must have access to perform all the functions of the application which most likely includes reading and writing data from and to tables you simply don’t want everyone accessing.

Let’s go back to the first example but this time we’ll create a new user with only select permissions to the Products table. We’ll call this user NorthwindPublicUser and it will be used by activities intended for the general public, i.e. not administrative activates such as managing customers or maintaining products.

image-07

Now let’s go back to the earlier request attempting to validate the existence of the Customers table:

Products.aspx?CategoryID=1 or 1=(select count(*) from customers)

image-07

In this case I’ve left custom errors off and allowed the internal error message to surface through the UI for the purposes of illustration. Of course doing this in a production environment is never a good thing not only because it’s information leakage but because the original objective of verifying the existence of the table has still been achieved. Once custom errors are on there’ll be no external error message hence there will be no verification the table exists. Finally – and most importantly - once we get to actually trying to read or write unauthorised data the exploit will not be successful.

This approach does come with a cost though. Firstly, you want to be pragmatic in the definition of how many logons are created. Ending up with 20 different accounts for performing different functions is going to drive the DBA nuts and be unwieldy to manage. Secondly, consider the impact on connection pooling. Different logons mean different connection strings which mean different connection pools.

On balance, a pragmatic selection of user accounts to align to different levels of access is a good approach to the principle of least privilege and shuts the door on the sort of exploit demonstrated above.


Getting more creative with HTTP request headers

On a couple of occasions above I’ve mentioned parsing input other than just the obvious stuff like query strings and form fields. You need to consider absolutely anything which could be submitted to the server from an untrusted source.

A good example of the sort of implicit untrusted data submission you need to consider is the accept-language attribute in the HTTP request headers. This is used to specify the spoken language preference of the user and is passed along with every request the browser makes. Here’s how the headers look after inspecting them with Fiddler:

image-08

Note the preference Firefox has delivered in this case is "en-gb". The developer can now access this attribute in code:

var language = HttpContext.Current.Request.UserLanguages[0];

lblLanguage.Text = "The browser language is: " + language;

And the result:

image-09

The language is often used to localise content on the page for applications with multilingual capabilities. The variable we’ve assigned above may be passed to SQL Server – possibly in a concatenated SQL string - should language variations be stored in the data layer.

But what if a malicious request header was passed? What if, for example, we used the Fiddler Request Builder to reissue the request but manipulated the header ever so slightly first:

image-10

It’s a small but critical change with a potentially serious result:

image-11

We’ve looked enough at where an exploit can go from here already, the main purpose of this section was to illustrate how injection can take different attack vectors in its path to successful execution. In reality, .NET has far more efficient ways of doing language localisation but this just goes to prove that vulnerabilities can be exposed through more obscure channels.


Summary

The potential damage from injection exploits is indeed, severe. Data disclosure, data loss, database object destruction and potentially limitless damage to reputation.

The thing is though, injection is a really easy vulnerability to apply some pretty thorough defences against. Fortunately it’s uncommon to see dynamic, parameterless SQL strings constructed in .NET code these days. ORMs like LINQ to SQL are very attractive from a productivity perspective and the security advantages that come with it are eradicating some of those bad old practices.

Input parsing, however, remains a bit more elusive. Often developers are relying on type conversion failures to detect rogue values which, of course, won’t do much good if the expected type is already a string and contains an injection payload. We’re going to come back to input parsing again in the next part of the series on XSS. For now, let’s just say that not parsing input has potential ramifications well beyond just injection vulnerabilities.

I suspect securing individual database objects to different accounts is not happening very frequently at all. The thing is though, it’s the only defence you have at the actual data layer if you’ve moved away from stored procedures. Applying the least privilege principle here means that in conjunction with the other measures, you’ve now erected injection defences on the input, the SQL statement construction and finally at the point of its execution. Ticking all these boxes is a very good place to be indeed.


Source: Troy Hunt, https://www.troyhunt.com/owasp-top-10-for-net-developers-part-1/
Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 License.

Last modified: Saturday, November 21, 2020, 5:58 PM