llmstory
SQL Injection Vulnerability and Prevention
Introduction to SQL Injection

SQL Injection (SQLi) is a code injection technique used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution (e.g., to dump the database contents to the attacker). It is one of the most common web hacking techniques. SQL Injection vulnerabilities arise when an application uses unvalidated user input directly within SQL queries. Attackers can leverage this vulnerability to bypass authentication, retrieve sensitive data, modify database contents, or even delete entire databases, severely compromising data integrity and confidentiality.

1.

Briefly explain what SQL Injection is and its primary cause.

Vulnerable Code Example

Consider a hypothetical backend application that retrieves user data based on an ID provided by the user, for example, through a web form. The following Python-like pseudocode demonstrates a common vulnerable pattern:


# Vulnerable Python backend code snippet
def get_user_data_vulnerable(user_id):
    # user_id comes directly from untrusted user input (e.g., a web form field)
    # The SQL query is constructed by concatenating the user_id string.
    sql_query = "SELECT * FROM users WHERE id = '" + user_id + "';"
    print("Executing SQL:", sql_query)
    # In a real application, this would execute against a database
    # cursor.execute(sql_query)
    # return cursor.fetchall()
    return sql_query # For demonstration, we return the query string

# Example usage with a legitimate user ID
print(get_user_data_vulnerable("123"))
# Expected output: SELECT * FROM users WHERE id = '123';

In this example, the user_id is directly embedded into the SQL query string without any sanitization or validation. This opens the door for an attacker to manipulate the query.

2.

Identify the main vulnerability in the provided code snippet.

Malicious Input Example (Data Deletion)

Now, let's look at how an attacker can exploit the vulnerable code to perform a data deletion attack. Imagine an attacker provides the following input for user_id:

1'; DELETE FROM users; --

When this malicious input is concatenated into the vulnerable SQL query from the previous step, the resulting full SQL query that the database server would attempt to execute becomes:

SELECT * FROM users WHERE id = '1'; DELETE FROM users; --';

This single string contains multiple SQL commands that the database will parse and execute sequentially.

3.

Given the vulnerable query SELECT * FROM users WHERE id = '" + userInput + "';, what is the full SQL query that would be executed if the userInput is 1'; DELETE FROM users; --?

Explanation of Attack Mechanism

Let's break down the malicious input 1'; DELETE FROM users; -- and how it manipulates the query:

  1. 1': This part provides a legitimate ID (1) and then immediately closes the single quote that was intended to encapsulate the user_id in the original query. At this point, the WHERE clause condition id = '1' is fully formed and syntactically correct.
  1. ;: The semicolon is a standard SQL statement terminator. It tells the database that the preceding SELECT statement is complete and a new statement is about to begin.
  1. DELETE FROM users;: This is the attacker's payload. It's a completely new, valid SQL command that instructs the database to delete all records from the users table. Because it follows the legitimate SELECT query, the database will attempt to execute it.
  1. --: This is the SQL comment syntax (two hyphens for single-line comments; # is also common in MySQL). Anything following -- on the same line is ignored by the database parser. In this case, it comments out the trailing single quote and semicolon (';) from the original query string (...WHERE id = 'user_id';). Without this comment, the remaining ' from the original string would cause a syntax error after DELETE FROM users;, potentially preventing the attack.
4.

Explain the role of '; and -- in the malicious input 1'; DELETE FROM users; -- to achieve data deletion.

Secure Code Example (Parameterized Queries)

The industry-standard and most effective way to prevent SQL Injection is by using parameterized queries (also known as prepared statements). Here's how the previous vulnerable example would be rewritten securely:


# Secure Python backend code using parameterized queries (e.g., using psycopg2 for PostgreSQL)
def get_user_data_secure(user_id):
    # user_id still comes from untrusted user input
    # The SQL query now uses a placeholder (%s for psycopg2, ? for sqlite3, :name for SQLAlchemy)
    sql_query_template = "SELECT * FROM users WHERE id = %s;"

    # The user_id is passed as a separate parameter, NOT concatenated into the query string.
    # The database driver handles the safe binding of the parameter to the placeholder.
    # cursor.execute(sql_query_template, (user_id,))
    # return cursor.fetchall()
    print("SQL Query Template:", sql_query_template)
    print("Parameters (passed separately):"), user_id)
    return "Query template and parameters handled separately by DB driver."

# Example usage with legitimate user ID
print(get_user_data_secure("123"))
# Example usage with malicious input (this will NOT work as an injection)
print(get_user_data_secure("1'; DELETE FROM users; --"))

In this secure approach, user_id is no longer concatenated. Instead, a placeholder (%s in this Python example, though ? or :name are also common) is used in the SQL query string. The actual value of user_id is then passed as a separate argument to the database driver's execution method.

5.

How does the secure code example handle user input differently from the vulnerable example to prevent SQL Injection?

Explanation of the Fix

Parameterized queries work because they separate the SQL code from the user-provided data. When a parameterized query is prepared, the database engine first parses the SQL statement template (e.g., SELECT * FROM users WHERE id = %s;). During this parsing phase, it understands the structure and logic of the query.

When the actual data (the value of user_id) is provided, the database engine treats it only as data to be inserted into the query at the placeholder's position. It never interprets this data as executable SQL commands. Any special characters within the data (like quotes, semicolons, or comment markers) are automatically escaped or properly quoted by the database driver, ensuring they are treated as literal characters within a string value, not as SQL syntax.

Therefore, even if an attacker provides 1'; DELETE FROM users; -- as input, the database engine will simply see it as a single string literal for the id parameter, effectively making the WHERE clause resolve to something like id = '1'; DELETE FROM users; --' (where  represents the escaped single quote). The malicious DELETE command will never be executed because it's treated as part of the data, not a separate SQL statement.

6.

Explain why parameterized queries prevent SQL Injection, focusing on the database engine's role.

Copyright © 2025 llmstory.comPrivacy PolicyTerms of Service