Reportinator⚓︎
Difficulty:
Direct link: https://hhc23-reportinator-dot-holidayhack2023.ue.r.appspot.com/
Objective⚓︎
Request
Noel Boetie used ChatNPT to write a pentest report. Go to Christmas Island and help him clean it up.
Noel Boetie
Hey there, Noel Boetie speaking! I recently tried using ChatNPT to generate my penetration testing report.
It's a pretty nifty tool, but there are a few issues in the output that I've noticed.
I need some guidance in finding any errors in the way it generated the content, especially those odd hallucinations in the LLM output.
I know it's not perfect, but I'd really appreciate the extra eyes on this one.
Some of the issues might be subtle, so don't be afraid to dig deep and ask for further clarification if you're unsure.
I've heard that you folks are experts about LLM outputs and their common issues, so I trust you can help me with this.
Your input will be invaluable to me, so please feel free to share any insights or findings you may have.
I'm looking forward to working with you all and improving the quality of the ChatNPT-generated penetration testing report.
Thanks in advance for your help! I truly appreciate it! Let's make this report the best it can be!
Hints⚓︎
Reportinator
From: Noel Boetie
I know AI sometimes can get specifics wrong unless the prompts are well written. Maybe chatNPT made some mistakes here.
Solution⚓︎
We have to decide for 9 findings whether these are true findings or halucinations of ChatNPT. ChatNPT appears to be good at minting very convincing sentences. Trying to find where it might have gone wrong is tedious.
Instead, we game the game: There are only 9 binary choices to make, so 512 possibilities need to be tested, using the game itself as an "oracle" that will tell us when we hit the right combination.
The testing can be automated; watching the behaviour of the browser in the developer console shows POST requests to https://hhc23-reportinator-dot-holidayhack2023.ue.r.appspot.com/check. The payload are fields of "input-1" to "input-9", their values either 0 or 1.
This python code will try all combinations:
try.py | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Cookies and referer are from our real browser session.
In line 17, we create the values for the input fields from the bits of our counter variable "n". The script runs at about 1 try/second, and soon... at n=292, we hit gold:
...
Trying 292
n=292
{'input-1': 0, 'input-2': 0, 'input-3': 1, 'input-4': 0, 'input-5': 0, 'input-6': 1, 'input-7': 0, 'input-8': 0, 'input-9': 1}
...
We use these for the 9 reports - 0 for true, 1 for false - and our report validation gets accepted.
Out of Sheer Curiosity⚓︎
But... what are the actual errors in ChatNPT's report? Now that we know at which reports to look, we may hazard a guess at some incongruities:
Report 3: Remote Code Execution via Java Deserialization of Stored Database Objects⚓︎
"By intercepting HTTP request traffic on 88555/TCP," claims a TCP port above 65535, which cannot be correct.
Report 6: Stored Cross-Site Scripting Vulnerabilities⚓︎
The code in listing 5 starts
"<img/src='1.jpg'".
Report 9: Internal IP Address Disclosure⚓︎
The report states "When given an HTTP 7.4.33 request, ...". 7.4.33 is not a type of HTTP request; that would be GET, POST, PUT or similiar. Instead, at a guess, this looks like a curl version number.
Admonitions⚓︎
Beware of AIs
Their golden words might hide a sharp edge
Images⚓︎
Answer
Reports 3, 6 and 9 are false.
After solving the challenge, the fact will be listed as an "Achievements" in the player's badge.
Response⚓︎
Noel Boetie
Great job on completing that challenge! Ever thought about how your newfound skills might come into play later on? Keep that mind sharp, and remember, today's victories are tomorrow's strategies!