Link to archived challenge

Category Difficulty Solves Author
web easy 22 sudoBash418

Description

Our playtesters said this challenge is so easy, even Googlebot could solve it.

So I made some tweaks to ensure Googlebot doesn’t stand a chance.

Players are given a link to a website (now offline).

There is also a single hint available:

  1. What part of a website does Googlebot visit first?

Analysis

Clicking on the link brings us to a barebones webpage with the following message:

Welcome! This site is a still a work-in-progress, sorry.

The HTML doesn’t have anything interesting, besides an easter egg:

<!DOCTYPE html>
<html>
    <head>
        <title>Under Construction</title>
    </head>
    <body>
        <p>Welcome! This site is a still a work-in-progress, sorry.</p>
        <!-- What are you, a cop? -->
    </body>
</html>

The hint and challenge description refer to the standard robots.txt file.
This file instructs web crawlers (including Googlebot) how they should interact with your website.

Solution

If we visit /robots.txt on the challenge website, we find the following:

User-agent: *
Disallow: /de3a3be2-ecea-4fe5-9224-b65bf491b184/

The website admin has explicitly marked a path to be excluded from scraping.

If we navigate to that path, we find a directory listing:

directory listing showing one file named flag.txt

Sure enough, flag.txt contains our flag: clubeh{h1d1ng_fr0m_4h3_r0b045_7d37c0ec}