"Learning Notes on IDA Reverse Engineering from Scratch" - 12 (Program Registration Reverse Analysis)

12.1 Determine the main function through command line parameters#

The program TEST_REVERSER.exe in this chapter teaches new knowledge about static reverse engineering and debugging from this exercise.

First, let's look at the program architecture by running the IDE.

From the above image, we can see that this program is a 32-bit architecture compiled with VC++ 2015.

In the second step, try running the program, which prompts for a username and password. Enter any username and password, and it displays "bad reverser."

In the third step, open IDA and load the target program.

One method to navigate to key areas is to search for strings. Search for some command line parameters, such as argc, argv, etc. Since this is a C++ program, the function prototype is as follows:

int main(int argc, char *argv[])

Search for parameters like arg in the name, using ctrl + F to open the search box.

Double-click on p_argc, the content is as follows:

Press the X key to search for references.

In the image above, the program calls the _p_argc_ and _p_argv functions, then passes the values to the main function.

Double-click to enter the main function.

Three parameters of the main function

References of the main function (signed)

12.2 Stack analysis of the main function#

Double-click on any function parameter or local variable to switch to the static stack view.

From the above image, we can see that the function parameters are at the bottom, always below the return address (r), because the parameters are first pushed onto the stack before calling the function, and then the return address is pushed.

Above that is the ebp value of the function that called the main function.

In the image above, when the main function executes the first instruction push ebp, it saves it onto the stack, then assigns the value of esp to ebp, using ebp as the base address for the function parameters below and local variable references above. Finally, sub esp,94h moves esp to create space for local variables and buffers. In this program, the distance moved is 0x94, and the compiler calculates the space occupied by local variables based on the source code.

The value of esp points to the top of the local variables, ebp points to the base address, above the base address are local variables, and below are the return address and function parameters, as shown in the image below.

Therefore, in a function with ebp as the base address, once the ebp value of the previous function is saved to the stack via push ebp, the value of esp is passed to ebp. 00000000 serves as a baseline, with addresses above being negative (-) and addresses below being positive (+).

In the image above, the relative address of var_4 is -00000004. If we take the value of ebp as the reference, the actual address of var_4 is ebp-4.

In the disassembly view, right-click on any instance using var_4 to verify the above content.

Above var_4, there is a blank area without variables, which may be a buffer.

Move the view up a bit, and you can see the first variable Buf above the blank area, as shown in the image below.

Right-click and select ARRAY, a window pops up showing that the array consists of 120 one-byte elements, so the array size is 120.

Function stack view

The image above shows the ebp base address. After pointing to the mov ebp, esp instruction, esp is then reduced by 0x94, finally pointing to the top of the local variable area, as shown in the image below, after executing sub esp, 0x94, the value of esp.

In the image above, the left side 00000094 represents esp=ebp-0x94. When calling other functions within the function, esp will move further up. Inside the main function, until exiting the main function, operations are still performed on local variables before -0x94.

12.3 Local variables of the main function#

Next, we will perform reverse analysis of local variables from the static stack view, with the parameters of the main function being known.

Local variables

In the image above, the program reads a certain value and performs an XOR operation with the value on ebp, saving the result in var_4, which serves to prevent stack overflow.

Double-click sub_4011B0 to enter, where you can see the sub_401040 function.

The sub_401040 function contains a printf function, indicating that this function is used to print characters.

After that, the size variable is assigned a value of 8. From the references, we can see there are two references, which only read the content without modification.

Next, there is a gets_s function, which limits user input. The image above shows a maximum input of 8 characters. By passing parameters through push eax, it then uses lea to obtain the address of the variable buf, which is the buffer.

If the user inputs fewer than 8 characters and simply presses enter, the function will also interrupt the input and return. Therefore, the Buf buffer can have a maximum of 8 characters.

Then the program passes the buffer address to the strlen() API function as a parameter through PUSH EDX. strlen() retrieves the length of the string in Buf and saves the result in the var_90 variable.

12.4 Loop and code block grouping#

In the image above, the blue arrow pointing back may indicate a loop, and the var_84 variable starts as the counter for this loop. At 0x4019f5, there is a conditional jump that ends the loop if the condition is met. The counter starts at 0 and accumulates until it is greater than or equal to the var_90 variable, at which point the loop ends.

Counter increment by 1

The value of the counter variable is passed into EAX, and after EAX is incremented by 1, it is returned to the counter variable.

In the image above, the program retrieves the first byte of BUFFER from EBP+EDX+BUF. EBP+BUF is added to the storage counter, which starts at 0. Each time the loop runs, it increments by 1 to read the next byte. This loop adds each byte of BUFFER in hexadecimal to the var_88 variable (initial value 0).

The content of this loop is character addition.

All marked in the same color, drag the bottom code blocks closer together.

These three blocks can be grouped by holding ctrl and clicking on each block in the tab above, changing the color above to cyan.

Then right-click to group the nodes.

Final effect

To see the specific content, right-click to ungroup the nodes.

12.5 Registration algorithm analysis#

Continue analyzing the code below the loop content.

In the image above, the program asks the user to enter a username and password. The following sub_4011b0 is a printf function, which then calls the gets_s function. The username and password use the same buffer Buf with a maximum character limit of Size.

Since the program has already calculated the sum of each character in the username, it no longer uses the username string, so the password can use the same buffer.

Next, the atoi function is called to convert the input content to decimal and save it in the variable var_94, which is the password variable.

Then the program passes the var_94 password variable through push edx, and the var_88 variable through push eax, passing these two variables as parameters to the 0x401010 function.

Entering the 0x401010 function, we can see two parameters. arg_4 should be the password variable since the password variable is pushed onto the stack first, and the arg_0 parameter above is the value of var_88.

So how does the 0x401010 function use these two parameters?

Before performing a cmp comparison, the program passes the password variable arg_4 to eax and executes shl eax,1.

shl shifts the bits in eax to the left, filling the rightmost low bits with 0. As a special case, shl reg,1 is equivalent to multiplying by 2.

Thus, the program multiplies the password variable by 2 and compares it with arg_0.

Using Python's ord function to calculate the ASCII corresponding decimal value.

If "pepe" is used as the username, the sum of the characters in "pepe" is as follows:

The result is 0x1aa, and the input password multiplied by 2 is compared with 0x1aa. Therefore, the correct password should be one that, when multiplied by 2, equals 0x1aa, resulting in:

The result is 213.

At this point, you can open the program and enter the username "pepe" and the password "213."

It displays a success message.

From the image above, we can see that when these two values are not equal, it jumps to the red code block and returns 0. If they are equal, it jumps to the green code block and returns 1.

What is the purpose of the return value?

From the image above, we can see that the return value is passed to the var_7D variable.

From the image above, we can see that if the return value is 0, it jumps to "bad reverser," and if it is 1, it jumps to "good reverser."

This chapter mainly discusses how to reverse analyze and bypass registration, covering topics such as function stacks, local variables, and registration algorithm analysis.