Monday, February 27, 2023

Reverse Engineering Notes - CVE-2020-7460


static int
freebsd32_copyin_control(struct mbuf **mp, caddr_t buf, u_int buflen)
{
	struct mbuf *m;
	void *md;
	u_int idx, len, msglen;
	int error;

	buflen = FREEBSD32_ALIGN(buflen);

	if (buflen > MCLBYTES)
		return (EINVAL); 

This is a function named freebsd32_copyin_control that takes three arguments: a pointer to a pointer of struct mbuf, a caddr_t buffer, and an u_int value representing the length of the buffer. The function returns an integer value.

Here's what each line of the function does:

  • struct mbuf *m; This line declares a pointer to an mbuf structure. The pointer is used later to traverse the mbuf chain and copy the data from the buffer into the mbuf structure.
  • void *md; This line declares a void pointer to be used later. This pointer is used to temporarily store a pointer to the md field of the mbuf structure.
  • u_int idx, len, msglen; This line declares three unsigned integers to be used later. These integers are used to keep track of the current index, the remaining length of the data to be copied, and the length of each mbuf segment.
  • int error; This line declares an integer variable to be used later. This variable is used to store the error code if the operation fails.
  • buflen = FREEBSD32_ALIGN(buflen); This line aligns the buflen parameter to a multiple of the alignment value specified by the FREEBSD32_ALIGN macro. This is done to ensure that the buffer length is aligned properly for efficient memory access.
  • if (buflen > MCLBYTES) return (EINVAL); This line checks if buflen is greater than the maximum length allowed for a single mbuf structure. If it is, the function returns an error code indicating that the operation is invalid.
  • msglen = MCLBYTES; This line sets the value of msglen to the maximum segment length allowed for an mbuf structure.
  • for (idx = 0, len = buflen; len > 0; idx++, len -= msglen, mp = &(*mp)->m_next) { This line starts a loop that iterates through the mbuf chain and copies the data from the buffer into each segment of the chain. The loop initializes idx to zero and len to the total length of the buffer. It also sets the value of mp to a pointer to the first mbuf structure in the chain.
  • if (*mp == NULL) { This line checks if the current mbuf structure is NULL. If it is, the function returns an error code indicating that the operation has failed.
  • if ((*mp)->m_flags & M_EXT) { This line checks if the M_EXT flag is set for the current mbuf structure. This flag indicates that the mbuf structure is part of an mbuf cluster, which is a group of contiguous mbuf structures used to represent larger data segments.
  • md = &(*mp)->m_ext.ext_buf[(idx * msglen) & (*mp)->m_ext.ext_size]; This line sets the value of md to a pointer to the current segment of the mbuf cluster. The pointer is calculated using the current segment index idx, the segment length msglen, and the size of the mbuf cluster.
  • } else { This line starts an else block for the case where the M_EXT flag is not set.
  • md = &(*mp)->m_dat[off]; This line sets the value of md to a pointer to the md field of the current mbuf structure. The md field is a pointer to the data segment of the mbuf structure.
  • off = idx * MCLBYTES; This line sets the value of
  •  This line of code is checking whether the value of the variable buflen is greater than the constant MCLBYTES.
  • If buflen is greater than MCLBYTES, the function returns the EINVAL error code indicating that the input parameters to the function are invalid.
  • EINVAL is a standard error code in Unix-based operating systems that indicates an invalid argument.

The overall purpose of this function is to perform a copy operation from a caddr_t buffer to a struct mbuf data structure, while ensuring that the buffer length is within a certain limit. The function first rounds up the buffer length to a multiple of a specific alignment value, and then checks if the rounded-up length exceeds a maximum value. If the length is within limits, the function proceeds with the copy operation and returns 0 to indicate success. Otherwise, it returns an error code to indicate failure.

 idx = 0; len = 0; while (idx < buflen) { error = copyin(buf + idx, &msglen, sizeof(msglen)); if (error) return (error); if (msglen < sizeof(struct cmsghdr)) return (EINVAL); msglen = FREEBSD32_ALIGN(msglen); if (idx + msglen > buflen) return (EINVAL); idx += msglen; msglen += CMSG_ALIGN(sizeof(struct cmsghdr)) - FREEBSD32_ALIGN(sizeof(struct cmsghdr)); len += CMSG_ALIGN(msglen); } 

This code block starts a while loop that iterates through the buffer specified by the buf pointer. The loop initializes the idx and len variables to 0.

Inside the loop, the copyin() function is called to copy the next 4 bytes of data from the buffer into the msglen variable. If an error occurs during the copy, the function returns the error code.

The code then checks if the value of msglen is less than the size of the cmsghdr structure. If it is, the function returns an error code indicating that the input parameters to the function are invalid.

The code then aligns the value of msglen to a multiple of the alignment value specified by the FREEBSD32_ALIGN macro. This is done to ensure that the length of the data to be copied is aligned properly for efficient memory access.

Next, the code checks if the value of idx plus the aligned value of msglen exceeds the total length of the buffer. If it does, the function returns an error code indicating that the input parameters to the function are invalid.

If none of the error conditions are met, the code updates the values of idx and len, and calculates the aligned length of the data segment using the CMSG_ALIGN macro. The loop continues until all the data in the buffer has been processed.

 

 

Reverse Engineering Notes - Race Conditions

Race Conditions

Race Conditions - Thank you shared resources and parallelism !

A race condition is a situation in which the behavior of a program depends on the order and timing of events, which are not guaranteed or predictable. In other words, it occurs when multiple threads or processes try to access a shared resource at the same time and the result of the execution depends on the order of the operations. This can lead to unexpected behavior, such as data corruption or access violations.

Double Fetch

A double fetch vulnerability occurs when a program fetches the same resource twice without adequate security checks in between. This can allow an attacker to modify the resource between the two fetches, leading to security vulnerabilities such as privilege escalation or denial of service attacks.

Time of Check, Time of Use (TOCTOU)

A Time of Check, Time of Use (TOCTOU) vulnerability is a type of race condition in which the state of a resource changes between the time it is checked and the time it is used. This can lead to unexpected or malicious behavior, such as privilege escalation or denial of service attacks.

Root Causes

 The root cause of race conditions is the presence of shared resources that can be accessed by multiple threads or processes simultaneously, without proper synchronization. These shared resources can include both volatile and non-volatile memory, as well as other system resources like files, sockets, and hardware devices.

In a multi-threaded program, different threads can execute concurrently and access shared memory locations, such as global variables or heap-allocated memory, without proper synchronization. Similarly, in a multi-process program, different processes can access shared resources, such as files or sockets, without proper synchronization.

When multiple threads or processes access the same shared resource without proper synchronization, the order of access and updates can become unpredictable, leading to race conditions. For example, if two threads try to increment the value of a shared variable simultaneously, it's possible that both threads read the same value before incrementing it, resulting in a lost update.

In summary, race conditions can occur whenever there are shared resources that can be accessed concurrently by multiple threads or processes, regardless of whether those resources are stored in volatile or non-volatile memory. To prevent race conditions, it's important to use proper synchronization mechanisms, such as locks, semaphores, or atomic operations, to ensure that only one thread or process can access a shared resource at a time.

examples:

  • 2+ clients talking to the same server
  • 2+ tabs executing javascript in the same browser
  • 2+ userspace threads/apps executing system calls in the same OS
  • 2+ OSes running in the same hypervisor

logic flow

 Here are some general steps you can follow to test for race conditions using radare2:

  1. Identify shared resources: Use radare2 to analyze the binary code and identify which memory locations or system resources are shared among multiple threads or processes. This can include global variables, heap-allocated memory, or files and sockets that are accessed by multiple threads or processes.

  2. Analyze access patterns: Use radare2 to analyze the assembly code of the program and identify how shared resources are accessed by different threads or processes. Look for patterns where multiple threads or processes access the same resource without proper synchronization.

  3. Simulate concurrent execution: Use radare2 to simulate concurrent execution of the program by setting breakpoints at relevant locations and using the dcb command to create multiple threads or processes. This can help you identify potential race conditions by observing how shared resources are accessed in different execution contexts.

  4. Test synchronization mechanisms: Use radare2 to test different synchronization mechanisms, such as locks or semaphores, to see how they affect the behavior of the program. You can use the dcb command to create multiple threads or processes that use different synchronization mechanisms and observe how they interact with shared resources.

     

Here are some examples of how these concepts can be used in C code:

Race Condition example:

c
#include <stdio.h> #include <pthread.h> int global_variable = 0; void* thread_func(void* arg){ for (int i = 0; i < 1000000; i++){ global_variable++; } return NULL; } int main(){ pthread_t thread1, thread2; pthread_create(&thread1, NULL, thread_func, NULL); pthread_create(&thread2, NULL, thread_func, NULL); pthread_join(thread1, NULL); pthread_join(thread2, NULL); printf("global_variable value: %d\n", global_variable); return 0; }

In this example, two threads increment the value of a global variable. However, because the execution order of the threads is not guaranteed, the final value of the global variable is unpredictable and can vary between different executions of the program.

Double Fetch example:

#include <stdio.h> void do_something(int* buffer, int length){ if (length < 1){ return; } int value = buffer[0]; if (value == 0){ return; } int* ptr = (int*)value; int data = *ptr; if (data == 42){ printf("Success!\n"); } } int main(){ int buffer[2] = {0}; int* ptr = (int*)buffer; *ptr = (int)&buffer[1]; buffer[1] = 42; do_something(buffer, 1); do_something(buffer, 1); return 0; }

In this example, the do_something() function takes a buffer of integers and fetches the first integer, treating it as a memory address. If the value at that memory address is 42, the function prints "Success!". However, the do_something() function is called twice with the same buffer, which allows an attacker to modify the buffer between the two calls, leading to a double fetch vulnerability.

Time of Check, Time of Use (TOCTOU) example:

c
#include <stdio.h> #include <fcntl.h> #include <unistd.h> void do_something(const char* filename, int uid){ int fd = open(filename, O_RDONLY); if (fd < 0){ perror("open"); return; } struct stat st; if (fstat(fd, &st) < 0){ perror("fstat"); close(fd); return; } if (st.st_uid != uid){ printf("Access denied\n");

    close(fd);     return;     } char buffer[1024]; ssize_t n = read(fd, buffer, sizeof(buffer)); close(fd); if (n < 0){ perror("read"); return; } printf("File contents: %.*s\n", (int)n, buffer); }

int main(){ const char* filename = "/etc/passwd"; int uid = getuid();

 

do_something(filename, uid); return 0;

}

In this example, the `do_something()` function takes a filename and a user ID as arguments and tries to read the contents of the file. However, before reading the file, the function checks if the user ID matches the owner of the file. This is intended to prevent unauthorized access to the file. However, this check is vulnerable to a Time of Check, Time of Use (TOCTOU) attack because the state of the file (including its owner) can change between the time it is checked and the time it is read. An attacker can exploit this vulnerability by replacing the file with a symbolic link to a file owned by a different user.

The following is an example of how this can be done:

$ ln -s /etc/shadow /tmp/passwd $ chown root /tmp/passwd $ ./example Access denied

In this example, the attacker creates a symbolic link from `/tmp/passwd` to `/etc/shadow`, which is owned by the root user. The attacker then changes the owner of the link to root. When the `do_something()` function is called with the `/tmp/passwd` filename, the check for the owner of the file returns true because the link is owned by root. However, when the function tries to read the contents of the file, it actually reads the contents of `/etc/shadow`, which contains sensitive information that should only be accessible by the root user.

Sunday, February 26, 2023

Reverse Engineering Notes - Radare2 Symbols + DLL Hijacking

Radare2 - Weak symbols


Weak symbols are symbols that are defined in a dynamic library and used by the binary, but they are not required for the binary to run. However, if these symbols are not present in the dynamic library at runtime, the binary will use a default implementation of these symbols or may crash.

To harden a binary with weak symbols, you should ensure that the required dynamic libraries are present at runtime and that they are up-to-date with no known vulnerabilities. If the dynamic library is not required for the binary to run, you can remove the weak symbol dependency from the binary by either linking the required function statically or removing the function call altogether.

In the case of the output from radare2 that you provided, some of the weak symbols are related to the POSIX threads library (pthread), which is commonly used for multithreaded programming in Unix-like systems. If your binary is using pthreads, you should ensure that the library is present and up-to-date at runtime to avoid vulnerabilities.

how can I check that with radare2?


You can use the ii command in radare2 to display the imports of a binary and look for weak symbols. Here's an example:

    Open the binary in radare2 using the following command:

vbnet

r2 /path/to/binary

    Once the binary is loaded, use the ii command to display the imports:

bash

[0x004004a0]> ii
[Imports]
nth vaddr      bind   type   lib name
...

    Look for any symbols with a WEAK bind type, which indicates that they can be overridden at runtime. You can filter the output using the grep command in the radare2 terminal:

python

[0x004004a0]> ii | grep WEAK
  1 0x000007b0 WEAK   FUNC       kernel32.dll_GetCurrentProcessId
  2 0x000007c0 WEAK   FUNC       kernel32.dll_GetCurrentThreadId
  3 0x000007d0 WEAK   FUNC       kernel32.dll_GetTickCount
  4 0x000007e0 WEAK   FUNC       kernel32.dll_QueryPerformanceCounter
  5 0x000007f0 WEAK   FUNC       kernel32.dll_IsDebuggerPresent
...

In this example, we can see that there are several weak symbols imported from kernel32.dll.

how can i use radare2 to make a proof of concept for this? and describe the logical steps it would take in radare2

To create a proof of concept for DLL hijacking using radare2, you can follow these logical steps:

    Find a vulnerable binary: Find a binary that has a weakly linked symbol in its Import Table that can be hijacked. Use radare2 -A [binary] to open the binary in radare2 and analyze it.

    Identify the vulnerable symbol: Use iE command in radare2 to list the symbols in the Import Table. Look for symbols that have a "WEAK" bind type and a "FUNC" or "OBJT" type. Note down the name and virtual address of the symbol you want to hijack.

    Create a malicious DLL: Create a malicious DLL that exports a function with the same name as the vulnerable symbol. This function should perform the desired action, such as executing arbitrary code or displaying a message box. The DLL should also contain the code that will execute the desired function when it is loaded.

    Place the malicious DLL in the current working directory or another directory that is in the system's DLL search path.

    Load the binary in radare2 and set a breakpoint at the address of the vulnerable symbol: Use radare2 -d [binary] to open the binary in debug mode. Use the db command to set a breakpoint at the address of the vulnerable symbol.

    Run the binary: Use the dc command to run the binary. The binary will load the malicious DLL instead of the legitimate one, and the function in the DLL will be executed.

    Verify the exploit: The code in the malicious DLL should have been executed, and you should see the expected behavior, such as a message box popping up.

Note that this is a simplified example of DLL hijacking, and in a real-world scenario, you would need to consider additional factors, such as the location of the vulnerable DLL, the search order of the DLLs, and the permissions required to place the malicious DLL in the desired directory.

Reverse Engineering Notes - _ITM_deregisterTMCloneTable

 

what is _ITM_deregisterTMCloneTable?

_ITM_deregisterTMCloneTable is another symbol related to transactional memory support in GCC-compiled programs. It is used to remove a previously registered transactional memory clone table.

Transactional memory is a feature in some CPUs and programming languages that allows a group of memory operations to be performed as a single, atomic transaction. This can make it easier to write concurrent code, as the programmer does not need to worry about race conditions or manually synchronizing access to shared data.

When a program is compiled with transactional memory support, GCC may insert calls to various internal library functions such as _ITM_addUserCommitAction and _ITM_deregisterTMCloneTable to manage the transactional memory operations.

As with _ITM_addUserCommitAction, the presence of _ITM_deregisterTMCloneTable in a program's symbols can be a useful indicator for reverse engineers looking to understand how the program was compiled and how it works. By analyzing the usage of these symbols in the program's code, a reverse engineer may be able to gain insight into how the program handles concurrent access to shared data, and potentially identify bugs or vulnerabilities related to transactional memory usage.

When reverse engineering an application or trying to harden it against attacks, identifying weak library symbols can be useful to find potential vulnerabilities or attack vectors. Radare2 can be used to search for such symbols using the iS command, which lists all weakly referenced symbols in the binary.

To identify and patch weak library symbols, one approach is to replace the weakly referenced symbol with a stronger, explicitly referenced symbol. This can be done using Radare2's patching capabilities, such as the wx command to write new values to memory.

In the case of the _ITM_addUserCommitAction symbol, which is a weakly referenced symbol of type NOTYPE, it may not be immediately clear how it is being used or if it is a potential vulnerability. However, investigating the context in which it is referenced and understanding its purpose in the program may reveal potential issues.

Similarly, the _ITM_deregisterTMCloneTable symbol may also be a potential target for patching or hardening, depending on the specific use case and potential vulnerabilities it may introduce. It is important to carefully analyze the use of all weakly referenced symbols in the binary and assess their potential impact on the security of the program.

Reverse Engineering Notes - _ITM_addUserCommitAction

 

what is _ITM_addUserCommitAction?

 
_ITM_addUserCommitAction is a symbol that is automatically generated by the GCC compiler when building programs with instrumentation enabled. It is used to register a function to be called after every transactional memory commit.

Transactional memory is a programming paradigm that allows multiple threads to work with shared data in a way that is similar to a database transaction. When a thread wants to modify shared data, it first enters a transaction, performs its operations, and then commits the transaction to make the changes visible to other threads.

The _ITM_addUserCommitAction function allows programmers to register a callback function that will be called after every transactional memory commit. This can be useful for performing certain operations that should occur only after the transaction has been completed successfully, such as logging or checkpointing.

Note that the _ITM_addUserCommitAction function is only generated when compiling with GCC's transactional memory support enabled, which requires a specific flag (-fgnu-tm) to be passed to the compiler. If you don't use this flag, you won't see this symbol in your program's imports or exports.

 how does it relate to reverse engineering?

In the context of reverse engineering, _ITM_addUserCommitAction can be seen as an indicator that the original program was compiled with GCC's transactional memory support enabled.

Knowing this information can be useful when analyzing a program, as it can provide insight into the original programmer's intent and the design of the program. It can also help to identify potential bugs or vulnerabilities that may be related to the use of transactional memory.

Additionally, the presence of the _ITM_addUserCommitAction symbol can be a useful clue when debugging or reverse engineering a program. By setting a breakpoint on this symbol or analyzing its usage in the program's code, a reverse engineer may be able to gain a better understanding of the program's behavior and control flow.

 

Saturday, February 25, 2023

Reverse Engineering Notes - TCP Dump

 

To capture network packets for a specific process or thread in Linux you can use a tool called tcpdump.

The tcpdump command can be used to capture network traffic on a Linux system. By default, tcpdump captures all network traffic on the system. However, it is possible to capture traffic for a specific process or thread by using the -p flag followed by the process or thread ID.

Here is an example command to capture network traffic for a specific process or thread:

csharp
sudo tcpdump -i <interface> -p -s0 -w <output-file> '(udp or tcp) and (dst port <port>) and ( (pid <process-id>) or (thread <thread-id>) )'
  • <interface>: Name of the network interface to capture packets on, such as eth0 or wlan0.
  • <output-file>: Name of the file to write the captured packets to.
  • <port>: The port number used by the process or thread to communicate over the network.
  • <process-id>: The process ID of the process to capture packets for.
  • <thread-id>: The thread ID of the thread to capture packets for.

Note that capturing network traffic requires root privileges, so you may need to run the command with sudo. Also, the tcpdump command may not be installed by default on your Linux system, so you may need to install it using your package manager before using it.

cmd:

sudo tcpdump -i <interface> -p -s0 -w <output-file> '(udp or tcp) and (dst port <port>) and ( (pid <process-id>) or (thread <thread-id>) )'

Reverse Engineering Notes - CVE-2022-26721

     xpc_object_t content = xpc_dictionary_get_value(req, "source");
    size_t count = xpc_array_get_count(content); //XENO: count SACI, based on number of array elements sent
    size_t descriptors = malloc(sizeof(size_t) * 4 * count);
    size_t *accessBeginPointer = &descriptors[count * 0],
      *accessDataLength = &descriptors[count * 1],
      *mappedBaseAddress = &descriptors[count * 2],
      *mappedLength = &descriptors[count * 3];

    for(size_t i = 0; i < count; i++) {
      accessBeginPointer[i] = accessDataLength[i] =
      mappedBaseAddress[i] = mappedLength[i] = 0;

      xpc_object_t chunk = xpc_array_get_value(content, i);

      if(xpc_get_type(chunk) == XPC_TYPE_DATA) { /*...*/ }
      else if(xpc_get_type(chunk) == XPC_TYPE_SHMEM) {
        xpc_object_t map = xpc_array_get_value(chunk, 0);
        size_t offset = min(xpc_array_get_uint64(chunk, 1), 0xFFF), //XENO: offset ACID
        size = xpc_array_get_uint64(chunk, 2);                      //XENO: size ACID

        size_t mapped_address;
        size_t mapped_size = xpc_shmem_map(map, &mapped_address);   //XENO: mapped_size ACID

        size = min(size, mapped_size - offset);
        // ...
      }
    }
    // ...
    // cleanup
    for(size_t index = 0; index < count; i++) {
      if(mappedLength[index]) {
        munmap(
          mappedBaseAddress[index],
          mappedLength[index]
        );
      }
    }
    free(descriptors);

There are a few potential vulnerabilities or issues with the code:

  1. Memory allocation issue: the descriptors array is allocated with malloc using the size of a size_t multiplied by 4 times the number of count elements, which should be the number of elements in the content XPC array. However, sizeof(size_t) is the size of the size_t type, not the size of a pointer, so the allocated size may not be correct. It should use sizeof(size_t *) instead.

  2. Unchecked array access: the code uses the count variable to access the descriptors array and set the accessBeginPointer, accessDataLength, mappedBaseAddress, and mappedLength values. However, if count is greater than the size of the descriptors array, this could result in undefined behavior or memory corruption.

  3. Uninitialized memory: the code initializes the accessBeginPointer, accessDataLength, mappedBaseAddress, and mappedLength values to zero for each array element. However, it is unclear if these values are actually used or set correctly later in the code. If they are not set correctly, this could result in unexpected behavior or vulnerabilities.

  4. Missing error checking: the code calls xpc_shmem_map to map shared memory, but does not check the return value to ensure that the mapping was successful. If the mapping fails, this could result in undefined behavior or vulnerabilities.

  5. Memory leak: the code calls munmap to unmap shared memory, but does not free the xpc_object_t objects or the chunk array, which could result in a memory leak.

Friday, February 24, 2023

Reverse Engineering Notes - CVE-2021-3608

 

What is QEMU?

QEMU (Quick Emulator) is an open-source virtualization software that allows running multiple operating systems on a single host machine. It emulates the underlying hardware environment for the guest operating system and provides various features like virtual machine management, live migration, and device emulation.

 What is para-virtualization?

Para-virtualization is a virtualization technique that allows guest operating systems to directly communicate with the hypervisor, thus providing better performance compared to full virtualization. In para-virtualization, the guest operating system is modified to use special hypercalls or API calls to communicate with the hypervisor.

RDMA (Remote Direct Memory Access) is a technology that allows data to be transferred directly from the memory of one computer to another over a network, without involving the CPU or operating system of the computers. This technology provides low-latency, high-bandwidth, and low-CPU overhead data transfers between systems.

CVE-2021-3608 is a security vulnerability in the QEMU hypervisor that allows an attacker with administrative privileges in a guest virtual machine to execute arbitrary code on the host system. The vulnerability exists due to a memory leak in the code that handles para-virtualized RDMA connections in the QEMU. An attacker can exploit this vulnerability by sending specially crafted RDMA packets from the guest virtual machine to the QEMU hypervisor, causing a heap buffer overflow in the QEMU and allowing the attacker to execute arbitrary code on the host system.

In this attack scenario, the guest virtual machine is the attacker, and it exploits the vulnerability in the QEMU hypervisor to break out of the virtualization environment and gain code execution on the host system. The use of para-virtualized RDMA connections allows the attacker to directly communicate with the hypervisor and trigger the vulnerability. This attack can be mitigated by patching the vulnerable QEMU version or by disabling para-virtualized RDMA connections in the QEMU configuration.

Security experts often review source code to identify vulnerabilities and potential attack vectors. By analyzing the source code, they can identify potential security issues and provide recommendations on how to address them. This process is known as "source code analysis" or "static code analysis."

In the case of CVE-2021-3608, security experts discovered a memory leak in the QEMU source code that handles para-virtualized RDMA connections. By analyzing the source code, they were able to identify the vulnerability and determine how it could be exploited by an attacker. They then provided recommendations on how to patch the vulnerability or mitigate the risk of exploitation.




#include "qemu/osdep.h"



#include <glib/gprintf.h>

#include <utime.h>



#include "9p-iov-marshal.h"

#include "qemu/bswap.h"

/*This code is importing various libraries needed for the program to run.

qemu/osdep.h is a header file from the QEMU emulator that includes operating system dependent definitions and functions.

glib/gprintf.h is a header file from the GLib library which provides many useful utilities and data structures, including string handling functions.

utime.h is a header file from the C standard library that defines functions related to time and date.

9p-iov-marshal.h is a header file specific to the 9P protocol, which is a network protocol used for sharing files and other resources between computers.

qemu/bswap.h is a header file from the QEMU emulator that includes functions for byte swapping, which is a technique used to convert between big-endian and little-endian byte order.

These libraries are necessary for the program to use various functions and data structures defined in them.*/



static ssize_t v9fs_packunpack(void *addr, struct iovec *sg, int sg_count,

size_t offset, size_t size, int pack)

{

int i = 0;

size_t copied = 0;

size_t req_size = size;

/*This code is defining a function called v9fs_packunpack with several parameters, including a memory address, an array of input/output vectors, the number of vectors in the array, an offset, a size, and a flag called pack.
The function first initializes several variables including i, copied, and req_size. i is used as a counter variable for a loop, copied is used to keep track of how much data has been copied, and req_size is used to store the size of the data to be copied.
The function returns a signed integer value called ssize_t which is used to represent the size of a byte buffer or an error condition.
The purpose of this function is to pack or unpack data from an array of input/output vectors. The pack flag indicates whether to pack or unpack the data. If pack is 1, the function packs the data from the memory address into the input/output vector array. If pack is 0, the function unpacks the data from the input/output vector array into the memory address.
The function uses a loop to iterate through each vector in the input/output vector array, copying data into or out of the vector as needed. The offset and size parameters are used to determine which portion of the data to copy. The copied variable is used to keep track of how much data has been copied so far. The function continues to copy data until the requested size has been copied or the end of the input/output vector array is reached.
Finally, the function returns the number of bytes that were copied.*/

for (i = 0; size && i < sg_count; i++) {
size_t len;
if (offset >= sg[i].iov_len) {

 

/* skip this sg */

 

offset -= sg[i].iov_len;
continue;
} else {
len = MIN(sg[i].iov_len - offset, size);
if (pack) { memcpy(sg[i].iov_base + offset, addr, len); } else {
memcpy(addr, sg[i].iov_base + offset, len);
}
size -= len;
copied += len;
addr += len;
if (size) {
offset = 0;
continue;
}
}
}

/*This code is a loop that iterates over each vector in the input/output vector array and copies data into or out of the vector as needed.
The loop iterates as long as there is still data to be copied (size is not zero) and the index i is less than the number of vectors in the array (sg_count).
For each vector, the loop first checks if the requested offset is greater than or equal to the length of the current vector (sg[i].iov_len). If it is, the loop skips this vector and moves on to the next one.
If the requested offset is less than the length of the current vector, the loop calculates the amount of data to copy (len) as the minimum of the remaining size and the remaining bytes in the current vector (sg[i].iov_len - offset).
If the pack flag is set to 1, the function copies data from the memory address to the input/output vector. If pack is 0, the function copies data from the input/output vector to the memory address.
The loop updates the copied variable to keep track of how much data has been copied so far, and updates the memory address to point to the next location to be copied to or from.
If there is still data to be copied, the loop sets the offset to 0 and continues to the next vector.
Once all data has been copied or the end of the input/output vector array has been reached, the function returns the number of bytes that were copied.*/

if (copied < req_size) {

/*
* We copied less that requested size. error out
*/

return -ENOBUFS;
}
return copied;

/*This code is the final part of the v9fs_packunpack function.
After the loop has finished copying data, the function checks if the total amount of data that was copied (copied) is less than the requested size (req_size). If the copied data is less than the requested size, it means there was not enough space in the input/output vector array to copy all of the requested data. In this case, the function returns an error code of -ENOBUFS, which indicates that the buffer is full and cannot accept any more data.
If the copied data is equal to or greater than the requested size, it means that the function was able to copy all of the requested data into or out of the input/output vector array. In this case, the function returns the total number of bytes that were copied, which is stored in the copied variable.*/

static ssize_t v9fs_unpack(void *dst, struct iovec *out_sg, int out_num,
size_t offset, size_t size)
{
return v9fs_packunpack(dst, out_sg, out_num, offset, size, 0);
}

/*This code defines the v9fs_unpack function, which is a wrapper around the v9fs_packunpack function with the pack flag set to 0. This function is used to copy data from the input/output vector array (out_sg) to a memory location (dst), using the same offset and size parameters as v9fs_packunpack.
In other words, this function is simply calling v9fs_packunpack with the pack flag set to 0, which means that the function will copy data from the input/output vector array to the memory location. The function then returns the result of v9fs_packunpack, which is the total number of bytes that were copied.*/

ssize_t v9fs_pack(struct iovec *in_sg, int in_num, size_t offset,
const void *src, size_t size)
{
return v9fs_packunpack((void *)src, in_sg, in_num, offset, size, 1);
}

/*This code defines the v9fs_pack function, which is another wrapper around the v9fs_packunpack function, but with the pack flag set to 1. This function is used to copy data from a memory location (src) to the input/output vector array (in_sg), using the same offset and size parameters as v9fs_packunpack.
In other words, this function is simply calling v9fs_packunpack with the pack flag set to 1, which means that the function will copy data from the memory location to the input/output vector array. The function then returns the result of v9fs_packunpack, which is the total number of bytes that were copied.*/


ssize_t v9fs_iov_vunmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, va_list ap)
{
int i;
ssize_t copied = 0;
size_t old_offset = offset;



for (i = 0; fmt[i]; i++) {
switch (fmt[i]) {
case 'b': {
uint8_t *valp = va_arg(ap, uint8_t *);
copied = v9fs_unpack(valp, out_sg, out_num, offset, sizeof(*valp));

break;

}
case 'w': {
uint16_t val, *valp;
valp = va_arg(ap, uint16_t *);
copied = v9fs_unpack(&val, out_sg, out_num, offset, sizeof(val));
if (bswap) {
*valp = le16_to_cpu(val);
} else {
*valp = val;
}
break;
}
case 'd': {
uint32_t val, *valp;
valp = va_arg(ap, uint32_t *);
copied = v9fs_unpack(&val, out_sg, out_num, offset, sizeof(val));
if (bswap) {
*valp = le32_to_cpu(val);

} else {

*valp = val;

}

break;

}

case 'q': {

uint64_t val, *valp;

valp = va_arg(ap, uint64_t *);

copied = v9fs_unpack(&val, out_sg, out_num, offset, sizeof(val));

if (bswap) {

*valp = le64_to_cpu(val);

} else {

*valp = val;

}

break;

}

case 's': {

V9fsString *str = va_arg(ap, V9fsString *);

copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,

"w", &str->size);

if (copied > 0) {

offset += copied;

str->data = g_malloc(str->size + 1);

copied = v9fs_unpack(str->data, out_sg, out_num, offset,

str->size);

if (copied >= 0) {

str->data[str->size] = 0;

} else {

v9fs_string_free(str);

}

}

break;

}

case 'Q': {

V9fsQID *qidp = va_arg(ap, V9fsQID *);

copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,

"bdq", &qidp->type, &qidp->version,

&qidp->path);

break;

}

case 'S': {

V9fsStat *statp = va_arg(ap, V9fsStat *);

copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,

"wwdQdddqsssssddd",

&statp->size, &statp->type,

&statp->dev, &statp->qid,

&statp->mode, &statp->atime,

&statp->mtime, &statp->length,

&statp->name, &statp->uid,

&statp->gid, &statp->muid,

&statp->extension,

&statp->n_uid, &statp->n_gid,

&statp->n_muid);

break;

}

case 'I': {

V9fsIattr *iattr = va_arg(ap, V9fsIattr *);

copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,

"ddddqqqqq",

&iattr->valid, &iattr->mode,

&iattr->uid, &iattr->gid,

&iattr->size, &iattr->atime_sec,

&iattr->atime_nsec,

&iattr->mtime_sec,

&iattr->mtime_nsec);

break;

}

default:

g_assert_not_reached();

}

if (copied < 0) {

return copied;

}

offset += copied;

}



return offset - old_offset;

}

/*This is a function in the 9P protocol implementation for the Plan 9 operating system. The function is responsible for unmarshalling data from a series of iovec buffers, according to a specified format string.
The function takes as input a pointer to an array of iovec structures, the number of iovecs in the array, an offset to the beginning of the data to be unmarshalled, a flag indicating whether the data is in big-endian byte order, a format string specifying the layout of the data to be unmarshalled, and a va_list of arguments to be used with the format string.
The function loops through the characters in the format string, switching on each character to determine how to unmarshal the next piece of data. For each character, the function uses the v9fs_unpack function to copy the data from the iovec buffers into the appropriate variable type. If the bswap flag is set, the function also performs a byte swap on the data
The function returns the total number of bytes unmarshalled from the iovec buffers. If an error occurs during unmarshalling, the function returns a negative error code.*/


ssize_t v9fs_iov_unmarshal(struct iovec *out_sg, int out_num, size_t offset,

int bswap, const char *fmt, ...)

{

ssize_t ret;

va_list ap;



va_start(ap, fmt);

ret = v9fs_iov_vunmarshal(out_sg, out_num, offset, bswap, fmt, ap);

va_end(ap);



return ret;

}

/*This function is a convenience wrapper around v9fs_iov_vunmarshal(). It takes a variable number of arguments that represent the values to be unmarshalled from the out_sg scatter-gather list, using the format string fmt. The arguments are passed to v9fs_iov_vunmarshal() using the va_list interface.
The bswap parameter specifies whether the byte order of the unmarshalled values should be swapped. If it is non-zero, byte swapping will be performed.
The function returns the number of bytes unmarshalled, or an error code if an error occurs.*/


ssize_t v9fs_iov_vmarshal(struct iovec *in_sg, int in_num, size_t offset,

int bswap, const char *fmt, va_list ap)

{

int i;

ssize_t copied = 0;

size_t old_offset = offset;



for (i = 0; fmt[i]; i++) {

switch (fmt[i]) {

case 'b': {

uint8_t val = va_arg(ap, int);

copied = v9fs_pack(in_sg, in_num, offset, &val, sizeof(val));

break;

}

case 'w': {

uint16_t val = va_arg(ap, int);

if (bswap) {

val = cpu_to_le16(val);

}

copied = v9fs_pack(in_sg, in_num, offset, &val, sizeof(val));

break;

}

case 'd': {

uint32_t val = va_arg(ap, uint32_t);

if (bswap) {

val = cpu_to_le32(val);

}

copied = v9fs_pack(in_sg, in_num, offset, &val, sizeof(val));

break;

}

case 'q': {

uint64_t val = va_arg(ap, uint64_t);

if (bswap) {

val = cpu_to_le64(val);

}

copied = v9fs_pack(in_sg, in_num, offset, &val, sizeof(val));

break;

}

case 's': {

V9fsString *str = va_arg(ap, V9fsString *);

copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap,

"w", str->size);

if (copied > 0) {

offset += copied;

copied = v9fs_pack(in_sg, in_num, offset, str->data, str->size);

}

break;

}



case 'Q': {

V9fsQID *qidp = va_arg(ap, V9fsQID *);

copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap, "bdq",

qidp->type, qidp->version,

qidp->path);

break;

}

case 'S': {

V9fsStat *statp = va_arg(ap, V9fsStat *);

copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap,

"wwdQdddqsssssddd",

statp->size, statp->type, statp->dev,

&statp->qid, statp->mode, statp->atime,

statp->mtime, statp->length,

&statp->name,

&statp->uid, &statp->gid, &statp->muid,

&statp->extension, statp->n_uid,

statp->n_gid, statp->n_muid);

break;

}

case 'A': {

V9fsStatDotl *statp = va_arg(ap, V9fsStatDotl *);

copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap,

"qQdddqqqqqqqqqqqqqqq",

statp->st_result_mask,

&statp->qid, statp->st_mode,

statp->st_uid, statp->st_gid,

statp->st_nlink, statp->st_rdev,

statp->st_size, statp->st_blksize,

statp->st_blocks, statp->st_atime_sec,

statp->st_atime_nsec,

statp->st_mtime_sec,

statp->st_mtime_nsec,

statp->st_ctime_sec,

statp->st_ctime_nsec,

statp->st_btime_sec,

statp->st_btime_nsec, statp->st_gen,

statp->st_data_version);
break;
}
default:
g_assert_not_reached();
}
if (copied < 0) {
return copied;
}
offset += copied;
}

return offset - old_offset;
}

/*This is a C function called v9fs_iov_vmarshal which marshals data according to a given format string fmt and appends it to an array of struct iovec called in_sg. The function takes a variable argument list ap which contains the values to be marshaled.
The function iterates over the format string character by character and for each character, it marshals the corresponding value from the argument list ap. The format string characters are as follows:
b: marshals an 8-bit unsigned integer
w: marshals a 16-bit unsigned integer
d: marshals a 32-bit unsigned integer
q: marshals a 64-bit unsigned integer
s: marshals a V9fsString structure, which contains a size field followed by a string of characters
Q: marshals a V9fsQID structure, which contains three fields: type, version, and path
S: marshals a V9fsStat structure, which contains fields for file size, type, device ID, QID, mode, access time, modification time, length, name, owner user ID, group ID, and modify user ID
A: marshals a V9fsStatDotl structure, which contains fields for result mask, QID, mode, owner user ID, group ID, number of links, device ID, file size, block size, number of blocks, access time, modification time, creation time, birth time, generation number, and data version
The function also takes a starting offset offset, which is used to determine where in the in_sg array the marshaled data should be written. If the marshaling is successful, the function returns the number of bytes written to the in_sg array. If an error occurs during marshaling, the function returns a negative value.
* */

ssize_t v9fs_iov_marshal(struct iovec *in_sg, int in_num, size_t offset,
int bswap, const char *fmt, ...) {
ssize_t ret;
va_list ap;

va_start(ap, fmt);
ret = v9fs_iov_vmarshal(in_sg, in_num, offset, bswap, fmt, ap);
va_end(ap);

return ret;
}

/*v9fs_iov_marshal is a convenience function that provides a simpler interface to the v9fs_iov_vmarshal function. It takes a variable number of arguments, which are passed to v9fs_iov_vmarshal along with a format string. The format string specifies the types of the arguments and the order in which they are passed.
Inside v9fs_iov_marshal, the va_list type is used to work with the variable argument list. va_start is called to initialize ap with the address of the first argument following fmt. v9fs_iov_vmarshal is then called with in_sg, in_num, offset, bswap, fmt, and ap. Finally, va_end is called to clean up the argument list.
The function returns the result of v9fs_iov_vmarshal, which is the number of bytes that were copied into the in_sg buffer, or a negative error code if an error occurred.*/

The code has a flaw in that the ring->pages array of pointers is not completely initialized. Specifically, if the tbl array contains a NULL pointer, then the corresponding ring->pages entry will not be initialized. This could cause issues later on if the uninitialized entry is used.
Additionally, there is no check for whether npages is a valid value or not. If npages is zero or negative, then ring->pages will be allocated with an invalid size, which could lead to memory issues or crashes.

OS Notes - POSIX

 

What is POSIX?

POSIX stands for "Portable Operating System Interface for Unix". It is a set of standards for operating systems that are designed to be compatible with Unix. The POSIX standards were developed by the IEEE (Institute of Electrical and Electronics Engineers) and are maintained by the POSIX working group.

The goal of POSIX is to provide a standardized interface for Unix-like operating systems, so that applications can be written to be portable across different Unix-based platforms. The standards define a set of APIs (application programming interfaces), command-line interfaces, and utilities for programming and managing operating systems.

Some examples of POSIX standards include:

  • POSIX.1: This standard defines the basic system interface, including APIs for file I/O, process control, signals, and more.
  • POSIX.2: This standard defines additional APIs and utilities for shell programming, including regular expressions, the awk programming language, and more.
  • POSIX.4: This standard defines APIs for real-time programming, including clocks, timers, and message queues.

POSIX compliance is common among Unix-like operating systems, including Linux, macOS, and various versions of Unix itself. Many programming languages and tools, such as C and the GNU Compiler Collection (GCC), also support POSIX interfaces. This makes it easier to write portable software that can run on multiple operating systems without modification.

  How does POSIX relate to reversing sin pai?

POSIX itself is not directly related to reverse engineering, as it is a set of standards for operating systems and software development, rather than a tool or technique for reverse engineering. However, reverse engineering can sometimes involve analyzing and understanding the behavior of programs that were developed according to POSIX standards.

In the context of reverse engineering, knowledge of POSIX can be useful in a few ways. For example:

  1. Understanding system calls: Reverse engineering often involves analyzing the behavior of programs at a low level, including system calls. POSIX defines a standard set of system calls that are available on Unix-like operating systems, and knowledge of these system calls can be helpful in understanding how a program interacts with the underlying operating system.

  2. Identifying function signatures: When reverse engineering a program, it can be helpful to know the expected signatures of library functions or system calls. POSIX defines standard function signatures for many of its APIs, so knowing the POSIX conventions can make it easier to identify the functions being used by a program.

  3. Cross-platform analysis: One of the goals of POSIX is to provide a standard interface for Unix-like operating systems, which can make it easier to write cross-platform software. When analyzing a program that was developed using POSIX standards, it may be helpful to understand how it behaves on different operating systems that support POSIX.

Overall, while POSIX itself is not a tool or technique for reverse engineering, understanding its conventions and standards can be helpful in analyzing and understanding programs that were developed for Unix-like operating systems.

Linux Kernel Notes - Namespaces

 Namespaces

In Linux, namespaces are a feature of the kernel that allow for process isolation and resource control. Namespaces provide a way to create a separate context for a group of processes, so that they can have their own isolated view of system resources like process IDs, network interfaces, mount points, and more.

The concept of namespaces can be a bit abstract, so let's take a look at a few examples to see how they work in practice.

  1. PID namespace: Each process on a Linux system has a unique process ID (PID) assigned to it. The PID namespace provides a way to create a separate view of the PID space for a group of processes. This can be useful for process isolation, containerization, and resource control.

For example, if you start a new process in a new PID namespace, it will have its own set of PIDs that are separate from the PIDs in the parent namespace. This means that the process will see only its own child processes, and not the child processes of other processes in the system.

  1. Network namespace: The network namespace provides a way to create a separate network stack for a group of processes. This can be useful for network isolation and virtualization.

For example, if you start a new process in a new network namespace, it will have its own network interfaces, routing tables, and firewall rules that are separate from those in the parent namespace. This means that the process can communicate with other processes in the same namespace, but not with processes in other namespaces.

  1. Mount namespace: The mount namespace provides a way to create a separate view of the file system for a group of processes. This can be useful for file system isolation and virtualization.

For example, if you start a new process in a new mount namespace, it will have its own set of mount points and file system access that are separate from those in the parent namespace. This means that the process can access only the files and directories that are available in its own namespace, and not those in other namespaces.

Overall, namespaces provide a powerful tool for process isolation and resource control in Linux. By using namespaces, you can create a secure and isolated environment for running processes, without affecting other processes running on the same system.

 Useful Commands

Here are some useful commands for working with Linux namespaces:

  1. unshare: This command allows you to create a new namespace and run a command in that namespace. For example, the command unshare --pid bash will create a new PID namespace and start a new bash shell in that namespace.

  2. ip netns: This command is used to manage network namespaces. You can create a new network namespace with the command ip netns add <name>, and then run commands in that namespace with the command ip netns exec <name> <command>. For example, the command ip netns add myns; ip netns exec myns ping 8.8.8.8 will create a new network namespace called "myns" and then run the ping command in that namespace.

  3. mount --make-private: This command is used to create a new mount namespace. You can use the mount --make-private command to make the current mount namespace private, so that changes to the file system in one namespace do not affect other namespaces.

  4. nsenter: This command allows you to enter an existing namespace and run a command in that namespace. For example, the command nsenter --pid=/proc/1234/ns/pid bash will enter the PID namespace of process 1234 and start a new bash shell in that namespace.

  5. ip link set: This command is used to manage network interfaces. You can use the ip link set <interface> netns <name> command to move a network interface to a different network namespace. For example, the command ip link set eth0 netns myns will move the eth0 interface to the network namespace called "myns".

These are just a few of the many commands available for working with Linux namespaces. The exact commands and syntax may vary depending on your distribution and version of Linux, so it's always a good idea to consult the documentation for your specific system.

Security

The security of a namespace depends on the specific use case and configuration. Here are a few things to consider when evaluating the security of a namespace:

  1. Isolation: The primary purpose of namespaces is to provide isolation between processes and system resources. A vulnerable namespace is one where processes in the namespace can access resources that they should not be able to access, or where resources outside the namespace can be accessed by processes inside the namespace.

  2. Escalation: In some cases, a process running in a namespace may be able to escape the namespace and gain access to resources outside the namespace. This can happen if there are bugs or vulnerabilities in the kernel or in the tools used to manage namespaces.

  3. Privileges: Some namespaces require special privileges to create or modify. For example, creating a new network namespace requires the CAP_NET_ADMIN capability. A vulnerable namespace is one where untrusted users or processes are able to create or modify namespaces, or where namespaces are created with more privileges than necessary.

  4. Configuration: The security of a namespace also depends on how it is configured. For example, a network namespace with no firewall rules may be more vulnerable to attacks than one with strict firewall rules. Similarly, a mount namespace with read-write access to the host file system may be more vulnerable than one with read-only access.

To determine if a namespace is vulnerable, you should evaluate the above factors for your specific use case and configuration. You can also perform security assessments or penetration testing to identify vulnerabilities and potential attack vectors. It is important to regularly monitor and update your system and namespaces to ensure that they remain secure.

Reverse Engineering Notes - C Language Calling Conventions and Syntax Overview

 


Calling Conventions

A calling convention is a set of rules that determine how parameters are passed to a function, how the return value is handled, and how the function is called. There are several different calling conventions used in C programming, but two of the most common are the cdecl and stdcall conventions.

cdecl Calling Convention

The cdecl calling convention is used by default in C programming. In this convention, the caller is responsible for cleaning up the stack after the function call. Parameters are pushed onto the stack in reverse order, with the rightmost parameter being pushed first. The return value is stored in the EAX register for 32-bit systems or in the RAX register for 64-bit systems.

Here is an example of using the cdecl calling convention:

c
int add_numbers(int x, int y); int main() { int result = add_numbers(2, 3); return 0; } int add_numbers(int x, int y) { int sum = x + y; return sum; }

In this example, the add_numbers() function takes two integer parameters and returns their sum. The main() function calls add_numbers() with the values 2 and 3 and stores the result in the result variable.

stdcall Calling Convention

The stdcall calling convention is used in some Windows API functions. In this convention, the callee is responsible for cleaning up the stack after the function call. Parameters are pushed onto the stack in right-to-left order, with the leftmost parameter being pushed first. The return value is stored in the EAX register for 32-bit systems or in the RAX register for 64-bit systems.

Here is an example of using the stdcall calling convention:

c
#include <windows.h> int main() { MessageBox(NULL, "Hello, World!", "Message", MB_OK); return 0; }

In this example, the MessageBox() function is called with the parameters NULL, "Hello, World!", "Message", and MB_OK. The function displays a message box with the specified text and returns an integer value indicating which button was clicked.

Programming Syntax

C programming syntax includes a variety of constructs for declaring variables, defining functions, and controlling program flow. Here are some examples of basic C programming syntax:

Variable Declaration

Variables can be declared using the following syntax:

c
type variable_name;

For example, to declare an integer variable named x, use the following code:

c
int x;

Function Definition

Functions can be defined using the following syntax:

c
return_type function_name(parameter_type parameter_name) { // function body }

For example, to define a function named add_numbers() that takes two integer parameters and returns their sum, use the following code:

c
int add_numbers(int x, int y) { int sum = x + y; return sum; }

Control Structures

C provides several control structures for controlling program flow, including if statements, for loops, and while loops. Here are some examples of control structures:

if Statement

c
if (condition) { // code to execute if condition is true } else { // code to execute if condition is false }

For example:

c
int x = 5; if (x > 0) { printf("x is positive"); } else if (x < 0) { printf
 

Variables 

In C, there are several ways to initialize variables, including:

  1. Initialization at declaration: This is the most common way to initialize a variable in C. You can declare a variable and initialize it with a value in a single line of code, like this:

    python
  1. static int x; // automatically initialized to zero

    In this example, we declare a static integer variable "x" which is automatically initialized to zero.

These are some of the most common ways to initialize variables in C. By using these techniques, you can ensure that your variables are properly initialized before they are used in your program.


  • int x = 10;

    In this example, we declare an integer variable "x" and initialize it with the value 10.

  • Initialization using assignment: You can also initialize a variable using the assignment operator "=" after it has been declared, like this:

    python
  • int x; x = 10;

    In this example, we declare an integer variable "x" and assign the value 10 to it after it has been declared.

  • Initialization with a compound literal: A compound literal is an expression that creates an unnamed object with a specific type and value. You can use a compound literal to initialize a variable, like this:

    python
  • int x = (int){10};

    In this example, we use a compound literal to create an integer object with the value 10, and then assign that value to the variable "x".

  • Initialization with an initializer list: An initializer list is a comma-separated list of values enclosed in braces "{}". You can use an initializer list to initialize an array or struct, like this:

    python
  • int arr[3] = {1, 2, 3};

    In this example, we declare an integer array "arr" with three elements and initialize it with the values 1, 2, and 3.

  • Static initialization: In C, variables with static storage duration are automatically initialized to zero if no other initializer is specified.

  • Static Int

    In C, the keyword "static" can be used to modify the storage class of a variable. When used with an int (or any other data type), "static int" creates a variable that is only visible within the function or file where it is declared, and retains its value between function calls.

    Here are some key characteristics of a static int variable:

    1. Scope: A static int variable is only visible within the function or file where it is declared. It cannot be accessed from outside that scope.

    2. Lifetime: Unlike a normal (automatic) variable, which is created when the function is called and destroyed when the function returns, a static int variable is created when the program starts and persists throughout the program's lifetime. This means that its value is retained between function calls.

    3. Initialization: A static int variable is initialized to zero by default. If it is explicitly initialized to another value, that value will be retained between function calls.

    Here is an example of using a static int variable in a function:

    c
    #include <stdio.h> void myFunction() { static int count = 0; // declare and initialize static variable count++; // increment the count printf("Count = %d\n", count); // print the current count } int main() { myFunction(); // prints "Count = 1" myFunction(); // prints "Count = 2" myFunction(); // prints "Count = 3" return 0; }

    In this example, the static int variable "count" is declared inside the "myFunction" function and initialized to zero. Each time the function is called, the value of "count" is incremented and printed to the console. Because "count" is a static variable, its value is retained between function calls, so the output will be "Count = 1", "Count = 2", and "Count = 3".


    Reverse Engineering Notes - Pattern Recognition

     

    This is a list of common function patterns that you might encounter when examining assembly code:

    1. Prologue and epilogue: Many functions in assembly language will start with a prologue and end with an epilogue that sets up and tears down the function's stack frame. The prologue usually saves the previous frame pointer and the return address, while the epilogue restores them and deallocates any local variables that were pushed onto the stack.

    2. Argument handling: In many cases, function arguments are passed on the stack or in registers. You can often identify the argument handling code by looking for instructions that access memory locations where the arguments are stored.

    3. Function call: When one function calls another, it will often pass arguments in registers or on the stack, then jump to the called function's address. You can identify a function call by looking for instructions that push arguments onto the stack, load the address of the called function, and perform a jump or call operation.

    4. Looping constructs: Functions that include loops will often include conditional jump instructions that test a counter or flag value and jump back to the start of the loop if the condition is met. You can identify looping constructs by looking for instructions that include jump or conditional jump operations.

    5. Memory operations: Functions that manipulate memory will often include instructions that load or store values from or to specific memory addresses. You can identify memory operations by looking for instructions that include load or store operations, as well as by looking at the memory addresses being accessed.

    6. Arithmetic and logical operations: Functions that perform arithmetic or logical operations on data will often include instructions that add, subtract, multiply, divide, or perform logical operations like AND, OR, and XOR. You can identify arithmetic and logical operations by looking for instructions that include these operations, as well as by looking at the registers or memory locations being used as operands.

    7. Control flow operations: Functions that perform conditional branching or other control flow operations will often include instructions that compare values, test flags, or perform other operations to determine which code path to take. You can identify control flow operations by looking for instructions that include conditional jumps or other control flow operations.

    SIDE NOTE::These are just a few examples of the kinds of patterns you might encounter when examining assembly code. The specific patterns you'll encounter will depend on the programming language, platform, and compiler used to generate the code.

    Thursday, February 23, 2023

    Reverse Engineering Notes: Heap Grooming

     Uninitialized data access in C can also occur when a program reads or writes to memory that has not been properly initialized or allocated, such as with dynamically allocated memory on the heap. Heap grooming is a technique that can be used to intentionally manipulate the heap in order to cause vulnerabilities in a program.

    Here's an example of how heap grooming can lead to uninitialized data access in C:

    c
    #include <stdio.h> #include <stdlib.h> int main() { char *buffer = (char *)malloc(10); char *secret = (char *)malloc(10); printf("Enter a secret: "); gets(secret); // Unsafe function that can write beyond secret boundaries strcpy(buffer, secret); // Copy secret to buffer printf("You entered: %s\n", buffer); free(buffer); free(secret); return 0; }

    In this example, the program declares two character pointers buffer and secret, and dynamically allocates 10 bytes of memory for each of them using the malloc function. It then reads a secret string from the user using the unsafe gets function, and copies the secret string to the buffer using the strcpy function. Finally, the program prints the contents of the buffer, and frees the memory allocated for both buffer and secret.

    However, this program is vulnerable to heap grooming attacks because the size of the allocated memory for secret is not checked, and the strcpy function does not validate the size of the destination buffer. An attacker can craft an input string that is longer than the size of the secret buffer, and include a specific pattern of values that will overwrite adjacent memory locations on the heap. For example, the attacker might fill the first few bytes of the input string with a specific pattern that will overwrite the memory location just after the secret buffer, which is the location where buffer is stored.

    In this way, heap grooming can be used to cause uninitialized data access and other security vulnerabilities in C programs. Here is a diagram that illustrates the memory layout of the program after a heap grooming attack:

    lua
    +--------------------------+ | Input String | +--------------------------+ | Secret | | +--------+----------------+ | A | B | C | Memory just after Secret +--------+----------------+ | 0x41414141 | A's overwritten value +--------------------------+ | 0x42424242 | B's overwritten value +--------------------------+ | 0x43434343 | C's overwritten value +--------------------------+ | Buffer | | +--------+----------------+ | Secret's contents | +--------------------------+

    In this diagram, the input string from the attacker is represented by the "Input String" box at the top of the diagram. The secret buffer is represented by the "Secret" box, which has a size of 10 bytes. The three variables A, B, and C are adjacent to the secret buffer on the heap, and are overwritten with specific values that were included in the input string. The buffer pointer is stored just after the secret buffer, and is overwritten with the value of variable A.

    When the strcpy function is called, it copies the contents of the secret buffer to the buffer buffer, but also overwrites the adjacent memory locations on the heap with the specific pattern of values included in the input string. This can cause uninitialized data access and other security vulnerabilities, depending on the contents of the overwritten memory locations.

    A Guide to Multi-Level Pointer Analysis

      A Comprehensive Guide to Multi-Level Pointer Analysis   A regular pointer points to only one address, but when it's accompanied by a l...