Emotet seen distributing bloated files to evade detection
- The SonicWall Capture Labs threat research team has once again observed a surge in Emotet. This the notorious malware, which heavily targets large organizations, uses similar tactics and functionality observed in past variants. Originally a banking Trojan, Emotet has evolved into a dropper-type class of malware. It has been been spreading through malicious Microsoft Office documents via email. Initially it was using Excel 4.0 macros(XLM), currently VBA macros are used to compromise the victims’ machines. Also, this time it is using large file size (Approx. 538 MB) for evading scanners. File size is increased due to padding of extra null bytes which are basically useless.
The initial infection vector is a phishing email which has office document attachment. This document file has obfuscated VBA macros. By default, the Document file is opened in protected view, with the macros disabled. To evade this, Emotet document files have one image with instructions (Figure 1) asking user to enable content. Once the user enables the content the macros are then executed in background.
Fig 1 : Office document in protected view with macro disabled
Fig 2: obfuscated macro in office document file.
Inside the document file, there are multiple macros which have various obfuscated functions and variable names. After debugging this macro, we were able to de-obfuscate the macros and successfully extract all the URLs which are responsible for downloading the payload.
Fig 3: After de-obfuscating macros, we can see all URL which download Payload.
URLs inside document file:
These URL downloads the archive file which contains a DLL file.
Fig 4: DLL after Extracting Archive.
Fig 5: Appended Null bytes.
This file is a bloated Emotet DLL file by appending null bytes at the overlay, actual PE size of file is in few hundred KB’s, but since null data appended at the end of file, file size increased up to few hundred MBs. (Here actual size is 667 KB and blotted file size is 538 MB)
Encrypted shell code and encrypted PE file are embedded within the resource section of the binary. Encrypted PE file size is of 0x2B000 bytes.
Fig 6 : embedded encrypted PE file and shell code blob in resource.
It uses “LdrFindresource_u” API to find encoded resource data and LdrAccessResource API to fetch the contents of a resource data. It calls NtAllocateVirtualMemory to allocate memory so it can decrypt shell code and payload in memory. Malware uses hardcoded 0x3B size of decryption key and customized decryption loop to decrypt payload PE file and shell code.
Once it completes the decryption of shell code blob then it transfers execution control to this shell code using NtQueueAPCThread and NtTestAlert. NtQueueApcThread is used to queue an APC to the current thread pool, here shell code is added as an APC to current thread.
NtTestAlert is a system call that’s related to the alerts mechanism of Windows. This system call can trigger execution of any pending APCs that thread has. Once NtTestAlert API gets called shell code will start executing.
Fig 9 : Transfers execution control to injected shell code
Following is a snapshot from the shell code blob where we can see strings has been pushed to stack which will be later used to resolve API addresses dynamically.
Fig 10 : start of shell code
The task of shell code is to map the DLL file in virtual memory the way process aligned in memory and start the execution of actual payload DLL.
Malware uses API resolving functionality where it passes hard-coded checksum of API to stack or to a register and then compute checksum from the combination of DLL name string and API name. If the checksum matches it found the name of API that it wanted to use. In Fig 11 we can see 0xBDBF9C13 which is checksum DWORD used to resolve API “LdrLoadDll”.
Following are steps It follows:
- Pushed hardcoded Hash to register (or it would be pushed on stack )
- Call API resolver Function.
- Access PEB structure.
- Access PEB_LDR_DATA from PEB structure which is at 0x18 offset in PEB.
- Access InLoadOrderModuleList from PEB_LDR_DATA which is at 0x10 offset in PEB_LDR_DATA
- Using InLoadOrderModuleList it accesses address where DLL got loaded in memory and its wide character name in memory.
- It checks if export directory is present in loaded module, if not then move to next loaded module in list.
- If export directory present, then it calculates the checksum with the DLL name.
- It traverses loaded DLL file in memory, to access export directory and access function names RVA and add this RVA to base address of DLL.
- It goes to last exported function name by multiplying NumberOfNames (Function Counter ) and adding function name RVA .
- Then One by one it accesses every API name and calculates the checksum of API Name.
- It finally adds API name checksum to DLL name checksum and compares with previously pushed hard-coded DWORD value, if it matches then malware gets API name otherwise enumerate all loaded DLL in memory.
- If malware gets API name, it accesses exported function and read the RVA of exported Function. Then it adds RVA to the base address of DLL where it was loaded, this way it resolves all API addresses dynamically.
After resolving LdrLoadDll it resolves LdrGetProcedureAddress API using hardcoded DWORD checksum 0x5ED941B5h.
It uses LdrLoadDll and LdrGetProcedureAddress to resolve following API. All this API names has been pushed on to the stack at the start of shellcode execution.
These APIs will be used to map payload DLL in memory so it can start its execution.
DLL Mapping Functionality In memory
- It enumerates the section header from payload DLL file to calculate size of Image which would be in memory for DLL payload. Basically, it adds Virtual Address of each section to Raw Size of that section.
- It allocates Memory to map DLL in memory using VirtualAlloc API then it copies first 0x400 bytes decrypted Payload DLL but it skips DOS header while coping first 0x400 bytes to newly allocated memory.
- It moves each section of Payload DLL file in newly allocated memory each section on new page.
- It changes memory protections of each section.
- Finally, it calls RtlAddFunctionTable to add dynamic function table to the dynamic function table list and call FlushInstructionCache API to make changes permanent.
- It parses loaded DLL file and find out entry point of it at the start execution of DLL from Address of entry point.
It simply comes out of main entry of payload DLL and starts executing “DLLregisterServer” function from main DLL file. “DllRegisterServer” from main DLL (not the mapped payload DLL), there is “DllRegisterServer” function in injected payload as well. Objective of this function to check whether exported function of injected payload is “DllRegisterServer” or not. To check whether exported function is “DllRegsiterServer” it calculates the checksum of string “DllRegisterServer” and check with hard-coded DWORD checksum, if matches then continue its execution otherwise exits. It transfers execution to “DllRegisterServer” from the payload DLL.
Payload DLL file
It decodes all necessary DLL names which needs to be loaded.
Following are DLL which are loaded by payload.
It loads all DLLs from above list.
API Resolve Functionality in Payload DLL
It resolves windows API Address of from the hardcoded DWORD checksum and it uses almost similar functionality as is used while resolving LdrFindresource_u, LdrAccessResource in main DLL.
The difference here is for every API it passes two hardcoded checksums, one for DLL name and another for API name while previously it was passing single DOWRD checksum ( combination of DLL name and API name )
It Access PEB structure and get the InLoadOrderModuleList by traversing PEB structure and then access Name of DLL , it calculates the checksum for every DLL and matches to passed DWROD checksum.
For e.g. it compares with hard-coded hash 0xAA83E8EA to find out loaded base address for kernel32.dll
If it resolves Base address for DLL, it enumerates export directory of each loaded DLL and calculate checksum of API name and compare with hard-coded DWORD checksum. If API name found, then take RVA from export directory and add it to base address of loaded DLL.
Here function which calculates checksum for DLL name and API Name are quite complex as compare to function found in main DLL.
The following are some API Names and their respective DWORD checksum used in payload.
6AE056F0h : memset
609FD004h : CreateEventW
87CA8415h : CreateTimerQueue
9BD9AB80h : CreateTimerQueueTimer
160FBC8Dh : WaitForMultipleObjects
11B5B47Ah : BCryptGenRandom
35BF9169h : BCryptCloseAlgorithmProvider
69230A13h : GetModuleFileNameW
1D50CF79h : OpenSCManagerW
32655658h : CloseServiceHandle
0C60B628h : Process32FirstW
91BB593Ah : GetCurrentProcessId
6B466D98h : GetProcessHeap
D84B3D0A : GetModuleHandleA
8B1AD334h : HeapAlloc
3434C63Dh : CloseHandle
25A83B18h : OpenProcess
5D2B782Fh : PathFindFileNameW
0FC917Eh : SHGetFolderPathW
305B4B5Ah : lstrlenA
32C0D23Ch : SHFileOperationW
6A53AC4Dh : DeleteFileW
73BF5525h : kernel32_SetEvent
9D1964ACh : kernel32_ExitProcess
661CE361h : RtlExitUserProcess
It checks for folder path from which process is being run, if it is not being run from %appdata\Local% directory, then it moves main DLL to %AppData\Local% directory, Regsvr32.exe spawns a child process by passing command line argument as file path of moved DLL to %AppData\Local% directory and kills the parent, becoming a ‘non-existent process’; this is an anti-analysis technique that prevents debuggers from attaching to the process.
Fig 20 : Regsvr32.exe restarted with dropped DLL as argument.
It sends data over internet using WinHttp API.
Fig : 21 : encoded data send over C2 server
We have found various C2 URLs, from which it performs the command-and-control activity.
Evidence of the detection by RTDMI(tm) engine can be seen below in the Capture ATP report for file: