Data types: Difference between revisions
(Removed {{Languages}}) |
|||
(15 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
This page describes common binary data types used by Factorio, and most modern-day computer applications, internally. | |||
This page describes common binary data types used by Factorio, and most modern day computer applications, internally. | |||
== Properties == | == Properties == | ||
=== Endianness === | === Endianness === | ||
All data stored in computers are stored in a single, small unit of data called a "byte". Larger units of data are split into multiple bytes. The order in which sequential bytes are assembled together to form a larger unit of data is referred to as it's "endianness". '''All data saved locally by Factorio is stored in "little"-endian format''' (because Intel favored little endian processors in the ancient times). All data transferred over the network by Factorio is done in "big"-endian format (the standard for network stuffs, so much so that it's often referred to as "network"-endian). (Please correct any of this if I'm wrong!) For more information about endianness, please see the [https://en.wikipedia.org/wiki/Endianness Wikipedia entry]. | |||
All data stored in computers are stored in single, small unit of data called a "byte". Larger units of data are split into multiple bytes. The order in which sequential bytes are assembled together to form a larger unit of data is referred to as it's " | |||
=== Signedness === | === Signedness === | ||
Another property of multi-byte data types is its signedness. Please see the [https://en.wikipedia.org/wiki/Signedness Wikipedia entry]. Data types without a 'u-' prefix are signed, and '''those marked with a 'u-' prefixed are unsigned'''. I'm not certain which type of signedness Factorio uses – perhaps it is machine dependent, but for me, it always appears to be [https://en.wikipedia.org/wiki/Two%27s_complement two's complement]. | |||
=== Space Optimized === | |||
Some integer values are save using a "space optimized" system where instead of always writing out the full multi-byte integer Factorio will check if the integer size is < 255 and if so a single [[Data_types#unsigned byte|unsigned byte]] is written. If the integer size is ≥ 255 the value 255 as an [[Data_types#unsigned byte|unsigned byte]] is written and then the full data type is written. This is mainly used when saving strings, arrays, and dictionaries because their size is almost always < 255 bytes and in those cases, it saves 3 bytes of space. | |||
== Data Types == | == Data Types == | ||
=== bool === | |||
1 byte value. Considered True if and only if value equals 1, otherwise it is considered False. | |||
=== byte === | === byte === | ||
1 byte long signed integer. | |||
1 byte long integer. | === unsigned byte === | ||
1 byte long unsigned integer. | |||
=== short === | === short === | ||
2 byte long signed integer. | |||
2 byte long integer. | === unsigned short === | ||
2 byte long unsigned integer. | |||
=== int === | === int === | ||
4 byte long signed integer. | |||
4 byte long integer. | === unsigned int === | ||
4 byte long unsigned integer. | |||
=== long === | === long === | ||
8 byte long signed integer. | |||
=== unsigned long === | |||
8 byte long unsigned integer. | |||
=== float === | |||
A 4 byte (32 bit) long [https://en.wikipedia.org/wiki/Floating-point_arithmetic floating point number] stored in a special format with two fields - the number's exponent, and it's [https://en.wikipedia.org/wiki/Significand mantissa]. This format is described by the [https://en.wikipedia.org/wiki/Single-precision_floating-point_format 32-bit data type in IEEE 754]. | |||
=== double === | === double === | ||
An 8 byte (64 bit) long [https://en.wikipedia.org/wiki/Floating-point_arithmetic floating point number] stored in a special format with two fields - the number's exponent, and it's [https://en.wikipedia.org/wiki/Significand mantissa]. This format is described by the [https://en.wikipedia.org/wiki/Double-precision_floating-point_format 64-bit data type in IEEE 754]. | |||
=== string === | |||
Strings in Factorio are stored [https://en.wikipedia.org/wiki/String_(computer_science)#Length-prefixed Pascal-style] – they're a variable length data type with two fields: an [[#unsigned int|unsigned int]] describing how many bytes long the string data is, and then the string data itself. | |||
=== array === | |||
<code>array<element type></code> | |||
Arrays are a container type. Like a [[#string|string]], they're prefixed with an [[#unsigned int|unsigned int]] describing how many elements are in the array, followed by that many elements of the specified element type back to back. | |||
=== dict === | |||
<code>dict<key type, value type></code> | |||
Dictionaries (or "dict"s for short) are like [[#array|arrays]], but their elements consist of (key, value) pairs that map their key (usually a string, or a number) to their value. Just like [[#array|arrays]], they're prefixed with an [[#unsigned int|unsigned int]] describing how many (key, value) pairs are in the dict, followed by that many (key, value) pairs. | |||
Each key should be unique and only appear once. In a few esoteric languages, if a key is provided multiple times then only the last value is used – but in Factorio, assume any dict with multiple identical keys is malformed and error. | |||
== See also == | |||
* [http://lua-api.factorio.com/latest/Builtin-Types.html Factorios Lua API documentation about its builtin types] | |||
[[Category:Technical]] |
Latest revision as of 09:03, 12 November 2023
This page describes common binary data types used by Factorio, and most modern-day computer applications, internally.
Properties
Endianness
All data stored in computers are stored in a single, small unit of data called a "byte". Larger units of data are split into multiple bytes. The order in which sequential bytes are assembled together to form a larger unit of data is referred to as it's "endianness". All data saved locally by Factorio is stored in "little"-endian format (because Intel favored little endian processors in the ancient times). All data transferred over the network by Factorio is done in "big"-endian format (the standard for network stuffs, so much so that it's often referred to as "network"-endian). (Please correct any of this if I'm wrong!) For more information about endianness, please see the Wikipedia entry.
Signedness
Another property of multi-byte data types is its signedness. Please see the Wikipedia entry. Data types without a 'u-' prefix are signed, and those marked with a 'u-' prefixed are unsigned. I'm not certain which type of signedness Factorio uses – perhaps it is machine dependent, but for me, it always appears to be two's complement.
Space Optimized
Some integer values are save using a "space optimized" system where instead of always writing out the full multi-byte integer Factorio will check if the integer size is < 255 and if so a single unsigned byte is written. If the integer size is ≥ 255 the value 255 as an unsigned byte is written and then the full data type is written. This is mainly used when saving strings, arrays, and dictionaries because their size is almost always < 255 bytes and in those cases, it saves 3 bytes of space.
Data Types
bool
1 byte value. Considered True if and only if value equals 1, otherwise it is considered False.
byte
1 byte long signed integer.
unsigned byte
1 byte long unsigned integer.
short
2 byte long signed integer.
unsigned short
2 byte long unsigned integer.
int
4 byte long signed integer.
unsigned int
4 byte long unsigned integer.
long
8 byte long signed integer.
unsigned long
8 byte long unsigned integer.
float
A 4 byte (32 bit) long floating point number stored in a special format with two fields - the number's exponent, and it's mantissa. This format is described by the 32-bit data type in IEEE 754.
double
An 8 byte (64 bit) long floating point number stored in a special format with two fields - the number's exponent, and it's mantissa. This format is described by the 64-bit data type in IEEE 754.
string
Strings in Factorio are stored Pascal-style – they're a variable length data type with two fields: an unsigned int describing how many bytes long the string data is, and then the string data itself.
array
array<element type>
Arrays are a container type. Like a string, they're prefixed with an unsigned int describing how many elements are in the array, followed by that many elements of the specified element type back to back.
dict
dict<key type, value type>
Dictionaries (or "dict"s for short) are like arrays, but their elements consist of (key, value) pairs that map their key (usually a string, or a number) to their value. Just like arrays, they're prefixed with an unsigned int describing how many (key, value) pairs are in the dict, followed by that many (key, value) pairs. Each key should be unique and only appear once. In a few esoteric languages, if a key is provided multiple times then only the last value is used – but in Factorio, assume any dict with multiple identical keys is malformed and error.