Support us on Patreon to keep GamingOnLinux alive. This ensures all of our main content remains free for everyone. Just good, fresh content! Alternatively, you can donate through PayPal. You can also buy games using our partner links for GOG and Humble Store.
We do often include affiliate links to earn us some pennies. See more here.
xoreos is a FLOSS project aiming to reimplement BioWare's Aurora engine (and derivatives), covering their games starting with Neverwinter Nights and potentially up to Dragon Age II. This post gives a short update on the current progress.

Note: This is a cross-post of a news item on the xoreos website (part 1, part 2, part 3).

And further down the path of getting all targetted games to show areas I go. Previously, I wrote about my progress with The Witcher, Jade Empire and Neverwinter Nights 2. For the next two months, I took a look at the odd one out: the Nintendo DS game Sonic Chronicles: The Dark Brotherhood.

Yes, a Nintendo DS game. I wasn't so sure myself that game is actually a "proper" target for xoreos. I'm still not 100% sure, but I know now that it at least does use several BioWare file formats, as well as Nintendo DS formats. I also saw that some of those BioWare formats are used in Dragon Age: Origins as well, so Sonic Chronicles actually did provide a natural station on my path.

This report is divided into three pages. On this here first page, I go a bit into the details of those common BioWare file formats. On the next page, I cover the graphics (that are mostly Nintendo formats). And the third page shows how I tied it all together in xoreos.

So, onwards to the BioWare formats.

GFF4

GFF is BioWare's "General File Format", which is used as the basis for many things in BioWare games. It's an old format, already found in the Infinity Engine, but not quite as complex yet. (Correction: It seems I misremembered there; GFF is not used in the Infinity Engine. I apologize for this mistake.) Conceptually, it is comparable to XML[1]: hierarchical data, organized in a tree-like fashion, able to hold basically everything. As such, it's used to describe areas, characters, items, dialogues, ... Unlike XML, however, GFF is a binary format, not directly human-readable.

Since GFF is such an important format, xoreos already implemented a reader (thanks to BioWare releasing specifications for the Neverwinter Nights toolset. And we provide a tool to convert them into XML for easier readability, too. It was only, however, for versions 3.2 (used by Neverwinter Nights, Neverwinter Nights 2, Knights of the Old Republic, Knights of the Old Republic 2 and Jade Empire) and 3.3 (used by The Witcher). But Sonic Chronicles, Dragon Age: Origins and Dragon Age 2 needed a reader for versions 4.0 and 4.1 -- and boy did they change the format.

You see, after converting the GFF3 to XML, the whole thing is really quite readable and understandable. Every tag has a full string as a name, making the uses and intentions clear. But from the game's perspective, this has a huge drawback: it's slow. Strings are unwieldy, slow to read and compare, and variable length items are generally a pain when you want to quickly jump to a specific field. To curb that, GFF4 removes those pesky strings. Instead, fields use a single 32-bit integer as their "name", making comparisons easy as pie.

image image
Lucky for me, the new GFF4 format is already documented on the Dragon Age Toolset Wiki. The huge amount of example files provided by the two Dragon Age games and Sonic Chronicles gave me ample opportunities to test out corner cases as well. Easy. The gff2xml tool mentioned above now supports GFF4 as well.

[1] In fact, BioWare generates their GFF4 files out of XML, as can be seen from the Dragon Age: Origins toolset.

TLK

Next up, I saw a new TLK format used in Sonic. TLK is a "talktable", a list of strings indexed by a numerical ID. The idea is that you have all text used in the game in one place, easy to use and easy to translate. Already used in Neverwinter Nights, xoreos has a reader for it already. It's relatively simple, too.

However, the new format is quite different. In fact, it's a GFF4! I did say that you can basically stick everything in a GFF, right? That's what they did for Sonic Chronicles (and the two Dragon Age games). With the new GFF4 reader, adding GFF4'd TLK support was quick and painless.

GDA

Just like the GFF4'd TLK, GDA is an old friend in GFF4 suit. This time, it's 2DA: a 2 dimensional array, a table if you will. If you're still lost, think Excel spreadsheet, a simple collection of data organized on a grid.

2DAs are used to, for example, specify the models of different objects. The GIT file describing objects in an area would say "Here's an object, we call it Chair, it has Appearance 179". The game then looks into appearances.2da, at row 179 and column "ModelName", grab that filename there and load it as the object's model.

GDA is, essentially, just the same thing as GFF4. A list of columns giving their name and type, and a list of rows with the data for each column. However... While real 2DA have an actual column name (the "ModelName", for example), making guessing the meaning easy, GDA don't actually store a name. They store a hash of the name (specifically, the CRC32 of the UTF-16LE encoded string in all lowercase), a number that's meaningless in and of itself.

There's 845 unique hashes in the GDA files found in Sonic. There's no real way to turn them back into readable strings, but there's a certain trick I could apply: a dictionary attack. I got myself a huge list of words found in a dictionary, hashed them, and compared the hashes. Then I extracted all strings I could find in the game (from the GFFs, mostly), and did the same. Then I combined the words of these lists. Then I combined matches. Each time, I manually went through the list to kick out the many, many false positives: strings that hashed to a valid number, but that don't make sense in the context of the game ("necklessnoflyzone", "rareuniquemummifications", "properlyunsmoked").

Phew, that was a lot of tedious work. Still, I managed to find the source strings for 534 of those 845 hashes, 63%. Sure, there's still 311 missing, but that'll have to wait for later.

And that's it for the common BioWare file formats. See page two for the graphical formats. Article taken from GamingOnLinux.com.
0 Likes
About the author -
Geek. Atheist+. Leftist. Metal-Head. Discordian. Lefty.
ScummVM dev, xoreos lead.
Free software zealot.
See more from me
The comments on this article are closed.
8 comments

GustyGhost Jun 7, 2015
I don't often read an entire article but after doing so, I wish I could contribute.
GoCorinthians Jun 7, 2015
Batman Arkham Knight in pre-order with release date dayONE for linux too!

DrMcCoy Jun 7, 2015
Quoting: AnxiousInfusionI don't often read an entire article but after doing so, I wish I could contribute.

I'd also be very happy about any signal boosting and getting the word out more. :)

Quoting: GoCorinthiansBatman Arkham Knight in pre-order with release date dayONE for linux too!

Did you mean to comment in a different article, perchance? :P
Keyrock Jun 7, 2015
Quoting: GoCorinthiansBatman Arkham Knight in pre-order with release date dayONE for linux too!

False, not day 1.



As for this article, that was a really interesting read. I am mostly interested in this project for Jade Empire. I would also so love a sequel to that game.
GoCorinthians Jun 7, 2015
That hurted, boy! lmao

Quoting: Keyrock
Quoting: GoCorinthiansBatman Arkham Knight in pre-order with release date dayONE for linux too!

False, not day 1.



As for this article, that was a really interesting read. I am mostly interested in this project for Jade Empire. I would also so love a sequel to that game.
Samsai Jun 7, 2015
Please, try to stay on topic. And if no article suits the topic you want to talk about, start a thread on the forums.
Anorelsan Jun 8, 2015
Every article you wrote is really very interesting. I wish I have the knowledge to contribute to the project, but at least I "boost the signal".

DrMcCoy you rock!
dibz Jun 8, 2015
This is really cool. Good luck.
While you're here, please consider supporting GamingOnLinux on:

Reward Tiers: Patreon. Plain Donations: PayPal.

This ensures all of our main content remains totally free for everyone! Patreon supporters can also remove all adverts and sponsors! Supporting us helps bring good, fresh content. Without your continued support, we simply could not continue!

You can find even more ways to support us on this dedicated page any time. If you already are, thank you!
The comments on this article are closed.