Thursday, November 5, 2015

The intricacies of disassembling RTLink/Plus

Finally, after many years of half-hearted attempts, I've finally rewritten my RTLink decode tool practically from scratch to handle disassembling the RTLink/Plus overlay management used by the later Legend entertainment games. See rtlink_decode for the result. The original version of the tool was pretty dodgy, and pretty much hardcoded to only handle the MADS games (Rex Nebular, Return of the Phantom, and Dragonsphere). In this posting, I'll go into more detail of what RTLink/Plus was for those who may be interested.

In the latter days of DOS gaming, game developers started running into a problem. Namely, that their executables were starting to hit the limits of main memory. Not every game could, or would, take advantage of scripting game content, so having all the game logic in the main executable caused the size to bloat. So what to do if your executable was now too big to fit in memory? This was the problem RTLink/Plus was designed to solve.

RTLink is essentially an overlay manager. It splits a program's compiled code into multiple different segments, and allows them to be loaded in as needed, and then replaced when code in other segments needs to execute. RTLink can handle recursive segments, with individual segments split up into their own set of swappable sub-segments. It also allows for multiple different "loading areas" in memory that can independently have their own set of segments. See this article for more general information about RTLink.

Before I go into more details of the problem this posed for disassembly, let's go over how RTLink implements the overlay manager in code. So far, I've encountered three different variations on RTLink being used in executables. What I'll call variation 1 & 2 seem to be the most common form of RTLink in games I've examined. When a program is compiled with either of these versions of RTLink/Plus, one of the segments in the code will contain the RTLink logic, as well as two main areas: the dynamic segments and the function thunks.

The segment list is a list of the dynamic segments within the application. It contains the following information:
  • The segment in memory where the dynamic segment should be loaded
  • Whether the segment is stored in the application or a secondary overlay file (Variation 1 only, version 2 only ever uses the executable).
  • The file offset and size of the segment
  • The number of relocation entries the segment has; Variation 1 only. Variation 2 has it as part of the starting header for the segment pointed to.
When a segment is needed, the above details are used to read the segment's data from file, and load it into the correct place in memory. The data for a segment consists of two parts: an initial header area, and the date/code for the segment. For variation 1, the header area simply consists of a list of relocation entries. Whereas for variation 2, the details of segment size and number of relocations are provided in a header at the start of the segment, before the relocations list.

A segment's relocation entries are used for the same purpose as relocation entries in a standard application - executables can be loaded at different locations within memory, so all segment references need to be relative to the starting point of where the program is loaded. By keeping the relocation entries for each dynamic segment together with the segment data itself, it's easier for RTLink to apply any needed segment adjustments each time a dynamic segment is loaded.

This is fine to handle shifting the segments in and out of memory, and allow them to have valid memory references, but what causes them to be loaded? The answer is the method "thunks" area of the RTLink segment. When dealing with dynamic segments, you can't just do a far call to some offset in the area of memory segments are loaded in.. you couldn't be sure that the segment you want is actually loaded, or still in memory and not unloaded by some other segment. For this purpose, the thunk list is present.

For every method in a dynamic segment that is referenced by any other segment, a thunk/stub method is created. These consist essentially of the following: a call to the RTLink manager to load the correct segment for the method, a far jump to the method in the correct memory location in the loaded segment, and a following 16-bit value specifying which segment the thunk is for. This way, the thunk method acts as a wrapper, ensuring the correct segment is loaded and passing control to the method to execute.

For variations 1 and 2, the thunk methods have some minor differences, such as version 2 using far calls to the RTLink segment loading code, and having an optional word after the segment index. The segment selector in the far jump call is also already loaded with the memory segment in variation 1, whereas in version 2 it's normally 0 initially, and then set to the correct segment when the thunk method is called. This allows variation 2 to dynamically load the segment in different places in memory, whereas variation 1 is limited to a single specific loading point.

The RTLink segment loader method also mucks around with the stack to push a new intermediate return address on the stack for when the method that's jumped to finishes. This return address points to a code fragment that also handles the case where a method in a dynamic segment calls a method in another one.. in that case, it handles reloading the original segment, so that the original caller's code can be safely returned to.

Put altogether, this scheme allows programs of practically any size needed. As the program grows, the code simply needs to be split into more and more dynamic segments which will get loaded only when needed, and remain on disk when not. Great for having big programs, but not so great for those of us interested in reverse engineering the game by disassembling the executable.

There were several problems to be solved for disassembling such games, which I'll go into now.

A standard IDA disassembly doesn't have all the code

Well, it wouldn't. If you try to disassemble an RTLink/Plus compiled game, IDA will give you an error about unused data at the end of the executable. This will be for one or more RTLink segments. Additionally, as previously mentioned, some of the code for the program can also be stored in a separate OVL (Overlay) file.

Well, I could just load the raw data for them into the disassembly, right?

Well, no. That wouldn't help much, because of all the thunk methods. They all have their references to the same area of memory where segments are expected to be loaded. If you were doing things manually, you'd need to get the details of each segment from RTLink, manually load the code and/or data into new IDA segments, and then manually adjust the thunk methods to point to those methods.

You'd also need to worry about the dynamic segment relocation entries. If you manually loaded the code for a segment, you'd have to read the list of relocation entries for the dynamic segment and manually adjust each relocation entry within the segment. Segment selectors may point to code within the segment (or another sub-segment within the loaded overall RTLink segment), to a low memory area of the executable that remains static in memory, or to the data segment (at a higher memory segment). All in all, you'd have to be extraordinarily patient to all that by hand.

So that's why you wrote rtlink_decode, right? That's what it does?
Yes and no. A bit part of what it does is indeed doing the above to create a new executable suitable for disassembly. This includes laying out all the dynamic segments sequentially (without their segment headers and relocation lists), handling relocation fixups, and the thunk methods adjusted to point to their methods in the decoded executable. However, another problem crops up in the handling of the data segment.

In my experience with RTLink, I've come across across two types of data segments:
  1. In the case of the later Legend Entertainment games, the executable has a single RTLink segment, with the remainder of the segments coming from an OVL file. The single executable segment is for the main data segment as well as a few other miscellaneous segments.
  2. In the case of the MADS games, the data segment isn't an RTLink segment, but all the RTLink segments follow it in the executable.
In both cases, we have a problem doing a proper disassembly. Executables are normally expected to have the data segment at the end of the program, because the data segment may be longer than the end of the executable. For example, a game's data segment may only have 1Kb of pre-set values which are stored in the executable, but it still requires 40Kb of unallocated/uninitialized space. That's why you'll frequently see, when you do a disassembly of a program, areas at the end of a data segment with '?' mark values, indicating the memory isn't part of the executable, so doesn't have any specific value when the program starts.

So if we did just lay the segments end to end, the data segment, coming before other dynamic segments, would end up being shorter than it should be, and a lot of the references to data within it would end up wrapping onto the following dynamic segments in the reworked executable. To avoid this,  the rtlink_decode tool ensures that the data segment falls at the end of the generated executable, after all the other segments. This, however, causes it's own share of problems. All the existing references to the data segment refer to where the data segment was expected to be loaded in memory, not to where the data segment actually is in the new executable. Because of this, all the references to the data segment in the executable have to be adjusted accordingly.

Ouch! Sounds fiddly.

It is. And took a lot of messing around to get right. Even then, that's not the entirety of the picture. For Companions of Xanth, the Legend game I used for testing when rebuilding the tool, the data segment has some extra gotcha's.. It contains segment references into the middle of the memory area RTLink segments are loaded into. Presumably these are used in some special controlled circumstances when a specific segment (or segments) are loaded to access particular data. But it's impossible to know without understanding the game a lot better. 

Worse, the presence of the references were screwing up some of the loaded dynamic segments in the disassembly, causing them to be split in half. To handle this, the tool explicitly looks for such "bad" references in the data segment, and removes the relocation entries for them. This way, the value in the data segment will remain as a static word, and the segments don't get incorrectly split up. The user can always then later manually set up a pointer to an appropriate segment if they wish. This handles the bulk of such errors, but Xanth at least, there are still references in the low part of the executable (that remains static in memory) to locations within the RTLink segments. Since I can't know which particular RTLink segment is meant to be loaded when the code they're in is called, these few remaining references will have to be later manually adjusted as well.

So that's it?

Yep. After all these years, I'm finally able to generate a (mostly valid) "decoded" executable, and produce a clean disassembly of Companions of Xanth. I also, initially, had two separate versions of the the tool, one the old hacky version for MADS games, and the legend variation for Legend-style RTLink usage. I've since updated my tool to properly handle MADS games, so now there's only the single rtlink_decode tool, and it can handle both variations 1 and 2.

Oh, wait.. what about the 3rd variation you mentioned?

Ah, yes, I didn't really get into that, did I. This version seems to be somewhat different than the other two variations. In this case, the RTLink code is stored in a separate rtlinkst.com file, and then loaded into memory. It then shifts part of the program downwards in memory, and uses it's own relocation table to manually process relocation entries on the shifted code. This variation is proving tricky to disassemble, so whilst I have located the segment list, I still need to:
  • Figure out how relocation data is encoded. I think I've located the correct data in the executable, but the code RTLink uses to update relocation entries is pretty nasty and overcomplicated.
  • How much of the start of the executable to remove so that the produced executable doesn't have any of the old code at the start of the executable that gets overwritten
  • Find the thunk methods, and see whether the existing code will handle them.

Hopefully I can quickly figure out the remaining details for the third variation soon. The goal is to have a tool that both myself and others can use in the future to help them disassemble any game that used RTLink/Plus. Then no-one else will have to go through all the frustrations that I did trying to deal with this %#@! thing.

Saturday, October 31, 2015

Sherlock Testing Resounding Success

Hi everyone,

It seems like the testing of the Sherlock games has been a success. There were lots of bug reported, and all the ones reported so far have been fixed. The foreign language versions haven't all been fully tested yet, but hopefully now the immediate crashes with conversations and inventory in both Serrated Scalpel and Rose Tattoo foreign language versions have been fixed, and the rest of the games can be tested. I'd like to thank everyone who's tested so far for your efforts, and feel free to post any more bugs you come across. Though hopefully there won't be too many more to find :). And if one else has copies of either game, particularly different foreign versions, any other testers would be appreciated.

Right at the moment I'm at the sweet point where all the outstanding bugs for Sherlock have been resolved, so unless new ones come in, I can turn my attention to other things.

So, whats coming next for me? Several things:

Serrated Scalpel 3DO

Serrated Scalpel 3DO still isn't completable. There are some areas, such as the darts game to fix up, and there are still some differences in sprite positioning that would need to be accounted for. It's somewhat constrained by the fact that we don't have any original source for it, and I don't have any experience with reverse engineering 3DO games. It may simply be a case of do as best we can with hardcoded fixes as necessary.

The 3DO version also has a few missing things compared to the PC version; notably it lacks the journal the PC version has. A "nice to have" for the future would be to reintroduce it, so that the 3DO version could be the definitive version of the game, containing all the PC version functionality as well as the video for all conversations. We'd likely create a tool that extracts necessary graphics for UI buttons and the journal background from the PC version, and produce a Dat file that ScummVM can use automatically when playing the 3DO version.

RTLink Overlay manager

Next, there's the RTLink/Plus overly handling in Companions of Xanth. I'd previously had some luck writing a tool to process Rex Nebular and create a flat executable with all the segments suitable for disassembling, but doing the same for Companions of Xanth proved elusive. Over the years I made several attempts, but none bore fruit. Until now. As of yesterday, I was finally able to write a new version of the tool that successfully generated a flat executable that could be disassembled. So one day, after a great deal of disassembly work, ScummVM may support the game.

My only disappointment is that RTLink/Plus seems to have had quite a number of variations. Rex and Xanth's RTLink mechanism were fundamentally similar, with all the RTLink code, segment list, and method thunks/stubs in the executable and/or overlay file. For several other games with RTLink that I tried, however, they seem to use a bizarre alternate method where a file called 'rtlinkst.com' is loaded, then an '.RTL' file for the game, and finally only then the game executable. Which means that my tool doesn't work with them, and I'd need to figure out this new mechanism from scratch if I want the tool to be general-purpose enough to handle any RTLink game anyone might want to use it on in the future.

Guess that can be another long-term project to muck around with. Hopefully it won't take as long as it did to finally get Xanth properly disassembled. :). I'll probably make another posting in a day or so about RTLink in more detail, for those that are interested.

Might & Magic, World of Xeen

Next there's my work on re-implementing Might & Magic - World of Xeen (and Clouds of Xeen, Dark Side of Xeen, and Swords of Xeen). With some free time last weekend, I finally returned to working on them, and likewise was finally able to properly disassemble the  algorithm the original used for scaling. So as of now, sprites are now correctly scaled, as before I was only using a rough guess scaling code I'd nicked from another game engine. I've also fixed some other drawing bugs for drawing outdoor areas (outside town). So as of now the game scenes now display properly! :)



That's right, you can now walk around, fight monsters, go visit the various buildings in town, and even leave the town! In fact, most of the functionality for the games are already implemented. Apart from lots of testing and minor bugfixes that will be needed, only the following major areas remain to be implemented:

  • Introduction/ending sequences for the games
  • character management, and title screens.
  • Sound. I'm hoping I can simply slot in one of ScummVM's audio decoders without much further work.
  • Savegames

At the moment I'm concentrating on getting World of Xeen (which combines Might & Magic 4 and 5 together) working. But then afterwards I'll implement separate support for 4 and 5, as well as for Swords of Xeen. It might even be feasible to handle Might and Magic 3 - Isles of Terra as well, since I'm given to understand the engines are nearly the same.

Those interested in following the progress can see it at the brand-spanking new RogueVM Github account. That's right; after all the years of idle talk, I've finally set up a place to properly store RPG related game engines. No website yet, but at least it's a start. :) I'll likely spend the near-term focusing mostly on finishing support for the game before I move onto any other adventure games, considering how far along the engine is already.

Return of the Phantom

Strangerke has put a lot of work recently into implementing scene logic for Return of Phantom, the next MADS game that was published after Rex Nebular. There are quite a few stubs for missing engine functionality that was added in though. So when I do return to working on adventures, it will likely be to assist him in completing the game.

DreamMaster.

Monday, July 6, 2015

Rose Tattoo is in progress

Hi all,

Work is progressing well on Rose Tattoo, and as of tonight I hit a major milestone.. the entire game intro sequence is now completable. See the screenshots below:



There are still some minor graphic glitches at various points, and one of the scenes which is a double-side scene isn't properly scrolled horizontally yet, but even so, it's a great step forward in supporting the game. We're actually lucky in that Rose Tattoo implemented all of the introduction using standard game scenes. So it saved a lot of effort implementing manual introduction code like had to be done for Scalpel.

Now that the introduction is working, more or less, I'll be devoting more time to implementing the game-play. I'd already been spending some of my working on it, so some interaction is possible.. you can look at objects, open up the inventory, and conversations with characters are partially working. The game map is also already implemented, so you can get to other game locations as well.

On another subject, I had good luck implementing the original EA logo at the start of Serrated Scalpel. I was able to complete support for it in the TsAGE engine, and then used that as a basis for copying necessary code into the Sherlock engine. With some most welcome assistance from others, the EA logo at the start of the game now displays when you start the game, just like the original does.

Finally, on yet another tack, there have been some promising first steps towards supporting the 3DO version of Serrated Scalpel by m-kiewitz, with some assistance from clone2727.  The 3DO was a superior version of game, and included 16-bit color, video portraits, and full speech for every conversation in the game. Supporting this would be great. It would be nice if, one day, the 3DO version could be re-released with ScummVM, so a wider audience could properly enjoy the game.

Friday, May 22, 2015

Happy Birthday, Sir Arthur Conan Doyle

Hey everyone.

Looking back it seems, to my chagrin, that it's now been over a year since my last news posting was over a year ago. Whoops :P. Not that I haven't been keeping busy over the last year or so, with the release of Voyeur, Amazon - Guardians of Eden, Rex Nebular (finally), and of course, the newest game.. The Lost Files of Sherlock Holmes: Case of the Serrated Scalpel.



Many thanks go to EA for providing us access to the original source for this game. Also to forum user sirlemonhead, and to James Ferguson, who patiently over the last few years tried to make this happen. I've always been a big Sherlock Holmes fan, so it was fun to work on this project.  It feels fitting to merge the game into master on the 22nd May, which is the birthday of the character's creator, Sir Arthur Conan Doyle. The engine isn't quite ready for serious testing yet.. it's still missing music playback, and there's also a starting logo animation that's not present. It should be finished soon, though, so expect to see an official testing announcement in the near future.

So.. as things stand, what I am up to right now?

The Logo
Apart from the game proper, one of the more interesting things about the game is at the very beginning, where the publisher EA logo is shown. This logo display was actually implemented in a separate executable using the TsAGE engine, of all things. Original source for this couldn't be located, so it means that I'm having to reverse engineer it. Luckily, since we've already had experience with several other TsAGE titles, I've been able to make excellent progress in figuring out all the various TsAGE classes and their methods within the executable.

At the current point in time, I've identified the bulk of the core TsAGE classes, and the custom logic for the "game", which is contained a single scene class. This scene class consists of several scene objects, a few palette containers, and an "Action" class for coordinating what happens in the logo display. There's only a minor variation in how object sprites are loaded compared to the games that I still need to figure out.

I've already started implementing a new sub-module within TsAGE for the game logo. Once I finish that, it will be easier to analyse all the movement and frame changes of the images with the scene.  Hopefully, based on that, I'll be able to simulate a similar sequence in our new Sherlock engine using the bare necessities from TsAGE - likely just the RLB archive manager and sprite loader. Particularly given the thoughtfulness of EA in providing us access to the original source, it would be nice to give them (the company) proper attribution by showing their logo just like the original game does.

The Sequel
Apart from that, we have also been given access to source for the sequel, The Case of the Rose Tattoo. Implementing this is likely to be much more challenging, as the sequel changed over to a 640x480 display, and significantly altered the user interface. As such, it's likely it will need a lot of re-factoring of the code base to add support for it to the existing engine. If you thought a lot of re-factoring was done during the pull request, you 'aint seen nothing yet. :)

I'm also somewhat constrained by the fact that the original uses DOS4GW and a 32-bit code segment. Whilst we do have the original source, I need to be able to run the game in DosBox so I can actually see the code running, and check registers and memory contents at given points in the program. I've had some significant trouble with the DosBox debugger, trying to set breakpoints in the code so I can inspect the game's state.  Doing so crashes either crashes DosBox, or the game executable, or the breakpoints simply aren't hit.

So far, I've only done some preliminary loading of scene resources in the second game, and the lack of a way to display the program state meant that I had to take a more laborious route of poring over the various resource structures and scene loading code in both games, to try and figure out what the differences were between the two, so my code can support it. Likely, as I proceed with implementing more of the game, this will cause real issues that will make finding bugs a lot harder.

World of Xeen
It's been somewhat on a back-burner since I started work on the Sherlock Holmes games, but I had previously made real progress on re-implementing World of Xeen using the ScummVM framework. See below:


As you can see, I have much of the game interface implemented. You can move around, fight monsters (with a few minor glitches), and even leave the town. There are really only a few main areas left to implement, which includes sound support, logic for all the various spells, savegames, and the intro/ending cut-scenes. I probably won't return to working on it until after Rose Tattoo is finished, though. But when I do, I don't anticipate it will take long to finish the remaining areas, and then it would simply be a matter of playing the game through in earnest, identifying and fixing minor bugs as they're identified.

Of course, as an RPG, World of Xeen is a bit outside the scope of ScummVM proper. At that point, it may be time to finally launch the RPG sister project Strangerke and I have been wanting to do. :)