Dialogs have always been an important thing in Larian games.
Because text is just text, ever since Divine Divinity we’ve always been trying to bring life to our in-game conversations. I think that so far we managed to do a decent job, but it’s fair to say that there have been some slip-ups. The most memorable of these was of course the voice of the Death Knight in Beyond Divinity. That one was judged to be so bad by our forum community (who are a very vocal bunch) that we had to do a tactical retreat and rerecord the entire thing.
Nowadays, bringing dialogs to life is not a question of voice recordings alone anymore, but also of having perfect lip synchronization and fitting non-verbal communication on the virtual speakers. The latter is certainly the case when you’re dealing with high quality character artwork, which is more or less the norm in these modern times.
While cool for players, this requirement is pretty uncool for poor and not-so-poor independent developers, and I already talked extensively about the problems this has been causing for us during the development of Dragon Commander. Because there’s a shitload of choice and consequence going on in between game turns, the dialog asset requirements are pretty steep in that game, and when we did the initial research on how much it was going to cost us to animate all the dialogs, we came up with numbers that were bananas. (between half a million and one million U$ for one language!)
For a long time actually we thought that we’d have to resort to plan B, which was just recording and animating the opening lines, without having anything in terms of animation or voice for the rest of the conversation.
It’s been almost a year since I first wrote about this particular problem here, and I’m really glad to say now that we finally cracked it (just in time 😉 ). It’s amazing what a courageous lead animator who refuses to admit defeat can do if you give him a few good programmers and a bunch of cool cameras.
If you haven’t already, I suggest you check the video – it shows the results we’re getting today and while you’ve seen results like this in other games, the amazing part (for us at least) is how fast and cost-effective we can now bring our dialogs to life. No more plan B’s, that’s for sure!
The solution to the problem began with us acquiring a bunch of Optitrack cameras to try out facial capturing. We were quite hyped about buying these cameras at the time and made this memorable unboxing video to mark the occasion. When making that video, we actually didn’t realize exactly how well suited to its task this system was going to be.
You see, by doubling the amount of cameras, the system could not only be used for doing facial capturing, but also for motion capturing, and really clean capturing at that. Originally we would’ve never contemplated this, but these cameras aren’t all that expensive and you really get clean results from them (i.e. you don’t have too much hand work afterwards removing the noise from the captured files).
Having clean motion capturing abilities allowed us to proceed to the next step of our cunning plan i.e. mixing motion captured body animations with facial captured emotional expressions and automatically generated phoneme based lip animations. Tbh, we didn’t expect the results to be that good. They were remarkably convincing and not that different from the really expensive full body/facial capture tests we’d seen from various studios.
So, next we started implementing all the software needed to tie all of this together and figuring out a pipeline for doing this in large quantities. Probably the coolest thing here is that the pipeline is almost fully automatic and requires almost no handwork. Manual labor is where the money usually disappears in these kind of endeavours when you’re do large scale content creation (There’s about 9000 lines of dialogs or so).
Today, all we need to get some life in the dialogs is our writer deciding what emotions are going to be present in a dialog and a certain process being followed during the recording of the voices. The rest comes for free, meaning that we can adapt the entire thing for each language, giving us pretty cool localization abilities. And all for a fairly modest price. There obviously was the R&D cost but compared to what we were going to be charged by all the service providers we contacted, that’s almost immaterial. The price of the cameras is also ridiculously low considering what you get for your money. And as a consequence, we could invest more in other aspects of the game, which obviously always is a topic of interest for any developer.
Isn’t that cool ? 😉
Well, I think it’s cool – and it shows that a bit of R&D can go a long way to get you results that ordinarily would be out of scope for your budget. You just need to be able to create the time.