In December, I wrote an entry here about the cost of dialogues in games, promising that I was going to write a follow up once we figured out how exactly we were going to tackle this in Dragon Commander.
The problem we needed a solution for was that because Dragon Commander features tons of choices & consequences, it also features a veritable avalanche of dialogue that somehow needs to be presented to our players.
In our dream scenario, all of this dialogue is fully animated and voiced, but because we’re dealing with several hours of dialogue in multiple languages, the cost of doing this is quite high and it’d actually be insane to animate it all.
So after long discussions and deliberations, we decided to do the insane thing.
Here’s a video of something that transpired in the Larian offices a month ago. I suggest you have a look at it and then read on…
As you can see, we bought a facial capture system – I guess it’s obvious from the video that some in our team were quite excited about this momentous event 😉
Our plan of approach is to put those cameras you see in the video in the voice recording booth, and then use this captured data to put emotion in the faces of our 3D protagonists and antagonists.
This facial capture data will then be overlaid on top of motion captured body animations (we also have a motion capture system in the office), and the end result should be believable dialogues when talking to all of the characters in Dragon Commander.
At least that’s the plan.
The decision to do it this way came after checking plenty of other solutions, ranging from trying to set something up oursevles with Kinect devices (cheapest) to hiring simultanous body & facial capture studios (most expensive).
The latter had prices in the range of $1000 to $2000 per minute which would cost us between 0,5M US$ to 1M US$. I actually contemplated this for some time, but then decided against it. I figured that in the end we’d be best served if we could come up with a homebrewn solution, even if that causes a bit more pain and might in the short term not give us the highest quality solution.
My thinking was that for whatever game we do, we’ll always need to hire voice actors, so in all cases that’s a cost we’ll have to carry. Now, while they are acting, they are actually generating the data we need – we just need the ability to extract that data and project it on 3D characters.
The equipment we bought allows us to record the facial marker data at 100 frames per second from seven directions. Should we discover for some reason that that’s not enough, we can always add extra cameras, but from the looks of it, the raw data looks to be good enough to work with.
So if we organize ourselves such that for every future recording session, we record the facial expressions of the actors in addition to their voice, we should have sufficient base material to work from. Obviously, this does cause extra complications in the recording booth as we’re increasing actor/studio time and thus recording cost, but from the tests we’ve done, it looks like it should be manageable.
The real problem is mapping this data to our 3D characters, preferably in an automated fashion, and if that’s not possible, at least in a semi-automated manner. The software solutions we have for the moment give us reasonable results, but it’s clear that the current off the shelf solutions don’t allow us to use all of the rich detail that’s present within raw data. Or in other words – when you look at the markers that were tracked, and then at how this translates into animations, you see there’s a significant loss of data.
Additionally, you also see that manual labor is required to fix wrong interpretations of the data. This is probably going to be the most expensive part of the entire operation, but it’s also the area that presents us with the largest opportunity to reduce costs, if we can be clever about it.
To be honest, I have no clue exactly how much we’ll be spending in the end on this, but I do know a couple of things already:
a) Whatever result we get, it’s going to be better than what we did in Divinity II: The Dragon Knight Saga
b) If we invest sufficient time into working on the mapping process, it can only get better which will benifit not only Dragon Commander, but also our future games
Before concluding, there’s one question I need to answer still: Why did we decide to do the insane thing after all ?
Some people commented that we shouldn’t be doing this, and instead focus on the gameplay. From the gut I would say they are right, but gameplay is something you create from many layers, and visualisation is definitely an important aspect. Specifically, in this particular game, I really think it’s vital that we provide you with believable characters, and we should invest sufficient resources to try achieving that.
This game is about you living the life of a Dragon Commander who’s forging an empire. That means that in addition to giving you the the experience of dealing with your troops and deciding on what strategy/ tactics you use, we also want you to deal with politics, the media and whatever social life you have left.
To do that convincingly, we need characters that look lifelike enough. Hence, facial expressions, motion capture, full voice recordings, truckloads of cash 😉
Whether or not it’ll be worth it, I’ll only know when the results are in, but I have good hopes.
I’ll keep you posted on how it goes. If all goes well, the E3 footage should show you how it turned out, because in addition to showing off project E there, we’re also planning on showing the multiplayer of Dragon Commander, together with more details on everything you can do in the game.
Have a great weekend !