Do they even performance capture/mocap it? I thought CDPR developed a system that moves bones and changes the facial structure depening on the line that is spoken...
It's two different technologies, although they probably do come together, and may be applied at the same time.
There's been technology around for a long time to handle lip-synching, to move the mouth and jaw areas automatically based on the phonemes of the speech. I'd expect it to still need manual tweaking, but it does take a LOT of the work out of synching, especially where it's being localised into many languages.
Body language and facial expression still needs to be added though, and that's where the mocap comes in.
Just compare TW1, which I would expect to have used lip-synch tech but doesn't do much (any?) facial expression, to a modern game.