Personally I am fine with first-person. For me, I am disappointed that they haven't implemented better support for dynamic reflections in the environment, or at least a limited ability to see V's facial expressions and mannerisms during interactions with others. I want to go into a clothing store and see myself in the mirrors. Not smart mirrors either - just show my reflection when I walk by.
Yes, obviously performance is an issue right now, but as a developer myself with over twenty years in the industry, most of these sorts of problems exist only when someone hasn't taken a step back and asked the question "is there a different way this technical challenge could be approached?".
Why not keep a second V overlaid with the first-person V, but only rendered when temporarily switching out to a third-person camera? Have an optional third-person cam available for cut scenes. Make it available also in the same way that middle-click switches to rear-vehicle view while the button is held. Photo mode is all well and good, but it doesn't matter what you were doing when you pressed N - either you get switched to a canned "action" pose which usually doesn't really match what you were doing, or you get turned into a bowling pin with a dead-eye straight-ahead stare. How about letting me just see V standing there in the street casually glancing around and shuffling her feet a little, maybe with mild reactions to the things happening around her?
For me, immersion is "V is me", which makes sense from a first-person sense, but not if I have almost no freedom to look at myself or to build a connection between my face/body and my interactions with the rest of the world, as I would do if I could walk past mirrors or reflective windows and get visual confirmation that I am really here, doing this thing.