We present a novel semantic-driven multi-view reconstruction technique for producing realistic 3D human models. Our approach borrows the fragmentation concept in Cubism style painting where human body is decomposed into semantically meaningful fragments for conveying space and movement. We first employ deep learning based skeleton estimation for warping a proxy human model under the canonical pose to the target multi-view input. It also conducts 3D fragment labeling on the warped model to separate different human body parts. Finally, we utilize the normal, depth, and fragment label of the proxy model as priors in the multi-view stereo reconstruction process. Comprehensive experiments have shown that our reconstruction technique outperforms the state-of-the-art methods in robustness and accuracy，especially near occlusion boundaries and on textureless regions. In particular, it manages to significantly reduce the “adhesive” artifacts commonly observed in MVS that incorrectly stitches different body parts.