Volgen
Xiang Yin
Xiang Yin
Bytedance AI Lab
Geverifieerd e-mailadres voor bytedance.com
Titel
Geciteerd door
Geciteerd door
Jaar
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models
R Huang, J Huang, D Yang, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, Z Zhao
International Conference on Machine Learning, 13916-13932, 2023
1252023
Bytesing: A chinese singing voice synthesis system using duration allocated encoder-decoder acoustic models and wavernn vocoders
Y Gu, X Yin, Y Rao, Y Wan, B Tang, Y Zhang, J Chen, Y Wang, Z Ma
2021 12th International Symposium on Chinese Spoken Language Processing …, 2021
742021
Ppg-based singing voice conversion with adversarial representation learning
Z Li, B Tang, X Yin, Y Wan, L Xu, C Shen, Z Ma
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
322021
Modeling F0 trajectories in hierarchically structured deep neural networks
X Yin, M Lei, Y Qian, FK Soong, L He, ZH Ling, LR Dai
Speech Communication 76, 82-92, 2016
322016
A unified sequence-to-sequence front-end model for mandarin text-to-speech synthesis
J Pan, X Yin, Z Zhang, S Liu, Y Zhang, Z Ma, Y Wang
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
312020
Towards realistic visual dubbing with heterogeneous sources
T Xie, L Liao, C Bi, B Tang, X Yin, J Yang, M Wang, J Yao, Y Zhang, Z Ma
Proceedings of the 29th ACM International Conference on Multimedia, 1739-1747, 2021
262021
A hybrid text normalization system using multi-head self-attention for mandarin
J Zhang, J Pan, X Yin, C Li, S Liu, Y Zhang, Y Wang, Z Ma
ICASSP 2020-2020 IEEE international conference on acoustics, speech and …, 2020
252020
Clinical efficacy of bone cement-injectable cannulated pedicle screw short segment fixation for lumbar spondylolisthesis with osteoporosise
Y Liu, J Xiao, X Yin, M Liu, J Zhao, P Liu, F Dai
Scientific reports 10 (1), 3929, 2020
252020
Mega-tts: Zero-shot text-to-speech at scale with intrinsic inductive bias
Z Jiang, Y Ren, Z Ye, J Liu, C Zhang, Q Yang, S Ji, R Huang, C Wang, ...
arXiv preprint arXiv:2306.03509, 2023
202023
Biomechanical influence of the surgical approaches, implant length and density in stabilizing ankylosing spondylitis cervical spine fracture
Y Liu, Z Wang, M Liu, X Yin, J Liu, J Zhao, P Liu
Scientific reports 11 (1), 6023, 2021
162021
Towards high-fidelity singing voice conversion with acoustic reference and contrastive predictive coding
C Wang, Z Li, B Tang, X Yin, Y Wan, Y Yu, Z Ma
arXiv preprint arXiv:2110.04754, 2021
142021
Cross-speaker emotion transfer based on speaker condition layer normalization and semi-supervised training in text-to-speech
P Wu, J Pan, C Xu, J Zhang, L Wu, X Yin, Z Ma
arXiv preprint arXiv:2110.04153, 2021
142021
Fine-Grained Prosody Modeling in Neural Speech Synthesis Using ToBI Representation.
Y Zou, S Liu, X Yin, H Lin, C Wang, H Zhang, Z Ma
Interspeech, 3146-3150, 2021
142021
A chapter-wise understanding system for text-to-speech in Chinese novels
J Pan, L Wu, X Yin, P Wu, C Xu, Z Ma
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
132021
Geneface++: Generalized and stable real-time audio-driven 3d talking face generation
Z Ye, J He, Z Jiang, R Huang, J Huang, J Liu, Y Ren, X Yin, Z Ma, Z Zhao
arXiv preprint arXiv:2305.00787, 2023
122023
Modeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree.
X Yin, M Lei, Y Qian, FK Soong, L He, ZH Ling, LR Dai
INTERSPEECH, 2273-2277, 2014
122014
Make-an-audio 2: Temporal-enhanced text-to-audio generation
J Huang, Y Ren, R Huang, D Yang, Z Ye, C Zhang, J Liu, X Yin, Z Ma, ...
arXiv preprint arXiv:2305.18474, 2023
112023
Improving accent conversion with reference encoder and end-to-end text-to-speech
W Li, B Tang, X Yin, Y Zhao, W Li, K Wang, H Huang, Y Wang, Z Ma
arXiv preprint arXiv:2005.09271, 2020
112020
Clapspeech: Learning prosody from text context with contrastive language-audio pre-training
Z Ye, R Huang, Y Ren, Z Jiang, J Liu, J He, X Yin, Z Zhao
arXiv preprint arXiv:2305.10763, 2023
102023
Mega-tts 2: Zero-shot text-to-speech with arbitrary length speech prompts
Z Jiang, J Liu, Y Ren, J He, C Zhang, Z Ye, P Wei, C Wang, X Yin, Z Ma, ...
arXiv preprint arXiv:2307.07218, 2023
92023
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20