Skip to the content.

Image

Single Speaker (LJSpeech Dataset)

Utterance groundthruth UDPNet(fsteps:1200 rsteps 8) UDPNet(fsteps:960 rsteps 8) UDPNet(fsteps:720 rsteps 8) UDPNet(fsteps:240 rsteps 8)
#1
#2
#3
#4
#5

Unseen Speakers (VCTK Dataset)

Utterance groundthruth UDPNet(fsteps:1200 rsteps 8) UDPNet(fsteps:960 rsteps 8) UDPNet(fsteps:720 rsteps 8) UDPNet(fsteps:240 rsteps 8)
#1
#2
#3
#4
#5