Paint Transformer: Feed Forward Neural Painting with Stroke Prediction
Neural painting refers to the procedure of producing a series of strokes for
a given image and non-photo-realistically recreating it using neural networks.
While reinforcement learning (RL) based agents can generate a stroke sequence
step by step for this task, it is not easy to train a stable RL agent. On the
other hand, stroke optimization methods search for a set of stroke parameters
iteratively in a large search space; such low efficiency significantly limits
their prevalence and practicality. Different from previous methods, in this
paper, we formulate the task as a set prediction problem and propose a novel
Transformer-based framework, dubbed Paint Transformer, to predict the
parameters of a stroke set with a feed forward network. This way, our model can
generate a set of strokes in parallel and obtain the final painting of size 512
* 512 in near real time. More importantly, since there is no dataset available
for training the Paint Transformer, we devise a self-training pipeline such
that it can be trained without any off-the-shelf dataset while still achieving
excellent generalization capability. Experiments demonstrate that our method
achieves better painting performance than previous ones with cheaper training
and inference costs. Codes and models are available.
Authors
Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, Hao Wang