Server workload in the form of cloud-end clusters is a key factor in server maintenance and task scheduling. How to balance and optimize hardware resources and computation resources should thus receive more attention. However, we have observed that the disordered execution of running application and batching seriously cuts down the efficiency of the server. To improve the workload prediction accuracy, this paper proposes an approach using the long short-term memory (LSTM) encoder-decoder network with attention mechanism. First, the approach extracts the sequential and contextual features of the historical workload data through the encoder network. Second, the model integrates the attention mechanism into the decoder network, through which the prediction for batch workloads can be carried out. Third, experiments carried out on Alibaba and Dinda workload traces dataset demonstrate that our method achieves state-of-the-art performance in mixed workload prediction in cloud computing environment. Furthermore, we also propose a scroll prediction method, which splits a long prediction sequence into several small sequences to monitor and control prediction accuracy. This work helps to dynamically guide the configuration for workload balancing.