what takes longer in LLMs, to encode tokens or to decode them and why

In LLMs, it generally takes longer to decode tokens than to encode them. The encoder part is designed to learn embeddings[1] for predictive tasks like classification, while the decoder generates new texts, which is a more complex and time-consuming task. The decoder utilizes autoregressive decoding, which slows down the process as it generates output tokens one at a time[1]. Additionally, the decoder's iterative nature contributes to the longer decoding time compared to the encoding process.