Recommender systems (RecSys) are the engine of the modern internet and play a critical role in driving user engagement while helping users to find relevant items that match their preferences learned from their historical interactions on other items. In many recommendation domains such as news, e-commerce, and streaming video services, users might be untraceable/anonymous, their histories can be short and users can have rapidly changing tastes. Providing recommendations based purely on the interactions within the current session is an important and challenging problem. The field of NLP has evolved significantly over the past decade, particularly due to the increased usage of deep learning. The state-of-the-art NLP approaches have inspired RecSys practitioners and researchers to adapt those architectures, especially for sequential and session-based recommendation problems. In our work, we investigate the effectiveness of the transformer-based architectures for next-click predictions using the short user sequences for session-based recommendation tasks. In addition, we explored if combining different transformer architectures and training techniques such as MLM (masked language modeling), PLM (permutation LM), and RTD (the Replacement Token Detection) could be beneficial in session-based recommendation tasks with short sequences of interactions. To effectively bridge the gap between modern NLP and sequential and session-based recommendation tasks, we developed a RecSys framework built upon direct integration with the Hugging Face (HF) Transformers library. We conducted experiments using our developed framework and the results showed that training XLNet with RTD, to our knowledge a novel combination, led to an improvement of +14.15% NDCG@20 and +9.75% NDCG@20 across REES46 and YOOCHOOSE e-commerce datasets, respectively, relative to the best baseline. Our framework provides the necessary functionalities to use Transformers to build sequential and session-based models.