Fundamental elements of text or code utilized by LLM AI systems for language processing and generation. These units can encompass characters, words, subwords, or other text segments, depending on the adopted tokenization method or scheme. Tokenization partitions the input text or code into discrete tokens, enabling the AI model to comprehend and manipulate language effectively. By representing linguistic components as tokens, LLMs streamline the processing and generation tasks, facilitating tasks such as text understanding, translation, summarization, and code generation with greater precision and efficiency.