Library Guides: AI Tools: Glossary

Terms Related to AI - Part 1

Algorithm - A sequence of rules given to an AI machine to perform a task or solve a problem.
Annotation - The process of tagging language data by identifying and flagging grammatical, semantic, or phonetic elements.
Artificial Intelligence (AI) - The simulation of human intelligence in machines that are programmed to think and learn like humans. Example: A self-driving car that can navigate and make decisions on its own using AI technology.
Automation - Automation refers to the use of technology to perform tasks with minimal human intervention.
ChatBot - A user-friendly interface that allows the user to ask questions and receive answers. Depending on the backend system that fuels the chatbot, it can be as basic as pre-written responses to a fully conversational AI that automates issue resolution.
Collective Learning - An AI training approach that leverages diverse skills and knowledge across multiple models to achieve more powerful and robust intelligence.
Conversational AI - A subfield of AI that focuses on developing systems that can understand and generate human-like language and conduct a back-and-forth conversation.
Data Augmentation - A technique used to artificially increase the size and diversity of a training set by creating modified copies of the existing data. It involves making minor changes such as flipping, resizing, or adjusting the brightness of images, to enhance the dataset and prevent models from overfitting.
Data Mining - The examination of data sets to discover patterns from that data that can be of further use.
Deterministic Model - A deterministic model follows a specific set of rules and conditions to reach a definite outcome, operating on a cause-and-effect basis.
Explainable AI - Methods or design choices used in automated systems so that AI and machine learning produces outputs that follow a logic that can be explained and understood by humans.
Extensibility - The ability of AI systems to expand their capabilities to new domains, tasks, and datasets without needing full retraining or major architectural changes.
Fine Tuning - The process of adapting a pre-trained model to a specific task by training it on a smaller dataset. For example, an image classification model trained on all intersection pictures can be fine-tuned to detect when a car runs a red light.
Generative AI - Artificial intelligence capable of generating text, images, videos, or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.
Grounding - Grounding is the process of anchoring artificial intelligence (AI) systems in real-world experiences, knowledge, or data. The objective is to improve the AI's understanding of the world, so it can effectively interpret and respond to user inputs, queries, and tasks.
Hybrid AI - Any artificial intelligence technology that combines multiple AI methodologies. In Natural Language Processing, this often means that a workflow will leverage both symbolic and machine-learning techniques.
Hyperparameters - These are adjustable model parameters that are tuned in order to obtain optimal performance of the model.
Instruction-Tuning - An approach where a pre-trained model is adapted to perform specific tasks by providing a set of guidelines or directives that outline the desired operation.
Intelligent Tutoring Systems - AI-based instructional systems that adapt instruction based on a range of learner variables, such as personal interest and motivation to learn.
Knowledge Generation - The process of training models on extensive datasets, allowing them to analyze data, discover patterns, and craft new insights.
Knowledge Model - A process of creating a computer-interpretable model of knowledge or standards about a language, domain, or process. It is expressed in a data structure that enables the knowledge to be stored in a database and be interpreted by software.
Large Language Model (LLM) - A type of deep learning model trained on a large dataset to perform natural language understanding and generation tasks.
Lexicon - A dataset that contains information about all of the possible meanings of words, in their proper context. A lexicon is fundamental for processing text content with high precision.

Terms Related to AI - Part 2

Machine Learning (ML) - A subfield of AI that involves the development of algorithms and statistical models that enable machines to improve their performance with experience. Example: A machine learning algorithm that can predict which customers are most likely to order a product based on their past behavior.
Metadata - Data that describes or provides information about other data.
Natural Language Ambiguity - Situations where a word, phrase, or sentence can have multiple meanings, making it challenging for both humans and AI systems to interpret correctly.
NLT (aka Natural Language Technology) - A subfield of linguistics, computer science, and artificial intelligence dealing with Natural Language Processing (NLP), Natural Language Understanding (NLU), and Natural Language Generation (NLG).
Overfitting - A problem that occurs when a model is too complex, performing well on the training data but poorly on unseen data. Example: A model that has memorized the training data instead of learning general patterns and thus performs poorly on new data.
Pre-Training - Training a model on a large dataset before fine-tuning it to a specific task.
Probabilistic AI Model - One that makes decisions based on probabilities or likelihoods.
Prompt - A phrase or individual keywords used as input for Generative AI.
Prompt Engineering - Identifying inputs — prompts — that result in meaningful outputs. As of now, prompt engineering is essential for LLMs. LLMs are a fusion of layers of algorithms and, consequently, have limited controllability with few opportunities to control and override behavior.
Random Forest - A supervised machine learning algorithm that grows and combines multiple decision trees to create a “forest.”
Reasoning - The process by which artificial intelligence systems solve problems, think critically, and create new knowledge by analyzing and processing available information, allowing them to make well-informed decisions across various tasks and domains.
Reinforcement Learning with Human Feedback (RLHF) - An algorithm that learns how to perform a task by receiving feedback from a human.
Sentiment Analysis - A Natural Language function that identifies the sentiment in text. This can be applied to anything from a business document to a social media post. Sentiment is typically measured on a linear scale (negative, neutral or positive), but advanced implementations can categorize text in terms of emotions, moods, and feelings.
Sequence Modeling - A subfield of NLP that focuses on modeling sequential data such as text, speech, or time series data. Example: A sequence model that can predict the next word in a sentence or generate coherent text.
Speech-to-Text - The process of converting spoken words into written text.
Stable Diffusion - An artificial intelligence system that uses deep learning to generate images from text prompts.
Stochastic Parrots - AI systems that use statistics to convincingly generate human-like text, while lacking true semantic understanding behind the word patterns.
Supervised Learning - The use of labeled data sets to train algorithms that to classify data or predict outcomes accurately.
Syntax - The arrangement of words and phrases in a specific order to create meaning in language. If you change the position of one word, it is possible to change the context and meaning.
Test Set - A collection of sample documents representative of the challenges and types of content an ML solution will face once in production. A test set is used to measure the accuracy of an AI system after it has gone through a round of training.
Text-to-Speech (TTS) - A technology that converts written text into spoken voice output. It allows users to hear written content being read aloud, typically using synthesized speech.
Thesaurus - A language or terminological resource “dictionary” describing relationships between lexical words and phrases in a formalized form of natural language(s), enabling the use of descriptions and relationships in text processing.
Tokenization - The process of breaking text into individual words or subwords to input them into a language model. Example: Tokenizing a sentence "I am ChatGPT" into the words: “I,” “am,” “Chat,” “G,” and “PT.”
Training data - The collection of data used to train an AI model.
Transfer Learning - A technique in which a pretrained model is used as a starting point for a new machine learning task.

AI Tools: Glossary

Terms Related to AI - Part 1

Terms Related to AI - Part 2

Sources