Techiepedia: learn technology with clear definitions

We are Telefónica Tech

Solutions

Industries

Success Stories

ESG

News

Techiepedia

In our technological dictionary, we offer a glossary of essential terms in the world of IoT, Big Data, Blockchain and Artificial Intelligence.

Success Stories

News

An artificial intelligence (AI) agent is an autonomous system capable of perceiving its environment, processing information, and performing actions aimed at achieving a specific goal. Unlike traditional applications, an AI agent doesn’t just follow predefined instructions — it learns from experience, makes adaptive decisions, and can interact with other agents or with people.

AUC (Area Under the ROC Curve)

An evaluation metric that considers all possible classification thresholds. The ROC curve is the plot between sensitivity and (1- specificity). (1- specificity) is also known as the False Positive rate and sensitivity is also known as True Positive rate. The Area Under the ROC (Receiver operating characteristic) curve is the probability that a classifier will be more confident that a randomly chosen positive example is actually positive than that a randomly chosen negative example is positive.

Accuracy

The fraction of predictions that a classification model got right. In multi-class classification, accuracy is defined as follows:

Accuracy=Correct Predictions/Total Number Of Examples

In binary classification, accuracy has the following definition:

Accuracy= (True Positives +True Negatives)/Total Number Of Examples

Activation function

An activation function is a function that transmits the information generated by the linear combination of the weighted inputs, i.e. the way information is transmitted through the output connections. Since we want the network to be able to solve increasingly complex problems, the activation functions will generally make the models non-linear. The best known are the step function, the sigmoidal function, the ReLu function, the hyperbolic tangent function or the radial base function (Gaussian, multicuadratic, inverse multicuadratic...).

Agentic AI

Agentic AI is an evolution of traditional artificial intelligence agents. It refers to systems capable of operating with extended autonomy, meaning they not only respond to specific instructions but can also plan, make decisions, and execute complex chains of tasks to achieve a defined goal.

Algorithm

A series of repeatable steps for carrying out a certain type of task with data. As with data structures, people studying computer science learn about different algorithms and their suitability for various tasks.

Analyst firms

Analyst firms are key in our sector. They are experts in different technological matters and are opinion makers. They have great influence on investment banks, act as regulators and as suppliers of Information and Communication Technologies (ICT). Many of these organizations generate their own studies where they position the Telco in different aspects and, therefore, they get to intervene in the purchase decisions of the clients in the business segment. They act as a quality meter for companies.

Analytics

Today we live in a hyperconnected world. More and more devices around us are sensorized and provide valuable data for users or companies. This data alone has no added value. The value comes when you cross-reference it, analyze it to improve production, save costs and increase efficiency through altering behavioral patterns. Data analytics are essential to the digital transformation of a company.

Artificial Intelligence of Things (AI of Things)

When artificial intelligence and IoT, Big Data technologies join forces, so that "things are able to learn, share information with each other and make decisions in an almost unattended way" in order to help organizations make decisions that improve people's lives.

Artificial intelligence

In AI’s early days in the 1960s, researchers sought general principles of intelligence to implement, often using symbolic logic to automate reasoning. As the cost of computing resources dropped, the focus moved more toward statistical analysis of large amounts of data to drive decision making that gives the appearance of intelligence.

Asset (Data Governance)

Any corporate resource that is necessary for the correct provision of information services. It is any information or system related to its treatment and which adds value to the organization. It can be either business processes, data, applications, IT systems, personnel, information supports, networks, auxiliary equipment, or installations. It is susceptible to deliberate or accidental attack with consequences for the organization.

Attack surface

The attack surface refers to the collection of entry points that a cybercriminal could exploit to try to gain access to a company’s systems. It includes devices, applications, users, cloud services, and any other resource connected to the network.

Autonomous vehicle

An autonomous vehicle is an intelligent car driving experience that provides real-time information on vehicle operation and use, so customers can make more efficient decisions. It is integrated into the daily life of the user who remains connected while driving and allows him/her to access the information through a mobile application where the data collected by the device within the car is visible.

Availability (Data Governance)

Ownership that must comply with the information contained in an information system by which, such information is available for reading or modification whenever a user with the appropriate permissions requires it.

Backup as a Service (BaaS)

Backup as a Service (BaaS) is a cloud-based solution that allows organizations to store and protect their data without the need for their own physical infrastructure. The BaaS provider remotely manages data backup, storage, and recovery.

Bayes Theorem

Named after the eighteenth-century English statistician and Presbyterian minister Thomas Bayes, Bayes’ Theorem is used to calculate conditional probability. Conditional probability is the probability of an event ‘B’ occurring given that a related event ‘A’ has already occurred (P (B|A)).

Bayesian Statistics

Bayesian statistics is a mathematical procedure that applies probabilities to statistical problems. It provides people the tools to update their beliefs on the evidence of new data. It is based on the use of Bayesian probabilities to summarize evidence.

Bias

An intercept or offset from an origin. Bias (also known as the bias term) is referred to as b or w0 in machine learning models. For example, bias is the b in the following formula:

y′=b+w1x1+w2x2+…wnxn

In machine learning, “bias is a learner’s tendency to consistently learn the same wrong thing. Variance is the tendency to learn random things irrespective of the real signal.... It’s easy to avoid overfitting (variance) by falling into the opposite error of underfitting (bias).

Big Data

In general, it refers to the ability to work with collections of data that had been impractical before because of their volume, velocity, and variety (“the three Vs, or the four Vs if we include veracity”). A key driver of this new ability has been easier distribution of storage and processing across networks of inexpensive commodity hardware using technology such as Hadoop instead of requiring larger, more powerful individual computers. But it’s not the amount of data that’s important, it’s how organizations use this large amount of data to generate insights. Companies use various tools, techniques and resources to make sense of this data to derive effective business strategies.

Binary Class

Binary variables are those variables, which can have only two unique values. For example, a variable “Smoking Habit” can contain only two values like “Yes” and “No”.

Blockchain

Blockchain is a set of technologies that allow the transfer of a value or asset from one place to another, without the intervention of third parties. In this model, authenticity is not verified by a third party but by a network of nodes (computers connected to the network). Therefore, asset transfers are made through consensus and by storing the information in a transparent manner.

Bot

Bot, chatbot, talkbot, chatterbot, conversational assistant, virtual assistant etc are just different ways to name computer programs that communicate with us as if they were human. In this way, bots can do many tasks, some good, such as buying tickets for concerts, unlocking a user's account, or offering options to reserve a holiday home on specific dates; and some bad, such as carrying out cyber-attacks, or causing a financial catastrophe by conducting high-speed stock trading.

The bots (diminutive of "robot") can be designed in any programming language and function as a client, as a server, as a mobile agent, etc. When they specialize in a specific function they are usually called "Expert Systems".

Business Analytics

Business analytics is mainly used to show the practical methodology than an organization uses to extract insights from their data. The methodology focuses on statistical analysis of the data.

Business Intelligence

Business intelligence refers to a set of strategies, applications, data and technologies used by an organization for data collection, analysis and generating insights in order to derive strategic business opportunity.

Business Intelligence (BI)

Business Intelligence, or BI, refers to the set of processes, methodologies, and tools that enable organisations to transform data into actionable information for decision-making. It is based on collecting, integrating, and analysing data from multiple internal and external sources.

ERP (Enterprise Resource Planning)

An ERP, or Enterprise Resource Planning system, is software that integrates a company’s core processes—such as finance, procurement, inventory, production, human resources, and logistics—into a single platform. Its goal is to centralize information and improve operational efficiency.

Edge Computing

It is a new computing paradigm that focuses on bringing data processing and storage closer to the devices that generate it, eliminating dependence on servers in the cloud or in data centres located thousands of miles away.

Edge Convergence

Edge Convergence is a model that combines 5G private networks with Edge Computing to unify connectivity and processing power directly within the industrial environment. This convergence enables critical processes to be executed locally and in real time, without relying on the public cloud or external networks.

End-to-end encryption

End-to-end encryption is a security method that protects communications so that only the sender and the intended recipient can access the content. Data is transformed into an unreadable format while traveling across the network, preventing third parties—including service providers—from reading it.

Enterprise Data Architect

Responsible for creating the structure to collect and access the data. Defines how the data moves Its main function is the design of the data usage environment. How they are stored, how they are accessed and how they are shared / used by different departments, systems or applications, in line with the business strategy. It is a strategic role, for which a vision of the complete life cycle is required. Therefore, you should consider aspects of data modeling, database design, SQL development, and software project management. It is also important to know and understand how traditional and emerging technologies can contribute to the achievement of business objectives. In short, its function is to ensure that "define the global vision."

Enterprise blockchain

Enterprise blockchain is the application of blockchain technology in corporate environments to improve security, transparency, and efficiency in the management of data and transactions. Unlike public blockchains, designed for open use cases such as cryptocurrencies, enterprise blockchain is deployed on private or hybrid networks controlled by individual companies or consortia of organizations.

Evaluation metrics

he purpose of evaluation metric is to measure the quality of the statistical / machine learning model.

Expert system

Expert system It is a system that uses human knowledge captured in a computer to solve problems that would normally be solved by expert humans. Well-designed systems mimic the reasoning process that experts use to solve specific problems. These systems can work better than any human expert making decisions individually in certain domains and can be used by non-expert humans to improve their problem-solving skills.

Exploratory data analysis (EDA)

EDA or exploratory data analysis is a phase used for data science pipeline in which the focus is to understand insights of the data through visualization or by statistical analysis.

Feature

The machine learning expression for a piece of measurable information about something. If you store the age, annual income, and weight of a set of people, you are storing three features about them. In other areas of the IT world, people may use the terms property, attribute, or field instead of “feature.”

Feature Selection is a process of choosing those features that are required to explain the predictive power of a statistical model, and dropping out irrelevant features. This can be done by either filtering out less useful features or by combining features to make a new one.

Fifth Generation Technology (5G)

5G is one of the new connectivities being implemented in different countries whose main function is to support information upload speeds far superior to any other technology created so far. This means that for the services that benefit from this technology, the delivery of information will be even faster than the current one.

GDPR

On May 25, 2018, the new General Regulation of Data Protection (GDPR) came into force. The main objective of this regulation is to govern the collection, use and exchange of personal data. The amount of data we create every day is growing at an exponential rate, and as the regulation says, "the processing of personal data must be designed to serve humanity.

Github

GitHub is a non-profit company that offers a hosting service for repositories stored in the cloud. It was purchased by Microsoft in 2018. GitHub is based on collaboration between users by encouraging several developers to experiment with open source and share their different projects and ideas.

Gradient Boosting

Gradient boosting o Potenciación del gradiente, es una técnica de aprendizaje automático utilizado para el análisis de la regresión y para problemas de clasificación estadística, el cual produce un modelo predictivo en forma de un conjunto de modelos de predicción débil, normalmente, árboles de decisión. Construye el modelo de forma iterativa y lo generaliza permitiendo la optimización de una función pérdida diferenciable arbitraria. (Wikipedia)

Graphics processing unit (GPU)

The graphics processing unit (GPU) is the hardware component that ensures that content is displayed correctly on the computer screen or monitor. It manages everything from the user interface to applications and web pages and, of course, games.

The use of mass parallel computing with GPUs has been key to the development of Deep Learning.

Hadoop

Hadoop is an open-source project from the Apache Foundation, introduced in 2006 and developed in Java. It has to objective of offering a working environment that is appropriate for the demands of Big Data (the 4 V’s). As such, Hadoop is designed to work with large Volumes of data, both structured and unstructured ( Variety), and to process them in a secure and efficient way ( Veracity and Velocity).

To achieve this, it distributes both the storage and processing of information between various computers working together in “ clusters”. These clusters have one or more master nodes charged with managing the distributed files where the information is stored in different blocks, as well as coordinating and executing the different tasks among the cluster’s members. As such, it is a highly scalable system that also offers software “redundancy”.

Heuristic

A practical and non-optimal solution to a problem, which is sufficient for making progress or for learning from.

Hidden layer

A synthetic layer in a neural network between the input layer (the features) and the output layer (the prediction). A neural network contains one or more hidden layers.

Holdout data

This refers to examples intentionally not used ("held out") during training. The validation data set and test data set are examples of holdout data. Holdout data helps evaluate your model's ability to generalize data other than the data it was trained on. The loss on the holdout set provides a better estimate of the loss on an unseen data set than does the loss on the training set.

Hosting

Service that allows storing and publishing websites, applications or emails on physical or virtual servers, making them accessible on the Internet. It includes the management of resources such as disk space, bandwidth, and IP addresses.

Hybrid Cloud

Hybrid cloud is an infrastructure model that combines public cloud and private cloud environments, allowing organizations to leverage the best of both. With this approach, data and applications are distributed flexibly based on security, performance, or scalability requirements.

Hyperplane

A boundary that separates a space into two subspaces. For example, a line is a hyperplane in two dimensions and a plane is a hyperplane in three dimensions. More typically, in machine learning, a hyperplane is the boundary separating a high-dimensional space. Kernel Support Vector Machines use hyperplanes to separate positive classes from negative classes, often in a very high-dimensional space.

Hyperscaler

A hyperscaler is a cloud service provider capable of delivering massive, scalable, and highly automated infrastructures that can scale up or down according to demand. Hyperscalers operate globally distributed data centres and offer advanced services across areas such as computing, storage, networking, and artificial intelligence.

Identity and Access Management (IAM)

Identity and Access Management (IAM) is the set of policies, processes, and technologies that enable organizations to securely manage users’ digital identities and control their access to systems, applications, and data.

Imputation

Imputation is a technique used for handling missing values in the data. This is done either by statistical metrics like mean/mode imputation or by machine learning techniques like kNN imputation.

Incident Response

Incident response is the set of processes and procedures an organisation applies to manage a cyberattack or security breach from the moment it is detected until it is fully resolved. Its purpose is to contain the incident, minimise its impact, and restore normal operations as quickly as possible.

Industry 4.0

Also known as the 4th Industrial Revolution, it refers to the transformation of a company into an intelligent organization in order to achieve the optimization of its resources and reduction of costs. As a result of this digital transformation, the business becomes more efficient and achieves greater competence.

Inferential Statistics

In inferential statistics, we try to hypothesize about the population by only looking at a sample of it. For example, before releasing a drug in the market, internal tests are done to check if the drug is viable for release. But here we cannot check with the whole population for viability of the drug, so we do it on a sample which best represents the population.

Innovation

Innovation, in most cases, is a transformation through which changes are made to introduce improvements or new functionalities to existing solutions. In other cases, it is a process of creating new solutions from scratch. In any case, these developments are created thanks to human ingenuity to improve our quality of life as a species and are closely connected to science and technology.

Integrity (Data Governance)

Ownership that must comply with the information contained in an information system by which, such information cannot be modified without leaving a trace that such modification has taken place, either in the physical supports in which it is stored or within its transfer through communication networks.

Internet of Things (IoT)

The Internet of Things is based on the connectivity of millions of objects to each other that allow us to make the most of every aspect of our lives. These are physical objects with integrated sensors in order to connect and exchange data with other devices and automate tasks so that you can spend your time on what you really like.

Interpretability

The degree to which a model's predictions can be readily explained. Deep models are often uninterpretable; that is, a deep model's different layers can be hard to decipher.

IoMT

IoMT, or Internet of Medical Things, is the name given to the sensorization of medical devices in order to collect data and analyze it to offer a better service to patients and health care professionals.

This translates into great advantages for workers and patients alike:

- Saving economic resources by digitizing medical examinations through gadgets in order to reduce the cost of hospital bills
- Improving the quality of life of patients by managing and collecting data in order to detect and prevent diseases in a more personalized way
- Process automation to optimize health resources and personnel in the best possible way
- Improving the user experience in the health centre by optimising space through people counting to reduce waiting times

IoT Sensors

An IoT sensor is a device capable of detecting, measuring or indicating changes in a physical space/object, transforms them into an electrical signal and uploads information which can be read on the connected platform. These sensors can measure a multitude of variables (location, temperature, humidity, pressure, speed...) On their own they would not be useful, so all the data collected is placed on a platform where, through the Big Data, we can analyze it and identify behaviour patterns in order to define the values and get the most out of the device.

k-means clustering

It is a type of unsupervised algorithm which solves the clustering problem. It is a procedure which follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters). Data points inside a cluster are homogeneous and heterogeneous to peer groups.

k-nearest neighbors

K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases by a majority vote of its k neighbors. The case being assigned to the class is most common amongst its K nearest neighbors measured by a distance function.

Legato

Legato (for example, Legato Sapient) is a Manufacturing Execution System (MES) used in industrial environments to monitor, control, and optimize production operations in real time. It enables integration between the shop floor (machines, sensors) and enterprise management systems.

Lift

In data mining, Lift compares the frequency of an observed pattern with how often you’d expect to see that pattern just by chance. If the lift is near 1, then there’s a good chance that the pattern you observed is occurring just by chance. The larger the lift, the more likely that the pattern is ‘real’.

Linear Regression

A technique to look for a linear relationship (that is, one where the relationship between two varying amounts, such as price and sales, can be expressed with an equation that you can represent as a straight line on a graph) by starting with a set of data points that don't necessarily line up nicely. This is done by computing the “ least squares” line: the one that has, on an x-y graph, the smallest possible sum of squared distances to the actual data point y values. Statistical software packages and even typical spreadsheet packages offer automated ways to calculate this.

Logistic Regression

A model similar to linear regression but where the potential results are a specific set of categories instead of being continuous.

Low-code / No-code

Low-code and no-code are application development approaches that reduce or eliminate the need for traditional coding.

M2M

Machine to Machine (M2M) is the connection or exchange of information, in data format, that is created between two connected machines. It is, in a way, the connectivity on which the Internet of Things (IoT) is based. Nowadays, the term M2M has become obsolete, since it has evolved into what we call IoT which, besides machines, also connects people.

Machine learning

Machine Learning refers to the techniques involved in dealing with vast data in the most intelligent fashion (by developing algorithms) to derive actionable insights. In these techniques, we expect the algorithms to learn by themselves without being explicitly programmed.

Managed Security Service Provider (MSSP)

A Managed Security Service Provider (MSSP) is a specialized company that delivers managed cybersecurity services to protect other organizations. Instead of each company deploying and maintaining its own defenses, the MSSP continuously monitors, detects, and responds to threats on their behalf.

Manufacturing Execution System (MES)

A Manufacturing Execution System (MES) is a software system that monitors, controls, and optimizes production processes in an industrial plant in real time. Its main role is to connect the machine and sensor control layer with enterprise management systems, ensuring traceability, quality, and efficiency.

Metadata

Data about data that allows to add context to the information. It describes characteristics of data that help its identification, discovery, assessment, and management. There are three types of metadata: technical, organizational, and business.

Multicloud

The multicloud approach consists of an organisation using cloud services from different providers at the same time. Instead of relying on a single provider, it combines multiple hyperscaler public clouds—large-scale infrastructures capable of delivering virtually unlimited resources on demand.

NB-IoT

NB-IoT are the first standard 3GPP technologies designed ad hoc for IoT in the licensed bands. Both technologies are part of the LPWA (low power wide area) networks and have been designed to optimize the massive consumption of Low Data and low cost IoT uses. Thanks to this technology we can reduce the cost of the devices and extend the life of the batteries for years. In addition, it provides better coverage both indoors (complicated coverage sites: e.g. basements) and outdoors (long range).

Naive Bayes classifier

“A collection of classification algorithms based on Bayes Theorem. It is not a single algorithm but a family of algorithms that all share a common principle, that every feature being classified is independent of the value of any other feature.

Natural language processing (NLP)

Natural Language Processing is the branch within the field of computer science, linguistics and artificial intelligence that is responsible for the study and development of techniques that enable computers to understand and process human language.

Neural network

A model that, taking inspiration from the brain, is composed of layers (at least one of which is hidden) consisting of simple connected units or neurons followed by nonlinearities. Neural Networks are used in deep learning research to match images to features and much more. What makes Neural Networks special is their use of a hidden layer of weighted functions called neurons, with which you can effectively build a network that maps a lot of other functions. Without a hidden layer of functions, Neural Networks would be just a set of simple weighted functions.”

New technologies

The new technologies are techniques that have not been used before but have emerged in recent years within the fields of computing and communication. They are small advances of humanity that help people evolve and make life easier for them. Tools such as the Internet, DVDs, desktop computers, and laptops were examples of this concept in their day. Today we understand concepts such as IoT, Big Data, Artificial Intelligence, Virtual Reality etc as new technologies.

OT Cybersecurity

It is the set of measures and technologies designed to protect industrial control systems, machinery, sensors, and networks that manage physical processes in sectors such as manufacturing, energy, transportation, or water.

Observability

Observability is the ability to understand the internal state of complex systems based on the data they generate, such as metrics, logs, and traces. It goes beyond traditional monitoring by providing a comprehensive, real-time view of systems and applications.

Outlier

“Extreme values that might be errors in measurement and recording, or might be accurate reports of rare events.”

Overfitting

A model of training data that, by taking too many of the data's quirks and outliers into account, is overly complicated and will not be as useful as it could be to find patterns in test data.

Perceptron

The perceptron is the simplest neural network, which approximates a single neuron with n binary inputs. It computes a weighted sum of its inputs and ‘fires’ if that weighted sum is zero or greater

Phishing

Phishing is a cyberattack technique that uses fraudulent messages (typically emails, SMS, or phone calls) to deceive users into handing over confidential information such as login credentials or banking details. Attackers impersonate legitimate organisations to gain the victim’s trust.

Pivot table

Pivot tables summarize long lists of data, without requiring you to write a single formula or copy a single cell. However, the most notable feature of pivot tables is that you can arrange them dynamically. The process of rearranging your table is known as pivoting your data: you're turning the same information around to examine it from different angles.

Precision and Recall

Precision is a metric for classifications models that answers the following question: Out of all the possible positive labels, how many did the model correctly identify?

It represents how near the actual value is to the one obtained from the model or measurement. It is also known as “True Positive Rate”.

Recall is described as the measured of how many of the positive predictions are correct.

Both precision and recall are therefore based on an understanding and measure of relevance. High precision means that an algorithm returned substantially more relevant results than irrelevant ones, while high recall means that an algorithm returned most of the relevant results

Predictive Modeling

Consiste en el desarrollo de modelos estadísticos y de aprendizaje automático que permitan predecir comportamientos futuros, basándose en datos históricos.

Predictive analytics

It consists of the analysis of historical business data in order to predict future behaviors that help to better planning. To do this, predictive modeling techniques are used, among others. These techniques are based on statistical algorithms and machine learning.

Prescriptive analytics

Principal component analysis (PCA)

It is a machine learning algorithm that aims to reduce the dimensionality of a set of observed variables to a set of variables without linear correlation, called main components. To do this, it calculates the direction with the greatest variance and defines it as the main component. It is used mainly in exploratory data analysis and to build predictive models.

Private Cloud

Private cloud is a cloud infrastructure model dedicated exclusively to a single organization. It can be hosted on the company’s own premises or managed by an external provider, but it always delivers an isolated environment with greater control over data and applications.

Privileged Access Management (PAM)

Privileged Access Management (PAM) refers to a set of practices and technologies designed to control and secure accounts with elevated permissions within an organization, such as system administrators or database managers.

Probability distribution

The probability distribution of a discrete random variable is the set of all the possible values that this variable can have, together with its probabilities of occurrence. For discrete variables, the main probability distributions are the binomial, the Poisson and the hypergeometric (the latter for dependent events). For continuous variable, the distribution that is generated is normal or Gaussian.

Profiling

Profiling is the process of using personal data to evaluate certain personal aspects to analyse and predict behaviour/performance/reliability, etc.

Pseudonymization

The pseudonymization process is an alternative to data anonymization. While anonymization involves the complete elimination of all identifiable information, pseudonymization aims to eliminate the link between a set of data and the identity of the individual. Pseudonymization examples are encryption and tokenization.

Public Cloud

Public cloud is a cloud services model in which the infrastructure is managed by an external provider and shared among multiple organizations. Through the internet, each company accesses the resources it needs on a pay-as-you-go basis, without having to invest in its own hardware.

Python

It is a programming language created in 1994 and is widely used in data science. For beginners, it is very easy to learn, but at the same time it is a very powerful language for advanced users, since it has specialized libraries for automatic learning and graphics generation.

Python Standard Library

A library is nothing more than a set of modules (see modules). Python standard library is very wide and offers a great variety of modules that perform functions of all kinds, from modules written in C that offer access to system features such as file access (file I / O). In Python website, you can find "The Python Standard Library", a reference guide to all the modules in Python. Installers for Windows platforms usually include the complete standard library, including some additional components. However, Python installations using packages will require specific installers.

RAG ( Retrieval-Augmented Generation)

Artificial intelligence technique that improves text generation by searching for information in external sources before answering. It allows creating more accurate and updated responses, combining data retrieval and content generation. Used in chatbots, virtual assistants, and advanced search engines.

Random forest

An algorithm used for regression or classification tasks that is based on a combination of predictive trees. "To classify a new object from an input vector, each of the trees in the forest is fed with that vector. Each tree offers a classification as a result, and we say "vote" for that result. The forest chooses the classification that has the most votes among all the trees in the forest. The term "random forest" is a trademark registered by its authors.

Regression

It is a supervised learning method where the output variable is a real and continuous value, such as "height" or "weight". Regression consists of fitting any data set to a given model. Within the regression algorithms we can find the linear, non-linear regression, by least squares, Lasso, etc.

Reinforcement learning

Based on studies on how to encourage learning in humans and rats based on rewards and punishments. The algorithm learns by observing the world around it. Your input information is the feedback you get from the outside world in response to your actions. Therefore, the system learns based on trial-error.

Resilience (Data Governance)

Systems capacity to maintain or restore its basic functionality after a risk or event (even unknown) occurs.

Robot

A robot is an electromechanical system with its own independence to create movements or perform operations that can be used, at least, as subject of research. They are created through a technique called robotics, which is used to design and build them.

Robotic Process Automation (RPA)

Robotic Process Automation (RPA) is a technology that enables the configuration of “software robots” capable of mimicking human actions on digital systems. These robots perform repetitive, rule-based tasks such as entering data into forms, generating reports, or processing requests.

SASE

SASE (Secure Access Service Edge) is a cloud-based networking and security model that integrates connectivity functions (such as SD-WAN) with security capabilities (Zero Trust, cloud firewall, access control) into a single managed service.

SIEM

SIEM, short for Security Information and Event Management, is a cybersecurity solution that centralizes and analyses in real time the logs generated by an organization’s systems, applications, and devices. Its goal is to detect anomalous patterns and threats at an early stage.

SOAR

SOAR (Security Orchestration, Automation & Response) is a technology designed to unify and coordinate an organization’s cybersecurity operations.

SOC (Security Operations Center)

A Security Operations Center (SOC) is a specialized centre dedicated to monitoring, detecting, analysing, and responding to cybersecurity incidents in real time. It brings together technology, processes, and expert security teams to protect an organization’s digital infrastructure 24 hours a day.

Scalar

A variable is scalar (as opposed to vectorial), when it has a value of magnitude but no direction in space, such as volume or temperature.

Self-supervised learning 

Self-supervised learning is a term that refers to a type of non-supervised learning framed within a supervised learning problem. It is a relatively recent learning technique where training data is labelled autonomously.

Semi-structured data

Semi-structured data does not have a defined schema. They do not fit into a table/row/column format but are organised by labels or "tags" that allow them to be grouped together and create hierarchies. They are also known as non-relational or NoSQL.

Sensitivity and Specificity

Son métricas estadísticas que se usan para medir el rendimiento de un clasificador binario.

La Sensibilidad (También llamada tasa de verdadero positivo, o probabilidad de detección en algunos campos) mide la proporción de casos positivos correctamente identificados por el algoritmo clasificador. Por ejemplo, el porcentaje de personas que padecen una enfermedad y que son correctamente detectadas. Su fórmula es:

Sensibilidad=Verdaderos Positivos/ (Verdaderos Positivos + Falsos Negativos)

La Especificidad (también llamada tasa de verdaderos negativos) mide la proporción de casos negativos correctamente identificados como tales por el algoritmo clasificador. Por ejemplo, se usa para indicar el número de personas sanas que han sido correctamente identificadas como tales por el algoritmo.

Especifidad=Verdaderos Negativos/ (Verdaderos Negativos + Falsos Positivos)

Serverless computing

Serverless computing is a cloud service model in which the provider itself automatically manages the entire server infrastructure (assigning, scaling , and maintaining them).

The user simply uploads and executes his code, without worrying about provisioning or managing physical or virtual servers.

Shell

When the operating system is accessed from the command line we are using the console. In addition to script languages such as Perl and Python, it is common to use Linux-based tools such as grep, diff, splitt, comm, head and tail to perform data preparation and debugging tasks from the console.

Smart Cities

A Smart City is a scenario in which technology is used to improve different infrastructures for citizens. It is a space with millions of devices and connected IoT solutions where the main challenge is how to manage the huge volume of data in real time and in a useful, efficient and integrated way.

Smart Contracts

Smart contracts are computer programs that automatically execute when predefined conditions are met. They run on blockchain technologies and allow agreements between parties to be carried out without the need for intermediaries.

Smart Retail

The connected shop is also known by other names such as connected shop, IoT shop, future shop or smart shop. In short, a connected store is a traditional commercial store that has undergone a digital transformation and has adapted its spaces with new functionalities thanks to IoT devices in order to offer its customers a better user experience. Brands are striving to translate the advantages of online commerce to the physical points of sale to attract new customers, increase sales and increase brand loyalty.

Spatiotemporal data

Time series data that also includes geographic identifiers such as latitude-longitude pairs.

Strata, stratified sampling

It consists in dividing the population samples into homeogenic groups or strata and taking a random sample of each of them. Strata is also an O'Reilly conference on Big Data, Data Science and related technologies.

Structured data  

Structured data is the typical data from most relational databases (RDBMS). These databases are characterised by having a particular schema that defines what the tables look like in which the data is stored, what type of fields they have and how they relate to each other.

Supervised learning

In supervised learning, the algorithms work with "tagged" data (labeled data), trying to find a function that, given the input variables, assign them the appropriate output tag. The algorithm is trained with a "historical" data and thus "learns" to assign the appropriate output label to a new value, that is, it predicts the output value. Supervised learning is often used in classification problems, such as identifying digits, diagnosing, or detecting identity fraud.

Support vector machine

A support vector machine is a supervised machine learning algorithm that is used for both classification and regression tasks. They are based on the idea of finding the hyperplane that best divides the data set into two differentiated classes. Intuitively, the farther away from the hyperplane our values are, the more certain we are that they are correctly classified. However, sometimes it is not easy to find the hyperplane that best classifies the data and it is necessary to jump to a larger dimension (from the plane to 3 dimensions or even n dimensions). SVMs are used for tasks of text classification, spam detection, sentiment analysis, etc. They are also used for image recognition

Tensor

Tensors are mathematical objects that store numerical values and can have different dimensions. Thus, for example, a 1D tensor is a vector, a 2D matrix, a 3D cube, etc.

Text summarization o Automatic summarization

It is the technique by which we can convert long passages of text into passages of shorter texts containing only that information that is relevant. Thanks to this we can design and develop models that help us to condense and present the information, saving us reading time and maximizing the amount of information per word.

Time series data

A time series is a sequence of measures spaced in time intervals not necessarily equal. Thus time series consist of a measure (for example, atmospheric pressure or price of an action) accompanied by a temporary seal.

Tokenization

Tokenization is a security technique that replaces sensitive data—such as credit card numbers, credentials, or personal information—with alternative values known as tokens. These tokens have no value outside the system that generates them, protecting the original data from theft or data breaches.

Transfer learning

This is a method widely used in artificial vision because it allows you to build accurate models saving a great deal of time. Instead of starting the learning process from scratch, you start using patterns or pre-trained models that you have learned by solving a different problem.

Transparency (Data Governance)

Concept on the basis of which the user is aware of the information stored, can give and revoke their explicit consent and be able to revoke it during the operation of the service and guarantee it has been removed once it finishes.

Unstructured data  

Unstructured data accounts for 80% of the volume of all data generated, and this percentage is growing steadily. This data may have an internal structure, but it does not follow any predefined schema or data model. It can be text or non-text data; it can be generated by a machine or a person; and can be stored in a NoSQL database or directly in a Datalake.

Unsupervised learning

Unsupervised learning occurs when "labeled" data is not available for training. We only know the input data, but there is no output data that corresponds to a certain input. Therefore, we can only describe the structure of the data, and try to find some kind of organization that simplifies the analysis. Therefore, they have an exploratory character.

Vector

The mathematical definition of a vector is "a value that has a magnitude and a direction, represented by an arrow whose length represents the magnitude and whose orientation in space represents the direction". However, data scientists use the term in this sense: "ordered set of real numbers denoting a distance on a coordinate axis. These numbers can represent characteristics of a person, movie, product or whatever we want to model. This mathematical representation of the variables allows working with software libraries that apply advanced mathematical operations to the data.

A vector space is a set of vectors, for example, a matrix.

Virtual Reality

Virtual reality is a computer system that generates simulations of real or fictitious spaces in which we can interact and explore as if we were there.

Vulnerability management

Vulnerability management is the process by which organizations identify, assess, and remediate weaknesses in their systems, applications, and networks. Its goal is to reduce the likelihood that these flaws will be exploited by attackers and to protect critical business assets.

XDR

XDR, or Extended Detection and Response, is a cybersecurity solution that integrates and correlates data from multiple sources (endpoints, networks, servers, applications, and email) to detect advanced threats and respond more effectively.

Zero Trust

Zero Trust is a security model based on the principle of “never trust, always verify.” Instead of assuming that anything inside the network is secure, it requires continuous validation of user and device identity, as well as the trust level of every transaction.

📶 We’re closing 2025 with 17 million IoT lines in Spain, strengthening our position as a market reference and achieving very significant growth so far this year.

DECEMBER 18, 2025

Follow us on LinkedIn
Our 2025 team Wrapped is here! 🙌💙 People, moments, and experiences that have shaped our year and made us feel truly, truly #ProudToBeTech.

DECEMBER 19, 2025

Follow us on Instagram
Our 2025 Success Stories Notebook is here! 🚀 This year, we’ve supported our customers in their #DigitalTransformation, overcoming challenges and reaching new milestones with technology as an ally.

DECEMBER 16, 2025

Follow us on X

Búsquedas recomendadas

Techiepedia

AI Agent

AUC (Area Under the ROC Curve)

Accuracy

Activation function

Agentic AI

Algorithm

Analyst firms

Analytics

Artificial Intelligence of Things (AI of Things)

Artificial intelligence

Asset (Data Governance)

Attack surface

Autonomous vehicle

Availability (Data Governance)

Backup as a Service (BaaS)

Bayes Theorem

Bayesian Statistics

Bias

Big Data

Binary Class

Blockchain

Bot

Business Analytics

Business Intelligence

Business Intelligence (BI)

CRM (Customer Relationship Management)

CTI (Cyber Threat Intelligence)

Categorical Variable

Chatbot

Chi-square test

Cibersecurity

Classification

Cloud

Cloud Hosting

Clustering

Coefficient

Cognitive intelligence

Computational linguistics

Confidence interval

Confidentiality (Data Governance)

Confusion matrix

Containers as a Service (CaaS)

Continuous variable

Convolutional neural networks (CNN)

Corporate NFT

Correlation matrix

Cross-validation

Cryptographic inventory

Customer Experience

Cyber intelligence

Cyber resilience

DFIR

Dark Web

Data Analyst

Data Controller

Data Engineer

Data Gobernance Manager

Data Governance

Data Governance table

Data Insight

Data Processor – Procesador de datos

Data Science

Data Scientist

Data Subject

Data mining

Data wrangling

Database Administrator (DBA)

Decentralized Digital Identity (DID)

Decision trees

Deep learning

Deepfake

Dependent Variable

Descriptive Analytics

DevSecOps

Digital Transformation

Digital forensics

Digital sovereignty

Dimension reduction