Dr. Adrian Lara Petitdemange

Dr. Adrian Lara Petitdemange

Proyectos

Publicaciones

Classifying and Understanding Tor Traffic Using Tree-Based Models

Descripción:

Over the past years the use of anonymization services has gained significant relevance as more users are interested in protecting their data and privacy on the internet. One of the most popular ways to achieve this result is Tor. The anonymity and untraceability that Tor provides, however, can also be used by ill-intentioned users who try to take advantage of bypassing security control and policies. The Cybersecurity and Infrastructure Security Agency (CISA) mentions two methods of recognizing Tor traffic in the enterprise: indicator- or behavior-based analysis. The first one uses log analysis and lists of Tor exit nodes to identify the suspicious activity while the latter inspects patterns in TCP and UDP ports, DNS queries and inspecting the payload of the packets. In this paper, we propose a different approach using white-box machine learning models such as decision trees and Random Forest. On the one hand, our classifier achieves accuracy levels above 95%. On the other hand, our approach is the first one to allow understanding the importance of each traffic feature in the classification. Our results demonstrate that the TCP window size, the frame size and time related traffic features can be used to identify Tor traffic. In this paper we will describe a Machine Learning methodology used to identify Tor network traffic utilizing decision trees C5.0 and Random Forest. We followed a white-box approach and accomplished accuracy of over 95% in the prediction in both models. We also present an analysis of the importance of the top predictor variables.

Tipo de publicación: Conference Paper

Publicado en: 2020 IEEE Latin-American Conference on Communications (LATINCOM)

Emotions Classifier based on Facial Expressions

Descripción:

Emotion recognition is important in the context of smart buildings and IoT, because it allows the environment to have a better notion of the mood of the humans who are present. With a view to developing such projects, in this article we analyze the performance of an emotion classifier that uses a convolutional neural network. Specifically, we focus on analyzing the impact of the epochs and batch size hyperparameters. To do this, we propose an experimental design with the following hypothesis: "The number of epochs that the model trains and the size of the batch given by iteration in each epoch influence the accuracy of an emotion classifier built from networks. convolutional neurons using the VGG16 architecture".

Tipo de publicación: Conference Paper

Publicado en: 2021 IEEE V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI)

Asynchronous Detection of Slowloris Attacks Via Random Forests

Descripción:

An asynchronous classifier of network flows was developed to detect Slowloris attacks. This classifier was implemented using random forests and its effectiveness was measured by the area under the ROC curve. These random forests were trained from a public dataset. We sought to minimize the number of necessary features that are required to analyze the flows satisfactorily. Finally, it was shown that the chosen features can be used individually to obtain reliable detections in the classifier, with two of the three individual features having an area under the curve greater than 0.95.

Tipo de publicación: Conference Paper

Publicado en: 2021 IEEE V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI)

Recognizing daily-life activities using sensor-collected data in a kitchen

Descripción:

This paper focuses on the recognition and classification of Activities of Daily Living (ADLs) that are carried out in a kitchen. To do this, a Recurrent Neural Network architecture of the Long-Short Term Memory (LSTM) type is implemented as a classifier. The ARAS dataset is used for training and evaluation. A classifier is obtained with an average value in the F1 metric of 95.33% for the chosen data set.

Tipo de publicación: Conference Paper

Publicado en: 2021 IEEE V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI)

Detecting Malicious Domains using the Splunk Machine Learning Toolkit

Descripción:

Malicious domains are often hidden amongst benign DNS requests. Given that DNS traffic is generally permitted, blocking malicious requests is a challenge for most network defenses. Using machine learning to classify DNS requests enables a scalable alternative to programmable blocklists. Studies in this field often reduce their dataset scope to a a single attack behavior. However, organizations are being hit by a myriad of attack patterns across multiple objectives, reducing the scope means closing the door to classifier operationalization in a real-world environment. In this paper, we propose a broader and more challenging scenario for our dataset by combining the four DNS malicious behaviors: malware, phishing, spam and botnet with legitimate domains samples. We use Splunk and its Machine Learning Toolkit to create, test and validate our classifier. We extract 12 static features from the domain name and analyze their weight on the prediction. We compared two supervised learning algorithms and measure their accuracy for such challenging environment. We obtained an 88% of accuracy by using Random Forest algorithm against Decision Tree 87%.

Tipo de publicación: Conference Paper

Publicado en: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium

Desarrollo y evaluación de un prototipo de aplicación móvil para la administración de traslados de pacientes COVID-19

Descripción:

En este artículo presentamos un prototipo de una aplicación móvil para la administración de los traslados de pacientes COVID-19 realizados por el equipo PRIME del centro médico CEACO en Costa Rica. Describimos el diseño de la aplicación, los aspectos técnicos relacionados con su implementación, y los resultados de la evaluación de la experiencia de usuario realizada por los miembros del equipo PRIME. La evaluación del prototipo muestra la utilidad de la aplicación móvil para apoyar los procesos del equipo PRIME y los resultados del estudio de experiencia de usuario indican una percepción muy positiva para las categorías de atracción, trasparencia, eficiencia, controlabilidad y estimulación.

Tipo de publicación: Journal Article

Publicado en: Revista Ibérica de Sistemas e Tecnologias de Informação

User - Smart Building Interactions: An Analysis of Privacy and Productivity Human Factors

Descripción:

Smart buildings are increasingly becoming more common and changing the way we interact with our home, workplace, and cities. Consequently, it is important to study how human factors play a role in smart building-based environments. This research focuses on the privacy and productivity of human factors, and presents the results of two complementary evaluations: a survey in which people’s privacy concerns were analyzed and an assessment that quantifies if smart building functionalities impacts people’s productivity.

Tipo de publicación: Conference Paper

Publicado en: Proceedings of the International Conference on Ubiquitous Computing & Ambient Intelligence (UCAmI 2022)

Taxonomy of Malicious URL Detection Techniques

Descripción:

Malicious URLs are often used by phishing campaigns, botnets and other attacks. Indeed, DNS traffic is necessary for the Internet to function correctly, which means that this data flow cannot be blocked. For these reasons, detecting malicious URLs is both important, challenging and still an open research problem. There are two types of techniques used to detect malicious URLs: rules-based and machine learning-based. The traditional, rules-based techniques rely on blacklists and heuristics. These techniques struggle to keep up with a rapidly changing array of malicious URLs. Therefore, machine learning-based techniques have emerged. Both detection techniques rely on URL characteristics such as length, number of vowels and others to classify them as legitimate or malicious. The main contribution of this paper is to propose a taxonomy of detection techniques and to point out which URL characteristics are used by each method. While surveys on the topic exist, a precise mapping between the detection methods and the characteristics is not available. We also compare these techniques, highlighting that machine learning-based techniques are more complex to implement but better at keeping up with rapidly incoming new malicious URLs. In contrast, rules-based techniques are simpler and easier to implement, but they struggle to update fast enough to identify new malicious URLs.

Tipo de publicación: Conference Paper

Publicado en: International Conference on Information Technology & Systems

Smart Home Interface Design: An Information Architecture Approach

Descripción:

The usage of IoT devices is rapidly growing. Many users may want to add them to their homes to automate certain tasks, help themselves with information, or monitor their environment continuously. Quick IoT development is taking place on privacy, protocols, and other areas. However, the user interface area is being left out of the efforts. Mobile applications and websites are focused on technical requirements only. Research is not focused on the interaction of the user with the application, so standards and/or tendencies may not be utilized. This investigation aims to implement a smart house application interface focused on the user instead of the technical requirements. To start, the card sorting technique is used to group IoT devices into meaningful groups for users. Then, an interface prototype of a smart house application is created with the feedback obtained from the card sorting activity. Finally, the prototype is evaluated by using a standardized questionnaire.

Tipo de publicación: Conference Paper

Publicado en: International Conference on Ubiquitous Computing and Ambient Intelligence

Understanding Students' Perspectives About Human-Building Interactions in the Context of Smart Buildings

Descripción:

Smart buildings provide a variety of sensor-based services to support and enhance the quality of human activities. Advanced technologies such as robotics are increasingly added to smart buildings’ ecosystems, creating a need to incorporate affective computing techniques to augment the quality of human-building, and human-robot interactions. To better understand user’s needs and expectations about human-building interactions, we conducted a pilot study using a mixed methods approach combining short surveys and controlled laboratory activities. We recruited 66 participants and collected several data elements characterizing their perceptions and expectations about smart building services. This paper presents preliminary evidence showing acceptance of specific human-building interaction methods based on ambient-sensors information such as in-context voice, behavior, and emotion, recognition. We also identified a need for educational activities to promote the understanding of smart building concepts and their impact in modern society. These results can be leveraged to assist the design of future services that include human-building and human-robot interactions.

Tipo de publicación: Conference Paper

Publicado en: International Conference on Ubiquitous Computing and Ambient Intelligence

Tor Traffic Classification using Decision Trees

Descripción:

The amount of users interested in protecting their data and privacy on the Internet has increased lately. This has augmented the popularity of anonymization services such as Tor. However, the anonymization and the complication of being tracked provided by Tor has also been used for illintended purposes, such as evading security policies and controls. In this work, we implemented and evaluated an offline Tor traffic detector using white-box machine learning algorithms such as decision trees and random forests. On the one hand, our classifier achieves precision levels above 99 %. On the other hand, our approach is the first one to allow understanding and interpreting the classifier, thus understanding which variables play a significant role in the classification. We show that TCP window size, packet size and some time-related features can be used to identify Tor traffic.

Tipo de publicación: Conference Paper

Publicado en: 2023 XLIX Latin American Computer Conference (CLEI)