FLOR ALBERTS

PROJECTS

CLICK TO FIND OUT MORE

ASSIGNMENT

TWEET EMOTION CLASSIFIER

2021

PREDICTS THE SENTIMENT OF A GIVEN SET OF TWEETS

FINAL PROJECT

SCREENPLAY PARSER

2021

LABELS LINES IN A SCREENPLAY (SCENES, CHARACTER DIALOGUES, ETC.) USING REGULAR EXPRESSIONS

ASSIGNMENT

REAL-TIME NEWS WEBSITE

2022

REAL-TIME RESPONSIVE NEWS WEBSITE

ASSIGNMENT

CRUD WEB INTERFACE

2022

SYSTEM FOR TRACKING (TV) SERIES

FINAL PROJECT

ASYNC. BROWSER MULTIPLAYER GAME

2022

CONNECT FOUR

FINAL PROJECT

SEMANTIC Q&A SYSTEM

2023

WIKIDATA BASED INFORMATION RETRIEVAL SYSTEM FOR VIDEO-GAME RELATED QUESTIONS

BACHELOR THESIS

2024
GRADE: 8.0

UTILIZING LARGE LANGUAGE MODELS FOR QUALITY BASED SUMMARIZATION OF BIOMEDICAL LITERATURE

LARGE PROJECT

SHARED TASK SUBMISSION - SEMEVAL 2025

2025

SUBMISSION FOR THE MU-SHROOM LLM HALLUCINATION DETECTION TASK

FINAL PROJECT

ABUSIVE & OFFENSIVE LANGUAGE DETECTION

2025

DETECTION OF HARMFUL LANGUAGE USING SEVERAL MACHINE LEARNING STRATEGIES

MASTER THESIS

2026
GRADE: 7.0

ADAPTING A STATE-OF-THE-ART FINE GRAINED ENTITY LINKING MODEL FOR DUTCH

PERSONAL PROJECT

MILKYWAY OF MUSIC

2026

WEB APP VISUALIZING MUSICAL ARTISTS AND GROUPS AS STARS IN A 3D GALAXY

PERSONAL PROJECT

WIP - KANBAN BOARD APP

2026

PRODUCTIVITY TOOL SIMILAR TO TRELLO

EDUCATION

BACHELOR - INFORMATION SCIENCE

UNIVERSITY OF GRONINGEN | 2020 - 2024

During my bachelor's, I built a broad and interdisciplinary foundation that combined computer science, linguistics, and data analysis. I started by developing a solid understanding of core computer science principles such algorithms, abstraction, data storage, networking, and logical operations, among other topics. By learning to program in Python as well as becoming familiar with the command line and writing shell scripts, I gradually became familiar with key concepts in programming, as well as learning to write legible code. Alongside this, I took courses in linguistics. Here, I analyzed sentence structure and learned about different types of sentence contstructions and patterns.

Following this, I learned more about performing data analysis and visualizing data. Additionally, I learned how to properly interpret and present results based on data by gaining a solid grasp of statistics and clear reporting practices.

Much of my bachelor's was centered around the principles of artificial intelligence and machine learning. This includes building a solid grasp of the mathematical foundations underlying these methods, as well as working with a variety of classical classification approaches such as logistic regression, k-nearest neighbor (KNN), support vector machines (SVM), decision trees, random forests and Naïve Bayes. I also gained hands-on experience with neural networks, such as feed-forward neural networks (FFNNs) and recurrent neural networks (RNNs) for sequence processing, particularly for handling problems in the domain of Natural Language Processing (NLP). In doing so, I became familiar with widely used Python libraries like Scikit-learn, Keras, and PyTorch.

Additionally, I developed practical skills in web technology, focusing on creating responsive, modular, and intuitive designs. Projects included building a responsive news website and developing an asynchrnonous browser game (Connect Four). Through these projects I was introduced to key concepts such as MVC architecture, routing, API implementation, and database interaction.

Around this time, I also learned about database management and design. I learned how to design efficient database schemas, create and interpret ER diagrams, normalize databases using several normal forms, and implement CRUD operations. My studies also included computational linguistics and language technology, where I explored parsing techniques and the intersection of language and computation. Additionally, I took part in courses on digital humanities, information security and encryption, machine translation, and the ethics of AI and NLP. This broadened my understanding of the impact of AI tools in societal contexts, and on the ethical implications.

BACHELOR THESIS

For my Bachelor thesis, I used several open-source Large Language Models to generate summaries of biomedical literature, utilizing various prompting techniques and settings (zero-shot, article-top, article-all, article-all+explanation). I then performed automated (BERTScore) and manual evaluation to assess the quality of the generated summaries. I concluded that one-shot prompting on average led to the most factual summaries, and that automated evaluation does not align with human judgement. More information can be found on GitHub.

MASTER - COMMUNICATION AND INFORMATION STUDIES

(TRACK INFORMATION SCIENCE)

UNIVERSITY OF GRONINGEN | 2024 - 2026

During my master's, I deepened my understanding of the principles behind artificial intelligence. In particular, I gained more insight into the inner workings of the Transformer architecture. This included learning about key mechanisms such as gradient descent, gating, and (self)-attention mechanisms, primarily using hands-on coding exercises.

A central theme of my master's was working with (open-source) Large Language Models, and improving the reliability of their outputs. I explored retrieval-augmented generation (RAG) techniques, focusing on efficient and factual information retrieval using public knowledge bases. As part of a semester-long research project, I participated in the SemEVAL-2025 shared task on hallucination detection (Mu-SHROOM), where I worked on developing an efficient RAG component for fact-checking statements generated by LLMs.

Additionally, I participated in courses on semantic web technologies, learning how to represent and query structured knowledge using RDF triples and Turtle syntax, and how to work with graph-based data.

I also followed courses on the societal impact of computer-mediated communication. Here, I studied and analyzed social media discourse around sensitive topics such as discrimination and racism.

Furthermore, I explored computational semantics. This includes logic and inference, meaning representations, compositionality, and lexical semantics. This provided me with an understanding of how meaning can be modeled and captured computationally. For example, as a final project, we trained an AI model to predict the relation between two sentences (A and B) to see if one sentence entailed or contradicted the other or if the relation was neutral.

The master's program also gave me the flexibility to explore additional interests. I engaged with topics such as UI/UX design and speech science, including the analysis of human speech and speech impairments. For example, I created a case study on Hypokinetic Dysarthria, where I propose analysis using different acoustic measurements.

MASTER THESIS

For my thesis, I focused on the task of (Neural) Entity Linking (EL) for dutch text. This task consists of two main steps, namely Named Entity Recognition (NER) and Named Entity Disambiguation (NED). NER involves recognizing named entities such as locations, persons and organizations in raw input text and labeling them as entities. NED is the task of disambiguating these entities to the correct entry in a structured knowledge base (in my case, a corresponding Dutch Wikipedia page ID). To achieve this, I adapted the code from the existing English SpEL Entity Linking model (Shavarani and Sarkar, 2023) and trained it on Dutch training data that I extracted from a Dutch Wikipedia dump, as well as existing training data from the Dutch part of the MultiNERD dataset, which I also used for evaluation. The resulting model, which I dubbed NLSpEL, achieves competitive performance with state-of-the-art multilingual EL systems (mention detection F1 of 81.5%, entity linking F1 of 71.5%). The model as well as the training data can be accessed on GitHub.

Skills

Hover over each bar to find out more

2020 2021 2022 2023 2024 2025 2026
| | | | | | | |