David Galera

Professional Software Engineer

I am a software engineer specializing in backend development on AWS, with expertise in building scalable applications integrating ML and genAI. I design and develop intelligent systems and architect them on AWS, leveraging services such as EC2, ECS, Lambda, DynamoDB, SQS and Bedrock. Exposing services through API Gateway using both REST and WebSocket protocols, ensuring seamless real-time and asynchronous communication. I bring together cutting-edge AI capabilities and robust cloud architecture to deliver impactful, future-ready solutions that meet complex business needs. I also hold a master's degree in Data Science and developed many ML projects.

Datacamp portfolio → LinkedIn

Myself in Zurich, standing in front of the river.

Portfolio projects

Real Time Chat web application

Fullstack Real time Chat application built with React + Material UI, Apollo GraphQL server running on NestJS and MongoDB as persistence layer. Implementing GraphQL websocket subscription for real-time communication, managing Apollo client to cache GraphQL queries on the React app, leveraging MongoDB aggregation pipelines to build GraphQL aggregated entities using a single MongoDB collection as well as chats list and messages pagination, JWT based authentication and more.

Learn more

NeuralNet for manufacturing control data

Kaggle competition project. Given a balanced dataset, consisting of 900.000 instances with 33 features from manufacturing control data, I built a neural network using keras to predict the binary output. I applied feature engineering and feature interaction to spot patterns and transform the data to achieve 97.4% accuracy in the competition.

Learn more

Computer vision applied to CIFAR-10

Image classification of the CIFAR-10 dataset using deep learning and the fastai library. I used transfer learning with resnet34 and resnet50 pretrained CNNs, applying data augmentation techniques such as mixup.

Learn more

Black and white image of racks with DVD films

Movie recommender engine

Recommender system on the movielens dataset implemented in Python. I implemented 4 recommendation approaches:

- Non personalized
- Content based, a Naive Bayes spam filter is also implemented.
- Colaborative filtering, item-item and user-user.
- Colaborative filtering using the Python surprise library.

Learn more

Density estimation

This project consists of 2 parts. In the first one, I estimate the probability density function of univariate data using kernel methods (non-parametric) . I fit the kernel bandwidth using 2 approaches: MISE and K-Fold cross-validation.
In the second part, I fit a Gaussian Mixture to multivariate data from diabetes patients, tunning the covariance type and number of components with K-Fold CV.

Learn more

Labyrinth

The goal of this project is to design an efficient algorithm that returns the minimum number of moves required by a rod to traverse a rectangular input grid, from the top left corner to the bottom right corner, cells can be blocked and the rod is composed of 3 adjacent cells that can rotate by its center cell if the rotation area does not contain blocked cells. Only Python builtin classes are allowed.

Learn more

MaxSAT optimal correlation clustering

CORRCLUSTERING ∈ NP
In this project, I created an instance of the CORRCLUSTERING problem and performed a polynomial-time reduction to SAT. I obtained a propositional formula by reducing the problem to the HARD and SOFT clauses described in the paper published in 2013 by Jeremias Berg and Matti Järvisalo titled Optimal Correlation Clustering via MaxSAT. Finally, I applied a MaxSAT Solver to obtain a Maximum SATisfiability resulting clustering.

Learn more

Certifications

Master thesis

IEC award & AIME conference paper

I received the Catalan society of technology award, presented by Institut d'Estudis Catalans (IEC) for my master's thesis, titled Frequent patterns of childhood overweight from longitudinal data on parental and early-life of infants health.
A data science project that I carried out in collaboration with the Institut d'investigació biòmedica de Girona and from which we published the following conference paper at AIME '24 conference.

Frequent patterns of childhood overweight from longitudinal data on parental and early-life of infants health

Published in 22nd International Conference on AI in Medicine (AIME 24), Salt Lake City, Utah, US, July 7-12th, 2024.

In this work, frequent pattern mining is used to find the risk factors of childhood obesity, taking into account the relationship among the data gathered in different visits. The experiments carried out on the data collected from 386 children from Girona and Figueres (Spain) demonstrate the relevance of discriminant frequent patterns for childhood overweight prediction.

Recommended citation: López, B., Galera, D., López-Bermejo, A., Bassols, J. (2024). Frequent Patterns of Childhood Overweight from Longitudinal Data on Parental and Early-Life of Infants Health. In: Finkelstein, J., Moskovitch, R., Parimbelli, E. (eds) Artificial Intelligence in Medicine. AIME 2024. Lecture Notes in Computer Science(), vol 14844. Springer, Cham.
https://doi.org/10.1007/978-3-031-66538-7_7