David Galera
Professional Software Engineer
I am a software engineer specializing in backend development on AWS, with expertise in building scalable applications integrating ML and genAI. I design and develop intelligent systems and architect them on AWS, leveraging services such as EC2, ECS, Lambda, DynamoDB, SQS and Bedrock. Exposing services through API Gateway using both REST and WebSocket protocols, ensuring seamless real-time and asynchronous communication. I bring together cutting-edge AI capabilities and robust cloud architecture to deliver impactful, future-ready solutions that meet complex business needs. I also hold a master's degree in Data Science and developed many ML projects.
Datacamp portfolio → LinkedIn
Portfolio projects
01
Real Time Chat web application
Fullstack Real time Chat application built with React + Material UI, Apollo GraphQL server running on NestJS and MongoDB as persistence layer. Implementing GraphQL websocket subscription for real-time communication, managing Apollo client to cache GraphQL queries on the React app, leveraging MongoDB aggregation pipelines to build GraphQL aggregated entities using a single MongoDB collection as well as chats list and messages pagination, JWT based authentication and more.
Learn more

02
NeuralNet for manufacturing control data
Kaggle competition project. Given a balanced dataset, consisting of 900.000 instances with 33 features from manufacturing control data, I built a neural network using keras to predict the binary output. I applied feature engineering and feature interaction to spot patterns and transform the data to achieve 97.4% accuracy in the competition.
Learn more03
Computer vision applied to CIFAR-10
Image classification of the CIFAR-10 dataset using deep learning and the fastai library. I used transfer learning with resnet34 and resnet50 pretrained CNNs, applying data augmentation techniques such as mixup.
Learn more

04
Movie recommender engine
Recommender system on the movielens dataset implemented in Python. I implemented 4 recommendation approaches:
- - Non personalized
- - Content based, a Naive Bayes spam filter is also implemented.
- - Colaborative filtering, item-item and user-user.
- - Colaborative filtering using the Python surprise library.
05
Density estimation
This project consists of 2 parts. In the first one, I estimate the
probability density function of
univariate data using
kernel methods (non-parametric) . I fit the kernel
bandwidth using 2 approaches: MISE and
K-Fold cross-validation.
In the second part, I fit a Gaussian Mixture to
multivariate data from diabetes patients, tunning
the covariance type and number of components with K-Fold CV.


06
Labyrinth
The goal of this project is to design an efficient algorithm that returns the minimum number of moves required by a rod to traverse a rectangular input grid, from the top left corner to the bottom right corner, cells can be blocked and the rod is composed of 3 adjacent cells that can rotate by its center cell if the rotation area does not contain blocked cells. Only Python builtin classes are allowed.
Learn more07
MaxSAT optimal correlation clustering
CORRCLUSTERING ∈ NP
In this project, I created an
instance of the CORRCLUSTERING problem and performed a
polynomial-time reduction to SAT. I obtained a propositional
formula by reducing the problem to the HARD and SOFT clauses
described in the paper published in 2013 by
Jeremias Berg and
Matti Järvisalo titled
Optimal Correlation Clustering via MaxSAT. Finally, I
applied a MaxSAT Solver to obtain a Maximum SATisfiability
resulting clustering.

Certifications
IEC award & AIME conference paper
I received the Catalan society of technology award, presented by
Institut d'Estudis Catalans (IEC) for my master's thesis, titled
Frequent patterns of childhood overweight from longitudinal
data on parental and early-life of infants health.
A data science project that I carried
out in collaboration with the
Institut d'investigació biòmedica de Girona and from
which we published the following conference paper at
AIME '24 conference.
Frequent patterns of childhood overweight from longitudinal data on parental and early-life of infants health
Published in 22nd International Conference on AI in Medicine (AIME 24), Salt Lake City, Utah, US, July 7-12th, 2024.
In this work, frequent pattern mining is used to find the risk factors of childhood obesity, taking into account the relationship among the data gathered in different visits. The experiments carried out on the data collected from 386 children from Girona and Figueres (Spain) demonstrate the relevance of discriminant frequent patterns for childhood overweight prediction.
Recommended citation: López, B., Galera, D., López-Bermejo, A.,
Bassols, J. (2024). Frequent Patterns of Childhood Overweight from
Longitudinal Data on Parental and Early-Life of Infants Health.
In: Finkelstein, J., Moskovitch, R., Parimbelli, E. (eds)
Artificial Intelligence in Medicine. AIME 2024. Lecture Notes in
Computer Science(), vol 14844. Springer, Cham.
https://doi.org/10.1007/978-3-031-66538-7_7