David Galera

Senior Software Engineer, AI

I am a software engineer specializing in backend development on AWS, with expertise in building scalable applications integrating ML and genAI. I design and develop intelligent systems and architect and deploy them on AWS, leveraging services such as EC2, ECS, S3, Lambda, DynamoDB, SQS and Bedrock. Exposing services through API Gateway using both REST and WebSocket protocols, ensuring seamless real-time and asynchronous communication. I bring together cutting-edge AI capabilities and robust cloud architecture to deliver impactful, future-ready solutions that meet complex business needs. I also hold a master's degree in Data Science and have completed +130 courses at Datacamp.

Datacamp portfolio → LinkedIn
Myself in Zurich, standing in front of the river.

Portfolio projects

01

GenAI services with FastAPI

A comprehensive Multi-modal Generative AI Platform built with FastAPI, PostgreSQL, and Qdrant as the embeddings persistence layer. Implementing a modular Domain-Driven Architecture with the Repository Pattern, the system uses open source AI models from HuggingFace for serving diverse AI tasks including text generation via TinyLlama, image synthesis with Stable Diffusion, and audio/video generation. Leveraging Retrieval-Augmented Generation (RAG), it integrates asynchronous PDF processing and Jina AI embeddings for semantic search, while managing scalable model serving through BentoML and providing real-time usage monitoring, JWT-based authentication and Alembic for robust database schema migrations.

Learn more
Pile of cans with binary numbers.
Pile of cans with binary numbers.

02

Real Time Chat web application

Fullstack Real time Chat application built with React + Material UI, Apollo GraphQL server running on NestJS and MongoDB as persistence layer. Implementing GraphQL websocket subscription for real-time communication, managing Apollo client to cache GraphQL queries on the React app, leveraging MongoDB aggregation pipelines to build GraphQL aggregated entities using a single MongoDB collection as well as chats list and messages pagination, JWT based authentication and more.

Learn more

03

NeuralNet for manufacturing control data

Kaggle competition project. Given a balanced dataset, consisting of 900.000 instances with 33 features from manufacturing control data, I built a neural network using keras to predict the binary output. I applied feature engineering and feature interaction to spot patterns and transform the data to achieve 97.4% accuracy in the competition.

Learn more
3D network
Microchip with AI logo

04

Computer vision applied to CIFAR-10

Image classification of the CIFAR-10 dataset using deep learning and the fastai library. I used transfer learning with resnet34 and resnet50 pretrained CNNs, applying data augmentation techniques such as mixup.

Learn more

05

Movie recommender engine

Recommender system on the movielens dataset implemented in Python. I implemented 4 recommendation approaches:

  • - Non personalized
  • - Content based, a Naive Bayes spam filter is also implemented.
  • - Colaborative filtering, item-item and user-user.
  • - Colaborative filtering using the Python surprise library.
Learn more
Black and white image of racks with DVD films
Space with stars forming clusters

06

Density estimation

This project consists of 2 parts. In the first one, I estimate the probability density function of univariate data using kernel methods (non-parametric) . I fit the kernel bandwidth using 2 approaches: MISE and K-Fold cross-validation.
In the second part, I fit a Gaussian Mixture to multivariate data from diabetes patients, tunning the covariance type and number of components with K-Fold CV.

Learn more

07

Labyrinth

The goal of this project is to design an efficient algorithm that returns the minimum number of moves required by a rod to traverse a rectangular input grid, from the top left corner to the bottom right corner, cells can be blocked and the rod is composed of 3 adjacent cells that can rotate by its center cell if the rotation area does not contain blocked cells. Only Python builtin classes are allowed.

Learn more
Top view of a round labyrinth with grass
Pile of cans with binary numbers.

08

MaxSAT optimal correlation clustering

CORRCLUSTERING ∈ NP
In this project, I created an instance of the CORRCLUSTERING problem and performed a polynomial-time reduction to SAT. I obtained a propositional formula by reducing the problem to the HARD and SOFT clauses described in the paper published in 2013 by Jeremias Berg and Matti Järvisalo titled Optimal Correlation Clustering via MaxSAT. Finally, I applied a MaxSAT Solver to obtain a Maximum SATisfiability resulting clustering.

Learn more

Certifications

Master thesis

IEC award & AIME conference paper

I received the Catalan society of technology award, presented by Institut d'Estudis Catalans (IEC) for my master's thesis, titled Frequent patterns of childhood overweight from longitudinal data on parental and early-life of infants health.
A data science project that I carried out in collaboration with the Institut d'investigació biòmedica de Girona and from which we published the following conference paper at AIME '24 conference.

Frequent patterns of childhood overweight from longitudinal data on parental and early-life of infants health

Published in 22nd International Conference on AI in Medicine (AIME 24), Salt Lake City, Utah, US, July 7-12th, 2024.

In this work, frequent pattern mining is used to find the risk factors of childhood obesity, taking into account the relationship among the data gathered in different visits. The experiments carried out on the data collected from 386 children from Girona and Figueres (Spain) demonstrate the relevance of discriminant frequent patterns for childhood overweight prediction.

Recommended citation: López, B., Galera, D., López-Bermejo, A., Bassols, J. (2024). Frequent Patterns of Childhood Overweight from Longitudinal Data on Parental and Early-Life of Infants Health. In: Finkelstein, J., Moskovitch, R., Parimbelli, E. (eds) Artificial Intelligence in Medicine. AIME 2024. Lecture Notes in Computer Science(), vol 14844. Springer, Cham.
https://doi.org/10.1007/978-3-031-66538-7_7