MagentaGaming Cloud Gaming Platform

T-Systems2020–2021

Team: 25 peopleCloud Architect

Amazon EC2Amazon GameLiftAmazon CloudFrontAWS LambdaAmazon DynamoDBAmazon ElastiCacheAmazon S3AWS Auto Scaling
MagentaGaming cloud gaming platform screenshot

Overview

MagentaGaming is Deutsche Telekom's cloud gaming service, delivering AAA game titles to end users via browser and thin clients — no local hardware required. As part of T-Systems, I served as Cloud Architect on the platform build, responsible for the AWS infrastructure that powers game session management, streaming delivery, and player state persistence.

The project ran from mid-2020 to early 2021. The team of 25 delivered a production-ready MVP in 5 months, launching publicly at Gamescom — one of the world's largest gaming trade shows — in August 2021.

Challenge

The defining constraint of this project was the Gamescom launch date. The deadline was fixed and non-negotiable: Deutsche Telekom had committed to a public launch at the trade show, with press coverage and partner announcements already scheduled. Missing it was not an option.

This created a hard engineering constraint: every architectural decision had to balance correctness against delivery speed. The platform needed to handle unpredictable burst traffic from a live launch event while remaining stable enough for a public demo environment. Latency requirements for cloud gaming are unforgiving — any infrastructure bottleneck translates directly into a degraded player experience visible to press and partners on the show floor.

Role

As Cloud Architect, I was responsible for the end-to-end AWS infrastructure design and technical oversight across the 25-person team. My responsibilities included:

  • Designing the game session orchestration layer using Amazon GameLift for fleet management and session placement
  • Architecting the EC2 gaming instance fleet with GPU-optimized instance types and Auto Scaling policies for burst capacity
  • Designing the CloudFront distribution for low-latency game stream delivery across European regions
  • Implementing the player state and session data layer using DynamoDB and ElastiCache
  • Defining the Lambda-based control plane for session lifecycle management
  • Coordinating infrastructure delivery across frontend, backend, and platform engineering workstreams

Process

Given the fixed deadline, the team adopted a time-boxed delivery model with weekly infrastructure milestones.

Weeks 1–4 — Core Infrastructure: Established the VPC topology, EC2 gaming instance baseline, and GameLift fleet configuration. Terraform modules were written for all core resources from day one to ensure reproducible environments across dev, staging, and production.

Weeks 5–8 — Session Management and Scaling: Implemented the GameLift matchmaking and session placement logic, integrated the Lambda control plane for session lifecycle events, and configured Auto Scaling policies for the EC2 fleet. Load testing began in week 7 to validate burst capacity assumptions.

Weeks 9–14 — Delivery Layer and Hardening: Deployed the CloudFront distribution with optimized cache behaviors for game stream traffic, integrated ElastiCache for session state, and ran end-to-end latency benchmarks. The final two weeks were dedicated to production hardening, runbook preparation, and Gamescom rehearsal scenarios.

Gamescom Launch: The platform handled the live launch event without incident. Post-launch, the team transitioned to a steady-state operations model.

Decisions

Amazon GameLift over custom session orchestration: Early design considered building a custom session broker on EC2 with a DynamoDB-backed queue. We chose GameLift because it provided battle-tested fleet management, built-in matchmaking, and session placement logic that would have taken weeks to replicate reliably. Given the deadline, reducing custom code surface area was a deliberate risk mitigation strategy.

CloudFront for stream delivery: Rather than routing game stream traffic directly from EC2 instances, we placed CloudFront in front of the delivery layer. This reduced origin load during burst events and provided a consistent edge presence across European PoPs — critical for the latency profile required by cloud gaming.

DynamoDB + ElastiCache for player state: Player session state required both durability (DynamoDB) and sub-millisecond read latency for active sessions (ElastiCache). The two-tier approach allowed the platform to serve active session data from cache while persisting state changes asynchronously to DynamoDB.

Terraform from day one: All infrastructure was defined in Terraform from the first sprint. This decision paid dividends during the final weeks when production environment parity with staging was essential for reliable launch rehearsals.

Results

The platform launched on schedule at Gamescom, meeting all committed milestones:

  • 5-month MVP delivery from project kickoff to public Gamescom launch
  • 25-person cross-functional team coordinated across T-Systems and Deutsche Telekom stakeholders
  • Gamescom launch completed without incident — platform handled live event traffic and press demonstrations
  • EC2 gaming fleet with Auto Scaling handled burst capacity during the launch event
  • Sub-100ms session placement latency achieved via GameLift fleet configuration
  • CloudFront distribution deployed across European edge locations for consistent stream delivery
  • DynamoDB + ElastiCache player state layer sustained concurrent active sessions throughout the launch window

The fixed deadline forced a discipline around scope and architectural simplicity that produced a more maintainable platform than a longer, more open-ended engagement might have. The Gamescom launch served as a forcing function for production readiness.

Learn more about my background, certifications, and how I work on the About page.

About me