Here are 100 Google Cloud Architecture interview questions and answers, covering core services, compute, storage, networking, security, IAM, databases, serverless, big data, DevOps, migration, cost optimization, and architectural best practices. Each question is in bold, followed by a detailed answer.
What is Google Cloud Platform (GCP) and what are its key differentiators?
Answer: Google Cloud Platform is a suite of cloud computing services running on the same infrastructure Google uses internally. Key differentiators: highly scalable global network (using Google’s private fiber, B4 and Andromeda), big data and AI/ML leadership (BigQuery, TensorFlow, Vertex AI), live migration of VMs (no downtime for maintenance), and strong container support (GKE, Anthos).
Explain the difference between a region, zone, and multiregion in GCP.
Answer: A region is a specific geographic location (e.g., us-central1, europe-west4), containing two or more zones. A zone is an isolated deployment area within a region with independent power, cooling, and networking (e.g., us-central1-a). A multiregion is a large geographic area (e.g., US, EU) with multiple regions, used for services like Cloud Storage and BigQuery to provide higher redundancy.
What is a Google Cloud Project?
Answer: A project is the fundamental organizational unit in GCP. It holds all your resources (Compute Engine VMs, Cloud Storage buckets, BigQuery datasets). Each project has a project ID (globally unique) and project number. Projects are used to manage IAM permissions, billing, and quotas.
What is IAM in GCP and what are its main components?
Answer: Identity and Access Management (IAM) controls who can do what on which resources. Components: Member (user, service account, group, domain), Role (collection of permissions), Policy (bindings between members and roles). Roles can be primitive (Owner, Editor, Viewer), predefined (service‑specific), or custom.
What is a service account? How is it different from a user account?
Answer: A service account is a special Google account used by applications or virtual machines (not people) to authenticate and authorize API calls. It has its own credentials (JSON key) and can be assigned specific IAM roles. User accounts represent human users and require interactive sign‑in. Use service accounts for automation and workload identity.
Explain the difference between Cloud IAM roles: primitive, predefined, and custom.
Answer: Primitive roles (Owner, Editor, Viewer) are broad, not recommended for production (risk of privilege escalation). Predefined roles are granular and managed by Google (e.g., Compute Admin, Storage Object Viewer). Custom roles allow you to create a set of permissions tailored to your needs, only available at project or organization level.
What is Google Cloud Resource Hierarchy?
Answer: The hierarchy (from top to bottom): Organization → Folders → Projects → Resources. Allows centralized policy management (IAM, VPC Service Controls, billing) at organization or folder level. Inheritance: policies from higher levels flow down.
What is Cloud CDN? How does it work?
Answer: Cloud CDN caches content at edge locations (Google’s global network) to improve latency and reduce origin load. It works with HTTP(S) Load Balancing and Cloud Storage. Uses cache keys, TTL, and invalidation. Supports signed URLs and cookies.
What is Cloud Interconnect and what are its types?
Answer: Cloud Interconnect provides dedicated private connectivity between on‑premises and GCP. Types: Dedicated Interconnect (direct physical connection, 10/100 Gbps), Partner Interconnect (via supported service provider, 50 Mbps to 10 Gbps). Alternative: VPN (over internet, less reliable, lower bandwidth).
What is Cloud Load Balancing and what are its types?
Answer: Cloud Load Balancing is a fully distributed, software‑defined load balancer. Types: External HTTP(S) (global, L7), Internal HTTP(S) (regional), External TCP/UDP (global, L4), Internal TCP/UDP (regional), SSL Proxy, TCP Proxy, Network Load Balancing (pass‑through). Global load balancing uses anycast IP.
What is Virtual Private Cloud (VPC) in GCP?
Answer: A VPC is a logically isolated network in GCP. It is global (subnets can span regions). You can create multiple VPCs per project. Subnets are regional. Resources in the same VPC can communicate using internal IPs (without public internet). Supports firewall rules, routes, VPN, Cloud NAT, VPC peering.
What is VPC peering?
Answer: VPC peering connects two VPCs (same or different projects) using private IP addresses, allowing them to communicate as if they were on the same network. No gateway required, no single point of failure. Does not support transitive peering (if A peers B and B peers C, A cannot reach C). Use Shared VPC for transitive.
What is Shared VPC?
Answer: Shared VPC allows you to share a central VPC network across multiple projects in the same organization. The host project contains the VPC; service projects attach to it. Resources in service projects can use subnets in the host VPC. Enables centralized network administration.
What is Cloud NAT and when is it used?
Answer: Cloud NAT (Network Address Translation) allows Compute Engine VMs without public IP addresses to initiate outbound connections to the internet (e.g., for updates, pulling images). It does not allow inbound connections from internet. Managed NAT gateway is regional, supports both TCP and UDP.
What is a firewall rule in GCP?
Answer: Firewall rules control ingress (incoming) and egress (outgoing) traffic to compute instances. Each rule has a priority (0‑65535, lower number higher priority). Stateful: once allowed, return traffic is automatically allowed. You can specify source/destination ranges, tags, service accounts.
Explain the difference between Cloud VPN and Cloud Interconnect.
Answer: Cloud VPN is an encrypted tunnel over the public internet, lower cost but subject to internet variability and latency. Cloud Interconnect is dedicated private connection, higher bandwidth, lower latency, more reliable, but more expensive. Use VPN for development or backup; Interconnect for production hybrid workloads.
What is Cloud Armor?
Answer: Cloud Armor is a Web Application Firewall (WAF) and DDoS protection service. It filters traffic to HTTP(S) Load Balancing and Cloud CDN. Provides preconfigured rules (OWASP Top 10), custom rules (using CEL), and rate limiting. Security policies are applied at the backend service. Protects against XSS, SQLi, and L3‑L7 DDoS.
Explain Compute Engine and its machine families.
Answer: Compute Engine provides scalable virtual machines. Instance families: General‑purpose (N2, N2D, N1, E2 – balanced), Compute‑optimized (C2, C2D – high CPU), Memory‑optimized (M2, M3 – large RAM), Accelerator‑optimized (A2 – GPUs), Storage‑optimized (Z3 – high local SSD). Choose based on workload requirements.
What are sole‑tenant nodes in Compute Engine?
Answer: Sole‑tenant nodes are physical servers dedicated to only your VMs. They provide hardware isolation for compliance (BYOL, license restrictions). You are billed for the entire node, not per VM. Managed via node groups and node templates.
What is a sustained use discount? How about committed use discounts?
Answer: Sustained use discount (SUD) is automatically applied for VMs running most of the month (up to 30% for 100% usage), no upfront payment. Committed use discounts (CUD) are purchased for 1 or 3 years for specific resources (vCPU, memory, GPUs) and region, providing significant discounts (up to 57% for 3‑year). CUD applies to many services (Compute Engine, GKE, Cloud SQL).
What is a preemptible VM?
Answer: Preemptible VMs are short‑lived (max 24 hours), low‑cost (60‑80% discount) instances that can be terminated by Google at any time with 30‑second notice. Use for batch jobs, fault‑tolerant workloads, and CI/CD. Not suitable for production stateful services.
What is a spot VM (previously preemptible) and how is it different?
Answer: Spot VMs are the evolution of preemptible VMs; they have no maximum runtime, but can be terminated when Google needs capacity. Discount up to 91%. They are not reclaimed for market price reasons (like AWS Spot). Use for fault‑tolerant and stateless workloads.
What is a Cloud TPU?
Answer: Tensor Processing Units (TPUs) are Google‑custom ASICs designed to accelerate machine learning workloads (training and inference). Available on Compute Engine, GKE, and Vertex AI. Optimized for TensorFlow, PyTorch (through XLA). Offer high throughput for matrix operations.
What is Google Kubernetes Engine (GKE)?
Answer: GKE is a managed Kubernetes service. It automates cluster management (control plane is managed by Google). Features: auto‑scaling (cluster and node pools), auto‑repair, auto‑upgrade, workload identity, integrated logging and monitoring, and Istio (service mesh). Supports standard (pay for control plane) and Autopilot (fully managed, pay per Pod) modes.
Explain the difference between GKE Standard and GKE Autopilot.
Answer: Standard mode gives you control over node management (you manage node pools, upgrades, and security). Autopilot is a fully managed mode where Google manages nodes, you only define Pod resources (vCPU, memory). Autopilot simplifies operations, optimizes costs, and enforces security best practices. It is more expensive per vCPU but saves operational overhead.
What is Anthos?
Answer: Anthos is a hybrid and multi‑cloud platform that allows you to run Kubernetes clusters consistently on‑premises, on GCP, and on other clouds (AWS, Azure). It includes GKE on‑prem, Anthos Config Management, Anthos Service Mesh, and policy enforcement. Enables application portability and security consistency.
Explain Cloud Functions and its event sources.
Answer: Cloud Functions is a serverless execution environment (FaaS) that runs code in response to events. Event sources: Cloud Storage (object change), Pub/Sub (message), Firestore (document change), HTTP triggers, Cloud Scheduler, etc. Supports Node.js, Python, Go, Java, .NET, Ruby, PHP. Auto‑scales. Two generations: 1st gen (legacy) and 2nd gen (Cloud Run based, more features).
What is Cloud Run and how does it differ from Cloud Functions?
Answer: Cloud Run is a serverless container platform that runs stateless HTTP containers. You can use any language/runtime. It scales down to zero. Cloud Functions is more limited (specific runtimes, shorter timeouts). Cloud Run also offers jobs (batch). Cloud Run is better for microservices and longer‑running tasks. Both are pay‑per‑request or CPU time.
What is App Engine? What are the two environments?
Answer: App Engine is a PaaS (Platform as a Service) that fully manages infrastructure. Standard Environment supports specific runtimes (Python, Java, PHP, Go, Node.js) with faster scaling, but has many restrictions (cannot write to local disk). Flexible Environment runs custom containers, but slower scaling. App Engine is often replaced by Cloud Run.
What is Cloud Storage? Explain storage classes.
Answer: Cloud Storage is object storage. Classes: Standard (frequently accessed, lowest per‑operation cost), Nearline (30‑day minimum, backup), Coldline (90‑day, archival), Archive (365‑day, compliance, lowest storage cost). Also Autoclass automatically moves objects between classes based on access patterns. Multi‑regional, regional, and dual‑regional locations.
What is a Cloud Storage bucket and its key properties?
Answer: A bucket is a container for objects (files). Properties: globally unique name, location (region, multi‑region), storage class, default encryption (CMEK, Google‑managed), versioning, lifecycle policies, IAM roles, and bucket lock.
What is Compute Engine persistent disk? Types?
Answer: Persistent Disk (PD) is durable block storage attached to VMs. Types: Standard (HDD, cost‑effective), Balanced (SSD for general purpose), SSD (zonal or regional), Extreme (high IOPS). Also Hyperdisk (extreme performance, configurable IOPS/throughput). Persistent disk can be resized without downtime and attached to multiple VMs (read‑only).
What is Cloud Filestore?
Answer: Cloud Filestore is a managed NFS (Network File System) file storage for Compute Engine and GKE. Used for applications that require file locking (e.g., SAP, shared home directories). Supports NFSv3. Pay for provisioned capacity (1 TB to 100 TB).
What is Bigtable and what are its use cases?
Answer: Cloud Bigtable is a fully managed, scalable NoSQL database (wide‑column, key‑value). It is not relational (no joins). Use cases: real‑time analytics (IoT, time‑series), personalization (recommendation engines), ad serving, financial data. Consistent sub‑10ms latency. Supports HBase API.
What is Cloud SQL?
Answer: Cloud SQL is a managed relational database service for MySQL, PostgreSQL, and SQL Server. Features: automated backup, point‑in‑time recovery, read replicas (regional and cross‑region), high availability (failover replica in another zone), and maintenance windows. Not as scalable as Spanner but simpler, cheaper.
What is Cloud Spanner?
Answer: Cloud Spanner is a globally distributed, strongly consistent relational database that scales horizontally. It combines relational semantics with high availability (5 9’s) and supports SQL. Suitable for global applications needing ACID transactions across regions. Higher cost than Cloud SQL. Offers interleaved tables, secondary indexes, and versioning.
What is Firestore?
Answer: Firestore is a serverless document database (NoSQL) for mobile and web applications. Real‑time listeners, scalable, offline support. Two modes: Native (real‑time, security rules) and Datastore mode (compatibility with old Datastore). Integrated with Firebase.
What is BigQuery? Explain BigQuery architecture.
Answer: BigQuery is a serverless, highly scalable data warehouse for analytics. Architecture: columnar storage (Capacitor), decoupled compute and storage (using Colossus, Borg). Query engine uses Dremel. You pay for bytes processed (for query) or storage. Supports standard SQL, machine learning (BQML), and BI Engine.
What are clustered and partitioned tables in BigQuery?
Answer: Partitioned tables split a table into smaller segments based on a timestamp, date, or integer column, reducing query cost and improving performance. Clustering organizes data within partitions based on columns (order matters). Use clustering for columns with high cardinality; use partitioning for time‑series data.
What is BigQuery slots and reservation?
Answer: Slots are virtual CPUs used by BigQuery to execute queries. By default, BigQuery uses on‑demand pricing (pay per query bytes). Reserved slots are purchased (flex or annual) and provide predictable performance and dedicated capacity. Reservations are assigned to projects, folders, or organizations.
What is Pub/Sub? Explain its components.
Answer: Pub/Sub is a messaging service for event ingestion and delivery. Components: Topic (named resource for messages), Subscription (pull or push), Publisher (sends messages), Subscriber (receives messages). At‑least‑once delivery, ordering optional, retention up to 31 days. Use for decoupling microservices, data pipelines.
What is Cloud Scheduler?
Answer: Cloud Scheduler is a fully managed cron job scheduler. Triggers HTTP/S endpoints, Pub/Sub topics, or App Engine tasks at defined intervals (Unix cron format). Time zones configurable. Uses OIDC authentication.
What is Cloud Tasks?
Answer: Cloud Tasks manages asynchronous task execution. Enqueue tasks (HTTP or App Engine) and they are delivered with configurable retries and rate limits. Used to decouple request handling and batch processing.
What is Dataflow?
Answer: Dataflow is a fully managed stream and batch data processing service based on Apache Beam. It handles autoscaling, dynamic work rebalancing, exactly‑once processing. Supports many I/O connectors (Pub/Sub, BigQuery, GCS, Kafka). Write Beam pipelines in Java/Python.
What is Dataproc?
Answer: Dataproc is a managed Hadoop and Spark service. It creates clusters quickly (average 90 seconds) and supports autoscaling, component configuration, and low‑cost preemptible VMs. Used for big data processing (ETL, machine learning) when existing Spark/Hadoop code.
What is Dataprep?
Answer: Dataprep (Cloud Dataprep by Trifacta) is a serverless data preparation (ETL) service that integrates with BigQuery, Dataflow, and GCS. It provides a visual interface to clean, transform, and explore data. Outputs could be Python code or Dataflow pipelines.
What is Vertex AI?
Answer: Vertex AI is a unified ML platform. It integrates AutoML, custom training, model registry, endpoints for serving, feature store, and ML pipelines. Supports both Google‑managed infrastructure and custom containers. Reduces operational overhead for ML.
What is TensorFlow on GCP?
Answer: TensorFlow is an open‑source ML framework. On GCP, you can run TensorFlow on Compute Engine (with GPUs/TPUs), GKE, Cloud TPU, and Vertex AI (training and prediction). Vertex AI TensorBoard is managed.
What is Looker and Looker Studio?
Answer: Looker is a business intelligence (BI) platform that queries BigQuery, Cloud SQL, etc., directly (in‑database). It uses LookML (modeling layer). Looker Studio (formerly Google Data Studio) is a free, easy‑to‑use dashboarding tool, less powerful than Looker but lower cost.
What is Cloud Build?
Answer: Cloud Build is a CI/CD platform that executes builds using steps (containers). Supported triggers: GitHub, Cloud Source Repositories, Pub/Sub, webhooks. Can build container images (Docker) and deploy to Cloud Run, GKE, App Engine, etc.
What is Artifact Registry?
Answer: Artifact Registry is a managed container and language package registry. It replaces Container Registry (deprecated). Supports Docker, Maven, npm, Python, Go, and Debian. Integrates with Cloud Build, Cloud Run, GKE.
Explain Deployment Manager.
Answer: Deployment Manager is an infrastructure as code (IaC) service that uses YAML (or Python, Jinja) templates to create and manage GCP resources. It is similar to CloudFormation (AWS). Alternatives: Terraform (more popular), Pulumi.
What is Cloud Security Command Center (Cloud SCC)?
Answer: Cloud SCC is a security management and data risk platform that provides asset inventory, vulnerability scanning, real‑time threat detection (from Event Threat Detection), and compliance dashboards (CIS, PCI, etc.). Integrates with Security Health Analytics, Web Security Scanner.
What is Cloud DLP (Data Loss Prevention)?
Answer: Cloud DLP is a service to discover, classify, and de‑identify sensitive data (PII, credit cards, etc.). It can scan text, images, BigQuery, Cloud Storage, and Datastore. Redaction (masking, tokenization) and risk analysis.
What is Identity‑Aware Proxy (IAP)?
Answer: IAP provides zero‑trust access to applications hosted on GCP (Compute Engine, GKE, App Engine, Cloud Run) without a VPN. It uses IAM to authorize users and provides TCP forwarding or HTTP(S) with user identity forwarding. Integrates with Cloud Load Balancing.
What is Cloud KMS?
Answer: Cloud Key Management Service (KMS) manages cryptographic keys for encryption. You can create, use, rotate, and destroy symmetric/asymmetric keys. Integrated with many GCP services (Cloud Storage, Compute Engine, BigQuery). Supports HSM (hardware security module) and external keys (Cloud EKM).
What is Secret Manager?
Answer: Secret Manager is a secure store for API keys, passwords, certificates, and other secrets. It provides versioning, audit logging, and fine‑grained IAM. Access is via API, Cloud Run, GKE (with Workload Identity), or Compute Engine. Secrets are encrypted (using KMS).
What is Cloud Identity?
Answer: Cloud Identity is an Identity as a Service (IDaaS) that provides user management, SSO, and device management. It can be used without Google Workspace. Integrates with GCP IAM and can federate with identity providers via SAML.
What is Workload Identity?
Answer: Workload Identity allows applications running on GKE, Compute Engine, or Cloud Run to authenticate to Google APIs without using service account keys (improved security). It uses OIDC tokens and Kubernetes service accounts mapped to IAM service accounts.
What is VPC Service Controls?
Answer: VPC Service Controls create a security perimeter that restricts access to GCP services (Cloud Storage, BigQuery, Cloud SQL) from untrusted networks. It prevents data exfiltration to unauthorized external IPs or VPCs. Adds a layer of protection beyond IAM.
What is Access Transparency?
Answer: Access Transparency logs are detailed records of Google personnel actions on your data (e.g., support engineer accessing a VM). Provides visibility and accountability. Available with certain support tiers.
What is Cloud Audit Logs?
Answer: Cloud Audit Logs records administrative (data plane) and data access (user, service account) activities on GCP resources. Three types: Admin Activity (default), Data Access (opt‑in), System Event. Integrates with Cloud Logging.
What is Cloud Monitoring and Cloud Logging?
Answer: Cloud Monitoring collects metrics, dashboards, and alerts (formerly Stackdriver). Cloud Logging collects and stores logs (with querying, log‑based metrics, and sinks to Pub/Sub, BigQuery). Both are essential for observability.
What are Service Level Indicators (SLI), Service Level Objectives (SLO), and Service Level Agreements (SLA)?
Answer: SLI is a metric (e.g., request latency, availability). SLO is a target for SLI (e.g., 99.9% uptime over 30 days). SLA is the agreement with customers, often includes compensation if SLO missed. Use Cloud Monitoring to track custom SLOs.
What is Cloud Profiler?
Answer: Cloud Profiler is a low‑overhead, production profiling tool that identifies CPU and memory bottlenecks across applications (Java, Go, Python, Node.js). It shows call trees and heap profiles. Works on Compute Engine, GKE, App Engine.
What is Cloud Trace?
Answer: Cloud Trace is a distributed tracing system (OpenTelemetry compatible) for analyzing latency in service calls. It collects traces, displays waterfall graphs, and identifies performance issues.
Explain Cloud Deployment options: Cloud Run, GKE, Compute Engine, App Engine.
Answer: Choose Cloud Run for stateless containers and serverless (ease, scalability). GKE for complex microservice architectures, batch jobs, or when you need Kubernetes features. Compute Engine for full control over OS, licensing, or legacy apps. App Engine when you want PaaS with specific runtime restrictions; Cloud Run is often better.
What is Firestore vs Datastore?
Answer: Firestore is the evolution of Datastore. Firestore Native mode adds real‑time updates, stronger consistency, and mobile SDKs. Datastore mode provides backward compatibility for existing Datastore customers. For new projects, use Firestore.
What is Cloud Composer?
Answer: Cloud Composer is a managed workflow orchestration service built on Apache Airflow. Used to author, schedule, and monitor data pipelines (DAGs). Integrates with BigQuery, Dataflow, Dataproc, etc.
What is Transfer Appliance?
Answer: Transfer Appliance is a physical device (rackable) to securely transfer large amounts (petabytes) of on‑premises data to Google Cloud (Cloud Storage). Offline transfer when bandwidth is insufficient.
What is Cloud Interconnect vs VPN?
Answer: Already covered. Interconnect is private, dedicated; VPN is encrypted internet.
How would you design a multi‑region deployment for high availability?
Answer: Use Cloud Load Balancing (global) to distribute traffic across regions. Deploy services in multiple regions (e.g., us‑central1 and europe‑west1). Use Cloud Spanner for globally distributed database, or Cloud SQL with cross‑region replicas. Use Cloud CDN for static content. For GKE, deploy regional clusters or multiple zonal clusters with MCS service mesh.
What is Cloud Foundation Toolkit?
Answer: Cloud Foundation Toolkit (CFT) is a set of reference Terraform modules maintained by Google. It follows best practices for creating VPCs, IAM, and billing. Used to accelerate landing zone deployments.
What is Policy Intelligence?
Answer: Policy Intelligence is a suite of tools to understand and manage IAM policies: IAM recommender (over‑permission), Access Insights, Role Recommender, and policy analyzer (troubleshooting). Helps enforce least privilege.
What is Billing Budget and Alert?
Answer: Billing budgets are thresholds (e.g., $500 or 90% of forecast) that trigger alerts. Budgets are created per project or billing account. Alerts can send to Pub/Sub, email. Use to prevent cost overrun.
What are labels and tags in GCP?
Answer: Labels are key‑value pairs for organizing resources (e.g., environment:prod, cost‑center:marketing). They appear in billing exports. Tags (network tags) are attached to instance networking for firewall rules. Also Resource Manager tags for hierarchical policies.
What is Cloud Asset Inventory?
Answer: Cloud Asset Inventory provides metadata and usage history of all GCP resources across projects. It supports export to BigQuery, export to Cloud Storage, and searchable API. Used for asset management and compliance.
What is Cloud Shell and Cloud Shell Editor?
Answer: Cloud Shell is a browser‑based terminal with pre‑installed gcloud, kubectl, and other tools, with a persistent $HOME disk. Cloud Shell Editor is a web‑based IDE (code‑oss) inside Cloud Shell.
What is Cloud Run for Anthos?
Answer: Cloud Run for Anthos runs serverless containers on your GKE on‑premises cluster, providing consistent experience between GCP and on‑prem. Managed by Google? Actual product is Cloud Run on Anthos (deprecated, use Cloud Run on GKE with knative). Better to use Cloud Run (fully managed) or GKE.
What is Cloud Functions 2nd gen?
Answer: Cloud Functions (2nd gen) is built on Cloud Run and Cloud Events, offering longer timeouts (up to 60 minutes), more resources, concurrency, and larger request sizes. Eventarc integration. Use for more complex workflows.
What is Eventarc?
Answer: Eventarc is a service to receive events from Google Cloud sources (Cloud Storage, Pub/Sub, Firestore, etc.) and deliver to Cloud Run or Cloud Functions (2nd gen). Provides a standardized eventing model.
What is Config Connector?
Answer: Config Connector is a Kubernetes add‑on that allows you to manage GCP resources using Kubernetes custom resources (kubectl). It is used in GKE and Anthos to manage cloud resources declaratively.
What is Cloud Build private pools?
Answer: Cloud Build private pools provide workers that run in your VPC network, enabling access to private resources (like Compute Engine VMs) without internet egress. Good for security and compliance.
What is Binary Authorization?
Answer: Binary Authorization is a policy‑based admission control system that ensures only trusted container images are deployed on GKE or Cloud Run. It can integrate with Container Analysis, vulnerability scanning.
What is Container Analysis?
Answer: Container Analysis stores metadata about container images (vulnerabilities, attestations). It works with Binary Authorization and Artifact Analysis (vulnerability scanning). Scans images stored in Artifact Registry.
What is Cloud HSM?
Answer: Cloud HSM provides hardware security module (FIPS 140‑2 Level 3) for key management. Keys are generated and stored in tamper‑resistant hardware. Managed via Cloud KMS, same APIs.
What is External Key Manager (Cloud EKM)?
Answer: Cloud EKM allows you to manage encryption keys outside GCP (on‑prem or external partner) and still use them with GCP services (Cloud Storage, BigQuery). Provides external control but adds latency.
What is Customer‑Supplied Encryption Keys (CSEK)?
Answer: CSEK allows you to provide your own AES‑256 key to encrypt data in Cloud Storage. Google does not store the key; you manage it. If you lose the key, data is unrecoverable. More cumbersome than KMS.
What is shielded VMs?
Answer: Shielded VMs provide verifiable integrity (Secure Boot, virtual trusted platform module, integrity monitoring). Protects against boot‑level malware. Supported on Compute Engine, GKE nodes.
What is Confidential VMs?
Answer: Confidential VMs encrypt data in‑use using AMD SEV (Secure Encrypted Virtualization) or Intel TDX. The encryption keys are generated per VM and are not accessible to hypervisor or host. Protects against physical attacks.
What is Access Approval?
Answer: Access Approval allows you to require explicit approval before Google personnel can access your data (e.g., support requests). Part of Assured Workloads, with time‑limited approvals. Integrates with Access Transparency.
What is Assured Workloads?
Answer: Assured Workloads provides compliance controls for regulated workloads (FedRAMP, HIPAA, PCI, CMMC). It enforces resource locations, personnel access restrictions, and data residence. Uses a folder with compliance blueprint.
What is Resource Manager and Folder?
Answer: Resource Manager is the service that organizes GCP resources. Folders group projects (e.g., by department, environment). Policies set on folders apply to all projects within. Enables scaling of IAM and organization policies.
What is Organization Policy Service?
Answer: Organization Policy Service allows you to define constraints on resource configurations (e.g., restrict VM machine types, disable public IP address, location restrictions). Applied at organization, folder, or project level. Prevents configuration drift.
What is Service Directory?
Answer: Service Directory is a managed service discovery and registry for services (internal or external). It stores endpoints, metadata, and annotations. Useful for microservices to discover each other without hardcoding IPs.
What is Private Google Access?
Answer: Private Google Access allows Compute Engine VMs with only private IPs (no public IP) to reach Google APIs and services (Cloud Storage, BigQuery) via internal network (not internet). Uses VPC routes with default domain.
What is a Cloud Router?
Answer: Cloud Router is a managed BGP (Border Gateway Protocol) router for dynamic routing between your VPC and on‑premises networks (via Cloud VPN or Interconnect). Exchanges routes and learns changes automatically.
What is a Cloud NAT Gateway type? Regional or global?
Answer: Cloud NAT is regional (per region). You must configure a NAT gateway per region. It only supports outbound initiated traffic from VMs without public IPs. Can be used with Cloud Router for BGP.
What is a load balancer health check? Types?
Answer: Health checks monitor backend instances. Types: HTTP, HTTPS, HTTP2, TCP, SSL. Unhealthy backends are removed from rotation. Health check parameters: check interval, timeout, healthy/unhealthy thresholds.
What is Traffic Director?
Answer: Traffic Director is a managed traffic control plane for service mesh on Compute Engine and GKE. It provides global load balancing, health checks, and traffic routing with Envoy sidecar proxies.
What is Cloud Endpoints?
Answer: Cloud Endpoints is an API management system (API gateway) that provides authentication, logging, monitoring, and API key validation for APIs running on GKE, Compute Engine, App Engine, Cloud Run.
What is Apigee?
Answer: Apigee is a full‑lifecycle API management platform, more powerful than Cloud Endpoints. It provides API analytics, developer portals, monetization, and security policies. Suitable for large‑scale API programs.
What is Cloud Life Sciences (formerly Google Genomics)?
Answer: Cloud Life Sciences is a fully managed service for running bioinformatics pipelines (workflows) on GCP. It uses containers and can be orchestrated with Cloud Workflows or Pipelines API.
What is Cloud Healthcare API?
Answer: Cloud Healthcare API provides managed storage and processing for healthcare data (HL7v2, FHIR, DICOM). It integrates with BigQuery, Dataproc, and Cloud Dataflow for analytics.
What is Cloud IoT Core?
Answer: Cloud IoT Core is a fully managed service for ingesting and managing device data from sensors (MQTT, HTTP). It forwards to Pub/Sub. Deprecated as of August 2023; alternatives are Pub/Sub directly or other partners.
What is Google Cloud Armor Adaptive Protection?
Answer: Adaptive Protection (part of Cloud Armor) uses machine learning to detect and mitigate application layer DDoS attacks and bot traffic. It automatically learns normal traffic patterns and creates security policies.
What is reCAPTCHA Enterprise?
Answer: reCAPTCHA Enterprise is a managed service that protects websites from spam, bots, and abusive traffic. Provides score‑based risk analysis and user challenge. Integrates with WAF (Cloud Armor).
What is Chronicle (Google Security Operations)?
Answer: Chronicle is a cloud‑based security analytics platform (SIEM). It ingests logs from multiple sources, provides search, detection rules, and threat intelligence. Formerly part of Google Cloud.
What is BeyondCorp?
Answer: BeyondCorp is Google’s zero‑trust security model, implemented through services like Identity‑Aware Proxy (IAP) and Access Context Manager. It grants access based on device and user context, not network location.
What is Certificate Authority Service (CAS)?
Answer: CAS is a managed private CA service for issuing and managing X.509 certificates for internal use. Supports IAM, integration with GKE, and compliance auditing. Better than using self‑signed.
What is Cloud Storage transfer service?
Answer: Transfer Service moves data from on‑premises (using storage systems like network attached), other clouds (AWS, Azure), or HTTP(S) sources to Cloud Storage. Supports scheduled, incremental transfers.
What is Storage Transfer Service for on‑premises?
Answer: It uses a software agent running on your server to transfer data to Cloud Storage. Agent copies files incrementally. Supports POSIX file systems. Can also schedule transfers.
What is BigQuery BI Engine?
Answer: BI Engine is an in‑memory analysis service that accelerates BigQuery queries for BI tools (Looker, Tableau, Data Studio). It caches data and supports sub‑second queries on large datasets.
What is BigQuery Omni?
Answer: BigQuery Omni is a multi‑cloud analytics product that allows you to query data stored in AWS S3 and Azure Blob Storage using BigQuery SQL, without moving data. Runs in the cloud provider’s environment.
What is BigQuery GIS?
Answer: BigQuery GIS provides geography data types and functions for spatial analytics (points, polygons). Use for location‑based queries, geospatial joins, and visualization.
What is BigQuery ML?
Answer: BigQuery ML allows you to build and train machine learning models (linear regression, logistic regression, XGBoost, etc.) using SQL directly on BigQuery data. Models can be exported to Vertex AI for serving.
What is BigQuery Data Transfer Service?
Answer: Data Transfer Service automates data ingestion from SaaS apps (Google Ads, Campaign Manager, etc.) and external sources (Teradata, Amazon S3) into BigQuery. Supports scheduled loads.
What is Cloud Data Fusion?
Answer: Cloud Data Fusion is a managed ETL service (CDAP) with a visual interface for data pipeline creation. Supports batch and real‑time, 150+ connectors. Good for data integration without coding.
What is Cloud Workflows?
Answer: Cloud Workflows is a serverless orchestration service for executing series of steps (HTTP, Cloud Functions, BigQuery jobs, etc.) using YAML or JSON definition. Uses durable execution, retries.
What is Series (serial) console in Compute Engine?
Answer: Serial console provides interactive access to instance console output (e.g., for debugging boot issues). Access is controlled by IAM. Logs are stored in Cloud Logging. Can be disabled for security.
What is live migration of VMs?
Answer: Live migration allows Google to move your running VM to another host without downtime, often for maintenance, security updates. It is enabled by default. Not supported for preemptible VMs.
What is machine images vs disk snapshots?
Answer: Machine images contain configuration (disks, metadata, startup scripts) for creating new VMs. Snapshots are disk‑level backups (block‑level), can restore single disk. Use machine images for capturing full VM state.
What is Sole‑tenancy? Already covered.
What is Filestore limits and scaling?
Answer: Filestore offers two tiers: Basic HDD (1‑10 TB), Basic SSD (2‑5 TB), High Scale SSD (10‑100 TB), and Enterprise (2‑10 TB). Scale by adding capacity, not performance (except Enterprise where IOPS increases with capacity).
What is Persistent Disk performance?
Answer: PD Standard (HDD) up to 750 IOPS, PD Balanced up to 6K IOPS, SSD up to 60K IOPS, Extreme up to 100K IOPS, Hyperdisk up to 350K IOPS. Throughput varies. Zonal or Regional.
What is Cloud Storage Object Lifecycle Management?
Answer: You can set rules (e.g., delete after 30 days, move to Coldline after 90 days). Actions: SetStorageClass, Delete. Conditions: Age, CreatedBefore, IsLive, etc.
What is Cloud Storage Requestor Pays?
Answer: Requestor Pays buckets require the data consumer to pay for requests and data transfer (egress). Used for sharing large public datasets (saves cost for bucket owner). The requester must authenticate.
What is Service Directory (mentioned earlier)? Also used for private DNS resolution with DNS peering?
What is Cloud DNS?
Answer: Cloud DNS is managed DNS service (authoritative). Supports public zones and private zones (within VPC). Uses DNSsec, managed zones with DNSSEC, and DNS peering.
What is Cloud Load Balancing backend service NEG?
Answer: Network Endpoint Group (NEG) is a group of backends (IP:port) for load balancing. Types: Zonal (within zone), Internet (external IPs), Serverless (Cloud Run, App Engine). Used with Container‑native load balancing.
What is Internal Load Balancing (ILB) vs external?
Answer: ILB distributes traffic inside your VPC using internal IP addresses. External LB distributes public internet traffic. Both regional and global exist.
What is Google Cloud Armor security policies priority?
Answer: Numeric priority (0‑1000). Lower number is higher priority. Each rule has an action (allow/deny). Default rule has lowest priority, allows all.
What is Cloud DLP de‑identification methods?
Answer: Masking (replace with special character), redaction (remove content), tokenization (replace with surrogate), format preservation, bucketing, date shifting.
How would you design cost optimization for a large data warehouse on BigQuery?
Answer: Use partitioned and clustered tables to reduce query scan. Use clustering keys for common filters. Use table expiration and lifecycle policies. Slot reservations for predictable workloads. Use BigQuery BI Engine for dashboards. Avoid SELECT *, use preview of data. Use materialized views for aggregations.
What is Cloud Run timeout and concurrency?
Answer: Maximum timeout is 60 minutes (Cloud Run) or 60 minutes (Cloud Functions 2nd gen). Concurrency (default 80, max 1000) requests per container instance. Choose based on application.
What is Cloud Run revisions and traffic splitting?
Answer: Revisions are immutable snapshots of a Cloud Run service. You can split traffic (e.g., 90% to stable revision, 10% to canary) using percentage weights. Used for A/B testing and gradual rollouts.
What are Cloud Run container limits?
Answer: Memory up to 32 GiB, CPU up to 4 (2.5 GHz) on CPU always allocated, 2 on CPU only allocated during request. Private networking (VPC) is supported via VPC connectors (GA) or Direct VPC (Preview). Not all regions support 32 GiB.
What is Cloud Run jobs?
Answer: Cloud Run jobs are serverless batch processes that run to completion (not serving traffic). They can be scheduled or on‑demand, with retries. Pay for CPU and memory duration. No idle instances.
What is GKE node auto‑provisioning?
Answer: Node auto‑provisioning automatically creates and deletes node pools in a cluster based on unschedulable pods. It uses machine type recommendations. Helps to scale cluster capacity.
What is GKE cluster autoscaler?
Answer: Cluster autoscaler resizes the node pool size when there are unschedulable pods or underutilized nodes. Works with node auto‑provisioning. You define min/max nodes per node pool.
What is GKE Vertical Pod Autoscaler (VPA)?
Answer: VPA adjusts pod CPU and memory requests automatically based on historical usage. Reduces over‑provisioning and OOM kills. It can also evict pods to apply new recommendations.
What is GKE Horizontal Pod Autoscaler (HPA)?
Answer: HPA scales the number of pod replicas based on CPU usage, memory, or custom metrics (Prometheus). Works with the metrics server.
What is GKE Sandbox (gVisor)?
Answer: GKE Sandbox provides a stronger isolation layer for untrusted workloads by adding a security sandbox (gVisor) between the kernel and container. Better than standard Linux namespaces. Increased overhead.
What is GKE Dataplane V2?
Answer: Dataplane V2 uses eBPF for network policy enforcement, load balancing, and observability, replacing iptables. Improves scalability and performance.
What is Cloud Service Mesh?
Answer: Cloud Service Mesh is a managed Istio service on GKE (part of Anthos). Provides traffic management (traffic splitting, retries), security (mTLS), and observability (graph).
What is Config Sync?
Answer: Config Sync (part of Anthos) continuously applies Kubernetes configurations from a Git repository to GKE clusters. GitOps model. Ensures cluster state matches desired state.
What is Policy Controller?
Answer: Policy Controller (Gatekeeper) is a policy engine on GKE that enforces constraints (e.g., require labels, deny privileged containers). Uses OPA (Open Policy Agent). Prevents non‑compliant resources.
What is Cloud Source Repositories?
Answer: Managed Git repositories integrated with Cloud Build, Cloud Functions, App Engine, and Cloud Run. Supports mirroring from GitHub/Bitbucket. No per‑user fee.
What is Cloud Code?
Answer: Cloud Code is an IDE extension for VS Code and IntelliJ for developing, debugging, and deploying to GCP (Kubernetes, Cloud Run, Cloud Functions).
What is Cloud SDK (gcloud)?
Answer: gcloud is the command‑line interface for GCP. Supports managing almost all resources and can invoke APIs. Companion tools: gsutil (Cloud Storage), bq (BigQuery), kubectl.
What is Cloud Console?
Answer: Cloud Console is the web‑based UI for GCP. Mobile app available. Offers quick access to services, monitoring, billing, IAM.
What is Terraform vs Deployment Manager?
Answer: Terraform is third‑party IaC (HashiCorp) with larger community, multi‑cloud support, and state management. Deployment Manager is GCP‑native, less widely used. Terraform is more popular.
What is Price Attribution (labels) for cost?
Answer: Labels are key‑value pairs; when applied to resources, cost is aggregated by label in billing reports. Use for chargeback, showback, and cost analysis.
What is committed use discounts for BigQuery?
Answer: BigQuery also supports committed use discounts (flex and annual) at the project level, similar to Compute Engine. They apply to BigQuery slot usage and storage.
What is BigQuery pricing for long‑running queries?
Answer: Interactive (priority high) vs Batch. Batch queries are cheaper but may be queued. Both are priced per bytes processed. Use reservations for flat‑rate pricing.
What is GKE cost optimization?
Answer: Use node auto‑provisioning with preemptible/spot nodes for fault‑tolerant workloads. Right‑size resource requests/limits. Use VPA and HPA. Use committed use discounts on node CPU/memory. Use cluster autoscaler. Consider GKE Autopilot (pay per pod).
What is Cloud Run cost optimization?
Answer: Set concurrency high (e.g., 80) to reduce idle instances. Use CPU always allocated only for low‑latency. Set min instances to 0. Use CPU boost for cold start. Use memory efficient.
What is Compute Engine cost optimization?
Answer: Use preemptible/spot VMs for batch jobs. Use committed use discounts for steady workloads. Use right‑sizing recommender. Delete unused disks and snapshots. Use VM manager (compute optimizer).
Why should we hire you as a Google Cloud Architect?
Answer: I have deep understanding of GCP services (compute, storage, networking, security, data, serverless) and architectural best practices (high availability, disaster recovery, cost optimization). I can design scalable, secure solutions using managed services to reduce operational overhead. I also understand migration strategies (lift‑and‑shift, re‑platform, re‑architect) and hybrid connectivity (Interconnect, VPN). I am certified (Professional Cloud Architect) and have hands‑on experience with Terraform IaaS and CI/CD pipelines. I communicate complex designs clearly and align with business goals.
Conclusion
The interviewer leans back, arms crossed, and asks, “Design a globally scalable, highly available solution on Google Cloud.”
Three weeks ago, that question might have made your throat tighten. But today? Today something inside you just smiles.
Because you’ve walked through the corridors of this guide. You’ve shaken hands with Compute Engine, become friendly with Cloud Run, explored the quiet intelligence of BigQuery, and learned exactly when to whisper “Cloud CDN” for that instant performance boost. You’ve mapped VPCs like city grids and balanced traffic with the grace of a conductor.
☁️ This isn’t just preparation anymore. This is architectural fluency — the kind that lets you walk into any room and speak GCP’s language as if it were your own. No robotic memorization. No cold sweat. Just the easy confidence of someone who’s built a mental model of the cloud that actually holds up under pressure.
So step into your interview like the lead architect of your own career. You’re not guessing. You’re designing. And that’s a style no interviewer ever forgets.