av Suv Rec 2 år siden
Mer som dette
VPC Peering
Communication happens through private IP
Each GCP network admin pairs their network with the other
Outside same organization
Shared VPC
The GCP are connected through an ad-hoc and private generated network
Whitin same organization
To create a dedicated, private connection between different GCP, both from same organization node or not
There is no SLA for peering
Connection is enstablished at the PoPs (GCP's Edge Points of Presence), where Google's network connects to the rest of the Internet via peering
it's an access service not only to GC infrastructure, but to the whole Google's services
Access: External IP
Carrier peeting
If direct is no option,several provider offer to work as a bridge
Capacity: Dipend on carrier
Enstabliscing a direct connection whit Google in a PoP
Requires: Connection in peering facility
Capacity: 10 Gbps / link
Physical linking between networks points of access (PoA) whitin a data center
If Dedicated is no option, several provider offer to work as a bridge
Requires: Service provider
Capacity: 0.5 - 10 Gbps / connection
Install an owned router whitin a data center hosting a Google PoA and directly link them
Requires: Connection in colocation facility
Capacity: 10 - 100 Gbps (100 in beta) / link
Create IPsec(ured) tunneling through data encription performed by gateways at each networks ends
Access: Internal IP
Requires: On-prem Gateway installation
Capacity: 1-5-3 Gps / tunnel (scalable)
To connect on-prem network with G's Network, by creating a direct link between owned resources and GC resources
Internal IP is generated throug link-local BGP routes
TCP proxy
SSL proxy
HTTP(S) Load Balancing
3.5 A network endpoint group (NEG) is a configuration object that specifies a group of back-end endpoints or services
Internet NEG
Specified by "FQDN:[port]" or "IP:[port]"
A single, hybrid connectivity pointing to Traffic Director services outside G Cloud
Zonal NEG
1 or more endpoints (VMs istances or containers)
Serverless NEG
Points to Cloud Run, App Engine, Cloud Functions services residing in the same region
Contains no end-points
3. Backend service
Priority is given to geographycal vicinity and, if available, based on session affinity
Runs healty checks and routes the request based on the required load balancing criteria
2. Target proxy / URL Map
Receives request from 1., and checks the request against an URL map for the most appropriate backend service
1. Global forward Rule
In case of HTTPS, the target proxy ned to hold a valid SSL certificate
IPv4/IPv6, scalable, require no prewarning, content base, Icross-regional
Internal HTTP(S)
Network TCP/UDP
Internal TCP/UDP
Healty check managment
Criteria are
Unhealthy threshold
How many failed attempts are decisive
Healthy threshold
How many successful attempts are decisive
How long to wait for a response
Check interval
How often to check whether an instance is healthy.
Creation and crsahing of instances from outside the group command are recovered
Autoscale is brought on based on group functional targets (CPU usage, load balancing, total capacity, budget, ...)
It's a group if identical (VMs) istances generatd by template and managed as a whole
It's not covered by Google Service Level Agreement (SLA)
gives direct access to Google Network
Carrier Peering
Using a 3rd partner service provider, the costumer network can be peered to those Google products that are exposed through a public IP
Direct peering
Placing a router in the same data center as oa Google point of presence
Partner Interconnect
Useful if
The data connection needs are lower then 10GB/s
the data center cannot be reached by Dedicated Interconnect
Connectivity between an on-premises network and a VPC network through a supported service provider.
Dedicated Interconnect
If the connection topologies meet G's specifications, SLA is covered up to 99.99%
Highest uptimes possible
1 or more private direct, connecting onto Google
bandwith reliability
Security concerns
New subnet will be automatically added to the connection
A "secure Internet Protocol" generated to create a tunnel connection
Layer 3 connections provide access to G Suite services, YouTube, and Google Cloud API's using public IP addresses
Layer 2 connections use a VLAN that pipes directly into your GCP environment providing connectivity to internal IP addresses
Dedicated connections provides a direct connection to Google's network
Shared connections for a connection to Google's network through a partner
Further specification
Billing indexing
Inventory purposes
A value ('HomeVM','Amagis','testing','ThomasEdison','ABC01234'...)
A key ('name', 'company', 'aim','contact','cost center'...)
In case of need, the versioning of the interface allows for legacy compatibility
It can be deployed with just the required system specification (adaptability), and be redefined (adjustability) on demand, and it is coordinated whitin the cloud with just the required number of replicas of itself required to share actual workload (scalability), which are much quicker then an ordinary VM to be booted or stopped.
From the host PoV is assimilable to a service, from the guest software PoV is a VM. Moreover, for the whole aplication (that they are built for) PoV, each guest software is a microservice.
What they use
More resources to be focused on application logic
What they allocate in advance
Network Capabilities
Raw Compute
Companies to concentrate on their goal rather than on maintaining tecnica linfrastructure
Includes Cloud SDK and other utilities fully available, updated and authenticated.
It's actually a Debian-based VM with persistent 5 GB home dir
Provides commmand-line access to cloud resources from browser
Set of tools usefull to manage resources and applications
bq - a command line tool for BigQuery
gsutil - provides access to Cloud Storage from cmd line
gcloud tool - the main command line interace
SSH connection via browser
Find resources, check status, manage them, set budgets
Easily deploy, scale, and diagnoe prdouction issues
When possibile use Identity Aware-Proxy (IAP) tool
On Service Accounts
Enstablish key rotation policies and methods
And audit keys with "serviceAccount.keys.list()" method
When creating one, use a clear explanatory name based on its purpose
Even better if a naming convention is enstablished
Be careful when granting "serviceaccountuser" role, since the account will receive all the permission granted to the service
On Policies
When possible grant roles to groups instead to individuals, then...
Control the ownership of the groups used in IAM policies
Audit membership of groups used in policies
Audit members of groups used in policies
Audit policies in Cloud Audi Logs: setiampolicy
Use "principle of least priviledge" when assign roles
Check the policy granted on each resources and make sure to understand their inheritance
Use projects to group resources that share the same trust boundary
Each belongs to exactly one projects
Can have several owner and users
If it's under an organization node, its identity will automatically be an owner
Are the basis for the use of Google services
Require a Organization node
Either your organization has a Google Workspace domain, or you will need to create an identity by Cloud Identity
SLO compliancy
Error Reporting - to assist developers work
Alerts - Automatically from data signal, end eventually directly to personnel in key role
by using
Profiler - from running Apps
Snapshot Debug - from running Apps
Health Checks - from Services (uptime and latency when facing external sites)
Service Monitoring - fron Ervices (compliances with SLO) and Error alerts
Logs Explorer - from Logs
Metrics Explorer - from Signal Data
Dashboards - from Signal Data
Are divided in
Trace - for Apps
Logs - for Apps, Services, Platform,
Categorized in
Service Logs: created by developers deploying code to Google Cloud.
Network Logs: Network and Security operations
NAT Gateway - capture information on NAT network connections and errors
Firewall Rules - allows to audit, verify, and analyze the effects of your firewall rules
VPC flows - records samples of VPC network flow and can be used for network monitoring, forensics, real-time security analysis, and expense optimization
Agent Logs: generated by a G's agent installed on AWS or G's Cloud VM instances to ingest the log these generate
Cloud Audit Logs: helps answer the question "Who did what, where, and when?"
Access Transparency - capture the actions Google personnel take when accessing your content
System events - non-human Google Cloud administrative actions that change the configuration of resources
Data access - tracks calls that read the configuration or metadata of resources and user-driven calls that create, modify, or read user-provided resource data
Admin activity - tracks configuration changes
Metrics - for Apps, Services, Platform, Microservices
Integrated in all G's Tools from the hardware layer up
To set alert threshold considertably higher then what defined as minimum
Include compensation in case of paying costumer if not respected
The minimum levels of service promised to provide AND what happens when not respected
Commitments made to the client that systems and applications will have only a certain amount of “down time”
To be something short of 100%, like 99.9% ("3 nines")
Have concrete, well-documented consequences in case of failure to meet the objectives
be S.M.A.R.T.
Time-bound (or it become ephimeral)
Relevant (to the defined goals)
Achiavable (realistic at the actual conditions)
Specific (as in not subjective)
The target value for a monitored metric
It's suggested
as the ratio: # good events / # all valid events.
Have a close linear relationship with the users' experience of that reliability
Selected monitoring metrics that measure one aspect of a service's reliability
# of dropped connections
Servers that fail liveness checks
# of stack traces
# of exceptions
# of failed requests
# of 400/500 HTTP codes
Wrong answers or incorrect content
Oftern arise when
a flaw, failure, or fault in a computer program or system causes it to produce incorrect or unexpected results, or behave in unintended ways
The momento to send out an alert
Service level objective violations
Configuration or capacity issues
They are importante because
They may indicate that something is failing
They are
Events that measure system failures or other issues
# of users on the system
# of of available connections
Memory quota
Disk quota
% CPU utilization
% disk utilization
% cache utilization
% memory utilization
% thread pool utilization
Degrading performance as capacity is reached
It's an indicator of how full the service is
The residual avaiability of the most constrained resources
To note
Is often a subjective measure depending on the application type
# of active connections
# of read ops
# of write ops
# of active requests
# of retrievals per second
# of transactions per second
# of concurrent sessions
Network I/O
# of requests for static vs. dynamic content
# of HTTP requests per second
User appreciation
infrasctructure spending
Capacity planning
It's important because
It’s an indicator of current system demand
how many requests are reaching the system
Some metrics
Time to complete data return
Time to first response
Transaction duration
Service response time
Query duration
# of request waiting for a tread
Page load latency
Can be related to
Measurement of system improvment
Capacity demand
Emerging issues
It's importante because
Directly affect the user experience
It measures
How long it takes a specific task to return a result
Vulnerability Rewards Programs
provides libraries that prevent developers from introducing certain classes of security bugs
two-party review of new code
central source control
Aggressively limits and actively monitors the activities of employees
Rules and machine intelligence
Multi-tier and multi-layer protections
Sheer scale of the infrastructure is a first layer of protection
Protection against Denial of Service attacks
Best practise are always in place
Every TLS cconnections are ended by public-private key pair and an X.509 certificate
Data in physical storages are encripted by centrally managed keys and tipically accesed by storage services
Whitin Data center: Hardware cryptographic accelerators - Ongoing
Between DataCenter: Encryption of all inter-service RPC communication - Already in place
Third-party Data Center
Limited access
Physical protections
base operating system image
Cryptographic signatures over the BIOS
Security chips
Reduce distance between endpoints
Reduce redoundancy by selecting the deploying areas
Protection from localized event
Low Latency
Measures the time a package takes to reach destination
Noth America
South America
Open source software library for machine learning
Designed to prevevnt over-usage due to malicious attack
Allocation quotas: resource limits
Rate quotas: reset periodically
Notification alerts
App Engine flexible environment VMs
Kubernets Engine
Compute Enginge
Customizable VMs to tailor resources workloads on pricing
discount applied if used for more than 25% of a month