
Introduction
Having the ability to utilize resources on demand and gaining high speed connectivity across the globe, without the need to purchase and maintain all the physical resources, is one of the greatest benefits of a Cloud Service Provider (CSP). But how can we control our data assets, while there are suddenly so many possible egress points to consider? Take for example the ability to interact with various cloud services such as Cloud Storage, BigQuery, Cloud SQL, etc. And did you also consider that other services like YouTube, Drive and AdSense use the same APIs?
When defining a cloud environment you will most likely configure a network (Virtual Private Cloud: VPC). One of the best practices when designing your cloud platform is to only use private IP addresses for the compute and data resources (listed under RFC-1918), that cannot be resolved from the public internet. For ingress access to your application, services like Cloud Load Balancer should be preferred and for egress to the public internet a service like Cloud NAT. This approach will, in many ways, protect your platform from external threats, but with the open access to the public internet doesn’t protect you from internal threats.
As can be seen from above diagram, there is nothing protecting data from being sent to anywhere across the internet. This is why many organizations choose to enforce a policy to ban or restrict the usage Cloud NAT. This can cause different problems for applications that in some ways depend on having internet access or even accessing Google services operations. For these scenarios various solutions can be implemented. For the Google API access however, there is an easy method by ‘checking the box’ called Private Google Access
at the subnetwork in which the resources are deployed. This will configure access for your resources in that subnetwork to call the Google APIs. There is a catch: it will open up access to all Google APIs. This is for a number of organizations a real problem, where they are subject to compliance with policies and regulations like the GDPR, HIPAA and NIS2(/NIST).
The objective of this blog is to provide insight in how the risk of data exfiltration can be further reduced by controlling the access to the Google APIs.
One step in the direction of restricting access to specific Google APIs is by disallowing specific services from being enabled in the first place, through Organizational Policies. And then the policy called Restrict allowed Google Cloud APIs and services
in particular. But this won’t work, if there is a demand for a specific service and you need to ensure that the access is only available from within a defined context. This is often the case for organizations that store data in Cloud Storage or analyse this using BigQuery, while there is still the legal requirement of protecting this data.
VPC Service Controls
This is where VPC Service Controls comes around the corner. VPC Service Controls is a service at Organizational level to control the access to the available Google APIs. It provides fine grained controls on who can acces which service and in which context. The implementation of VPC Service Controls is done by configuring the organizational level resource called a Service Perimeter. This perimeter draws an access boundry around the Google APIs for which access needs to be controlled. When looking at VPC Service Controls in general, it can be viewed as additional access controls to the Google APIs next to regular Cloud IAM access controls. This means that when applying VPC Service Controls through a perimeter it will be an extra line of defense for accessing Google APIs that are protected by the perimeter.
This may still seem a bit vague for now, but it will become more clear later in this blog.
VPC Service Controls building blocks
Note
It is considered best practice to use an Infrastructure as Code approach, by using for example Terraform, to deploy your cloud resources. For the sake of this blog I will show all the steps using screenshots from the Google Cloud Console.
First off. You can find the configuration of the VPC Service Control perimeter in the Google Cloud Console under the menu item Security -> VPC Service Controls
. This is an Organizational level
resource and cannot be configured on Folder
nor Project
level.
Now let’s walk through the options that you can see when opening this menu item.
VPC Service Controls resources
Let’s start at the top. You can immediately spot three menu items:
- An item called
default policy
with a dropdown arrow - An item called
Manage policies
- An item called
View Quota
Hierarchy
Starting at the right, because that sheds a bit of light on the hierarchy, we can see the quotas that apply to the current policy. Below diagram will try to clearify the values that you see when clicking this item, which can also be listed in the VPC Service Controls Quota documentation.
Manage policies
This brings us to menu item number 2: Manage policies
. As you can see from the quota there is a difference between the Organization access policy, from which there can be only 1 (the default policy), and the scoped policies. These are the ones you can manage in this item.
Create or update a policy
Following best-practices, in a typical organization you will probably have an hierarchy that is divided into at least a production and a non-production environment. This can be done with a hierarchy resource called a Folder
. This would provide the ability to create two scoped policies, one for each folder. Creating a policy only allows for either one Folder or one Project to be set. This will automatically mean that any perimeter created under this policy can only include project(s) that are within the scope of the policy.
Enforced and dry run
In the earlier screenshot the default policy
is selected, as marked in blue at the top. This affects how the perimeter configuration will be applied when clicking the + NEW PERIMETER
button and under which policy.
- Enforced mode
- Dry run mode
The menu items do as what you would expect them to do: either enforce the rules you configure or to test rules before enforcing them. In dry run mode it will still yield the access logs in Cloud Audit Logging so you can see how the perimeter will affect the access to the Google services, but without actually blocking them. This will help you in configuring the appropriate rules that you want to adopt for enforcing access restrictions. The configuration of both options are in other terms the same.
Note
I want to point to an important part when it comes to deploying this using the Terraform resource called google_access_context_manager_service_perimeter. The configuration of the dry run perimeter and the enforced perimeter are done in one and the same resource. Looking more closely at the dry run configuration you can see that the rules you want to add for dry run need both the spec
block and the value of the attribute use_explicit_dry_run_spec
set to true. Rules you add under the block status
are the rules that will be enforced.
Perimeter types
When you start configuring your perimeter you are brought to the following screen.
The name will be obvious, but what stands out is the distinction between a Regular perimeter
and a Perimeter bridge
. The Config Type
selection is disabled, as this is determined when you clicked the button under Enforced
or Dry run
mode.
Let me explain the difference between these two perimeter types with some diagrams.
Regular perimeter
A regular perimeter draws a boundry around the GCP Projects or VPC networks that you will select. All resources within this regular perimeter are still subject to Cloud IAM, but access to and from outside the perimeter have added controls. An important factor is that a GCP project or VPC network can only be included in exactly 1 regular service perimeter!
The above diagram displays a basic setup where there are two perimeters: Production Service Perimeter
and DTA Service Perimeter
. In both these perimters Cloud Storage
is allowed, while the regular Cloud IAM permissions are still verified. It is not allowed to access the Cloud Storage API tied to the Production project
from the DTA project
.
Perimeter bridge
A perimeter bridge provides the ability to allow for cross perimeter Google service communication for the included resources. The difference here being that the GCP projects and VPC networks can be part of zero or more perimeter bridges. Bear in mind that when you create a perimeter bridge you will effectively negate the purpose of a service perimeter between these projects or networks and only depend on the protection of Cloud IAM.
The above diagram introduces two new projects called Production DMZ project
and DTA DMZ project
and a perimeter bridge called DMZ Service Perimeter Bridge
. A typical usecase for the perimeter bridge is where the data in the DTA environment should resemble production data as closely as possible, but needs to be anonimized before it can be used.
Resource to protect
After choosing your perimeter type, the next part of the configuration is to define the resources to protect.
Note
Keep in mind that you can only include resources (projects are considered resources) that you have added to your policy or use the organization level policy to include any project or network in your perimeter. And remember that a project or network can only be added to one regular perimeter.
The thing that stands out here is that you can select Projects
and VPC networks
. The most commonly used resource when configuring a perimeter are projects
, less commonly so are the VPC networks
. When you do select networks as the resource(s) to protect in the perimeter (for example when you want to include a Shared VPC) this can only effectively apply to network level services, such as Cloud SQL. Project level services, like Cloud Storage, can yield unexpected behavior as these have no relationship to the networks defined in the perimeter and there is no discriminator to specify the access rules from outside of your perimeter.
Restricted services
The selection of the restricted services is where the confusion for many already start.
In this item you can either select all services
or specific services
. The services that you select here are the Google services that you want to restrict from usage from inside
and outside
perimeter. This means that if you select Cloud Storage
as a service to restrict, that this service will no longer be accessible without explicitly configuring this: you have drawn a line for the usage of Cloud Storage in relationship to the project(s) in the perimeter.
In the diagram below you can see it a bit more clearly. Both the Production project
and the DTA project
each sit in their own respective service perimeters. Next to that there is some Other project
, that can either be in the same GCP organization or another organization entirely. When adding Cloud Storage to the restricted services, you will have effectively blocked any access to this service. Both from inside and outside of the perimeter. All other services, that not have been added to this list, will still be freely accessible and only subject to regular Cloud IAM permissions.
VPC Accesible services
The confusion increases for most when selecting the VPC Accessible services. How is this related to the restricted services?
It is often triggered because of the four options you can choose here.
- All services
- No services
- All restricted services
- Selected services
To explain it more clearly I add BigQuery to the services used by our application from the earlier diagrams. For these diagrams we consider the scenario where we’ve selected Cloud Storage
to be in our restricted services, but not BigQuery
. If you look closely at the diagrams, you will see that the Service Perimeter is partially overlapping the project but not completely. This is to make clear which service related to the project is included in the perimeter.
All services
From the diagram you can see that when you select All services
this means that from within the perimeter, all services are accessible. This includes all Google APIs, not just BigQuery from the diagram. The only difference here is that Cloud Storage
is no longer accessible from outside the service perimeter.
No services
This diagram shows that when you select No services
it will have more effect on the services access from within the service perimeter than from outside the perimeter. This means that in this scenario Cloud Storage
is not accessible at all. BigQuery
is only protected through Cloud IAM from outside the perimeter, but cannot be reached from inside the perimeter. A typical use case for this is to protect agains data exfiltration from inside the perimeter to disallowed services.
All restricted services
This diagram shows that selecting All restricted services
will ensure that Cloud Storage
can be reached from inside the perimeter, but not from outside. BigQuery
here however, is accessible from outside the perimeter but not from inside the perimeter.
Selected services
For a bit more clearity on the last option, we need to make a change in our scenario. Let’s consider that we’ve now also added BigQuery
to our Restricted services
.
For this diagram we used Selected services
and only selected Cloud Storage
as the service that should be accessible. However, since BigQuery
is now included in the service perimeter, it is no longer accessible from inside nor outside the service perimeter.
Access Levels
We’ve now covered one of the more confusing parts, but are not there yet. Next up is the ability to add an Access Level. Note that this is singular, as you can only select one access level on perimeter level. Do bear in mind that an access level can contain other access levels. These access levels are configured in a different menu item. Security -> Access Context Manager
. It might be that you are already familiar with the Access Context Manager from other security features, like Identity Aware Proxy and these are indeed the same.
Access Context Manager
The Access Context Manager provides the ability to configure sets of rules that will be applied for authorization. This can be considered as part of the broader Zero Trust principles that can be applied on network (Zero Trust Network Access, ZTNA) and application level (Zero Trust Application Access, ZTAA).
Creating and configuring an Access Level can include a wide variety of options, such as specifying the IP ranges that are allowed or denied and for geographical context (applicable for External Application Load Balancers and service perimeters). They can also be combined with other Access Levels. There are premium options that are available when also purchasing licences for Chrome Enterprise Premium. This will even further enhance the security capabilities, which lie far beyond the scope of this blog.
The important part for this section is to understand that it provides the ability to also validate the context in which the request is being made to the service(s) protected by the service perimeter.
Ingress policy
The Ingress policy is used to apply granular rules for resources outside of the perimeter to be granted access to inside the perimeter. Below diagram is a visualization of this process.
Here you can see that the ingress policy is applied on any request that wants to access a service that is protected by the service perimeter. The benefit of this approach over the usage of a service perimeter is two-fold: 1. You can configure rules for projects that are not part of any service perimeter or even outside the organization 2. This provides the ability of a granular authorization process to determine what service is allowed to be accessed from outside the perimeter
The dependency on ingress policies do however also come with some caveats. You need to explicitly configure all the rules for any access coming from outside the service perimeter. This includes not only access from other projects, but even affects the ability to access resources using the Cloud Console and from managed services that are deployed in Google managed single tenant projects. This can lead to a whole lot of testing and trying, while going through the Cloud Audit logs and checking how to resolve using the Troubleshooting guide.
From my personal experience, especially the error NETWORK_NOT_IN_SAME_PERIMETER can provide a lot of headaches. One particular case is where the requests are coming from a resource in a project that is contained in a Shared VPC, while the current perimeter is outside of this VPC. You then need to explicitly grant this network access. Another one is where there is use of a managed service, like Cloud SQL, where the network of the single tenant project is peered with the project in the service perimeter. The trick is to figure out what network it actually is, since in the logs the network name will show up as __unknown__
.
The ingress policy itself is fairly self-explanatory.
You can select where you expect the requests to be coming from and to which service.
You can define the identities you expect
You can specify the source locations, including other access levels
You can specify the target project(s)
And the target service(s)
For some services you can even be more specific, by defining the methods that are allowed. These are services like Cloud Storage and BigQuery
Egress policy
Last but not least is the egress policy. Especially to protect against data exfiltration, this feature will help in achieving compliance with required security controls.
This diagram shows that this egress policy is applied to calls from within the perimeter to outside the perimeter. Note that if the target resource itself is inside a service perimeter, there should also be added ingress rules in that perimeter to allow for the request to be successful.
The configuration of the egress policy itself is a little bit less complex and the fields that can be configured are pretty similar to an ingress policy.
One specific deviation on the ingress policy is that for an egress policy you can also specify a resource with an external CSP, like AWS or Azure.
Conclusion
VPC Service Controls are a technique that will provide the capability to control access to Google APIs, which would otherwise be broadly accessible even from the context of a private network with Private Google Access enabled. The scope goes a little bit further to ensure that these services can now be protected from access outside of the perimeter, ensuring that you can enforce data protection on multiple levels. It is however often an important feature to be able to prove that you comply with policies and regulations, like the GDPR and HIPAA.
The configuration of VPC Service Controls can become quite complex, especially when you are unfamiliar with the topic and you lose the sight of the forrest through the trees. This blog intended to provide clarity on how to properly configure a VPC Service Perimeter, what parts there are to configure and the considerations to make when implementing (like when to chose perimeter bridges over ingress and egress policies). If you have any further questions, please feel free to reach out.