Wednesday, October 4, 2023

Back to the basics: Should I use security list or network security group or both to secure my OCI deployment?

Today I was in a customer call, it was a pretty straightforward scenario. The session turned into hands-on pretty fast, and I love it, sharing the curiosity and eagerness to solve the problem with technical people, few things can match that feeling. And as always we came to the point where we delve into the troubleshooting. So here we go...

The requirement is simple, deploy an Ubuntu server to host a demo application over HTTP port 80, a small VM in a public subnet with a public IP Address and supporting security rules. VCN is created with the wizard, and it comes with a Default Security List which is populated with 3 stateful ingress rules:

First rule enables SSH access to my host, the other two ICMP rules are there for debugging and they don't enable a ping response. All of them are stateful. And this is the Egress part:

There is one stateful egress rule which enables outgoing traffic to any destination with any protocol on any port. State will be important as we will find out later...

Security list is attached to subnet and enforced at all VNICs in the subnet. So setting general rules with security list makes sense, however we also need to open HTTP 80 port for one server and we don't want this for all servers in the subnet. For this purpose we use network security groups which is another type of virtual firewall that Oracle recommends over security lists. You can use security lists and network security groups together. How do the rules apply? At simplest: a union of all rules are applied to VNIC. Security list is tied to subnet so applies to all VNICs in the subnet, NSG is attached to individual VNIC, so it's granular. Here are the rules in our NSG:

First rule is allowing incoming TCP traffic on port 80 from any source, the second rule is allowing outgoing TCP traffic. And rules are stateless , which means connection tracking is disabled. Why would I want that? Maybe I am expecting high traffic, or maybe I was greedy and wanted everything at once.

Overall architecture can be simplified like this:

So we SSH into our Ubuntu server using our public IP, also add linux firewall rules by updating iptables as explained in detail on this tutorial .

All set, for a really quick dirty test, let's run python

And it works, but we quickly find out there is another problem. We can't access Ubuntu repositories to update the packages or install new ones. Although IPv6 in the error message is distracting, it doesn't work with IPv4 either. It is a problem with accessing the internet.

So after some debugging, we soon realize the problem is having overlapping stateful and stateless rules. Our stateful egress rule on the security list should be providing all the access we need towards internet. But it doesn't, why? Because our stateless egress rule in NSG is overlapping and overriding the SL as stateless has precedence over stateful. This is what documentation exactly warns us about.

If for some reason you use both stateful and stateless rules, 
and there's traffic that matches both a stateful and stateless rule 
in a particular direction (for example, ingress), the stateless rule 
takes precedence and the connection is not tracked. You would need 
a corresponding rule in the other direction (for example, egress, 
either stateless or stateful) for the response traffic to be allowed.

Lessons learned
1. Use stateful rules (which is default) unless I have a good reason to use stateless
2. If using stateless ingress always exactly match it with an egress rule, don't use a broader rule
3. Don't use overlapping stateless and stateful rules, as the stateless rule takes precedence and the connection is not tracked, thus acting different than expected.

How did we fix it?
On our NSG, we converted ingress rule from stateless to stateful, and removed egress rule as it's not needed anymore.

If we wanted to use stateless rules, then a viable solution will be restricting egress rule to exactly match the ingress thus protocol TCP, source port 80 and destination port any.


References:
1. OCI Security Rules: Stateful Versus Stateless Rules
2. Developer Tutorials: Free Tier: Install Apache and PHP on an Ubuntu Instance
3. Enabling Network Traffic to Ubuntu Images: Enabling Network Traffic to Ubuntu Images

No comments:

Post a Comment

Featured

Putting it altogether: How to deploy scalable and secure APEX on OCI

Oracle APEX is very popular, and it is one of the most common usecases that I see with my customers. Oracle Architecture Center offers a re...