What happened to the network???

Have you ever had one of those problems where there was no rhyme or reason for it? Where there was no consistency or trend with the problem? So what do you do, you inspect logs. What if there is nothing showing in the logs? Well, if it is the evening, a single malt scotch helps. During the day, walking to the lunch room to get coffee gives that little bit of a mental break to reset your thoughts.

We had one of those events over the weekend at 1709 Bloor which carried into Monday. Browsers would work and then they wouldn’t. You could reach Google.com one time then the next, they would time out. Sounds like the telco was having an issue or there is something not funky with DNS in-house or out in the big cloud. Thus, our search began for the identity of the source of the issue.

Hopped on the phone with the Telco to have them investigate from their side, found nothing. There were no errors on the 1709 Bloor router, clean pings to external websites with no CRC or retry errors. So it was time to recap. What it was that we knew:

1) Browsing was inconsistent to everyday, common websites but DNS was resolving when running a trace route.
2) Telco was not experiencing any errors.
3) Clean pings to websites with response time of 4ms to 10ms which is normal.

After the review, the common denominator was the 1709 Bloor firewall. All internal and external traffic flows through it. Then we had remembered that we had a similar issue in the past where the current policy in the database had gotten corrupted. So we had decided to revert back to a previous policy. Before we did that, we needed to make sure that it was the firewall. We wired a laptop, around the firewall, directly to the telco and did some browsing. No issues at all. Everything was smooth.

We reverted back to the a previous policy on the 1709 Bloor firewall and all issues disappeared. It took a few minutes for the firewall cache to get cleared, but outside of that, everything was back to normal. Actually, there was one thing. The IP address for the external Uunanet reverted back to the old address which stopped a couple of people from accessing it that are working remotely. So we changed the external IP, pushed policy and everything was groovey again. We also made sure that the policy was saved, so if we need to revert back to it again, everything is there.

Once again, I want to thank everyone for their patience and understanding while we took the time to troubleshoot and resolve the issue. Once again, happy browsing!

It's only fair to share...
Share on FacebookGoogle+Tweet about this on TwitterShare on LinkedIn

Leave a Reply